CN116987715B - Artificial gene driving system - Google Patents
Artificial gene driving system Download PDFInfo
- Publication number
- CN116987715B CN116987715B CN202311247476.0A CN202311247476A CN116987715B CN 116987715 B CN116987715 B CN 116987715B CN 202311247476 A CN202311247476 A CN 202311247476A CN 116987715 B CN116987715 B CN 116987715B
- Authority
- CN
- China
- Prior art keywords
- gene
- plant
- plants
- cain
- population
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108700005078 Synthetic Genes Proteins 0.000 title claims abstract description 55
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 138
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 44
- 238000010362 genome editing Methods 0.000 claims abstract description 38
- 108091026890 Coding region Proteins 0.000 claims abstract description 36
- 230000025594 tube development Effects 0.000 claims abstract description 29
- 241000196324 Embryophyta Species 0.000 claims description 262
- 150000007523 nucleic acids Chemical class 0.000 claims description 71
- 108091033409 CRISPR Proteins 0.000 claims description 68
- 238000000034 method Methods 0.000 claims description 65
- 108020004707 nucleic acids Proteins 0.000 claims description 63
- 102000039446 nucleic acids Human genes 0.000 claims description 63
- 108020005004 Guide RNA Proteins 0.000 claims description 46
- 238000010441 gene drive Methods 0.000 claims description 39
- 230000014509 gene expression Effects 0.000 claims description 38
- 239000002773 nucleotide Substances 0.000 claims description 36
- 125000003729 nucleotide group Chemical group 0.000 claims description 36
- 238000010354 CRISPR gene editing Methods 0.000 claims description 32
- 101710163270 Nuclease Proteins 0.000 claims description 31
- 108700039887 Essential Genes Proteins 0.000 claims description 18
- 241000219194 Arabidopsis Species 0.000 claims description 17
- 230000015572 biosynthetic process Effects 0.000 claims description 16
- 239000004009 herbicide Substances 0.000 claims description 11
- 101000995793 Arabidopsis thaliana Protein NPG1 Proteins 0.000 claims description 10
- 230000008685 targeting Effects 0.000 claims description 10
- 230000002363 herbicidal effect Effects 0.000 claims description 9
- 241000219195 Arabidopsis thaliana Species 0.000 claims description 8
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 7
- 230000001965 increasing effect Effects 0.000 claims description 7
- 208000035240 Disease Resistance Diseases 0.000 claims description 4
- 230000006978 adaptation Effects 0.000 claims description 4
- 238000012245 TALEN-based genome engineering Methods 0.000 claims description 3
- 230000007613 environmental effect Effects 0.000 claims description 3
- 108700026220 vif Genes Proteins 0.000 claims description 3
- 239000003814 drug Substances 0.000 abstract description 11
- 230000007246 mechanism Effects 0.000 abstract description 11
- 239000002574 poison Substances 0.000 abstract description 10
- 231100000614 poison Toxicity 0.000 abstract description 10
- 229940079593 drug Drugs 0.000 abstract description 9
- 230000009286 beneficial effect Effects 0.000 abstract description 4
- 230000004853 protein function Effects 0.000 abstract 1
- 208000032325 CEBPE-associated autoinflammation-immunodeficiency-neutrophil dysfunction syndrome Diseases 0.000 description 81
- 101150038768 NPG1 gene Proteins 0.000 description 48
- 235000018102 proteins Nutrition 0.000 description 38
- 238000003776 cleavage reaction Methods 0.000 description 36
- 230000007017 scission Effects 0.000 description 33
- 108700028369 Alleles Proteins 0.000 description 29
- 210000004027 cell Anatomy 0.000 description 23
- 239000013598 vector Substances 0.000 description 18
- 230000005540 biological transmission Effects 0.000 description 16
- 230000000694 effects Effects 0.000 description 16
- 241000894007 species Species 0.000 description 15
- 238000003205 genotyping method Methods 0.000 description 13
- 238000003752 polymerase chain reaction Methods 0.000 description 13
- 230000001105 regulatory effect Effects 0.000 description 13
- 238000013461 design Methods 0.000 description 12
- 108090000765 processed proteins & peptides Proteins 0.000 description 12
- 238000005520 cutting process Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 11
- 210000001519 tissue Anatomy 0.000 description 11
- 230000009466 transformation Effects 0.000 description 11
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 10
- 235000001014 amino acid Nutrition 0.000 description 10
- 230000005782 double-strand break Effects 0.000 description 10
- 239000013612 plasmid Substances 0.000 description 10
- 102000004196 processed proteins & peptides Human genes 0.000 description 10
- 238000013518 transcription Methods 0.000 description 10
- 230000035897 transcription Effects 0.000 description 10
- 235000013339 cereals Nutrition 0.000 description 9
- 230000002068 genetic effect Effects 0.000 description 9
- 210000003794 male germ cell Anatomy 0.000 description 9
- 230000001404 mediated effect Effects 0.000 description 9
- 230000007198 pollen germination Effects 0.000 description 9
- 229920001184 polypeptide Polymers 0.000 description 9
- 210000001938 protoplast Anatomy 0.000 description 9
- 238000012360 testing method Methods 0.000 description 9
- 108020004705 Codon Proteins 0.000 description 8
- 150000001413 amino acids Chemical class 0.000 description 8
- 210000004602 germ cell Anatomy 0.000 description 8
- 238000009396 hybridization Methods 0.000 description 8
- 238000003780 insertion Methods 0.000 description 8
- 230000037431 insertion Effects 0.000 description 8
- 238000004088 simulation Methods 0.000 description 8
- 210000000130 stem cell Anatomy 0.000 description 8
- 108020004414 DNA Proteins 0.000 description 7
- 230000007018 DNA scission Effects 0.000 description 7
- 108091028043 Nucleic acid sequence Proteins 0.000 description 7
- 210000004420 female germ cell Anatomy 0.000 description 7
- 238000007726 management method Methods 0.000 description 7
- 230000008439 repair process Effects 0.000 description 7
- 238000007480 sanger sequencing Methods 0.000 description 7
- 238000012163 sequencing technique Methods 0.000 description 7
- 239000000729 antidote Substances 0.000 description 6
- 230000036961 partial effect Effects 0.000 description 6
- 238000000926 separation method Methods 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- 238000013519 translation Methods 0.000 description 6
- 241000589158 Agrobacterium Species 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 230000006543 gametophyte development Effects 0.000 description 5
- 230000006780 non-homologous end joining Effects 0.000 description 5
- 230000001629 suppression Effects 0.000 description 5
- 238000012546 transfer Methods 0.000 description 5
- 241000255925 Diptera Species 0.000 description 4
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 4
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 4
- 230000000295 complement effect Effects 0.000 description 4
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 230000013020 embryo development Effects 0.000 description 4
- 210000002257 embryonic structure Anatomy 0.000 description 4
- 230000035558 fertility Effects 0.000 description 4
- 239000012634 fragment Substances 0.000 description 4
- 230000021121 meiosis Effects 0.000 description 4
- 230000035772 mutation Effects 0.000 description 4
- 230000000644 propagated effect Effects 0.000 description 4
- 231100000167 toxic agent Toxicity 0.000 description 4
- 239000003440 toxic substance Substances 0.000 description 4
- 101000949825 Homo sapiens Meiotic recombination protein DMC1/LIM15 homolog Proteins 0.000 description 3
- 101001046894 Homo sapiens Protein HID1 Proteins 0.000 description 3
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 3
- 102100022877 Protein HID1 Human genes 0.000 description 3
- 108091081024 Start codon Proteins 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 230000037429 base substitution Effects 0.000 description 3
- 239000000969 carrier Substances 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- 238000005094 computer simulation Methods 0.000 description 3
- 230000001276 controlling effect Effects 0.000 description 3
- 238000000354 decomposition reaction Methods 0.000 description 3
- 230000001627 detrimental effect Effects 0.000 description 3
- 235000013399 edible fruits Nutrition 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 230000008774 maternal effect Effects 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 229930182817 methionine Natural products 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 229920000642 polymer Polymers 0.000 description 3
- 108091033319 polynucleotide Proteins 0.000 description 3
- 102000040430 polynucleotide Human genes 0.000 description 3
- 239000002157 polynucleotide Substances 0.000 description 3
- 230000001172 regenerating effect Effects 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 230000000392 somatic effect Effects 0.000 description 3
- 230000021595 spermatogenesis Effects 0.000 description 3
- 230000009261 transgenic effect Effects 0.000 description 3
- 230000001052 transient effect Effects 0.000 description 3
- 238000011144 upstream manufacturing Methods 0.000 description 3
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 229920001817 Agar Polymers 0.000 description 2
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 description 2
- 101710192106 Calcineurin-binding protein cabin-1 Proteins 0.000 description 2
- 102100024123 Calcineurin-binding protein cabin-1 Human genes 0.000 description 2
- 238000007400 DNA extraction Methods 0.000 description 2
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 2
- 206010020649 Hyperkeratosis Diseases 0.000 description 2
- 241000699670 Mus sp. Species 0.000 description 2
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 2
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 2
- 108091005461 Nucleic proteins Proteins 0.000 description 2
- 238000012300 Sequence Analysis Methods 0.000 description 2
- 101100059152 Thermococcus onnurineus (strain NA1) csm1 gene Proteins 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- 108091028113 Trans-activating crRNA Proteins 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- 101150063416 add gene Proteins 0.000 description 2
- 239000008272 agar Substances 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000004071 biological effect Effects 0.000 description 2
- 101150090505 cas10 gene Proteins 0.000 description 2
- 101150038500 cas9 gene Proteins 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 238000009792 diffusion process Methods 0.000 description 2
- 238000007598 dipping method Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000004720 fertilization Effects 0.000 description 2
- 239000000499 gel Substances 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 230000001976 improved effect Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000005764 inhibitory process Effects 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 101150044508 key gene Proteins 0.000 description 2
- 210000001161 mammalian embryo Anatomy 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000008775 paternal effect Effects 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 238000005204 segregation Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- 238000012070 whole genome sequencing analysis Methods 0.000 description 2
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 1
- YKBGVTZYEHREMT-KVQBGUIXSA-N 2'-deoxyguanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](CO)O1 YKBGVTZYEHREMT-KVQBGUIXSA-N 0.000 description 1
- CKTSBUTUHBMZGZ-ULQXZJNLSA-N 4-amino-1-[(2r,4s,5r)-4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]-5-tritiopyrimidin-2-one Chemical compound O=C1N=C(N)C([3H])=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 CKTSBUTUHBMZGZ-ULQXZJNLSA-N 0.000 description 1
- 230000005730 ADP ribosylation Effects 0.000 description 1
- 241000093740 Acidaminococcus sp. Species 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 235000014698 Brassica juncea var multisecta Nutrition 0.000 description 1
- 235000006008 Brassica napus var napus Nutrition 0.000 description 1
- 240000000385 Brassica napus var. napus Species 0.000 description 1
- 235000006618 Brassica rapa subsp oleifera Nutrition 0.000 description 1
- 235000004977 Brassica sinapistrum Nutrition 0.000 description 1
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 1
- 101710172824 CRISPR-associated endonuclease Cas9 Proteins 0.000 description 1
- 238000010453 CRISPR/Cas method Methods 0.000 description 1
- 101150075629 CSM2 gene Proteins 0.000 description 1
- 101150069031 CSN2 gene Proteins 0.000 description 1
- 101150078885 CSY3 gene Proteins 0.000 description 1
- MIKUYHXYGGJMLM-GIMIYPNGSA-N Crotonoside Natural products C1=NC2=C(N)NC(=O)N=C2N1[C@H]1O[C@@H](CO)[C@H](O)[C@@H]1O MIKUYHXYGGJMLM-GIMIYPNGSA-N 0.000 description 1
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 1
- NYHBQMYGNKIUIF-UHFFFAOYSA-N D-guanosine Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(CO)C(O)C1O NYHBQMYGNKIUIF-UHFFFAOYSA-N 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 241000255601 Drosophila melanogaster Species 0.000 description 1
- 101100275895 Emericella nidulans (strain FGSC A4 / ATCC 38163 / CBS 112.46 / NRRL 194 / M139) csnB gene Proteins 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 101100007788 Escherichia coli (strain K12) casA gene Proteins 0.000 description 1
- 101100007792 Escherichia coli (strain K12) casB gene Proteins 0.000 description 1
- 101100219622 Escherichia coli (strain K12) casC gene Proteins 0.000 description 1
- 101100382541 Escherichia coli (strain K12) casD gene Proteins 0.000 description 1
- 108091029865 Exogenous DNA Proteins 0.000 description 1
- 238000001162 G-test Methods 0.000 description 1
- -1 GSU0054 Proteins 0.000 description 1
- 101100273274 Haloferax volcanii (strain ATCC 29605 / DSM 3757 / JCM 8879 / NBRC 14742 / NCIMB 2012 / VKM B-1768 / DS2) cas8b gene Proteins 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 206010021929 Infertility male Diseases 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 208000007466 Male Infertility Diseases 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 101100387131 Myxococcus xanthus (strain DK1622) devS gene Proteins 0.000 description 1
- 244000061176 Nicotiana tabacum Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- 101150113476 OLE1 gene Proteins 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 239000005708 Sodium hypochlorite Substances 0.000 description 1
- 241000191940 Staphylococcus Species 0.000 description 1
- 101000910035 Streptococcus pyogenes serotype M1 CRISPR-associated endonuclease Cas9/Csn1 Proteins 0.000 description 1
- 108091027544 Subgenomic mRNA Proteins 0.000 description 1
- 229930006000 Sucrose Natural products 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 108700009124 Transcription Initiation Site Proteins 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 241000607479 Yersinia pestis Species 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 101100188627 Zea mays OLE16 gene Proteins 0.000 description 1
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 1
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 229960005305 adenosine Drugs 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 229940075522 antidotes Drugs 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- 238000000876 binomial test Methods 0.000 description 1
- 101150111685 cas4 gene Proteins 0.000 description 1
- 101150049463 cas5 gene Proteins 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000011712 cell development Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 101150095330 cmr5 gene Proteins 0.000 description 1
- 230000010154 cross-pollination Effects 0.000 description 1
- 101150088252 csy1 gene Proteins 0.000 description 1
- 101150016576 csy2 gene Proteins 0.000 description 1
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000019827 double fertilization forming a zygote and endosperm Effects 0.000 description 1
- 230000034431 double-strand break repair via homologous recombination Effects 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 239000012636 effector Substances 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000007849 functional defect Effects 0.000 description 1
- 230000006251 gamma-carboxylation Effects 0.000 description 1
- 238000003209 gene knockout Methods 0.000 description 1
- 230000004077 genetic alteration Effects 0.000 description 1
- 231100000118 genetic alteration Toxicity 0.000 description 1
- 230000008826 genomic mutation Effects 0.000 description 1
- 230000035784 germination Effects 0.000 description 1
- 239000012869 germination medium Substances 0.000 description 1
- 125000000291 glutamic acid group Chemical group N[C@@H](CCC(O)=O)C(=O)* 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 229940029575 guanosine Drugs 0.000 description 1
- 239000010903 husk Substances 0.000 description 1
- 230000033444 hydroxylation Effects 0.000 description 1
- 238000005805 hydroxylation reaction Methods 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 230000000415 inactivating effect Effects 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 235000009973 maize Nutrition 0.000 description 1
- 230000013011 mating Effects 0.000 description 1
- 230000005541 medical transmission Effects 0.000 description 1
- 230000000442 meristematic effect Effects 0.000 description 1
- 108091070501 miRNA Proteins 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 230000011278 mitosis Effects 0.000 description 1
- 238000001823 molecular biology technique Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 230000003472 neutralizing effect Effects 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 238000010899 nucleation Methods 0.000 description 1
- 230000009437 off-target effect Effects 0.000 description 1
- 230000005868 ontogenesis Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000009401 outcrossing Methods 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 235000012736 patent blue V Nutrition 0.000 description 1
- 230000008121 plant development Effects 0.000 description 1
- 230000010152 pollination Effects 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 102000054765 polymorphisms of proteins Human genes 0.000 description 1
- 230000003389 potentiating effect Effects 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 108010054624 red fluorescent protein Proteins 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000010153 self-pollination Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 210000003765 sex chromosome Anatomy 0.000 description 1
- 230000013278 single fertilization Effects 0.000 description 1
- SUKJFIGYRHOWBL-UHFFFAOYSA-N sodium hypochlorite Chemical compound [Na+].Cl[O-] SUKJFIGYRHOWBL-UHFFFAOYSA-N 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 210000001082 somatic cell Anatomy 0.000 description 1
- 230000002623 sporogenic effect Effects 0.000 description 1
- 238000005507 spraying Methods 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000003892 spreading Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 239000008223 sterile water Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- 230000019635 sulfation Effects 0.000 description 1
- 238000005670 sulfation reaction Methods 0.000 description 1
- 230000008093 supporting effect Effects 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/415—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
- C12N15/8261—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
- C12N15/8271—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
- C12N15/8274—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for herbicide resistance
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- Biochemistry (AREA)
- Wood Science & Technology (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- Botany (AREA)
- Physics & Mathematics (AREA)
- Cell Biology (AREA)
- Plant Pathology (AREA)
- Gastroenterology & Hepatology (AREA)
- Microbiology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Medicinal Chemistry (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
Abstract
The invention belongs to the field of biotechnology. In particular, the present invention relates to an artificial gene driving system. More specifically, the present invention relates to an artificial gene driving system based on a poison-drug-releasing mechanism applicable to plants, wherein a gene editing system for disabling a pollen tube development essential protein functions as a poison, a recoded coding sequence of the pollen tube development essential protein codes for a wild-type pollen tube development essential protein and cannot be targeted by the gene editing system as a drug-releasing. The artificial gene driving system of the present invention can be used to transmit characteristics beneficial to humans to wild organism populations.
Description
Technical Field
The invention belongs to the field of biotechnology. In particular, the present invention relates to an artificial gene driving system. More particularly, the present invention relates to an artificial gene driving system based on a poison-drug releasing mechanism applicable to plants.
Background
The method has important significance for gene manipulation of wild populations in the face of controlling disease transmission media such as mosquitoes, protecting biodiversity, relieving diversification challenges such as agricultural pest disasters and the like. However, under the influence of classical mendelian genetics and darwinian selection, characteristics beneficial to humans are often difficult to widespread in these populations, as they tend to be selection neutral or even detrimental to the organism itself. Nevertheless, there are widespread selfish genetic elements in nature that can be transferred to offspring at frequencies exceeding mendelian's law (> 50%, for the example of diploid hybrids) to obtain their own advantages. Inspired by these natural processes, there have been artificially designed gene driven systems aimed at propagating genetic changes in populations (genetic alterations) without regard to the possible cost of adaptability (fitness cost) to the individual organism. Thus, this technology has the potential to propagate human-beneficial features to wild populations, providing an attractive tool to address the aforementioned global challenges.
In yeast 1-2 Mosquito and mosquito 3-7 Drosophila, drosophila 8-10 And mice (mice) 11 In the past, various synthetic gene drive systems have been implemented for population improvement (population modification) or population suppression (population suppression). These gene drive systems are mostly based on localization endonucleases (so called home-based drive), functioning through a "copy-paste" mechanism, utilizing CRISPR-Cas9 mediated DNA Double Strand Breaks (DSBs) and subsequent homologous recombination repair (HDR) processes to convert heterozygotes to homozygotes, allowing the gene drive element to inherit to offspring in proportions exceeding 50%. If the gene drive system is internally interlocked with a cargo 3 (cargo) with the spread of gene drives, population improvement can be achieved; if the gene driving system itself is located in a fertility-related essential gene (essential gene) 4,5 Population suppression can be achieved by continuing to cut the homologous genes internally. However, if repair by HDR is not successful, DSBs will be repaired by non-homologous end joining (NHEJ), introducing Indel, and creating a resistance allele that cannot continue to be targeted for cleavage by gRNA (resistance allele). Whereas the NHEJ pattern is particularly common in plants and is therefore plant specific Continuous spread of gene drives in populations constitutes a significant challenge. Therefore, mechanisms that do not rely on the HDR repair pathway are vital in pursuing more efficient synthetic gene driven systems.
Toxic-drug-releasing (TA) mechanism, naturally occurring in micetHaplotypetA replotype) as an example, gives a very promising hint. Toxicants are usually expressed prior to meiosis and thus are present in the four gametes formed, interfering with normal gametogenesis, whereas antidotes are activated at a stage following meiosis, able to mitigate, neutralize the lesions caused by the toxicants, providing evolutionary advantages for their carriers. Although these natural drug-solution systems cannot replicate directly into individual species, the advent of CRISPR-Cas9 provides a more versatile way to mimic natural drug-solution strategies. In an artificially designed TA system, an essential gene is repaired through CRISPR/Cas9 cleavage and NHEJ pathway, so that loss-of-function (LOF) is used as a poison, and the Recoded (Recoded) sequence of the essential gene which cannot be targeted by gRNA is used as a poison to rescue the influence generated by the poison.
Toxin-Antidote Recessive Embryo (TARE) drive System has been developed 10 Also known as Cleave and Rescue or ClvR 9 Targeting an essential gene for zygotic development, an individual dies when both alleles of that gene are inactivated (LOF) in the individual and there is no genetic driving element. In contrast, individuals who inherit the gene driving element survive. This system is already known in Drosophila melanogaster @Drosophila melanogaster) 88-95% gene driven transmissibility was achieved in female heterozygotes (transmission rate) 10 . Although very effective, the efficiency of this system depends on the presence of Cas9 activity from egg cell carry (caryover) to zygote (only able to cleave the paternally contributed WT allele), or requires that the target gene be located on the sex chromosome, thus somewhat hampering wide species application.
In addition, another design principle that has not been realized is the axin-Antidote Dominant Sperm (TADS) drive 12 The aim is to interfere with essential genes in the spermatogenesis process, theoretically driving more efficiently, bypassing the above-mentioned limitations of TARE by disrupting only one copy of the gene of interest, rather than the need to disrupt both alleles as in the TARE system. However, its technical implementation is hampered by the fact that determining only essential genes affecting spermatogenesis is still a problem.
Disclosure of Invention
In this work, the inventors developed a gene drive system, named CAIN (CRISPR-Assisted Inheritance utilizingNPG1). The gene driving system based on poison-drug decomposition mechanism is a carefully designed artificial gene driving system which can be applied to plants, utilizes the longer male gametophyte development process of plants and targets and cuts an essential gene related to pollen grain germinationNPG1(No Pollen Germination 1). Experiments prove that the method can be improved from 50% to 88-99% of two continuous generations in Arabidopsis thaliana, successfully realizes partial isolation (biased inheritance), and realizes the inheritance of a remarkable super-Mendelian proportion. The success of CAIN predicts the potential of application in a variety of plant species, providing a solution to the important challenges-slowing down the spread of invasive species by affecting the genetic proportion of sterile genes, and managing weed populations by spreading genes that are sensitive to certain herbicides, leading to a new era of ecological management and sustainable agriculture.
The present invention includes, but is not limited to, the following embodiments:
embodiment 1. An artificial gene driving system for plants comprising:
a first nucleic acid comprising a coding sequence for a component of a gene editing system that can target a gene, such as a coding sequence, for a pollen tube development essential protein in the plant and cause the pollen tube development essential protein to lose function, the coding sequence for the component of the gene editing system being operably linked to a promoter that mediates specific expression during pollen formation;
A second nucleic acid comprising a recoded coding sequence for the pollen tube development essential protein, the recoded sequence encoding a wild-type pollen tube development essential protein and not being targeted by the gene editing system and being operably linked to a native promoter of the pollen tube development essential gene; and
a third nucleic acid comprising a coding sequence for a cargo, e.g., the cargo to be transmitted in a population of the plant.
Embodiment 2. The artificial gene driving system of embodiment 1 wherein the first nucleic acid, the second nucleic acid and the third nucleic acid are located on the same expression construct.
Embodiment 3. The artificial gene driving system of embodiment 1 or 2 wherein the pollen tube development essential protein isNo Pollen Germination 1 (NPG1)。
Embodiment 4. The artificial gene driving system of embodiment 3 whereinNPG1Comprising an amino acid sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID No. 1.
Embodiment 5. The artificial gene driving system of embodiment 3 wherein endogenous plants NPG1Comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID No. 2.
Embodiment 6. The artificial gene driving system of embodiment 3 wherein the recodedNPG1Comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID NO. 3, and recodedNPG1Cannot be targeted by the gene editing system and thus will not be rendered functional by the expression of the gene editing system.
Embodiment 7. Embodiment3, whereinNPG1Comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID No. 4.
Embodiment 8 the artificial gene driving system of any one of embodiments 1-7 wherein the promoter mediating specific expression during pollen formation isDMC1 (Disruption of Meiotic Control 1) Promoters of genes.
Embodiment 9. The artificial gene driving system of embodiment 8 whereinDMC1The promoter comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID No. 5.
Embodiment 10 the artificial gene drive system of any one of embodiments 1-7 wherein the promoter that mediates specific expression during pollen formation isTPD1 (Tapetum Determinant 1) Promoters of genes.
Embodiment 11 the artificial gene driving system of embodiment 10 whereinTPD1The promoter comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID No. 6.
Embodiment 12 the artificial gene drive system of any one of embodiments 1-11 wherein the gene editing system is selected from CRISPR, ZFN or TALEN based gene editing systems, preferably the gene editing system is a CRISPR based gene editing system.
Embodiment 13. The artificial gene drive system of embodiment 12 wherein the CRISPR gene editing system comprises a CRISPR nuclease and at least one guide RNA, preferably the CRISPR nuclease is a Cas9 nuclease.
Embodiment 14. The artificial gene drive system of embodiment 13 wherein the coding sequence of the CRISPR nuclease is operably linked to the promoter that mediates specific expression during pollen formation, preferably to theTPD1The promoter is operably linked.
Embodiment 15. The artificial gene drive system of embodiment 13, wherein the gene editing system comprises a Cas9 nuclease and at least one targeted endogenous sourceNPG1Is a gRNA of (C).
Embodiment 16. The artificial gene driving system of embodiment 15, wherein the at least one targeting endogenous sourceNPG1Targeting a nucleotide sequence selected from any one of SEQ ID NOs 7 to 10.
Embodiment 17 the artificial gene driving system of any of embodiments 1-16, wherein the expression of the cargo is detrimental or beneficial to the plant when the plant is exposed to a particular compound or condition, e.g., the cargo is a herbicide sensitive gene, a gene that disrupts herbicide resistance, a gene that enhances environmental adaptation, a gene that enhances disease resistance.
Embodiment 18. A method of producing a modified plant for genetically driven engineering a plant population, the method comprising introducing the artificial gene drive system of any one of embodiments 1-17 into at least one plant, thereby obtaining at least one modified plant having the genome thereof integrated with the first nucleic acid, the second nucleic acid, and the third nucleic acid.
Embodiment 19 the method of embodiment 18, wherein the first nucleic acid, second nucleic acid, and third nucleic acid integrated into the genome of the modified plant are closely linked, e.g., located at the same locus.
Embodiment 20. Use of a modified plant for genetically engineering a population of plants, wherein the modified plant is obtained by the method of embodiment 18 or 19 or the modified plant has introduced into it an artificial gene drive system for plants according to any of embodiments 1 to 17, whereby its genome has been integrated with said first, second and third nucleic acids.
Embodiment 21. A modified plant for genetically engineering a population of plants, wherein the modified plant is obtained by the method of embodiment 18 or 19 or the modified plant has introduced into it an artificial gene driven system for plants according to any of embodiments 1 to 17, whereby its genome integrates said first, second and third nucleic acids.
Embodiment 22. A method of modifying a plant population by gene driving, the method comprising placing at least one modified plant of embodiment 21 into the population of plants and allowing the at least one modified plant to cross with other plants in the plant population.
Embodiment 23. The method of embodiment 22, wherein the method allows the progeny of the at least one modified plant that crosses with other plants in the plant population to cross with other plants and/or progeny in the population.
Embodiment 24. The method of embodiment 22 or 23, resulting in an increased proportion of plants carrying the cargo in the population of modified plants as compared to the population of unmodified plants.
Drawings
FIG. 1, CAIN gene driving and theoretical genetic behavior in Arabidopsis. a, CAIN gene driven elements. The portions within the left and right borders are shown and describe the corresponding stages of development of the arabidopsis male gametophyte.TPD1: Tapetum Determinant 1. DMC1: Disruption of Meiotic Control 1. NPG1: No Pollen Germination 1B, assuming two, one or no, respectively, in male germ line cellsNPG1When allele is cut resulting in loss of function, the proportion of CAIN-carrying individuals in the F1 offspring generated by crossing the wild-type female parent and CAIN-carrying male parent is predicted. Salmon color, sky blue, and gray boxes represent female parent, male parent, and offspring, respectively. The dashed box represents the gametophyte. Red crosses represent non-germinated pollen grains.
FIG. 2, CAIN gene driven the transmission rate from T1 to F1 generation in test cross. a, transforming to obtain T1 plants and a subsequent hybridization test step. Transgenic T1 was obtained using Agrobacterium transformed plants carrying either the control vector (FAST only) or one of the gene driven vectors (DMC-CAIN and TPD-CAIN). Taking a T1 plant inserted at a single site as a male parent and a wild Col-0 as a female parent to obtain an F1 generation. b, the transmission efficiency of CAIN gene drive is the proportion of FAST+F1 seeds in all F1 seeds. Each red dot represents the transfer efficiency in a single corner.
FIG. 3, FAST+F1 plants in TPD-CAIN experimentsNPG1Genotype at the locus. a, F1 somatic tissue (rosette leaves and inflorescences) genotyping schematic diagram. b, genotypic results for 16 fast+f1 plants at four gRNA targets were summarized. The numerical values and the symbols "+" and "-" preceding the bases represent insertion and deletion events, respectively. The numbers following the symbols represent the number of nucleotides in the indels greater than 2. Symbol "A>C' represents a base substitution from adenine (A) to cytosine (C). c, genotypic results at the gRNA11 target based on Illumina sequencing.
FIG. 4, propagation rate of TPD-CAIN from F1 generation to F2 generation in backcrossing. The average transmission efficiency of TPD-CAIN in offspring when FAST+F1 plants were used as male parent (a) or female parent (b), respectively.
FIG. 5, DMC-CAIN propagation rate from F1 to F2. and a, genotyping the inflorescence part of the F1 plant. b, genotype summary of 12 fast+f1 plants at four gRNA target sites. c, average delivery efficiency of DMC-CAIN in F2 seed generated by FAST+F1 as male parent.
FIG. 6, FAST-F1 and FAST +/-F2 plants inNPG1Genotype at the locus. a, genotyping of the leaf part of the FAST-F1 plant generated by taking a T1 plant carrying TPD-CAIN as a male parent. b, genotype summary of 11 FAST-F1 plants at four gRNA target sites. The corresponding mechanism (incomplete cutting efficiency or incomplete penetrance) for each F1 plant is marked below by the permutation. c, genotyping of fast+ and FAST-F2 plants at four gRNA target sites according to Sanger sequencing. The F2 plants were generated from F1 carrying TPD-CAIN (TPD-CAIN/+) with wild type (+/+) positive and negative crosses.
FIG. 7, modified and suppressed CAIN driven propagation dynamics simulations. a, the calculation simulation shows the effect of different settings of the male germ cell cutting efficiency (empirical value: 98.4%, artificial setting: 50.0% and 100.0%) and the penetrance (empirical value: 96.0%, artificial setting: 50.0% and 100.0%) on CAIN propagation dynamics. b, calculation simulation of the diffusion dynamics driven by the home type, TARE and CAIN, and the initial throwing proportion is 1%. The cleavage efficiency of both home and TARE was set to maximum. For CAIN, the penetrance was set to 96.0% of the empirical value. And different male (empirical 98.4% or artificial set-point 50.0%) and female (empirical 94.1% and artificial set-point 0.0% and 50.0%) germ cell cutting efficiencies were set. c, diffusion dynamics driven by the inhibition type CAIN. The figure shows the number and frequency of CAIN carriers and wild-type individuals as a function of each generation.
FIG. 8, four gRNAs involved in CAIN gene driving. a, CAIN vector contains four grnas in tandem. Based on synonymous codon principle pairNPG1Sequence changes without changing the encoded amino acids as a RecodedNPG1The mutated nucleotides are marked with red boxes. b, the positions of the four gRNA target sequences on the genomic sequence are displayed. Primers used for genotyping have been labeled. The PCR products were first Sanger sequenced using the primer pairs NPG-gDNA-F1 and NPG-gDNA-R1_2 to amplify a genomic region covering the four target sites. The primers used for sequencing are: NPG-gDNA-F1, NPG-gDNA-F2, NPG-gDNA-R1_1 and NPG-gDNA-R2.
FIG. 9, CAIN gene drive vector map. The control vectors FAST only (a), DMC-CAIN (b) and TPD-CAIN (c) are shown, and the total sequence length and main features are marked.
Fig. 10, summary of experimental procedure. The first step: the control vector (FAST only), DMC-CAIN or TPD-CAIN gene-driven vector was infected with the Arabidopsis Col-0 background, respectively. Successfully transformed seeds were directly picked by the phenotype of FAST (red fluorescence). T1 plants with single site insertions were selected by TAIL-PCR and whole genome sequencing. And a second step of: hybridization is carried out by taking the T1 as a male parent and a Col-0 female parent. The percentage of fast+ seeds in F1 seeds was taken as the CAIN gene driven transmissibility (drive%). And a third step of: f1 seeds are planted to obtain F1 plants. Genotyping was performed on each F1 plant as described in the methods section. Fourth step: f1 plants with known genotypes are used as male parent or female parent to be crossed with Col-0 plants respectively. Drive transmissivities in F2 seeds were also counted. Fifth step: f2 plants are obtained by seeding F2 seeds. Genotyping was performed similarly on F2 plants.
FIG. 11, type of mutation detected in FAST+F1 plants. The figure illustrates the insertions, deletions and single nucleotide polymorphisms generated at (a) the gRNA2 and (b) the gRNA11 target site and their positions. * The possible alignment results are indicated, as the underlined bases can be located on either side of the deletion.
FIG. 12, reversible TPD-CAIN gene drive. Gene driven TPD-CAIN at new versions n+1 In the design of gRNAs against novel sites in NPG1 n+1 As a novel poison to gRNAs n And gRNAs n+1 Recoded with resistance (not capable of being targeted for cleavage) n+1 Then it is used as medicine for resolving medicine. When Cas9 is active and cleaves in germ cells, new toxicants destroy NPG1 and recoded on the genome n Therefore only be recooded n+1 Rescue. In this way, if a new version of TPD-CAIN n+1 With old version TPD-CAIN n In a homologous position, the new version will be obsolete and replace the old version. Novel cargo linked thereto n+1 And also spreads out therewith.
FIG. 13, effect of male germ cell cleavage efficiency, incomplete penetrance and female germ cell cleavage efficiency on TPD-CAIN system. and a, estimating the cutting efficiency and the apparent rate of the male germ cells. In the F1 offspring, 94.3% (526/558) of the seeds were FAST+ (TPD-CAIN/+) and NPG1 at the gRNA11 target site - Genotype. In addition, 2.6% (5.7%. Times.5/11) is genotype +/+;NPG1 +/- 3.1% (5.7%. Times.6/11) is genotype +/+;NPG1 +/+ And (5) a plant. In the F2 generation, 94.8% (3868/4080) was FAST+ (TPD-CAIN/+) and NPG1 was found at the gRNA11 target site - Genotype, remaining 5.2% are +/++, NPG1 +/- . Based on the statistics in the F1 and F2 generations, the average cleavage failure rate was estimated to be 1.6%, i.e., the male germ cell cleavage efficiency was 98.4%. The average penetrance was estimated to be 96.0%. b, female germ cell cleavage efficiency (r) estimation. Since no further cleavage of the target site occurs in FAST-F2 plantsThus, according to the gRNA11 target siteNPG1Genotype calculation cleavage efficiency r. Since only one of 34 FAST-F2 strains is +/+;NPG1 +/+ Genotype, thus estimating r to be 94.1%.
FIG. 14 potential application of TPD-CAIN gene drive. TPD-CAIN has two potential application directions. a, improving plant adaptability: by propagating genes that promote the adaptation of specific endangered species to their environment, TPD-CAIN can make rapid genetic rescue and make the target species more suitable for their living environment. b, weed management: TPD-CAIN enables efficient weed area management by transmitting genes that confer herbicide sensitivity to the target weeds, in combination with the local application of subsequent herbicides.
Detailed Description
1. Definition of the definition
In the present invention, unless otherwise indicated, scientific and technical terms used herein have the meanings commonly understood by one of ordinary skill in the art. Also, the terms related to protein and nucleic acid chemistry, molecular biology, microbiology and laboratory procedures used herein are terms and conventional procedures that are widely used in the corresponding field. For example, standard recombinant DNA and molecular cloning techniques for use in the present invention are well known to those skilled in the art and are more fully described in the following documents: sambrook, j., fritsch, e.f., and Maniatis, t., molecular Cloning: a Laboratory Manual; cold Spring Harbor Laboratory Press: cold Spring Harbor,1989 (hereinafter "Sambrook"). Meanwhile, in order to better understand the present invention, definitions and explanations of related terms are provided below.
As used herein, the term "and/or" encompasses all combinations of items connected by the term, and should be viewed as having been individually listed herein. For example, "a and/or B" encompasses "a", "a and B", and "B". For example, "A, B and/or C" encompasses "a", "B", "C", "a and B", "a and C", "B and C" and "a and B and C".
The term "comprising" is used herein to describe a sequence of a protein or nucleic acid, which may consist of the sequence, or may have additional amino acids or nucleotides at one or both ends of the protein or nucleic acid, but still have the activity described herein. Furthermore, it will be clear to those skilled in the art that the methionine encoded by the start codon at the N-terminus of a polypeptide may be retained in some practical situations (e.g., when expressed in a particular expression system) without substantially affecting the function of the polypeptide. Thus, in describing a particular polypeptide amino acid sequence in the specification and claims, although it may not comprise a methionine encoded at the N-terminus by the initiation codon, a sequence comprising such methionine is also contemplated at this time, and accordingly, the encoding nucleotide sequence may also comprise the initiation codon; and vice versa.
"exogenous" with respect to a sequence means a sequence from a foreign species, or if from the same species, a sequence that has undergone significant alteration in composition and/or locus from its native form by deliberate human intervention.
"Polynucleotide", "nucleic acid sequence", "nucleotide sequence" or "nucleic acid" are used interchangeably and are a single-or double-stranded RNA or DNA polymer, optionally containing synthetic, unnatural or altered nucleotide bases. Nucleotides are referred to by their single letter designations as follows: "A" is adenosine or deoxyadenosine (corresponding to RNA or DNA, respectively), "C" represents cytidine or deoxycytidine, "G" represents guanosine or deoxyguanosine, "U" represents uridine, "T" represents deoxythymidine, "R" represents purine (A or G), "Y" represents pyrimidine (C or T), "K" represents G or T, "H" represents A or C or T, "D" represents A, T or G, "I" represents inosine, and "N" represents any nucleotide.
Codon optimization refers to a method of modifying a nucleic acid sequence to enhance expression in a host cell of interest by replacing at least one codon of the native sequence with a more or most frequently used codon in the gene of the host cell (e.g., about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50 or more codons while maintaining the native amino acid sequence).
"polypeptide", "peptide", and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The term applies to amino acid polymers in which one or more amino acid residues are artificial chemical analogues of the corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The terms "polypeptide", "peptide", "amino acid sequence" and "protein" may also include modified forms including, but not limited to, glycosylation, lipid attachment, sulfation, gamma carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.
Sequence "identity" has art-recognized meanings and the percent sequence identity between two nucleic acid or polypeptide molecules or regions can be calculated using the disclosed techniques. Sequence identity may be measured along the full length of a polynucleotide or polypeptide or along a region of the molecule. (see, e.g., computational Molecular Biology, lesk, A.M., ed., oxford University Press, new York, 1988; biocomputing: informatics and Genome Projects, smith, D.W., ed., academic Press, new York, 1993; computer Analysis of Sequence Data, part I, griffin, A.M., and Griffin, H.G., eds., humana Press, new Jersey, 1994; sequence Analysis in Molecular Biology, von Heinje, G., academic Press, 1987; and Sequence Analysis Primer, grib skov, M.and Devereux, J., eds., M Stockton Press, new York, 1991). Although there are many methods of measuring identity between two polynucleotides or polypeptides, the term "identity" is well known to the skilled artisan (carrello, h. & Lipman, d.,. SIAM J Applied Math 48:1073 (1988)).
In peptides or proteins, suitable conservative amino acid substitutions are known to those skilled in the art, and can generally be made without altering the biological activity of the resulting molecule. In general, one skilled in The art recognizes that single amino acid substitutions in The non-essential region of a polypeptide do not substantially alter biological activity (see, e.g., watson et al, molecular Biology of The Gene, 4th Edition, 1987, the Benjamin/Cummings pub. Co., p. 224).
As used herein, an "expression construct" refers to a vector, such as a recombinant vector, suitable for expression of a nucleotide sequence of interest in an organism. "expression" refers to the production of a functional product. For example, expression of a nucleotide sequence may refer to transcription of the nucleotide sequence (e.g., transcription into mRNA or functional RNA) and/or translation of RNA into a precursor or mature protein.
The "expression construct" of the invention may be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, may be an RNA (e.g., mRNA) that is capable of translation, such as RNA produced by in vitro transcription.
The "expression construct" of the invention may comprise regulatory sequences of different origin and nucleotide sequences of interest, or regulatory sequences and nucleotide sequences of interest of the same origin but arranged in a manner different from that normally found in nature.
"regulatory sequence" and "regulatory element" are used interchangeably and refer to a nucleotide sequence that is located upstream (5 'non-coding sequence), intermediate or downstream (3' non-coding sequence) of a coding sequence and affects transcription, RNA processing or stability, or translation of the relevant coding sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
"promoter" refers to a nucleic acid fragment capable of controlling transcription of another nucleic acid fragment. In some embodiments of the invention, the promoter is a promoter capable of controlling transcription of a gene in a cell, whether or not it is derived from the cell. The promoter may be a constitutive or tissue specific or developmentally regulated or inducible promoter.
As used herein, the term "operably linked" refers to a regulatory element (e.g., without limitation, a promoter sequence, a transcription termination sequence, etc.) linked to a nucleic acid sequence (e.g., a coding sequence or an open reading frame) such that transcription of the nucleotide sequence is controlled and regulated by the transcription regulatory element. Techniques for operably linking a regulatory element region to a nucleic acid molecule are known in the art.
"introducing" a nucleic acid molecule (e.g., plasmid, linear nucleic acid fragment, RNA, etc.) or protein into an organism refers to transforming a cell of the organism with the nucleic acid or protein such that the nucleic acid or protein is capable of functioning in the cell. "transformation" as used herein includes both stable transformation and transient transformation. "Stable transformation" refers to the introduction of an exogenous nucleotide sequence into the genome, resulting in stable inheritance of an exogenous gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generation thereof. "transient transformation" refers to the introduction of a nucleic acid molecule or protein into a cell to perform a function without stable inheritance of an exogenous gene. In transient transformation, the exogenous nucleic acid sequence is not integrated into the genome.
As used herein, the term "plant" includes whole plants and any progeny, cells, tissues, or parts of plants. The term "plant part" includes any part of a plant, including, for example, but not limited to: seeds (including mature seeds, immature embryos without seed coats, and immature seeds); plant cutting (plant cutting); a plant cell; plant cell cultures; plant organs (e.g., pollen, embryos, flowers, fruits, shoots, leaves, roots, stems, and related explants). The plant tissue or plant organ may be a seed, a callus, or any other population of plant cells organized into structural or functional units. Plant cells or tissue cultures are capable of regenerating plants having the physiological and morphological characteristics of the plant from which the cells or tissue are derived, and of regenerating plants having substantially the same genotype as the plant. In contrast, some plant cells are not capable of regenerating to produce plants. The regenerable cells in the plant cells or tissue culture may be embryos, protoplasts, meristematic cells, callus tissue, pollen, leaves, anthers, roots, root tips, filaments, flowers, kernels, ears, cobs, husks, or stems.
Plant "progeny" includes any subsequent generation of a plant.
2. Artificial gene driving
In one aspect, the present invention provides an artificial gene drive system for plants comprising:
a first nucleic acid comprising a coding sequence for a gene editing system component that can target a gene encoding a pollen tube development essential protein in the plant and cause the pollen tube development essential protein to lose function, the coding sequence of the gene editing system component being operably linked to a promoter that mediates specific expression during pollen formation;
a second nucleic acid comprising a recoded coding sequence for the pollen tube development essential protein, the recoded sequence encoding the functional pollen tube development essential protein and not being targeted by the gene editing system and being operably linked to a native promoter of a gene encoding the pollen tube development essential protein; and
a third nucleic acid comprising a coding sequence for a cargo to be transmitted in a population of the plant.
In some embodiments, the first nucleic acid, the second nucleic acid, and the third nucleic acid are located on the same expression construct.
In some embodiments, the gene encoding a pollen tube development essential protein is an endogenous gene of the plant. In some embodiments, the gene encoding a protein essential for pollen tube development is an exogenous gene that has been introduced into the plant.
In some embodiments, the pollen tube development essential protein is No Pollen Germination 1 (NPG 1). NPG1 is associated with the development of male gametophytes but does not affect female gametophyte development and is required for the later stages of pollen germination. NPG1 is well conserved among different plants.
In some embodiments, NPG1 comprises an amino acid sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID NO. 1.
In some embodiments, the coding sequence for endogenous NPG1 in a plant comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID No. 2.
In some embodiments, the coding sequence of the recoded NPG1 comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID NO. 3. The coding sequence of the recoded NPG1 cannot be targeted by the gene editing system and thus is not rendered functional by the expression of the gene editing system.
In general, a promoter of a gene refers to a sequence on the genome that is about 100bp to about 5kb, e.g., about 500bp to about 3kb, e.g., about 2kb, in length upstream of the translation start site or transcription start site of the coding sequence of the gene.
In some embodiments, the natural promoter of NPG1 comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID NO. 4.
In some embodiments, the promoter that mediates specific expression during pollen formation isDMC1(Disruption of Meiotic Control 1) Promoters of genes. In some embodiments, theDMC1The promoter comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID No. 5. The saidDMC1The promoter is capable of driving expression of the nucleotide sequence to which it is operably linked in pollen mother cells.
In some preferred embodiments, the promoter that mediates specific expression during pollen formation isTPD1 (Tapetum Determinant 1) Promoters of genes. In some embodiments, theTPD1The promoter comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, even 100% sequence identity to SEQ ID No. 6. The saidTPD1The promoter is capable of driving the continuous expression of the nucleotide sequence to which it is operably linked during the development of the progenitor cells of the pollen mother cell, i.e. the sporogenic cells, into the pollen mother cell.
The gene editing system useful in the present invention may be various gene editing systems known in the art as long as they can perform targeted genome editing in plants. The gene editing system may be a CRISPR, ZFN or TALEN based gene editing system. Preferably, the gene editing system is a CRISPR-based gene editing system.
The CRISPR gene editing system may comprise a CRISPR nuclease and at least one guide RNA. The CRISPR nuclease and the guide RNA can form a complex that targets and/or cleaves a genomic target sequence based on the complementarity of the guide RNA to the genomic target sequence.
The "CRISPR nuclease" can be derived from a Cas9 nuclease, including a Cas9 nuclease or a functional variant thereof. The Cas9 nuclease may be Cas9 nucleases from different species, such as from streptococcus pyogenes @, for exampleS. pyogenes) spCas9 or derived from Staphylococcus aureusS. aureus) SaCas9 of (A). "Cas9 coreThe nucleases "and" Cas9 "are used interchangeably herein to refer to RNA-guided nucleases comprising a Cas9 protein or fragment thereof (e.g., a protein comprising the active DNA cleavage domain of Cas9 and/or the gRNA binding domain of Cas 9). Cas9 is a component of a CRISPR/Cas (clustered regularly interspaced short palindromic repeats and related systems) genome editing system that can target and cleave DNA target sequences to form DNA Double Strand Breaks (DSBs) under the direction of guide RNAs.
The "CRISPR nuclease" may also be derived from a Cpf1 nuclease, including a Cpf1 nuclease or a functional variant thereof. The Cpf1 nuclease may be a Cpf1 nuclease from a different species, e.g.fromFrancisella novicida U112、Acidaminococcus sp.BV3L6Lachnospiraceae bacteriumCpf1 nuclease of ND 2006.
Useful "CRISPR nucleases" can also be derived from Cas3, cas8a, cas5, cas8b, cas8C, cas10d, cse1, cse2, csy1, csy2, csy3, GSU0054, cas10, csm2, cmr5, cas10, csx11, csx10, csf1, csn2, cas4, C2C1, C2C3, or C2 nucleases, including for example these nucleases or functional variants thereof.
In some embodiments, the coding sequence of the CRISPR nuclease is operably linked to the promoter that mediates specific expression during pollen formation, preferably to the promoterTPD1The promoter is operably linked.
As used herein, "guide RNA" and "gRNA" are used interchangeably to refer to an RNA molecule that is capable of forming a complex with a CRISPR effector protein and of targeting the complex to a target sequence due to having a identity to the target sequence. The guide RNA targets the target sequence by base pairing with the complementary strand of the target sequence. For example, the grnas employed by Cas9 nucleases or functional variants thereof are typically composed of crrnas and tracrRNA molecules that are partially complementary to form a complex, wherein the crrnas comprise a guide sequence (also known as a seed sequence) that has sufficient identity to a target sequence to hybridize to the complementary strand of the target sequence and direct the CRISPR complex (Cas 9+ crRNA + tracrRNA) to specifically bind to the target sequence. However, it is known in the art that one-way guide RNAs (sgrnas) can be designed which contain both the features of crrnas and tracrrnas. Whereas the grnas employed for Cpf1 nucleases or functional variants thereof typically consist of only mature crRNA molecules, which may also be referred to as sgrnas. It is within the ability of those skilled in the art to design a suitable gRNA based on the CRISPR nuclease used and the target sequence to be edited. In some embodiments, the guide RNA may be driven in expression, such as transcription, by a constitutive promoter. In some embodiments, the guide RNA may drive expression, such as transcription, by a U6 or U3 promoter.
The gene editing system may target any region of the endogenous gene encoding a pollen tube development essential protein, as long as it is capable of causing the loss of function of the endogenous pollen tube development essential protein. For example, the gene editing system may target endogenous coding sequences for proteins essential for pollen tube development, resulting in incomplete mutation or translation of the protein. Alternatively, the gene editing system may target endogenous regulatory sequences of a protein necessary for pollen tube development, resulting in the protein not being expressed.
Methods of recoding the coding sequence of the pollen tube development essential protein such that it expresses a functional protein but is no longer targeted by the gene editing system are well known in the art. For example, the nucleotide sequence may be altered by codon degeneracy to remove the target sequence of the gene editing system without altering the encoded protein sequence. However, if the gene editing system targets an endogenous regulatory sequence of the pollen tube development essential protein, the coding sequence of the pollen tube development essential protein contained in the second nucleic acid may also be identical to the wild type coding sequence, which may also be referred to herein as recoded, because it is also not targeted by the gene editing system.
In some embodiments, the gene editing system comprises a Cas9 nuclease and at least one gRNA targeting endogenous NPG 1. In some embodiments, the at least one gRNA targeting endogenous NPG1 targets a nucleotide sequence selected from any of SEQ ID NOs 7-10.
The "cargo to be spread in the population of plants" as described herein may be any sequence that is desired to be spread in the population of plants, such as a wild population. For example, expression of the cargo is detrimental to the plant when the plant is exposed to a particular compound or condition. For example, the cargo may be a herbicide sensitive gene, or a gene capable of disrupting the original herbicide resistance. By matching with subsequent artificial spraying of a certain herbicide or specific compound, effective weed management in a controllable range and locally can be realized. The cargo may also be a gene that affects megaspore cell or embryo development, whereby control over population size may be achieved. The goods can also be genes capable of improving the adaptability to the environment, disease resistance and the like, so that the adaptability of endangered plants to the natural environment is improved.
In another aspect, the invention provides a method of producing a modified plant for genetically driven engineering a population of plants, the method comprising introducing the artificial gene driven system for plants of the invention into at least one plant, thereby obtaining at least one modified plant having a genome into which the first, second and third nucleic acids are integrated.
In some embodiments, the first nucleic acid, second nucleic acid, and third nucleic acid integrated into the genome of the modified plant are closely linked, e.g., located at the same locus. In some embodiments, the first nucleic acid, second nucleic acid, and/or third nucleic acid are present in a single copy in the genome of the plant.
In the methods of the invention, the artificial gene drive system may be introduced into plants by various methods well known to those skilled in the art. Methods useful for introducing the artificial gene drive system of the invention into plants include, but are not limited to: gene gun method, PEG-mediated protoplast transformation, agrobacterium-mediated transformation, plant virus-mediated transformation, pollen tube channel method, and ovary injection method.
In another aspect, the invention provides a modified plant for genetically driven engineering a plant population, prepared by the method of the invention.
In another aspect, the invention provides a modified plant for genetically driven engineering a plant population, into which the artificial gene driven system for plants of the invention has been introduced, whereby its genome has been integrated with the first, second and third nucleic acids.
In some embodiments, the first nucleic acid, second nucleic acid, and third nucleic acid integrated into the genome of the modified plant are closely linked, e.g., located at the same locus. In some embodiments, the first nucleic acid, second nucleic acid, and/or third nucleic acid are present in a single copy in the genome of the plant.
In another aspect, the invention provides a method of engineering a population of plants by gene driving, the method comprising placing at least one modified plant of the invention into the population of plants and allowing the at least one modified plant to cross with other plants in the population of plants. In some embodiments, the methods allow the progeny of the at least one modified plant that crosses with other plants in the plant population to cross with other plants and/or progeny in the population. In some embodiments, the method results in an increased proportion of plants carrying the cargo in the population of modified plants as compared to the population of non-modified plants. For example, the population of modified plants comprises at least 1% to 100%, such as at least 1%, at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or even 100% of the plants carrying the cargo.
In another aspect, the invention provides a method of engineering a plant population by gene driving, the method comprising:
i) Introducing the artificial gene driving system for plants of the present invention into at least one plant, thereby obtaining at least one modified plant having the genome integrated with the first, second and third nucleic acids;
ii) placing at least one modified plant obtained in step i) in a population of said plants and allowing said at least one modified plant to cross with other plants in said population of plants.
In some embodiments, the first nucleic acid, second nucleic acid, and third nucleic acid integrated into the genome of the modified plant are closely linked, e.g., located at the same locus. In some embodiments, the first nucleic acid, second nucleic acid, and/or third nucleic acid are present in a single copy in the genome of the plant. In some embodiments, the method allows the progeny of the at least one modified plant that crosses with other plants in the plant population to cross with other plants in the population. In some embodiments, the methods allow the progeny of the at least one modified plant that crosses with other plants in the plant population to cross with other plants and/or progeny in the population. In some embodiments, the method results in an increased proportion of plants carrying the cargo in the population of modified plants as compared to the population of non-modified plants. For example, the population of modified plants comprises at least 1% to 100%, such as at least 1%, at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or even 100% of the plants carrying the cargo.
The plants of the various aspects of the invention may be monocotyledonous or dicotyledonous plants, preferably plants that are predominantly inbred. For example, the plant may be Arabidopsis thaliana, maize, canola, tobacco, grassy weeds, and the like.
Examples
Experimental materials and methods
Plant material and growth conditions
All Arabidopsis lines used in this study were Columbia-0 (Col-0) ecotype. Seeds were surface sterilized in 10% sodium hypochlorite for 10 min, washed 3 times with sterile water, and then sown in germination medium (2.2 g/l Murashige and Skoog medium, 10 g/l sucrose, and 7.6 g/l plant agar, ph=5.7). After 2 days of treatment at 4 ℃, the agar plates were transferred to a growth chamber at 22 ℃,16 hours light/8 hours dark photoperiod. Seedlings 7 days old were transferred to soil and continuously cultivated in a greenhouse at the above temperature and light conditions for subsequent experiments.
Plasmid construction
All restriction enzymes and enzymes for Gibson ligation were from NEB. High fidelity polymerase (Phanta Max Super-Fidelity DNA Polymerase, P505) and gel extraction kit (FastPure Gel DNA Extraction Mini Kit, DC 301) for Polymerase Chain Reaction (PCR) are both from Vazyme. All plasmids used in this study were cloned using standard molecular biology techniques and purified using StarPrep Fast Plasmid Mini Kit (Genstar, D201).
To construct Fast only vectors, ligation was performed by Gibson 26 FAST marker sequence 17 (pOLE 1: OLE1-TagRFP-Nos terminator) was inserted into XF675 binary vector. Specifically, the promoter and part of the coding sequence of the OLE1 (AT 4G 25140) gene were first amplified from genomic DNA (gDNA), the sequences of the TagRFP and the Nos terminator being as per the references 17 Synthesized. The HindIII digested XF675 plasmid and the two fragments were assembled together by Gibson ligation.
To construct DMC-CAIN and TPD-CAIN vectors, all components were cloned into XF675 binary vectors in five consecutive steps. Specifically, sgRNA cassette (pU 6-SmR-gRNA scaffold-U6 terminator) was amplified from template pHEE401E (Addgene # 71287) and inserted into XF675 after double digestion (EcoRI and HindIII) using Gibson ligation. Then, bsaI edge cutting and connecting mode is used 27 4 gRNAs were introduced. The plasmid was digested again with HindIII and the FAST marker sequence was cloned in via Gibson ligation, while the HindIII cleavage recognition sites were filled in.NPG1The promoter (about 2 kb) was amplified from gDNA, the Recoded NPG1 sequence was obtained by mutation PCR, and the ligation by Gibson was continued to insert into the HindIII digested plasmid. Finally, the plasmid of the previous step was digested with KpnI and the following fragments were ligated by Gibson ligation: amplified from gDNA DMC1(AT 3G 22880) andTPD1(AT 4G 24972), a plant codon optimized SpCas9 sequence with two Nuclear Localization Signals (NLS) amplified from template pHEE401E and a Nos terminator amplified from template XF 675.
Evaluation of editing efficiency of potential target sites in protoplasts
According to CRISPR-P2.0 24 (http:// CRISPR. Hzau. Edu. Cn/CRISPR2 /) and CRISPR-GE 28 (http:// skl. Scau. Edu. Cn/home /) predicts candidate gRNAs for the NPG1 (AT 2G 43040) coding sequence, respectively, and 12 were screened for cleavage efficiency in Arabidopsis protoplasts.
Each 20nt gRNA sequence was inserted into the pAtU6-sgRNA vector (Addgene plasmid # 119775) by BsaI cleavage and Gibson ligation. Arabidopsis protoplasts were prepared according to the reference 29,30 . After co-transformation with pAtU6-sgRNA and p2X35S-Cas9, protoplasts were harvested after 48 hours incubation at room temperature. Meanwhile, one tube of protoplasts was transformed with p2X35S-GFP alone, and about 16 hours after transformation was used to estimate transformation efficiency. The conversion efficiency of the two biological replicates was found to be 41% and 45%, respectively.
Genomic DNA from each protoplast was extracted using DNA extraction kit (Plant Genomic DNA Kit, DP305, TIANGEN). A180-220 bp genomic region surrounding each target region was PCR amplified and purified. Two biologically repeatedly generated PCR products at the same target site were distinguished by primer introduction of 6nt barcode ('ATGCAG'). All 24 purified PCR product samples were quantitated with Nanodrop 2000 and mixed in equal amounts and sent to Novogene for library construction and Illumina PE150 sequencing.
The calculation method of the gRNA editing efficiency comprises the following steps: the number of reads edited is divided by the total number of reads on the site-specific alignment (map). Mismatch reads located within 10bp upstream of PAM (NGG) served as reads that were edited. Since the main editing types of Cas9 are insertions and deletions, single base Substitutions (SNPs) were removed when calculating the editing type.
Single locus insertion driven heterozygote generation and validation
By dipping using Agrobacterium GV3101 strain containing DMC-CAIN, TPD-CAIN or FAST only vector 31 Wild type Col-0 Arabidopsis thaliana was transformed. Successful primary transformants (T1) were selected directly from the harvested dry seeds according to the presence or absence of red fluorescence under a hand-held fluorescence detector (LUYOR 3415 RG).
Agrobacterium-mediated transformationSometimes exogenous DNA sequences are introduced into multiple sites of the plant genome. For this purpose, 48-50 strains were randomly selected from T1 obtained from DMC-CAIN, TPD-CAIN, and subjected to thermal asymmetric interlacing polymerase chain reaction (TAIL-PCR) 32 The number of T-DNA insertion sites in each plant was examined. And by whole genome sequencing (Novogene, PE 150), using software TDNAscan 33 And semi-automatic pipeline developed by the inventors for data analysis, further confirming single site inserted T1 plants, obtaining T1 of 3 DMC-CAIN, T1 of 5 TPD-CAIN, and T1 of 2 FAST only.
Cross pollination and percent drive evaluation
To examine the gene-driven spread, unopened flowers of the wild-type female parent (several strains Col-0) were emasculated and pollinated with pollen from T1 plants of DMC-CAIN and TPD-CAIN. To simulate natural hybridization, pollen from multiple flowers of the male parent would be smeared on each stigma. The red fluorescence in the F1 dry seeds was identified with a hand-held fluorescence detector, and the percentage of red fluorescence in the F1 seeds was taken as the propagation rate of gene drive in the F1 generation. To test whether the TPD-CAIN gene driver was transmitted through the male parent, FAST+F1 plants were crossed with wild type Col-0 plants as female or male parent to obtain F2 generation. The method for identifying F2 representation is the same as that of F1.
Statistical analysis
To check the significance of CAIN propagation scale, an exact binom.test () function in R was used to check the ratio of fast+ seed to FAST-seed, assuming 1:1. Because of the heterogeneity in fast+ seed ratios between different fruits, the heterogeneity was quantified by replied G-test with desco tools (https:// cran. R-project. Org/package = desco tools) in R. Since multiple comparisons may increase type I errors (false positives), the p.adjust () function in the R package is used to calculate the False Discovery Rate (FDR).
Genotyping
Target geneNPG1Is identified by amplification of about 2 kb genomic fragments including four target sitesSanger sequencing was performed (FIG. 8 b). Wherein the reverse amplification primer is located within the intron region to avoid interference of the Recoded NPG1 sequence in the driving element. Genomic DNA (gDNA) extracted from rosette leaves and inflorescences, respectively, was used as a template for Polymerase Chain Reaction (PCR). The PCR products were directly subjected to Sanger sequencing. Consistent Sanger sequencing results are generally obtained in leaf and inflorescence samples of the same plant. If the presence of multiple peaks in the sequencing result of leaf samples (indicating the possible presence of chimeric or heterogeneous species) leads to an inability to determine genotype, an inflorescence sample from the same plant is used to determine genotype. FAST-F1 and F2 plants were genotyped using leaf samples only.
For three F1 plants, the gRNA11 target site genotype was determined for leaf and inflorescence samples using Illumina sequencing. PCR products from different tissues all had unique barcode sequences (barcode) introduced during PCR, mixed in equal amounts and sequenced using Illumina PE 150. Clear read, differentiated by barcode sequence, was aligned back to genomic sequence surrounding the gRNA11 target site using BWA. The read depth covering the gRNA11 region is between 107,207 and 150,422. Consider a read that shows a mismatch in the 23-nt target site region as the edit type, and determine the edit efficiency of each sample as the ratio of the mismatched read to the total reads aligned to the target site. Single base substitutions with a frequency higher than 0.5% and all indels are considered as edited types.
Population dynamic simulation
CAINIs computationally modeled using an individual-based stochastic model based on the Wright-Fisher model, which assumes that the population size is constant and that the generations do not overlap. For modification typeCAINDriving, taking into account two unlinked lociCAINAndNPG1the initial population has 9900 wild individuals and 100 heterozygotes carrying CAINTPD- CAIN/+;NPG1 +/+ ). Study before reference 25 A density adjustment strategy is employed. In short, the formula s=10/(9×) is usedN / K+1) calculation of the scaling factorS) WhereinNRepresenting the population size of the current generation,Krepresenting environmental load bearing capacity (i.e., 10,000). According to binomial distributionX ~ B (50, 0.02 × S) Calculating the population scale of each generationX). For each generation, individual pairs are randomly selected to produce offspring, and sexes are randomly assigned. This process is repeatedXSecond, the offspring generated overlay the parental data.
For each generation, CAIN-carrying male parents could produce CRISPR-mediated cleavage at NPG1 sites with cleavage efficiency set to either empirical (98.4%, fig. 13 a) or fixed (50% or 100%). Cutting results inNPG1The gene function is lost. The phenotypic penetrance of pollen failure due to loss of male gamete function was either empirically (96.0%, FIG. 13 a) or manually set (100%). Also contemplated is a TPD-CAIN/+ female germ cell pair NPG1Is set to an empirical value (94.1%, FIG. 13 b) or is set manually (0%, 50%, 100%).
For a homing type of drive, the drive allele in heterozygous state can convert the wild-type allele into the drive allele with 100% efficiency. For TARE actuation, the target gene is a haplosufficient (haplokick) essential gene for embryo development, and Cas9 will cleave the wild type target gene in germ cells with 100% probability. After fertilization, cas9 carried by the egg cells will further cleave the paternally derived wild-type target gene with a cleavage efficiency of 100% as well. Embryos with two disrupted alleles of the target gene but without genetic TARE driving elements will not continue to develop.
For the inhibitory CAIN drive, CAIN is located inside a haploid abundant male fertility gene, which is therefore disabled. Except for the initial population size andKall were set to 100,000, the rest were the same as the modified CAIN drive. Male CAIN homozygotes fail to produce viable pollen.
The simulation was implemented using custom Python script available from https:// gitsub.com/QianLabWebsite/GeneDrive.
Example 1 design of CAIN, CRISPR-based poison-antidote gene driven system targeting pollen germination
CAIN consists of three parts that are closely linked: poison, untangling and carried goods. The poison is a gRNA-Cas9 complex that can introduce inactivating mutations in an essential gene associated with pollen germination by triggering DSBs and subsequent repair by NHEJ. The drug solution is a recoded version of this essential gene, expressed with its native promoter (FIG. 1 a). In theory, a poison may disrupt both alleles of the essential gene prior to meiosis, affecting germination of all four pollen grains, but only neutralizing this effect if a antidote is present in the pollen grains. Thus, when CAIN-carrying plants were mated with wild-type plants, only two pollen grains carrying CAIN were able to successfully germinate, allowing CAIN to reach 100% transmission (fig. 1 b). Even if only one of the two alleles of the essential gene is disturbed by a poison, the transmission of CAIN will be two-thirds (fig. 1 b). In either case, the ratio of CAIN delivery will be 50% higher than expected for mendelian genetics.
This process can be propagated in successive generations, with CAINs being propagated throughout the population by a continuous hybridization process. Although the gene driving system aims at the population of the outcrossing propagation, the arabidopsis thaliana is selected as an experimental object, because the arabidopsis thaliana is a main self-pollination mode plant, the artificial hybridization is carried out in the whole experimental process, and the ecological safety of the gene driving system before formal release is further enhanced through strict experimental procedures and management.
The implementation of CAIN requires the selection of a gene critical to pollen germination. To determine the appropriate target gene, a list of genes related to male gametophyte development but not affecting female gametophyte development was retrieved from a previous collection 13 Select and selectNo Pollen Germination(NPG1)As a target gene, it is a gene necessary for the late stage of pollen germination 14 . Selecting it as the target gene can enable the period of Cas9 cleavage to be relatively longer, as Cas9 cleavage needs to be completed before it can function (fig. 1 a).
12 gRNAs were screened in the NPG1 CDS sequence and their cleavage efficiency was initially tested in Arabidopsis protoplasts. Finally, four grnas (gRNA 2, gRNA6, gRNA11 and gRNA 23) with different DNA cleavage efficiencies were selected and constructed into drive elements (fig. 1 a).
To mitigate the negative effects that Cas9 expression may have on plant development and adaptability in somatic cells, we selected promoters that are active mainly in germ cells to drive Cas9 expression (fig. 1 a). We constructed two gene drive systems, using promoters with different simultaneous empty expression patterns (FIG. 1 a). One isDMC1(Disruption of Meiotic Control 1)Is capable of expressing Cas9 in pollen mother cells within anthers, the other is TPD1 (Tapetum Determinant 1)Is continuously expressed in the process that the progenitor cell of the pollen mother cell, namely the sporocyte, gradually develops into the pollen mother cell 16 . The latter provides a longer cutting time window (FIG. 1 a), possibly duringNPG1Has a higher overall Cas9 cleavage activity before onset of action.
Medicine decomposition isNPG1The recoded version of the gene sequence is driven by its original promoter to complement the gene function. To ensure that this solution was not cleaved by Cas9, we mutated the target site of the gRNA (according to synonymous codons, so amino acid composition was not affected, fig. 8 a). In addition, the intronic region of the gene is deleted in the version, so that the structure is more simplified, and the subsequent genome is not interferedNPG1Genotype identification of (figure 8 b). For goods (cargo), a red fluorescent protein expressed during seed drying is selected and named FAST 17 So as to observe the spread of drive.
In summary, two CAIN vectors, TPD-CAIN and DMC-CAIN (fig. 1a and 9), were constructed, named for their respective promoters driving Cas9 expression. In addition, a vector containing only the FAST marker was constructed as a negative control (fig. 9). After the above vectors were introduced into wild type Arabidopsis Col-0, respectively, by Agrobacterium-mediated flower dipping, random insertion into a certain genomic position was performed (FIG. 10). Transgenic plants T1 with single site insertion (i.e., first generation transgenic plants) were screened for subsequent analysis and, in the absence of gene driving activity, drive elements would be propagated to offspring at a rate of 50% according to mendelian genetics.
Example 2 significantly increased propagation Rate of TPD-CAIN transfer to F1 offspring
To assess whether the constructed gene-driven CAIN was successfully transmitted by male parent, arabidopsis plant T1 carrying CAIN was used as male parent and crossed with wild type Col-0 as female parent (fig. 2 a). Since maternal Col-0 will always be transferred to one Wild Type (WT) allele of the F1 generation, the proportion of CAIN transferred from the male parent can be determined from the dominant phenotype conferred by FAST (i.e. red fluorescence in F1 seeds).
For DMC-CAIN, the test results showed that only one of the three crosses (i.e., D31 plants as male parent) had significantly increased CAIN delivery (exact binomial test, FIG. 2 b). In contrast, for TPD-CAIN, all four hybridizations showed a transmission ratio of 89.6% to 96.9%, which deviates greatly from Mendelian inheritance (FIG. 2 b). In contrast, the negative control (i.e. FAST only) propagates at a proportion of about 50% (fig. 2 b), conforming to mendelian inheritance.
Since the observations in each silique (silique) can be considered as an independent test event, further examined for goodness-of-fit with repetitionGTest, fig. 2 b) determines the deviation between the test result and 50% to explain the differences between different fruits. It was observed that in crosses with D31 plants as male parent, there was a significant heterogeneity of DMC-CAIN between different cones, which is the only cross showing more than 50% transmission rate, probably due to lower efficiency of Cas9 cleavage, and inconsistency between different cones due to randomness. In TPD-CAIN, however, no significant heterogeneity in propagation ratio was observed between different cones, and all four hybridization combinations showed a propagation efficiency approaching 100% (FIG. 2 b), indicating powerful and consistent performance of TPD-CAIN.
Example 3 parental-dependent bias isolation in TPD-CAIN
To evaluate whether TPD-CAIN is designed by destructionNPG1Partial separation achieved (FIG. 1), in individuals F1NPG1Genotyping was performed. The rosette leaves and inflorescences of the plant are sampled, and the target position is amplified by extracting genome DNAPoint and Sanger sequencing, genotyping (FIG. 3 a). The results show that all FAST + (i.e. containing driving elements,TPD-CAIN/+) F1 plants (n=16) all carry a disruptionNPG1Allele [ ]NPG1 - ) (FIG. 3b, FIG. 11), various types of index production were observed at 88% and 100% of F1 offspring at the target sites of gRNA2 and gRNA11, respectively. Theoretically, indels of length three could produce CRISPR resistance alleles without affecting subsequent reading frames, but at the same time the rarity (i.e. 0%) at both gRNA sites, suggesting that CAIN designs have lower rates of normal gene-functional resistance allele formation, especially where multiple grnas are concatenated.NPG1 - Is ubiquitous in almost all F1 generation plants, suggesting CRISPR-basedNPG1The knocking-out efficiency is very high, and pollen germination is further damaged theoretically, so that the partial separation phenomenon is caused.
To further investigate the mechanism of partial segregation, a forward and reverse cross was performed between FAST+F1 plants and Col-0, aiming at determining whether the partial segregation phenomenon of TPD-CAIN was affected by the direction of hybridization. When fast+f1 plants were used as male parent (n=13), the transmission rate of TPD-CAIN was significantly higher than expected for mendelian inheritance (i.e. 50%, fig. 4 a). In contrast, when fast+f1 plants were used as female parent (n=8), the transmission rate of TPD-CAIN did not deviate significantly by 50% (fig. 4 b). These results indicate that the partial separation phenomenon of TPD-CAIN is caused by NPG1The functional defect of the male gametophyte caused by gene knockout.
The propagation rate of TPD-CAIN is higher than DMC-CAIN (FIG. 2), presumablyTPD1The promoter is capable of producing higher activity caused by Cas9, so thatNPG1Is higher. To verify this hypothesis, FAST+F1 plants generated by DMC-CAIN were also subjected toNPG1Genotyping (n=12). The results show that only two strains F1 possess two KO alleles [ ]NPG1 -/- ) One strain possesses a KO alleleNPG1 +/- ) The rest 9 strainsNPG1 +/+ ) Without any KO allele (fig. 5 a-b). This also means that DMC-CAIN is shown in the following section (FIG. 3 b) relative to TPD-CAINNPG1The cleavage efficiency of the locus is low.
Although DMC-CAIN cleavage was less efficient, it was further evaluated whether DMC-CAIN could produce bias separation in the next generation. For this purpose, two strains are usedDMC-CAIN/+;NPG1 -/- F1 plants pollinated wild-type female parent, and the red fluorescence ratio in their F2 seeds was counted (fig. 5b and 5 c), and DMC-CAIN transitivity was found to reach 95.9% and 99.5%, respectively. In contrast, byDMC-CAIN/+;NPG1 +/- Of the F2 offspring produced by the F1 plants as male parent, drive% was 63.7% (FIGS. 5b and 5 c). The other nine strainsDMC-CAIN/+NPG1 +/+ The% drive in the F2 offspring produced by F1 plants was nearly 50% (fig. 5b and 5 c). On average, DMC-CAIN had a transfer rate of 57.5% from F1 to F2 (3367/5857). This limited transmissibility indicates a critical role for cutting efficiency in affecting the efficacy of CAIN systems.
Example 4 insufficient DNA cleavage and incomplete Exactness can result in a TPD-CAIN spread of not reaching 100%
When T1 plants with gene driving elements were used as male parent, a transmission rate of TPD-CAIN of between 89.6% and 96.9% was observed (FIG. 2 b), i.e.a fraction (3.1% -10.4%) was still not inherited TPD-CAIN (i.e.FAST-). Similarly, in the F2 generation produced by the F1 plant carrying TPD-CAIN, some (1.0% -12.2%) had no genetic TPD-CAIN (FIG. 4 a). The mechanism of production of these FAST-offspring was further explored.
To confirm whether these FAST-F1 plants were due toNPG1Not successfully cut, for 11 FAST-F1 plantsNPG1Genotyping was performed (FIG. 6 a), notably where 6 strains were homozygous for the WT allele at all four target sites (FIG. 6 b). Demonstrating that male gametes provided by the male parent contributed to WT alleles, these FAST-F1 were indeed generated by escaping cleavage of Cas9 (i.e., insufficient DNA cleavage).
On the other hand, there were 5 FAST-F1 strains that were of WT/KO genotype at the gRNA2 and gRNA11 target sites (FIG. 6 b). Considering that female parent Col-0 can only provide one WT allele, the KO allele is transmitted from male parent through pollen. This means that a small number is carried NPG1KO allele and lack of drive elementPollen of the pieces was still able to germinate, indicatingNPG1The non-germinating phenotype of the KO allele is not 100%, i.e. there is an incomplete exon rate (Incomplete penetrance).
To understand the effect of DNA cleavage failure and incomplete exon on the population level that CAIN-driven propagation may produce, computational simulations based on the Wright-Fisher model were performed. Such an individual-based random model assumes that a limited, randomly mated population is propagated in discrete, non-overlapping generations. From 9900 wild-type individuals and 100TPD-CAIN/+Individuals began, the population size was preserved, the mating pairs were randomly selected, and DNA cleavage and bias separation were limited to the male parent. Based on the estimated efficiency of cleavage of male germ cells and the apparent rate (98.4% and 96.0%, respectively; FIG. 13 a), the simulation results showed that,TPD- CAINpopulation transmission from 1% to 99% requires approximately 17 generations, only one generation more than the optimal conditions (i.e., 100% DNA cleavage efficiency and full-penetrance, fig. 7 a). The results show that the efficiency of TPD-CAIN propagation is relatively robust despite these effects.
Example 5 presence of ubiquitously in germ cells of plants carrying TPD-CAINNPG1Allelic conversion
The in-use TPD-CAIN was observed, NPG1 - Pollen against wild type plants (NPG 1) +/+ ) NPG1 was not detected by Sanger sequencing in somatic tissues of FAST+F1 plants produced by pollination + Genotype (fig. 3b; n=16). This suggests maternal NPG1 + Alleles may have undergone CRISPR-mediated DSBs. To further test this hypothesis, leaf and inflorescence samples from three fast+f1 plants of male parent T18 were detailed at the gRNA11 target site using Illumina sequencing technologyNPG1Genotyping. Except for the major (average 52.9%) genotype "NPG1 possibly inherited from T18 -8 In addition to "(FIG. 3 b), several other genotypes were observed, but uncleaved NPG1 + Alleles were relatively rare (0.8% on average, fig. 3 c). The result shows that the female parentNPG1 + Alleles did undergo CRISPR-mediated DSBs, presumably Cas9 activity results from post-zygote formation, with DSBs repaired primarily by the end-ligation mechanism.
Cas9 activity responsible for the generation of these DSBs may originate from one of two possible scenarios: cas9 expressed from the embryo genome after fertilization (i.e., zygote), or the paternal carry-over of Cas9 protein. The latter situation appears to be unlikely due to the limited protein/RNA content of the sperm cells. To investigate these two hypotheses, F2 plants previously generated by F1 back-crossing were examined using Sanger sequencing (FIG. 6 c) NPG1Genotype. NPG1 whether TPD-CAIN is inherited from male parent or female parent + None of the genotypes appeared in fast+f2 offspring. In contrast, NPG1 was detected in all FAST-F2 offspring even if the father carried TPD-CAIN + Genotype (FIG. 6 c). These observations are very consistent with Cas9 expression after zygote formation, in contradiction with the interpretation of paternal inheritance.
Analysis of transfer from fast+f1 to F2 plantsNPG1Genotype (FIG. 6 c), found a certainNPG1The frequency of genotypes will dominate (i.e., much more than 50% in frequency), although the genotypes of these F1 parents are heterozygous NPG1 at the time of initial fertilized egg formation +/- . For example, although at the gRNA11 target site, FAST+F1 male parent T18-2-1 (NPG 1) -8 ) Transferred to three genotypes of F2 plants, but NPG1 -8 Appears in 13/16F 2 plants, the remaining two genotypes NPG1 -C And NPG1 -30 Are identified in only one plant, respectively. These findings suggest that HDR repair of DSB occurs in germ cells compared to somatic tissues to father NPG1 - Alleles act as templates and create allelic transitions.
EXAMPLE 6 female germ cell internal cleavageNPG1Can promote CAIN propagation
Cleavage in female germ cellsNPG1And its subsequent repair is possible by recruiting male germ cells NPG1Knockout to enhance CAIN propagation. To accurately estimate in female germ cellsNPG1Cleavage efficiency, the genotype of FAST-F2 plants was analyzed, since no additional events occurred after F2-zygote formationNPG1Cleavage provides the original genotype information inherited from its parent. Of the 34 FAST-F2 offspring produced from FAST+F1 plants (i.e., TPD-CAIN/+) and wild-type pollen, 33 are at the gRNA11 target siteWhere is shown as heterozygous NPG1 +/- Genotype, thereby estimating female germ cellNPG1The cleavage efficiency was 94.1% (FIG. 13 b).
To evaluate female germ cellsNPG1The cut was to a large extent able to promote CAIN propagation, and this parameter was incorporated into the simulation with the CAIN carrier's initial introduction frequency set to 1% (fig. 7 b). The results reveal a context dependent effect: when the efficiency of male germ cell cleavage is 50%, an additional 50% of female germ cell cleavage can accelerate CAIN propagation by about 3 passages. However, the efficiency of cleavage of male germ cells observed in TPD-CAIN plants was 98.4% and the actual cleavage of female germ cells was 94.1% only about one generation faster (FIG. 7 b). The simulation results also show that the CAIN propagation speed is delayed by only a few generations compared to the home form. However, it is still much faster than frequency dependent TARE drives, which require a high enough initial launch ratio to be able to spread out quickly (FIG. 7 b).
The CAIN/TADS propagation speed is also significantly faster than the TARE with high initial drop ratio 18 This also shows its potential as a means of population suppression when integrated into a haploid-sufficient male fertility gene 25 . To measure the inhibition effect, the population dynamics of CAIN were simulated at an initial input ratio of 1% (fig. 7 c). These simulations show a rapid increase in the number and frequency of CAIN carriers (primarily heterozygotes in the early stages and primarily homozygotes in the late stages). At the same time, the total population size gradually decreased until the 26 th generation population became extinct (fig. 7 c), possibly due to the proliferation of the number of CAIN homozygotes for male sterility. Overall, the results of computer simulations indicate that rapid propagation of CAIN gives it the potential to achieve population suppression.
In this study, the present inventors developed CAIN, a CRISPR-based poison-antidote gene driven system, which implements supermendelian inheritance for a key gene necessary for the function of arabidopsis male gametophyte and is capable of generating very few resistance alleles compared to home-based drives. The inventor not only provides key insights for the design and application of the artificially synthesized TA gene drive system in plants, but also provides an innovative solution for solving urgent ecological and agricultural challenges.
Homing-based drives that have been successfully applied to mosquitoes 3-7 Resistance alleles are often generated in species with low HDR repair rates, impeding their transmission. An artificially synthesized TA gene drive system mimicking the nature would overcome this problem. For example, synthetic medical 20,21 The system (maternal effect dominant embryonic arrest), whose inspiration comes from the natural medical gene drive in the planchet (flower beedle), exploits a complex balance between parent miRNA (as a poison) and drug release expressed in the zygote and is implemented in drosophila. However, its wider application is limited because it relies on detailed knowledge of Drosophila embryo development. TARE (TARE) 10 (also known as ClvR 9 ) This was avoided by the design of (c) which was directed to the essential gene in the ontogenesis stage, although it was dependent on Cas9 activity transfer (carryover) into the zygote in the egg. Therefore, the design of Medium and TARE/ClvR is biased towards female germ cells and may impair fertility 22 . In contrast, the present inventors' design was aimed at affecting male germ cells. Given that the number of pollen grains is far greater than the ovules, it is possible to minimize the cost of adaptation.
The design focus of the present inventors is the key gene for normal functioning of male gametophyte, and the unique strategy of the present inventors makes use of the common male gametophyte formation process in plants, i.e. after meiosis, there are two rounds of mitosis cycle, eventually forming mature pollen grains 23 . And pollen grains continue to germinate and extend, and the two sperm cells are transferred through the stigma, so that final double fertilization is realized, and the pollen grains can be transferred to offspring. The design and efficacy of the gene driven system of the present inventors depends on the selection of a target gene (in this caseNPG1) In addition to selecting a highly potent promoter for high activity of Cas9 in the germline. This method of tailoring plants may also be applied in animals if the critical essential genes for spermatogenesis can be determined.
By marking FAST 17 As a cargo, the inventors determined that CAIN can achieve efficient bias separation in plants. Cargo may also be replaced for solving various ecological problems, depending on the specific situation and objectives (fig. 14). For example, CAIN systems can be used to control invasive plants by selecting specific genes, such as those that affect megasporocyte or embryo development to achieve population control. Or CAIN can be used to achieve the transmission of beneficial traits, such as drought or disease resistance genes, to enhance survival of endangered species in the field. Also, specific herbicide susceptibility genes can be introduced to more effectively manage weeds and the like. This strategy, if widely applied, might be predictive of a new era of ecological management and sustainable agriculture.
Due to the intense debate and regulatory scrutiny about gene driven technology, the inventors have taken specific measures to ensure ecological safety when designing CAINs. The selection of such a model plant for selfing propagation of Arabidopsis precludes accidental transmission by the present system, as gene driving requires crosses to be effectively transmitted in the population. In addition, CAIN design contains a degree of specificity, which can be specifically directed against certain genotypes or ecotypes by screening for gRNA. As with the home-based drive, CAIN is zero threshold (zero-threshold), i.e., an individual theoretically releasing a drive element can gradually spread throughout the population. Whereas its propagation speed can control Cas9 expression by using weaker gRNA or promoter, since potency of drive is closely related to cleavage efficiency. This additional flexibility adds a layer of control to the propagation of the gene drive, enhancing its safe use in different ecological environments.
Some off-target effects may be present in the gRNAs, which is a concern. Although off-target phenomena are unlikely to hinder CAIN transmission, it may introduce unintended genomic mutations, thereby increasing genetic load. It is worth mentioning that although not emphasized in the results section, the inventors performed small scale assays by CRISPR-P2.0 24 16 potential off-target sites for 4 gRNAs were predicted and the bases of these sites were determined by Sanger sequencing in 16F 1 plants generated from four T1 linesThus, none of these potential off-target sites was edited in the 16 plants tested, supporting the specific targeting of the four gRNAs selected in CAINNPG1。
The flexibility of updating existing gene driven elements in a population is a critical safety issue. CAIN designed by the present inventors has the ability to be functionally replaced. There are three preconditions for this process to be implemented (fig. 12).
First, a new CAIN driver, CAIN n+1 Must be matched with CAIN n Integration into the same genomic location. Although the probability of achieving targeted integration by homologous repair (HDR) in plants is relatively low, a sufficient number of transformant screens can compensate for this limitation. Homologous location can force CAIN n+1 And CAIN n Direct competition ensures that only one drive element remains.
Second, CAIN n+1 It is necessary to use different gRNA targets to destroy essential genes and thus act as new toxicants. This change causes CAIN to n Medium RecodedNPG1And in the genomeNPG1Are all the cutting objects.
Finally, it is not CAIN n And CAIN n+1 Novel REcoded targeted by gRNA in (E) a host cell NPG1As a new drug for drug decomposition.
With these three preconditions, the original CAIN n Can be CAIN n+1 Instead, the removal or modification of the current good (cargo) is thereby achieved. The method does not need to select a new target gene, and maintains the overall size of the gene driving element, thereby improving compatibility and integration efficiency.
In summary, the present inventors devised CAIN, a gene driven system based on CRISPR TA principle specifically tailored for plants. By targeting the prolonged male gamete stage in the plant life cycle, the inventors successfully demonstrated the efficacy of CAIN in arabidopsis, setting a benchmark for its use in other species. In view of the key componentsNPG1The gene shows sequence conservation in various plants, and species popularization of CAIN is promising. Hope for future, needs to continue to perfect the gene driving system, packageIncluding research on its reversibility and adaptability, and the relevant mechanism of controllability. CAIN and similar gene driven systems are expected to remodel ecological management, agriculture and species protection in a substantial and revolutionary manner.
Reference is made to:
1.DiCarlo, J.E., Chavez, A., Dietz, S.L., Esvelt, K.M. & Church, G.M. Safeguarding CRISPR-Cas9 gene drives in yeast. Nature biotechnology 33, 1250-1255 (2015).
2.Xu, H. et al. Chromosome drives via CRISPR-Cas9 in yeast. Nature communications 11, 4344 (2020).
3.Gantz, V.M. et al. Highly efficient Cas9-mediated gene drive for population modification of the malaria vector mosquito Anopheles stephensi. Proceedings of the National Academy of Sciences 112, E6736-E6743 (2015).
4.Hammond, A. et al. A CRISPR-Cas9 gene drive system targeting female reproduction in the malaria mosquito vector Anopheles gambiae. Nature biotechnology 34, 78-83 (2016).
5.Kyrou, K. et al. A CRISPR–Cas9 gene drive targeting doublesex causes complete population suppression in caged Anopheles gambiae mosquitoes. Nature biotechnology 36, 1062-1066 (2018).
6.Li, M. et al. Development of a confinable gene drive system in the human disease vector Aedes aegypti. Elife 9, e51701 (2020).
7.Simoni, A. et al. A male-biased sex-distorter gene drive for the human malaria vector Anopheles gambiae. Nature biotechnology 38, 1054-1060 (2020).
8.Gantz, V.M. & Bier, E. The mutagenic chain reaction: a method for converting heterozygous to homozygous mutations. Science 348, 442-444 (2015).
9.Oberhofer, G., Ivy, T. & Hay, B.A. Cleave and Rescue, a novel selfish genetic element and general strategy for gene drive. Proceedings of the National Academy of Sciences 116, 6250-6259 (2019).
10.Champer, J. et al. A toxin-antidote CRISPR gene drive system for regional population modification. Nature communications 11, 1082 (2020).
11.Grunwald, H.A. et al. Super-Mendelian inheritance mediated by CRISPR–Cas9 in the female mouse germline. Nature 566, 105-109 (2019).
12.Champer, J., Kim, I.K., Champer, S.E., Clark, A.G. & Messer, P.W. Performance analysis of novel toxin-antidote CRISPR gene drive systems. BMC biology 18, 1-17 (2020).
13.Muralla, R., Lloyd, J. & Meinke, D. Molecular foundations of reproductive lethality in Arabidopsis thaliana. PloS one 6, e28398 (2011).
14.Golovkin, M. & Reddy, A.S. A calmodulin-binding protein from Arabidopsis has an essential role in pollen germination. Proceedings of the National Academy of Sciences 100, 10558-10563 (2003).
15.Klimyuk, V.I. & Jones, J.D. AtDMC1, the Arabidopsis homologue of the yeast DMC1 gene: characterization, transposon‐induced allelic variation and meiosis‐associated expression. The Plant Journal 11, 1-14 (1997).
16.Yang, S.-L. et al. Tapetum determinant1 is required for cell specialization in the Arabidopsis anther. The Plant Cell 15, 2792-2804 (2003).
17.Shimada, T.L., Shimada, T. & Hara‐Nishimura, I. A rapid and non‐destructive screenable marker, FAST, for identifying transformed seeds of Arabidopsis thaliana. The Plant Journal 61, 519-528 (2010).
18.Zou, J. et al. Comparative proteomic analysis of Arabidopsis mature pollen and germinated pollen. Journal of Integrative Plant Biology 51, 438-455 (2009).
19.Leljak-Levanić, D., Juranić, M. & Sprunck, S. De novo zygotic transcription in wheat (Triticum aestivum L.) includes genes encoding small putative secreted peptides and a protein involved in proteasomal degradation. Plant reproduction 26, 267-285 (2013).
20.Chen, C.-H. et al. A synthetic maternal-effect selfish genetic element drives population replacement in Drosophila. science 316, 597-600 (2007).
21.Buchman, A., Marshall, J.M., Ostrovski, D., Yang, T. & Akbari, O.S. Synthetically engineered Medea gene drive system in the worldwide crop pest Drosophila suzukii. Proceedings of the National Academy of Sciences 115, 4725-4730 (2018).
22.Zanders, S.E. & Unckless, R.L. Fertility costs of meiotic drivers. Current Biology 29, R512-R520 (2019).
23.Schmidt, A., Schmid, M.W. & Grossniklaus, U. Plant germline formation: common concepts and developmental flexibility in sexual and asexual reproduction. Development 142, 229-241 (2015).
24.Liu, H. et al. CRISPR-P 2.0: an improved CRISPR-Cas9 tool for genome editing in plants. Molecular plant 10, 530-532 (2017).
25.Kim, Y.-J., Zhang, D. & Jung, K.-H. Molecular basis of pollen germination in cereals. Trends in Plant Science 24, 1126-1136 (2019).
26.Gibson, D.G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nature methods 6, 343-345 (2009).
27.Xing, H.-L. et al. A CRISPR/Cas9 toolkit for multiplex genome editing in plants. BMC plant biology 14, 1-12 (2014).
28.Xie, X. et al. CRISPR-GE: a convenient software toolkit for CRISPR-based genome editing. Molecular plant 10, 1246-1249 (2017).
29.Yoo, S.-D., Cho, Y.-H. & Sheen, J. Arabidopsis mesophyll protoplasts: a versatile cell system for transient gene expression analysis. Nature protocols 2, 1565-1572 (2007).
30.Wu, F.-H. et al. Tape-Arabidopsis Sandwich-a simpler Arabidopsis protoplast isolation method. Plant methods 5, 1-10 (2009).
31.Clough, S.J. & Bent, A.F. Floral dip: a simplified method for Agrobacterium‐mediated transformation of Arabidopsis thaliana. The plant journal 16, 735-743 (1998).
32.Liu, Y.G., Mitsukawa, N., Oosumi, T. & Whittier, R.F. Efficient isolation and mapping of Arabidopsis thaliana T‐DNA insert junctions by thermal asymmetric interlaced PCR. The Plant Journal 8, 457-463 (1995).
33.Sun, L. et al. TDNAscan: a software to identify complete and truncated T-DNA insertions. Frontiers in Genetics 10, 685 (2019).
the partial sequences referred to in this application:
SEQ ID NO. 1 Arabidopsis NPG1 amino acid sequence
MLGNQSADFSEKGEDEIVRQLCANGICMKTTEVEAKLDEGNIQEAESSLREGLSLNFEEARALLGRLEYQRGNLEGALRVFEGIDLQAAIQRLQVSVPLEKPATKKNRPREPQQSVSQHAANLVLEAIYLKAKSLQKLGRITEAAHECKSVLDSVEKIFQQGIPDAQVDNKLQETVSHAVELLPALWKESGDYQEAISAYRRALLSQWNLDNDCCARIQKDFAVFLLHSGVEASPPSLGSQIEGSYIPRNNIEEAILLLMILLKKFNLGKAKWDPSVFEHLTFALSLCSQTAVLAKQLEEVMPGVFSRIERWNTLALSYSAAGQNSAAVNLLRKSLHKHEQPDDLVALLLAAKLCSEEPSLAAEGTGYAQRAINNAQGMDEHLKGVGLRMLGLCLGKQAKVPTSDFERSRLQSESLKALDGAIAFEHNNPDLIFELGVQYAEQRNLKAASRYAKEFIDATGGSVLKGWRFLALVLSAQQRFSEAEVVTDAALDETAKWDQGPLLRLKAKLKISQSNPTEAVETYRYLLALVQAQRKSFGPLRTLSQMEEDKVNEFEVWHGLAYLYSSLSHWNDVEVCLKKAGELKQYSASMLHTEGRMWEGRKEFKPALAAFLDGLLLDGSSVPCKVAVGALLSERGKDHQPTLPVARSLLSDALRIDPTNRKAWYYLGMVHKSDGRIADATDCFQAASMLEESDPIESFSTIL
SEQ ID NO. 2 Arabidopsis NPG1 coding sequence
ATGCTCGGGAATCAATCCGCGGATTTTAGTGAGAAGGGGGAAGATGAGATCGTCAGACAGCTTTGTGCTAATGGGATTTGCATGAAAACAACTGAAGTTGAAGCAAAGCTTGATGAAGGAAATATTCAAGAAGCTGAATCTTCTTTGAGAGAAGGATTATCTCTCAATTTCGAGGAAGCAAGAGCACTTCTTGGAAGATTGGAATACCAAAGAGGGAATTTAGAAGGCGCACTTCGTGTCTTTGAAGGTATCGACCTTCAAGCAGCTATCCAGCGGTTACAGGTTTCCGTGCCTCTTGAGAAACCGGCTACTAAGAAAAACCGTCCCCGTGAACCGCAGCAATCAGTTTCTCAGCATGCTGCTAACTTGGTCCTTGAAGCTATCTACTTGAAAGCCAAATCCCTTCAAAAGCTTGGGAGAATAACTGAGGCTGCTCATGAATGCAAGAGTGTTCTTGATTCTGTTGAGAAGATATTTCAGCAAGGGATACCAGATGCTCAAGTGGATAACAAACTTCAAGAAACCGTTAGCCACGCCGTTGAACTACTTCCTGCGCTATGGAAAGAATCTGGTGATTATCAAGAAGCCATATCTGCTTATAGACGCGCGCTTTTAAGCCAATGGAATCTTGATAATGATTGTTGTGCAAGGATTCAAAAAGATTTTGCAGTCTTTCTTTTACATTCTGGAGTCGAAGCGAGTCCACCGAGTTTAGGTTCTCAGATAGAGGGATCGTACATACCTAGAAACAACATAGAAGAAGCCATTCTTCTTCTAATGATTCTTTTAAAGAAGTTTAACCTCGGGAAAGCGAAATGGGATCCGTCTGTGTTTGAGCACCTTACCTTTGCGTTATCTTTATGTAGTCAGACCGCGGTTCTCGCCAAGCAGCTTGAAGAAGTAATGCCTGGTGTGTTTAGCCGTATTGAGCGTTGGAACACTTTGGCTCTTTCTTATAGTGCAGCAGGTCAAAACAGTGCTGCAGTTAACCTTCTTAGAAAGTCTCTGCATAAACACGAACAACCCGATGATCTTGTGGCGCTTTTGTTAGCTGCTAAGCTTTGCAGTGAAGAGCCTTCTTTAGCTGCTGAAGGTACGGGTTATGCGCAGAGAGCGATAAACAATGCTCAAGGTATGGATGAGCATTTGAAAGGCGTTGGTTTGAGGATGTTAGGACTTTGTTTAGGGAAACAAGCGAAGGTTCCGACATCGGATTTTGAAAGATCTCGGCTGCAATCAGAATCATTGAAAGCATTAGATGGAGCTATAGCTTTTGAGCACAATAATCCTGATTTGATCTTTGAGTTAGGTGTTCAATACGCTGAGCAACGGAACTTAAAAGCTGCTTCCCGTTACGCCAAAGAGTTCATCGATGCAACGGGAGGGTCAGTGTTAAAAGGATGGAGATTTCTCGCGCTTGTTTTGTCAGCTCAACAACGGTTTTCAGAAGCAGAAGTTGTGACTGATGCTGCTTTAGATGAAACTGCAAAGTGGGATCAGGGACCTCTCTTGAGACTCAAAGCAAAGCTGAAAATCTCTCAGTCAAATCCAACAGAAGCCGTTGAGACTTATCGTTACCTTCTTGCATTGGTTCAAGCGCAAAGGAAATCTTTCGGACCTCTCAGAACTCTTTCTCAGATGGAGGAAGACAAAGTGAATGAGTTTGAAGTGTGGCATGGCTTGGCTTATCTTTACTCAAGCCTTTCGCATTGGAACGACGTAGAAGTCTGTCTGAAAAAAGCCGGAGAGCTGAAACAATACTCTGCTTCAATGTTGCATACAGAAGGTCGAATGTGGGAAGGACGAAAGGAGTTCAAACCCGCGCTAGCAGCTTTCTTGGACGGTTTATTACTAGACGGATCATCGGTTCCTTGCAAAGTAGCGGTTGGAGCGTTATTGTCCGAAAGAGGGAAAGATCATCAGCCAACTCTCCCCGTGGCTAGAAGTTTGCTCTCTGATGCATTGAGGATCGATCCAACAAACCGAAAAGCTTGGTATTACTTAGGAATGGTTCATAAATCTGATGGACGTATAGCTGATGCTACTGATTGCTTCCAAGCTGCTTCTATGCTTGAAGAGTCTGATCCTATTGAAAGCTTCTCAACCATTCTTTAA
SEQ ID NO 3 recoded Arabidopsis NPG1 coding sequence
ATGCTCGGGAATCAATCGGCAGACTTCTCAGAGAAGGGGGAAGATGAGATCGTCAGACAGCTTTGTGCTAATGGGATTTGCATGAAAACAACTGAAGTTGAAGCAAAGCTTGATGAAGGAAATATTCAAGAAGCTGAATCTTCTTTGAGAGAAGGATTATCTCTCAATTTCGAGGAAGCAAGAGCACTTCTTGGAAGATTGGAATACCAAAGAGGGAATTTAGAAGGCGCACTTCGTGTCTTTGAAGGTATCGACCTTCAGGCGGCCATACAACGATTACAGGTTTCCGTGCCTCTTGAGAAACCGGCTACTAAGAAAAACCGTCCCCGTGAACCGCAGCAATCAGTTTCTCAGCATGCTGCTAACTTGGTCCTTGAAGCTATCTACTTGAAAGCCAAATCCCTTCAAAAGCTTGGGAGAATAACTGAGGCTGCTCATGAATGCAAGAGTGTTCTTGATTCTGTTGAGAAGATATTTCAGCAAGGGATACCAGATGCTCAAGTGGATAACAAACTTCAAGAAACCGTTAGCCACGCCGTTGAACTACTTCCTGCCTTGTGGAAGGAGAGCGGTGATTATCAAGAAGCCATATCTGCTTATAGACGCGCGCTTTTAAGCCAATGGAATCTTGATAATGATTGTTGTGCAAGGATTCAAAAAGATTTTGCAGTCTTTCTTTTACATTCTGGAGTCGAAGCGAGTCCACCGAGTTTAGGTTCTCAGATAGAGGGATCGTACATACCTAGAAACAACATAGAAGAAGCCATTCTTCTTCTAATGATTCTTTTAAAGAAGTTTAATTTAGGCAAGGCTAAGTGGGATCCGTCTGTGTTTGAGCACCTTACCTTTGCGTTATCTTTATGTAGTCAGACCGCGGTTCTCGCCAAGCAGCTTGAAGAAGTAATGCCTGGTGTGTTTAGCCGTATTGAGCGTTGGAACACTTTGGCTCTTTCTTATAGTGCAGCAGGTCAAAACAGTGCTGCAGTTAACCTTCTTAGAAAGTCTCTGCATAAACACGAACAACCCGATGATCTTGTGGCGCTTTTGTTAGCTGCTAAGCTTTGCAGTGAAGAGCCTTCTTTAGCTGCTGAAGGTACGGGTTATGCGCAGAGAGCGATAAACAATGCTCAAGGTATGGATGAGCATTTGAAAGGCGTTGGTTTGAGGATGTTAGGACTTTGTTTAGGGAAACAAGCGAAGGTTCCGACATCGGATTTTGAAAGATCTCGGCTGCAATCAGAATCATTGAAAGCATTAGATGGAGCTATAGCTTTTGAGCACAATAATCCTGATTTGATCTTTGAGTTAGGTGTTCAATACGCTGAGCAACGGAACTTAAAAGCTGCTTCCCGTTACGCCAAAGAGTTCATCGATGCAACGGGAGGGTCAGTGTTAAAAGGATGGAGATTTCTCGCGCTTGTTTTGTCAGCTCAACAACGGTTTTCAGAAGCAGAAGTTGTGACTGATGCTGCTTTAGATGAAACTGCAAAGTGGGATCAGGGACCTCTCTTGAGACTCAAAGCAAAGCTGAAAATCTCTCAGTCAAATCCAACAGAAGCCGTTGAGACTTATCGTTACCTTCTTGCATTGGTTCAAGCGCAAAGGAAATCTTTCGGACCTCTCAGAACTCTTTCTCAGATGGAGGAAGACAAAGTGAATGAGTTTGAAGTGTGGCATGGCTTGGCTTATCTTTACTCAAGCCTTTCGCATTGGAACGACGTAGAAGTCTGTCTGAAAAAAGCCGGAGAGCTGAAACAATACTCTGCTTCAATGTTGCATACAGAAGGTCGAATGTGGGAAGGACGAAAGGAGTTCAAACCCGCGCTAGCAGCTTTCTTGGACGGTTTATTACTAGACGGATCATCGGTTCCTTGCAAAGTAGCGGTTGGAGCGTTATTGTCCGAAAGAGGGAAAGATCATCAGCCAACTCTCCCCGTGGCTAGAAGTTTGCTCTCTGATGCATTGAGGATCGATCCAACAAACCGAAAAGCTTGGTATTACTTAGGAATGGTTCATAAATCTGATGGACGTATAGCTGATGCTACTGATTGCTTCCAAGCTGCTTCTATGCTTGAAGAGTCTGATCCTATTGAAAGCTTCTCAACCATTCTTTAA
SEQ ID NO. 4 Arabidopsis NPG1 native promoter sequence
tatgagtcgagtgtctgacttgtatgagttagggcctagtatgaataaataaacattattaatattaagatagttgttttcgataattgtttgataaggatgccactaaactcatacctcttagcttatacgaattgacttaattagacattaatacattatatctatatattatctagatttataattgctaagccaataggtcaaggtcttgtctaataaatgcatgcacaactaattcagtcaataatgtacctgtataatactacaaataattcaagctaattgatctatattgaagaacaaataagtaatctatttcggatttagtcttatcatgtgtctaaataaacacataactcttaagtcttaatgatttatttttgatagatatcaattataattatacaattacaaatgatttgatgattgactatacgtaagaactaactttgataattttgaattgggacaaatcattgaaggccttacgtttaagctttagatgtttcccaacgccaaaggagaatgaaaaggacagaccatcagtgatttgagtactcaatcaacatatttattatgtactttgagttaattaattttctattaataacaaaaatcaagcttgcacatttcaatgtgataagtatatgaataataatccaagctaatttttaagaaaagaggaatattgaaagcttgcaaattattcgaatgctagaggtccttaccttgcatgcaccttttgtaacaattacctatgggtgtggggaaatctagctagctacatattttcaattatttttccctattaaattgagattattgttataaaagaaaatgcccaaacttaattttcggggtttaaaattttgtttaaaaataaataaaatataagaaaagaaagaaaagtataatttgggttaaggggtttgaatatgattgatttgaatcgtcgtcgaaatgtatacgtcacctaacgcttttgttgctatactagtatcattaagtggaaattttaaagtcattaaaactcttctcatttttgtatttctaaaagagtcttaaggggtttgaatatgatttaaattatcttacaagtgtaaatgccatctaacgcttttgttgttatactagtattatttagtaataagatgctaaagtcactcaaactccagaatcaataatactccaagctatacatattagaattttaaaatagtatgaacactttcgataataaaaataccaaacttatttgggacactaaataagtttgggccgaaaatatttaaaagcccaatttaaactaaaattcatttaggctcttctcttctactaccttcttctatcgagccacaccgaatgaaattagtgaaactgctattggcttgtgaattgtgtgtgatggcgttaaagcctcttttagttcgtaaccgatgaaatgacagtaatagccttgagaaacactgaaaattacagaaagagagtttgaactttgaagacaaaacaggtgtttctatttctctccccgttcacgttctgcaacatcggaagcacgtacggctcctaagactccgttttgcttctttttttttaaaacacattcttattataaatataaaaaaaacaaagagagatcaaaaacaaaaagtttcctctctttttctaattttttaaagtttctttcatcttcttcagatccgaattgtcgccgcgaaattcgtcagtgcagcttcttcttcttcgcgtactttattcgatcggctgtctgaagaacatgaagccgatgatcgtaggtacgttagattcatttttccgaaattggctttttgatttttctgatcgaaacgatgcgaggttcaatttcatcattgttttgaaatctatgacttacaaaagtaatggcgttgacagatttgttcttaataaggacccacatttttgctgaattttggaacaaacattgttcttctttgatttcaaaacaagaattagaaaattcatttatcatgtaatctatttagctgatgtgacgatgaacagatcaaaggaatgtagtctcgaattgtttaaggttataatgattcctctaagtgaaaaaaaaaaaaaaagcagaaaaaaaagttagaaagatgacaaagttgaagatttctttttgtctttgaagcttctatttttttggtgggtcctttttaagacaatgatttcaattcttggattttgtctgaagaaaaatgttgctgttcttctctttacaatgtttttgattgtgagcttgcgttgacttaaatcatgtcatatattttggtttctcacggttttatttattgtgccaagtgatgcagttgctgctagttacggtggattgatgtttggatggacgcagaaattttgatgtgggtttagtctaaaaggtgaagaaca
SEQ ID NO. 5 DMC1 promoter sequence
cagggaatgttccaatataagacactttaaacgtaagtttagacaatatagacactttccaagttagaggcacttttccttctttttgaaggaaaacttgacttttatacctcttaactaaacaatcgaaaacaataactaaatatatatcttaaccaaacaattaaaaaaataaaagaatttagatacgtagttattaatatagaccattagattgaaaaataaaaattaagatctatggctgagattaaagacaataaatggattaattttttgatgttaaaatctgattagaaaaaggtatttctcttcgtctctagaactaaatctctctctctaaaaaaacaatcgtttctccctttctccttcctgaagatcgttttttcataaatccatagtagtttaaaaacgaagcagagagatgttgaaaatcgtttctcatgaaattaatcgattattctctgtgaagttctttaatccacacaactttcctcatgaacatgataatagtagtaaatggaggtttttcctatggttactctagacgaaggaggatctccttgtgttggacaggtttgtgatttctttccatggattaaaaaaatttgattgtttgtttatgatgaacgattctttggctacggaagagtgtcatggagttctggcgaattctttggctatgtttggtgatttcgtttttaatcaagttgggaatcaataggaaacaactaagcatacaacatagattagaagagatatcaagatggatctaatttaagtaagatttggcgactaattctagatgattagggttatttgtgatttattacaaggcatttgtgttctcattgatttggcgagtaattctgtatgactagggttatttgtgttttcttaaaaagaatttgtgttcttgttgaaatcttgttcattggaattatttgtgtttggtaaatcttcattggtggctaaggatgtgtttgtagctcttacggcgtttgttattggtgatgtccattatggatggcaaattatggatggcacattatggatgatgaatcatggatgacatattatggatgacgcatcatggattgtatattatggattgatatggtgagatttgtaaatcttttggtcttacatgttaagagtaaaagatgaagaattggagaagcatgtctaacatcctaaaaacaagctatatgcggttgatttgctacaaataattttttggtatccataataacaaatccatttaaatatatccattcagaaacctttctactgatccgtatccattctatataccatgtcaataataataggagattcgattaaccgtgttttgtaaagaaaccaaagttccatgtccataaggttttgaaggtggaggtctctgcaaactgaaaaaaaaatcaacaaacaattttttggtgtccataataacgaatccatttaaatatatccattcggaaacctttttactgatctatatccattttatataacatgtccatgataacaggagattcgattaactgaaatctcgatgctacgtagatgaaacgagtttgacacatgagagagagcaaaaatcaaatcaaaccgccattgttgaagaagaagaagtttcttctcattttttacaaagatgaagagagagagaggtgaagagagagagagagatgaagagagagagagagagaaagagagagatgaagagagagagagaaagagagaaaacgtgggttaagataatattttagttaagagggtattttagtaaaaaaacataaaaaagtgcctaatcttttgaaagtgcctaaacacagaaatagttttaaaaaagtgtttaagagtgtaatattctctttttttcacctagattccttctattgaccgtcgatagacggatgataactatgacgtggcattatcgcagccatcaaacaaagtcatgtataacaaagaagagcacacaaacgaaaacaaattcagttgcggaacccaaattcaaatcaacggaattagaatcacgctttcaattccgtaacccgccattaaaaaccttgaaccctcgaagcaaatcgagcaaagattttcaaatttcgaatttcaaaattctatctctctcactcttccaagcttagagactcttagagcgagaaa
SEQ ID NO. 6 TPD1 promoter sequence
acatagagcttgcatatatttggaggttagattacaagacgagattccatgtgtaacctaattgattaataaggcatctctatttatttgtgactcgacctgatctgatccgggtgggatacaacatgttgtagattagtgttattgataggaaatttgtaacatctctaaatgtttttgacctatgattgttttttttccttacaaacttataccattcccatagccttagcatctgccattgcagtaacattaacgatttccatgtgaaaaacaaccaattttagcaataatttgggttgactttgtcgagatcttggctcaattatatatattcataccattactatataagaactgatgtcttgtttatttgatgtcagacgcctgagaggtttcaaagtttttaaaaaaaaaaatttttaaagagaagcgtgtgtggctttaaataaggtcaactaggaaatgggaatcattcaacaagaagaaaaatgacaaaatgaaatatgaatgaagaggaggagggggtcgagaaaggttgagagaagcagaccaaagctctgcaaaactctgttttattaatgacacattgtgctctgtctgtcaaaagcaatgccttctttctagtgcatttattgcccattcccaaacaaaatatacaaataagtgtaaggatgcatgatatagtttaaaaaactatttgaaatgctcacattctttttgaacttctcttttaaatttgcaaaaaaaaattatatttttttgttccaaaaactgcaagcaaatgttgatacgaacgagccaacttgtcattttatgaccttgttttatctctgccagtcaaataactctttccgttttcgcttttttggcttacttcttactctgttggtttgccttttgtttggccttactttcgtttataggaatcgaatttcaatgttttatctttcctgtcgaaattaaattggtctttctaataaatctcatttttttctttttcaaagatttgtttatttagtgaacaaattcttaagagagtttttttccccaagcaattgaaaatgaatcatgtaatgttgatttttttggtgcaagtttatatagtttgctagaaatttggccttcatacgatatttgaacattttgatataagatttctatcagaagacagaagctacacgattgattcagccagagaaaacaaaagttgaaccgaacgattaaacccacacacaaaaaaaaacaaatagaataagaatgaaggagaaggaatataaaaatgggtacaagaaaaaacatcatcgtcgcaatcataaatgcaattgaaggcgcgtggaaaagagactcgtgtgcttctgatactcccacgtgaggatgtgacaatttaatattacgaattcaataattacccaatctttcttaatctgttaatttatctaagccaatcattcattcctttcacacccgccacggtgtcaatccaaattttttagaatcaccaaattacacctttacccttatattgtttttaatttgtttccgaattttaccaactgtttcaataaaacgtaccaacccatttctggttggatacaagcgggattcattcctatataacatttttaacggtatcattcaatcataccggtccaatttatttctatcatgctatctatataacattttcttacaaaatgtctttctctataccttttcacattcgaaacttcaaaagttaatgtgtcaatttaattacgcataactcgaaaaatgcattttaaaacaattaaaattaaatttatcttaatttgacgttataaaaaaatattgaatatatttccgagaaataataataagagaaaggactataaatacgtctctagtgtgtaatgtgtaacacagacgagagtcctcaaatccattttctctctctatctctctttatcccttcgtcttcttcctcggcgacaccacttgcaggcgctaactcgacgaagaaggaaaaggtgagagaaactctctgaaaactgtacggatttaaacgtatatatgtgtgtatgtatacgaatctgatggtttagttttctggatttttctccattctctgttgattctactttttttgtttgtttgtttgctttgtttctctgtgtttcacgctgcactacgctccagctttctctttgtttttcagaaccagattgcttttttccatgaaactcgatcgagatttcttactttttcctatttttagtcgctttatgatacgattcatctgtcgcctgattcgcttcatctccttggtttgattttagattttcaatttcttctgtttttggttacgtttgtgttcgctgtgatgaagttttccctgaaacttgttaaaagcgataatgcatttcgccgtcgttttcttcgattttaggtttaagcttctctctctctctttcactgtacattgcgcaacagatttttgattttggtcaaagtttttcaaaatttctgcagtagattccttatatttcaaatcagagaagcgagtgatgttaggagccgcttaaatctggattttcctctgttttatactgttcattgatatgatggatgcaagacaagtcgtgtgataagatactcataaagtttttttcgttctctttcctctggtttttacagattttccggtgttagtcacatcgacgcagaaggaacagagaagaagacgagagtcagcttcattatcaactttagttcttcgacgtctacgcac
SEQ ID NO. 7 gRNA2 target sequence
CCCCTTCTCACTAAAATCCGCGG
SEQ ID NO 8 gRNA6 target sequence
GATCCCATTTCGCTTTCCCGAGG
SEQ ID NO 9 gRNA11 target sequence
ACCTTCAAGCAGCTATCCAGCGG
SEQ ID NO 10 gRNA23 target sequence
CCAGATTCTTTCCATAGCGCAGG。
Claims (22)
1. An artificial gene driving system for a plant, the artificial gene driving system comprising:
a first nucleic acid comprising a coding sequence for a component of a gene editing system that can target and cause the loss of function of a pollen tube development essential protein in the plant, the coding sequence for the component of the gene editing system being operably linked to a promoter that mediates specific expression during pollen formation;
a second nucleic acid comprising a recoded coding sequence for the pollen tube development essential protein encoding a wild-type pollen tube development essential protein and not being targeted by the gene editing system and being operably linked to a native promoter of the pollen tube development essential gene; and
A third nucleic acid comprising a coding sequence for a cargo to be transmitted in a population of said plant,
wherein the plant is Arabidopsis thaliana and the pollen tube development essential protein is Arabidopsis thaliana No Pollen Germination 1.
2. The artificial gene drive system of claim 1, wherein the first nucleic acid, second nucleic acid, and third nucleic acid are located on the same expression construct.
3. The artificial gene drive system of claim 2, wherein the No Pollen Germination 1 consists of the amino acid sequence set forth in SEQ ID No. 1.
4. The artificial gene driving system of claim 2, wherein the coding sequence of endogenous No Pollen Germination 1 in the plant consists of the nucleotide sequence set forth in SEQ ID No. 2.
5. The artificial gene drive system of claim 2, wherein the recoded coding sequence of No Pollen Germination 1 consists of the nucleotide sequence set forth in SEQ ID No. 3 and the recoded coding sequence of No Pollen Germination 1 cannot be targeted by the gene editing system so as not to be disabled by expression of the gene editing system.
6. The artificial gene driving system of claim 2, wherein the native promoter of No Pollen Germination 1 consists of the nucleotide sequence set forth in SEQ ID No. 4.
7. The artificial gene drive system of claim 2 wherein the promoter that mediates specific expression during pollen formation isDisruption of Meiotic Control 1Promoters of genes.
8. The artificial gene driving system of claim 7, wherein theDisruption of Meiotic Control 1The promoter of the gene consists of the nucleotide sequence shown in SEQ ID No. 5.
9. The artificial gene drive system of claim 2 wherein the promoter that mediates specific expression during pollen formation isTapetum Determinant 1Promoters of genes.
10. The artificial gene driving system of claim 9, wherein theTapetum Determinant 1The promoter of the gene consists of the nucleotide sequence shown in SEQ ID NO. 6.
11. The artificial gene driving system of claim 1, wherein the gene editing system is selected from CRISPR, ZFN, or TALEN based gene editing systems.
12. The artificial gene drive system of claim 11, wherein the CRISPR gene editing system comprises a CRISPR nuclease and at least one guide RNA.
13. The artificial gene drive system of claim 12 wherein the coding sequence of the CRISPR nuclease is operably linked to the promoter that mediates specific expression during pollen formation.
14. The artificial gene drive system of claim 12, wherein the gene editing system comprises a Cas9 nuclease and at least one gRNA targeting endogenous No Pollen Germination 1.
15. The artificial gene drive system of claim 14, wherein the at least one gRNA targeting endogenous No Pollen Germination 1 targets a nucleotide sequence selected from any one of SEQ ID NOs 7-10.
16. The artificial gene drive system of any one of claims 1-15, wherein the cargo is a herbicide sensitive gene, a gene that disrupts herbicide resistance, a gene that improves environmental adaptation, or a gene that improves disease resistance.
17. A method of producing a modified plant for genetically driven engineering a plant population, the method comprising introducing the artificial gene driven system of any one of claims 1-16 into at least one plant, thereby obtaining at least one modified plant, the at least one modified plant genome integrating the first nucleic acid, second nucleic acid, and third nucleic acid, wherein the plant is arabidopsis thaliana.
18. The method of claim 17, wherein the first, second, and third nucleic acids integrated into the genome of the modified plant are closely linked.
19. Use of a modified plant for genetically engineering a plant population, wherein the modified plant is obtained by the method of producing a modified plant for genetically engineering a plant population as claimed in claim 17 or 18 or the modified plant has been introduced into an artificial gene drive system for plants as claimed in any of claims 1 to 16, whereby the modified plant genome has the first, second and third nucleic acids integrated therein, wherein the plant is arabidopsis.
20. A method of genetically modifying a population of plants, the method comprising placing at least one modified plant obtained by the method of producing a modified plant for genetically modifying a population of plants of claim 17 or 18 into the population of plants and allowing the at least one modified plant to cross with other plants in the population of plants, wherein the plant is arabidopsis thaliana.
21. The method of claim 20, wherein the method allows the offspring of the at least one modified plant that hybridizes with other plants in the plant population to hybridize with other plants and/or offspring in the population.
22. The method of claim 20 or 21, which results in an increased proportion of plants carrying the cargo in a plant population modified by the method as compared to an unmodified plant population.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311247476.0A CN116987715B (en) | 2023-09-25 | 2023-09-25 | Artificial gene driving system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311247476.0A CN116987715B (en) | 2023-09-25 | 2023-09-25 | Artificial gene driving system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116987715A CN116987715A (en) | 2023-11-03 |
CN116987715B true CN116987715B (en) | 2024-01-30 |
Family
ID=88525161
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311247476.0A Active CN116987715B (en) | 2023-09-25 | 2023-09-25 | Artificial gene driving system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116987715B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017196858A1 (en) * | 2016-05-09 | 2017-11-16 | Massachusetts Institute Of Technology | Methods to design and use gene drives |
WO2023009993A1 (en) * | 2021-07-26 | 2023-02-02 | Elsoms Developments Limited | Methods and compositions relating to maintainer lines for male-sterility |
WO2023169454A1 (en) * | 2022-03-08 | 2023-09-14 | 中国科学院遗传与发育生物学研究所 | Adenine deaminase and use thereof in base editing |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030061635A1 (en) * | 2001-06-20 | 2003-03-27 | Reddy Anireddy S.N. | Pollen-specific novel calmodulin-binding protein, NPG1 (No Pollen Germination1), promoter, coding sequences and methods for using the same |
US9006515B2 (en) * | 2012-01-06 | 2015-04-14 | Pioneer Hi Bred International Inc | Pollen preferred promoters and methods of use |
WO2020018528A2 (en) * | 2018-07-16 | 2020-01-23 | Board Of Trustees Of Michigan State University | Overcoming self-incompatibility in diploid plants for breeding and production of hybrids |
-
2023
- 2023-09-25 CN CN202311247476.0A patent/CN116987715B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017196858A1 (en) * | 2016-05-09 | 2017-11-16 | Massachusetts Institute Of Technology | Methods to design and use gene drives |
WO2023009993A1 (en) * | 2021-07-26 | 2023-02-02 | Elsoms Developments Limited | Methods and compositions relating to maintainer lines for male-sterility |
WO2023169454A1 (en) * | 2022-03-08 | 2023-09-14 | 中国科学院遗传与发育生物学研究所 | Adenine deaminase and use thereof in base editing |
Non-Patent Citations (6)
Title |
---|
"基因驱动遗传元件"促进水稻物种形成;李亚静等;《科学通报》;第68卷(第26期);第3400-3402页 * |
A natural gene drive system confers reproductive isolation in rice;Wang C等;《Cell》;第186卷(第17期);第3577-3592页 * |
AltName: Full=NO POLLEN GERMINATION 1.《Genbank》.2023,FEATURES、ORIGIN. * |
Golovkin,M.等.RecName: Full=Protein NPG1 * |
Teosinte Pollen Drive guides maize domestication and evolution by RNAi;Berube B等;《bioRxiv》;doi: 10.1101/2023.07.12.548689 * |
低安全风险的腺嘌呤碱基编辑器和多能干细胞系的建立;刘洋;《中国博士学位论文全文数据库基础科学辑》(2023年第7期);第A006-3页 * |
Also Published As
Publication number | Publication date |
---|---|
CN116987715A (en) | 2023-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11820990B2 (en) | Method for base editing in plants | |
US11345924B2 (en) | Creation of herbicide resistant gene and use thereof | |
Mao et al. | Heritability of targeted gene modifications induced by plant-optimized CRISPR systems | |
US20200140874A1 (en) | Genome Editing-Based Crop Engineering and Production of Brachytic Plants | |
US11773398B2 (en) | Modified excisable 5307 maize transgenic locus lacking a selectable marker | |
Lu et al. | Low frequency of zinc-finger nuclease-induced mutagenesis in Populus | |
US20220346341A1 (en) | Methods and compositions to increase yield through modifications of fea3 genomic locus and associated ligands | |
Oberhofer et al. | Cleave and Rescue gamete killers create conditions for gene drive in plants | |
US11359210B2 (en) | INIR12 transgenic maize | |
CN115698302A (en) | Large-scale genome manipulation | |
CN115135143A (en) | Methods and compositions for multiplex editing of plant cell genomes | |
CN116987715B (en) | Artificial gene driving system | |
WO2019234129A1 (en) | Haploid induction with modified dna-repair | |
CN108668884A (en) | The method for formulating ornamental type rice germplasm using two kinds of transcription factor genes | |
Viviani et al. | Origin of the genome editing systems: application for crop improvement | |
US20230203513A1 (en) | Cucumber plant habit | |
US20220411809A1 (en) | Gene mutations in tomato to yield compact and early yielding forms suitable for urban agriculture | |
Harwood | GENOME ENGINEERING TECHNIQUES | |
Rathore et al. | Applications and associated challenges of CRISPR-Cas technology in agriculture | |
US20240309394A1 (en) | Herbicide resistant cannabis plant | |
JP2024113751A (en) | Method for producing a crop having a superdominant trait, and a crop having a superdominant trait | |
AU2001278066A1 (en) | Methods for the controlled, automatic excision of heterologous DNA from transgenic plants and DNA-excising gene cassettes for use therein | |
CN118308418A (en) | Corn gene DWF4 and functional site and application thereof | |
Baurens et al. | Ecologically acceptable genetic transformation of banana and plantain? Proposal for a theoretical experiment not restricted to Musa crops alone | |
Poltronieri et al. | Genetically Modified Plants. Cisgenesis. RNA transfer: Rootstock to shoot delivery, Mutagenesis and non-GM advanced breeding methods. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |