CN117242181A - Plant regulatory element and use thereof - Google Patents
Plant regulatory element and use thereof Download PDFInfo
- Publication number
- CN117242181A CN117242181A CN202280031323.XA CN202280031323A CN117242181A CN 117242181 A CN117242181 A CN 117242181A CN 202280031323 A CN202280031323 A CN 202280031323A CN 117242181 A CN117242181 A CN 117242181A
- Authority
- CN
- China
- Prior art keywords
- sequence
- synthetic
- crispr
- plant cells
- snrna
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000001105 regulatory effect Effects 0.000 title claims description 31
- 230000014509 gene expression Effects 0.000 claims abstract description 169
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims abstract description 29
- 108091027963 non-coding RNA Proteins 0.000 claims abstract description 18
- 102000042567 non-coding RNA Human genes 0.000 claims abstract description 18
- 210000004027 cell Anatomy 0.000 claims description 222
- 108020005004 Guide RNA Proteins 0.000 claims description 219
- 239000012634 fragment Substances 0.000 claims description 174
- 241000196324 Embryophyta Species 0.000 claims description 151
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 97
- 108090000623 proteins and genes Proteins 0.000 claims description 96
- 108010040467 CRISPR-Associated Proteins Proteins 0.000 claims description 79
- 240000008042 Zea mays Species 0.000 claims description 56
- 108020004511 Recombinant DNA Proteins 0.000 claims description 42
- -1 cas1B Proteins 0.000 claims description 31
- 240000002791 Brassica napus Species 0.000 claims description 29
- 244000046052 Phaseolus vulgaris Species 0.000 claims description 26
- 230000008685 targeting Effects 0.000 claims description 23
- 241000219198 Brassica Species 0.000 claims description 15
- 240000007124 Brassica oleracea Species 0.000 claims description 10
- 239000002243 precursor Substances 0.000 claims description 9
- 240000003259 Brassica oleracea var. botrytis Species 0.000 claims description 8
- 101150038500 cas9 gene Proteins 0.000 claims description 8
- 210000000349 chromosome Anatomy 0.000 claims description 8
- 239000002679 microRNA Substances 0.000 claims description 8
- 102000040650 (ribonucleotides)n+m Human genes 0.000 claims description 7
- 101150069031 CSN2 gene Proteins 0.000 claims description 7
- 244000045195 Cicer arietinum Species 0.000 claims description 7
- 241000234643 Festuca arundinacea Species 0.000 claims description 7
- 244000068988 Glycine max Species 0.000 claims description 7
- 108020004459 Small interfering RNA Proteins 0.000 claims description 7
- 101150055191 cas3 gene Proteins 0.000 claims description 7
- 244000105624 Arachis hypogaea Species 0.000 claims description 6
- 101150017047 CSM3 gene Proteins 0.000 claims description 6
- 101150078885 CSY3 gene Proteins 0.000 claims description 6
- 101100275895 Emericella nidulans (strain FGSC A4 / ATCC 38163 / CBS 112.46 / NRRL 194 / M139) csnB gene Proteins 0.000 claims description 6
- 101100007792 Escherichia coli (strain K12) casB gene Proteins 0.000 claims description 6
- 101100219622 Escherichia coli (strain K12) casC gene Proteins 0.000 claims description 6
- 101100382541 Escherichia coli (strain K12) casD gene Proteins 0.000 claims description 6
- 101100005249 Escherichia coli (strain K12) ygcB gene Proteins 0.000 claims description 6
- 108700011259 MicroRNAs Proteins 0.000 claims description 6
- 101100387128 Myxococcus xanthus (strain DK1622) devR gene Proteins 0.000 claims description 6
- 101100387131 Myxococcus xanthus (strain DK1622) devS gene Proteins 0.000 claims description 6
- 101100059152 Thermococcus onnurineus (strain NA1) csm1 gene Proteins 0.000 claims description 6
- 101100273269 Thermus thermophilus (strain ATCC 27634 / DSM 579 / HB8) cse3 gene Proteins 0.000 claims description 6
- 101150090505 cas10 gene Proteins 0.000 claims description 6
- 101150111685 cas4 gene Proteins 0.000 claims description 6
- 101150049463 cas5 gene Proteins 0.000 claims description 6
- 101150106467 cas6 gene Proteins 0.000 claims description 6
- 101150044165 cas7 gene Proteins 0.000 claims description 6
- 101150100788 cmr3 gene Proteins 0.000 claims description 6
- 101150040342 cmr4 gene Proteins 0.000 claims description 6
- 101150095330 cmr5 gene Proteins 0.000 claims description 6
- 101150034961 cmr6 gene Proteins 0.000 claims description 6
- 101150085344 csa5 gene Proteins 0.000 claims description 6
- 101150088639 csm4 gene Proteins 0.000 claims description 6
- 101150022488 csm5 gene Proteins 0.000 claims description 6
- 101150064365 csm6 gene Proteins 0.000 claims description 6
- 101150016576 csy2 gene Proteins 0.000 claims description 6
- 108091032955 Bacterial small RNA Proteins 0.000 claims description 5
- 101100007788 Escherichia coli (strain K12) casA gene Proteins 0.000 claims description 5
- 101100326871 Escherichia coli (strain K12) ygbF gene Proteins 0.000 claims description 5
- 101150117416 cas2 gene Proteins 0.000 claims description 5
- 101150089829 csc-1 gene Proteins 0.000 claims description 5
- 101150056210 csx1 gene Proteins 0.000 claims description 5
- 101150088252 csy1 gene Proteins 0.000 claims description 5
- 230000005030 transcription termination Effects 0.000 claims description 5
- 244000075850 Avena orientalis Species 0.000 claims description 4
- 235000007319 Avena orientalis Nutrition 0.000 claims description 4
- 240000008067 Cucumis sativus Species 0.000 claims description 4
- 101100438439 Escherichia coli (strain K12) ygbT gene Proteins 0.000 claims description 4
- 240000005979 Hordeum vulgare Species 0.000 claims description 4
- 240000004658 Medicago sativa Species 0.000 claims description 4
- 240000007594 Oryza sativa Species 0.000 claims description 4
- 108091007412 Piwi-interacting RNA Proteins 0.000 claims description 4
- 240000003768 Solanum lycopersicum Species 0.000 claims description 4
- 244000061456 Solanum tuberosum Species 0.000 claims description 4
- 101100329497 Thermoproteus tenax (strain ATCC 35583 / DSM 2078 / JCM 9277 / NBRC 100435 / Kra 1) cas2 gene Proteins 0.000 claims description 4
- 244000098338 Triticum aestivum Species 0.000 claims description 4
- 101150000705 cas1 gene Proteins 0.000 claims description 4
- 244000064816 Brassica oleracea var. acephala Species 0.000 claims description 3
- 244000025254 Cannabis sativa Species 0.000 claims description 3
- 244000000626 Daucus carota Species 0.000 claims description 3
- 108010034791 Heterochromatin Proteins 0.000 claims description 3
- 241000209510 Liliopsida Species 0.000 claims description 3
- 240000000111 Saccharum officinarum Species 0.000 claims description 3
- 230000000692 anti-sense effect Effects 0.000 claims description 3
- 210000004458 heterochromatin Anatomy 0.000 claims description 3
- 230000030648 nucleus localization Effects 0.000 claims description 3
- 241000219112 Cucumis Species 0.000 claims description 2
- 235000004341 Gossypium herbaceum Nutrition 0.000 claims description 2
- 240000002024 Gossypium herbaceum Species 0.000 claims description 2
- 240000008415 Lactuca sativa Species 0.000 claims description 2
- 240000003889 Piper guineense Species 0.000 claims description 2
- 244000042314 Vigna unguiculata Species 0.000 claims description 2
- 241001233957 eudicotyledons Species 0.000 claims description 2
- 241000219192 Brassica napus subsp. rapifera Species 0.000 claims 1
- 244000088415 Raphanus sativus Species 0.000 claims 1
- 240000003829 Sorghum propinquum Species 0.000 claims 1
- 108091028113 Trans-activating crRNA Proteins 0.000 claims 1
- 102000039471 Small Nuclear RNA Human genes 0.000 abstract description 184
- 108091029842 small nuclear ribonucleic acid Proteins 0.000 abstract description 184
- 238000000034 method Methods 0.000 abstract description 61
- 108091033409 CRISPR Proteins 0.000 abstract description 35
- 230000001404 mediated effect Effects 0.000 abstract description 31
- 239000000203 mixture Substances 0.000 abstract description 8
- 238000012239 gene modification Methods 0.000 abstract description 3
- 238000010354 CRISPR gene editing Methods 0.000 abstract 1
- 108020004414 DNA Proteins 0.000 description 109
- 108010042407 Endonucleases Proteins 0.000 description 55
- 102100031780 Endonuclease Human genes 0.000 description 51
- 108700019146 Transgenes Proteins 0.000 description 46
- 238000003776 cleavage reaction Methods 0.000 description 42
- 230000007017 scission Effects 0.000 description 42
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 41
- 150000007523 nucleic acids Chemical class 0.000 description 40
- 235000018102 proteins Nutrition 0.000 description 40
- 102000004169 proteins and genes Human genes 0.000 description 40
- 102000053602 DNA Human genes 0.000 description 38
- 125000003729 nucleotide group Chemical group 0.000 description 38
- 108700004991 Cas12a Proteins 0.000 description 37
- 239000002773 nucleotide Substances 0.000 description 37
- 125000006850 spacer group Chemical group 0.000 description 33
- 210000001938 protoplast Anatomy 0.000 description 31
- 230000009466 transformation Effects 0.000 description 31
- 108091079001 CRISPR RNA Proteins 0.000 description 28
- 230000000694 effects Effects 0.000 description 28
- 235000011293 Brassica napus Nutrition 0.000 description 26
- 230000004048 modification Effects 0.000 description 25
- 238000012986 modification Methods 0.000 description 25
- 102000039446 nucleic acids Human genes 0.000 description 25
- 108020004707 nucleic acids Proteins 0.000 description 25
- 230000010354 integration Effects 0.000 description 23
- 239000013612 plasmid Substances 0.000 description 23
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 22
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 21
- 235000009973 maize Nutrition 0.000 description 21
- 101710163270 Nuclease Proteins 0.000 description 18
- 210000001519 tissue Anatomy 0.000 description 18
- 230000035897 transcription Effects 0.000 description 18
- 238000013518 transcription Methods 0.000 description 18
- 239000003550 marker Substances 0.000 description 17
- 230000001580 bacterial effect Effects 0.000 description 15
- 238000012217 deletion Methods 0.000 description 15
- 230000037430 deletion Effects 0.000 description 15
- 238000010362 genome editing Methods 0.000 description 15
- 238000001890 transfection Methods 0.000 description 15
- 108091026890 Coding region Proteins 0.000 description 14
- 241000589158 Agrobacterium Species 0.000 description 13
- 240000008100 Brassica rapa Species 0.000 description 13
- 235000002566 Capsicum Nutrition 0.000 description 13
- 101150059443 cas12a gene Proteins 0.000 description 13
- 230000000295 complement effect Effects 0.000 description 13
- 238000003780 insertion Methods 0.000 description 13
- 230000037431 insertion Effects 0.000 description 13
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 12
- 235000005822 corn Nutrition 0.000 description 12
- 244000178993 Brassica juncea Species 0.000 description 11
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 11
- 230000005782 double-strand break Effects 0.000 description 11
- 230000009261 transgenic effect Effects 0.000 description 11
- 235000011331 Brassica Nutrition 0.000 description 10
- 108020004566 Transfer RNA Proteins 0.000 description 10
- 235000007244 Zea mays Nutrition 0.000 description 10
- 230000035772 mutation Effects 0.000 description 10
- 239000002245 particle Substances 0.000 description 10
- 235000011332 Brassica juncea Nutrition 0.000 description 9
- 235000014700 Brassica juncea var napiformis Nutrition 0.000 description 9
- 235000011292 Brassica rapa Nutrition 0.000 description 9
- 108091070501 miRNA Proteins 0.000 description 9
- 102000040430 polynucleotide Human genes 0.000 description 9
- 108091033319 polynucleotide Proteins 0.000 description 9
- 239000002157 polynucleotide Substances 0.000 description 9
- 235000000540 Brassica rapa subsp rapa Nutrition 0.000 description 8
- 235000019510 Long pepper Nutrition 0.000 description 8
- 240000003455 Piper longum Species 0.000 description 8
- 241000482268 Zea mays subsp. mays Species 0.000 description 8
- 230000008901 benefit Effects 0.000 description 8
- 230000006780 non-homologous end joining Effects 0.000 description 8
- 235000008534 Capsicum annuum var annuum Nutrition 0.000 description 7
- 240000008574 Capsicum frutescens Species 0.000 description 7
- 244000203593 Piper nigrum Species 0.000 description 7
- 238000009395 breeding Methods 0.000 description 7
- 230000001488 breeding effect Effects 0.000 description 7
- 238000013461 design Methods 0.000 description 7
- 108700028369 Alleles Proteins 0.000 description 6
- 108091023037 Aptamer Proteins 0.000 description 6
- 241000894006 Bacteria Species 0.000 description 6
- 235000003899 Brassica oleracea var acephala Nutrition 0.000 description 6
- 244000045232 Canavalia ensiformis Species 0.000 description 6
- 102000004533 Endonucleases Human genes 0.000 description 6
- 206010020649 Hyperkeratosis Diseases 0.000 description 6
- 235000008184 Piper nigrum Nutrition 0.000 description 6
- 108091036066 Three prime untranslated region Proteins 0.000 description 6
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 6
- 239000001390 capsicum minimum Substances 0.000 description 6
- 230000006801 homologous recombination Effects 0.000 description 6
- 238000002744 homologous recombination Methods 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 6
- 241000894007 species Species 0.000 description 6
- 230000010474 transient expression Effects 0.000 description 6
- 101100385358 Alicyclobacillus acidoterrestris (strain ATCC 49025 / DSM 3922 / CIP 106132 / NCIMB 13137 / GD3B) cas12b gene Proteins 0.000 description 5
- 108091093088 Amplicon Proteins 0.000 description 5
- 235000012905 Brassica oleracea var viridis Nutrition 0.000 description 5
- 235000003343 Brassica rupestris Nutrition 0.000 description 5
- 235000010520 Canavalia ensiformis Nutrition 0.000 description 5
- 235000010523 Cicer arietinum Nutrition 0.000 description 5
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 5
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 5
- 235000016761 Piper aduncum Nutrition 0.000 description 5
- 108010020764 Transposases Proteins 0.000 description 5
- 102000008579 Transposases Human genes 0.000 description 5
- 239000012636 effector Substances 0.000 description 5
- 230000002068 genetic effect Effects 0.000 description 5
- 210000004940 nucleus Anatomy 0.000 description 5
- 108090000765 processed proteins & peptides Proteins 0.000 description 5
- 230000006798 recombination Effects 0.000 description 5
- 238000005215 recombination Methods 0.000 description 5
- 230000001131 transforming effect Effects 0.000 description 5
- 235000010777 Arachis hypogaea Nutrition 0.000 description 4
- 235000003351 Brassica cretica Nutrition 0.000 description 4
- 235000011299 Brassica oleracea var botrytis Nutrition 0.000 description 4
- 108020004705 Codon Proteins 0.000 description 4
- 240000001980 Cucurbita pepo Species 0.000 description 4
- 230000007018 DNA scission Effects 0.000 description 4
- 235000010469 Glycine max Nutrition 0.000 description 4
- 239000005562 Glyphosate Substances 0.000 description 4
- 244000299507 Gossypium hirsutum Species 0.000 description 4
- 235000017804 Piper guineense Nutrition 0.000 description 4
- 241000758706 Piperaceae Species 0.000 description 4
- 240000004713 Pisum sativum Species 0.000 description 4
- 235000010582 Pisum sativum Nutrition 0.000 description 4
- 102000014450 RNA Polymerase III Human genes 0.000 description 4
- 108010078067 RNA Polymerase III Proteins 0.000 description 4
- 240000006677 Vicia faba Species 0.000 description 4
- 241000219977 Vigna Species 0.000 description 4
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 4
- 230000003321 amplification Effects 0.000 description 4
- 238000000137 annealing Methods 0.000 description 4
- 230000002759 chromosomal effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 210000004602 germ cell Anatomy 0.000 description 4
- XDDAORKBJWWYJS-UHFFFAOYSA-N glyphosate Chemical compound OC(=O)CNCP(O)(O)=O XDDAORKBJWWYJS-UHFFFAOYSA-N 0.000 description 4
- 229940097068 glyphosate Drugs 0.000 description 4
- 230000000670 limiting effect Effects 0.000 description 4
- 235000010460 mustard Nutrition 0.000 description 4
- 238000003199 nucleic acid amplification method Methods 0.000 description 4
- 229920001184 polypeptide Polymers 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 102000004196 processed proteins & peptides Human genes 0.000 description 4
- 239000000523 sample Substances 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 235000017060 Arachis glabrata Nutrition 0.000 description 3
- 235000018262 Arachis monticola Nutrition 0.000 description 3
- 235000005156 Brassica carinata Nutrition 0.000 description 3
- 244000257790 Brassica carinata Species 0.000 description 3
- 235000017647 Brassica oleracea var italica Nutrition 0.000 description 3
- 235000010149 Brassica rapa subsp chinensis Nutrition 0.000 description 3
- 238000010453 CRISPR/Cas method Methods 0.000 description 3
- 240000004160 Capsicum annuum Species 0.000 description 3
- 235000002568 Capsicum frutescens Nutrition 0.000 description 3
- 241001107116 Castanospermum australe Species 0.000 description 3
- 241000219109 Citrullus Species 0.000 description 3
- 108010051219 Cre recombinase Proteins 0.000 description 3
- 241000219130 Cucurbita pepo subsp. pepo Species 0.000 description 3
- 235000003954 Cucurbita pepo var melopepo Nutrition 0.000 description 3
- 230000033616 DNA repair Effects 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- 101150106478 GPS1 gene Proteins 0.000 description 3
- 235000010671 Lathyrus sativus Nutrition 0.000 description 3
- 240000005783 Lathyrus sativus Species 0.000 description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 3
- 239000006002 Pepper Substances 0.000 description 3
- 241000242739 Renilla Species 0.000 description 3
- 240000006394 Sorghum bicolor Species 0.000 description 3
- 235000009337 Spinacia oleracea Nutrition 0.000 description 3
- 244000300264 Spinacia oleracea Species 0.000 description 3
- 241000193996 Streptococcus pyogenes Species 0.000 description 3
- 241000194020 Streptococcus thermophilus Species 0.000 description 3
- 238000010459 TALEN Methods 0.000 description 3
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 3
- 235000010749 Vicia faba Nutrition 0.000 description 3
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 3
- 238000003491 array Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- QKSKPIVNLNLAAV-UHFFFAOYSA-N bis(2-chloroethyl) sulfide Chemical compound ClCCSCCCl QKSKPIVNLNLAAV-UHFFFAOYSA-N 0.000 description 3
- 235000021279 black bean Nutrition 0.000 description 3
- 230000032823 cell division Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000010367 cloning Methods 0.000 description 3
- 230000009615 deamination Effects 0.000 description 3
- 238000006481 deamination reaction Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 239000004009 herbicide Substances 0.000 description 3
- 230000003301 hydrolyzing effect Effects 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 238000002955 isolation Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 235000020232 peanut Nutrition 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 235000004252 protein component Nutrition 0.000 description 3
- 230000010076 replication Effects 0.000 description 3
- 230000003362 replicative effect Effects 0.000 description 3
- 101150071322 ruvC gene Proteins 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 230000014616 translation Effects 0.000 description 3
- 108020005065 3' Flanking Region Proteins 0.000 description 2
- 108020005029 5' Flanking Region Proteins 0.000 description 2
- 241000193412 Alicyclobacillus acidoterrestris Species 0.000 description 2
- 244000291564 Allium cepa Species 0.000 description 2
- 235000002732 Allium cepa var. cepa Nutrition 0.000 description 2
- 241001135756 Alphaproteobacteria Species 0.000 description 2
- 241000192542 Anabaena Species 0.000 description 2
- 235000000832 Ayote Nutrition 0.000 description 2
- 241000186000 Bifidobacterium Species 0.000 description 2
- 244000060924 Brassica campestris Species 0.000 description 2
- 235000005637 Brassica campestris Nutrition 0.000 description 2
- 244000012866 Brassica narinosa Species 0.000 description 2
- 235000004862 Brassica narinosa Nutrition 0.000 description 2
- 235000011291 Brassica nigra Nutrition 0.000 description 2
- 244000180419 Brassica nigra Species 0.000 description 2
- 235000011301 Brassica oleracea var capitata Nutrition 0.000 description 2
- 235000004221 Brassica oleracea var gemmifera Nutrition 0.000 description 2
- 235000001169 Brassica oleracea var oleracea Nutrition 0.000 description 2
- 244000308368 Brassica oleracea var. gemmifera Species 0.000 description 2
- 235000008744 Brassica perviridis Nutrition 0.000 description 2
- 244000221633 Brassica rapa subsp chinensis Species 0.000 description 2
- 235000000536 Brassica rapa subsp pekinensis Nutrition 0.000 description 2
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 2
- 244000105627 Cajanus indicus Species 0.000 description 2
- 235000010773 Cajanus indicus Nutrition 0.000 description 2
- 235000010518 Canavalia gladiata Nutrition 0.000 description 2
- 235000015844 Citrullus colocynthis Nutrition 0.000 description 2
- 240000000885 Citrullus colocynthis Species 0.000 description 2
- 235000012828 Citrullus lanatus var citroides Nutrition 0.000 description 2
- 229920000742 Cotton Polymers 0.000 description 2
- 235000009854 Cucurbita moschata Nutrition 0.000 description 2
- 235000009852 Cucurbita pepo Nutrition 0.000 description 2
- 235000009804 Cucurbita pepo subsp pepo Nutrition 0.000 description 2
- 244000007835 Cyamopsis tetragonoloba Species 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 241001135761 Deltaproteobacteria Species 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- 101100260928 Escherichia coli tnsB gene Proteins 0.000 description 2
- 101100260929 Escherichia coli tnsC gene Proteins 0.000 description 2
- 108091029865 Exogenous DNA Proteins 0.000 description 2
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 2
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 2
- 244000020551 Helianthus annuus Species 0.000 description 2
- 235000003222 Helianthus annuus Nutrition 0.000 description 2
- 235000007340 Hordeum vulgare Nutrition 0.000 description 2
- 241000186660 Lactobacillus Species 0.000 description 2
- 241000254158 Lampyridae Species 0.000 description 2
- 235000007252 Lathyrus tuberosus Nutrition 0.000 description 2
- 244000237786 Lathyrus tuberosus Species 0.000 description 2
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 2
- 235000014647 Lens culinaris subsp culinaris Nutrition 0.000 description 2
- 244000043158 Lens esculenta Species 0.000 description 2
- 235000010649 Lupinus albus Nutrition 0.000 description 2
- 240000000894 Lupinus albus Species 0.000 description 2
- 235000008755 Lupinus mutabilis Nutrition 0.000 description 2
- 240000005265 Lupinus mutabilis Species 0.000 description 2
- 244000131099 Macrotyloma uniflorum Species 0.000 description 2
- 235000012549 Macrotyloma uniflorum Nutrition 0.000 description 2
- 241001608711 Melo Species 0.000 description 2
- 108700026244 Open Reading Frames Proteins 0.000 description 2
- 235000007164 Oryza sativa Nutrition 0.000 description 2
- 108091081548 Palindromic sequence Proteins 0.000 description 2
- 241000192001 Pediococcus Species 0.000 description 2
- 235000010617 Phaseolus lunatus Nutrition 0.000 description 2
- 244000302909 Piper aduncum Species 0.000 description 2
- 230000004570 RNA-binding Effects 0.000 description 2
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 2
- 241000220259 Raphanus Species 0.000 description 2
- 108010091086 Recombinases Proteins 0.000 description 2
- 102000018120 Recombinases Human genes 0.000 description 2
- 235000002595 Solanum tuberosum Nutrition 0.000 description 2
- 241000194017 Streptococcus Species 0.000 description 2
- 108700026226 TATA Box Proteins 0.000 description 2
- 108091023040 Transcription factor Proteins 0.000 description 2
- 102000040945 Transcription factor Human genes 0.000 description 2
- 108091026822 U6 spliceosomal RNA Proteins 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- 241001261005 Verrucomicrobia Species 0.000 description 2
- 235000002098 Vicia faba var. major Nutrition 0.000 description 2
- 240000004922 Vigna radiata Species 0.000 description 2
- 235000010721 Vigna radiata var radiata Nutrition 0.000 description 2
- 235000011469 Vigna radiata var sublobata Nutrition 0.000 description 2
- 235000010726 Vigna sinensis Nutrition 0.000 description 2
- 235000011453 Vigna umbellata Nutrition 0.000 description 2
- 240000001417 Vigna umbellata Species 0.000 description 2
- 241000202221 Weissella Species 0.000 description 2
- 210000005006 adaptive immune system Anatomy 0.000 description 2
- 229960005305 adenosine Drugs 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 150000001413 amino acids Chemical group 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 230000027455 binding Effects 0.000 description 2
- 235000013614 black pepper Nutrition 0.000 description 2
- 230000022131 cell cycle Effects 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 230000012361 double-strand break repair Effects 0.000 description 2
- 244000013123 dwarf bean Species 0.000 description 2
- 230000002616 endonucleolytic effect Effects 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- 102000054766 genetic haplotypes Human genes 0.000 description 2
- 230000005017 genetic modification Effects 0.000 description 2
- 235000013617 genetically modified food Nutrition 0.000 description 2
- 230000002363 herbicidal effect Effects 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 238000011534 incubation Methods 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 238000005304 joining Methods 0.000 description 2
- 229920005610 lignin Polymers 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000001717 pathogenic effect Effects 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 230000008121 plant development Effects 0.000 description 2
- 230000023603 positive regulation of transcription initiation, DNA-dependent Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 235000015136 pumpkin Nutrition 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 108091092562 ribozyme Proteins 0.000 description 2
- YGSDEFSMJLZEOE-UHFFFAOYSA-N salicylic acid Chemical compound OC(=O)C1=CC=CC=C1O YGSDEFSMJLZEOE-UHFFFAOYSA-N 0.000 description 2
- 238000002864 sequence alignment Methods 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 238000011426 transformation method Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 230000003612 virological effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 1
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 description 1
- CKTSBUTUHBMZGZ-SHYZEUOFSA-N 2'‐deoxycytidine Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 CKTSBUTUHBMZGZ-SHYZEUOFSA-N 0.000 description 1
- 108020005345 3' Untranslated Regions Proteins 0.000 description 1
- 241001133760 Acoelorraphe Species 0.000 description 1
- 102100038740 Activator of RNA decay Human genes 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 108010052875 Adenine deaminase Proteins 0.000 description 1
- 240000007241 Agrostis stolonifera Species 0.000 description 1
- 229920000856 Amylose Polymers 0.000 description 1
- 241000192537 Anabaena cylindrica Species 0.000 description 1
- 241000219195 Arabidopsis thaliana Species 0.000 description 1
- 241000193830 Bacillus <bacterium> Species 0.000 description 1
- 241000194110 Bacillus sp. (in: Bacteria) Species 0.000 description 1
- 101100381862 Bacillus subtilis (strain 168) bmr3 gene Proteins 0.000 description 1
- 241000589171 Bradyrhizobium sp. Species 0.000 description 1
- 235000011303 Brassica alboglabra Nutrition 0.000 description 1
- 235000011457 Brassica balearica Nutrition 0.000 description 1
- 240000008687 Brassica balearica Species 0.000 description 1
- 235000014698 Brassica juncea var multisecta Nutrition 0.000 description 1
- 235000014750 Brassica kaber Nutrition 0.000 description 1
- 244000178924 Brassica napobrassica Species 0.000 description 1
- 235000006008 Brassica napus var napus Nutrition 0.000 description 1
- 235000011302 Brassica oleracea Nutrition 0.000 description 1
- 244000304217 Brassica oleracea var. gongylodes Species 0.000 description 1
- 244000233513 Brassica perviridis Species 0.000 description 1
- 241000499439 Brassica rapa subsp. rapa Species 0.000 description 1
- 235000010570 Brassica rapa var. rapa Nutrition 0.000 description 1
- 241000982104 Brassica rupestris Species 0.000 description 1
- 235000000883 Brassica tournefortii Nutrition 0.000 description 1
- 241000219193 Brassicaceae Species 0.000 description 1
- 101150018129 CSF2 gene Proteins 0.000 description 1
- 240000003049 Canavalia gladiata Species 0.000 description 1
- 241001297358 Candidatus Kerfeldbacteria Species 0.000 description 1
- 241001297364 Candidatus Komeilibacteria Species 0.000 description 1
- 241000243205 Candidatus Parcubacteria Species 0.000 description 1
- 241001297304 Candidatus Vogelbacteria Species 0.000 description 1
- 235000002567 Capsicum annuum Nutrition 0.000 description 1
- 235000002283 Capsicum annuum var aviculare Nutrition 0.000 description 1
- 240000008384 Capsicum annuum var. annuum Species 0.000 description 1
- 235000013303 Capsicum annuum var. frutescens Nutrition 0.000 description 1
- 235000007862 Capsicum baccatum Nutrition 0.000 description 1
- 240000001844 Capsicum baccatum Species 0.000 description 1
- 235000002284 Capsicum baccatum var baccatum Nutrition 0.000 description 1
- 235000018306 Capsicum chinense Nutrition 0.000 description 1
- 244000185501 Capsicum chinense Species 0.000 description 1
- 235000015855 Capsicum pubescens Nutrition 0.000 description 1
- 240000000533 Capsicum pubescens Species 0.000 description 1
- 235000003255 Carthamus tinctorius Nutrition 0.000 description 1
- 244000020518 Carthamus tinctorius Species 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- 102000053642 Catalytic RNA Human genes 0.000 description 1
- 108020002739 Catechol O-methyltransferase Proteins 0.000 description 1
- 102100040999 Catechol O-methyltransferase Human genes 0.000 description 1
- 244000241235 Citrullus lanatus Species 0.000 description 1
- 235000009831 Citrullus lanatus Nutrition 0.000 description 1
- 241001478240 Coccus Species 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- 101100329224 Coprinopsis cinerea (strain Okayama-7 / 130 / ATCC MYA-4618 / FGSC 9003) cpf1 gene Proteins 0.000 description 1
- 235000002361 Crambe hispanica Nutrition 0.000 description 1
- 235000005983 Crescentia cujete Nutrition 0.000 description 1
- 244000241257 Cucumis melo Species 0.000 description 1
- 235000009847 Cucumis melo var cantalupensis Nutrition 0.000 description 1
- 244000308746 Cucumis metuliferus Species 0.000 description 1
- 235000013554 Cucumis metuliferus Nutrition 0.000 description 1
- 235000009849 Cucumis sativus Nutrition 0.000 description 1
- 235000010799 Cucumis sativus var sativus Nutrition 0.000 description 1
- 241000219104 Cucurbitaceae Species 0.000 description 1
- 241000192700 Cyanobacteria Species 0.000 description 1
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 1
- 102100028717 Cytosolic 5'-nucleotidase 3A Human genes 0.000 description 1
- DSLZVSRJTYRBFB-LLEIAEIESA-N D-glucaric acid Chemical compound OC(=O)[C@@H](O)[C@@H](O)[C@H](O)[C@@H](O)C(O)=O DSLZVSRJTYRBFB-LLEIAEIESA-N 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 1
- 230000004568 DNA-binding Effects 0.000 description 1
- 101710096438 DNA-binding protein Proteins 0.000 description 1
- 240000004585 Dactylis glomerata Species 0.000 description 1
- 241000252212 Danio rerio Species 0.000 description 1
- 235000002767 Daucus carota Nutrition 0.000 description 1
- 235000002243 Daucus carota subsp sativus Nutrition 0.000 description 1
- 241001338022 Daucus carota subsp. sativus Species 0.000 description 1
- CKTSBUTUHBMZGZ-UHFFFAOYSA-N Deoxycytidine Natural products O=C1N=C(N)C=CN1C1OC(CO)C(O)C1 CKTSBUTUHBMZGZ-UHFFFAOYSA-N 0.000 description 1
- 235000002673 Dioscorea communis Nutrition 0.000 description 1
- 241000544230 Dioscorea communis Species 0.000 description 1
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 1
- 206010059866 Drug resistance Diseases 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 101710116650 FAD-dependent monooxygenase Proteins 0.000 description 1
- 108090000331 Firefly luciferases Proteins 0.000 description 1
- 241000589601 Francisella Species 0.000 description 1
- 241000589599 Francisella tularensis subsp. novicida Species 0.000 description 1
- 101000860092 Francisella tularensis subsp. novicida (strain U112) CRISPR-associated endonuclease Cas12a Proteins 0.000 description 1
- 230000010558 Gene Alterations Effects 0.000 description 1
- 235000009432 Gossypium hirsutum Nutrition 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 108010025815 Kanamycin Kinase Proteins 0.000 description 1
- 241000256560 Kandleria Species 0.000 description 1
- 241000904817 Lachnospiraceae bacterium Species 0.000 description 1
- 241000219136 Lagenaria Species 0.000 description 1
- 240000007741 Lagenaria siceraria Species 0.000 description 1
- 235000009797 Lagenaria vulgaris Nutrition 0.000 description 1
- 241000192132 Leuconostoc Species 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- 241000219745 Lupinus Species 0.000 description 1
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 1
- 235000010624 Medicago sativa Nutrition 0.000 description 1
- 235000017587 Medicago sativa ssp. sativa Nutrition 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 241000091577 Mexicana Species 0.000 description 1
- 108020005196 Mitochondrial DNA Proteins 0.000 description 1
- 241000122904 Mucuna Species 0.000 description 1
- 235000006161 Mucuna pruriens Nutrition 0.000 description 1
- 244000111261 Mucuna pruriens Species 0.000 description 1
- 101100219625 Mus musculus Casd1 gene Proteins 0.000 description 1
- 101100494762 Mus musculus Nedd9 gene Proteins 0.000 description 1
- 240000005561 Musa balbisiana Species 0.000 description 1
- 235000018290 Musa x paradisiaca Nutrition 0.000 description 1
- 241000244206 Nematoda Species 0.000 description 1
- 101100385413 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) csm-3 gene Proteins 0.000 description 1
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 1
- 101710128228 O-methyltransferase Proteins 0.000 description 1
- 240000007817 Olea europaea Species 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 208000035753 Periorbital contusion Diseases 0.000 description 1
- 235000007848 Phaseolus acutifolius Nutrition 0.000 description 1
- 240000001956 Phaseolus acutifolius Species 0.000 description 1
- 235000010632 Phaseolus coccineus Nutrition 0.000 description 1
- 241000168681 Phaseolus dumosus Species 0.000 description 1
- 244000100170 Phaseolus lunatus Species 0.000 description 1
- 244000042209 Phaseolus multiflorus Species 0.000 description 1
- 235000008331 Pinus X rigitaeda Nutrition 0.000 description 1
- 235000011613 Pinus brutia Nutrition 0.000 description 1
- 241000018646 Pinus brutia Species 0.000 description 1
- 241000671722 Piper borbonense Species 0.000 description 1
- 235000002711 Piper cubeba Nutrition 0.000 description 1
- 240000003731 Piper cubeba Species 0.000 description 1
- 108700001094 Plant Genes Proteins 0.000 description 1
- 241000209049 Poa pratensis Species 0.000 description 1
- 241000605861 Prevotella Species 0.000 description 1
- 241000588769 Proteus <enterobacteria> Species 0.000 description 1
- 241000308169 Pseudocladosporium Species 0.000 description 1
- 241001523344 Pseudoroegneria Species 0.000 description 1
- 235000010580 Psophocarpus tetragonolobus Nutrition 0.000 description 1
- 244000046095 Psophocarpus tetragonolobus Species 0.000 description 1
- 244000184734 Pyrus japonica Species 0.000 description 1
- 102000017143 RNA Polymerase I Human genes 0.000 description 1
- 108010013845 RNA Polymerase I Proteins 0.000 description 1
- 102000009572 RNA Polymerase II Human genes 0.000 description 1
- 108010009460 RNA Polymerase II Proteins 0.000 description 1
- 244000286177 Raphanus raphanistrum Species 0.000 description 1
- 235000000241 Raphanus raphanistrum Nutrition 0.000 description 1
- 235000006140 Raphanus sativus var sativus Nutrition 0.000 description 1
- 108010052090 Renilla Luciferases Proteins 0.000 description 1
- 241000589180 Rhizobium Species 0.000 description 1
- 108090000621 Ribonuclease P Proteins 0.000 description 1
- 102000004167 Ribonuclease P Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 235000007201 Saccharum officinarum Nutrition 0.000 description 1
- 241000746444 Saccharum sp. Species 0.000 description 1
- 240000009132 Sagittaria sagittifolia Species 0.000 description 1
- 241001478233 Scytonema hofmannii Species 0.000 description 1
- 241001247145 Sebastes goodei Species 0.000 description 1
- 235000002560 Solanum lycopersicum Nutrition 0.000 description 1
- 235000007230 Sorghum bicolor Nutrition 0.000 description 1
- 235000011684 Sorghum saccharatum Nutrition 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 241000044578 Stenotaphrum secundatum Species 0.000 description 1
- 108091027544 Subgenomic mRNA Proteins 0.000 description 1
- 244000186561 Swietenia macrophylla Species 0.000 description 1
- 101100168692 Thermoproteus tenax (strain ATCC 35583 / DSM 2078 / JCM 9277 / NBRC 100435 / Kra 1) cas3' gene Proteins 0.000 description 1
- 235000021307 Triticum Nutrition 0.000 description 1
- 108090000848 Ubiquitin Proteins 0.000 description 1
- 108091023045 Untranslated Region Proteins 0.000 description 1
- 235000002096 Vicia faba var. equina Nutrition 0.000 description 1
- 235000010725 Vigna aconitifolia Nutrition 0.000 description 1
- 244000042325 Vigna aconitifolia Species 0.000 description 1
- 235000010500 Vigna catjang Nutrition 0.000 description 1
- 235000010722 Vigna unguiculata Nutrition 0.000 description 1
- 244000240068 Vigna unguiculata ssp sesquipedalis Species 0.000 description 1
- 235000005755 Vigna unguiculata ssp. sesquipedalis Nutrition 0.000 description 1
- 206010052428 Wound Diseases 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 241000209149 Zea Species 0.000 description 1
- 235000011902 Zea mays var everta Nutrition 0.000 description 1
- 244000171502 Zea mays var. everta Species 0.000 description 1
- 230000036579 abiotic stress Effects 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 230000009418 agronomic effect Effects 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- 239000012472 biological sample Substances 0.000 description 1
- 230000004790 biotic stress Effects 0.000 description 1
- 238000005251 capillar electrophoresis Methods 0.000 description 1
- 239000001511 capsicum annuum Substances 0.000 description 1
- 239000001728 capsicum frutescens Substances 0.000 description 1
- 235000021466 carotenoid Nutrition 0.000 description 1
- 150000001747 carotenoids Chemical class 0.000 description 1
- 101150098304 cas13a gene Proteins 0.000 description 1
- 101150055766 cat gene Proteins 0.000 description 1
- 210000000170 cell membrane Anatomy 0.000 description 1
- 230000010307 cell transformation Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 210000003763 chloroplast Anatomy 0.000 description 1
- 239000013611 chromosomal DNA Substances 0.000 description 1
- 230000027288 circadian rhythm Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 101150055601 cops2 gene Proteins 0.000 description 1
- 235000011655 cotton Nutrition 0.000 description 1
- 244000038559 crop plants Species 0.000 description 1
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- VGONTNSXDCQUGY-UHFFFAOYSA-N desoxyinosine Natural products C1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 VGONTNSXDCQUGY-UHFFFAOYSA-N 0.000 description 1
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 235000019621 digestibility Nutrition 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 231100000221 frame shift mutation induction Toxicity 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 230000004077 genetic alteration Effects 0.000 description 1
- 231100000118 genetic alteration Toxicity 0.000 description 1
- 231100000024 genotoxic Toxicity 0.000 description 1
- 230000001738 genotoxic effect Effects 0.000 description 1
- 235000021384 green leafy vegetables Nutrition 0.000 description 1
- 230000003054 hormonal effect Effects 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- ZNJFBWYDHIGLCU-HWKXXFMVSA-N jasmonic acid Chemical compound CC\C=C/C[C@@H]1[C@@H](CC(O)=O)CCC1=O ZNJFBWYDHIGLCU-HWKXXFMVSA-N 0.000 description 1
- 229960000318 kanamycin Drugs 0.000 description 1
- 229930027917 kanamycin Natural products 0.000 description 1
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 1
- 229930182823 kanamycin A Natural products 0.000 description 1
- 229940039696 lactobacillus Drugs 0.000 description 1
- 230000033001 locomotion Effects 0.000 description 1
- 108010026228 mRNA guanylyltransferase Proteins 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000021121 meiosis Effects 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 125000000740 n-pentyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 239000002105 nanoparticle Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000001053 orange pigment Substances 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- FJKROLUGYXJWQN-UHFFFAOYSA-N papa-hydroxy-benzoic acid Natural products OC(=O)C1=CC=C(O)C=C1 FJKROLUGYXJWQN-UHFFFAOYSA-N 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 230000019612 pigmentation Effects 0.000 description 1
- 239000001931 piper nigrum l. white Substances 0.000 description 1
- 239000003375 plant hormone Substances 0.000 description 1
- 210000002706 plastid Anatomy 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000008929 regeneration Effects 0.000 description 1
- 238000011069 regeneration method Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 108090000589 ribonuclease E Proteins 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 229960004889 salicylic acid Drugs 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000004460 silage Substances 0.000 description 1
- 244000240103 southern pea Species 0.000 description 1
- 238000011895 specific detection Methods 0.000 description 1
- 230000010473 stable expression Effects 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 230000035882 stress Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
- 230000013819 transposition, DNA-mediated Effects 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
Landscapes
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The present application provides novel synthetic small nuclear RNA (snRNA) promoters that are useful for CRISPR-mediated targeted gene modification in plants. The application also provides methods and compositions for driving the expression of a snRNA promoter that directs the expression of RNA and non-coding RNA for the development of plants and plant cells comprising a modified genome.
Description
Citation of related application
The present application claims the benefit of U.S. provisional application No. 63/182,288 filed on month 4 of 2021 and U.S. provisional application No. 63/295,061 filed on month 12 of 2021, all of which are incorporated herein by reference in their entirety.
Incorporation of the sequence Listing
37 kilobytes (as in MicrosoftMeasured in (a) and at 28 of month 4 of 2022, the sequence listing contained in the file named "MONS492WO-sequence_listing" was submitted herein by electronic submission and incorporated herein by reference.
Technical Field
The present disclosure relates to the field of biotechnology. More specifically, the present disclosure provides novel synthetic plant promoters useful for expressing non-protein coding micrornas, e.g., for CRISPR-mediated genome modification.
Background
Site-specific recombination has potential for application in a wide range of biotechnology-related fields. Meganucleases, zinc Finger Nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs) containing a DNA binding domain and a DNA cleavage domain are capable of genomic modification. While meganucleases, ZFNs and TALENs are efficient and specific, these techniques require the generation by protein engineering of one or more components of each genomic locus selected for modification. Advances in CRISPR using clustered regularly interspaced short palindromic repeats have demonstrated a genome modification approach with the advantage of rapid engineering.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) systems constitute an adaptive immune system in prokaryotes that targets endonucleolytic cleavage of invasive phages. The system consists of a protein component (Cas) and a guide RNA (gRNA) that targets Cas protein to a specific locus for endonuclease cleavage. This system has been successfully engineered to target specific loci for endonuclease cleavage of mammalian, zebra fish, drosophila, nematode, bacterial, yeast and plant genomes.
Preferably, the DNA sequence encoding the guide RNA is transcribed by RNA polymerase III which transcribes small nuclear RNA (snRNA). Natural promoters (such as the U6 snRNA promoter) are commonly used to drive expression of grnas. Multiplex targeting experiments typically rely on the same promoter driving each gRNA. When cloning or maintaining a plasmid comprising a plurality of U6/gRNA cassettes, this may lead to technical problems such as recombination events or deletions caused by sequence redundancy in the cassette. Having multiple snRNA promoters with different DNA sequences would help alleviate this technical problem. Thus, the inventors herein disclose novel synthetic snRNA promoters that have little sequence homology to known native U6 snRNA promoters and to each other. These novel synthetic snRNA promoters are capable of driving expression of RNA polymerase III transcripts (such as gRNA) in plant cells.
Disclosure of Invention
In one aspect, the invention provides a synthetic small nuclear RNA (snRNA) promoter comprising a DNA sequence selected from the group consisting of: (a) A sequence having at least 85% sequence identity to any one of SEQ ID NOs 1 to 10; (b) a sequence comprising any one of SEQ ID NOs 1 to 10; and (c) a fragment of any one of SEQ ID NOS: 1-10. In one embodiment, the synthetic snRNA promoter sequence has at least 90% sequence identity to the DNA sequence of any one of SEQ ID NOs 1-10. In another embodiment, the synthetic snRNA promoter sequence has at least 95% sequence identity to the DNA sequence of any one of SEQ ID NOs 1-10. In yet another embodiment, the synthetic snRNA promoter fragment comprises a gene regulatory activity.
Another aspect of the invention provides a recombinant DNA construct comprising a synthetic snRNA promoter operably linked to a DNA sequence encoding one or more guide RNAs (grnas), wherein the sequence of the synthetic snRNA promoter is selected from the group consisting of: (a) A sequence having at least 85% sequence identity to any one of SEQ ID NOs 1 to 10; (b) a sequence comprising any one of SEQ ID NOs 1 to 10; and (c) a fragment of any one of SEQ ID NOs 1-10, wherein the synthetic snRNA promoter is capable of expressing a gRNA. In some embodiments, the recombinant DNA construct further comprises a transcription termination sequence. In some embodiments, the recombinant DNA construct may further comprise a DNA sequence encoding a promoter operably linked to a DNA sequence encoding a clustered regularly interspaced short palindromic repeat CRISPR-associated protein. In some embodiments, the CRISPR-associated protein is selected from a type I CRISPR-associated protein, a type II CRISPR-associated protein, a type III CRISPR-associated protein, a type IV CRISPR-associated protein, a type V CRISPR-associated protein, or a type VI CRISPR-associated protein. In some embodiments, the CRISPR-associated protein is a synthetic CRISPR-associated protein. In certain embodiments of the recombinant DNA construct, the nucleotide sequence encoding a CRISPR-associated protein may be further operably linked to at least one Nuclear Localization Sequence (NLS). Furthermore, in certain embodiments of the contemplated recombinant DNA construct, the CRISPR-associated protein is selected from the group consisting of: cas1, cas1B, cas2, cas3, cas4, cas5, cas6, cas7, cas8, cas9 (also referred to as Csn1 and Csx 12), cas10, cas12 a (also referred to as Cpf 1), cas12b, cas12d, csy1, csy2, csy3, cse1, cse2, csc1, csc2, csa5, csn2, csm3, csm4, csm5, csm6, cmr1, cmr3, cmr4, cmr5, cmr6, csb1, csb2, csb3, csx17, csx14, csx10, csx16, csaX, csx3, csx1, csx15, csf1, csf2, csf3, csf4, casX, casY and Mad7. In certain embodiments, the construct comprises flanking left and right Homology Arms (HA) each about 2bp to 1200bp in length. In a particular embodiment, the homology arms are about 230bp to about 1003bp in length.
Another aspect of the invention provides a recombinant DNA construct comprising a first synthetic snRNA promoter operably linked to a DNA sequence encoding one or more guide RNAs (grnas), and a second synthetic snRNA promoter operably linked to a DNA sequence encoding one or more guide RNAs (grnas), wherein the sequences of the first and second synthetic snRNA promoters are independently selected from the group consisting of: (a) A sequence having at least 85% sequence identity to any one of SEQ ID NOs 1 to 10; (b) a sequence comprising any one of SEQ ID NOs 1 to 10; and (c) a fragment of any one of SEQ ID NOs 1-10, wherein said fragment is capable of expressing a gRNA. In certain embodiments, the first synthetic snRNA promoter is different from the second synthetic snRNA promoter. In certain embodiments, the sequence encoding one or more grnas expressed by the first synthetic snRNA promoter is different from the sequence encoding one or more grnas expressed by the second synthetic snRNA promoter. In some embodiments, the sequence encoding the gRNA further comprises a sequence encoding one or more tRNA's as described in WO/2016/061481, which is incorporated herein by reference in its entirety. In certain embodiments, the construct comprises flanking left and right Homology Arms (HA) each about 2bp to 1200bp in length. In a particular embodiment, the homology arms are about 230bp to about 1003bp in length. In some embodiments, the recombinant DNA construct further comprises a transcription termination sequence. In some embodiments, the recombinant DNA construct may further comprise a DNA sequence encoding a promoter operably linked to a DNA sequence encoding a clustered regularly interspaced short palindromic repeat CRISPR-associated protein. In some embodiments, the CRISPR-associated protein is selected from a type I CRISPR-Cas system, a type II CRISPR-Cas system, a type III CRISPR-Cas system, a type IV CRISPR-Cas system, a type V CRISPR-Cas system, or a type VI CRISPR-Cas system. In some embodiments, the CRISPR-associated protein is a synthetic CRISPR-associated protein. In certain embodiments of the recombinant DNA construct, the nucleotide sequence encoding a CRISPR-associated protein may be further operably linked to at least one Nuclear Localization Sequence (NLS). Furthermore, in certain embodiments of the contemplated recombinant DNA construct, the CRISPR-associated protein is selected from the group consisting of: cas1, cas1B, cas2, cas3, cas4, cas5, cas6, cas7, cas8, cas9 (also referred to as Csn1 and Csx 12), cas10, cas12 a (also referred to as Cpf 1), cas12b, cas12d, csy1, csy2, csy3, cse1, cse2, csc1, csc2, csa5, csn2, csm3, csm4, csm5, csm6, cmr1, cmr3, cmr4, cmr5, cmr6, csb1, csb2, csb3, csx17, csx14, csx10, csx16, csaX, csx3, csx1, csx15, csf1, csf2, csf3, csf4, casX, casY and Mad7.
Another aspect of the invention provides a recombinant DNA construct comprising a synthetic snRNA promoter operably linked to a sequence encoding a non-coding RNA, wherein the sequence of the synthetic snRNA promoter is selected from the group consisting of: (a) A sequence having at least 85% sequence identity to any one of SEQ ID NOs 1 to 10; (b) a sequence comprising any one of SEQ ID NOs 1 to 10; and (c) a fragment of any one of SEQ ID NOs 1 to 10, wherein the fragment comprises a gene regulatory activity. In some embodiments, the non-coding RNA is selected from the group consisting of: guide RNAs (grnas), micrornas (mirnas), miRNA precursors, mature mirnas, decoy mirnas as described in WO 2010/002984, incorporated herein by reference; small interfering RNAs (sirnas), small RNAs (22-26 nt in length) and precursors encoding the same, heterochromatin sirnas (hc-sirnas), piwi-interacting RNAs (pirnas), hairpin double-stranded RNAs (hairpin dsRNA), trans-acting sirnas (ta-sirnas), and naturally occurring antisense sirnas (nat-sirnas). In some embodiments, the recombinant DNA construct comprises a synthetic snRNA promoter operably linked to sequences encoding two or more non-coding RNAs. In some embodiments, the sequences encoding two or more non-coding RNAs further comprise sequences encoding one or more trnas.
Still another aspect of the invention includes a recombinant DNA construct comprising: a) A first synthetic snRNA promoter selected from the group consisting of: (a) A sequence having at least 85% sequence identity to any one of SEQ ID NOs 1 to 10; (b) a sequence comprising any one of SEQ ID NOs 1 to 10; and (c) a fragment of any one of SEQ ID NOs 1-10, wherein said fragment comprises gene regulatory activity and said first synthetic snRNA promoter is operably linked to a DNA sequence encoding a non-coding RNA; and b) a second synthetic snRNA promoter selected from the group consisting of: (a) A sequence having at least 85% sequence identity to any one of SEQ ID NOs 1 to 10; (b) a sequence comprising any one of SEQ ID NOs 1 to 10; and (c) a fragment of any one of SEQ ID NOs 1-10, wherein the fragment comprises gene regulatory activity and the second synthetic snRNA promoter is operably linked to a DNA sequence encoding a non-coding RNA, wherein the first synthetic snRNA promoter and the second synthetic snRNA promoter are different. In certain embodiments of the recombinant DNA construct, the sequence encoding the first synthetic snRNA promoter and the sequence encoding the second synthetic snRNA promoter each comprise any one of SEQ ID NOs 1-10 or a fragment thereof, wherein the fragment comprises gene regulatory activity. Embodiments are also contemplated wherein the recombinant DNA construct further comprises a sequence specifying one or more additional synthetic snRNA promoters selected from the group consisting of: 1-10 or a fragment thereof, wherein said fragment comprises a gene regulatory activity, said sequence being operably linked to a DNA sequence encoding a non-coding RNA, wherein each of said first synthetic snRNA promoter, said second synthetic snRNA promoter, and said one or more additional snRNA promoters are different. In certain embodiments, the recombinant DNA construct sequence specifying the one or more additional synthetic snRNA promoters is selected from the group consisting of: 1-10 or a fragment thereof, wherein said fragment comprises gene regulatory activity. In other embodiments, the recombinant DNA construct comprises 3, 4, or 5 synthetic snRNA promoters. In some embodiments, the recombinant DNA construct comprises a non-coding RNA that is a gRNA that targets a different selected target site in the chromosome of the plant cell. In other contemplated embodiments, the recombinant DNA further comprises a DNA sequence encoding a promoter operably linked to a DNA sequence encoding an RNA-guided endonuclease. In another embodiment, the RNA-guided endonuclease is a Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) -associated protein. In some embodiments, the CRISPR-associated protein is selected from the group consisting of a type I CRISPR-Cas protein, a type II CRISPR-Cas protein, a type III CRISPR-Cas protein, a type IV CRISPR-Cas protein, a type V CRISPR-Cas protein, and a type VI CRISPR-Cas protein. In some embodiments, the CRISPR-associated protein is a synthetic CRISPR-associated protein. In some embodiments, the CRISPR-associated protein is selected from Cas1, cas1B, cas, cas3, cas4, cas5, cas6, cas7, cas8, cas9 (also known as Csn1 and Csx 12), cas10, cas12 a (also known as Cpf 1), cas12b, cas12d, csy1, csy2, csy3, cse1, cse2, csc1, csc2, csa5, csn2, csm3, csm4, csm5, csm6, cmr1, cmr3, cmr4, cmr5, cmr6, csb1, csb2, csb3, csx17, csx14, csx10, csx16, csaX, csx3, csx1, csx15, csf1, csf2, csf3, csf4, casX, casY and Mad7.
In another aspect the invention provides a cell comprising any of the recombinant DNA constructs described above. In certain embodiments, the cell is a plant cell. In some embodiments, the plant cell is a monocot plant cell. In other embodiments, the plant cell is a dicotyledonous plant cell. In yet another embodiment, the plant cell is selected from the group consisting of: maize plant cells, soybean plant cells, cotton plant cells, peanut plant cells, barley plant cells, oat plant cells, festuca arundinacea plant cells, rice plant cells, sorghum plant cells, sugarcane plant cells, festuca arundinacea plant cells, turf grass plant cells, wheat plant cells, alfalfa plant cells, rape plant cells, cabbage plant cells, mustard plant cells, rutabaga plant cells, turnip plant cells, collard plant cells, broccoli plant cells, cauliflower plant cells, pepper plant cells, bean plant cells, cowpea plant cells, chickpea plant cells, cucurbit plant cells, lettuce plant cells, cucumber plant cells, melon plant cells, carrot plant cells, tomato plant cells, radish plant cells, potato plant cells, or ornamental plant cells.
Sequence description
SEQ ID NO. 1 is a DNA sequence of the synthetic snRNA promoter P-GSP 2262.
SEQ ID NO. 2 is a DNA sequence of the synthetic snRNA promoter P-GSP 2268.
SEQ ID NO. 3 is a DNA sequence of the synthetic snRNA promoter P-GSP 2269.
SEQ ID NO. 4 is the DNA sequence of the synthetic snRNA promoter P-GSP 2272.
SEQ ID NO. 5 is a DNA sequence of the synthetic snRNA promoter P-GSP 2273.
SEQ ID NO. 6 is a DNA sequence of the truncated variant synthetic snRNA promoter P-GSP2262_TR derived from P-GSP 2262.
SEQ ID NO. 7 is a DNA sequence of the truncated variant synthetic snRNA promoter P-GSP2268_TR derived from P-GSP 2268.
SEQ ID NO. 8 is a DNA sequence of the truncated variant synthetic snRNA promoter P-GSP2269_TR derived from P-GSP 2269.
SEQ ID NO. 9 is a DNA sequence of the truncated variant synthetic snRNA promoter P-GSP2272_TR derived from P-GSP 2272.
SEQ ID NO. 10 is a DNA sequence of the truncated variant synthetic snRNA promoter P-GSP2273_TR derived from P-GSP 2273.
SEQ ID NO. 11 is the DNA sequence of EXP, EXP-Zm.UbqM1:1:9, consisting of the promoter, leader sequence and intron from the ubiquitin gene of maize in Mexico (Zea mayssp. Mexicana).
SEQ ID NO. 12 is a DNA sequence encoding a nuclear-targeted Cas12a protein Cas12a_NLS.
SEQ ID NO. 13 is the DNA sequence of 3' UTR, T-Os.LTP: 2.
SEQ ID NO. 14 shows the DNA sequence of the guide RNA spacer NR-Zm.Bmr3_2691.
SEQ ID NO. 15 shows the DNA sequence of guide RNA, gRNA-Zm.Bmr3_2691.
SEQ ID NO. 16 is a DNA sequence of the guide RNA spacer NR-Zm.Bmr3_3170.
SEQ ID NO. 17 is the DNA sequence of guide RNA, gRNA-Zm.Bmr3_3170.
SEQ ID NO. 18 is a DNA sequence of the maize brown MIDRib 3 (Bmr 3) genomic region targeted for genome editing.
SEQ ID NO. 19 is the amino acid sequence of Cas12a_NLS encoded by SEQ ID NO. 12.
SEQ ID NO. 20 is a DNA sequence of the guide RNA spacer NR-Zm.Bmr3_90.
SEQ ID NO. 21 shows the DNA sequence of the guide RNA spacer NR-Zm.Bmr3_227.
SEQ ID NO. 22 shows the DNA sequence of the guide RNA spacer NR-Zm.Bmr3_3279.
SEQ ID NO. 23 is the DNA sequence of guide RNA, gRNA-Zm.Bmr3_90_3279.
SEQ ID NO. 24 is the DNA sequence of guide RNA, gRNA-Zm.Bmr3_227_3279.
SEQ ID NO. 25 is the DNA sequence of guide RNA, gRNA-Zm.Bmr3_2691_2.
SEQ ID NO. 26 is the DNA sequence of guide RNA, gRNA-Zm.Bmr3_3170_2.
SEQ ID NO. 27 is the DNA sequence of guide RNA, gRNA-Zm.Bmr3_2691_3170.
SEQ ID NO. 28 is the DNA sequence of the maize Zm7 genomic region targeted for genome editing.
SEQ ID NO. 29 shows the DNA sequence of the guide RNA spacer NR-Zm.7.1b.
SEQ ID NO. 30 is the DNA sequence of the guide RNA, gRNA-Zm.7.1b.
SEQ ID NO. 31 shows the DNA sequence of the guide RNA spacer NR-Zm.7.1c.
SEQ ID NO. 32 is the DNA sequence of the guide RNA, gRNA-Zm.7.1c.
SEQ ID NO. 33 is the DNA sequence of the guide RNA, gRNA-7.1c_7.1b.
Detailed Description
Provided herein are novel synthetic snRNA (micronuclear RNA) promoters active in plants. The nucleotide sequences of these small nuclear RNA promoters are shown in SEQ ID NOS.1-10. These microRNA promoters are capable of affecting expression of non-coding RNA (such as guide RNA) in plant tissue, and thus regulating expression of an operably linked sequence encoding non-coding RNA in a plant. Methods of modifying, producing, and using recombinant DNA molecules containing the provided micronuclear RNA promoters are also provided. Also provided are compositions comprising transgenic plant cells, plants, plant parts and seeds comprising the micronuclear RNA promoters of the invention, and methods of making and using the same.
In some embodiments, variants of a microRNA promoter selected from SEQ ID NOS: 1-10 are provided. In some embodiments, variants are provided comprising a sequence that is at least about 85% identical, at least about 86% identical, at least about 87% identical, at least about 88% identical, at least about 89% identical, at least about 90% identical, at least about 91% identical, at least about 92% identical, at least about 93% identical, at least about 94% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, or at least about 99% identical to a reference sequence provided as any of SEQ ID NOs 1-10 herein and has promoter activity as disclosed herein. The variants of any of SEQ ID NOS.1-10 may have activity of a base activity, such as a promoter activity of a base sequence.
In some embodiments, fragments of a microRNA promoter selected from SEQ ID NOS.1-10 are provided comprising at least about 50, at least about 75, at least about 95, at least about 100, at least about 125, at least about 150, at least about 175, at least about 200, at least about 225, at least about 250, at least about 275, at least about 300, at least about 325, at least about 350, at least about 375, at least about 400 contiguous nucleotides, at least about 425 contiguous nucleotides, at least about 450, at least about 475, or longer DNA molecules having promoter activity as disclosed herein. In certain embodiments, fragments of the micronuclear RNA promoters provided herein having gene expression activity are provided. Methods for generating such fragments from the starting promoter molecule are well known in the art. The fragment of any one of SEQ ID NOS.1 to 10 may have an activity of a base activity, for example, a promoter activity of a base sequence.
Compositions derived from any of the promoter elements contained in any of SEQ ID NOs 1-10 (such as internal or 5' deletions), for example, may be produced using methods known in the art to improve or alter expression, including by removing elements that have a positive or negative effect on expression; replicating elements that have a positive or negative effect on expression; and/or replicating or removing elements that have a tissue-or cell-specific effect on expression. Compositions derived from any of the promoter elements comprised in any of SEQ ID NOs 1-10 may be used, for example, to prepare enhancer elements comprising 3' deletions wherein the TATA box element or equivalent sequence and downstream sequence are removed. These enhancer elements may be operably linked to other synthetic or natural snRNA promoters to enhance expression. Further deletions may be made to remove any elements that have a positive or negative effect on expression. Any promoter element contained in any of the sequences set forth in SEQ ID NOS.1-10 and fragments or enhancers derived therefrom may be used to prepare chimeric transcription regulatory element compositions.
In some embodiments, the present disclosure provides novel synthetic snRNA (micronuclear RNA) promoters, and methods of their use, comprising expressing a guide RNA for targeted genetic modification of a plant genome by a Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) editing system. For example, in one embodiment, the present disclosure provides a DNA construct encoding at least one expression cassette comprising a synthetic snRNA promoter disclosed herein and a DNA sequence encoding one or more guide RNAs (grnas). Methods for causing a CRISPR system to modify a target genome are also provided, as well as genomic complements of plants modified by use of such systems. Thus, the present disclosure provides tools and methods that allow for the insertion, removal or modification of genes, loci, interlocking blocks, and chromosomes within a plant genome.
In another embodiment, the present disclosure provides a DNA construct encoding at least one expression cassette comprising a promoter disclosed herein and a DNA sequence encoding a non-protein coding small RNA (npcRNA). These constructs can be used to express npcRNA molecules.
The CRISPR system constitutes an adaptive immune system in prokaryotes that targets the endonucleolytic cleavage of DNA and RNA that invade phage (reviewed in Westra et al, annu Rev Genet 46:311-39,2012). There are six known types of CRISPR systems that rely on small RNAs for sequence-specific detection and targeting of foreign nucleic acids for disruption: type I, type II, type III, type V and type VI. Components of bacterial CRISPR systems are CRISPR-associated (Cas) proteins and CRISPR arrays comprising genomic targeting sequences interspersed with short palindromic repeats (protospacer regions). For the CRISPR type II system, the protospacer/repeat element is transcribed into a precursor CRISPR RNA (pre-crRNA) molecule, followed by enzymatic cleavage initiated by hybridization between the trans-acting CRISPR RNA (tracrRNA) molecule and the pre-crRNA palindromic repeat. The resulting crRNA-tracrRNA molecule comprises one copy of the spacer and one scaffold that can complex with Cas nuclease. The CRISPR/Cas complex is then directed to a DNA sequence complementary to the crRNA spacer sequence (protospacer), wherein such RNA-Cas protein complex silences the target DNA by enzymatic cleavage of both strands (double strand break; DSB).
The natural bacterial type II CRISPR system requires four molecular components for targeted cleavage of exogenous DNA: cas endonuclease (e.g., cas 9), housekeeping rnase III, CRISPR RNA (crRNA), and trans-acting CRISPR RNA (tracrRNA). The latter two components form a dsRNA complex and bind Cas9, producing an RNA-guided DNA endonuclease complex. For targeted genomic modifications in eukaryotes, this system is simplified into two components: cas9 endonuclease and guide RNA (gRNA). Experiments initially performed in eukaryotic systems determined that the rnase III component is not necessary to achieve targeted DNA cleavage. The minimal two-component system of Cas9 with gRNA as the sole target-specific component makes this targeted genome-modified CRISPR system more cost-effective and flexible than other targeting platforms such as meganucleases, zinc finger nucleases or TALE-nucleases (which require protein engineering to make modifications at each targeted DNA site). In addition, the ease of design and production of gRNA provides several advantages for CRISPR systems using targeted genomic modifications. For example, CRISPR/Cas system components designed for one or more genomic target sites (Cas endonuclease, gRNA, and optionally exogenous DNA for integration into the genome) may be multiplexed in one transformation, or the introduction of CRISPR/Cas system components may be spatially and/or temporally separated.
As used herein, "guide nucleic acid" or "guide RNA" or "gRNA" refers to a nucleic acid comprising a spacer sequence complementary to (and hybridizing to) a target DNA sequence and a scaffold sequence that binds to a Cas protein. In some embodiments, the scaffold sequence and spacer sequence are covalently linked and expressed as a single RNA transcript or molecule, referred to herein as a "single stranded guide RNA" (or "sgRNA"). In some embodiments, the scaffold sequence and spacer sequence are expressed as separate transcripts or molecules, referred to herein as "dual guide RNAs" (or "dgrnas"). The spacer sequence may be covalently or non-covalently linked to the 5 'and/or 3' end of the scaffold sequence. In some embodiments, the guide RNA comprises CRISPR RNA (crRNA) and transactivation crRNA (tracrRNA). In other embodiments, the guide RNA comprises crRNA but not tracrRNA. In some embodiments, the crRNA comprises both a spacer and a scaffold sequence. In some embodiments, the design of the gRNA can be based on a type I, type II, type III, type IV, type V, or type VI CRISPR-Cas system.
In some embodiments, the guide RNA array is expressed from a synthetic snRNA promoter described herein. In some embodiments, a synthetic snRNA promoter as described herein can be operably linked to more than one scaffold-spacer (and/or spacer-scaffold) sequence (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more scaffold-spacer (and/or spacer-scaffold) sequences) (e.g., a scaffold-spacer-scaffold, such as a spacer-scaffold-spacer, such as a scaffold-spacer-scaffold-spacer, such as a spacer-scaffold-spacer-scaffold, etc. In some embodiments, the guide RNA array comprises one or more tRNA's as described in WO/2016/061481. In some embodiments, the guide RNA array comprises one or more trnas that separate the scaffold and spacer sequences (e.g., scaffold-spacer-tRNA-scaffold-spacer, e.g., spacer-scaffold-tRNA-spacer-scaffold, e.g., scaffold-spacer-tRNA-scaffold-spacer, e.g., spacer-scaffold-tRNA-spacer-scaffold-spacer-tRNA-scaffold-spacer, etc.). In some embodiments, the scaffold sequence is selected from the group consisting of: a repeat sequence of a Cas12a crispr-Cas system or a fragment thereof; a repeat sequence of a Cas12b CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas12c CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas12d CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas12e CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas9 CRISPR-Cas system or a fragment thereof; a repeat sequence of a C2C 1CRISPR Cas system or fragment thereof; a repeat sequence of a C2C3 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas13a CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas13b CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas13 crcrispr-Cas system or a fragment thereof; a repeat sequence of a Cas13d CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas 1CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas1B CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas2 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas3 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas3' CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas3"CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas 4CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas 5CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas6 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas7 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas8 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cas10 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Csy 1CRISPR-Cas system or fragment thereof; a repeat sequence of a Csy2 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Csy3 CRISPR-Cas system or a fragment thereof; a repeat sequence of the Cse 1CRISPR-Cas system or fragment thereof; a repeat sequence of the Cse2 CRISPR-Cas system or a fragment thereof; a repeated sequence of Csc 1CRISPR-Cas system or a fragment thereof; a repeated sequence of Csc2 CRISPR-Cas system or a fragment thereof; a repeated sequence of a Csa 5CRISPR-Cas system or a fragment thereof; a repeated sequence of a Csn2 CRISPR-Cas system or a fragment thereof; a repeated sequence of a Csm2 CRISPR-Cas system or a fragment thereof; a repeated sequence of a Csm3 CRISPR-Cas system or a fragment thereof; a repeated sequence of a Csm 5CRISPR-Cas system or a fragment thereof; a repeated sequence of a Csm6 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cmr 1CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cmr3 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cmr 4CRISPR-Cas system or a fragment thereof; a repeat sequence of a Cmr 5CRISPR-Cas system or a fragment thereof; a repeat sequence of the Cmr6 CRISPR-Cas system or a fragment thereof; a repeat sequence of the Csb 1CRISPR-Cas system or fragment thereof; a repeated sequence of a Csb2 CRISPR-Cas system or a fragment thereof; a repeated sequence of a Csb3 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Csx10 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Csx14 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Csx15CRISPR-Cas system or a fragment thereof; a repeat sequence of a Csx16 CRISPR-Cas system or a fragment thereof; a repeat sequence of a Csx17 CRISPR-Cas system or a fragment thereof; a repeat sequence of a CsaX CRISPR-Cas system or a fragment thereof; a repeat sequence of a Csx 1CRISPR-Cas system or a fragment thereof; a repeat sequence of a Csx3 CRISPR-Cas system or a fragment thereof; a repeat sequence of the Csf1CRISPR-Cas system or fragment thereof; a repeat sequence of the Csf2 CRISPR-Cas system or fragment thereof; a repeat sequence of the Csf3 CRISPR-Cas system or fragment thereof; a repeat sequence of the Csf4CRISPR-Cas system or fragment thereof; and a repeat sequence of the Csf 5CRISPR-Cas system or a fragment thereof.
In some embodiments, the guide RNA expressed by the synthetic snRNA promoters described herein can comprise more than one crRNA sequence (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more crRNA sequences). In some embodiments, the guide RNA comprises one or more trnas that isolate the crRNA sequence (e.g., crRNA-tRNA-crRNA, e.g., crRNA-tRNA-crRNA, etc.).
In some embodiments, the guide RNA array expressed by the synthetic snRNA promoters described herein can comprise more than one tracrRNA sequence (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more tracrRNA sequences). In some embodiments, the guide RNA array comprises one or more trnas that isolate a tracrRNA sequence (e.g., tracrRNA-tRNA-tracrRNA, e.g., tracrRNA-tRNA-tracrRNA, etc.).
In some embodiments, the array of guide RNAs expressed by a synthetic snRNA promoter as described herein may comprise more than one crRNA-tracrRNA (and/or tracrRNA-crRNA) sequence (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10 or more crRNA-tracrRNA (and/or tracrRNA-crRNA) sequences) (e.g., crRNA-tracrRNA-crRNA, for example, tracrRNA-crRNA-tracrRNA, for example, crRNA-tracrRNA-crRNA-tracrRNA, such as tracrRNA-crRNA-tracrRNA, etc. In some embodiments, the guide RNA array comprises one or more tRNA's that separate the crRNA and tracrRNA sequences (e.g., crRNA-tracrRNA-tRNA-crRNA-tracrRNA, e.g., tracrRNA-crRNA-tRNA-tracrRNA-crRNA, such as crRNA-tracrRNA-tRNA-crRNA-tracrRNA-tRNArRNA-tracrRNA-tRNA-crRNA-tracrRNA, such as, for example, tracrRNA-crRNA-tRNA-tracrRNA-crRNA, etc.
In some embodiments, the guide RNA expressed by the synthetic snRNA promoters described herein can further comprise an aptamer sequence (e.g., MS2 aptamer). In some embodiments, the aptamer sequence recruits deaminase. In some embodiments, the aptamer sequence recruits reverse transcriptase. In some embodiments, the guide RNA can comprise one or two or more aptamers (e.g., 1,2, 3, 4, 5, 6, 7, 8, 9, 10, or more aptamers).
In some embodiments, the guide RNA expressed by the synthetic snRNA promoters described herein may further comprise an RNA template for reverse transcriptase. In some embodiments, a synthetic snRNA promoter as described herein is operably linked to a guide editing guide RNA (prime editing guide RNA "PegRNA").
Cas9 is a class 2 CRISPR effector protein. Class 2 CRISPR-Cas systems rely on single component effector proteins (such as Cas 9), where a single gRNA-bound Cas protein recognizes and cleaves a target sequence. Cas9 recognizes a G-rich Protospacer Adjacent Motif (PAM) located 3' to its guide RNA binding site. In some embodiments, the CRISPR Cas9 protein may be a Cas9 protein from, for example, streptococcus species (Streptococcus spp.) (e.g., streptococcus pyogenes, streptococcus thermophilus), lactobacillus species (Lactobacillus spp.), bifidobacterium species (Bifidobacterium spp.), candelas species (Kandleria spp.), leuconostoc spp.), pneumococcus species (oenocardia spp.), pediococcus spp, weissella species (Pediococcus spp.), weissella spp, and/or Europenosis spp. An additional class 2 Cas effector protein family has been found: cpf1 (also known as Cas12 a), C2C1, casX and CasY (burst et al Nature,542:237-241,2017).
Cas12a belongs to class 2V CRISPR systems and utilizes a single RNA-guided endonuclease lacking a tracrRNA. Cas12a system recognizes a T-rich Protospacer Adjacent Motif (PAM). T-enriched PAM allows genome editing in organisms with especially AT-enriched genomes or with AT-enriched regions of interest. CRISPR arrays are processed into short mature crrnas of 42-44 nucleotides in length. Each mature crRNA starts with 19 nucleotides of the direct repeat scaffold, followed by 23-25 nucleotides of the spacer sequence. This crRNA arrangement is in contrast to the type II CRISPR-Cas system, where the mature crRNA starts with 20-24 nucleotides of the spacer sequence followed by about 22 nucleotides of the direct repeat scaffold (Zetsche et al, cell 163:759-771,2015). When double stranded DNA molecules are cleaved, cas12a produces staggered cleavage, as opposed to blunt-ended cleavage (such as those produced by Cas 9). An example of a Cas12a coding sequence comprising a transit peptide for delivery to the nucleus is represented by SEQ ID No. 12 and encodes the protein represented by SEQ ID No. 19.
CRISPR-Cas nucleases useful in the present invention can include, but are not limited to, cas9, C2C1, C2C3, cas12a (also known as Cpf 1), cas12b, cas12C, cas12d, cas12e, cas13a, cas13b, cas13C, cas13d, casl, caslB, cas2, cas3', cas3", cas4, cas5, cas6, cas7, cas8, cas9 (also known as Csnl and Csx 12), cas10, csyl, csy2, csy3, csel, cse2, cscl, csc2, csa5, csn2, csm3, csm4, csm5, csm6, cmrl, cmr3, cmr4, cmr5, cmr6, csbl, csb2, csxl7, csxl4, csx10, x16, ax, csx3, csxl5, csxl, csf2, csf (or Csf) nucleic acids. In some embodiments, the CRISPR-Cas nuclease can be Cas9, cas12a (Cpf 1), cas12b, cas12C (C2C 3), cas12d (CasY), cas12e (CasX), cas12g, cas12h, cas12i, C2C4, C2C5, C2C8, C2C9, C2C10, cas14a, cas14b, and/or Cas14C effector protein. In some embodiments, CRISPR-Cas nucleases useful in the present disclosure can comprise mutations in their nuclease active sites (e.g., ruvC, HNH, e.g., ruvC site of Cas12a nuclease domain; e.g., ruvC site and/or HNH site of Cas9 nuclease domain). CRISPR-Cas nucleases have mutations in their nuclease active sites and therefore no longer contain nuclease activity, commonly referred to as "dead", e.g., dCas, such as dCas9 or dCas12a. In some embodiments, a CRISPR-Cas nuclease domain or polypeptide having a mutation in its nuclease active site can have impaired or reduced activity compared to the same CRISPR-Cas nuclease (e.g., a nickase, e.g., cas9 nickase, cas12a nickase) without the mutation. Recently, CRISPR-associated transposases (CAST) have been discovered and characterized. CAST consists of Tn 7-like transposase subunits tnsB, tnsC and tniQ, and the V-K CRISPR effector Cas12K catalyzes site-directed DNA transposition. Cas12k forms a complex with the partially complementary non-coding RNA species crRNA and tracrRNA, and the three-part ribonucleic acid-protein (RNP) complex recognizes chromosomal sites for transposition based on the presence of Protospacer Adjacent Motifs (PAMs) and complementarity between the variable part of the crRNA and the target DNA. The relevant transposases tnsB, tnsC and tniQ recognize the transposon by conserved 'left' (LE) and 'right' (RE) boundaries and they insert the transposon into a chromosomal site near the target sequence recognized by Cas12k, preferably between TA dinucleotides. Two homologous CAST systems native in the species Pseudocladosporium graminearum (Scytonema hofmanni) (UTEX B2349) and Anabaena cylindrical (Anabaena cylindrica) (PCC 7122) of the cyanobacteria have been shown to be functional for transit in E.coli (E.coli) (Strecker et al, science 365 (6448): 48-53,2019).
gRNA expression strategy
In certain embodiments, the present disclosure provides novel combinations of synthetic snRNA promoters (and functional fragments thereof) with DNA sequences encoding one or more guide nucleic acid molecules. The guide nucleic acid molecules provided herein may be DNA, RNA, or a combination of DNA and RNA.
In one embodiment, the synthetic snRNA promoter is operably linked to one or more gRNA coding sequences so as to constitutively express the gRNA in the transformed cell. For example, in some embodiments, this may be desirable when the resulting gRNA transcript remains in the nucleus and is therefore optimally located within the cell to direct nuclear processes. This may also be desirable, for example, in some embodiments, when the CRISPR system activity is low or the frequency of finding and cleaving target sites is low. In some embodiments, it may also be desirable when the promoter of a particular cell type (such as germline) is unknown for a given species of interest.
In another embodiment, fragments of the synthetic snRNA promoter comprising cis-elements necessary to drive transcription can be used to express one or more gRNAs. The disclosed full-length synthetic snRNA promoters (shown as SEQ ID Nos. 1-5) are each about 500bp in length. Constructs comprising multiple synthetic snRNA promoters can become larger when additional expression cassettes are cloned in tandem. This may lead to problems affecting stability and conversion. Thus, in some cases, the synthetic snRNA promoter may be truncated to reduce the size of the construct, so long as the truncated synthetic snRNA promoter retains the ability to drive transcription of the gRNA. Examples of such truncated synthetic snRNA promoters are shown in SEQ ID NOS.6-10.
Multiple synthetic snRNA promoters (or functional fragments thereof) with different sequences can be used to minimize problems in construct stability that are typically associated with sequence duplication and can be used to facilitate stacking of multiple gRNA cassettes in the same transformation construct.
In some embodiments, a synthetic snRNA promoter (or functional fragment thereof) as described herein can drive expression of a single gRNA. In some embodiments, a synthetic snRNA promoter (or functional fragment thereof) as described herein can drive expression of an array of 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more grnas. Each individual guide sequence may target the same target sequence or different target sequences. This configuration is suitable for multiplex gene manipulation (e.g., targeting multiple genes). Several strategies have been described in the art to facilitate the processing of a single guide RNA from a single transcript. In some embodiments, a synthetic snRNA promoter (or functional fragment thereof) as described herein can be used to drive a gRNA array, wherein the expression cassette comprises at least two or more grnas separated by one or more tRNA cleavage sequences (US 20190330647). tRNA-cleaving sequences include any sequence and/or structural motif that actively interacts with and is cleaved by cellular endogenous tRNA systems, such as RNase P, RNA enzyme Z and RNase E (bacteria). This may include structural recognition elements such as receptor stems, D-loop arms, T Psi C loops, and specific sequence motifs. In another embodiment, a synthetic snRNA promoter (or functional fragment thereof) as described herein may be used to drive a gRNA array comprising two or more gRNAs separated by one or more ribozyme cleavage sites (Tang et al mol. Plant9:1088-1091, 2016). In another embodiment, a synthetic snRNA promoter (or functional fragment thereof) as described herein may be used to drive a gRNA array comprising two or more gRNA arrays separated by one or more Csy4 ribonuclease recognition sites (Tsai et al, nat. Biotechnol,32 (6): 569-576, 2014).
In some embodiments, a synthetic snRNA promoter (or functional fragment thereof) as described herein can be used to drive expression of guided editing gRNA (PEgRNA). Guide editing is a genomic editing method that uses a nucleic acid programmable DNA binding protein (napDNAbp) (e.g., cas 9) that works in conjunction with a polymerase (e.g., in the form of a fusion protein or otherwise provided in trans with napDNAbp) to write new genetic information directly to a targeted DNA site, wherein the guide editing system is programmed with a specialized guide editing (PE) guide RNA ("PEgRNA") that both specifies the target site and provides a template for synthesis of the desired editing by engineering an extension (DNA or RNA) onto the guide RNA in the form of a replacement DNA strand (e.g., at the 5 'or 3' end, or at an internal portion of the guide RNA) (WO 2020191248). In some embodiments, a synthetic snRNA promoter (or functional fragment thereof) as described herein is used to drive expression of PEgRNA comprising a guide RNA and at least one nucleic acid extension arm comprising a DNA synthesis template, wherein the nucleic acid extension arm is located at the 3 'or 5' end of the guide RNA.
In another embodiment, a synthetic snRNA promoter (or functional fragment thereof) as described herein can be used to drive expression of an increased gRNA that further comprises an RNA mobility sequence that enables movement of RNA from cell to cell. The RNA mobility sequence may be a sequence derived from a plant gene, such as the flowering-time (FT) gene, BEL5, GAI, tRNA-like motif or LeT (WO 2021041001).
In other embodiments, a synthetic snRNA promoter (or functional fragment thereof) as described herein may be used to drive expression of CRISPR RNA (crRNA), mature crRNA, precursor crRNA, crRNA fragments, transactivation crRNA (tracrRNA), or tracrRNA fragments.
In some embodiments, the synthetic snRNA promoter (or functional fragment thereof) as described herein can be used to drive gRNA compatible with other forms of CRISPR-mediated gene editing, such as base editing (Komor et al, nature 533,420-424,2016; gaudelli et al, nature 551:464-471,2017; komor et al, science Advances volume 3: stage 8, 2017; and Rees et al, nat Rev Genet.19 (12): 770-788, 2018).
In some embodiments, synthetic snRNA promoters (or functional fragments thereof) as described herein can be used to drive gRNAs compatible with CRISPR-related transposase systems (CAST), such as those derived from Pseudocladus suis (ShCAST) and Anabaena cylindrical (AcCAST) (Strecker et al, science 365 (6448): 48-53,2019).
In some embodiments, a synthetic snRNA promoter (or functional fragment thereof) as described herein can be used to drive expression of one or more non-protein coding RNAs (npcrnas). Non-limiting examples of non-protein-encoding RNAs include micrornas (mirnas), miRNA precursors, small interfering RNAs (sirnas), small RNAs (22-26 nt in length) and precursors encoding the same, heterochromatin sirnas (hc-sirnas), piwi-interacting RNAs (pirnas), hairpin double-stranded RNAs (hairpin dsRNA), trans-acting sirnas (ta-sirnas), naturally-occurring antisense sirnas (nat-sirnas), and trnas.
Expression strategy for CRISPR class 2 type II or type V related genes
The present disclosure provides novel synthetic snRNA promoters (and functional fragments thereof) for use in sequence-specific CRISPR-mediated cleavage for molecular breeding by providing transcription of, for example, a gRNA comprising a spacer sequence for targeting a site of endonuclease cleavage by at least one Cas protein. In certain embodiments, the target site is a genomic target site. In some embodiments, the genomic target site is native or transgenic. In addition, CRISPR systems can be tailored to catalyze cleavage at one or more genomic target sites.
One aspect of the disclosure is the introduction of an expression construct into a plant cell comprising one or more cassettes encoding a synthetic snRNA promoter (or functional fragment thereof) as described herein operably linked to a nucleotide sequence encoding one or more grnas, including a copy of a spacer sequence complementary to a target site (e.g., a genomic target site); and expression constructs encoding type I, type II, type III, type IV, type V or type VI CRISPR-associated proteins, for modifying plant cells such that the plant cells or plants consisting of such cells subsequently exhibit beneficial traits. In one non-limiting example, the trait is another improved trait such as increased yield, resistance to biotic or abiotic stress, herbicide tolerance, or agronomic performance. The ability to produce such plant cells from this source depends on the use of the transformation constructs and cassettes described herein to introduce into the CRISPR system.
The expression construct encoding a CRISPR-associated protein may comprise a promoter. In certain embodiments, the promoter is a constitutive promoter, a tissue-specific promoter, a developmentally regulated promoter, or a cell cycle regulated promoter. Some contemplated promoters include promoters that are expressed only in germ line or germ cells. Such developmentally regulated promoters have the advantage of limiting the activity of the CRISPR system to only those cells that express the CRISPR-associated protein. In some embodiments, CRISPR-mediated genetic modification (e.g., chromosomal or episomal dsDNA cleavage) is limited to cells involved in transferring their genome from one generation to the next. This may be useful if the broader expression of the CRISPR system is genotoxic or has other undesirable effects. Examples of such promoters include promoters of genes encoding DNA ligases, recombinases, replicases, and the like.
In some embodiments, a DNA construct as described herein contains one or more synthetic snRNA promoters or fragments thereof that express high levels of DNA sequences encoding one or more grnas. DNA constructs expressing grnas directing CRISPR type 2 II or V-associated proteins with endonuclease activity to specific genomic sequences can be particularly useful such that the specific genomic sequences are cleaved and create double strand breaks that are repaired by a double strand break repair pathway that can include, for example, non-homologous end joining, micro-homology mediated end joining (MMEJ) homologous recombination, synthesis Dependent Strand Annealing (SDSA), single Strand Annealing (SSA), or a combination thereof, thereby disrupting the natural locus.
In one embodiment, the CRISPR system comprises at least one type I, II, III, IV, V or VI CRISPR-associated protein and a gRNA comprising a copy of a spacer sequence complementary to an endogenous target site.
In some embodiments, the CRISPR system can comprise a catalytically inactive CRISPR endonuclease. Such endonucleases will comprise a domain that retains the ability to bind to its target nucleic acid but has the ability to cleave nucleic acid molecules that are reduced or eliminated compared to control nucleases. In some embodiments, the catalytically inactive nuclease is catalytically inactive Cas9. In some embodiments, the catalytically inactive Cas9 creates a nick in one of the target DNA strands. In some embodiments, catalytically inactive Cas9, known as dead Cas9 (dCas 9), lacks all nuclease activity. In some embodiments, the catalytically inactive nuclease is catalytically inactive Cas12a. In some embodiments, catalytically inactive Cas12a creates a nick in one of the target DNA strands. In some embodiments, catalytically inactive Cas12a, referred to as dead Cas12a (dCas 12 a), lacks all dnase activity.
The present disclosure also provides for the use of CRISPR-mediated double stranded DNA cleavage to genetically alter expression and/or activity of a gene or gene product of interest in a tissue or cell type specific manner to increase productivity or to provide another beneficial trait, wherein the nucleic acid of interest may be endogenous or transgenic in nature. Thus, in one embodiment, the CRISPR system is engineered to mediate disruption at a specific site in a gene of interest. Genes of interest include genes that require altered expression levels/protein activity. These DNA cleavage events may be in the coding sequence or in regulatory elements within the gene.
The present disclosure provides for the introduction of components of a CRISPR system (e.g., CRISPR-associated proteins and their cognate grnas) into cells. Examples of CRISPR-associated proteins include natural and engineered (e.g., modified, including codon redesigned) nucleotide sequences encoding polypeptides having nuclease activity, such as Cas9 from streptococcus pyogenes (Streptococcus pyogenes), streptococcus thermophilus (Streptococcus thermophilus), or rhizobium species (bradyrhizobium sp.); from Francisella new (Francisella novicida) (Fncpf 1), prevotella sp Cpf1 (also known as Cas12 a) of the amino acid coccus species (Acidococcus sp.) BV3L6 and Mao Luoke bacteria (Lachnospiraceae bacterium) ND2006 (LbCPf 1); C2C1 from alicyclobacillus acidoterrestris (Alicyclobacillus acidoterrestris), bacillus species (bacillus sp.), verrucomicrobia species (Verrucomicrobia sp.), a-proteobacteria (a-proteobacteria) or δ -proteobacteria (δ -proteobacteria); from Plactomycetes andCasX of Proteus; or CasY from a klebside bacterial candidate (Candidatus Kerfeldbacteria), a wogonis bacterial candidate (Candidatus Vogelbacteria), a parvobacteria candidate (Candidatus Parcubacteria) or a Ke Meili bacterial candidate (Candidatus Komeilibacteria).
In a particular embodiment, the codon redesigned FnCpf1 and LbCpf1 nucleotide sequences and expression cassettes comprise the recombinant nucleic acid sequences disclosed in U.S.2020/0080096, the contents and disclosure of which are incorporated herein by reference.
Catalytically active CRISPR-associated genes (e.g., cas9 endonuclease, C2C1 endonuclease, casX endonuclease, casY endonuclease, or Cpf1 endonuclease) may be introduced into or produced by a target cell. As disclosed herein, this can be accomplished using a variety of methods.
Transient expression of CRISPR
In some embodiments, one or more expression cassettes encoding the gRNA and/or CRISPR-associated protein components of a type I, II, III, IV, V, or VI CRISPR-Cas system are transiently introduced into a cell. In certain embodiments, the one or more expression cassettes encoding the gRNA and/or CRISPR-associated protein are provided in an amount sufficient to modify the cell, but do not persist after a desired period of time has elapsed or after one or more cell divisions. In such embodiments, no additional step is required to remove or isolate one or more expression cassettes encoding the gRNA and/or CRISPR-associated proteins from the modified cells. In still other embodiments of the present disclosure, double stranded DNA fragments are also transiently introduced into cells along with one or more expression cassettes encoding gRNA and/or CRISPR-associated proteins. In such embodiments, the introduced double stranded DNA fragment is provided in an amount sufficient to modify the cell, but does not persist after a desired period of time has elapsed or after one or more cell divisions.
In another embodiment, mRNA encoding a CRISPR-associated protein is introduced into a cell. In such embodiments, the mRNA is translated to produce a sufficient amount of CRISPR-associated protein to modify the cell (expression of the gRNA is driven by the synthetic snRNA promoter (or functional fragment thereof) as described herein) in the presence of at least one gRNA), but does not persist after a desired period of time or after one or more cell divisions. In such embodiments, no further step is required to remove or isolate CRISPR-associated proteins from the modified cells.
In one embodiment of the present disclosure, a catalytically active CRISPR-associated protein is prepared in vitro prior to introducing into a plant cell comprising at least one gRNA whose expression is driven by a synthetic snRNA promoter (or a functional fragment thereof) as described herein. Methods of preparing CRISPR-associated proteins depend on their type and nature and are known to those skilled in the art. For example, if the CRISPR-associated protein is large and monomeric, the active form of the CRISPR-associated protein can be produced via bacterial expression, in vitro translation, via yeast cells, in insect cells, or by other protein production techniques known in the art. After expression, the CRISPR-associated protein is isolated, refolded (if desired), purified and optionally treated to remove any purification tags, such as His-tags. Once a crude, partially purified or more fully purified CRISPR-associated protein is obtained, the protein can be introduced into, for example, a plant cell by electroporation, by particle bombardment coated with the CRISPR-associated protein, by chemical transfection or by some other means of transport across the cell membrane. Methods for introducing proteins and nucleic acids into plant cells are well known in the art. Nanoparticles may also be used to deliver proteins, which may deliver a combination of active proteins and nucleic acids. Once a sufficient amount of CRISPR-associated protein is introduced, along with the appropriate gRNA, such that an effective amount of in vivo activity is present, the target sequence within the genome is cleaved. It is also recognized that one skilled in the art can produce CRISPR-associated proteins that are inactive but are activated in vivo by natural processing mechanisms; such CRISPR-associated proteins are also contemplated by the present disclosure.
In another embodiment, constructs are generated that will transiently express the gRNA and/or CRISPR-associated protein and introduced into a plant cell. In yet another embodiment, the construct will produce a sufficient amount of the gRNA and/or CRISPR-associated protein to effectively modify the desired episome or genomic target site. For example, the present disclosure contemplates the preparation of constructs that can be bombarded, electroporated, chemically transfected, or transported into a plant cell by some other means. Such constructs may have several useful properties. For example, in one embodiment, the construct may replicate in a bacterial host, so that the construct can be produced and purified in an amount sufficient for transient expression. In another embodiment, the construct may encode a herbicide resistance gene to allow selection of the construct in a host, or the construct may further comprise an expression cassette to provide expression of the gRNA and/or CRISPR-associated protein in a plant. In another embodiment, the CRISPR-associated protein expression cassette can contain a promoter region, a 5 'untranslated region, an optional intron to aid expression, a multiple cloning site to allow easy introduction of a DNA sequence encoding a CRISPR-associated protein, and a 3' utr. In particular embodiments, the promoter of the CRISPR-associated protein expression cassette can be a constitutive promoter, a tissue-specific promoter, or other type of promoter expressed in a plant cell. In another embodiment, a gRNA expression cassette can contain a snRNA promoter (or functional fragment thereof), a gRNA coding sequence, and a short poly-T region that terminates transcription as described herein. In some embodiments, the promoter in the gRNA expression cassette may be a synthetic snRNA promoter selected from the group consisting of SEQ ID NOS: 1-5. In some embodiments, the promoter in the gRNA expression cassette may be a synthetic snRNA promoter selected from the group consisting of SEQ ID NOS: 6-10. In some embodiments, it may be beneficial to include a unique restriction site at one or each end of the expression cassette to allow for the production and isolation of linear expression cassettes, which may be free of other construct elements. In certain embodiments, the untranslated leader may be a plant-derived untranslated region. When the expression cassette is transformed or transfected into monocot or dicot cells, the use of introns, which may be of plant origin, is contemplated.
In other embodiments, one or more elements in the construct include a spacer complementary to a target site contained within an episome or genomic sequence. This facilitates CRISPR-mediated modification within the expression cassette enabling removal and/or insertion of elements such as promoters and transgenes.
In another approach, the transient expression construct may be introduced into a plant cell using a bacterial or viral construct host. For example, agrobacterium (Agrobacterium) is one such bacterial construct that can be used to introduce a transient expression construct into a host plant cell. When bacterial, viral or other construct host systems are used, the transient expression construct is included in the host construct system. For example, if an Agrobacterium host system is used, the transient expression cassette will be flanked by one or more T-DNA borders and cloned into a binary construct. Many such construct systems have been identified in the art (reviewed in Hellens et al, 2000).
In embodiments, methods of selecting modified plant cells may be employed whereby one or more of the gRNA and/or CRISPR-associated protein components of the CRISPR system are transiently introduced in an amount sufficient to modify the plant cells. In one such method, a second nucleic acid molecule comprising a selectable marker is co-introduced with the transient gRNA and/or CRISPR-associated protein. In this embodiment, the co-introduced label may be part of a molecular strategy that introduces the label at the target site. For example, co-introduced markers can be used to disrupt a target gene by inserting between genomic target sites. In another embodiment, the co-introduced nucleic acid can be used to produce a visual marker protein, such that transfected cells can be cell sorted or isolated by some other method. In yet another embodiment, the co-introduced tag can be randomly integrated or directed via a second gRNA: CRISPR-associated protein complex to integrate at a site independent of the primary genomic target site. In another embodiment, the co-introduced molecules can target a particular locus through a double strand break repair pathway, which can include, for example, non-homologous end joining (NHEJ), micro-homology mediated end joining (MMEJ), homologous recombination, synthesis-dependent strand annealing (SDSA), single Strand Annealing (SSA), or a combination thereof, at the genomic target site. In the above embodiments, the co-introduced markers can be used to identify or select cells that may have been exposed to gRNA and/or CRISPR-associated proteins and thus may have been modified by CRISPR.
Stable expression of CRISPR
In another embodiment, one or more expression constructs encoding one or more components of a CRISPR system (e.g., CRISPR-associated proteins and homologous grnas thereof) are stably transformed into a plant cell. In this embodiment, the design of the transformation construct provides flexibility for when and under what conditions the gRNA and/or CRISPR-associated protein is expressed. Furthermore, the transformation construct may be designed to comprise a selectable or visible marker that will provide a means to isolate or effectively select one or more expression constructs containing one or more components encoding the CRISPR system and/or cell lines that have been modified by the CRISPR system.
Cell transformation systems have been described in the art and include a variety of transformation constructs. For example, for plant transformation, two major methods include agrobacterium-mediated transformation and particle gun bombardment-mediated (e.g., biolistic) transformation. In both cases, the nucleotide sequence encoding a component of the CRISPR system is introduced via one or more expression cassettes. In another embodiment, the CRISPR-associated protein expression cassette can contain a promoter region, a 5 'untranslated region, an optional intron to aid expression, a multiple cloning site to allow easy introduction of a DNA sequence encoding a CRISPR-associated protein, and a 3' utr. In particular embodiments, the promoter of the CRISPR-associated protein expression cassette can be a constitutive promoter, a tissue-specific promoter, a developmentally regulated promoter, a cell cycle regulated promoter, or a germline specific promoter. In another embodiment, a gRNA expression cassette can contain a snRNA promoter (or functional fragment thereof), a gRNA coding sequence, and a short poly-T region that terminates transcription as described herein. In a particular embodiment, the promoter in the gRNA expression cassette may be a synthetic snRNA promoter selected from the group consisting of SEQ ID NOS: 1-5. In some embodiments, the promoter in the gRNA expression cassette may be a synthetic snRNA promoter selected from the group consisting of SEQ ID NOS: 6-10.
For particle bombardment or protoplast transformation, the expression cassette may be an isolated linear fragment or may be part of a larger construct that may contain bacterial replication elements, bacterial selection markers, or other elements. The one or more gRNA and/or CRISPR-associated protein expression cassettes can be physically linked to the marker cassette or can be mixed with a second nucleic acid molecule encoding the marker cassette. In some embodiments, the marker cassette consists of the necessary elements to express a visual or selectable marker that allows for efficient selection of transformed cells. In the case of agrobacterium-mediated transformation, one or more expression cassettes may be adjacent to or between flanking T-DNA boundaries and contained in a binary construct. In another embodiment, one or more expression cassettes may be outside of the T-DNA. The presence of one or more expression cassettes in a cell can be manipulated by positive or negative selection protocols. Furthermore, the selectable marker cassette may also be within or adjacent to the same T-DNA boundary, or may be elsewhere within the second T-DNA (e.g., 2T-DNA system) on the binary construct.
In some embodiments, cells that have been transiently or stably modified by the CRISPR system are passaged with unmodified cells. The cells may be subdivided into lines of independent clonal origin or may be used to regenerate plants of independent origin. Individual plants or clonal populations regenerated from these cells can be used to generate lines of independent origin. At any of these stages, molecular assays can be used to screen for cells, plants or lines that have been modified. The modified cells, plants or lines continue to proliferate while the unmodified cells, plants or lines are discarded. In some embodiments, the presence of an active CRISPR system in a cell is essential to ensure the efficiency of the overall process.
Conversion process
Methods for transforming or transfecting cells are well known in the art. Methods for plant transformation using agrobacterium or DNA coated particles are well known in the art and are incorporated herein. Suitable methods for transforming host cells for use in the present disclosure are believed to include virtually any method by which DNA can be introduced into the cell, such as by agrobacterium-mediated transformation (U.S. Pat. nos. 5,563,055, 5,591,616, 5,693,512, 5,824,877, 5,981,840 and 6,384,301), and by accelerating DNA-coated particles (U.S. Pat. nos. 5,015,580, 5,550,318, 5,538,880, 6,160,208, 6,399,861 and 6,403,865), and the like. By applying these techniques, almost any kind of cells can be stably transformed.
Various methods of selecting transformed cells have been described. For example, a drug resistance marker (such as neomycin phosphotransferase protein) may be utilized to confer resistance to kanamycin or 5-enolpyruvylshikimate phosphate synthase may be used to confer tolerance to glyphosate. In another embodiment, carotenoid synthases are used to produce an orange pigment that can be visually identified. Each of these three exemplary methods can be effectively used to isolate cells or plants or tissues thereof that have been transformed and/or modified by CRISPR.
When a nucleic acid sequence encoding a selectable or screenable marker is inserted into a genomic target site, the marker can be used to detect the presence or absence of CRISPR or its activity. Once a cell has been modified by CRISPR, this may be useful, and it is desirable to recover genetically modified cells that no longer contain CRISPR, or plants regenerated from such modified cells. In other embodiments, the tag may be intentionally designed to integrate at the genomic target site such that it can be used to track modified cells independent of CRISPR. The marker may be a gene that provides a visually detectable phenotype, for example in a seed, to allow rapid identification of seeds carrying or lacking the CRISPR expression cassette.
The present disclosure provides means to regenerate plants from cells having a repaired double strand break within a genomic target site. Regeneration may then be used to propagate additional plants.
The present disclosure additionally provides novel plant transformation constructs and expression cassettes, including synthetic snRNA promoters and combinations thereof, as well as CRISPR-associated genes and gRNA/expression cassettes. The present disclosure further provides methods of obtaining plant cells, whole plants, and seeds or embryos that have been specifically modified using CRISPR-mediated lysis. The present disclosure also relates to a novel plant cell containing a CRISPR-associated Cas endonuclease expression construct and a gRNA expression cassette.
Targeting using blunt-ended oligonucleotides
In certain embodiments, a CRISPR system (e.g., a CRISPR/Cas9 system or a CRISPR/Cas12a system) can be used to target the 5' insertion of a blunt-ended double-stranded DNA fragment into a genomic target site of interest. In some embodiments, CRISPR-mediated endonuclease activity can introduce double-scaffold breaks (DSBs) in selected genomic target sites, and DNA repair (such as micro-homology driven non-homologous end joining DNA repair) results in insertion of blunt-ended double-stranded DNA fragments into the DSBs. In some embodiments, blunt-ended double-stranded DNA fragments may be designed to have 1-10bp microhomologies on the 5 'and 3' ends of the DNA fragments that correspond to the 5 'and 3' flanking sequences at the cleavage site in the genomic target site.
Use of CRISPR system in molecular breeding
In some embodiments, genomic knowledge is used for targeted gene alterations of the genome. At least one gRNA can be designed to target at least one region of the genome to disrupt that region of the genome. This aspect of the disclosure may be particularly useful for genetic alterations. The resulting plants may have a modified phenotype or other trait, depending on the gene or genes that have been altered. The previously characterized mutant alleles or introduced transgenes can be targeted for CRISPR-mediated modification enabling the production of improved mutants or transgenic lines.
In another embodiment, the gene targeted for deletion or disruption may be a transgene that has been previously introduced into the target plant or plant cell. This has the advantage of allowing the introduction of improved forms of transgenes or allowing disruption of the selectable marker coding sequence. In yet another embodiment, the gene targeted for disruption by the CRISPR system is at least one transgene that is introduced on the same construct or expression cassette as one or more other transgenes of interest and resides at the same locus as another transgene. Those skilled in the art understand that this type of CRISPR-mediated modification can result in a deletion or insertion of additional sequences. Thus, in certain embodiments, it may be preferred to produce a plurality of plants or plant cells in which a deletion has occurred, and screen such plants or plant cells using standard techniques to identify a particular plant or plant cell that has minimal alteration in its genome following CRISPR-mediated modification. Such screening may utilize genotypic and/or phenotypic information. In such embodiments, a particular transgene may be disrupted leaving the remaining transgenes intact. This avoids having to create new transgenic lines containing the desired transgene without the undesired transgene.
In another aspect, the present disclosure includes a method for inserting a DNA fragment of interest into a specific locus of a plant genome, wherein the DNA fragment of interest is from the genome of the plant or is heterologous with respect to the plant. The present disclosure allows for selection or targeting of specific regions of a genome for nucleic acid (e.g., transgene) stacking (e.g., large loci). Thus, a targeted region of the genome may exhibit linkage of at least one transgene to at least one phenotype trait of interest, and may also result in the production of linkage blocks that facilitate transgene stacking and integration of the transgene trait, and/or the production of linkage blocks while also allowing for integration of conventional traits.
Use of CRISPR systems in trait integration
Targeted insertion of a DNA fragment of interest into at least one genomic target site by CRISPR-mediated cleavage allows targeted integration of multiple nucleic acids of interest (e.g., trait stacks) to be added into the plant genome, either in the same site or in different sites. The site of targeted integration may be selected based on knowledge of potential breeding values, transgene performance at the location, potential recombination rates at the location, transgenes present in the interlocking block, or other factors. Once the stacked plants are assembled, it can be used as a trait donor for crossing with germplasm that is advanced in or directly in a breeding pipeline.
The present disclosure includes methods for inserting at least one nucleic acid of interest into at least one locus, wherein the nucleic acid of interest is from the genome of a plant, such as a QTL or allele, or is of transgenic origin. The targeted region of the genome may thus exhibit linkage of at least one transgene to a haplotype of interest associated with at least one phenotypic trait (as described in U.S. patent application publication No. 2006/0282911), generation of a linkage block that facilitates integration of the transgene stack and the transgene trait, generation of a linkage block that facilitates integration of the QTL or haplotype stack and the conventional trait, and the like.
In another embodiment of the present disclosure, multiple unique grnas can be used to modify multiple loci contained within a linkage block on a chromosome by exploiting knowledge of genomic sequence information and the ability to design custom grnas as described in the art. Grnas specific for or guidable to genomic target sites upstream of loci containing non-target alleles are designed or engineered as desired. A second gRNA specific for or guidable to a genomic target site downstream of a target locus containing a non-target allele is also designed or engineered. The grnas can be designed such that they are complementary to genomic regions that have no homology to non-target loci that contain target alleles. Two grnas can be introduced into a cell using one of the methods described above.
The ability to perform targeted integration depends on the role of the gRNA: CRISPR-associated protein. This advantage provides a method for engineering a plant of interest (including a plant or cell) comprising at least one genomic modification.
The custom grnas can be utilized in a CRISPR system to generate at least one trait donor to generate a custom genome modification event that is then crossed into at least one second plant (including a plant) of interest, wherein CRISPR-related protein delivery can be combined with the grnas of interest for genome editing. In other aspects, one or more plants of interest are directly transformed with a CRISPR system and at least one double stranded DNA fragment of interest for targeted insertion. It should be appreciated that this method can be performed in a variety of cells, tissues and developmental types, including gametes of plants. It is also contemplated that one or more of the elements described herein may be used in combination with a particular cell, tissue, plant part, and/or developmental stage specific promoter, such as a meiosis specific promoter.
Furthermore, the present disclosure contemplates targeting transgene elements already present within the genome for deletion or disruption. This allows, for example, the introduction of improved forms of transgenes, or the removal of selection markers. In yet another embodiment, the gene targeted for disruption by CRISPR-mediated cleavage is at least one transgene that is introduced onto the same construct or expression cassette as another transgene or one or more other transgenes of interest and resides at the same locus as the other transgene.
In one aspect, the present disclosure provides a method for modifying a locus of interest in a cell comprising (a) identifying at least one locus of interest within a DNA sequence; (b) Introducing into the cell an expression cassette comprising a synthetic snRNA promoter selected from the group consisting of SEQ ID NOs 1-10 operably linked to a nucleotide sequence encoding a gRNA; and an expression cassette comprising a plant-expressible promoter operably linked to a nucleic acid sequence encoding a CRISPR-associated protein, wherein the gRNA and/or CRISPR-associated protein is transiently or stably expressed; (d) Determining CRISPR-mediated modification in DNA constituting or flanking a locus of interest in a cell; and (e) identifying a modified cell or progeny cell thereof contained in the locus of interest.
Another aspect provides a method for modifying a plurality of loci of interest in a cell comprising (a) identifying a plurality of loci of interest within a genome; (b) Introducing into at least one cell a plurality of expression cassettes comprising a synthetic snRNA promoter selected from the group consisting of SEQ ID NOs 1-10 operably linked to a nucleotide sequence encoding a gRNA, wherein the synthetic snRNA promoters are independently selected; and at least one expression cassette comprising a plant-expressible promoter operably linked to nucleic acid sequences encoding CRISPR-associated proteins according to the disclosure, wherein said cells comprise these genomic target sites and these grnas and CRISPR-associated proteins are transiently or stably expressed and produce a modified locus or loci comprising at least one CRISPR-mediated cleavage event; (d) Determining CRISPR-mediated modifications in DNA that make up or flank each locus of interest in a cell; and (e) identifying cells comprising the modified nucleotide sequence in the locus of interest or cells that are progeny thereof.
The present disclosure further contemplates sequential modification of one locus of interest by two or more grnas and CRISPR-associated proteins according to the present disclosure. Genes or other sequences added by such first CRISPR-mediated genome modification may be retained, further modified or removed by the second CRISPR-mediated genome modification.
Accordingly, the present invention includes a method forCompositions and methods for modifying loci of interest in crop plants such as corn (maize): maize (Zea mays) subspecies maize (mays); maize varieties (farinaceous corn) (farinaceous corn (Zea mays var. Amyl acid)), popped corn (popcorn) (popped corn (Zea mays var. Everta)), dent corn (dent corn) (dent corn (Zea mays var. Incontata)), hard corn (flint corn) (hard corn (Zea mays var. Indutate)), sweet corn (Zea mays var. Saccharate and Zea mays var. Rugose)), waxy corn (waxy corn) (waxy corn (Zea mays var. Cera), high amylose corn (amylomaze) (corn (Zea mays)), pod corn (pod corn) (Zea mays var. Tujiata)ex a.st.hi.)), streaked maize (colored maize (Zea mays var); soybean (Glycine max); cotton (Gossypium hirsutum); cotton species (Gossypium sp.))); peanut (Arachis hypogaea); barley (Hordeum vulgare); oat (Avena sativa)); festuca arundinacea (Dactylis glomerata)); rice (Oryza sativa), including indica and japonica varieties; sorghum (Sorghum bicolor)); sugarcane (Saccharum sp.); festuca arundinacea (Festuca arundinacea)); turf grass species (e.g., species: stolon-cut-strand (Agrostis stolonifera), poa pratensis, sagittaria (Stenotaphrum secundatum)); wheat (Triticum aestivum)); alfalfa (Medicago sativa); members of the genus Brassica, including but not limited to, brassica napus (Brassica napus) and Brassica napus (Brassica rapa)), members of the genus Brassica, such as, for example, species Brassica rapa (B.rapa subsp. Chinesis), brassica rapa (Brassica rapa var glabra), brassica napus (chu sum), brassica napus (field mustards), brassica juncea (Brassica juncea subsp. Oleer), brassica napus (komatsu) and Brassica rapa subsp. Oleer, brassica napus (Brassica napus, brassica rapa substrate), brassica (Brassica rapa substrate), brassica sativa (Brassica sativa) and Brassica sativa (Brassica sativa) The method comprises the steps of (a) a turnip (Brassica rapa var. Rapidia), a slump (tatoi), a slump (Brassica rapa subsp. Napinosa), a turnip (Brassica rapa subsp. Rapa), a yellow salson (yellow salson), a Chinese cabbage (Chinese cabbage), a turnip (turnip), a western vegetable bench (rapin), a small pine (komatsu) (Brassica napus (Brassica rapa) (synonym turnip (Brassica campestris)), a ma, a arabidopsis thaliana (mahogany) a (Brassica balearica), a mustard (Abyssinian mustard) or a deep sea mustard (Abyssinian cabbage) (yellow salon (Brassica carinata) for producing a biological, a long elytrigia (52), a Brassica juncea (48), a Brassica napus (24), a Brassica napus (Brassica juncea), a (24, a Brassica napus (24), a Brassica juncea (Brassica, a Brassica napus (24), a Brassica napus (Brassica napus), a Brassica napus (24), a Brassica juncea (Brassica, a Brassica napus (24), a Brassica napus (Brassica juncea), a Brassica napus (Brassica napus) and a Brassica juncea (Brassica, a. Fra) of the second-order of the biological production Slump (broadbeaked mustard) (slump (Brassica narinosa)), black mustards (black mustard), kale (kale), cabbage, kale (collard greens), broccoli, cauliflower, kale (kai-lan), brussels sprouts (Brussels sprouts), broccoli (kohlrabi) (cabbage (Brassica oleracea)), spinach mustard green, mustard spinach (mustards spin) (Brassica juncea (Brassica perviridis)), brown mustard (Brassica rupestris)), turnip (seventuhop) Brassica septiceps), asian mustard (submustard) and broccoli (b. Oleracea); pepper (e.g., species: black pepper, white pepper and green pepper (Piper nigrum), piper cubeba (cubeba), indian long pepper (Indian long pepper) (long pepper (Piperlongum)), indonesia long pepper (Indonesian long pepper) (false Piper longum (Piper retrofractum)), motor gas wild pepper (Voatsiepfery) (motor gas wild pepper (Piper borbonense)), and the like Fructus Piperis (Ashanti peppers), fructus Musae (banana peppers), fructus Capsici (bell peppers), capsici fructus (cayenne peppers), capsici fructus (Mexico peppers)>Florina pepper (Florina peppers), (annual Capsicum annuum) cultivars), chile pepper (chili peppers) (cultivars of annual Capsicum, shrub Capsicum (Capsicum frutescens), chinese Capsicum (Capsicum chinense), soft Capsicum (Capsicum pubescens) and drooping Capsicum (Capsicum baccatum), and dazol peppers (chinese Capsicum cultivars); soybean plant species (e.g., broad bean (Vicia faba)), common soybean (common bean), including pinto bean, kidney bean (kidney bean), black bean (black bean), aba Lu Sa (Appalosa bean), and mung bean, among others (bean (Phaseolus vulgaris)), broad bean (Phaseolus acutifolius)), safflower bean (run bean (Phaseolus coccineus)), lima bean (Phaseolus lunatus)), also known as jungle bean (P.dumosus bean), bean (moth (Vigna aconitifolia)), red bean (black bean) (Vigna angusta bean), you Lade (bean (bean), red bean (green bean) (bean) or mung bean (green bean) (37 bean) in 1995); also include black-eye beans (black-eye pea), long cowpea (yardlong bean), and others (cowpea (Vigna unguiculata)), chickpea (chickpea/garbanzo bean) (chickpea (Cicer arietinum)), pea (Pisum sativum), indian pea (Indian pea) (Lathyrus sativus), tuber pea (tuberous pea) (red mountain bean (Lathyrus tuberosus)), lentil (lentil), tetrad bean (Psophocarpus tetragonolobus)), and pigeon pea (Pige) on pea) (Cajanus cajan), mucuna (Mucuna pruriens), guar (Cyamopsis tetragonoloba)), jack bean (jack bean) (Canavalia ensiformis), jack bean (swordbean) (Canavalia gladiata)), horse bean (horse gram) (Macrotyloma uniflorum), lupin (tarwi) (Lupinus mutabilis)), lu Bini bean (lupini bean) (Bai Yushan bean (Lupinus albus)); members of the Cucurbitaceae family (cucurbit family), such as the genus zucchini (squarish), pumpkin (pumpkin), closely grown zucchini (zucchini), some Cucurbita pepo (cucurbaita)), cucurbita pepo (calabash (Lagenaria)), watermelons (Citrullus) such as watermelon (Citrullus lanatus) and colocynth (Citrullus colocynthis)), cucumis sativus (cucumber) various melos (cuhumis melo), cantaloupe (Cucumis metuliferus)), spinach (Spinacia oleracea)), carrot (Daucus carota subsp. Sativus)), tomato (Solanum lycopersicum)), onion (onion cepa L), radish (Raphanus raphanistrum sub.sativus), potato (Solanum tuberosum)), ornamental plants, and oilseed crops such as soybean, rape, oilseed rape, olive, cotton seed, sunflower, palm, sunflower, peanut, and peanut.
Genomic modifications may include modified linkage blocks, linkage of two or more QTLs, linkage of disruption of two or more QTLs, gene insertion, gene replacement, gene transformation, deletion or disruption of a gene, selection of transgenic events, selection of transgenic trait donors, transgene replacement, or targeted insertion of at least one nucleic acid of interest.
Definition of the definition
The definitions and methods provided define the present disclosure and guide those of ordinary skill in the art in practicing the present disclosure. Unless otherwise indicated, terms are to be construed according to conventional usage by those of ordinary skill in the relevant art. Definitions of commonly used terms in molecular biology can also be found in Alberts et al, molecular Biology of The Cell, 5 th edition, garland Science Publishing, inc.: new York,2007; rieger et al Glossary of Genetics: classical and Molecular, 5 th edition, springer-Verlag: new York,1991; king et al, ADictionary of Genetics, 6 th edition, oxford University Press:New York,152247; and Lewis, genes IX, oxford University Press: new York,2007. The nomenclature of DNA bases as described at 37CFR ≡1.822 was used.
As used herein, a "synthetic nucleotide sequence" or "artificial nucleotide sequence" is a nucleotide sequence that is known to not exist or not naturally occur in nature. The gene regulatory element of the present invention comprises a synthetic nucleotide sequence. Preferably, the synthetic nucleotide sequence shares little or no extended homology with the natural sequence. Extension homology, as used herein, generally refers to 100% sequence identity extending over about 25 nucleotides of a contiguous sequence.
The term "isolated DNA molecule" or equivalent terms or phrases referred to herein means that the DNA molecule is present alone or in combination with other compositions, but not in its natural environment. For example, nucleic acid elements naturally occurring in the DNA of the genome of an organism (such as coding sequences, intron sequences, untranslated leader sequences, promoter sequences, transcription termination sequences, etc.) are not considered "isolated" as long as the element is located within the genome of the organism and at a position within its naturally occurring genome. However, each of these elements and sub-portions of these elements will be "isolated" within the scope of this disclosure, as long as the elements are not located within the genome of the organism and within the genome that they naturally find. In one embodiment, the term "isolated" refers to a DNA molecule that is at least partially isolated from a nucleic acid that is normally flanked by DNA molecules in their natural or native state. Thus, for example, DNA molecules fused to regulatory or coding sequences to which they are not normally associated are considered isolated herein as a result of recombinant techniques. These molecules are considered isolated when they integrate into the chromosome of the host cell or are present in the nucleic acid solution with other DNA molecules, because they are not in a natural state. For the purposes of this disclosure, any transgenic nucleotide sequence (i.e., a nucleotide sequence of DNA inserted into the genome of a plant or bacterial cell or present in an extrachromosomal construct) will be considered an isolated nucleotide sequence, whether it is present in a plasmid or similar structure used to transform the cell, in the genome of a plant or bacterium, or in a detectable amount in a tissue, progeny, biological sample, or commodity product derived from a plant or bacterium.
"heterologous DNA molecule" refers to a DNA molecule that is heterologous to an operably linked polynucleotide sequence.
As used herein, the term "operably linked" refers to a first DNA molecule being linked to a second DNA molecule, wherein the first and second DNA molecules are arranged such that the first DNA molecule affects the function of the second DNA molecule. The two DNA molecules may or may not be part of a single, contiguous DNA molecule, and may or may not be contiguous. For example, a promoter is operably linked to a DNA molecule if it regulates the transcription of the DNA molecule of interest in a cell. For example, a leader sequence is operably linked to a DNA sequence when it is capable of affecting the transcription or translation of the DNA sequence.
As used herein, a "recombinant DNA molecule" is a DNA molecule that comprises a combination of DNA molecules that do not naturally occur together without human intervention. For example, a recombinant DNA molecule may be a DNA molecule consisting of at least two DNA molecules that are heterologous with respect to each other, a DNA molecule comprising a DNA sequence that deviates from the DNA sequence that exists in nature, a DNA molecule comprising a synthetic DNA sequence, or a DNA molecule that has been incorporated into a host cell by genetic transformation or genetic editing.
As used herein, the term "sequence identity" refers to the degree to which two optimally aligned polynucleotide sequences or two optimally aligned polypeptide sequences are identical. An optimal sequence alignment is created by manually aligning two sequences (e.g., a reference sequence and another sequence) to maximize the number of nucleotide matches with the appropriate internal nucleotide insertions, deletions, or gaps in the sequence alignment. As used herein, the term "reference sequence" refers to the DNA sequences provided by SEQ ID NOS.1-10.
As used herein, the term "percent sequence identity" or "percent identity" is the percent identity multiplied by 100. The "identity score" of a sequence optimally aligned to a reference sequence is the number of nucleotide matches in the optimal alignment divided by the total number of nucleotides in the reference sequence, e.g., the total number of nucleotides in the entire length of the reference sequence. Thus, in one embodiment of the invention there is provided a DNA molecule comprising a sequence that is at least about 85% identical, at least about 86% identical, at least about 87% identical, at least about 88% identical, at least about 89% identical, at least about 90% identical, at least about 91% identical, at least about 92% identical, at least about 93% identical, at least about 94% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, or at least about 100% identical to a reference sequence provided herein as any of SEQ ID NOs 1-10 when optimally aligned with the reference sequence. In a further embodiment, a sequence having a percent identity to any one of SEQ ID NOS: 1-10 may be defined as exhibiting promoter activity possessed by the starting sequence from which it is derived. The percent sequence of identity to any of SEQ ID NOS.1-10 may further comprise a "minimal promoter" that provides a basal level of transcription and comprises a TATA box or equivalent sequence for recognizing and binding to the RNA polymerase III complex to initiate transcription. According to the invention, promoters, promoter variants or promoter fragments can be analyzed for the presence of known promoter elements, i.e., DNA sequence features such as TATA boxes and other known transcription factor binding site motifs. The skilled artisan can use the identification of such known promoter elements to design variants of promoters having similar expression patterns as the original promoter.
The term "genome" encompasses not only chromosomal DNA found within the nucleus, but also organelle DNA found within subcellular components of the cell (e.g., mitochondria or plastids).
As used herein, the term "genome editing" or "editing" refers to any modification of a nucleotide sequence in a site-specific manner. In the present disclosure, genome editing techniques include the use of endonucleases, recombinases, transposases, helicases, and any combination thereof. In one aspect, "modification" includes the hydrolytic deamination of cytidine or deoxycytidine to uridine or deoxyuridine, respectively. In some embodiments, the sequence-specific editing system comprises an adenine deaminase. In one aspect, "modification" includes hydrolytic deamination of adenine or adenosine. In one aspect, "modification" includes the hydrolytic deamination of adenosine or deoxyadenosine to inosine or deoxyinosine, respectively. In one aspect, "modifying" includes inserting at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 25, at least 50, at least 100, at least 200, at least 300, at least 400, at least 500, at least 750, at least 1000, at least 1500, at least 2000, at least 3000, at least 4000, at least 5000, or at least 10,000 nucleotides. In another aspect, "modifying" includes deleting at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 25, at least 50, at least 100, at least 200, at least 300, at least 400, at least 500, at least 750, at least 1000, at least 1500, at least 2000, at least 3000, at least 4000, at least 5000, or at least 10,000 nucleotides. In another aspect, "modifying" includes inverting at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 25, at least 50, at least 100, at least 200, at least 300, at least 400, at least 500, at least 750, at least 1000, at least 1500, at least 2000, at least 3000, at least 4000, at least 5000, or at least 10,000 nucleotides. In yet another aspect, "modifying" includes substituting at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 25, at least 50, at least 100, at least 200, at least 300, at least 400, at least 500, at least 750, at least 1000, at least 1500, at least 2000, at least 3000, at least 4000, at least 5000, or at least 10,000 nucleotides. In yet another aspect, "modifying" includes replicating at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 25, at least 50, at least 100, at least 200, at least 300, at least 400, at least 500, at least 750, at least 1000, at least 1500, at least 2000, at least 3000, at least 4000, at least 5000, or at least 10,000 nucleotides. In some embodiments, "modifying" includes replacing "C", "G" or "T" with "a" in the nucleic acid sequence. In some embodiments, "modification" includes substitution of "a", "G" or "T" with "C" in the nucleic acid sequence. In some embodiments, "modifying" includes replacing "a", "C" or "T" with "G" in the nucleic acid sequence. In some embodiments, "modifying" includes replacing "a", "C" or "G" with "T" in the nucleic acid sequence. In some embodiments, "modifying" includes replacing "U" with "C" in the nucleic acid sequence. In some embodiments, "modifying" includes replacing "a" with "G" in the nucleic acid sequence. In some embodiments, "modifying" includes replacing "G" with "a" in the nucleic acid sequence. In some embodiments, "modifying" includes replacing "C" with "T" in the nucleic acid sequence.
As used herein, "target site" refers to a nucleotide sequence (e.g., protospacer and Protospacer Adjacent Motif (PAM)) for which the gRNA/CRISPR-associated protein system binds and/or is active, which is located in a DNA sequence selected for targeted modification. The target site may be genetic or non-genetic. The target site may be on the chromosome, episome, locus, or any other DNA molecule in the genome of the cell (including chromosome, chloroplast, mitochondrial DNA, plasmid DNA). The target site may be an endogenous site in the genome of the cell, or alternatively, the target site may be heterologous to the cell and thus not naturally occurring in the genome of the cell, or the target site may be found in a genomic location that is heterologous to the location in which it naturally occurs.
As used herein, "genomic target site" refers to a target site (e.g., protospacer and Protospacer Adjacent Motif (PAM)) located in a host genome selected for targeted modification.
As used herein, "protospacer" refers to a short DNA sequence (12 bp to 40 bp) that can be targeted by a CRISPR system, which is guided by complementary base pairing with a spacer sequence in a gRNA.
As used herein, "microhomology" refers to the presence of bases of the same short sequence (1 bp to 10 bp) in different polynucleotide molecules.
As used herein, "codon optimized" refers to a polynucleotide sequence modified to take advantage of codon usage preferences of a particular plant. The modified polynucleotide sequence still encodes a polypeptide that is identical or substantially similar to the original sequence, but uses codon nucleotide triplets that are found at a higher frequency in a particular plant.
As used herein, "non-protein-encoding RNA (npcRNA)" refers to non-encoding RNA (ncRNA), which is a precursor small non-protein-encoding RNA, or fully processed non-protein-encoding RNA, which is a functional RNA molecule that is not translated into a protein.
As used herein, "promoter" refers to a nucleic acid sequence located upstream or 5' to the translation initiation codon of the open reading frame (or protein coding region) of a gene that is involved in the recognition and binding of RNA polymerase I, II or III and other proteins (trans-acting transcription factors) to initiate transcription. A "plant promoter" is a natural or non-natural promoter that is functional in plant cells. Constitutive promoters function in most or all tissues of a plant during plant development. Tissue, organ or cell specific promoters are expressed only or predominantly in a specific tissue, organ or cell type, respectively. Promoters do not "specifically" express in a given tissue, plant part or cell type, but rather exhibit "enhanced" expression in one cell type, tissue or plant part of a plant, i.e., higher levels of expression compared to other parts of the plant. The transiently regulated promoters only or predominantly function at certain periods of plant development or at certain times of the day, for example in the case of genes associated with circadian rhythms. Inducible promoters selectively express operably linked DNA sequences in response to the presence of endogenous or exogenous stimuli, such as by chemical compounds (chemical inducers), or in response to environmental, hormonal, chemical and/or developmental signals. Inducible or regulatable promoters include, for example, promoters regulated by light, heat, stress, 5 flooding or drought, plant hormones, wounds or chemicals (such as ethanol, jasmonate, salicylic acid, or safeners).
As used herein, an "expression cassette" refers to a polynucleotide sequence comprising at least a first polynucleotide sequence capable of initiating transcription of an operably linked second polynucleotide sequence and optionally a transcription termination sequence operably linked to the second polynucleotide sequence.
Palindromic sequences are nucleic acid sequences that are identical whether read from 5 'to 3' on one strand or from 3 'to 5' on the complementary strand with which they form a duplex. A nucleotide sequence is said to be palindromic if it is identical to its reverse complement. Palindromic sequences may form hairpins.
In some embodiments, numbers expressing quantities of ingredients, properties such as molecular weight, reaction conditions, and so forth, used to describe and claim certain embodiments of the present disclosure are to be understood as being modified in some instances by the term "about". In some embodiments, the term "about" is used to indicate that a value includes the standard deviation of the mean of the device or method used to determine the value. In some embodiments, the numerical parameters set forth in the written specification and attached claims are approximations that may vary depending upon the desired properties sought to be obtained by the particular embodiment. In some implementations, numerical parameters should be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of some embodiments of the disclosure are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. The numerical values presented in some embodiments of the present disclosure may contain certain errors necessarily arising from the standard deviation found in their respective test measurements. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range. Unless otherwise indicated herein, each individual value is incorporated into the specification as if it were individually recited herein.
In some embodiments, the terms "a/an" and "the" and similar referents used in the context of describing particular embodiments (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated. In some embodiments, the term "or" as used herein, including the claims, is used to mean "and/or" unless explicitly indicated to mean only alternatives or alternatives are mutually exclusive.
The terms "include", "have" and "include" are open-ended linking verbs. Any form or tense of one or more of these verbs, such as "comprises", "comprising", "having", "including" and "including", are also open. For example, any method that "comprises," "has," or "includes" one or more steps is not limited to having only such one or more steps and may also cover other steps not listed. Similarly, any composition or device that "comprises," "has," or "includes" one or more features is not limited to only those one or more features and may encompass other non-listed features.
All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., "such as") provided with respect to certain embodiments herein, is intended merely to better illuminate the disclosure and does not pose a limitation on the scope of the disclosure otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the disclosure.
The grouping of alternative elements or embodiments of the present disclosure disclosed herein should not be construed as limiting. Each group member may be referred to and claimed individually or in any combination with other members of the group or other elements found herein. For convenience or patentability reasons, one or more members of a group may be included in or deleted from the group.
Having described the present disclosure in detail, it will be apparent that modifications, variations, and equivalent embodiments are possible without departing from the scope of the disclosure defined in the appended claims. Further, it should be understood that all examples in this disclosure are provided as non-limiting examples.
Examples
The following examples are included to demonstrate embodiments of the present disclosure. It will be appreciated by those of skill in the art that many changes can be made to the specific embodiments disclosed and still obtain a like or similar result without departing from the concept, spirit and scope of the disclosure. More specifically, it will be apparent that certain agents that are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the disclosure as defined by the appended claims.
Example 1
Synthetic promoters to express gRNA
The novel synthetic transcription regulatory element is a synthetic expression element designed by an algorithm method. The synthetic promoter elements of the present invention provide for transcription of small nuclear RNA (snRNA) molecules, such as guide RNA (gRNA) molecules. The engineered synthetic snRNA promoter element does not have extended homology to any known nucleic acid sequence that is naturally occurring, but affects transcription of the same operably linked DNA sequence as the naturally occurring snRNA promoter. The full length synthetic snRNA promoters of the invention share small sequence identity among each other; ranging from about thirty-eight (38) percent identity to about forty-seven (47) percent identity. Truncated variants of the synthetic snRNA promoter were also generated. Truncated synthetic snRNA promoters also share small sequence identity with each other; ranging from about forty-one percent (41) to about fifty-one percent (51) identity. The low percent identity between the synthetic snRNA promoters reduces the likelihood of recombination between the promoters and makes the synthetic snRNA promoters ideal for stacking multiple RNA expression cassettes; wherein each cassette comprises a different synthetic snRNA promoter. Both full length and truncated synthetic snRNA promoters demonstrate the ability to drive expression of grnas, as will be further described in the examples below. Table 1 below shows the different synthetic snRNA promoters and the corresponding truncated variants (indicated by "_TR") and the respective lengths of each synthetic snRNA promoter.
TABLE 1 full length and truncated synthetic snRNA promoters.
Synthesis of snRNA promoter | SEQ ID NO: | Length (bp) |
P-GSP2262 | 1 | 500 |
P-GSP2268 | 2 | 500 |
P-GSP2269 | 3 | 500 |
P-GSP2272 | 4 | 500 |
P-GSP2273 | 5 | 500 |
P-GSP2262_TR | 6 | 280 |
P-GSP2268_TR | 7 | 300 |
P-GSP2269_TR | 8 | 300 |
P-GSP2272_TR | 9 | 288 |
P-GSP2273_TR | 10 | 282 |
Example 2
Analysis of synthetic snRNA promoter in transfected maize leaf protoplasts
Transfecting a maize leaf protoplast with a plasmid construct comprising an expression cassette for expressing a Cas12a endonuclease driven by a constitutive promoter; and a second expression cassette for expressing the gRNA driven by the synthetic snRNA promoter.
A plasmid construct is constructed using methods known in the art, comprising two transgene cassettes: a first transgene cassette for expressing a nuclear-targeted Cas12a protein comprising EXP, EXP-zm.ubqm1:1:9 (SEQ ID NO: 11), said EXP operably linked to a coding sequence Cas12a_nls (SEQ ID NO: 12) encoding a nuclear-targeted Cas12a_nls protein (SEQ ID NO: 19), said coding sequence operably linked to 5 'of 3' utr, t-os.ltp:2 (SEQ ID NO: 13); and a second transgene cassette comprising a synthetic snRNA promoter selected from the group consisting of SEQ ID NOS: 1-10, operably linked to a guide RNA comprising guide RNA spacer NR-Zm.Bmr3_2691 (SEQ ID NO: 14), gRNA-Zm.Bmr3_2691 (SEQ ID NO: 15) 5'. The gRNA, gRNA-Zm.Bmr3_2691, was designed to guide the cleavage of the Cas12a endonuclease within the brown MIdrib 3 (Bmr 3) genomic sequence (denoted SEQ ID NO: 18). The brown midrib mutation was described earliest in maize. Plants containing the brown midrib mutation exhibit reddish brown pigmentation of leaf midrib that begins in the presence of four to six leaves. These mutations are known to alter lignin composition and digestibility of plants and thus constitute the primary candidates in silage maize breeding. Bmr3 the gene encodes the enzyme O-methyltransferase (COMT) involved in lignin biosynthesis (Vignols et al 1995,The Plant Cell, vol. 7, 407-416).
Maize leaf protoplasts were transfected using a method similar to PEG-based transfection known in the art. To assess the effectiveness of each synthetic snRNA promoter, primers were used that allowed for the amplification of DNA fragments comprising the cleavage site regions of isolated genomic DNA from the transfected protoplast cell population to generate amplicon fragments. The sequences of the amplicon fragments are aligned to identify any fragment sequences that contain mutations (such as DNA deletions) in the cleavage site region. The presence of such mutations demonstrates the ability of the synthetic snRNA promoter to drive the expression of gRNA.
Example 3
Analysis of two synthetic snRNA promoters in transfected maize leaf protoplasts
Transfecting a maize leaf protoplast with a plasmid construct comprising an expression cassette for expressing a Cas12a endonuclease driven by a constitutive promoter; and two expression cassettes for expressing two different grnas, each driven by a synthetic snRNA promoter.
A plasmid construct is constructed using methods known in the art, comprising three transgene cassettes: a first transgene cassette for expressing a nuclear-targeted Cas12a protein comprising EXP operably linked to the 5' of the coding sequence Cas12a_nls (SEQ ID NO: 12) encoding the nuclear-targeted Cas12a_nls protein (SEQ ID NO: 19), the coding sequence operably linked to the 5' of the 3' utr, t-os.ltp:2 (SEQ ID NO: 13); a second transgene cassette comprising a synthetic snRNA promoter selected from the group consisting of SEQ ID NOs 1-10 operably linked to a guide RNA containing guide RNA spacer NR-zm.bmr3_2691 (SEQ ID NO: 14), gRNA-zm.bmr3_2691 (SEQ ID NO: 15) 5'; and a third transgene cassette comprising a second synthetic snRNA promoter selected from the group consisting of SEQ ID NOS: 1-10, said second synthetic snRNA promoter being different from the synthetic snRNA promoter used in the second transgene cassette, said second synthetic snRNA promoter being operably linked to a guide RNA comprising a guide RNA spacer NR-Zm.Bmr3_3170 (SEQ ID NO: 16), gRNA-Zm.Bmr3_3170 (SEQ ID NO: 17) 5'. gRNA, gRNA-zm.bmr3_2691 and gRNA, gRNA-zm.bmr3_3170 were designed to guide Cas12a endonuclease cleavage within the brown midlib 3 (Bmr 3) genomic sequence (denoted as SEQ ID NO: 18).
Maize leaf protoplasts were transfected using a method similar to PEG-based transfection known in the art. To assess the effectiveness of each synthetic snRNA promoter in the construct stack, primers were used that allowed for amplification of DNA fragments comprising two cleavage site regions of isolated genomic DNA from the transfected protoplast cell population to generate amplicon fragments. Mutations detected at each cleavage site or deletions of about 480 base pairs indicate the ability of each synthetic promoter to drive expression of its respective gRNA.
Example 4
Introduction of targeted double strand breaks into the genome of cells
This example illustrates the use of synthetic snRNA promoter sequences to drive gRNA expression to produce targeted double strand breaks in the cell genome when provided with Cas9 endonuclease, cas12a endonuclease, or other CRISPR endonucleases.
The synthetic snRNA promoters and truncated variant synthetic snRNA promoters of the invention as shown in SEQ ID NOS.1-10 can be used to drive gRNA expression in plant cells. When provided to the nucleus with a Cas9 endonuclease, a Cas12a endonuclease, or a CRISPR endonuclease, DNA fragmentation will occur in a selected target region comprising a sequence complementary to the spacer of the gRNA.
There are a variety of methods for introducing essential components into plant cells. The gRNA may be expressed from a DNA fragment comprising a synthetic snRNA promoter or a truncated synthetic snRNA promoter operably linked to 5 'and 3' poly-T fragments of the nucleotide sequence encoding the gRNA to terminate transcription. Alternatively, the sequence encoding the gRNA may be cloned into a plasmid construct. The plasmid construct may be a construct for transfecting plant-derived protoplasts, or the construct may be a binary plant transformation construct for stably transforming plant cells. The Cas9 endonuclease, cas12a endonuclease, or other CRISPR endonuclease can be introduced into the plant cell as a protein or via heterologous DNA for expressing the Cas9 endonuclease, cas12a endonuclease, or other CRISPR endonuclease. The Cas9 endonuclease, cas12a endonuclease, or other CRISPR endonuclease comprises at least one Nuclear Localization Signal (NLS) to more effectively allow endonuclease cleavage to occur within the nucleus.
Plant cells may be transfected by particle bombardment. In this case, cas9 endonuclease, cas12a endonuclease, or other CRISPR endonuclease may be introduced as a protein; or alternatively introduced as a DNA fragment comprising a plant-expressible promoter operably linked to 5 'of an optional intron operably linked to 5' of a coding sequence encoding a Cas9 endonuclease, cas12a endonuclease, or other CRISPR endonuclease comprising at least one NLS operably linked to 5 'of a 3' utr. The DNA encoding the gRNA may be introduced into the cell together with a heterologous DNA fragment comprising a synthetic snRNA promoter or a truncated synthetic snRNA promoter (SEQ ID NOS: 1-10) operably linked to 5 'of the sequence encoding the gRNA, which further comprises a 3' poly-T fragment to terminate transcription. Protoplast cells can also be transfected with the same reagents as described above.
Protoplast cells can also be transfected with one or two plasmid constructs. One such method, wherein two constructs described in example 2 above are used, wherein the first construct comprises a transgene cassette for expressing a gRNA and the second construct comprises a transgene cassette for expressing a Cas9 endonuclease, a Cas12a endonuclease, or other CRISPR endonuclease. Alternatively, both the gRNA transgene cassette and the Cas9 endonuclease, cas12, or other CRISPR endonuclease transgene cassettes can be contained in one construct for transfection.
For stable transformation of plant cells, the gRNA expression cassette and Cas9 endonuclease, cas12a endonuclease, or other CRISPR endonuclease expression cassette may be contained in a binary plant transformation plasmid construct. Alternatively, two constructs can be used to co-transform plant cells: a first construct comprising a gRNA expression cassette; and a second construct comprising a Cas9 endonuclease, a Cas12a endonuclease, or a CRISPR endonuclease expression cassette.
To induce a double strand break in DNA without incorporating the transgene cassette into the plant genome, the gRNA and Cas9 endonuclease, cas12a endonuclease, or other CRISPR endonuclease expression cassette can be excised as linear fragments from one or more constructs comprising the cassette. The expression cassette and blunt-ended DNA fragments may be delivered into plant cells by particle bombardment. The bombarded cells are induced to form callus. The callus was then used to form whole plants.
The resulting breaks introduced into the cell genome can be used to introduce oligomeric DNA fragments; or by altering or disrupting the sequence through an error-prone non-homologous end joining.
Example 5
Genomic modification by blunt-ended double-stranded DNA fragment integration
This example illustrates the use of synthetic snRNA promoter sequences to drive the expression of gRNA to integrate blunt-ended double-stranded DNA fragments into selected target sites when provided with CRISPR endonucleases.
The complementary oligonucleotides are pre-annealed to form blunt-ended double-stranded DNA fragments. The DNA fragments and constructs comprising the gRNA and CRISPR endonuclease expression cassettes were co-transfected into plant protoplasts. The oligonucleotides may be designed to contain a micro-homology region of about three base pairs with the corresponding 5 'and 3' flanking sequences at the cleavage site of the genomic target site; or no microhomologous regions. The microhomologous regions can facilitate integration of the blunt-ended double-stranded DNA fragments into the genomic target site by a microhomology-driven non-homologous end joining mechanism.
For expression of the gRNA and CRISPR endonucleases, one or two constructs can be used. For one construct, the gRNA and Cas9 expression cassettes were cloned into a single plasmid construct. If two constructs are required, the first construct will comprise a gRNA expression cassette and the second construct will comprise a cassette for expression of a CRISPR endonuclease. The gRNA expression cassette will comprise one of the synthetic snRNA promoters or truncated variant synthetic snRNA promoters of the invention as shown in SEQ ID NOS.1-10.
For protoplast transfection, one or more constructs comprising a gRNA expression cassette and a CRISPR endonuclease expression cassette are co-transfected with a blunt-ended double-stranded DNA fragment. Detection of blunt-ended double-stranded DNA fragment integration the region surrounding integration of the target site can be amplified and the amplicon detected using high resolution capillary electrophoresis; and directly sequencing the amplicon.
To integrate blunt-ended DNA fragments into selected target sites of plants that produce stable alterations, the gRNA and CRISPR endonuclease expression cassettes can be excised as linear fragments from one or more constructs comprising the cassettes. The expression cassette and blunt-ended DNA fragments may be delivered into plant cells by particle bombardment. The bombarded cells are induced to form callus. The callus was then used to form whole plants. Regenerated plants are then assayed using methods known in the art, such as amplification and sequencing, to identify those plants that contain DNA fragments that enter the plant genome.
Example 6
Targeting unique genomic loci by gRNA multiplexing
One key advantage of CRISPR systems compared to other genome engineering platforms is that multiple grnas for separate and unique genomic target sites can be delivered as separate components to achieve targeting. Alternatively, multiple grnas for separate and unique genomic target sites can be multiplexed in a single expression construct to achieve targeting. Examples of applications where multiple targeting endonuclease cleavage may be required include removal of a marker gene from a transgenic event. The CRISPR system can be used to remove selection markers from transgenic inserts, leaving genes of interest.
Another example of an application where such CRISPR systems may be useful is when multiple targeted endonuclease cleavage is required, such as when identification of pathogenic genes following a quantitative trait is hampered by a lack of meiotic recombination in QTL regions separating gene candidates from each other. This can be avoided by transformation with several CRISPR constructs targeting the gene of interest simultaneously. These constructs knock out gene candidates by frame shift mutation or remove them by deletion. Such transformation may also result in random combinations of intact and mutated loci that allow identification of pathogenic genes.
The gRNA expression cassette will comprise two or more synthetic snRNA promoters of the invention and/or truncated variant synthetic snRNA promoters, as shown in SEQ ID NOs 1-10, operably linked to 5' of a unique gRNA coding sequence designed to direct CRISPR endonuclease activity to specific sites in a genomic region of a plant cell. The use of a truncated variant synthetic snRNA promoter (SEQ ID NOS: 6-10) may be advantageous because the smaller size of the truncated variant synthetic snRNA promoter allows for the construction of smaller constructs and reduces the likelihood of replication errors occurring in the bacterial host prior to transformation of plant cells.
The construction of binary plant transformation constructs was similar to the constructs described in example 3 above, but contained multiple gRNA expression cassettes, each with a unique synthetic snRNA promoter or truncated variant synthetic snRNA promoter operably linked to 5' of a unique gRNA coding sequence. The binary plant transformation construct further comprises an expression cassette for expressing a CRISPR endonuclease. Plant cells are transformed using agrobacterium-mediated transformation methods. After transformation, the grnas direct CRISPR endonucleases to genomic regions comprising PAM sequences adjacent to sequences complementary to the spacer sequence of each gRNA, resulting in endonuclease cleavage within genomic DNA in each respective target sequence. After cleavage, genomic DNA between the target sites will be excised and repaired by non-homologous end joining. Excision of the genomic DNA fragment may be confirmed by various amplification or sequencing methods available in the art. Changes in phenotype, metabolism, or other characteristics may be observed, depending on the nature of the genomic region targeted for excision.
Example 7
Targeted integration by homologous recombination
Genomic modification by targeting the introduced DNA sequence required for integration will occur at the double strand break site in the chromosome. Integration of the DNA sequence is mediated by non-homologous end joining or homologous recombination mechanisms using the DNA repair machinery of the host cell. Double strand breaks in the genome of a cell can be achieved using a CRISPR endonuclease and a gRNA that directs the CRISPR endonuclease to a target region of genomic DNA. An example of an application where homologous recombination may be desired is the integration of an expression cassette into the genome of a plant cell within a specific region of the plant genome.
The integration of DNA fragments using homologous recombination requires the same homology region, referred to herein as the "homology arm" (HA), as the region that is preferably integrated after cleavage by a CRISPR endonuclease. Homology arms flank the 5 'and 3' ends of the DNA fragments. The design of left HA is based on flanking the sequence 5' to the double-strand break site for targeted integration. The design of right HA is based on flanking the sequence 3' of the double-strand break site for targeted integration. The homology arms may be about two (2) to about one thousand two hundred (1200) base pairs, but longer homology arms may function more effectively. The ideal size range for the homology arms may be two hundred thirty (230) to one thousand three (1, 003) base pairs in length.
For transfection of protoplasts, constructs for transfection are described in example 5. The gRNA expression cassette will comprise one of the synthetic snRNA promoters or truncated variant synthetic snRNA promoters of the invention as shown in SEQ ID NOS.1-10. One or more constructs may be co-transfected with a DNA fragment comprising a homology arm. Alternatively, the expression cassette may be excised from one or more plasmid constructs, and the linear expression cassette fragments may be co-transfected with a DNA fragment comprising a homology arm.
To produce stable integration of the DNA fragments comprising the homology arms of the stably transformed plants comprising the DNA fragments, the expression cassette may be excised from one or more plasmid constructs and the linear expression cassette fragments co-transformed with the DNA fragments comprising the homology arms by particle bombardment. Alternatively, expression cassettes comprising synthetic snRNA promoters or truncated synthetic snRNA promoters may be co-transformed with DNA fragments comprising homology arms by particle bombardment. Transformed tissue is induced to form a whole plant, and plants in which the integrated DNA fragment is present are selected and characterized using methods known in the art for insertion into a target site.
For agrobacterium-mediated stable integration of a DNA fragment comprising a homology arm to produce a stably transformed plant comprising the DNA fragment, a single binary transformation construct can be constructed using methods known in the art. The construct will comprise the right T-DNA border region; a left homology arm flanked by a first transgene cassette, for example, for selecting transformed plant cells using herbicides or antibiotics; a second transgene cassette comprising an expression cassette for expressing a gene of interest; a right homology arm; a third transgene cassette comprising a plant-expressible promoter operably linked to 5' of a coding sequence encoding a nuclear-targeted CRISPR endonuclease operably linked to 5' of a 3' utr; a fourth transgene cassette comprising a synthetic snRNA promoter of the invention or a truncated synthetic snRNA promoter as shown in SEQ ID NOs 1-10 operably linked to 5 'of a gRNA coding sequence comprising a poly-T fragment at the 3' end to terminate transcription; and right T-DNA border. It may also be preferred to flank a selectable marker cassette, a CRISPR endonuclease cassette and a gRNA cassette with a site that allows excision of the selectable marker after selection and characterization of the transformant, such as the Lox site cleaved by Cre recombinase.
The use of two right T-DNA border regions will result in the formation of double stranded DNA from the replication process of Agrobacterium. Selection and expression cassettes flanking the homology arms will integrate into the target site. The loss of any chromosomal integration of full-length T-DNA or part of T-DNA can be achieved in subsequent generations by breeding and isolation, by selecting those isolates that have selection and expression cassettes only at the target site. The removal of the selectable marker cassette may be accomplished by breeding plants comprising the selection and expression cassette with transformed plants expressing the Cre recombinase. The Cre recombinase expression cassette can also be selected in the next generation by isolation.
Example 8
P-GSP2262_TR can drive gRNA expression
Maize plants were transformed with a plasmid construct comprising an expression cassette for expressing Cas12a driven by a plant-expressible promoter and an expression cassette for expressing gRNA driven by a synthetic snRNA promoter gsp2262_tr, and editing within a specific region of the Bmr target sequence (SEQ ID NO: 18) was evaluated.
Two different plasmid constructs (construct-1 and construct-2) were used to transform maize plants. Each construct comprises an expression cassette for selecting transformed plant cells using glyphosate selection and an expression cassette for expressing Cas12 a. Construct-1 also contained an expression cassette for expressing gRNA, gRNA-Zm.Bmr3_90_3279 (SEQ ID NO: 23) driven by the synthetic snRNA promoter GSP2262_TR (SEQ ID NO: 6). gRNA, gRNA-zm.bmr3_90_3279 comprises two spacer sequences: NR-Zm.Bmr3_90 (SEQ ID NO: 20) and NR-Zm.Bmr3_3279 (SEQ ID NO: 22), which directs the cleavage of Cas12a within the Bmr3 target sequence (SEQ ID NO: 18). Construct-2 also contained an expression cassette for expressing gRNA, gRNA-Zm.Bmr3_227_3279 (SEQ ID NO: 24) driven by the synthetic snRNA promoter GSP2262_TR (SEQ ID NO: 6). gRNA, gRNA-zm.bmr3_227_3279 comprises two spacer sequences: NR-Zm.Bmr3_227 (SEQ ID NO: 21) and NR-Zm.Bmr3_3279 (SEQ ID NO: 22), which directs the cleavage of Cas12a within the Bmr target sequence (SEQ ID NO: 18).
Two plasmid constructs described above were used using Agrobacterium-mediated transformation methodsA maize plant is transformed. The transformed cells are induced to form plants by methods known in the art. R from transformation 0 Leaf tissue samples were taken from the plants and genomic DNA was extracted from each sample. The region spanning the target site is sequenced. The percentage of plants containing at least 1 edited allele per cleavage site was calculated. The percentage of target sites compiled is shown in table 2 below.
Table 2. Percentage of targeting sites edited.
As shown in Table 2 above, the synthetic snRNA promoter P-GSP 2262-TR (SEQ ID NO: 6) was able to drive the expression of the gRNAs as demonstrated by the percentage of editing sites specific for each gRNA.
Example 9
Assay of synthetic snRNA promoters using transfected protoplasts to drive expression of gRNA targeting the Bmr genomic locus
Maize leaf protoplasts were transfected with the following constructs: a first construct comprising an expression cassette for expressing Cas12a driven by a plant-expressible promoter and a second construct comprising an expression cassette for expressing a gRNA designed to target the Bmr genomic locus driven by a synthetic snRNA promoter; and the effectiveness of induction of editing within the Bmr target sequence (SEQ ID NO: 18) was evaluated.
Corn leaf protoplasts were transfected with various constructs to determine the ability of the synthetic snRNA promoter to drive expression of gRNA, resulting in editing of specific sequences within the Bmr target site (SEQ ID NO: 18). Each protoplast preparation was transfected with 4 different constructs. The first construct was used to drive expression of Cas12a (Cas 12a_nls, SEQ ID NO: 12) in protoplast cells using a constitutive promoter. The second construct was used to drive expression of a gRNA targeting the Bmr locus driven by a synthetic snRNA promoter selected from the group consisting of SEQ ID NOs 1-10. The third and fourth constructs were used to drive expression of Renilla (Renilla) and Firefly (Firefly) luciferase genes, using constitutive promoters to assess the success of protoplast transfection, respectively.
The second construct is used to drive expression of a gRNA driven by a synthetic snRNA promoter selected from the group consisting of SEQ ID NOs 1-10, including one of three different grnas: (1) gRNA-Zm.Brm3_2691_2 (SEQ ID NO: 25) which contains the spacer NR-Zm.Brm3_2691 (SEQ ID NO: 14) and directs cleavage of the Cas12a_NLS protein within the Bmr3 target sequence; (2) gRNA-Zm.Brm3_3170_2 (SEQ ID NO: 26), which contains the spacer NR-Zm.Brm3_3170 (SEQ ID NO: 16) and directs cleavage of the Cas12a_NLS protein within the Bmr3 target sequence; and (3) gRNA-zm.brm3_2691_3170 (SEQ ID NO: 27), which directs Cas12a_nls cleavage within two positions of the Bmr target sequence. A total of 30 constructs were prepared to provide all three grnas for each of the 10 synthetic snRNA promoters.
Maize leaf protoplasts were transfected with constructs of the 5 types described above (first, second, third, fourth and fifth) using a method of PEG-based transfection similar to that known in the art. Genomic DNA was isolated from protoplast cells after transfection and incubation. DNA sequencing was performed around the target region of the Bmr target site. Each transfection was repeated 4 times and the average InDel% was calculated based on the 4 replicates. The percentage of InDel was calculated as follows: inDel% = 100× [ (in+del)/(total rc) ], where "In" is the read count with insert; "Del" is the read count with a miss; "Total RC" is the count of reads of all sequences from a given sample including wild-type and mutant reads. Since each guide RNA was different in efficiency of inducing double strand breaks, the% indels of each gRNA driven by the 10 synthetic snRNA promoters were normalized using rep from any of the 10 snRNA promoters with the highest% indels as 100%. Table 3 shows the average InDel% and average normalized InDel% for two single target gRNAs, gRNA-Zm.Brm3_2691_2 (SEQ ID NO: 25) and gRNA-Zm.Brm3_3170_2 (SEQ ID NO: 26). Table 4 shows the average InDel% and average normalized InDel% for 2 target gRNAs, gRNA-Zm.Brm3_2691_3170 (SEQ ID NO: 27).
Table 3. Average InDel% and average normalized InDel% of single target gRNAs driven by synthetic snRNA promoters.
Table 4. Average InDel% and average normalized InDel% of 2 target gRNAs driven by the synthetic snRNA promoter.
As shown in tables 3 and 4, each synthetic snRNA promoter was able to drive gRNA expression to guide Cas12a editing in the target site. In these experiments, gRNA-zm.brm3_2691_2 appeared to be less efficient than the other 2 grnas, resulting in a low average InDel, especially for promoters GSP2262, GSP2272 and GSP2272_tr. However, when gRNA, gRNA-zm.brm3_3170_2 and gRNA-zm.brm3_2691_3170 are driven, these 3 synthetic snRNA promoters exhibit similar% InDel as other synthetic snRNA promoters.
Example 10
Determination of synthetic snRNA promoters using transfected protoplasts to drive expression of gRNA targeting the Zm7 genomic locus
Maize leaf protoplasts were transfected with the following constructs: a first construct comprising an expression cassette for expressing Cas12a driven by a plant-expressible promoter and a second construct comprising an expression cassette for expressing a gRNA designed to target the Bmr genomic locus driven by a synthetic snRNA promoter; and the effectiveness of induction of editing within the Zm7 target sequence (SEQ ID NO: 28) was evaluated.
Corn leaf protoplasts were transfected with various constructs to determine the ability of the synthetic snRNA promoter to drive the expression of gRNA, resulting in the editing of specific sequences within the Zm7 target site (SEQ ID NO: 28). Each protoplast preparation was transfected with 4 different constructs. The first construct was used to drive expression of Cas12a (Cas 12a_nls, SEQ ID NO: 12) in protoplast cells using a constitutive promoter. The second construct was used to drive the expression of a gRNA targeting the Zm7 locus driven by a synthetic snRNA promoter selected from the group consisting of SEQ ID NOs 1-10. The third and fourth constructs were used to drive expression of the Renilla and firefly luciferase genes, using constitutive promoters to assess the success of protoplast transfection, respectively.
The second construct is used to drive expression of a gRNA driven by a synthetic snRNA promoter selected from the group consisting of SEQ ID NOs 1-10, including one of three different grnas: (1) gRNA-Zm.7.1b (SEQ ID NO: 30) comprising the spacer NR-Zm.7.1b (SEQ ID NO: 29) and guiding cleavage of the Cas12a_NLS protein within the Zm7 target sequence; (2) gRNA-Zm.7.1c (SEQ ID NO: 32) comprising the spacer NR-Zm.7.1c (SEQ ID NO: 31) and guiding cleavage of the Cas12a_NLS protein within the Zm7 target sequence; and (3) gRNA-7.1c_7.1b (SEQ ID NO: 33), which directs Cas12a_nls to cleave within two positions of the Zm7 target sequence. A total of 30 constructs were prepared to provide all three grnas for each of the 10 synthetic snRNA promoters.
Maize leaf protoplasts were transfected with constructs of the 5 types described above (first, second, third, fourth and fifth) using a method of PEG-based transfection similar to that known in the art. Genomic DNA was isolated from protoplast cells after transfection and incubation. DNA sequencing was performed around the target region of the Bmr target site. Each transfection was repeated 4 times and the average InDel% was calculated based on the 4 replicates. InDel% and InDel% normalization was calculated as described in example 9 above.
Table 5 shows the average InDel% and average normalized InDel% for two single target gRNAs, gRNA-Zm.7.1b (SEQ ID NO: 30) and gRNA-Zm.7.1c (SEQ ID NO: 32). Table 6 shows the average InDel% and average normalized InDel% for 2 target gRNAs, gRNA-7.1c_7.1b (SEQ ID NO: 33).
Table 5. Average InDel% and average normalized InDel% of single target gRNAs driven by synthetic snRNA promoters.
Table 6. Average InDel% and average normalized InDel% of 2 target gRNAs driven by the synthetic snRNA promoter.
/>
As shown in tables 3 and 4, each synthetic snRNA promoter was able to drive gRNA expression to guide Cas12a editing in the target site. Editing of the zm.7.1b site is less efficient than zm.7.1c site. However, synthetic snRNA promoters can drive the expression of grnas to affect editing of Cas12 a.
Example 11
Determination of synthetic snRNA promoters driving expression of gRNA targeting the Bmr genomic locus in stably transformed maize plants
Transforming a maize plant with a plasmid construct comprising an expression cassette for expressing Cas12a driven by a plant-expressible promoter and an expression cassette for expressing gRNA driven by a synthetic snRNA promoter shown in SEQ ID NOs 6-10; and the editing within a specific region of the Bmr target sequence (SEQ ID NO: 18) was evaluated.
Maize plants were transformed with 5 plasmid constructs comprising 3 expression cassettes: a first expression cassette for selecting transformed plant cells using glyphosate selection, a second expression cassette for expressing Cas12a using a plant-expressible promoter, and a third transgene cassette for expressing gRNA, gRNA-zm.brm3_2691_3170 (SEQ ID NO: 27) driven by a synthetic snRNA promoter shown in SEQ ID NOs 6-10 that directs Cas12a cleavage within both regions of the Bmr target sequence (SEQ ID NO: 18).
Two plasmid constructs described above were used using Agrobacterium-mediated transformation methodsA maize plant is transformed. The transformed cells are induced to form plants by methods known in the art. R from transformation 0 Leaf tissue samples were taken from the plants and genomic DNA was extracted from each sample. One and two copy events are selected and the region spanning the target site is sequenced. The average InDel percentage was calculated based on the number of insertions and deletions observed for each target site. Table 7 shows the average InDel percentages calculated for each of the two target sites within the Bmr3 target sequence.
Table 7. Average InDel percentages within bmr3 target sequences.
As shown in table 7 above, each synthetic snRNA promoter was able to drive gRNA expression to guide Cas12a editing in the target site of the Bmr3 target sequence.
Example 12
Determination of synthetic snRNA promoters driving expression of gRNA targeting the Zm7 genomic locus in stably transformed maize plants
Transforming a maize plant with a plasmid construct comprising an expression cassette for expressing Cas12a driven by a plant-expressible promoter and an expression cassette for expressing gRNA driven by a synthetic snRNA promoter shown in SEQ ID NOs 6-10; and the edits within a specific region of the Zm7 target sequence (SEQ ID NO: 28) were evaluated.
Maize plants were transformed with 5 plasmid constructs comprising 3 expression cassettes: a first expression cassette for selecting transformed plant cells using glyphosate selection, a second expression cassette for expressing Cas12a using a plant-expressible promoter, and a third transgene cassette for expressing gRNA, gRNA-7.1c_7.1b (SEQ ID NO: 33) driven by a synthetic snRNA promoter shown in SEQ ID NOs 6-10 that directs Cas12a cleavage within both regions of the Zm7 target sequence (SEQ ID NO: 28).
The maize plants were transformed with the two plasmid constructs described above using agrobacterium-mediated transformation methods. By the artMethods known in the art induce transformed cells to form plants. R from transformation 0 Leaf tissue samples were taken from the plants and genomic DNA was extracted from each sample. One and two copy events are selected and the region spanning the target site is sequenced. The average InDel percentage was calculated based on the number of insertions and deletions observed for each target site. Table 8 shows the calculated average InDel percentages for each of the two target sites within the Zm7 target sequence (SEQ ID NO: 28).
Table 8. Average InDel percentages within zm7 target sequences.
As shown in table 8 above, each synthetic snRNA promoter was able to drive gRNA expression to guide Cas12a editing in the target site of the Zm7 target sequence.
*******
While the principles of the invention have been shown and described, it will be apparent to those skilled in the art that the invention may be modified in arrangement and detail without departing from such principles. We claim all modifications coming within the spirit and scope of the claims. All publications and published patent documents herein are incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Claims (29)
1. A synthetic micronuclear RNA (snRNA) promoter comprising a DNA sequence selected from the group consisting of:
a. a sequence having at least 85% sequence identity to any one of SEQ ID NOs 1 to 10;
b. a sequence comprising any one of SEQ ID NOs 1 to 10; and
a fragment of any one of SEQ ID NOS: 1-10.
2. The synthetic snRNA promoter of claim 1, wherein the sequence has at least 90% sequence identity to the DNA sequence of any one of SEQ ID NOs 1-10.
3. The synthetic snRNA promoter of claim 1, wherein the sequence has at least 95% sequence identity to the DNA sequence of any one of SEQ ID NOs 1-10.
4. The synthetic snRNA promoter of claim 1, wherein the fragment comprises a gene regulatory activity.
5. A recombinant DNA construct comprising a synthetic snRNA promoter operably linked to a DNA sequence encoding a guide RNA (gRNA), wherein the sequence of the synthetic snRNA promoter is any one of SEQ ID NOs 1-10 or fragments thereof, wherein the fragments comprise a gene regulatory activity.
6. The recombinant DNA construct of claim 5, further comprising a transcription termination sequence.
7. The recombinant DNA construct of claim 5, further comprising a DNA sequence encoding a promoter operably linked to a type I CRISPR-associated protein, a type II CRISPR-associated protein, a type III CRISPR-associated protein, a type IV CRISPR-associated protein, a type V CRISPR-associated protein, or a type VI CRISPR-associated protein.
8. The recombinant DNA construct of claim 7, wherein the CRISPR-associated protein is further operably linked to at least one Nuclear Localization Sequence (NLS).
9. The recombinant DNA construct of claim 7, wherein the CRISPR-associated protein is selected from the group consisting of: cas1, cas1B, cas2, cas3, cas4, cas5, cas6, cas7, cas8, cas9, cas10, cas 12a, csy1, csy2, csy3, cse1, cse2, csc1, csc2, csa5, csn2, csm3, csm4, csm5, csm6, cmr1, cmr3, cmr4, cmr5, cmr6, csb1, csb2, csb3, csx17, csx14, csx10, csx16, csaX, csx3, csx1, csx15, csf1, csf2, csf3, csf4, casX, casY and Mad7.
10. A recombinant DNA construct comprising a synthetic snRNA promoter operably linked to a sequence of a specified non-coding RNA, wherein the sequence of the synthetic snRNA promoter is any one of SEQ ID NOs 1-10 or fragments thereof, wherein the fragments comprise a gene regulatory activity.
11. The recombinant DNA construct of claim 10, wherein said non-coding RNA is selected from the group consisting of: guide RNA (gRNA), single guide RNA (sgRNA), crRNA, pre-crRNA, tracrRNA, PEgRNA, microRNA (miRNA), miRNA precursors, small interfering RNAs (siRNAs), small RNAs (22-26 nt in length) and precursors encoding the same, heterochromatin siRNAs (hc-siRNAs), piwi-interacting RNAs (piRNAs), hairpin double-stranded RNAs (hairpin dsRNAs), trans-acting siRNAs (ta-siRNAs), and naturally-occurring antisense siRNAs (nat-siRNAs).
12. A recombinant DNA construct comprising at least a first expression cassette comprising a synthetic snRNA promoter operably linked to a DNA sequence encoding a guide RNA (gRNA), wherein the sequence of the promoter comprises any one of SEQ ID NOs 1-10 or fragments thereof, wherein the fragments comprise a gene regulatory activity.
13. The recombinant DNA construct of claim 12, further comprising at least a second expression cassette, wherein the sequence encoding the first gRNA is different from the sequence encoding the second gRNA.
14. The recombinant DNA construct of claim 13, wherein the synthetic snRNA promoter operably linked to the sequence encoding a first gRNA is different from the synthetic snRNA promoter operably linked to the sequence encoding a second gRNA.
15. The construct of claim 14, comprising flanking left and right Homology Arms (HAs) each about 200-1200bp in length.
16. The construct of claim 15, wherein the homology arm is about 230 to 1003bp in length.
17. A recombinant DNA construct comprising:
a. a first synthetic snRNA promoter selected from the group consisting of: 1-10 or a fragment thereof, wherein said fragment comprises gene regulatory activity and said first synthetic snRNA promoter is operably linked to a DNA sequence encoding a non-coding RNA; and
b. a second synthetic snRNA promoter selected from the group consisting of: 1-10 or a fragment thereof, wherein said fragment comprises gene regulatory activity, said second synthetic snRNA promoter is operably linked to a DNA sequence encoding a non-coding RNA,
wherein the first synthetic snRNA promoter and the second synthetic snRNA promoter are different.
18. The recombinant DNA construct of claim 17, wherein the sequence encoding the first synthetic snRNA promoter and the sequence encoding the second synthetic snRNA promoter each comprise any one of SEQ ID NOs 1-10 or fragments thereof, wherein the fragments comprise gene regulatory activity.
19. The recombinant DNA construct of any one of claims 17 to 18, further comprising a sequence specifying one or more additional synthetic snRNA promoters selected from the group consisting of: 1-10 or a fragment thereof, wherein said fragment comprises a gene regulatory activity, said sequence being operably linked to a DNA sequence encoding a non-coding RNA, wherein each of said first synthetic snRNA promoter, said second synthetic snRNA promoter, and said one or more additional synthetic snRNA promoters are different.
20. The recombinant DNA construct of claim 19, wherein the sequence specifying the one or more additional synthetic snRNA promoters is selected from the group consisting of: 1-10 or a fragment thereof, wherein said fragment comprises gene regulatory activity.
21. The recombinant DNA construct of any one of claims 19 to 20, wherein the recombinant DNA construct comprises 3, 4, 5, 6, 7, or 8 synthetic snRNA promoters.
22. The recombinant DNA construct of any one of claims 17 to 21, wherein said non-coding RNA is a gRNA targeting a different selected target site in a plant cell chromosome.
23. The recombinant DNA construct of any one of claims 17 to 22, further comprising a DNA sequence encoding a promoter operably linked to a DNA sequence encoding a CRISPR-associated protein.
24. The CRISPR-associated protein of claim 23, wherein the CRISPR-associated protein is selected from the group consisting of: cas1, cas1B, cas2, cas3, cas4, cas5, cas6, cas7, cas8, cas9, cas10, cas 12a, csy1, csy2, csy3, cse1, cse2, csc1, csc2, csa5, csn2, csm3, csm4, csm5, csm6, cmr1, cmr3, cmr4, cmr5, cmr6, csb1, csb2, csb3, csx17, csx14, csx10, csx16, csaX, csx3, csx1, csx15, csf1, csf2, csf3, csf4, casX, casY and Mad7.
25. A cell comprising a recombinant DNA construct selected from the group consisting of: claim 5, claim 10, claim 12 and claim 17.
26. The cell of claim 25, wherein the cell is a plant cell.
27. The plant cell of claim 26, wherein said plant cell is a monocot plant cell.
28. The plant cell of claim 26, wherein said plant cell is a dicot plant cell.
29. The plant cell of claim 26, wherein said plant cell is selected from the group consisting of: maize plant cells, soybean plant cells, cotton plant cells, peanut plant cells, barley plant cells, oat plant cells, festuca arundinacea plant cells, rice plant cells, sorghum plant cells, sugarcane plant cells, festuca arundinacea plant cells, turf grass plant cells, wheat plant cells, alfalfa plant cells, rape plant cells, cabbage plant cells, mustard plant cells, rutabaga plant cells, turnip plant cells, collard plant cells, broccoli plant cells, cauliflower plant cells, pepper plant cells, bean plant cells, cowpea plant cells, chickpea plant cells, cucurbit plant cells, lettuce plant cells, cucumber plant cells, melon plant cells, carrot plant cells, tomato plant cells, radish plant cells, potato plant cells, or ornamental plant cells.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US63/182,288 | 2021-04-30 | ||
US202163295061P | 2021-12-30 | 2021-12-30 | |
US63/295,061 | 2021-12-30 | ||
PCT/US2022/026754 WO2022232407A2 (en) | 2021-04-30 | 2022-04-28 | Plant regulatory elements and uses thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117242181A true CN117242181A (en) | 2023-12-15 |
Family
ID=89098908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202280031323.XA Pending CN117242181A (en) | 2021-04-30 | 2022-04-28 | Plant regulatory element and use thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117242181A (en) |
-
2022
- 2022-04-28 CN CN202280031323.XA patent/CN117242181A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11566254B2 (en) | Compositions and methods for site directed genomic modification | |
US11584936B2 (en) | Targeted viral-mediated plant genome editing using CRISPR /Cas9 | |
CN110709519B (en) | Expression regulatory element and use thereof | |
US20200354734A1 (en) | New strategies for precision genome editing | |
CN116390644A (en) | Generation of plants with improved transgene loci by genome editing | |
US10941412B2 (en) | Citrus varieties resistant to Xanthomonas citri infection | |
CN117242181A (en) | Plant regulatory element and use thereof | |
CN116529377A (en) | Genetic regulatory element | |
US20220372502A1 (en) | Plant regulatory elements and uses thereof | |
Alburquerque et al. | New transformation technologies for trees | |
TWI686477B (en) | Cloning vector, kit, and method for specifically inducing mutagenesis in chloroplast genes, and transgenic plant cells and agrobacterium generated by the same | |
WO2023192825A1 (en) | Targeted donor dna insertion and indel editing of plant genes | |
Bi | Gene editing of rice miRNA and argonaute genes | |
CN116367714A (en) | Genome editing of transgenic crop plants with modified transgene loci | |
EP4352235A1 (en) | Methods and compositions for altering protein accumulation | |
CN116529370A (en) | Resectable plant transgene locus with characteristic protospacer adjacent motif or characteristic guide RNA recognition site | |
JP2021522829A (en) | Methods and compositions for targeted editing of polynucleotides |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |