CN115466319A - 高粱SbMS1蛋白及其编码基因与应用 - Google Patents
高粱SbMS1蛋白及其编码基因与应用 Download PDFInfo
- Publication number
- CN115466319A CN115466319A CN202110589746.0A CN202110589746A CN115466319A CN 115466319 A CN115466319 A CN 115466319A CN 202110589746 A CN202110589746 A CN 202110589746A CN 115466319 A CN115466319 A CN 115466319A
- Authority
- CN
- China
- Prior art keywords
- protein
- sbms1
- sequence
- plant
- nucleic acid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 107
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 70
- 235000011684 Sorghum saccharatum Nutrition 0.000 title claims abstract description 40
- 241000209072 Sorghum Species 0.000 title claims abstract 8
- 241000196324 Embryophyta Species 0.000 claims abstract description 87
- 230000035558 fertility Effects 0.000 claims abstract description 22
- 230000014509 gene expression Effects 0.000 claims abstract description 21
- 238000000034 method Methods 0.000 claims abstract description 15
- 230000001105 regulatory effect Effects 0.000 claims abstract description 7
- 230000001276 controlling effect Effects 0.000 claims abstract description 5
- 230000002401 inhibitory effect Effects 0.000 claims abstract description 4
- 108020004414 DNA Proteins 0.000 claims description 31
- 150000007523 nucleic acids Chemical class 0.000 claims description 28
- 108020004707 nucleic acids Proteins 0.000 claims description 27
- 102000039446 nucleic acids Human genes 0.000 claims description 27
- 239000013598 vector Substances 0.000 claims description 22
- 230000009261 transgenic effect Effects 0.000 claims description 21
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 14
- 108091033409 CRISPR Proteins 0.000 claims description 13
- 102000053602 DNA Human genes 0.000 claims description 12
- 239000002773 nucleotide Substances 0.000 claims description 10
- 125000003729 nucleotide group Chemical group 0.000 claims description 10
- 239000000126 substance Substances 0.000 claims description 10
- 244000005700 microbiome Species 0.000 claims description 9
- 230000000694 effects Effects 0.000 claims description 8
- 238000010354 CRISPR gene editing Methods 0.000 claims description 7
- 210000004027 cell Anatomy 0.000 claims description 7
- 239000002299 complementary DNA Substances 0.000 claims description 7
- 125000000539 amino acid group Chemical group 0.000 claims description 5
- 239000012620 biological material Substances 0.000 claims description 5
- 238000009395 breeding Methods 0.000 claims description 4
- 230000001488 breeding effect Effects 0.000 claims description 4
- 241000894006 Bacteria Species 0.000 claims description 3
- 241000209510 Liliopsida Species 0.000 claims description 3
- 210000004899 c-terminal region Anatomy 0.000 claims description 3
- 108020001507 fusion proteins Proteins 0.000 claims description 3
- 102000037865 fusion proteins Human genes 0.000 claims description 3
- 108091027544 Subgenomic mRNA Proteins 0.000 claims 4
- 241000209504 Poaceae Species 0.000 claims 2
- 239000003795 chemical substances by application Substances 0.000 claims 1
- 206010021929 Infertility male Diseases 0.000 abstract description 7
- 208000007466 Male Infertility Diseases 0.000 abstract description 7
- 238000012217 deletion Methods 0.000 abstract description 6
- 230000037430 deletion Effects 0.000 abstract description 6
- 238000012271 agricultural production Methods 0.000 abstract description 2
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 230000007547 defect Effects 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 abstract description 2
- 238000002474 experimental method Methods 0.000 abstract description 2
- 240000006394 Sorghum bicolor Species 0.000 description 31
- 238000010362 genome editing Methods 0.000 description 21
- 244000202761 Sorghum bicolor subsp verticilliflorum Species 0.000 description 9
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 8
- PHTQWCKDNZKARW-UHFFFAOYSA-N isoamylol Chemical compound CC(C)CCO PHTQWCKDNZKARW-UHFFFAOYSA-N 0.000 description 8
- 241000589158 Agrobacterium Species 0.000 description 7
- 108020005004 Guide RNA Proteins 0.000 description 6
- 239000000203 mixture Substances 0.000 description 6
- 239000013604 expression vector Substances 0.000 description 5
- 108010058731 nopaline synthase Proteins 0.000 description 5
- 238000012163 sequencing technique Methods 0.000 description 5
- 230000033228 biological regulation Effects 0.000 description 4
- 238000003209 gene knockout Methods 0.000 description 4
- 240000007594 Oryza sativa Species 0.000 description 3
- 235000007164 Oryza sativa Nutrition 0.000 description 3
- 244000062793 Sorghum vulgare Species 0.000 description 3
- 239000003242 anti bacterial agent Substances 0.000 description 3
- 235000013339 cereals Nutrition 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 239000003623 enhancer Substances 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 230000002068 genetic effect Effects 0.000 description 3
- 230000002363 herbicidal effect Effects 0.000 description 3
- 239000004009 herbicide Substances 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 108020004999 messenger RNA Proteins 0.000 description 3
- 238000002156 mixing Methods 0.000 description 3
- 230000035772 mutation Effects 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- 235000009566 rice Nutrition 0.000 description 3
- 239000006228 supernatant Substances 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 238000005406 washing Methods 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 101150084750 1 gene Proteins 0.000 description 2
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 2
- YCYXHLZRUSJITQ-SRVKXCTJSA-N Arg-Pro-Pro Chemical compound NC(=N)NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 YCYXHLZRUSJITQ-SRVKXCTJSA-N 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- LZZYPRNAOMGNLH-UHFFFAOYSA-M Cetrimonium bromide Chemical class [Br-].CCCCCCCCCCCCCCCC[N+](C)(C)C LZZYPRNAOMGNLH-UHFFFAOYSA-M 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 2
- YNJBLTDKTMKEET-ZLUOBGJFSA-N Cys-Ser-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O YNJBLTDKTMKEET-ZLUOBGJFSA-N 0.000 description 2
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 2
- NAFIFZNBSPWYOO-RWRJDSDZSA-N Ile-Thr-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NAFIFZNBSPWYOO-RWRJDSDZSA-N 0.000 description 2
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 2
- IDGZVZJLYFTXSL-DCAQKATOSA-N Leu-Ser-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IDGZVZJLYFTXSL-DCAQKATOSA-N 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- WMZVVNLPHFSUPA-BPUTZDHNSA-N Ser-Trp-Arg Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 WMZVVNLPHFSUPA-BPUTZDHNSA-N 0.000 description 2
- 108091081024 Start codon Proteins 0.000 description 2
- DEGCBBCMYWNJNA-RHYQMDGZSA-N Thr-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O DEGCBBCMYWNJNA-RHYQMDGZSA-N 0.000 description 2
- MROIJTGJGIDEEJ-RCWTZXSCSA-N Thr-Pro-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 MROIJTGJGIDEEJ-RCWTZXSCSA-N 0.000 description 2
- 229940088710 antibiotic agent Drugs 0.000 description 2
- 108010029539 arginyl-prolyl-proline Proteins 0.000 description 2
- 108010060035 arginylproline Proteins 0.000 description 2
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 230000001086 cytosolic effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 108010049041 glutamylalanine Proteins 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 238000001802 infusion Methods 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000013612 plasmid Substances 0.000 description 2
- 230000008488 polyadenylation Effects 0.000 description 2
- 239000000843 powder Substances 0.000 description 2
- 108010015796 prolylisoleucine Proteins 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 108010048818 seryl-histidine Proteins 0.000 description 2
- 238000009331 sowing Methods 0.000 description 2
- 238000010186 staining Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 230000017260 vegetative to reproductive phase transition of meristem Effects 0.000 description 2
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- HHGYNJRJIINWAK-FXQIFTODSA-N Ala-Ala-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N HHGYNJRJIINWAK-FXQIFTODSA-N 0.000 description 1
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 1
- WXERCAHAIKMTKX-ZLUOBGJFSA-N Ala-Asp-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O WXERCAHAIKMTKX-ZLUOBGJFSA-N 0.000 description 1
- FOWHQTWRLFTELJ-FXQIFTODSA-N Ala-Asp-Met Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N FOWHQTWRLFTELJ-FXQIFTODSA-N 0.000 description 1
- KUDREHRZRIVKHS-UWJYBYFXSA-N Ala-Asp-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KUDREHRZRIVKHS-UWJYBYFXSA-N 0.000 description 1
- HMRWQTHUDVXMGH-GUBZILKMSA-N Ala-Glu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HMRWQTHUDVXMGH-GUBZILKMSA-N 0.000 description 1
- VWEWCZSUWOEEFM-WDSKDSINSA-N Ala-Gly-Ala-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(=O)NCC(O)=O VWEWCZSUWOEEFM-WDSKDSINSA-N 0.000 description 1
- IFKQPMZRDQZSHI-GHCJXIJMSA-N Ala-Ile-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O IFKQPMZRDQZSHI-GHCJXIJMSA-N 0.000 description 1
- CKLDHDOIYBVUNP-KBIXCLLPSA-N Ala-Ile-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O CKLDHDOIYBVUNP-KBIXCLLPSA-N 0.000 description 1
- PIXQDIGKDNNOOV-GUBZILKMSA-N Ala-Lys-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O PIXQDIGKDNNOOV-GUBZILKMSA-N 0.000 description 1
- NINQYGGNRIBFSC-CIUDSAMLSA-N Ala-Lys-Ser Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CO)C(O)=O NINQYGGNRIBFSC-CIUDSAMLSA-N 0.000 description 1
- FFZJHQODAYHGPO-KZVJFYERSA-N Ala-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N FFZJHQODAYHGPO-KZVJFYERSA-N 0.000 description 1
- LSMDIAAALJJLRO-XQXXSGGOSA-N Ala-Thr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LSMDIAAALJJLRO-XQXXSGGOSA-N 0.000 description 1
- AAWLEICNDUHIJM-MBLNEYKQSA-N Ala-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](C)N)O AAWLEICNDUHIJM-MBLNEYKQSA-N 0.000 description 1
- NLYYHIKRBRMAJV-AEJSXWLSSA-N Ala-Val-Pro Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N NLYYHIKRBRMAJV-AEJSXWLSSA-N 0.000 description 1
- UXJCMQFPDWCHKX-DCAQKATOSA-N Arg-Arg-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UXJCMQFPDWCHKX-DCAQKATOSA-N 0.000 description 1
- UISQLSIBJKEJSS-GUBZILKMSA-N Arg-Arg-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(O)=O UISQLSIBJKEJSS-GUBZILKMSA-N 0.000 description 1
- XVLLUZMFSAYKJV-GUBZILKMSA-N Arg-Asp-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XVLLUZMFSAYKJV-GUBZILKMSA-N 0.000 description 1
- OTCJMMRQBVDQRK-DCAQKATOSA-N Arg-Asp-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OTCJMMRQBVDQRK-DCAQKATOSA-N 0.000 description 1
- FEZJJKXNPSEYEV-CIUDSAMLSA-N Arg-Gln-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O FEZJJKXNPSEYEV-CIUDSAMLSA-N 0.000 description 1
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 1
- NMRHDSAOIURTNT-RWMBFGLXSA-N Arg-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NMRHDSAOIURTNT-RWMBFGLXSA-N 0.000 description 1
- STHNZYKCJHWULY-AVGNSLFASA-N Arg-Pro-His Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O STHNZYKCJHWULY-AVGNSLFASA-N 0.000 description 1
- JPAWCMXVNZPJLO-IHRRRGAJSA-N Arg-Ser-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JPAWCMXVNZPJLO-IHRRRGAJSA-N 0.000 description 1
- LYJXHXGPWDTLKW-HJGDQZAQSA-N Arg-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O LYJXHXGPWDTLKW-HJGDQZAQSA-N 0.000 description 1
- AUZAXCPWMDBWEE-HJGDQZAQSA-N Arg-Thr-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O AUZAXCPWMDBWEE-HJGDQZAQSA-N 0.000 description 1
- QNYWYYNQSXANBL-WDSOQIARSA-N Arg-Trp-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QNYWYYNQSXANBL-WDSOQIARSA-N 0.000 description 1
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 1
- JEPNYDRDYNSFIU-QXEWZRGKSA-N Asn-Arg-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(N)=O)C(O)=O JEPNYDRDYNSFIU-QXEWZRGKSA-N 0.000 description 1
- VWJFQGXPYOPXJH-ZLUOBGJFSA-N Asn-Cys-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)C(=O)N VWJFQGXPYOPXJH-ZLUOBGJFSA-N 0.000 description 1
- SRUUBQBAVNQZGJ-LAEOZQHASA-N Asn-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N SRUUBQBAVNQZGJ-LAEOZQHASA-N 0.000 description 1
- ULRPXVNMIIYDDJ-ACZMJKKPSA-N Asn-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N ULRPXVNMIIYDDJ-ACZMJKKPSA-N 0.000 description 1
- UBKOVSLDWIHYSY-ACZMJKKPSA-N Asn-Glu-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UBKOVSLDWIHYSY-ACZMJKKPSA-N 0.000 description 1
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 1
- FBODFHMLALOPHP-GUBZILKMSA-N Asn-Lys-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O FBODFHMLALOPHP-GUBZILKMSA-N 0.000 description 1
- XHTUGJCAEYOZOR-UBHSHLNASA-N Asn-Ser-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O XHTUGJCAEYOZOR-UBHSHLNASA-N 0.000 description 1
- AMGQTNHANMRPOE-LKXGYXEUSA-N Asn-Thr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O AMGQTNHANMRPOE-LKXGYXEUSA-N 0.000 description 1
- MJIJBEYEHBKTIM-BYULHYEWSA-N Asn-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MJIJBEYEHBKTIM-BYULHYEWSA-N 0.000 description 1
- NJIKKGUVGUBICV-ZLUOBGJFSA-N Asp-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O NJIKKGUVGUBICV-ZLUOBGJFSA-N 0.000 description 1
- QOVWVLLHMMCFFY-ZLUOBGJFSA-N Asp-Asp-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QOVWVLLHMMCFFY-ZLUOBGJFSA-N 0.000 description 1
- PMEHKVHZQKJACS-PEFMBERDSA-N Asp-Gln-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PMEHKVHZQKJACS-PEFMBERDSA-N 0.000 description 1
- HSWYMWGDMPLTTH-FXQIFTODSA-N Asp-Glu-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HSWYMWGDMPLTTH-FXQIFTODSA-N 0.000 description 1
- GHODABZPVZMWCE-FXQIFTODSA-N Asp-Glu-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O GHODABZPVZMWCE-FXQIFTODSA-N 0.000 description 1
- DGKCOYGQLNWNCJ-ACZMJKKPSA-N Asp-Glu-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O DGKCOYGQLNWNCJ-ACZMJKKPSA-N 0.000 description 1
- JOCQXVJCTCEFAZ-CIUDSAMLSA-N Asp-His-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O JOCQXVJCTCEFAZ-CIUDSAMLSA-N 0.000 description 1
- DJCAHYVLMSRBFR-QXEWZRGKSA-N Asp-Met-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(O)=O DJCAHYVLMSRBFR-QXEWZRGKSA-N 0.000 description 1
- HRVQDZOWMLFAOD-BIIVOSGPSA-N Asp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N)C(=O)O HRVQDZOWMLFAOD-BIIVOSGPSA-N 0.000 description 1
- ITGFVUYOLWBPQW-KKHAAJSZSA-N Asp-Thr-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O ITGFVUYOLWBPQW-KKHAAJSZSA-N 0.000 description 1
- FIRWLDUOFOULCA-XIRDDKMYSA-N Asp-Trp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N FIRWLDUOFOULCA-XIRDDKMYSA-N 0.000 description 1
- 238000010453 CRISPR/Cas method Methods 0.000 description 1
- 241000701489 Cauliflower mosaic virus Species 0.000 description 1
- 241000195493 Cryptophyta Species 0.000 description 1
- UPJGYXRAPJWIHD-CIUDSAMLSA-N Cys-Asn-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UPJGYXRAPJWIHD-CIUDSAMLSA-N 0.000 description 1
- HQZGVYJBRSISDT-BQBZGAKWSA-N Cys-Gly-Arg Chemical compound [H]N[C@@H](CS)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O HQZGVYJBRSISDT-BQBZGAKWSA-N 0.000 description 1
- HEPLXMBVMCXTBP-QWRGUYRKSA-N Cys-Phe-Gly Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O HEPLXMBVMCXTBP-QWRGUYRKSA-N 0.000 description 1
- JUNZLDGUJZIUCO-IHRRRGAJSA-N Cys-Pro-Tyr Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CS)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O JUNZLDGUJZIUCO-IHRRRGAJSA-N 0.000 description 1
- WQZGKKKJIJFFOK-QTVWNMPRSA-N D-mannopyranose Chemical compound OC[C@H]1OC(O)[C@@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-QTVWNMPRSA-N 0.000 description 1
- LMKYZBGVKHTLTN-NKWVEPMBSA-N D-nopaline Chemical compound NC(=N)NCCC[C@@H](C(O)=O)N[C@@H](C(O)=O)CCC(O)=O LMKYZBGVKHTLTN-NKWVEPMBSA-N 0.000 description 1
- 108010066133 D-octopine dehydrogenase Proteins 0.000 description 1
- 101150074155 DHFR gene Proteins 0.000 description 1
- 206010058314 Dysplasia Diseases 0.000 description 1
- 101150111720 EPSPS gene Proteins 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- YQYJSBFKSSDGFO-UHFFFAOYSA-N Epihygromycin Natural products OC1C(O)C(C(=O)C)OC1OC(C(=C1)O)=CC=C1C=C(C)C(=O)NC1C(O)C(O)C2OCOC2C1O YQYJSBFKSSDGFO-UHFFFAOYSA-N 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- RGXXLQWXBFNXTG-CIUDSAMLSA-N Gln-Arg-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O RGXXLQWXBFNXTG-CIUDSAMLSA-N 0.000 description 1
- INFBPLSHYFALDE-ACZMJKKPSA-N Gln-Asn-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O INFBPLSHYFALDE-ACZMJKKPSA-N 0.000 description 1
- RRYLMJWPWBJFPZ-ACZMJKKPSA-N Gln-Asn-Asp Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N RRYLMJWPWBJFPZ-ACZMJKKPSA-N 0.000 description 1
- LMPBBFWHCRURJD-LAEOZQHASA-N Gln-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N LMPBBFWHCRURJD-LAEOZQHASA-N 0.000 description 1
- SMLDOQHTOAAFJQ-WDSKDSINSA-N Gln-Gly-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SMLDOQHTOAAFJQ-WDSKDSINSA-N 0.000 description 1
- MWERYIXRDZDXOA-QEWYBTABSA-N Gln-Ile-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MWERYIXRDZDXOA-QEWYBTABSA-N 0.000 description 1
- FFVXLVGUJBCKRX-UKJIMTQDSA-N Gln-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCC(=O)N)N FFVXLVGUJBCKRX-UKJIMTQDSA-N 0.000 description 1
- LPIKVBWNNVFHCQ-GUBZILKMSA-N Gln-Ser-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LPIKVBWNNVFHCQ-GUBZILKMSA-N 0.000 description 1
- NHMRJKKAVMENKJ-WDCWCFNPSA-N Gln-Thr-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NHMRJKKAVMENKJ-WDCWCFNPSA-N 0.000 description 1
- VLOLPWWCNKWRNB-LOKLDPHHSA-N Gln-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O VLOLPWWCNKWRNB-LOKLDPHHSA-N 0.000 description 1
- OGMQXTXGLDNBSS-FXQIFTODSA-N Glu-Ala-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O OGMQXTXGLDNBSS-FXQIFTODSA-N 0.000 description 1
- IRDASPPCLZIERZ-XHNCKOQMSA-N Glu-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N IRDASPPCLZIERZ-XHNCKOQMSA-N 0.000 description 1
- AVZHGSCDKIQZPQ-CIUDSAMLSA-N Glu-Arg-Ala Chemical compound C[C@H](NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O AVZHGSCDKIQZPQ-CIUDSAMLSA-N 0.000 description 1
- CVPXINNKRTZBMO-CIUDSAMLSA-N Glu-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)CN=C(N)N CVPXINNKRTZBMO-CIUDSAMLSA-N 0.000 description 1
- NKSGKPWXSWBRRX-ACZMJKKPSA-N Glu-Asn-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N NKSGKPWXSWBRRX-ACZMJKKPSA-N 0.000 description 1
- ZJICFHQSPWFBKP-AVGNSLFASA-N Glu-Asn-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZJICFHQSPWFBKP-AVGNSLFASA-N 0.000 description 1
- RQNYYRHRKSVKAB-GUBZILKMSA-N Glu-Cys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O RQNYYRHRKSVKAB-GUBZILKMSA-N 0.000 description 1
- XHWLNISLUFEWNS-CIUDSAMLSA-N Glu-Gln-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O XHWLNISLUFEWNS-CIUDSAMLSA-N 0.000 description 1
- AIGROOHQXCACHL-WDSKDSINSA-N Glu-Gly-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O AIGROOHQXCACHL-WDSKDSINSA-N 0.000 description 1
- CXRWMMRLEMVSEH-PEFMBERDSA-N Glu-Ile-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CXRWMMRLEMVSEH-PEFMBERDSA-N 0.000 description 1
- CUPSDFQZTVVTSK-GUBZILKMSA-N Glu-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O CUPSDFQZTVVTSK-GUBZILKMSA-N 0.000 description 1
- YGLCLCMAYUYZSG-AVGNSLFASA-N Glu-Lys-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 YGLCLCMAYUYZSG-AVGNSLFASA-N 0.000 description 1
- CBEUFCJRFNZMCU-SRVKXCTJSA-N Glu-Met-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O CBEUFCJRFNZMCU-SRVKXCTJSA-N 0.000 description 1
- PAZQYODKOZHXGA-SRVKXCTJSA-N Glu-Pro-His Chemical compound N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O PAZQYODKOZHXGA-SRVKXCTJSA-N 0.000 description 1
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 1
- VNCNWQPIQYAMAK-ACZMJKKPSA-N Glu-Ser-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O VNCNWQPIQYAMAK-ACZMJKKPSA-N 0.000 description 1
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 1
- NTHIHAUEXVTXQG-KKUMJFAQSA-N Glu-Tyr-Arg Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O NTHIHAUEXVTXQG-KKUMJFAQSA-N 0.000 description 1
- WKJKBELXHCTHIJ-WPRPVWTQSA-N Gly-Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N WKJKBELXHCTHIJ-WPRPVWTQSA-N 0.000 description 1
- FUTAPPOITCCWTH-WHFBIAKZSA-N Gly-Asp-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FUTAPPOITCCWTH-WHFBIAKZSA-N 0.000 description 1
- DTRUBYPMMVPQPD-YUMQZZPRSA-N Gly-Gln-Arg Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DTRUBYPMMVPQPD-YUMQZZPRSA-N 0.000 description 1
- HQRHFUYMGCHHJS-LURJTMIESA-N Gly-Gly-Arg Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-LURJTMIESA-N 0.000 description 1
- QPTNELDXWKRIFX-YFKPBYRVSA-N Gly-Gly-Gln Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O QPTNELDXWKRIFX-YFKPBYRVSA-N 0.000 description 1
- ADZGCWWDPFDHCY-ZETCQYMHSA-N Gly-His-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 ADZGCWWDPFDHCY-ZETCQYMHSA-N 0.000 description 1
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 1
- HQSKKSLNLSTONK-JTQLQIEISA-N Gly-Tyr-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 HQSKKSLNLSTONK-JTQLQIEISA-N 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 239000005562 Glyphosate Substances 0.000 description 1
- HVLSXIKZNLPZJJ-TXZCQADKSA-N HA peptide Chemical compound C([C@@H](C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@H]1N(CCC1)C(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HVLSXIKZNLPZJJ-TXZCQADKSA-N 0.000 description 1
- LIEIYPBMQJLASB-SRVKXCTJSA-N His-Gln-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CN=CN1 LIEIYPBMQJLASB-SRVKXCTJSA-N 0.000 description 1
- CZXKZMQKXQZDEX-YUMQZZPRSA-N His-Gly-Cys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N CZXKZMQKXQZDEX-YUMQZZPRSA-N 0.000 description 1
- XIGFLVCAVQQGNS-IHRRRGAJSA-N His-Pro-His Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 XIGFLVCAVQQGNS-IHRRRGAJSA-N 0.000 description 1
- SYPULFZAGBBIOM-GVXVVHGQSA-N His-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N SYPULFZAGBBIOM-GVXVVHGQSA-N 0.000 description 1
- PUFNQIPSRXVLQJ-IHRRRGAJSA-N His-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N PUFNQIPSRXVLQJ-IHRRRGAJSA-N 0.000 description 1
- 206010020649 Hyperkeratosis Diseases 0.000 description 1
- ZDNORQNHCJUVOV-KBIXCLLPSA-N Ile-Gln-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O ZDNORQNHCJUVOV-KBIXCLLPSA-N 0.000 description 1
- UQXADIGYEYBJEI-DJFWLOJKSA-N Ile-His-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N UQXADIGYEYBJEI-DJFWLOJKSA-N 0.000 description 1
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 1
- SNHYFFQZRFIRHO-CYDGBPFRSA-N Ile-Met-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)O)N SNHYFFQZRFIRHO-CYDGBPFRSA-N 0.000 description 1
- XLXPYSDGMXTTNQ-UHFFFAOYSA-N Ile-Phe-Leu Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(CC(C)C)C(O)=O)CC1=CC=CC=C1 XLXPYSDGMXTTNQ-UHFFFAOYSA-N 0.000 description 1
- KTNGVMMGIQWIDV-OSUNSFLBSA-N Ile-Pro-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O KTNGVMMGIQWIDV-OSUNSFLBSA-N 0.000 description 1
- XOZOSAUOGRPCES-STECZYCISA-N Ile-Pro-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XOZOSAUOGRPCES-STECZYCISA-N 0.000 description 1
- HXIDVIFHRYRXLZ-NAKRPEOUSA-N Ile-Ser-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)O)N HXIDVIFHRYRXLZ-NAKRPEOUSA-N 0.000 description 1
- SAEWJTCJQVZQNZ-IUKAMOBKSA-N Ile-Thr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SAEWJTCJQVZQNZ-IUKAMOBKSA-N 0.000 description 1
- 108020005350 Initiator Codon Proteins 0.000 description 1
- 108010025815 Kanamycin Kinase Proteins 0.000 description 1
- 101100288095 Klebsiella pneumoniae neo gene Proteins 0.000 description 1
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 1
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 1
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 1
- HASRFYOMVPJRPU-SRVKXCTJSA-N Leu-Arg-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HASRFYOMVPJRPU-SRVKXCTJSA-N 0.000 description 1
- DLCOFDAHNMMQPP-SRVKXCTJSA-N Leu-Asp-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DLCOFDAHNMMQPP-SRVKXCTJSA-N 0.000 description 1
- CQGSYZCULZMEDE-SRVKXCTJSA-N Leu-Gln-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O CQGSYZCULZMEDE-SRVKXCTJSA-N 0.000 description 1
- CQGSYZCULZMEDE-UHFFFAOYSA-N Leu-Gln-Pro Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)N1CCCC1C(O)=O CQGSYZCULZMEDE-UHFFFAOYSA-N 0.000 description 1
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 1
- KGCLIYGPQXUNLO-IUCAKERBSA-N Leu-Gly-Glu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O KGCLIYGPQXUNLO-IUCAKERBSA-N 0.000 description 1
- UHNQRAFSEBGZFZ-YESZJQIVSA-N Leu-Phe-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N UHNQRAFSEBGZFZ-YESZJQIVSA-N 0.000 description 1
- MAXILRZVORNXBE-PMVMPFDFSA-N Leu-Phe-Trp Chemical compound C([C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 MAXILRZVORNXBE-PMVMPFDFSA-N 0.000 description 1
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 1
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 1
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 1
- CGHXMODRYJISSK-NHCYSSNCSA-N Leu-Val-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O CGHXMODRYJISSK-NHCYSSNCSA-N 0.000 description 1
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- JCFYLFOCALSNLQ-GUBZILKMSA-N Lys-Ala-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JCFYLFOCALSNLQ-GUBZILKMSA-N 0.000 description 1
- SSJBMGCZZXCGJJ-DCAQKATOSA-N Lys-Asp-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O SSJBMGCZZXCGJJ-DCAQKATOSA-N 0.000 description 1
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 1
- LCMWVZLBCUVDAZ-IUCAKERBSA-N Lys-Gly-Glu Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CCC([O-])=O LCMWVZLBCUVDAZ-IUCAKERBSA-N 0.000 description 1
- KZJQUYFDSCFSCO-DLOVCJGASA-N Lys-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCCN)N KZJQUYFDSCFSCO-DLOVCJGASA-N 0.000 description 1
- IVFUVMSKSFSFBT-NHCYSSNCSA-N Lys-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN IVFUVMSKSFSFBT-NHCYSSNCSA-N 0.000 description 1
- OVAOHZIOUBEQCJ-IHRRRGAJSA-N Lys-Leu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OVAOHZIOUBEQCJ-IHRRRGAJSA-N 0.000 description 1
- JYVCOTWSRGFABJ-DCAQKATOSA-N Lys-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N JYVCOTWSRGFABJ-DCAQKATOSA-N 0.000 description 1
- HYSVGEAWTGPMOA-IHRRRGAJSA-N Lys-Pro-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O HYSVGEAWTGPMOA-IHRRRGAJSA-N 0.000 description 1
- LKDXINHHSWFFJC-SRVKXCTJSA-N Lys-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCCN)N LKDXINHHSWFFJC-SRVKXCTJSA-N 0.000 description 1
- WINFHLHJTRGLCV-BZSNNMDCSA-N Lys-Tyr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=C(O)C=C1 WINFHLHJTRGLCV-BZSNNMDCSA-N 0.000 description 1
- 108091022912 Mannose-6-Phosphate Isomerase Proteins 0.000 description 1
- WGBMNLCRYKSWAR-DCAQKATOSA-N Met-Asp-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN WGBMNLCRYKSWAR-DCAQKATOSA-N 0.000 description 1
- FWTBMGAKKPSTBT-GUBZILKMSA-N Met-Gln-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FWTBMGAKKPSTBT-GUBZILKMSA-N 0.000 description 1
- RZJOHSFAEZBWLK-CIUDSAMLSA-N Met-Gln-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N RZJOHSFAEZBWLK-CIUDSAMLSA-N 0.000 description 1
- NHXXGBXJTLRGJI-GUBZILKMSA-N Met-Pro-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O NHXXGBXJTLRGJI-GUBZILKMSA-N 0.000 description 1
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 1
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 1
- 108010066427 N-valyltryptophan Proteins 0.000 description 1
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 1
- 244000115721 Pennisetum typhoides Species 0.000 description 1
- 235000007195 Pennisetum typhoides Nutrition 0.000 description 1
- CYZBFPYMSJGBRL-DRZSPHRISA-N Phe-Ala-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CYZBFPYMSJGBRL-DRZSPHRISA-N 0.000 description 1
- HCTXJGRYAACKOB-SRVKXCTJSA-N Phe-Asn-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HCTXJGRYAACKOB-SRVKXCTJSA-N 0.000 description 1
- SXJGROGVINAYSH-AVGNSLFASA-N Phe-Gln-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SXJGROGVINAYSH-AVGNSLFASA-N 0.000 description 1
- UNLYPPYNDXHGDG-IHRRRGAJSA-N Phe-Gln-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 UNLYPPYNDXHGDG-IHRRRGAJSA-N 0.000 description 1
- IAJOBQBIJHVGMQ-UHFFFAOYSA-N Phosphinothricin Natural products CP(O)(=O)CCC(N)C(O)=O IAJOBQBIJHVGMQ-UHFFFAOYSA-N 0.000 description 1
- 108700001094 Plant Genes Proteins 0.000 description 1
- MTHRMUXESFIAMS-DCAQKATOSA-N Pro-Asn-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O MTHRMUXESFIAMS-DCAQKATOSA-N 0.000 description 1
- FUVBEZJCRMHWEM-FXQIFTODSA-N Pro-Asn-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FUVBEZJCRMHWEM-FXQIFTODSA-N 0.000 description 1
- ODPIUQVTULPQEP-CIUDSAMLSA-N Pro-Gln-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ODPIUQVTULPQEP-CIUDSAMLSA-N 0.000 description 1
- AQSMZTIEJMZQEC-DCAQKATOSA-N Pro-His-Ser Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CO)C(=O)O AQSMZTIEJMZQEC-DCAQKATOSA-N 0.000 description 1
- DSGSTPRKNYHGCL-JYJNAYRXSA-N Pro-Phe-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O DSGSTPRKNYHGCL-JYJNAYRXSA-N 0.000 description 1
- FDMKYQQYJKYCLV-GUBZILKMSA-N Pro-Pro-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 FDMKYQQYJKYCLV-GUBZILKMSA-N 0.000 description 1
- LNICFEXCAHIJOR-DCAQKATOSA-N Pro-Ser-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LNICFEXCAHIJOR-DCAQKATOSA-N 0.000 description 1
- SNGZLPOXVRTNMB-LPEHRKFASA-N Pro-Ser-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N2CCC[C@@H]2C(=O)O SNGZLPOXVRTNMB-LPEHRKFASA-N 0.000 description 1
- VBZXFFYOBDLLFE-HSHDSVGOSA-N Pro-Trp-Thr Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H]([C@H](O)C)C(O)=O)C(=O)[C@@H]1CCCN1 VBZXFFYOBDLLFE-HSHDSVGOSA-N 0.000 description 1
- 108091034057 RNA (poly(A)) Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- ZUGXSSFMTXKHJS-ZLUOBGJFSA-N Ser-Ala-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O ZUGXSSFMTXKHJS-ZLUOBGJFSA-N 0.000 description 1
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 1
- HRNQLKCLPVKZNE-CIUDSAMLSA-N Ser-Ala-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O HRNQLKCLPVKZNE-CIUDSAMLSA-N 0.000 description 1
- OYEDZGNMSBZCIM-XGEHTFHBSA-N Ser-Arg-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OYEDZGNMSBZCIM-XGEHTFHBSA-N 0.000 description 1
- BTPAWKABYQMKKN-LKXGYXEUSA-N Ser-Asp-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BTPAWKABYQMKKN-LKXGYXEUSA-N 0.000 description 1
- ZOHGLPQGEHSLPD-FXQIFTODSA-N Ser-Gln-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZOHGLPQGEHSLPD-FXQIFTODSA-N 0.000 description 1
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 1
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 1
- MUARUIBTKQJKFY-WHFBIAKZSA-N Ser-Gly-Asp Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MUARUIBTKQJKFY-WHFBIAKZSA-N 0.000 description 1
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 1
- SRKMDKACHDVPMD-SRVKXCTJSA-N Ser-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)N SRKMDKACHDVPMD-SRVKXCTJSA-N 0.000 description 1
- LRWBCWGEUCKDTN-BJDJZHNGSA-N Ser-Lys-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LRWBCWGEUCKDTN-BJDJZHNGSA-N 0.000 description 1
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 1
- FPCGZYMRFFIYIH-CIUDSAMLSA-N Ser-Lys-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O FPCGZYMRFFIYIH-CIUDSAMLSA-N 0.000 description 1
- NUEHQDHDLDXCRU-GUBZILKMSA-N Ser-Pro-Arg Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NUEHQDHDLDXCRU-GUBZILKMSA-N 0.000 description 1
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 1
- 235000008515 Setaria glauca Nutrition 0.000 description 1
- 108020004688 Small Nuclear RNA Proteins 0.000 description 1
- 102000039471 Small Nuclear RNA Human genes 0.000 description 1
- 235000007230 Sorghum bicolor Nutrition 0.000 description 1
- 229920002472 Starch Polymers 0.000 description 1
- 241000193996 Streptococcus pyogenes Species 0.000 description 1
- TYVAWPFQYFPSBR-BFHQHQDPSA-N Thr-Ala-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)NCC(O)=O TYVAWPFQYFPSBR-BFHQHQDPSA-N 0.000 description 1
- JNQZPAWOPBZGIX-RCWTZXSCSA-N Thr-Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)CCCN=C(N)N JNQZPAWOPBZGIX-RCWTZXSCSA-N 0.000 description 1
- YBXMGKCLOPDEKA-NUMRIWBASA-N Thr-Asp-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YBXMGKCLOPDEKA-NUMRIWBASA-N 0.000 description 1
- GKWNLDNXMMLRMC-GLLZPBPUSA-N Thr-Glu-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O GKWNLDNXMMLRMC-GLLZPBPUSA-N 0.000 description 1
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 1
- RFKVQLIXNVEOMB-WEDXCCLWSA-N Thr-Leu-Gly Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N)O RFKVQLIXNVEOMB-WEDXCCLWSA-N 0.000 description 1
- WTMPKZWHRCMMMT-KZVJFYERSA-N Thr-Pro-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WTMPKZWHRCMMMT-KZVJFYERSA-N 0.000 description 1
- SGAOHNPSEPVAFP-ZDLURKLDSA-N Thr-Ser-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SGAOHNPSEPVAFP-ZDLURKLDSA-N 0.000 description 1
- FBQHKSPOIAFUEI-OWLDWWDNSA-N Thr-Trp-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O FBQHKSPOIAFUEI-OWLDWWDNSA-N 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- 235000021307 Triticum Nutrition 0.000 description 1
- 244000098338 Triticum aestivum Species 0.000 description 1
- NIWAGRRZHCMPOY-GMVOTWDCSA-N Trp-Ala-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N NIWAGRRZHCMPOY-GMVOTWDCSA-N 0.000 description 1
- XZLHHHYSWIYXHD-XIRDDKMYSA-N Trp-Gln-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XZLHHHYSWIYXHD-XIRDDKMYSA-N 0.000 description 1
- WKQNLTQSCYXKQK-VFAJRCTISA-N Trp-Lys-Thr Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WKQNLTQSCYXKQK-VFAJRCTISA-N 0.000 description 1
- GEGYPBOPIGNZIF-CWRNSKLLSA-N Trp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O GEGYPBOPIGNZIF-CWRNSKLLSA-N 0.000 description 1
- ZOBLBMGJKVJVEV-BZSNNMDCSA-N Tyr-Lys-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O ZOBLBMGJKVJVEV-BZSNNMDCSA-N 0.000 description 1
- MJUTYRIMFIICKL-JYJNAYRXSA-N Tyr-Val-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MJUTYRIMFIICKL-JYJNAYRXSA-N 0.000 description 1
- 108090000848 Ubiquitin Proteins 0.000 description 1
- 102000044159 Ubiquitin Human genes 0.000 description 1
- 108091023045 Untranslated Region Proteins 0.000 description 1
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 1
- DNOOLPROHJWCSQ-RCWTZXSCSA-N Val-Arg-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DNOOLPROHJWCSQ-RCWTZXSCSA-N 0.000 description 1
- MHAHQDBEIDPFQS-NHCYSSNCSA-N Val-Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)C(C)C MHAHQDBEIDPFQS-NHCYSSNCSA-N 0.000 description 1
- KZKMBGXCNLPYKD-YEPSODPASA-N Val-Gly-Thr Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O KZKMBGXCNLPYKD-YEPSODPASA-N 0.000 description 1
- ZZGPVSZDZQRJQY-ULQDDVLXSA-N Val-Leu-Phe Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](Cc1ccccc1)C(O)=O ZZGPVSZDZQRJQY-ULQDDVLXSA-N 0.000 description 1
- YMTOEGGOCHVGEH-IHRRRGAJSA-N Val-Lys-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O YMTOEGGOCHVGEH-IHRRRGAJSA-N 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 1
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 1
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 1
- 108010041407 alanylaspartic acid Proteins 0.000 description 1
- 108010005233 alanylglutamic acid Proteins 0.000 description 1
- 102000006646 aminoglycoside phosphotransferase Human genes 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 108010008355 arginyl-glutamine Proteins 0.000 description 1
- 108010052670 arginyl-glutamyl-glutamic acid Proteins 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 101150103518 bar gene Proteins 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 101150008672 csn-1 gene Proteins 0.000 description 1
- 239000008367 deionised water Substances 0.000 description 1
- 229910021641 deionized water Inorganic materials 0.000 description 1
- 108010054813 diprotin B Proteins 0.000 description 1
- 239000012153 distilled water Substances 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 241001233957 eudicotyledons Species 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- IAJOBQBIJHVGMQ-BYPYZUCNSA-N glufosinate-P Chemical compound CP(O)(=O)CC[C@H](N)C(O)=O IAJOBQBIJHVGMQ-BYPYZUCNSA-N 0.000 description 1
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 1
- 108010079547 glutamylmethionine Proteins 0.000 description 1
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 1
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 1
- 108010033719 glycyl-histidyl-glycine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010045126 glycyl-tyrosyl-glycine Proteins 0.000 description 1
- XDDAORKBJWWYJS-UHFFFAOYSA-N glyphosate Chemical compound OC(=O)CNCP(O)(O)=O XDDAORKBJWWYJS-UHFFFAOYSA-N 0.000 description 1
- 229940097068 glyphosate Drugs 0.000 description 1
- 238000000227 grinding Methods 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 101150054900 gus gene Proteins 0.000 description 1
- 101150029559 hph gene Proteins 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 208000000509 infertility Diseases 0.000 description 1
- 230000036512 infertility Effects 0.000 description 1
- 208000021267 infertility disease Diseases 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 229960000318 kanamycin Drugs 0.000 description 1
- 229930027917 kanamycin Natural products 0.000 description 1
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 1
- 229930182823 kanamycin A Natural products 0.000 description 1
- 108010034529 leucyl-lysine Proteins 0.000 description 1
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 1
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 1
- 108010057821 leucylproline Proteins 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 108010003700 lysyl aspartic acid Proteins 0.000 description 1
- 108010064235 lysylglycine Proteins 0.000 description 1
- 235000009973 maize Nutrition 0.000 description 1
- 210000001161 mammalian embryo Anatomy 0.000 description 1
- 238000005360 mashing Methods 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 108010005942 methionylglycine Proteins 0.000 description 1
- 229960000485 methotrexate Drugs 0.000 description 1
- 235000019713 millet Nutrition 0.000 description 1
- 239000004570 mortar (masonry) Substances 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 230000031787 nutrient reservoir activity Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 108010018625 phenylalanylarginine Proteins 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 239000002244 precipitate Substances 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 238000012113 quantitative test Methods 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 229960000268 spectinomycin Drugs 0.000 description 1
- UNFWWIHTNXNPBV-WXKVUWSESA-N spectinomycin Chemical compound O([C@@H]1[C@@H](NC)[C@@H](O)[C@H]([C@@H]([C@H]1O1)O)NC)[C@]2(O)[C@H]1O[C@H](C)CC2=O UNFWWIHTNXNPBV-WXKVUWSESA-N 0.000 description 1
- 235000019698 starch Nutrition 0.000 description 1
- 239000008107 starch Substances 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000005026 transcription initiation Effects 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- 108010080629 tryptophan-leucine Proteins 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/415—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8201—Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
- C12N15/8202—Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation by biological means, e.g. cell mediated or natural vector
- C12N15/8205—Agrobacterium mediated transformation
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
- C12N15/8218—Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
- C12N15/8261—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
- C12N15/8287—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for fertility modification, e.g. apomixis
- C12N15/8289—Male sterility
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Cell Biology (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Virology (AREA)
- Botany (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
Abstract
本发明公开了高粱SbMS1蛋白及其编码基因与应用。本发明的目的在于克服现有技术的不足,提供SbMS1蛋白及其编码基因在调控植物育性中的应用。通过实验证明:SbMS1基因的功能缺失会导致高粱的高度雄性不育。与现有的技术相比,本发明具有如下的有益效果:由于SbMS1基因功能的缺失会特异性导致高粱的高度雄性不育;可利用敲除或抑制SbMS1的特异性表达技术获得高粱高度雄性不育系,在农业生产上具有十分重要的应用。
Description
技术领域
本发明属于生物技术领域,具体涉及高粱SbMS1蛋白及其编码基因与应用。
背景技术
杂种优势是生物界的一种普遍现象,是指两个基因型不同的亲本杂交产生的杂种,在生长势、生活力、繁殖力、适应性以及产量、品质等性状方面超过其双亲的现象。杂种优势利用也是提高作物产量的重要途径,其中,雄性不育基因的克隆及雄性不育系的培育是自花授粉作物有效利用杂种优势的前提和基础。以水稻为代表,主要有“质核互作三系”和“光温敏不育两系”两种利用途径。在高粱上,利用细胞质雄性不育系实现了杂种优势利用。然而,“质核互作三系”和“光温敏不育两系”都有各自的缺点。例如,细胞质雄性不育系的培育需要多代的回交,费时费力;制种时需要不育系、保持系、恢复系三系配套,制种流程繁琐;恢复系遗传资源有限,需要特殊培育;光温敏雄性不育系的育性不稳定容易受到环境变化的影响等。
植物高度雄性不育是指植物的雌配子体发育正常,并且大部分雄配子体发育异常而通常只能产生少量有功能活性花粉的现象。植物依靠少量正常的花粉可以自交保持,特别是一些繁殖系数较高的作物如水稻、小麦、高粱、油菜、谷子、珍珠粟等。利用这种高度雄性不育材料培育成不育系,配合具有显性抗除草剂性状的材料作为恢复系,可以制备杂交种,大大扩宽杂种优势利用的遗传基础,简化了制种流程,且不育系育性较为稳定,相比较传统“质核互作三系”和“光温敏不育两系”优势明显。
发明内容
本发明的第一个目的是提供SbMS1蛋白的新用途。
本发明提供了SbMS1蛋白在调控植物雄性育性中的应用:
所述SbMS1蛋白为a1)或a2)或a3)或a4):
a1)氨基酸序列是序列3所示的蛋白质;
a2)在序列3所示的蛋白质的N端或/和C端连接标签得到的融合蛋白质;
a3)将序列3所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加得到的与植物雄性育性相关的蛋白质;
a4)将序列3所示的氨基酸序列具有90%同一性、来源于高粱且与植物雄性育性相关的蛋白质。
上述a2)所述的蛋白质中,所述标签是指利用DNA体外重组技术,与目的蛋白一起融合表达的一种多肽或者蛋白,以便于目的蛋白的表达、检测、示踪和/或纯化。所述标签可为Flag标签、His标签、MBP标签、HA标签、myc标签、GST标签和/或SUMO标签等。
上述a3)所述的蛋白质中,所述一个或几个氨基酸残基的取代和/或缺失和/或添加为不超过10个氨基酸残基的取代和/或缺失和/或添加。
上述a4)所述的蛋白质中,“同一性”包括与本发明的序列3所示的氨基酸序列具有90%或更高,或91%或更高,或92%或更高,或93%或更高,或94%或更高,或95%或更高,或96%或更高,或97%或更高,或98%或更高,或99%或更高同源性的氨基酸序列。
上述a1)或a2)或a3)或a4)所述的蛋白质可人工合成,也可先合成其编码基因,再进行生物表达得到。
本发明的第二个目的是提供与SbMS1蛋白相关的生物材料的新用途。
本发明提供了与SbMS1蛋白相关的生物材料在调控植物雄性育性中的应用:
所述生物材料为下述A1)至A8)中的任一种:
A1)编码SbMS1蛋白的核酸分子;
A2)含有A1)所述核酸分子的表达盒;
A3)含有A1)所述核酸分子的重组载体;
A4)含有A2)所述表达盒的重组载体;
A5)含有A1)所述核酸分子的重组微生物;
A6)含有A2)所述表达盒的重组微生物;
A7)含有A3)所述重组载体的重组微生物;
A8)含有A4)所述重组载体的重组微生物。
上述应用中,A1)所述核酸分子为如下B1)或B2)或B3)或B4)所示的基因:
B1)序列1所示的基因组DNA分子;
B2)序列2所示的cDNA分子;
B3)与B1)或B2)限定的核苷酸序列具有75%或75%以上同一性,且编码SbMS1蛋白的cDNA分子或基因组DNA分子;
B4)在严格条件下与B1)或B2)或B3)限定的核苷酸序列杂交,且编码SbMS1蛋白的cDNA分子或基因组DNA分子。
其中,所述核酸分子可以是DNA,如cDNA、基因组DNA或重组DNA;所述核酸分子也可以是RNA,如mRNA或hnRNA等。
本领域普通技术人员可以很容易地采用已知的方法,例如定向进化和点突变的方法,对本发明的编码蛋白质SbMS1的核苷酸序列进行突变。那些经过人工修饰的,具有编码蛋白质SbMS1的核苷酸序列75%或者更高同一性的核苷酸,只要编码蛋白质SbMS1且具有相同功能,均是衍生于本发明的核苷酸序列并且等同于本发明的序列。
这里使用的术语“同一性”指与天然核酸序列的序列相似性。“同一性”包括与本发明的编码序列3所示的氨基酸序列组成的蛋白质的核苷酸序列具有75%或更高,或85%或更高,或90%或更高,或95%或更高同一性的核苷酸序列。同一性可以用肉眼或计算机软件进行评价。使用计算机软件,两个或多个序列之间的同一性可以用百分比(%)表示,其可以用来评价相关序列之间的同一性。
上述75%或75%以上同一性,可为80%、85%、90%或95%以上的同一性。
上述应用中,所述严格条件是在2×SSC,0.1%SDS的溶液中,在68℃下杂交并洗膜2次,每次5min,又于0.5×SSC,0.1%SDS的溶液中,在68℃下杂交并洗膜2次,每次15min;或,0.1×SSPE(或0.1×SSC)、0.1%SDS的溶液中,65℃条件下杂交并洗膜。
上述应用中,A2)所述的含有编码蛋白质SbMS1的核酸分子的表达盒(SbMS1基因表达盒)是指能够在宿主细胞中表达蛋白质SbMS1的DNA,该DNA不但可包括启动SbMS1转录的启动子,还可包括终止SbMS1转录的终止子。进一步,所述表达盒还可包括增强子序列。可用于本发明的启动子包括但不限于:组成型启动子;组织、器官和发育特异的启动子及诱导型启动子。合适的转录终止子包括但不限于:农杆菌胭脂碱合成酶终止子(NOS终止子)、花椰菜花叶病毒CaMV 35S终止子、tml终止子、豌豆rbcS E9终止子和胭脂氨酸和章鱼氨酸合酶终止子。
可用现有的表达载体构建含有所述SbMS1基因表达盒的重组载体。所述植物表达载体包括双元农杆菌载体和可用于植物微弹轰击的载体等。如pAHC25、pBin438、pCAMBIA1302、pCAMBIA2301、pCAMBIA1301、pCAMBIA1300、pBI121、pCAMBIA1391-Xa或pCAMBIA1391-Xb等。所述植物表达载体还可包含外源基因的3′端非翻译区域,即包含聚腺苷酸信号和任何其它参与mRNA加工或基因表达的DNA片段。所述聚腺苷酸信号可引导聚腺苷酸加入到mRNA前体的3′端,如农杆菌冠瘿瘤诱导(Ti)质粒基因(如胭脂碱合成酶基因Nos)、植物基因(如大豆贮存蛋白基因)3′端转录的非翻译区均具有类似功能。使用本发明的基因构建植物表达载体时,还可使用增强子,包括翻译增强子或转录增强子,这些增强子区域可以是ATG起始密码子或邻接区域起始密码子等,但必需与编码序列的阅读框相同,以保证整个序列的正确翻译。所述翻译控制信号和起始密码子的来源是广泛的,可以是天然的,也可以是合成的。翻译起始区域可以来自转录起始区域或结构基因。为了便于对转基因植物细胞或植物进行鉴定及筛选,可对所用植物表达载体进行加工,如加入可在植物中表达的编码可产生颜色变化的酶或发光化合物的基因(GUS基因、萤光素酶基因等)、抗生素的标记基因(如赋予对卡那霉素和相关抗生素抗性的nptII基因,赋予对除草剂膦丝菌素抗性的bar基因,赋予对抗生素潮霉素抗性的hph基因,和赋予对氨甲喋呤抗性的dhfr基因,赋予对草甘磷抗性的EPSPS基因)或是抗化学试剂标记基因等(如抗除莠剂基因)、提供代谢甘露糖能力的甘露糖-6-磷酸异构酶基因。从转基因植物的安全性考虑,可不加任何选择性标记基因,直接以逆境筛选转化植株。
上述应用中,所述载体可为质粒、黏粒、噬菌体或病毒载体。
上述应用中,所述微生物可为酵母、细菌、藻或真菌,如农杆菌。
本发明的第三个目的是提供m1或m2所示的物质的新用途;
m1、抑制或降低植物中SbMS1蛋白活性或者含量的物质;
m2、抑制或降低植物中SbMS1蛋白编码核酸表达的物质或敲除植物中SbMS1蛋白编码核酸的物质。
本发明提供了m1或m2所示的物质在如下1)或2)中的应用:
1)调控植物雄性育性;
2)培育雄性不育的转基因植物。
上述应用中,所述调控植物雄性育性为使植物雄性不育,体现为:当敲除植物中的SbMS1蛋白编码核酸后,该植物雄性不育,进一步体现为:当敲除植物中的SbMS1蛋白编码核酸后,该植物的花粉活性和/或结实率显著降低。
本发明的第四个目的是提供一种培育雄性不育的转基因植物的方法。
本发明提供的培育雄性不育的转基因植物的方法包括如下步骤:降低受体植物中SbMS1蛋白的含量和/或活性或抑制或降低受体植物中SbMS1蛋白编码核酸的表达或敲除受体植物中SbMS1蛋白编码核酸,得到转基因植物;所述转基因植物为雄性不育。
上述方法中,所述SbMS1蛋白编码核酸的序列为序列1或序列2所示的DNA分子。
进一步的,所述转基因植物为雄性不育体现为所述转基因植物的花粉活性低于所述受体植物和/或所述转基因植物的结实率低于所述受体植物。
更进一步的,所述敲除受体植物中SbMS1蛋白编码核酸的物质为CRISPR/Cas9系统。所述CRISPR/Cas9系统中sgRNA的靶序列具体为序列4所示的DNA分子。
上述任一所述应用或方法中,所述植物为双子叶植物或单子叶植物;进一步的,所述单子叶植物为禾本科植物;更进一步的,所述禾本科植物为高粱(如高粱Tx430)。
本发明的最后一个目的是提供一种特异sgRNA或含有所述sgRNA编码基因的表达盒、载体、宿主细胞、工程菌或转基因植物细胞系,所述sgRNA的靶序列为序列4所示的DNA分子。
本发明的目的在于克服现有技术的不足,提供SbMS1蛋白及其编码基因在调控植物育性中的应用。通过实验证明:SbMS1基因的功能缺失会导致高粱的高度雄性不育。与现有的技术相比,本发明具有如下的有益效果:由于SbMS1基因功能的缺失会特异性导致高粱的高度雄性不育;可利用敲除或抑制SbMS1的特异性表达技术获得高粱高度雄性不育系,在农业生产上具有十分重要的应用。
附图说明
图1为SbMS1基因结构、靶点设计及靶点处基因编辑类型。
图2为野生型高粱品种TX430和基因编辑纯合株系sbms1#8的SbMS1基因组序列比对示意图。
图3为野生型高粱品种TX430和基因编辑纯合株系sbms1#8的植株形态学观察示意图。
图4为野生型高粱品种TX430和基因编辑纯合株系sbms1#8的穗部表型观察示意图。
图5为野生型高粱品种TX430和基因编辑纯合株系sbms1#8的花粉I2-IK染色观察示意图。
图6为野生型高粱品种TX430和基因编辑纯合株系sbms1#8的花粉育性及结实率统计图。
具体实施方式
以下的实施例便于更好地理解本发明,但并不限定本发明。下述实施例中的试验方法,如无特殊说明,均为常规方法。下述实施例中所用的试验材料,如无特殊说明,均为自常规生化试剂商店购买得到的。以下实施例中的定量试验,均设置三次重复实验,结果取平均值。
下述实施例中的高粱DNA的提取采用改进的CTAB法进行。具体步骤如下:取叶片0.1-0.2克放到小研钵中,加入适量的液氮,立刻研磨至粉状,装入2ml离心管,加入800ul65℃预热的CTAB溶液于离心管中,小心混匀后放入65℃水浴,20分钟后取出离心管,加入800ul氯仿/异戊醇溶液(氯仿:异戊醇=24:1),猛烈混匀,12000rpm离心10分钟,取上清,再次加入800ul氯仿/异戊醇溶液(氯仿:异戊醇=24:1),12000rpm离心10分钟,取上清于新离心管中,加入600ul异丙醇混匀后-20℃放半小时以上。将析出的DNA离心,12000rpm离心10分钟。去掉上清,将沉淀用500ul 70%乙醇清洗两次,离心干燥,溶于100ul去离子水中,-20℃冰箱保存。
下述实施例中的SbMS1基因的基因组序列如序列表中的序列1所示,SbMS1基因的CDS序列如序列表中的序列2所示,SbMS1蛋白的氨基酸序列如序列表中的序列3所示。
下述实施例中的高粱Tx430记载于文献“Sato-Izawa K,Tokue K,EzuraH.Development of a stable Agrobacterium-mediated transformation protocol forSorghum bicolor Tx430[J].Plant Biotechnology,2018,35(2).”中。
实施例1、SbMS1蛋白在调控植物育性中的应用
一、SbMS1敲除高粱的获得
1、敲除靶点的设计
根据SbMS1基因序列,利用植物CRISPR/Cas9在线靶点设计网站CRISPR-P 2.0(http://crispr.hzau.edu.cn/CRISPR2/)设计SbMS1基因敲除靶序列,最终筛选得到的SbMS1基因敲除靶序列如下:5’-CCGCACGTAGGGCGAATCCA-3’(序列4)。
2、用于infusion连接且含靶序列的片段的获得
在10μl的体系中加入CRISPR(Sobic)上游引物5’-AGATGATCCGTGGCACCGCACGTAGGGCGAATCCAGTTTTAGAGCTATGC-3’和下游引物5’-GCATAGCTCTAAAACTGGATTCGCCCTACGTGCGGTGCCACGGATCATCT-3’各5μl,94℃10min,0.1℃/s退火至15℃,15℃保持10min,完成退火,获得用于infusion连接且含靶序列的DNA片段。
3、SbMS1敲除载体的构建
将PCas9载体利用限制性内切酶AarⅠ(Cat no.ER1581,ThermoFisherScientific,United States)37℃酶切5h,得到PCas9线性载体。然后将步骤2获得的DNA片段5’-AGATGATCCGTGGCACCGCACGTAGGGCGAATCCAGTTTTAGAGCTATGC-3’利用In-Fusion HDCloning Kits(Cat no.639648,Takara,Japan)连入PCas9线性载体,转化大肠杆菌感受态细胞Trans-T1(Cat no.CD501,TransGen Biotech,China)并涂布含壮观霉素Spec的固体LB培养基上,挑取单克隆并利用引物5’-CCCTTCACCGTCAGATGCTACT-3’和5’-TGGATAATGTGCAAGGGATCTTT-3’(目标产物序列大小为1090bp)进行测序鉴定。
PCas9载体的核苷酸序列如序列表中的序列5所示,其中,序列5第1611-1863位为胭脂碱合酶终止子,第1992-2789位为氨基糖苷类磷酸转移酶的编码基因序列,第2882-3061位为胭脂碱合酶启动子序列,第3445-7545位为化脓性链球菌II型CRISPR/Cas系统的Cas9(Csn1)内切酶的编码基因序列,第7546-7566位为核定位信号序列,第7640-9632位为玉米泛素启动子序列,第9707-10087位为水稻snRNA U3启动子序列,第10923-11013位为小向导RNA(sgRNA)的编码基因序列。
测序结果表明:SbMS1基因敲除载体为将靶序列5’-CCGCACGTAGGGCGAATCCA-3’插入PCas9载体第10087位和第10923位之间,且保持其它序列不变后得到的载体。
4、敲除高粱的获得及鉴定
将步骤3构建的SbMS1基因敲除载体导入农杆菌EHA105,得到重组菌;使用农杆菌介导的遗传转化手段转化高粱Tx430幼胚愈伤,得到T0代转基因植株。利用引物5’-GTGGAGCCCTGCTGCTG-3’和5’-AAGGGCAGGCTACGACTA-3’(目标产物序列大小为315bp)对T0代转基因植株进行DNA检测,确定转基因植物编辑方式,获得T1代种子,在温室进行播种,对T1代转基因植株再次进行DNA检测并使用表1中的引物对SbMS1基因进行扩增与测序,检测编辑方式是否稳定遗传及植株编辑位点的纯杂合,最终得到高度雄性不育植株,经过测序发现均为纯合编辑单株,分别命名为sbms1#8、sbms1#11、sbms1#20。
表1、基因测序引物
引物名 | 产物长度 | 上游引物 | 下游引物 |
SbMS1-1 | 921 | GGGACGAGCCTACAGGAA | TAACAGTGCGGTTGTAAGGT |
SbMS1-2 | 1210 | TCCCTCTGAAGAAACCTC | GACTCATTCGCATTGGAC |
SbMS1-3 | 1254 | CTTTCGTGATCGGTGTCC | TGCACCAGCAGTAACCAT |
SbMS1-4 | 1369 | ATGCTTAGGGAACTTGAA | AGGAGTATGAATGGTGGG |
与野生型高粱Tx430基因组DNA相比,sbms1#8的差异仅在于在序列1所示的编码SbMS1蛋白的基因序列中,发生了一个碱基A的插入,该碱基A的插入位置为序列1第259位和第260位之间(图1),从而造成移码并提前终止,SbMS1蛋白功能缺失。野生型高粱品种Tx430和基因编辑纯合株系sbms1#8的SbMS1基因组序列比对示意图如图2所示。
与野生型高粱Tx430基因组DNA相比,sbms1#11的差异仅在于在序列1所示的编码SbMS1蛋白的基因序列中,发生了3bp的碱基缺失,该缺失碱基位于序列1第260-262位(ATT),从而造成对应于序列3所示的SbMS1蛋白氨基酸序列第88位Asp(D)变为Ala(A)以及第89位Ser(S)缺失。
与野生型高粱Tx430基因组DNA相比,sbms1#20的差异仅在于在序列1所示的编码SbMS1蛋白的基因序列中,发生了25bp的碱基缺失,该缺失碱基位于序列1第238-262位,从而移码并提前终止,SbMS1蛋白功能缺失。
二、SbMS1敲除高粱的育性分析
1、植株和穗部形态观察
取开花期的野生型高粱品种Tx430和基因编辑纯合株系sbms1#8进行植株形态学观察。取成熟期的野生型高粱品种Tx430和基因编辑纯合株系sbms1#8进行穗部表型观察。
野生型高粱品种Tx430和基因编辑纯合株系sbms1#8的植株形态学观察示意图如图3所示,结果表明:与野生型高粱品种Tx430相比,基因编辑纯合株系sbms1#8株高没有明显差异。野生型高粱品种Tx430和基因编辑纯合株系sbms1#8的穗部表型观察示意图如图4所示,结果表明:野生型高粱品种Tx430结实正常,而基因编辑纯合株系sbms1#8结实受到严重影响而只有少量的种子。说明SbMS1突变之后严重影响高粱结实。
2、花粉I2-IK染色
取开花期的野生型高粱品种Tx430和基因编辑纯合株系sbms1#8花药置于载玻片上,加1滴蒸馏水,用镊子将花药充分捣碎,使花粉粒释放,再加1-2滴I2-KI溶液,盖上盖玻片,于低倍显微镜下观察。凡被染成黑色的为含有淀粉的活力较强的花粉粒,呈黄褐色的为发育不良的花粉粒。
结果如图5所示,对野生型高粱品种Tx430和基因编辑纯合株系sbms1#8花粉进行I2-IK染色时发现野生型高粱品种Tx430花粉能够正常着色(图5左),而基因编辑纯合株系sbms1#8大部分花粉不能够正常着色(图5右)。基因编辑纯合株系sbms1#8的花粉活性显著低于野生型高粱品种Tx430(图6左)。
3、结实率
取成熟期的野生型高粱品种Tx430和基因编辑纯合株系sbms1#8的穗,每个穗随机统计5个一级分支的小花数和结实数,根据公式“结实率=(结实数/小花数)*100%”计算结实率,野生型高粱品种Tx430和基因编辑纯合株系sbms1#8分别统计至少5株的结实率。
结果如图6所示。结果表明:野生型高粱品种Tx430能够正常结实,而基因编辑纯合株系sbms1#8结实率明显下降(图6右)。说明基因编辑纯合株系sbms1#8表现出部分雄性不育表型,即将SbMS1突变之后可以导致高粱高度雄性不育,SbMS1具有调控植物雄性育性的功能。
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。
序列表
<110> 中国农业科学院作物科学研究所
<120> 高粱SbMS1蛋白及其编码基因与应用
<160> 5
<170> PatentIn version 3.5
<210> 1
<211> 3789
<212> DNA
<213> Artificial Sequence
<400> 1
atgccgagcg gcgggaggcg gctgccgccg tggacgtcgc cgaggagcgc gggggcgggg 60
gcggcgaggt ggagccctgc tgctggtacg cccgcggccg cgggtggaca gcgctcgggc 120
tcgggctacg ggacgccgcc tctgagcgcg ggttgcttcg gcacgcgcgt cacgccgccg 180
acgagtgggg gcgcgcgcgt cacgccgccc tcgaccgggg gatgctcgtc gcggccgccc 240
aggccgccgc cttccctgga ttcgccctac gtgcgggcca agcaggcgca ggtaattcat 300
catcggcagt accatttcgg tcatgtttag gctcagaatt actcgagcat taccgggtca 360
gcaattagtc gtagcctgcc cttcagctca cgcatgagac tggcgactgt taccattaaa 420
tggtttaggc atttctcata cacaatgtta tgttcgatcg atacttgccg gattcaggcc 480
tttaacactt caattatgtc aagccttctc ccttgtctgt agcaaagatt ttcatttttc 540
gcatgtatgt gtacttaatt tcagaagaat gaattaaagt ttctaacatg cattgcatgt 600
actcccagtt taaacgcagg gggaaatgtg gcttccttga cttaagtggc aaagtttttt 660
ttccctctga agaaacctct tcgaagaggg cacggcagtc ctgcaattca cttaagaaga 720
aaaggattga cctggtttat aagggaaact aggcccaaaa accttacaac cgcactgtta 780
ttaggaggat tacgtgccac agtctactag gtttactcaa ttgttgagac tggcaatgca 840
atggtcctct gcagactgtg aaagtctttt aacatctagt tattagttaa agtatcaata 900
tatatgtgat gttgacctga ttaactgctt aatgaaacac gtgccattgc atgttctcga 960
aaaaaaaatg ttgacctgat taaattactc aaacatttct aatcctcagt taataggtac 1020
cctgcaagct gcattatgat ggcatttggt cctctgtgta tgcactatac tgcaggctcg 1080
ttctcatatg tgtttgcaga ctcgagggta acatattggg gggcaatgat gtattatcct 1140
ttaaggaaat caccattatt gttttagaat accatattac catgtagctg taacaaaatg 1200
ccaggaaatt tctctgcact aacaaaattg tagattgggt aagcattccc aggaagtgag 1260
gaattgaaac taaaatactg gtccttctga ttgtgcatct tattaccaga aacctgctgt 1320
accgagagat tggattgaca ctatttatgt gtactggcat caactttttt ttttggaatt 1380
gtggatgttc atttaaaaaa ttgcgtttga catcctgaac atatgaagta ccataagcta 1440
aatatgcaga tttttatgtg tccttatgtt gtttctgtta ttccagatag ttgaaaagga 1500
ccccaacaag gccgttccat tgttttgggc agctataaac agtggtgacc ggactgagag 1560
cgcattgaag gatatggcca atgtactgaa acaagcaaat cgggctgaag aagccattga 1620
ggcaataaga tcctttcgtg atcggtgtcc ctatgaagct caggagtccc ttgacaatat 1680
tcttctagat ctatacaagg tatattgttt gtcctttcaa agaaagaggt tcatttttat 1740
aagctaacag ttcctctgta gctccattgg tttggttatg cagatacatt tttttttctc 1800
gaaaacttgg ttaggcagat gcatattgaa cacatttaat taattatcac attgtccaat 1860
gcgaatgagt catcttaaaa taattagtta tcacatgaat tttggatgaa aatgtcacta 1920
acacatttct tctgaaacta ataattttca tgttaaacca gaaatgtggt aggacagatg 1980
agcagatcga aatgctgaca ataaagctgc gaattgttga tgaggagcta gcttctggtc 2040
ggtggaaaac aaaaatgtct aaatctcatg gaagggtagt ctacctgtct ctcagagatg 2100
aaaaagcaag gtactggaaa tctttatatt cccatttgag atttgacttt taaacactat 2160
tcaaactaag ttccttgttt cttgccaagg ctattgggga accttgcttg ggcctatatg 2220
cagtccgaaa attatgaggg agcagagatg ctctacaggt acatcttgag ttcattcttc 2280
tcccatttct aaatgtcacg agttttctca aatggcatat tgcagatgaa cctataatcc 2340
cgccacaata ctaagtgtgt gaatgacata cttgcatgtg gtgtgctgtg tgcacctacc 2400
ctttatctca aaagggaaac aaactaattc tggctactat tgacttggca caggcaagct 2460
cttgctatag aagctgacta caacaaagag tgtaacttgg ccatctgttt gatgaagact 2520
ggaaaggtgg ctgaagctaa atacctgatt caagctatac cttacaactg tgatgatgaa 2580
agtcatgtga agtctctttc ccgggctact gaaatgctta gggaacttga attgcaatca 2640
ctcccctctc ccataactca ggcgaagtcc aaagaatcac agatttttct tgctgatgat 2700
gtggagatgc ttgtagatct acagccacaa acactatcaa ctcctttgag tgaactgaaa 2760
tataaaagac cacatatttc agtttcacaa aatgcagaga agcatgagaa ttgcaattca 2820
tggcttccat ctcccataac tcagttgaga cgtgaagaac cacacattat ggttactgct 2880
ggtgcagaaa agaatgaaag ctttgcagag ttccaagatc tttctcgact gttcaatgat 2940
gctgctacac ctcattcaat acttgagaaa ctacgaaaga ggctagttaa agaggcacca 3000
aaaatcggta ttcatgatga tcagattcag actcctattc caactgaatg cttgccaaac 3060
tctgaaagaa acctagatgc tagtgagact cccatgcaag aagggaagct attgaccaaa 3120
ggtgttaaga aaacatgggc tgacatggtg gatgaagagg aacaacaatt gggtgatgat 3180
aaaccattgg ctgacatggt ggcgaaggat gaacagcaat tgggtgaaag caagtcaaca 3240
cttggtgtgg gaactactga acaaaaggag agcagtaagc atgcaagtaa gctggaatac 3300
agaacaccat tggcctctca agaaagcagg acccatcaaa gaccattcat gggtggtcaa 3360
ctgcaaggtt cttcagcagc ttcatggaga cagaatgact ccaaaatctc catggataag 3420
aacgtgaacc gggatcttgt gaggactgct ccgacatgga gcaagcataa ggcacaagac 3480
cacaacaatc gagtttggca aaggcttgac acagttcatc cccatgagag agcctcagac 3540
acgaaccaag taccacggag aagcaacaca tctcagcgcg ctctttttcc tgactggaaa 3600
tcaaagggtg aaggacatgg ccatggttgt gttctgtttg atgataacga acgcactcag 3660
tgttccagtc acgttgaggc cactcatcgc tggcataata atgaggcaag tacagggtca 3720
tggaggccac agaaccgtct gcgggtcttc caggaaatca caaatgagat caaccaaaat 3780
gttgtgtaa 3789
<210> 2
<211> 2058
<212> DNA
<213> Artificial Sequence
<400> 2
atgccgagcg gcgggaggcg gctgccgccg tggacgtcgc cgaggagcgc gggggcgggg 60
gcggcgaggt ggagccctgc tgctggtacg cccgcggccg cgggtggaca gcgctcgggc 120
tcgggctacg ggacgccgcc tctgagcgcg ggttgcttcg gcacgcgcgt cacgccgccg 180
acgagtgggg gcgcgcgcgt cacgccgccc tcgaccgggg gatgctcgtc gcggccgccc 240
aggccgccgc cttccctgga ttcgccctac gtgcgggcca agcaggcgca gatagttgaa 300
aaggacccca acaaggccgt tccattgttt tgggcagcta taaacagtgg tgaccggact 360
gagagcgcat tgaaggatat ggccaatgta ctgaaacaag caaatcgggc tgaagaagcc 420
attgaggcaa taagatcctt tcgtgatcgg tgtccctatg aagctcagga gtcccttgac 480
aatattcttc tagatctata caagaaatgt ggtaggacag atgagcagat cgaaatgctg 540
acaataaagc tgcgaattgt tgatgaggag ctagcttctg gtcggtggaa aacaaaaatg 600
tctaaatctc atggaagggt agtctacctg tctctcagag atgaaaaagc aaggctattg 660
gggaaccttg cttgggccta tatgcagtcc gaaaattatg agggagcaga gatgctctac 720
aggcaagctc ttgctataga agctgactac aacaaagagt gtaacttggc catctgtttg 780
atgaagactg gaaaggtggc tgaagctaaa tacctgattc aagctatacc ttacaactgt 840
gatgatgaaa gtcatgtgaa gtctctttcc cgggctactg aaatgcttag ggaacttgaa 900
ttgcaatcac tcccctctcc cataactcag gcgaagtcca aagaatcaca gatttttctt 960
gctgatgatg tggagatgct tgtagatcta cagccacaaa cactatcaac tcctttgagt 1020
gaactgaaat ataaaagacc acatatttca gtttcacaaa atgcagagaa gcatgagaat 1080
tgcaattcat ggcttccatc tcccataact cagttgagac gtgaagaacc acacattatg 1140
gttactgctg gtgcagaaaa gaatgaaagc tttgcagagt tccaagatct ttctcgactg 1200
ttcaatgatg ctgctacacc tcattcaata cttgagaaac tacgaaagag gctagttaaa 1260
gaggcaccaa aaatcggtat tcatgatgat cagattcaga ctcctattcc aactgaatgc 1320
ttgccaaact ctgaaagaaa cctagatgct agtgagactc ccatgcaaga agggaagcta 1380
ttgaccaaag gtgttaagaa aacatgggct gacatggtgg atgaagagga acaacaattg 1440
ggtgatgata aaccattggc tgacatggtg gcgaaggatg aacagcaatt gggtgaaagc 1500
aagtcaacac ttggtgtggg aactactgaa caaaaggaga gcagtaagca tgcaagtaag 1560
ctggaataca gaacaccatt ggcctctcaa gaaagcagga cccatcaaag accattcatg 1620
ggtggtcaac tgcaaggttc ttcagcagct tcatggagac agaatgactc caaaatctcc 1680
atggataaga acgtgaaccg ggatcttgtg aggactgctc cgacatggag caagcataag 1740
gcacaagacc acaacaatcg agtttggcaa aggcttgaca cagttcatcc ccatgagaga 1800
gcctcagaca cgaaccaagt accacggaga agcaacacat ctcagcgcgc tctttttcct 1860
gactggaaat caaagggtga aggacatggc catggttgtg ttctgtttga tgataacgaa 1920
cgcactcagt gttccagtca cgttgaggcc actcatcgct ggcataataa tgaggcaagt 1980
acagggtcat ggaggccaca gaaccgtctg cgggtcttcc aggaaatcac aaatgagatc 2040
aaccaaaatg ttgtgtaa 2058
<210> 3
<211> 685
<212> PRT
<213> Artificial Sequence
<400> 3
Met Pro Ser Gly Gly Arg Arg Leu Pro Pro Trp Thr Ser Pro Arg Ser
1 5 10 15
Ala Gly Ala Gly Ala Ala Arg Trp Ser Pro Ala Ala Gly Thr Pro Ala
20 25 30
Ala Ala Gly Gly Gln Arg Ser Gly Ser Gly Tyr Gly Thr Pro Pro Leu
35 40 45
Ser Ala Gly Cys Phe Gly Thr Arg Val Thr Pro Pro Thr Ser Gly Gly
50 55 60
Ala Arg Val Thr Pro Pro Ser Thr Gly Gly Cys Ser Ser Arg Pro Pro
65 70 75 80
Arg Pro Pro Pro Ser Leu Asp Ser Pro Tyr Val Arg Ala Lys Gln Ala
85 90 95
Gln Ile Val Glu Lys Asp Pro Asn Lys Ala Val Pro Leu Phe Trp Ala
100 105 110
Ala Ile Asn Ser Gly Asp Arg Thr Glu Ser Ala Leu Lys Asp Met Ala
115 120 125
Asn Val Leu Lys Gln Ala Asn Arg Ala Glu Glu Ala Ile Glu Ala Ile
130 135 140
Arg Ser Phe Arg Asp Arg Cys Pro Tyr Glu Ala Gln Glu Ser Leu Asp
145 150 155 160
Asn Ile Leu Leu Asp Leu Tyr Lys Lys Cys Gly Arg Thr Asp Glu Gln
165 170 175
Ile Glu Met Leu Thr Ile Lys Leu Arg Ile Val Asp Glu Glu Leu Ala
180 185 190
Ser Gly Arg Trp Lys Thr Lys Met Ser Lys Ser His Gly Arg Val Val
195 200 205
Tyr Leu Ser Leu Arg Asp Glu Lys Ala Arg Leu Leu Gly Asn Leu Ala
210 215 220
Trp Ala Tyr Met Gln Ser Glu Asn Tyr Glu Gly Ala Glu Met Leu Tyr
225 230 235 240
Arg Gln Ala Leu Ala Ile Glu Ala Asp Tyr Asn Lys Glu Cys Asn Leu
245 250 255
Ala Ile Cys Leu Met Lys Thr Gly Lys Val Ala Glu Ala Lys Tyr Leu
260 265 270
Ile Gln Ala Ile Pro Tyr Asn Cys Asp Asp Glu Ser His Val Lys Ser
275 280 285
Leu Ser Arg Ala Thr Glu Met Leu Arg Glu Leu Glu Leu Gln Ser Leu
290 295 300
Pro Ser Pro Ile Thr Gln Ala Lys Ser Lys Glu Ser Gln Ile Phe Leu
305 310 315 320
Ala Asp Asp Val Glu Met Leu Val Asp Leu Gln Pro Gln Thr Leu Ser
325 330 335
Thr Pro Leu Ser Glu Leu Lys Tyr Lys Arg Pro His Ile Ser Val Ser
340 345 350
Gln Asn Ala Glu Lys His Glu Asn Cys Asn Ser Trp Leu Pro Ser Pro
355 360 365
Ile Thr Gln Leu Arg Arg Glu Glu Pro His Ile Met Val Thr Ala Gly
370 375 380
Ala Glu Lys Asn Glu Ser Phe Ala Glu Phe Gln Asp Leu Ser Arg Leu
385 390 395 400
Phe Asn Asp Ala Ala Thr Pro His Ser Ile Leu Glu Lys Leu Arg Lys
405 410 415
Arg Leu Val Lys Glu Ala Pro Lys Ile Gly Ile His Asp Asp Gln Ile
420 425 430
Gln Thr Pro Ile Pro Thr Glu Cys Leu Pro Asn Ser Glu Arg Asn Leu
435 440 445
Asp Ala Ser Glu Thr Pro Met Gln Glu Gly Lys Leu Leu Thr Lys Gly
450 455 460
Val Lys Lys Thr Trp Ala Asp Met Val Asp Glu Glu Glu Gln Gln Leu
465 470 475 480
Gly Asp Asp Lys Pro Leu Ala Asp Met Val Ala Lys Asp Glu Gln Gln
485 490 495
Leu Gly Glu Ser Lys Ser Thr Leu Gly Val Gly Thr Thr Glu Gln Lys
500 505 510
Glu Ser Ser Lys His Ala Ser Lys Leu Glu Tyr Arg Thr Pro Leu Ala
515 520 525
Ser Gln Glu Ser Arg Thr His Gln Arg Pro Phe Met Gly Gly Gln Leu
530 535 540
Gln Gly Ser Ser Ala Ala Ser Trp Arg Gln Asn Asp Ser Lys Ile Ser
545 550 555 560
Met Asp Lys Asn Val Asn Arg Asp Leu Val Arg Thr Ala Pro Thr Trp
565 570 575
Ser Lys His Lys Ala Gln Asp His Asn Asn Arg Val Trp Gln Arg Leu
580 585 590
Asp Thr Val His Pro His Glu Arg Ala Ser Asp Thr Asn Gln Val Pro
595 600 605
Arg Arg Ser Asn Thr Ser Gln Arg Ala Leu Phe Pro Asp Trp Lys Ser
610 615 620
Lys Gly Glu Gly His Gly His Gly Cys Val Leu Phe Asp Asp Asn Glu
625 630 635 640
Arg Thr Gln Cys Ser Ser His Val Glu Ala Thr His Arg Trp His Asn
645 650 655
Asn Glu Ala Ser Thr Gly Ser Trp Arg Pro Gln Asn Arg Leu Arg Val
660 665 670
Phe Gln Glu Ile Thr Asn Glu Ile Asn Gln Asn Val Val
675 680 685
<210> 4
<211> 20
<212> DNA
<213> Artificial Sequence
<400> 4
ccgcacgtag ggcgaatcca 20
<210> 5
<211> 16179
<212> DNA
<213> Artificial Sequence
<400> 5
gtcatgcatg atatatctcc caatttgtgt agggcttatt atgcacgctt aaaaataata 60
aaagcagact tgacctgata gtttggctgt gagcaattat gtgcttagtg catctaatcg 120
cttgagttaa cgccggcgaa gcggcgtcgg cttgaacgaa tttctagcta gacattattt 180
gccgactacc ttggtgatct cgcctttcac gtagtggaca aattcttcca actgatctgc 240
gcgcgaggcc aagcgatctt cttcttgtcc aagataagcc tgtctagctt caagtatgac 300
gggctgatac tgggccggca ggcgctccat tgcccagtcg gcagcgacat ccttcggcgc 360
gattttgccg gttactgcgc tgtaccaaat gcgggacaac gtaagcacta catttcgctc 420
atcgccagcc cagtcgggcg gcgagttcca tagcgttaag gtttcattta gcgcctcaaa 480
tagatcctgt tcaggaaccg gatcaaagag ttcctccgcc gctggaccta ccaaggcaac 540
gctatgttct cttgcttttg tcagcaagat agccagatca atgtcgatcg tggctggctc 600
gaagatacct gcaagaatgt cattgcgctg ccattctcca aattgcagtt cgcgcttagc 660
tggataacgc cacggaatga tgtcgtcgtg cacaacaatg gtgacttcta cagcgcggag 720
aatctcgctc tctccagggg aagccgaagt ttccaaaagg tcgttgatca aagctcgccg 780
cgttgtttca tcaagcctta cggtcaccgt aaccagcaaa tcaatatcac tgtgtggctt 840
caggccgcca tccactgcgg agccgtacaa atgtacggcc agcaacgtcg gttcgagatg 900
gcgctcgatg acgccaacta cctctgatag ttgagtcgat acttcggcga tcaccgcttc 960
ccccatgatg tttaactttg ttttagggcg actgccctgc tgcgtaacat cgttgctgct 1020
ccataacatc aaacatcgac ccacggcgta acgcgcttgc tgcttggatg cccgaggcat 1080
agactgtacc ccaaaaaaac atgtcataac aagaagccat gaaaaccgcc actgcgccgt 1140
taccaccgct gcgttcggtc aaggttctgg accagttgcg tgacggcagt tacgctactt 1200
gcattacagc ttacgaaccg aacgaggctt atgtccactg ggttcgtgcc cgaattgatc 1260
acaggcagca acgctctgtc atcgttacaa tcaacatgct accctccgcg agatcatccg 1320
tgtttcaaac ccggcagctt agttgccgtt cttccgaata gcatcggtaa catgagcaaa 1380
gtctgccgcc ttacaacggc tctcccgctg acgccgtccc ggactgatgg gctgcctgta 1440
tcgagtggtg attttgtgcc gagctgccgg tcggggagct gttggctggc tggtggcagg 1500
atatattgtg gtgtaaacaa attgacgctt agacaactta ataacacatt gcggacgttt 1560
ttaatgtact gaattaacgc cgaattgaat tatcagcttg catgccggtc gatctagtaa 1620
catagatgac accgcgcgcg ataatttatc ctagtttgcg cgctatattt tgttttctat 1680
cgcgtattaa atgtataatt gcgggactct aatcataaaa acccatctca taaataacgt 1740
catgcattac atgttaatta ttacatgctt aacgtaattc aacagaaatt atatgataat 1800
catcgcaaga ccggcaacag gattcaatct taagaaactt tattgccaaa tgtttgaacg 1860
atctgcttga ctctagggaa ttaattcctg aatcactgcg accggccctc ccgcgaccca 1920
gccgagcgag cttagcgaac tgtggacgag aactgtgcca ccaagcgtaa ggccgttctc 1980
tcgcattccg ctcagaagaa ctcgtcaaga aggcgataga aggcgatgcg ctgcgaatcg 2040
ggagcggcga taccgtaaag cacgaggaag cggtcagccc attcgccgcc aagctcttca 2100
gcaatatcac gggtagccaa cgctatgtcc tgatagcggt ccgccacacc cagccggcca 2160
cagtcgatga atccagaaaa gcggccattt tccaccatga tattcggcaa gcaggcatcg 2220
ccatgtgtca cgacgagatc ctcgccgtcg ggcatgcgcg ccttgagcct ggcgaacagt 2280
tcggctggcg cgagcccctg atgctcttcg tccagatcat cctgatcgac aagaccggct 2340
tccatccgag tacgtgctcg ctcgatgcga tgtttcgctt ggtggtcgaa tgggcaggta 2400
gccggatcaa gcgtatgcag ccgccgcatt gcatcagcca tgatggatac tttctcggca 2460
ggagcaaggt gagatgacag gagatcctgc cccggcactt cgcccaatag cagccagtcc 2520
cttcccgctt cagtgacaac gtcgagcaca gctgcgcaag gaacgcccgt cgtggccagc 2580
cacgatagcc gcgctgcctc gtcctggagt tcattcaggg caccggacag gtcggtcttg 2640
acaaaaagaa ccgggcgccc ctgcgctgac agccgaaaca cggcggcatc agagcagccg 2700
attgtctgtt gtgcccagtc atagccgaat agcctctcca cccaagcggc cggagaacct 2760
gcgtgcaatc catcttgttc aatccccatg gtcgatcgac agatctgcga aagctcgaga 2820
gagatagatt tgtagagaga gactggtgag gggattcgag ttgagagtga atatgagact 2880
ctaattggat accgagggga atttatggaa cgtcagtgga gcatttttga caagaaatat 2940
ttgctagctg atagtgacct taggcgactt ttgaacgcgc aataatggtt tctgacgtat 3000
gtgcttagct cattaaactc cagaaacccg cggctcagtg gctccttcaa cgttgcggtt 3060
ctgtcagttc caaaggtacc cggggatcct ctagagggcc cgacgtcgca tgcctgcagg 3120
tcactggatt ttggttttag gaattagaaa ttttattgat agaagtattt tacaaataca 3180
aatacatact aagggtttct tatatgctca acacatgagc gaaaccctat aagaacccta 3240
attcccttat ctgggaacta ctcacacatt attctggaga aaaatagaga gagatagatt 3300
tgtagagaga gactggtgat ttttgcggac tctagcatgg ccgcggctag tcagttagat 3360
cgacgtcgca tgctcccggc cgccatggcc gcgggatatc accactttgt acaagaaagc 3420
tgggtcggcg cgcccaccct ttcaatcgcc gccgagttgt gagaggtcga tgcgtgtctc 3480
gtagaggcct gtgatagact ggtggatgag ggtggcgtcg agaacctcct tggtagaggt 3540
gtagcgcttg cggtcgatgg tggtgtcgaa gtacttgaag gcggctggag cgccgaggtt 3600
ggtgagggtg aagaggtgga tgatgttctc ggcctgctcg cgaattggct tatcgcggtg 3660
cttgttgtag gcgctgagca ccttatcgag gttggcatcg gcgaggatca cgcgcttgga 3720
gaactcggag atctgctcga tgatctcgtc gaggtagtgc ttgtgctgct cgacgaacag 3780
ctgcttttgc tcgttgtcct ctggggagcc cttgagcttc tcgtagtggg aggcgaggta 3840
gaggaagttc acgtacttgg acgggagagc aagctcgttg cccttctgaa gctcgccagc 3900
agaggcgagc attctcttgc ggccgttctc aagctcgaag aggctgtact tcgggagctt 3960
gatgatgagg tccttcttca cctccttgta gcccttggcc tcgaggaagt cgattgggtt 4020
cttctcgaag ctgctgcgct ccatgatcgt gatgcccagc agctccttga cggacttgag 4080
cttcttgctc ttgcccttct cgaccttggc aaccacgagc acagagtagg ccacggtcgg 4140
agaatcgaag ccgccatact tcttcgggtc ccagtccttc ttgcgggcga tcagcttgtc 4200
ggagttgcgc tttgggagga tggactcctt ggagaagccg ccggtctgaa cctcggtctt 4260
cttcacgatg ttcacttgcg gcatggagag caccttgcgc actgtggcga aatccctgcc 4320
cttgtcccac acgatctcgc ctgtctcgcc gtttgtctcg atgagcggcc tcttcctaat 4380
ctcgccgttg gcgagcgtga tctcggtctt gaagaaattc atgatgttgg agtagaagaa 4440
gtacttggcg gtcgccttgc cgatctcttg ctcggacttg gcgatcatct tgcgcacgtc 4500
gtacaccttg tagtcgccgt acacgaactc ggactcgagc tttgggtact tcttgatgag 4560
ggctgtgccc accacggcat tgaggtaggc gtcgtgggcg tggtggtagt tgttgatctc 4620
gcgcaccttg tagaactgga agtccttgcg gaagtcggac acgagcttgg acttgagggt 4680
gatgaccttc acctcgcgga tgagcttgtc gttctcgtcg tacttggtgt tcatgcggga 4740
gtcgaggatc tgggccacgt gctttgtgat ctggcgtgtc tcgacgagct ggcgcttgat 4800
gaagccggcc ttatcaagct cggaaaggcc gcctctctcg gccttggtga ggttgtcgaa 4860
cttcctctgg gtgatgagct tggcgttgag gagctggcgc cagtagttct tcatcttctt 4920
gacgacctct tcggacggca cgttatcgga cttgcccctg ttcttgtcgg agcgggtgag 4980
caccttgttg tcgatggagt cgtccttcag gaaggactgc ggcacaatat ggtccacgtc 5040
gtagtcggag aggcggttga tgtccagctc ttggtccacg tacatgtcgc ggccgttctg 5100
gaggtagtag aggtagagct tctcgttctg gagctgggtg ttctcgactg ggtgctcctt 5160
gaggatctgg gagcccagct ccttaatgcc ctcctcgatc ctcttcatgc gctcgcggga 5220
gttcttttgg cccttctgtg tggtctggtt ctcgcgggcc atctcgatca cgatgttctc 5280
tggcttgtgc ctgcccatca ccttcaccag ctcgtccacc accttcacgg tctggagaat 5340
gcccttcttg atagccgggg agccggcgag attggcgata tgctcatgga gggaatcgcc 5400
ttggccggac acctgggcct tttggatgtc ctccttgaag gtgagggagt cgtcgtggat 5460
gagctgcatg aagttgcggt tggcgaagcc gtcggacttg aggaagtcga ggatcgtctt 5520
gccggactgc ttgtcgcgga tgccgttgat gagcttccta gagagcctgc cccagccggt 5580
atagcgcctg cgcttcagct gcttcatcac cttgtcgtcg aagaggtggg cgtatgtctt 5640
gaggcgctcc tcgatcatct cgcggtcctc gaagagggtg agggtgagca cgatgtcctc 5700
gaggatgtcc tcgttctcct cgttgtcgag gaagtccttg tccttgataa tcttgaggag 5760
gtcgtggtag gtcccgaggg aggcattgaa cctatcctcg acgccggaga tctcgacgga 5820
gtcgaagcac tcgattttct tgaagtagtc ctccttgagc tgcttcacgg tcaccttgcg 5880
gttggtcttg aacagcaggt cgacgatggc cttcttttgc tcgccgctaa ggaaagctgg 5940
cttcctcatc ccctcggtca cgtacttcac cttggtcagc tcgttgtaca cggtgaagta 6000
ctcgtagagg agtgagtgct tcgggagcac cttctcgttc gggaggttct tgtcgaagtt 6060
ggtcatgcgc tcgatgaaag actgggcaga ggcgccctta tccaccacct cctcgaagtt 6120
ccagggggtg attgtctcct cggactttct ggtcatccag gcgaacctgg agttgcccct 6180
ggcgagcggg cccacgtagt acgggatgcg gaaggtgagg atcttctcaa tcttctcgcg 6240
gttgtccttg aggaacgggt agaagtcctc ttgcctgcgg aggatagcat gaagctcgcc 6300
gaggtggatc tggtgcggga tggagccatt atcgaaggtg cgctgcttgc ggaggaggtc 6360
ctctctattg agcttcacga gcagctcctc ggtgccgtcc atcttctcga ggatcggctt 6420
gatgaacttg tagaactcct cttgagaagc gccgccatcg atgtagccgg cgtagccgtt 6480
cttggactgg tcgaagaaga tctccttgta cttctctggg agctgctgtc tcacgagggc 6540
cttgaggagt gtgaggtcct ggtggtgctc gtcgtacctc ttgatcatgg aggcggagag 6600
tggggccttg gtgatctcgg tgttcaccct gaggatgtcg ctgaggagga tggcgtcgga 6660
gagattcttg gcggcgagga acagatcggc gtactgatcg ccaatctggg cgaggagatt 6720
gtcgaggtcg tcgtcgtagg tgtccttgga aagctggagc ttggcgtcct cggcgaggtc 6780
gaagttggac ttgaagttcg gggtgaggcc aagagagagg gcgatcaggt tgccgaagag 6840
gccattcttc ttctcgcccg gaagttgggc gatcagattc tcgagcctgc gggacttaga 6900
gagcctggca gagagaatag ccttggcgtc aacgccagag gcgttgatcg ggttctcctc 6960
gaacagctgg ttgtaggtct gcacgagctg gatgaacagc ttgtccacat cggagttgtc 7020
cgggttgagg tcgccctcga tgaggaagtg gcccctgaac ttgatcatgt gggcgagggc 7080
gaggtagatg agcctgaggt cggccttatc ggtggagtcg acgagcttct tgcggaggtg 7140
gtagatggtc gggtacttct cgtggtaggc cacctcatcc acgatgttgc cgaagatcgg 7200
atggcgctcg tgcttcttgt cctcctcgac gaggaagctc tcctcgagcc tgtggaagaa 7260
gctgtcgtcc accttggcca tctcgttgga gaagatctct tggaggtagc agatgcggtt 7320
cttgcgcctg gtgtacctgc gtctagcggt cctcttgagc cttgtagcct cggctgtctc 7380
gccagagtcg aacagcaggg cgccgatgag attcttcttg atggagtggc ggtcggtgtt 7440
gccgaggacc ttgaacttct tggacggcac cttgtactcg tcggtgatca cggcccagcc 7500
aacagaattg gtgccgatgt cgaggccgat ggagtacttc ttgtcgacct tgcgcttctt 7560
ctttggggcc atggtgaagg gggcggccgc ggagcctgct tttttgtaca aacttgcccc 7620
gggatcctct agagtcgacc tgcagaagta acaccaaaca acagggtgag catcgacaaa 7680
agaaacagta ccaagcaaat aaatagcgta tgaaggcagg gctaaaaaaa tccacatata 7740
gctgctgcat atgccatcat ccaagtatat caagatcaaa ataattataa aacatacttg 7800
tttattataa tagataggta ctcaaggtta gagcatatga atagatgctg catatgccat 7860
catgtatatg catcagtaaa acccacatca acatgtatac ctatcctaga tcgatatttc 7920
catccatctt aaactcgtaa ctatgaagat gtatgacaca cacatacagt tccaaaatta 7980
ataaatacac caggtagttt gaaacagtat tctactccga tctagaacga atgaacgacc 8040
gcccaaccac accacatcat cacaaccaag cgaacaaaaa gcatctctgt atatgcatca 8100
gtaaaacccg catcaacatg tatacctatc ctagatcgat atttccatcc atcatcttca 8160
attcgtaact atgaatatgt atggcacaca catacagatc caaaattaat aaatccacca 8220
ggtagtttga aacagaattc tactccgatc tagaacgacc gcccaaccag accacatcat 8280
cacaaccaag acaaaaaaaa gcatgaaaag atgacccgac aaacaagtgc acggcatata 8340
ttgaaataaa ggaaaagggc aaaccaaacc ctatgcaacg aaacaaaaaa aatcatgaaa 8400
tcgatcccgt ctgcggaacg gctagagcca tcccaggatt ccccaaagag aaacactggc 8460
aagttagcaa tcagaacgtg tctgacgtac aggtcgcatc cgtgtacgaa cgctagcagc 8520
acggatctaa cacaaacacg gatctaacac aaacatgaac agaagtagaa ctaccgggcc 8580
ctaaccatgg accggaacgc cgatctagag aaggtagaga gggggggggg gggaggacga 8640
gcggcgtacc ttgaagcgga ggtgccgacg ggtggatttg ggggagatct ggttgtgtgt 8700
gtgtgcgctc cgaacaacac gaggttgggg aaagagggtg tggagggggt gtctatttat 8760
tacggcgggc gaggaaggga aagcgaagga gcggtgggaa aggaatcccc cgtagctgcc 8820
ggtgccgtga gaggaggagg aggccgcctg ccgtgccggc tcacgtctgc cgctccgcca 8880
cgcaatttct ggatgccgac agcggagcaa gtccaacggt ggagcggaac tctcgagagg 8940
ggtccagagg cagcgacaga gatgccgtgc cgtctgcttc gcttggcccg acgcgacgct 9000
gctggttcgc tggttggtgt ccgttagact cgtcgacggc gtttaacagg ctggcattat 9060
ctactcgaaa caagaaaaat gtttccttag tttttttaat ttcttaaagg gtatttgttt 9120
aatttttagt cactttattt tattctattt tatatctaaa ttattaaata aaaaaactaa 9180
aatagagttt tagttttctt aatttagagg ctaaaataga ataaaataga tgtactaaaa 9240
aaattagtct ataaaaacca ttaaccctaa accctaaatg gatgtactaa taaaatggat 9300
gaagtattat ataggtgaag ctatttgcaa aaaaaaagga gaacacatgc acactaaaaa 9360
gataaaactg tagagtcctg ttgtcaaaat actcaattgt cctttagacc atgtctaact 9420
gttcatttat atgattctct aaaacactga tattattgta gtactataga ttatattatt 9480
cgtagagtaa agtttaaata tatgtataaa gatagataaa ctgcacttca aacaagtgtg 9540
acaaaaaaaa tatgtggtaa ttttttataa cttagacatg caatgctcat tatctctaga 9600
gaggggcacg accgggtcac gctgcactgc aggcatgcaa gcttgatctc tagaaccact 9660
ttgtacaaga aagctgggtc ggcgcgccca cccttggata atgtgcaagg gatctttaaa 9720
catacgaaca gatcacttaa agttcttctg aagcaactta aagttatcag gcatgcatgg 9780
atcttggagg aatcagatgt gcagtcaggg accatagcac aagacaggcg tcttctactg 9840
gtgctaccag caaatgctgg aagccgggaa cactgggtac gttggaaacc acgtgatgtg 9900
aagaagtaag ataaactgta ggagaaaagc atttcgtagt gggccatgaa gcctttcagg 9960
acatgtattg cagtatgggc cggcccatta cgcaattgga cgacaacaaa gactagtatt 10020
agtaccacct cggctatcca catagatcaa agctgattta aaagagttgt gcagatgatc 10080
cgtggcagct cgcaggtggc ggccgcatta ggcaccccag gctttacact ttatgcttcc 10140
ggctcgtata atgtgtggat tttgagttag gatccggcga gattttcagg agctaaggaa 10200
gctaaaatgg agaaaaaaat cactggatat accaccgttg atatatccca atggcatcgt 10260
aaagaacatt ttgaggcatt tcagtcagtt gctcaatgta cctataacca gaccgttcag 10320
ctggatatta cggccttttt aaagaccgta aagaaaaata agcacaagtt ttatccggcc 10380
tttattcaca ttcttgcccg cctgatgaat gctcatccgg aattccgtat ggcaatgaaa 10440
gacggtgagc tggtgatatg ggatagtgtt cacccttgtt acaccgtttt ccatgagcaa 10500
actgaaacgt tttcatcgct ctggagtgaa taccacgacg atttccggca gtttctacac 10560
atatattcgc aagatgtggc gtgttacggt gaaaacctgg cctatttccc taaagggttt 10620
attgagaata tgtttttcgt ctcagccaat ccctgggtga gtttcaccag ttttgattta 10680
aacgtggcca atatggacaa cttcttcgcc cccgttttca ccatgggcaa atattatacg 10740
caaggcgaca aggtgctgat gccgctggcg attcaggttc atcatgccgt ctgtgatggc 10800
ttccatgtcg gcagaatgct taatgaatta caacagtact gcgatgagtg gcagggcggg 10860
gcgtaaacgc gtggatccgg cttactaaaa gccagataac agtatgcgta tcacctgcac 10920
acgttttaga gctatgctga aaagcatagc aagttaaaat aaggctagtc cgttatcaac 10980
ttgaaaaagt ggcaccgagt cggtgctttt ttttagtagt agcatctgac ggtgaagggg 11040
gcggccgcgg agcctgcttt tttgtacaaa gttgtaagct tagcttgagc ttggatcaga 11100
ttgtcgtttc ccgccttcag tttaaactat cagtgtttga caggatatat tggcgggtaa 11160
acctaagaga aaagagcgtt tattagaata acggatattt aaaagggcgt gaaaaggttt 11220
atccgttcgt ccatttgtat gtgcatgcca accacagggt tcccctcggg atcaaagtac 11280
tttgatccaa cccctccgct gctatagtgc agtcggcttc tgacgttcag tgcagccgtc 11340
ttctgaaaac gacatgtcgc acaagtccta agttacgcga caggctgccg ccctgccctt 11400
ttcctggcgt tttcttgtcg cgtgttttag tcgcataaag tagaatactt gcgactagaa 11460
ccggagacat tacgccatga acaagagcgc cgccgctggc ctgctgggct atgcccgcgt 11520
cagcaccgac gaccaggact tgaccaacca acgggccgaa ctgcacgcgg ccggctgcac 11580
caagctgttt tccgagaaga tcaccggcac caggcgcgac cgcccggagc tggccaggat 11640
gcttgaccac ctacgccctg gcgacgttgt gacagtgacc aggctagacc gcctggcccg 11700
cagcacccgc gacctactgg acattgccga gcgcatccag gaggccggcg cgggcctgcg 11760
tagcctggca gagccgtggg ccgacaccac cacgccggcc ggccgcatgg tgttgaccgt 11820
gttcgccggc attgccgagt tcgagcgttc cctaatcatc gaccgcaccc ggagcgggcg 11880
cgaggccgcc aaggcccgag gcgtgaagtt tggcccccgc cctaccctca ccccggcaca 11940
gatcgcgcac gcccgcgagc tgatcgacca ggaaggccgc accgtgaaag aggcggctgc 12000
actgcttggc gtgcatcgct cgaccctgta ccgcgcactt gagcgcagcg aggaagtgac 12060
gcccaccgag gccaggcggc gcggtgcctt ccgtgaggac gcattgaccg aggccgacgc 12120
cctggcggcc gccgagaatg aacgccaaga ggaacaagca tgaaaccgca ccaggacggc 12180
caggacgaac cgtttttcat taccgaagag atcgaggcgg agatgatcgc ggccgggtac 12240
gtgttcgagc cgcccgcgca cgtctcaacc gtgcggctgc atgaaatcct ggccggtttg 12300
tctgatgcca agctggcggc ctggccggcc agcttggccg ctgaagaaac cgagcgccgc 12360
cgtctaaaaa ggtgatgtgt atttgagtaa aacagcttgc gtcatgcggt cgctgcgtat 12420
atgatgcgat gagtaaataa acaaatacgc aaggggaacg catgaaggtt atcgctgtac 12480
ttaaccagaa aggcgggtca ggcaagacga ccatcgcaac ccatctagcc cgcgccctgc 12540
aactcgccgg ggccgatgtt ctgttagtcg attccgatcc ccagggcagt gcccgcgatt 12600
gggcggccgt gcgggaagat caaccgctaa ccgttgtcgg catcgaccgc ccgacgattg 12660
accgcgacgt gaaggccatc ggccggcgcg acttcgtagt gatcgacgga gcgccccagg 12720
cggcggactt ggctgtgtcc gcgatcaagg cagccgactt cgtgctgatt ccggtgcagc 12780
caagccctta cgacatatgg gccaccgccg acctggtgga gctggttaag cagcgcattg 12840
aggtcacgga tggaaggcta caagcggcct ttgtcgtgtc gcgggcgatc aaaggcacgc 12900
gcatcggcgg tgaggttgcc gaggcgctgg ccgggtacga gctgcccatt cttgagtccc 12960
gtatcacgca gcgcgtgagc tacccaggca ctgccgccgc cggcacaacc gttcttgaat 13020
cagaacccga gggcgacgct gcccgcgagg tccaggcgct ggccgctgaa attaaatcaa 13080
aactcatttg agttaatgag gtaaagagaa aatgagcaaa agcacaaaca cgctaagtgc 13140
cggccgtccg agcgcacgca gcagcaaggc tgcaacgttg gccagcctgg cagacacgcc 13200
agccatgaag cgggtcaact ttcagttgcc ggcggaggat cacaccaagc tgaagatgta 13260
cgcggtacgc caaggcaaga ccattaccga gctgctatct gaatacatcg cgcagctacc 13320
agagtaaatg agcaaatgaa taaatgagta gatgaatttt agcggctaaa ggaggcggca 13380
tggaaaatca agaacaacca ggcaccgacg ccgtggaatg ccccatgtgt ggaggaacgg 13440
gcggttggcc aggcgtaagc ggctgggttg tctgccggcc ctgcaatggc actggaaccc 13500
ccaagcccga ggaatcggcg tgacggtcgc aaaccatccg gcccggtaca aatcggcgcg 13560
gcgctgggtg atgacctggt ggagaagttg aaggccgcgc aggccgccca gcggcaacgc 13620
atcgaggcag aagcacgccc cggtgaatcg tggcaagcgg ccgctgatcg aatccgcaaa 13680
gaatcccggc aaccgccggc agccggtgcg ccgtcgatta ggaagccgcc caagggcgac 13740
gagcaaccag attttttcgt tccgatgctc tatgacgtgg gcacccgcga tagtcgcagc 13800
atcatggacg tggccgtttt ccgtctgtcg aagcgtgacc gacgagctgg cgaggtgatc 13860
cgctacgagc ttccagacgg gcacgtagag gtttccgcag ggccggccgg catggccagt 13920
gtgtgggatt acgacctggt actgatggcg gtttcccatc taaccgaatc catgaaccga 13980
taccgggaag ggaagggaga caagcccggc cgcgtgttcc gtccacacgt tgcggacgta 14040
ctcaagttct gccggcgagc cgatggcgga aagcagaaag acgacctggt agaaacctgc 14100
attcggttaa acaccacgca cgttgccatg cagcgtacga agaaggccaa gaacggccgc 14160
ctggtgacgg tatccgaggg tgaagccttg attagccgct acaagatcgt aaagagcgaa 14220
accgggcggc cggagtacat cgagatcgag ctagctgatt ggatgtaccg cgagatcaca 14280
gaaggcaaga acccggacgt gctgacggtt caccccgatt actttttgat cgatcccggc 14340
atcggccgtt ttctctaccg cctggcacgc cgcgccgcag gcaaggcaga agccagatgg 14400
ttgttcaaga cgatctacga acgcagtggc agcgccggag agttcaagaa gttctgtttc 14460
accgtgcgca agctgatcgg gtcaaatgac ctgccggagt acgatttgaa ggaggaggcg 14520
gggcaggctg gcccgatcct agtcatgcgc taccgcaacc tgatcgaggg cgaagcatcc 14580
gccggttcct aatgtacgga gcagatgcta gggcaaattg ccctagcagg ggaaaaaggt 14640
cgaaaaggtc tctttcctgt ggatagcacg tacattggga acccaaagcc gtacattggg 14700
aaccggaacc cgtacattgg gaacccaaag ccgtacattg ggaaccggtc acacatgtaa 14760
gtgactgata taaaagagaa aaaaggcgat ttttccgcct aaaactcttt aaaacttatt 14820
aaaactctta aaacccgcct ggcctgtgca taactgtctg gccagcgcac agccgaagag 14880
ctgcaaaaag cgcctaccct tcggtcgctg cgctccctac gccccgccgc ttcgcgtcgg 14940
cctatcgcgg ccgctggccg ctcaaaaatg gctggcctac ggccaggcaa tctaccaggg 15000
cgcggacaag ccgcgccgtc gccactcgac cgccggcgcc cacatcaagg caccctgcct 15060
cgcgcgtttc ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg agacggtcac 15120
agcttgtctg taagcggatg ccgggagcag acaagcccgt cagggcgcgt cagcgggtgt 15180
tggcgggtgt cggggcgcag ccatgaccca gtcacgtagc gatagcggag tgtatactgg 15240
cttaactatg cggcatcaga gcagattgta ctgagagtgc accatatgcg gtgtgaaata 15300
ccgcacagat gcgtaaggag aaaataccgc atcaggcgct cttccgcttc ctcgctcact 15360
gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta 15420
atacggttat ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag 15480
caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc 15540
cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta 15600
taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg 15660
ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcatagc 15720
tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac 15780
gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac 15840
ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg 15900
aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga 15960
aggacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt 16020
agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag 16080
cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct 16140
gacgctcagt ggaacgaaaa ctcacgttaa gggattttg 16179
Claims (10)
1.SbMS1蛋白在调控植物雄性育性中的应用:
所述SbMS1蛋白为a1)或a2)或a3)或a4):
a1)氨基酸序列是序列3所示的蛋白质;
a2)在序列3所示的蛋白质的N端或/和C端连接标签得到的融合蛋白质;
a3)将序列3所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加得到的与植物雄性育性相关的蛋白质;
a4)将序列3所示的氨基酸序列具有90%同一性、来源于高粱且与植物雄性育性相关的蛋白质。
2.与SbMS1蛋白相关的生物材料在调控植物雄性育性中的应用:
所述生物材料为下述A1)至A8)中的任一种:
A1)编码SbMS1蛋白的核酸分子;
A2)含有A1)所述核酸分子的表达盒;
A3)含有A1)所述核酸分子的重组载体;
A4)含有A2)所述表达盒的重组载体;
A5)含有A1)所述核酸分子的重组微生物;
A6)含有A2)所述表达盒的重组微生物;
A7)含有A3)所述重组载体的重组微生物;
A8)含有A4)所述重组载体的重组微生物。
3.根据权利要求2所述的应用,其特征在于:A1)所述核酸分子为如下B1)或B2)或B3)或B4)所示的基因:
B1)序列1所示的基因组DNA分子;
B2)序列2所示的cDNA分子;
B3)与B1)或B2)限定的核苷酸序列具有75%或75%以上同一性,且编码权利要求1中所述的SbMS1蛋白的cDNA分子或基因组DNA分子;
B4)在严格条件下与B1)或B2)或B3)限定的核苷酸序列杂交,且编码权利要求1中所述的SbMS1蛋白的cDNA分子或基因组DNA分子。
4.m1或m2所示的物质在如下1)或2)中的应用:
1)调控植物雄性育性;
2)培育雄性不育的转基因植物;
m1、抑制或降低植物中SbMS1蛋白活性或者含量的物质;
m2、抑制或降低植物中SbMS1蛋白编码核酸表达的物质或敲除植物中SbMS1蛋白编码核酸的物质;
所述SbMS1蛋白为a1)或a2)或a3)或a4):
a1)氨基酸序列是序列3所示的蛋白质;
a2)在序列3所示的蛋白质的N端或/和C端连接标签得到的融合蛋白质;
a3)将序列3所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加得到的与植物雄性育性相关的蛋白质;
a4)将序列3所示的氨基酸序列具有90%同一性、来源于高粱且与植物雄性育性相关的蛋白质。
5.一种培育雄性不育的转基因植物的方法,包括如下步骤:降低受体植物中权利要求1中所述SbMS1蛋白的含量和/或活性或抑制或降低受体植物中权利要求1中所述SbMS1蛋白编码核酸的表达或敲除受体植物中权利要求1中所述SbMS1蛋白编码核酸,得到转基因植物;所述转基因植物为雄性不育。
6.根据权利要求5所述的方法,其特征在于:所述SbMS1蛋白编码核酸的序列为序列1或序列2所示的DNA分子。
7.根据权利要求5或6所述的方法,其特征在于:所述敲除受体植物中权利要求1中所述SbMS1蛋白编码核酸的物质为CRISPR/Cas9系统。
8.根据权利要求7所述的方法,其特征在于:所述CRISPR/Cas9系统中sgRNA的靶序列为序列4所示的DNA分子。
9.根据权利要求1-4任一所述的应用或权利要求5-8任一所述的方法,其特征在于:所述植物为双子叶植物或单子叶植物;
和/或,所述单子叶植物为禾本科植物;
和/或,所述禾本科植物为高粱。
10.一种特异sgRNA或含有所述sgRNA编码基因的表达盒、载体、宿主细胞、工程菌或转基因植物细胞系,所述sgRNA的靶序列为序列4所示的DNA分子。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110589746.0A CN115466319A (zh) | 2021-05-28 | 2021-05-28 | 高粱SbMS1蛋白及其编码基因与应用 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110589746.0A CN115466319A (zh) | 2021-05-28 | 2021-05-28 | 高粱SbMS1蛋白及其编码基因与应用 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115466319A true CN115466319A (zh) | 2022-12-13 |
Family
ID=84364350
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110589746.0A Pending CN115466319A (zh) | 2021-05-28 | 2021-05-28 | 高粱SbMS1蛋白及其编码基因与应用 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115466319A (zh) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105518141A (zh) * | 2013-09-16 | 2016-04-20 | 兴旺投资有限公司 | 雄性核不育基因及其突变体在杂交育种上的应用 |
CN106661589A (zh) * | 2014-06-02 | 2017-05-10 | 国家农艺研究所 | 在植物中产生二倍体配子的tdm基因中的显性突变 |
CN110386967A (zh) * | 2018-03-26 | 2019-10-29 | 中国农业科学院作物科学研究所 | 与植物雄性育性相关的蛋白SiMS1及其编码基因与应用 |
-
2021
- 2021-05-28 CN CN202110589746.0A patent/CN115466319A/zh active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105518141A (zh) * | 2013-09-16 | 2016-04-20 | 兴旺投资有限公司 | 雄性核不育基因及其突变体在杂交育种上的应用 |
CN106661589A (zh) * | 2014-06-02 | 2017-05-10 | 国家农艺研究所 | 在植物中产生二倍体配子的tdm基因中的显性突变 |
CN110386967A (zh) * | 2018-03-26 | 2019-10-29 | 中国农业科学院作物科学研究所 | 与植物雄性育性相关的蛋白SiMS1及其编码基因与应用 |
Non-Patent Citations (3)
Title |
---|
"GenBank Access No:XM_002444951.2", GENBANK * |
XIN Z.ET AL.: "Morphological Characterization of a New and Easily Recognizable Nuclear Male Sterile Mutant of Sorghum (Sorghum bicolor)", PLOS ONE, vol. 12, no. 1, pages 1 - 14 * |
卢峰等: "高粱雄性核不育基因的利用(综述)", 杂粮作物, vol. 21, no. 4, pages 16 - 17 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110551752B (zh) | xCas9n-epBE碱基编辑系统及其在基因组碱基替换中的应用 | |
CN108203714B (zh) | 一种棉花基因的编辑方法 | |
CN110283840B (zh) | 陆地棉基因组的精确高效编辑方法 | |
KR102056250B1 (ko) | 재조합 세포, 및 이소프렌의 생산 방법 | |
CN107119063B (zh) | 一种提高蛹虫草中虫草素含量的方法 | |
CN110656114B (zh) | 一种烟草色素合成相关的基因及其应用 | |
CN110760538B (zh) | 一种创制枯萎病抗性西瓜种质材料的方法 | |
CN109321576A (zh) | 一种无腺体低棉酚棉花种质的创制方法 | |
CN109232726B (zh) | 蛋白质OsVPE2在调控植物液泡无机磷输出能力中的应用 | |
KR101608078B1 (ko) | 이산화탄소 유래 숙신산 생산을 위한 균주 및 이를 이용한 이산화탄소 유래 숙신산 생산 방법 | |
CN109485707B (zh) | 蛋白质OsVPE1在调控植物液泡无机磷输出能力中的应用 | |
CN115058439A (zh) | 云锦杜鹃samt基因过表达载体及其构建方法和应用 | |
CN115466319A (zh) | 高粱SbMS1蛋白及其编码基因与应用 | |
CA2352504C (en) | Plastidic phosphoglucomutase genes | |
TW201210471A (en) | Method for enhancing thermotolerance of plant and applicatibility of transgenic plant | |
CN113122516B (zh) | 一种植物epsps突变体及其在植物中的应用 | |
CN107815435A (zh) | 具有增强的纤维素生产能力的葡糖醋杆菌 | |
CN113281521B (zh) | 用于植物应激颗粒相关蛋白快速鉴定的Gateway双元质粒载体、其构建方法及应用 | |
CN109337925B (zh) | 一种以黄花蒿悬浮细胞系为受体的转AaADS基因提高黄花蒿中青蒿素含量的方法 | |
KR20210137055A (ko) | 네이티브 miRNA의 게놈 편집을 통한 표적 유전자 발현의 억제 | |
CN106459161A (zh) | 涉及谷氨酸受体多肽编码基因的构建体和方法 | |
CN108841862A (zh) | 一种含有ha蛋白融合标签的植物表达质粒载体及其载体的构建方法 | |
CN114591996B (zh) | 一种凝结芽孢杆菌h-1的表达载体及其构建方法与应用 | |
CN112662672B (zh) | 一种启动子及其制备方法 | |
KR20220114958A (ko) | 형질전환 식물체에서 차세대 염기서열 분석을 위한 프로브 세트의 제조 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |