CN114761547A - 用于dna碱基编辑的方法和组合物 - Google Patents
用于dna碱基编辑的方法和组合物 Download PDFInfo
- Publication number
- CN114761547A CN114761547A CN202080081866.3A CN202080081866A CN114761547A CN 114761547 A CN114761547 A CN 114761547A CN 202080081866 A CN202080081866 A CN 202080081866A CN 114761547 A CN114761547 A CN 114761547A
- Authority
- CN
- China
- Prior art keywords
- lys
- leu
- glu
- ile
- ser
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 239000000203 mixture Substances 0.000 title abstract description 7
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 133
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 133
- 108020004414 DNA Proteins 0.000 claims description 137
- 239000002773 nucleotide Substances 0.000 claims description 120
- 125000003729 nucleotide group Chemical group 0.000 claims description 119
- 150000007523 nucleic acids Chemical group 0.000 claims description 116
- 210000004027 cell Anatomy 0.000 claims description 109
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 47
- 108700004991 Cas12a Proteins 0.000 claims description 40
- 108020005004 Guide RNA Proteins 0.000 claims description 38
- 108010031325 Cytidine deaminase Proteins 0.000 claims description 30
- 102000004190 Enzymes Human genes 0.000 claims description 27
- 108090000790 Enzymes Proteins 0.000 claims description 27
- 101710163270 Nuclease Proteins 0.000 claims description 23
- 230000002829 reductive effect Effects 0.000 claims description 17
- 108010079649 APOBEC-1 Deaminase Proteins 0.000 claims description 15
- 108010052875 Adenine deaminase Proteins 0.000 claims description 15
- 108010077850 Nuclear Localization Signals Proteins 0.000 claims description 14
- 229930024421 Adenine Natural products 0.000 claims description 12
- 102100040397 C->U-editing enzyme APOBEC-1 Human genes 0.000 claims description 12
- 229960000643 adenine Drugs 0.000 claims description 12
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 claims description 12
- 230000001965 increasing effect Effects 0.000 claims description 11
- -1 relaxase Proteins 0.000 claims description 11
- 102000004316 Oxidoreductases Human genes 0.000 claims description 10
- 108090000854 Oxidoreductases Proteins 0.000 claims description 10
- 102000055027 Protein Methyltransferases Human genes 0.000 claims description 9
- 108700040121 Protein Methyltransferases Proteins 0.000 claims description 9
- 101710172430 Uracil-DNA glycosylase inhibitor Proteins 0.000 claims description 9
- 101710169336 5'-deoxyadenosine deaminase Proteins 0.000 claims description 7
- 102000055025 Adenosine deaminases Human genes 0.000 claims description 7
- 230000001131 transforming effect Effects 0.000 claims description 7
- 239000002126 C01EB10 - Adenosine Substances 0.000 claims description 6
- 229960005305 adenosine Drugs 0.000 claims description 6
- 210000004899 c-terminal region Anatomy 0.000 claims description 5
- 230000003247 decreasing effect Effects 0.000 claims description 5
- 102000002226 Alkyl and Aryl Transferases Human genes 0.000 claims description 4
- 108010014722 Alkyl and Aryl Transferases Proteins 0.000 claims description 4
- 101710095342 Apolipoprotein B Proteins 0.000 claims description 4
- 102100040202 Apolipoprotein B-100 Human genes 0.000 claims description 4
- 102100022433 Single-stranded DNA cytosine deaminase Human genes 0.000 claims description 4
- 101710143275 Single-stranded DNA cytosine deaminase Proteins 0.000 claims description 4
- 230000017156 mRNA modification Effects 0.000 claims description 4
- 230000003252 repetitive effect Effects 0.000 claims description 3
- 102100026846 Cytidine deaminase Human genes 0.000 claims 4
- 102000040430 polynucleotide Human genes 0.000 abstract description 34
- 108091033319 polynucleotide Proteins 0.000 abstract description 34
- 239000002157 polynucleotide Substances 0.000 abstract description 34
- 230000001976 improved effect Effects 0.000 abstract description 11
- 230000004568 DNA-binding Effects 0.000 abstract description 7
- 230000008836 DNA modification Effects 0.000 abstract description 2
- 241000196324 Embryophyta Species 0.000 description 137
- 102000053602 DNA Human genes 0.000 description 125
- 102000039446 nucleic acids Human genes 0.000 description 91
- 108020004707 nucleic acids Proteins 0.000 description 91
- 108090000623 proteins and genes Proteins 0.000 description 72
- 125000003275 alpha amino acid group Chemical group 0.000 description 61
- 108010054155 lysyllysine Proteins 0.000 description 49
- 230000000694 effects Effects 0.000 description 46
- 108090000765 processed proteins & peptides Proteins 0.000 description 46
- 230000009466 transformation Effects 0.000 description 41
- 102000004196 processed proteins & peptides Human genes 0.000 description 40
- 229920001184 polypeptide Polymers 0.000 description 39
- 229920002477 rna polymer Polymers 0.000 description 36
- 102000004169 proteins and genes Human genes 0.000 description 33
- 230000014509 gene expression Effects 0.000 description 32
- 238000003752 polymerase chain reaction Methods 0.000 description 32
- 108010009298 lysylglutamic acid Proteins 0.000 description 31
- 239000013615 primer Substances 0.000 description 31
- 235000018102 proteins Nutrition 0.000 description 31
- 108091033409 CRISPR Proteins 0.000 description 29
- 108010038633 aspartylglutamate Proteins 0.000 description 29
- 108010092854 aspartyllysine Proteins 0.000 description 29
- 108010034529 leucyl-lysine Proteins 0.000 description 28
- 239000000523 sample Substances 0.000 description 28
- 102000005381 Cytidine Deaminase Human genes 0.000 description 26
- 230000015572 biosynthetic process Effects 0.000 description 24
- 238000003786 synthesis reaction Methods 0.000 description 24
- 108010001064 glycyl-glycyl-glycyl-glycine Proteins 0.000 description 23
- 240000008042 Zea mays Species 0.000 description 22
- 108091026890 Coding region Proteins 0.000 description 21
- 108020004705 Codon Proteins 0.000 description 21
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 21
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 21
- SOEGEPHNZOISMT-BYPYZUCNSA-N Gly-Ser-Gly Chemical compound NCC(=O)N[C@@H](CO)C(=O)NCC(O)=O SOEGEPHNZOISMT-BYPYZUCNSA-N 0.000 description 20
- 238000009396 hybridization Methods 0.000 description 20
- 230000001404 mediated effect Effects 0.000 description 20
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 19
- 230000000295 complement effect Effects 0.000 description 19
- 230000001105 regulatory effect Effects 0.000 description 19
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 18
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 18
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 18
- 108010077245 asparaginyl-proline Proteins 0.000 description 18
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 18
- 239000012634 fragment Substances 0.000 description 17
- 241000894007 species Species 0.000 description 17
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 16
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 16
- 235000009973 maize Nutrition 0.000 description 16
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 15
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 15
- 108010057952 lysyl-phenylalanyl-lysine Proteins 0.000 description 15
- 108010026333 seryl-proline Proteins 0.000 description 15
- 108010005652 splenotritin Proteins 0.000 description 15
- 108010061238 threonyl-glycine Proteins 0.000 description 15
- 210000001519 tissue Anatomy 0.000 description 15
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 14
- WBSCNDJQPKSPII-KKUMJFAQSA-N Lys-Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O WBSCNDJQPKSPII-KKUMJFAQSA-N 0.000 description 14
- 235000001014 amino acid Nutrition 0.000 description 14
- 238000003776 cleavage reaction Methods 0.000 description 14
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 14
- 230000007017 scission Effects 0.000 description 14
- 241000589158 Agrobacterium Species 0.000 description 13
- QJCKNLPMTPXXEM-AUTRQRHGSA-N Glu-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O QJCKNLPMTPXXEM-AUTRQRHGSA-N 0.000 description 13
- QARCDOCCDOLJSF-HJPIBITLSA-N Tyr-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QARCDOCCDOLJSF-HJPIBITLSA-N 0.000 description 13
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 13
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 13
- 108010013835 arginine glutamate Proteins 0.000 description 13
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 13
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 13
- 108010064235 lysylglycine Proteins 0.000 description 13
- 239000000047 product Substances 0.000 description 13
- 238000003753 real-time PCR Methods 0.000 description 13
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 13
- 230000008685 targeting Effects 0.000 description 13
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 13
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 12
- CTWCFPWFIGRAEP-CIUDSAMLSA-N Asp-Lys-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O CTWCFPWFIGRAEP-CIUDSAMLSA-N 0.000 description 12
- HQUXQAMSWFIRET-AVGNSLFASA-N Leu-Glu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HQUXQAMSWFIRET-AVGNSLFASA-N 0.000 description 12
- WGLAORUKDGRINI-WDCWCFNPSA-N Lys-Glu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGLAORUKDGRINI-WDCWCFNPSA-N 0.000 description 12
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 12
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 12
- 238000006243 chemical reaction Methods 0.000 description 12
- 108010015792 glycyllysine Proteins 0.000 description 12
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 12
- 108010057821 leucylproline Proteins 0.000 description 12
- 108010017391 lysylvaline Proteins 0.000 description 12
- 108010012581 phenylalanylglutamate Proteins 0.000 description 12
- HJOSVGCWOTYJFG-WDCWCFNPSA-N Thr-Glu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O HJOSVGCWOTYJFG-WDCWCFNPSA-N 0.000 description 11
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 11
- 108010081551 glycylphenylalanine Proteins 0.000 description 11
- 108010003700 lysyl aspartic acid Proteins 0.000 description 11
- 238000003199 nucleic acid amplification method Methods 0.000 description 11
- 108010051242 phenylalanylserine Proteins 0.000 description 11
- 108010015840 seryl-prolyl-lysyl-lysine Proteins 0.000 description 11
- 238000012360 testing method Methods 0.000 description 11
- 108010003137 tyrosyltyrosine Proteins 0.000 description 11
- 239000013598 vector Substances 0.000 description 11
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 10
- 241000894006 Bacteria Species 0.000 description 10
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 10
- 102100035102 E3 ubiquitin-protein ligase MYCBP2 Human genes 0.000 description 10
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 10
- AQNYKMCFCCZEEL-JYJNAYRXSA-N Glu-Lys-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AQNYKMCFCCZEEL-JYJNAYRXSA-N 0.000 description 10
- 241000880493 Leptailurus serval Species 0.000 description 10
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 10
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 10
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 10
- VSTNAUBHKQPVJX-IHRRRGAJSA-N Lys-Met-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O VSTNAUBHKQPVJX-IHRRRGAJSA-N 0.000 description 10
- PEFJUUYFEGBXFA-BZSNNMDCSA-N Phe-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 PEFJUUYFEGBXFA-BZSNNMDCSA-N 0.000 description 10
- 108700019146 Transgenes Proteins 0.000 description 10
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 10
- BCBFMJYTNKDALA-UFYCRDLUSA-N Val-Phe-Phe Chemical compound N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O BCBFMJYTNKDALA-UFYCRDLUSA-N 0.000 description 10
- SSKKGOWRPNIVDW-AVGNSLFASA-N Val-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N SSKKGOWRPNIVDW-AVGNSLFASA-N 0.000 description 10
- 108010041407 alanylaspartic acid Proteins 0.000 description 10
- 230000003321 amplification Effects 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 10
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 10
- 230000006870 function Effects 0.000 description 10
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 10
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Chemical compound NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 10
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 10
- 108010038320 lysylphenylalanine Proteins 0.000 description 10
- 108020004999 messenger RNA Proteins 0.000 description 10
- 108010051110 tyrosyl-lysine Proteins 0.000 description 10
- 108010073969 valyllysine Proteins 0.000 description 10
- KHBLRHKVXICFMY-GUBZILKMSA-N Asp-Glu-Lys Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O KHBLRHKVXICFMY-GUBZILKMSA-N 0.000 description 9
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 9
- WXUBSIDKNMFAGS-IHRRRGAJSA-N Ser-Arg-Tyr Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WXUBSIDKNMFAGS-IHRRRGAJSA-N 0.000 description 9
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 9
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 9
- IVDFVBVIVLJJHR-LKXGYXEUSA-N Thr-Ser-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IVDFVBVIVLJJHR-LKXGYXEUSA-N 0.000 description 9
- 108010093581 aspartyl-proline Proteins 0.000 description 9
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 9
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 9
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 9
- 108010020532 tyrosyl-proline Proteins 0.000 description 9
- HKZAAJSTFUZYTO-LURJTMIESA-N (2s)-2-[[2-[[2-[[2-[(2-aminoacetyl)amino]acetyl]amino]acetyl]amino]acetyl]amino]-3-hydroxypropanoic acid Chemical compound NCC(=O)NCC(=O)NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O HKZAAJSTFUZYTO-LURJTMIESA-N 0.000 description 8
- VNYMOTCMNHJGTG-JBDRJPRFSA-N Ala-Ile-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O VNYMOTCMNHJGTG-JBDRJPRFSA-N 0.000 description 8
- AWXDRZJQCVHCIT-DCAQKATOSA-N Asn-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O AWXDRZJQCVHCIT-DCAQKATOSA-N 0.000 description 8
- ZKAOJVJQGVUIIU-GUBZILKMSA-N Asp-Pro-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ZKAOJVJQGVUIIU-GUBZILKMSA-N 0.000 description 8
- XMPAXPSENRSOSV-RYUDHWBXSA-N Glu-Gly-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XMPAXPSENRSOSV-RYUDHWBXSA-N 0.000 description 8
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 8
- ZUWSVOYKBCHLRR-MGHWNKPDSA-N Ile-Tyr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZUWSVOYKBCHLRR-MGHWNKPDSA-N 0.000 description 8
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 8
- YRRCOJOXAJNSAX-IHRRRGAJSA-N Leu-Pro-Lys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N YRRCOJOXAJNSAX-IHRRRGAJSA-N 0.000 description 8
- DGWXCIORNLWGGG-CIUDSAMLSA-N Lys-Asn-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O DGWXCIORNLWGGG-CIUDSAMLSA-N 0.000 description 8
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 8
- KYNNSEJZFVCDIV-ZPFDUUQYSA-N Lys-Ile-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O KYNNSEJZFVCDIV-ZPFDUUQYSA-N 0.000 description 8
- UDXSLGLHFUBRRM-OEAJRASXSA-N Lys-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CCCCN)N)O UDXSLGLHFUBRRM-OEAJRASXSA-N 0.000 description 8
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 8
- BONHGTUEEPIMPM-AVGNSLFASA-N Phe-Ser-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O BONHGTUEEPIMPM-AVGNSLFASA-N 0.000 description 8
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 8
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 8
- 108010011559 alanylphenylalanine Proteins 0.000 description 8
- 108010070783 alanyltyrosine Proteins 0.000 description 8
- 108010085325 histidylproline Proteins 0.000 description 8
- 108010018006 histidylserine Proteins 0.000 description 8
- 230000035772 mutation Effects 0.000 description 8
- 108010053725 prolylvaline Proteins 0.000 description 8
- 108010048818 seryl-histidine Proteins 0.000 description 8
- 238000006467 substitution reaction Methods 0.000 description 8
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 7
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 7
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 7
- ZMUQQMGITUJQTI-CIUDSAMLSA-N Asn-Leu-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ZMUQQMGITUJQTI-CIUDSAMLSA-N 0.000 description 7
- UCHSVZYJKJLPHF-BZSNNMDCSA-N Asp-Phe-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O UCHSVZYJKJLPHF-BZSNNMDCSA-N 0.000 description 7
- BJDHEININLSZOT-KKUMJFAQSA-N Asp-Tyr-Lys Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(O)=O BJDHEININLSZOT-KKUMJFAQSA-N 0.000 description 7
- AFODTOLGSZQDSL-PEFMBERDSA-N Glu-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N AFODTOLGSZQDSL-PEFMBERDSA-N 0.000 description 7
- RJONUNZIMUXUOI-GUBZILKMSA-N Glu-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N RJONUNZIMUXUOI-GUBZILKMSA-N 0.000 description 7
- QNJNPKSWAHPYGI-JYJNAYRXSA-N Glu-Phe-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 QNJNPKSWAHPYGI-JYJNAYRXSA-N 0.000 description 7
- RFTVTKBHDXCEEX-WDSKDSINSA-N Glu-Ser-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RFTVTKBHDXCEEX-WDSKDSINSA-N 0.000 description 7
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 7
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 7
- QSPLUJGYOPZINY-ZPFDUUQYSA-N Ile-Asp-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N QSPLUJGYOPZINY-ZPFDUUQYSA-N 0.000 description 7
- POJPZSMTTMLSTG-SRVKXCTJSA-N Leu-Asn-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N POJPZSMTTMLSTG-SRVKXCTJSA-N 0.000 description 7
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 7
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 7
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 7
- YVSHZSUKQHNDHD-KKUMJFAQSA-N Lys-Asn-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N YVSHZSUKQHNDHD-KKUMJFAQSA-N 0.000 description 7
- LPAJOCKCPRZEAG-MNXVOIDGSA-N Lys-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCCN LPAJOCKCPRZEAG-MNXVOIDGSA-N 0.000 description 7
- ZJSZPXISKMDJKQ-JYJNAYRXSA-N Lys-Phe-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=CC=C1 ZJSZPXISKMDJKQ-JYJNAYRXSA-N 0.000 description 7
- DOXQMJCSSYZSNM-BZSNNMDCSA-N Phe-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O DOXQMJCSSYZSNM-BZSNNMDCSA-N 0.000 description 7
- KAJLHCWRWDSROH-BZSNNMDCSA-N Phe-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 KAJLHCWRWDSROH-BZSNNMDCSA-N 0.000 description 7
- DBALDZKOTNSBFM-FXQIFTODSA-N Pro-Ala-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DBALDZKOTNSBFM-FXQIFTODSA-N 0.000 description 7
- XVWDJUROVRQKAE-KKUMJFAQSA-N Ser-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=CC=C1 XVWDJUROVRQKAE-KKUMJFAQSA-N 0.000 description 7
- NQQMWWVVGIXUOX-SVSWQMSJSA-N Thr-Ser-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NQQMWWVVGIXUOX-SVSWQMSJSA-N 0.000 description 7
- HSBZWINKRYZCSQ-KKUMJFAQSA-N Tyr-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O HSBZWINKRYZCSQ-KKUMJFAQSA-N 0.000 description 7
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 7
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 7
- 108010005233 alanylglutamic acid Proteins 0.000 description 7
- 108010062796 arginyllysine Proteins 0.000 description 7
- 238000003556 assay Methods 0.000 description 7
- 108010054812 diprotin A Proteins 0.000 description 7
- 230000002255 enzymatic effect Effects 0.000 description 7
- 108010073628 glutamyl-valyl-phenylalanine Proteins 0.000 description 7
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 7
- 108010050848 glycylleucine Proteins 0.000 description 7
- 239000000463 material Substances 0.000 description 7
- 239000002245 particle Substances 0.000 description 7
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 7
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 7
- 108700029760 synthetic LTSP Proteins 0.000 description 7
- 238000013518 transcription Methods 0.000 description 7
- 230000035897 transcription Effects 0.000 description 7
- 230000009261 transgenic effect Effects 0.000 description 7
- 238000012384 transportation and delivery Methods 0.000 description 7
- 108010058119 tryptophyl-glycyl-glycine Proteins 0.000 description 7
- XQGIRPGAVLFKBJ-CIUDSAMLSA-N Ala-Asn-Lys Chemical compound N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)O XQGIRPGAVLFKBJ-CIUDSAMLSA-N 0.000 description 6
- NHCPCLJZRSIDHS-ZLUOBGJFSA-N Ala-Asp-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O NHCPCLJZRSIDHS-ZLUOBGJFSA-N 0.000 description 6
- QUIGLPSHIFPEOV-CIUDSAMLSA-N Ala-Lys-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O QUIGLPSHIFPEOV-CIUDSAMLSA-N 0.000 description 6
- VKKYFICVTYKFIO-CIUDSAMLSA-N Arg-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N VKKYFICVTYKFIO-CIUDSAMLSA-N 0.000 description 6
- PAPSMOYMQDWIOR-AVGNSLFASA-N Arg-Lys-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PAPSMOYMQDWIOR-AVGNSLFASA-N 0.000 description 6
- KZXPVYVSHUJCEO-ULQDDVLXSA-N Arg-Phe-Lys Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 KZXPVYVSHUJCEO-ULQDDVLXSA-N 0.000 description 6
- LSJQOMAZIKQMTJ-SRVKXCTJSA-N Asn-Phe-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LSJQOMAZIKQMTJ-SRVKXCTJSA-N 0.000 description 6
- BEHQTVDBCLSCBY-CFMVVWHZSA-N Asn-Tyr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BEHQTVDBCLSCBY-CFMVVWHZSA-N 0.000 description 6
- RGKKALNPOYURGE-ZKWXMUAHSA-N Asp-Ala-Val Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O RGKKALNPOYURGE-ZKWXMUAHSA-N 0.000 description 6
- ILJQISGMGXRZQQ-IHRRRGAJSA-N Asp-Arg-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ILJQISGMGXRZQQ-IHRRRGAJSA-N 0.000 description 6
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 6
- NHSDEZURHWEZPN-SXTJYALSSA-N Asp-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CC(=O)O)N NHSDEZURHWEZPN-SXTJYALSSA-N 0.000 description 6
- NVFSJIXJZCDICF-SRVKXCTJSA-N Asp-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N NVFSJIXJZCDICF-SRVKXCTJSA-N 0.000 description 6
- YWLDTBBUHZJQHW-KKUMJFAQSA-N Asp-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N YWLDTBBUHZJQHW-KKUMJFAQSA-N 0.000 description 6
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 6
- 241000588724 Escherichia coli Species 0.000 description 6
- LWDGZZGWDMHBOF-FXQIFTODSA-N Gln-Glu-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O LWDGZZGWDMHBOF-FXQIFTODSA-N 0.000 description 6
- YXQCLIVLWCKCRS-RYUDHWBXSA-N Gln-Gly-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N)O YXQCLIVLWCKCRS-RYUDHWBXSA-N 0.000 description 6
- LURQDGKYBFWWJA-MNXVOIDGSA-N Gln-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)N)N LURQDGKYBFWWJA-MNXVOIDGSA-N 0.000 description 6
- ZOXBSICWUDAOHX-GUBZILKMSA-N Glu-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O ZOXBSICWUDAOHX-GUBZILKMSA-N 0.000 description 6
- CXRWMMRLEMVSEH-PEFMBERDSA-N Glu-Ile-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CXRWMMRLEMVSEH-PEFMBERDSA-N 0.000 description 6
- ZCOJVESMNGBGLF-GRLWGSQLSA-N Glu-Ile-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZCOJVESMNGBGLF-GRLWGSQLSA-N 0.000 description 6
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 6
- GJBUAAAIZSRCDC-GVXVVHGQSA-N Glu-Leu-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O GJBUAAAIZSRCDC-GVXVVHGQSA-N 0.000 description 6
- YKBUCXNNBYZYAY-MNXVOIDGSA-N Glu-Lys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YKBUCXNNBYZYAY-MNXVOIDGSA-N 0.000 description 6
- FNXSYBOHALPRHV-ONGXEEELSA-N Gly-Val-Lys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN FNXSYBOHALPRHV-ONGXEEELSA-N 0.000 description 6
- MVADCDSCFTXCBT-CIUDSAMLSA-N His-Asp-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MVADCDSCFTXCBT-CIUDSAMLSA-N 0.000 description 6
- TZCGZYWNIDZZMR-UHFFFAOYSA-N Ile-Arg-Ala Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(C)C(O)=O)CCCN=C(N)N TZCGZYWNIDZZMR-UHFFFAOYSA-N 0.000 description 6
- LKACSKJPTFSBHR-MNXVOIDGSA-N Ile-Gln-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N LKACSKJPTFSBHR-MNXVOIDGSA-N 0.000 description 6
- VZSDQFZFTCVEGF-ZEWNOJEFSA-N Ile-Phe-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O VZSDQFZFTCVEGF-ZEWNOJEFSA-N 0.000 description 6
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 6
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 6
- KKXDHFKZWKLYGB-GUBZILKMSA-N Leu-Asn-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKXDHFKZWKLYGB-GUBZILKMSA-N 0.000 description 6
- HRTRLSRYZZKPCO-BJDJZHNGSA-N Leu-Ile-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HRTRLSRYZZKPCO-BJDJZHNGSA-N 0.000 description 6
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 6
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 6
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 6
- CNWDWAMPKVYJJB-NUTKFTJISA-N Leu-Trp-Ala Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](C)C(O)=O)=CNC2=C1 CNWDWAMPKVYJJB-NUTKFTJISA-N 0.000 description 6
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 6
- YIBOAHAOAWACDK-QEJZJMRPSA-N Lys-Ala-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YIBOAHAOAWACDK-QEJZJMRPSA-N 0.000 description 6
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 6
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 6
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 6
- AIRZWUMAHCDDHR-KKUMJFAQSA-N Lys-Leu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O AIRZWUMAHCDDHR-KKUMJFAQSA-N 0.000 description 6
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 6
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 6
- GILLQRYAWOMHED-DCAQKATOSA-N Lys-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN GILLQRYAWOMHED-DCAQKATOSA-N 0.000 description 6
- WXXNVZMWHOLNRJ-AVGNSLFASA-N Met-Pro-Lys Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O WXXNVZMWHOLNRJ-AVGNSLFASA-N 0.000 description 6
- 108091034117 Oligonucleotide Proteins 0.000 description 6
- LDSOBEJVGGVWGD-DLOVCJGASA-N Phe-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 LDSOBEJVGGVWGD-DLOVCJGASA-N 0.000 description 6
- PSKRILMFHNIUAO-JYJNAYRXSA-N Phe-Glu-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N PSKRILMFHNIUAO-JYJNAYRXSA-N 0.000 description 6
- JWQWPTLEOFNCGX-AVGNSLFASA-N Phe-Glu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JWQWPTLEOFNCGX-AVGNSLFASA-N 0.000 description 6
- ZLGQEBCCANLYRA-RYUDHWBXSA-N Phe-Gly-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O ZLGQEBCCANLYRA-RYUDHWBXSA-N 0.000 description 6
- KRYSMKKRRRWOCZ-QEWYBTABSA-N Phe-Ile-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KRYSMKKRRRWOCZ-QEWYBTABSA-N 0.000 description 6
- JHSRGEODDALISP-XVSYOHENSA-N Phe-Thr-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O JHSRGEODDALISP-XVSYOHENSA-N 0.000 description 6
- 240000000111 Saccharum officinarum Species 0.000 description 6
- 235000007201 Saccharum officinarum Nutrition 0.000 description 6
- GZGFSPWOMUKKCV-NAKRPEOUSA-N Ser-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO GZGFSPWOMUKKCV-NAKRPEOUSA-N 0.000 description 6
- GYDFRTRSSXOZCR-ACZMJKKPSA-N Ser-Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GYDFRTRSSXOZCR-ACZMJKKPSA-N 0.000 description 6
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 6
- JTEICXDKGWKRRV-HJGDQZAQSA-N Thr-Asn-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O JTEICXDKGWKRRV-HJGDQZAQSA-N 0.000 description 6
- VUVCRYXYUUPGSB-GLLZPBPUSA-N Thr-Gln-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O VUVCRYXYUUPGSB-GLLZPBPUSA-N 0.000 description 6
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 6
- HTHCZRWCFXMENJ-KKUMJFAQSA-N Tyr-Arg-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HTHCZRWCFXMENJ-KKUMJFAQSA-N 0.000 description 6
- YGKVNUAKYPGORG-AVGNSLFASA-N Tyr-Asp-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YGKVNUAKYPGORG-AVGNSLFASA-N 0.000 description 6
- UABYBEBXFFNCIR-YDHLFZDLSA-N Tyr-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UABYBEBXFFNCIR-YDHLFZDLSA-N 0.000 description 6
- NSGZILIDHCIZAM-KKUMJFAQSA-N Tyr-Leu-Ser Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NSGZILIDHCIZAM-KKUMJFAQSA-N 0.000 description 6
- WQOHKVRQDLNDIL-YJRXYDGGSA-N Tyr-Thr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O WQOHKVRQDLNDIL-YJRXYDGGSA-N 0.000 description 6
- WNZSAUMKZQXHNC-UKJIMTQDSA-N Val-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N WNZSAUMKZQXHNC-UKJIMTQDSA-N 0.000 description 6
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 6
- 108010068265 aspartyltyrosine Proteins 0.000 description 6
- 230000008859 change Effects 0.000 description 6
- 210000002257 embryonic structure Anatomy 0.000 description 6
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 6
- 108010074027 glycyl-seryl-phenylalanine Proteins 0.000 description 6
- 108010087823 glycyltyrosine Proteins 0.000 description 6
- 238000003780 insertion Methods 0.000 description 6
- 230000037431 insertion Effects 0.000 description 6
- 108010012058 leucyltyrosine Proteins 0.000 description 6
- 108010084572 phenylalanyl-valine Proteins 0.000 description 6
- 229940035893 uracil Drugs 0.000 description 6
- 238000005406 washing Methods 0.000 description 6
- IFKQPMZRDQZSHI-GHCJXIJMSA-N Ala-Ile-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O IFKQPMZRDQZSHI-GHCJXIJMSA-N 0.000 description 5
- LXAARTARZJJCMB-CIQUZCHMSA-N Ala-Ile-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LXAARTARZJJCMB-CIQUZCHMSA-N 0.000 description 5
- MMLHRUJLOUSRJX-CIUDSAMLSA-N Ala-Ser-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN MMLHRUJLOUSRJX-CIUDSAMLSA-N 0.000 description 5
- WNHNMKOFKCHKKD-BFHQHQDPSA-N Ala-Thr-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O WNHNMKOFKCHKKD-BFHQHQDPSA-N 0.000 description 5
- YCTIYBUTCKNOTI-UWJYBYFXSA-N Ala-Tyr-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCTIYBUTCKNOTI-UWJYBYFXSA-N 0.000 description 5
- SQKPKIJVWHAWNF-DCAQKATOSA-N Arg-Asp-Lys Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(O)=O SQKPKIJVWHAWNF-DCAQKATOSA-N 0.000 description 5
- CRCCTGPNZUCAHE-DCAQKATOSA-N Arg-His-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 CRCCTGPNZUCAHE-DCAQKATOSA-N 0.000 description 5
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 5
- CVXXSWQORBZAAA-SRVKXCTJSA-N Arg-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N CVXXSWQORBZAAA-SRVKXCTJSA-N 0.000 description 5
- QNJIRRVTOXNGMH-GUBZILKMSA-N Asn-Gln-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC(N)=O QNJIRRVTOXNGMH-GUBZILKMSA-N 0.000 description 5
- JQSWHKKUZMTOIH-QWRGUYRKSA-N Asn-Gly-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N JQSWHKKUZMTOIH-QWRGUYRKSA-N 0.000 description 5
- GJFYPBDMUGGLFR-NKWVEPMBSA-N Asn-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CC(=O)N)N)C(=O)O GJFYPBDMUGGLFR-NKWVEPMBSA-N 0.000 description 5
- RAQMSGVCGSJKCL-FOHZUACHSA-N Asn-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(N)=O RAQMSGVCGSJKCL-FOHZUACHSA-N 0.000 description 5
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 5
- NLRJGXZWTKXRHP-DCAQKATOSA-N Asn-Leu-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NLRJGXZWTKXRHP-DCAQKATOSA-N 0.000 description 5
- TZFQICWZWFNIKU-KKUMJFAQSA-N Asn-Leu-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 TZFQICWZWFNIKU-KKUMJFAQSA-N 0.000 description 5
- ALHMNHZJBYBYHS-DCAQKATOSA-N Asn-Lys-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ALHMNHZJBYBYHS-DCAQKATOSA-N 0.000 description 5
- VWADICJNCPFKJS-ZLUOBGJFSA-N Asn-Ser-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O VWADICJNCPFKJS-ZLUOBGJFSA-N 0.000 description 5
- GZXOUBTUAUAVHD-ACZMJKKPSA-N Asn-Ser-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GZXOUBTUAUAVHD-ACZMJKKPSA-N 0.000 description 5
- ZUFPUBYQYWCMDB-NUMRIWBASA-N Asn-Thr-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZUFPUBYQYWCMDB-NUMRIWBASA-N 0.000 description 5
- JNCRAQVYJZGIOW-QSFUFRPTSA-N Asn-Val-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JNCRAQVYJZGIOW-QSFUFRPTSA-N 0.000 description 5
- WSWYMRLTJVKRCE-ZLUOBGJFSA-N Asp-Ala-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O WSWYMRLTJVKRCE-ZLUOBGJFSA-N 0.000 description 5
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 5
- HMQDRBKQMLRCCG-GMOBBJLQSA-N Asp-Arg-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HMQDRBKQMLRCCG-GMOBBJLQSA-N 0.000 description 5
- IAMNNSSEBXDJMN-CIUDSAMLSA-N Asp-Cys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N IAMNNSSEBXDJMN-CIUDSAMLSA-N 0.000 description 5
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 5
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 5
- LIVXPXUVXFRWNY-CIUDSAMLSA-N Asp-Lys-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O LIVXPXUVXFRWNY-CIUDSAMLSA-N 0.000 description 5
- DPNWSMBUYCLEDG-CIUDSAMLSA-N Asp-Lys-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O DPNWSMBUYCLEDG-CIUDSAMLSA-N 0.000 description 5
- GPPIDDWYKJPRES-YDHLFZDLSA-N Asp-Phe-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O GPPIDDWYKJPRES-YDHLFZDLSA-N 0.000 description 5
- ZQFZEBRNAMXXJV-KKUMJFAQSA-N Asp-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O ZQFZEBRNAMXXJV-KKUMJFAQSA-N 0.000 description 5
- XMKXONRMGJXCJV-LAEOZQHASA-N Asp-Val-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XMKXONRMGJXCJV-LAEOZQHASA-N 0.000 description 5
- VRJZMZGGAKVSIQ-SRVKXCTJSA-N Cys-Tyr-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O VRJZMZGGAKVSIQ-SRVKXCTJSA-N 0.000 description 5
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 5
- KCJJFESQRXGTGC-BQBZGAKWSA-N Gln-Glu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O KCJJFESQRXGTGC-BQBZGAKWSA-N 0.000 description 5
- QKWBEMCLYTYBNI-GVXVVHGQSA-N Gln-Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(N)=O QKWBEMCLYTYBNI-GVXVVHGQSA-N 0.000 description 5
- VEYGCDYMOXHJLS-GVXVVHGQSA-N Gln-Val-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VEYGCDYMOXHJLS-GVXVVHGQSA-N 0.000 description 5
- LTUVYLVIZHJCOQ-KKUMJFAQSA-N Glu-Arg-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LTUVYLVIZHJCOQ-KKUMJFAQSA-N 0.000 description 5
- JPHYJQHPILOKHC-ACZMJKKPSA-N Glu-Asp-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O JPHYJQHPILOKHC-ACZMJKKPSA-N 0.000 description 5
- WPLGNDORMXTMQS-FXQIFTODSA-N Glu-Gln-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O WPLGNDORMXTMQS-FXQIFTODSA-N 0.000 description 5
- CGOHAEBMDSEKFB-FXQIFTODSA-N Glu-Glu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O CGOHAEBMDSEKFB-FXQIFTODSA-N 0.000 description 5
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 5
- BCYGDJXHAGZNPQ-DCAQKATOSA-N Glu-Lys-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O BCYGDJXHAGZNPQ-DCAQKATOSA-N 0.000 description 5
- MFNUFCFRAZPJFW-JYJNAYRXSA-N Glu-Lys-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MFNUFCFRAZPJFW-JYJNAYRXSA-N 0.000 description 5
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 5
- UPADCCSMVOQAGF-LBPRGKRZSA-N Gly-Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CNC(=O)CN)C(O)=O)=CNC2=C1 UPADCCSMVOQAGF-LBPRGKRZSA-N 0.000 description 5
- LDFWDDVELNOGII-MXAVVETBSA-N His-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N LDFWDDVELNOGII-MXAVVETBSA-N 0.000 description 5
- DFJJAVZIHDFOGQ-MNXVOIDGSA-N Ile-Glu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DFJJAVZIHDFOGQ-MNXVOIDGSA-N 0.000 description 5
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 5
- IIWQTXMUALXGOV-PCBIJLKTSA-N Ile-Phe-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IIWQTXMUALXGOV-PCBIJLKTSA-N 0.000 description 5
- FGBRXCZYVRFNKQ-MXAVVETBSA-N Ile-Phe-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N FGBRXCZYVRFNKQ-MXAVVETBSA-N 0.000 description 5
- 108010065920 Insulin Lispro Proteins 0.000 description 5
- 108091092195 Intron Proteins 0.000 description 5
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 5
- QPRQGENIBFLVEB-BJDJZHNGSA-N Leu-Ala-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O QPRQGENIBFLVEB-BJDJZHNGSA-N 0.000 description 5
- OGCQGUIWMSBHRZ-CIUDSAMLSA-N Leu-Asn-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OGCQGUIWMSBHRZ-CIUDSAMLSA-N 0.000 description 5
- WXHFZJFZWNCDNB-KKUMJFAQSA-N Leu-Asn-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WXHFZJFZWNCDNB-KKUMJFAQSA-N 0.000 description 5
- DLFAACQHIRSQGG-CIUDSAMLSA-N Leu-Asp-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O DLFAACQHIRSQGG-CIUDSAMLSA-N 0.000 description 5
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 5
- IFMPDNRWZZEZSL-SRVKXCTJSA-N Leu-Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O IFMPDNRWZZEZSL-SRVKXCTJSA-N 0.000 description 5
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 5
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 5
- ZDJQVSIPFLMNOX-RHYQMDGZSA-N Leu-Thr-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZDJQVSIPFLMNOX-RHYQMDGZSA-N 0.000 description 5
- ILDSIMPXNFWKLH-KATARQTJSA-N Leu-Thr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ILDSIMPXNFWKLH-KATARQTJSA-N 0.000 description 5
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 5
- 108010062166 Lys-Asn-Asp Proteins 0.000 description 5
- DEFGUIIUYAUEDU-ZPFDUUQYSA-N Lys-Asn-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DEFGUIIUYAUEDU-ZPFDUUQYSA-N 0.000 description 5
- OVIVOCSURJYCTM-GUBZILKMSA-N Lys-Asp-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O OVIVOCSURJYCTM-GUBZILKMSA-N 0.000 description 5
- WGCKDDHUFPQSMZ-ZPFDUUQYSA-N Lys-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCCN WGCKDDHUFPQSMZ-ZPFDUUQYSA-N 0.000 description 5
- SSYOBDBNBQBSQE-SRVKXCTJSA-N Lys-Cys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O SSYOBDBNBQBSQE-SRVKXCTJSA-N 0.000 description 5
- KSFQPRLZAUXXPT-GARJFASQSA-N Lys-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CCCCN)N)C(=O)O KSFQPRLZAUXXPT-GARJFASQSA-N 0.000 description 5
- KKFVKBWCXXLKIK-AVGNSLFASA-N Lys-His-Glu Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCCN)N KKFVKBWCXXLKIK-AVGNSLFASA-N 0.000 description 5
- IVFUVMSKSFSFBT-NHCYSSNCSA-N Lys-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN IVFUVMSKSFSFBT-NHCYSSNCSA-N 0.000 description 5
- QOJDBRUCOXQSSK-AJNGGQMLSA-N Lys-Ile-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O QOJDBRUCOXQSSK-AJNGGQMLSA-N 0.000 description 5
- MIFFFXHMAHFACR-KATARQTJSA-N Lys-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN MIFFFXHMAHFACR-KATARQTJSA-N 0.000 description 5
- TVHCDSBMFQYPNA-RHYQMDGZSA-N Lys-Thr-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TVHCDSBMFQYPNA-RHYQMDGZSA-N 0.000 description 5
- CAVRAQIDHUPECU-UVOCVTCTSA-N Lys-Thr-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAVRAQIDHUPECU-UVOCVTCTSA-N 0.000 description 5
- SUZVLFWOCKHWET-CQDKDKBSSA-N Lys-Tyr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O SUZVLFWOCKHWET-CQDKDKBSSA-N 0.000 description 5
- WGBMNLCRYKSWAR-DCAQKATOSA-N Met-Asp-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN WGBMNLCRYKSWAR-DCAQKATOSA-N 0.000 description 5
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 5
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 5
- AWAYOWOUGVZXOB-BZSNNMDCSA-N Phe-Asn-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 AWAYOWOUGVZXOB-BZSNNMDCSA-N 0.000 description 5
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 5
- BSKMOCNNLNDIMU-CDMKHQONSA-N Phe-Thr-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O BSKMOCNNLNDIMU-CDMKHQONSA-N 0.000 description 5
- GOUWCZRDTWTODO-YDHLFZDLSA-N Phe-Val-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O GOUWCZRDTWTODO-YDHLFZDLSA-N 0.000 description 5
- MWQXFDIQXIXPMS-UNQGMJICSA-N Phe-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC1=CC=CC=C1)N)O MWQXFDIQXIXPMS-UNQGMJICSA-N 0.000 description 5
- FRKBNXCFJBPJOL-GUBZILKMSA-N Pro-Glu-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FRKBNXCFJBPJOL-GUBZILKMSA-N 0.000 description 5
- IQAGKQWXVHTPOT-FHWLQOOXSA-N Pro-Lys-Trp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O IQAGKQWXVHTPOT-FHWLQOOXSA-N 0.000 description 5
- QPFJSHSJFIYDJZ-GHCJXIJMSA-N Ser-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO QPFJSHSJFIYDJZ-GHCJXIJMSA-N 0.000 description 5
- HEQPKICPPDOSIN-SRVKXCTJSA-N Ser-Asp-Tyr Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HEQPKICPPDOSIN-SRVKXCTJSA-N 0.000 description 5
- HJEBZBMOTCQYDN-ACZMJKKPSA-N Ser-Glu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HJEBZBMOTCQYDN-ACZMJKKPSA-N 0.000 description 5
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 5
- KCNSGAMPBPYUAI-CIUDSAMLSA-N Ser-Leu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O KCNSGAMPBPYUAI-CIUDSAMLSA-N 0.000 description 5
- QJKPECIAWNNKIT-KKUMJFAQSA-N Ser-Lys-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QJKPECIAWNNKIT-KKUMJFAQSA-N 0.000 description 5
- DCLBXIWHLVEPMQ-JRQIVUDYSA-N Thr-Asp-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DCLBXIWHLVEPMQ-JRQIVUDYSA-N 0.000 description 5
- ZTPXSEUVYNNZRB-CDMKHQONSA-N Thr-Gly-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZTPXSEUVYNNZRB-CDMKHQONSA-N 0.000 description 5
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 5
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 5
- WSFXJLFSJSXGMQ-MGHWNKPDSA-N Tyr-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N WSFXJLFSJSXGMQ-MGHWNKPDSA-N 0.000 description 5
- JLKVWTICWVWGSK-JYJNAYRXSA-N Tyr-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JLKVWTICWVWGSK-JYJNAYRXSA-N 0.000 description 5
- FMXFHNSFABRVFZ-BZSNNMDCSA-N Tyr-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FMXFHNSFABRVFZ-BZSNNMDCSA-N 0.000 description 5
- NHOVZGFNTGMYMI-KKUMJFAQSA-N Tyr-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NHOVZGFNTGMYMI-KKUMJFAQSA-N 0.000 description 5
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 5
- 102000006943 Uracil-DNA Glycosidase Human genes 0.000 description 5
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 5
- JFAWZADYPRMRCO-UBHSHLNASA-N Val-Ala-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JFAWZADYPRMRCO-UBHSHLNASA-N 0.000 description 5
- UKEVLVBHRKWECS-LSJOCFKGSA-N Val-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](C(C)C)N UKEVLVBHRKWECS-LSJOCFKGSA-N 0.000 description 5
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 5
- JAKHAONCJJZVHT-DCAQKATOSA-N Val-Lys-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)O)N JAKHAONCJJZVHT-DCAQKATOSA-N 0.000 description 5
- VPGCVZRRBYOGCD-AVGNSLFASA-N Val-Lys-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O VPGCVZRRBYOGCD-AVGNSLFASA-N 0.000 description 5
- WMRWZYSRQUORHJ-YDHLFZDLSA-N Val-Phe-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N WMRWZYSRQUORHJ-YDHLFZDLSA-N 0.000 description 5
- JVGDAEKKZKKZFO-RCWTZXSCSA-N Val-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)N)O JVGDAEKKZKKZFO-RCWTZXSCSA-N 0.000 description 5
- 108010047495 alanylglycine Proteins 0.000 description 5
- 150000001413 amino acids Chemical class 0.000 description 5
- 108010008355 arginyl-glutamine Proteins 0.000 description 5
- 108010047857 aspartylglycine Proteins 0.000 description 5
- 230000001186 cumulative effect Effects 0.000 description 5
- 230000009615 deamination Effects 0.000 description 5
- 238000006481 deamination reaction Methods 0.000 description 5
- 238000012217 deletion Methods 0.000 description 5
- 230000037430 deletion Effects 0.000 description 5
- 210000003527 eukaryotic cell Anatomy 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 230000004927 fusion Effects 0.000 description 5
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 5
- 108010049041 glutamylalanine Proteins 0.000 description 5
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 5
- 108010037850 glycylvaline Proteins 0.000 description 5
- 108010036413 histidylglycine Proteins 0.000 description 5
- 108010025306 histidylleucine Proteins 0.000 description 5
- 230000000670 limiting effect Effects 0.000 description 5
- 108010076718 lysyl-glutamyl-tryptophan Proteins 0.000 description 5
- 108010068488 methionylphenylalanine Proteins 0.000 description 5
- 108091040857 miR-604 stem-loop Proteins 0.000 description 5
- 239000002987 primer (paints) Substances 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 210000001938 protoplast Anatomy 0.000 description 5
- 230000008929 regeneration Effects 0.000 description 5
- 238000011069 regeneration method Methods 0.000 description 5
- 238000012546 transfer Methods 0.000 description 5
- 230000014616 translation Effects 0.000 description 5
- JNTMAZFVYNDPLB-PEDHHIEDSA-N (2S,3S)-2-[[[(2S)-1-[(2S,3S)-2-amino-3-methyl-1-oxopentyl]-2-pyrrolidinyl]-oxomethyl]amino]-3-methylpentanoic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JNTMAZFVYNDPLB-PEDHHIEDSA-N 0.000 description 4
- PAHHYDSPOXDASW-VGWMRTNUSA-N (2s)-6-amino-2-[[(2s)-6-amino-2-[[(2s)-1-[(2s)-2-amino-3-hydroxypropanoyl]pyrrolidine-2-carbonyl]amino]hexanoyl]amino]hexanoic acid Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO PAHHYDSPOXDASW-VGWMRTNUSA-N 0.000 description 4
- FATXTKJILXPNJL-UHFFFAOYSA-N 2-[[2-[2-[(2-amino-3-methylpentanoyl)amino]propanoylamino]acetyl]amino]-3-phenylpropanoic acid Chemical compound CCC(C)C(N)C(=O)NC(C)C(=O)NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 FATXTKJILXPNJL-UHFFFAOYSA-N 0.000 description 4
- KXEVYGKATAMXJJ-ACZMJKKPSA-N Ala-Glu-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KXEVYGKATAMXJJ-ACZMJKKPSA-N 0.000 description 4
- HXNNRBHASOSVPG-GUBZILKMSA-N Ala-Glu-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HXNNRBHASOSVPG-GUBZILKMSA-N 0.000 description 4
- GRIFPSOFWFIICX-GOPGUHFVSA-N Ala-His-Trp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O GRIFPSOFWFIICX-GOPGUHFVSA-N 0.000 description 4
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 4
- XCZXVTHYGSMQGH-NAKRPEOUSA-N Ala-Ile-Met Chemical compound C[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCSC)C([O-])=O XCZXVTHYGSMQGH-NAKRPEOUSA-N 0.000 description 4
- OKIKVSXTXVVFDV-MMWGEVLESA-N Ala-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N OKIKVSXTXVVFDV-MMWGEVLESA-N 0.000 description 4
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 4
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 4
- OTUQSEPIIVBYEM-IHRRRGAJSA-N Arg-Asn-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OTUQSEPIIVBYEM-IHRRRGAJSA-N 0.000 description 4
- NVUIWHJLPSZZQC-CYDGBPFRSA-N Arg-Ile-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NVUIWHJLPSZZQC-CYDGBPFRSA-N 0.000 description 4
- NMRHDSAOIURTNT-RWMBFGLXSA-N Arg-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NMRHDSAOIURTNT-RWMBFGLXSA-N 0.000 description 4
- DPLFNLDACGGBAK-KKUMJFAQSA-N Arg-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N DPLFNLDACGGBAK-KKUMJFAQSA-N 0.000 description 4
- VLIJAPRTSXSGFY-STQMWFEESA-N Arg-Tyr-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 VLIJAPRTSXSGFY-STQMWFEESA-N 0.000 description 4
- CIBWFJFMOBIFTE-CIUDSAMLSA-N Asn-Arg-Gln Chemical compound C(C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N CIBWFJFMOBIFTE-CIUDSAMLSA-N 0.000 description 4
- NVGWESORMHFISY-SRVKXCTJSA-N Asn-Asn-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NVGWESORMHFISY-SRVKXCTJSA-N 0.000 description 4
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 4
- ANPFQTJEPONRPL-UGYAYLCHSA-N Asn-Ile-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O ANPFQTJEPONRPL-UGYAYLCHSA-N 0.000 description 4
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 4
- FBODFHMLALOPHP-GUBZILKMSA-N Asn-Lys-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O FBODFHMLALOPHP-GUBZILKMSA-N 0.000 description 4
- MDDXKBHIMYYJLW-FXQIFTODSA-N Asn-Met-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N MDDXKBHIMYYJLW-FXQIFTODSA-N 0.000 description 4
- PPCORQFLAZWUNO-QWRGUYRKSA-N Asn-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N PPCORQFLAZWUNO-QWRGUYRKSA-N 0.000 description 4
- YRTOMUMWSTUQAX-FXQIFTODSA-N Asn-Pro-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O YRTOMUMWSTUQAX-FXQIFTODSA-N 0.000 description 4
- OOXUBGLNDRGOKT-FXQIFTODSA-N Asn-Ser-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OOXUBGLNDRGOKT-FXQIFTODSA-N 0.000 description 4
- JXMREEPBRANWBY-VEVYYDQMSA-N Asn-Thr-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JXMREEPBRANWBY-VEVYYDQMSA-N 0.000 description 4
- QYRMBFWDSFGSFC-OLHMAJIHSA-N Asn-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QYRMBFWDSFGSFC-OLHMAJIHSA-N 0.000 description 4
- NSTBNYOKCZKOMI-AVGNSLFASA-N Asn-Tyr-Glu Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O NSTBNYOKCZKOMI-AVGNSLFASA-N 0.000 description 4
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 4
- GBSUGIXJAAKZOW-GMOBBJLQSA-N Asp-Ile-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GBSUGIXJAAKZOW-GMOBBJLQSA-N 0.000 description 4
- PYXXJFRXIYAESU-PCBIJLKTSA-N Asp-Ile-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PYXXJFRXIYAESU-PCBIJLKTSA-N 0.000 description 4
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 4
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 4
- LTCKTLYKRMCFOC-KKUMJFAQSA-N Asp-Phe-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LTCKTLYKRMCFOC-KKUMJFAQSA-N 0.000 description 4
- FAUPLTGRUBTXNU-FXQIFTODSA-N Asp-Pro-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O FAUPLTGRUBTXNU-FXQIFTODSA-N 0.000 description 4
- XXAMCEGRCZQGEM-ZLUOBGJFSA-N Asp-Ser-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O XXAMCEGRCZQGEM-ZLUOBGJFSA-N 0.000 description 4
- DRCOAZZDQRCGGP-GHCJXIJMSA-N Asp-Ser-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DRCOAZZDQRCGGP-GHCJXIJMSA-N 0.000 description 4
- VNXQRBXEQXLERQ-CIUDSAMLSA-N Asp-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N VNXQRBXEQXLERQ-CIUDSAMLSA-N 0.000 description 4
- PLOKOIJSGCISHE-BYULHYEWSA-N Asp-Val-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PLOKOIJSGCISHE-BYULHYEWSA-N 0.000 description 4
- 238000010453 CRISPR/Cas method Methods 0.000 description 4
- DIUBVGXMXONJCF-KKUMJFAQSA-N Cys-His-Tyr Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O DIUBVGXMXONJCF-KKUMJFAQSA-N 0.000 description 4
- WVLZTXGTNGHPBO-SRVKXCTJSA-N Cys-Leu-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O WVLZTXGTNGHPBO-SRVKXCTJSA-N 0.000 description 4
- SRZZZTMJARUVPI-JBDRJPRFSA-N Cys-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N SRZZZTMJARUVPI-JBDRJPRFSA-N 0.000 description 4
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 4
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 4
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 4
- DXMPMSWUZVNBSG-QEJZJMRPSA-N Gln-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N DXMPMSWUZVNBSG-QEJZJMRPSA-N 0.000 description 4
- NVEASDQHBRZPSU-BQBZGAKWSA-N Gln-Gln-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O NVEASDQHBRZPSU-BQBZGAKWSA-N 0.000 description 4
- SOEXCCGNHQBFPV-DLOVCJGASA-N Gln-Val-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SOEXCCGNHQBFPV-DLOVCJGASA-N 0.000 description 4
- FHPXTPQBODWBIY-CIUDSAMLSA-N Glu-Ala-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHPXTPQBODWBIY-CIUDSAMLSA-N 0.000 description 4
- GCYFUZJHAXJKKE-KKUMJFAQSA-N Glu-Arg-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O GCYFUZJHAXJKKE-KKUMJFAQSA-N 0.000 description 4
- GLWXKFRTOHKGIT-ACZMJKKPSA-N Glu-Asn-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O GLWXKFRTOHKGIT-ACZMJKKPSA-N 0.000 description 4
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 4
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 4
- KVBPDJIFRQUQFY-ACZMJKKPSA-N Glu-Cys-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O KVBPDJIFRQUQFY-ACZMJKKPSA-N 0.000 description 4
- XHUCVVHRLNPZSZ-CIUDSAMLSA-N Glu-Gln-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XHUCVVHRLNPZSZ-CIUDSAMLSA-N 0.000 description 4
- VSRCAOIHMGCIJK-SRVKXCTJSA-N Glu-Leu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VSRCAOIHMGCIJK-SRVKXCTJSA-N 0.000 description 4
- NWOUBJNMZDDGDT-AVGNSLFASA-N Glu-Leu-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 NWOUBJNMZDDGDT-AVGNSLFASA-N 0.000 description 4
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 4
- UJMNFCAHLYKWOZ-DCAQKATOSA-N Glu-Lys-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UJMNFCAHLYKWOZ-DCAQKATOSA-N 0.000 description 4
- RGJKYNUINKGPJN-RWRJDSDZSA-N Glu-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(=O)O)N RGJKYNUINKGPJN-RWRJDSDZSA-N 0.000 description 4
- VXEFAWJTFAUDJK-AVGNSLFASA-N Glu-Tyr-Ser Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O VXEFAWJTFAUDJK-AVGNSLFASA-N 0.000 description 4
- UZWUBBRJWFTHTD-LAEOZQHASA-N Glu-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O UZWUBBRJWFTHTD-LAEOZQHASA-N 0.000 description 4
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 4
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 4
- HDNXXTBKOJKWNN-WDSKDSINSA-N Gly-Glu-Asn Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O HDNXXTBKOJKWNN-WDSKDSINSA-N 0.000 description 4
- SXJHOPPTOJACOA-QXEWZRGKSA-N Gly-Ile-Arg Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SXJHOPPTOJACOA-QXEWZRGKSA-N 0.000 description 4
- FCKPEGOCSVZPNC-WHOFXGATSA-N Gly-Ile-Phe Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FCKPEGOCSVZPNC-WHOFXGATSA-N 0.000 description 4
- NSTUFLGQJCOCDL-UWVGGRQHSA-N Gly-Leu-Arg Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NSTUFLGQJCOCDL-UWVGGRQHSA-N 0.000 description 4
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 4
- VDCRBJACQKOSMS-JSGCOSHPSA-N Gly-Phe-Val Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O VDCRBJACQKOSMS-JSGCOSHPSA-N 0.000 description 4
- BDFCIKANUNMFGB-PMVVWTBXSA-N His-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 BDFCIKANUNMFGB-PMVVWTBXSA-N 0.000 description 4
- KWBISLAEQZUYIC-UWJYBYFXSA-N His-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CN=CN2)N KWBISLAEQZUYIC-UWJYBYFXSA-N 0.000 description 4
- PZAJPILZRFPYJJ-SRVKXCTJSA-N His-Ser-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O PZAJPILZRFPYJJ-SRVKXCTJSA-N 0.000 description 4
- UWSMZKRTOZEGDD-CUJWVEQBSA-N His-Thr-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O UWSMZKRTOZEGDD-CUJWVEQBSA-N 0.000 description 4
- 206010020649 Hyperkeratosis Diseases 0.000 description 4
- LEDRIAHEWDJRMF-CFMVVWHZSA-N Ile-Asn-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LEDRIAHEWDJRMF-CFMVVWHZSA-N 0.000 description 4
- LLZLRXBTOOFODM-QSFUFRPTSA-N Ile-Asp-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N LLZLRXBTOOFODM-QSFUFRPTSA-N 0.000 description 4
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 4
- CMNMPCTVCWWYHY-MXAVVETBSA-N Ile-His-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(C)C)C(=O)O)N CMNMPCTVCWWYHY-MXAVVETBSA-N 0.000 description 4
- TWPSALMCEHCIOY-YTFOTSKYSA-N Ile-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(=O)O)N TWPSALMCEHCIOY-YTFOTSKYSA-N 0.000 description 4
- CSQNHSGHAPRGPQ-YTFOTSKYSA-N Ile-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)O)N CSQNHSGHAPRGPQ-YTFOTSKYSA-N 0.000 description 4
- PMMMQRVUMVURGJ-XUXIUFHCSA-N Ile-Leu-Pro Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O PMMMQRVUMVURGJ-XUXIUFHCSA-N 0.000 description 4
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 4
- CKRFDMPBSWYOBT-PPCPHDFISA-N Ile-Lys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CKRFDMPBSWYOBT-PPCPHDFISA-N 0.000 description 4
- BKPPWVSPSIUXHZ-OSUNSFLBSA-N Ile-Met-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N BKPPWVSPSIUXHZ-OSUNSFLBSA-N 0.000 description 4
- UYNXBNHVWFNVIN-HJWJTTGWSA-N Ile-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 UYNXBNHVWFNVIN-HJWJTTGWSA-N 0.000 description 4
- IITVUURPOYGCTD-NAKRPEOUSA-N Ile-Pro-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IITVUURPOYGCTD-NAKRPEOUSA-N 0.000 description 4
- BLFXHAFTNYZEQE-VKOGCVSHSA-N Ile-Trp-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N BLFXHAFTNYZEQE-VKOGCVSHSA-N 0.000 description 4
- REXAUQBGSGDEJY-IGISWZIWSA-N Ile-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N REXAUQBGSGDEJY-IGISWZIWSA-N 0.000 description 4
- YWCJXQKATPNPOE-UKJIMTQDSA-N Ile-Val-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YWCJXQKATPNPOE-UKJIMTQDSA-N 0.000 description 4
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 4
- SUPVSFFZWVOEOI-UHFFFAOYSA-N Leu-Ala-Tyr Natural products CC(C)CC(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-UHFFFAOYSA-N 0.000 description 4
- DBVWMYGBVFCRBE-CIUDSAMLSA-N Leu-Asn-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DBVWMYGBVFCRBE-CIUDSAMLSA-N 0.000 description 4
- GPICTNQYKHHHTH-GUBZILKMSA-N Leu-Gln-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O GPICTNQYKHHHTH-GUBZILKMSA-N 0.000 description 4
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 4
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 4
- OVZLLFONXILPDZ-VOAKCMCISA-N Leu-Lys-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OVZLLFONXILPDZ-VOAKCMCISA-N 0.000 description 4
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 4
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 4
- VJGQRELPQWNURN-JYJNAYRXSA-N Leu-Tyr-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJGQRELPQWNURN-JYJNAYRXSA-N 0.000 description 4
- OZTZJMUZVAVJGY-BZSNNMDCSA-N Leu-Tyr-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N OZTZJMUZVAVJGY-BZSNNMDCSA-N 0.000 description 4
- VUBIPAHVHMZHCM-KKUMJFAQSA-N Leu-Tyr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 VUBIPAHVHMZHCM-KKUMJFAQSA-N 0.000 description 4
- ABHIXYDMILIUKV-CIUDSAMLSA-N Lys-Asn-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ABHIXYDMILIUKV-CIUDSAMLSA-N 0.000 description 4
- NDORZBUHCOJQDO-GVXVVHGQSA-N Lys-Gln-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O NDORZBUHCOJQDO-GVXVVHGQSA-N 0.000 description 4
- SPCHLZUWJTYZFC-IHRRRGAJSA-N Lys-His-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(O)=O SPCHLZUWJTYZFC-IHRRRGAJSA-N 0.000 description 4
- IUWMQCZOTYRXPL-ZPFDUUQYSA-N Lys-Ile-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O IUWMQCZOTYRXPL-ZPFDUUQYSA-N 0.000 description 4
- XDPLZVNMYQOFQZ-BJDJZHNGSA-N Lys-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N XDPLZVNMYQOFQZ-BJDJZHNGSA-N 0.000 description 4
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 4
- UQRZFMQQXXJTTF-AVGNSLFASA-N Lys-Lys-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O UQRZFMQQXXJTTF-AVGNSLFASA-N 0.000 description 4
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 4
- LMGNWHDWJDIOPK-DKIMLUQUSA-N Lys-Phe-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LMGNWHDWJDIOPK-DKIMLUQUSA-N 0.000 description 4
- GHKXHCMRAUYLBS-CIUDSAMLSA-N Lys-Ser-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O GHKXHCMRAUYLBS-CIUDSAMLSA-N 0.000 description 4
- YKBSXQFZWFXFIB-VOAKCMCISA-N Lys-Thr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCCN)C(O)=O YKBSXQFZWFXFIB-VOAKCMCISA-N 0.000 description 4
- YUTZYVTZDVZBJJ-IHPCNDPISA-N Lys-Trp-Lys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 YUTZYVTZDVZBJJ-IHPCNDPISA-N 0.000 description 4
- RQILLQOQXLZTCK-KBPBESRZSA-N Lys-Tyr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O RQILLQOQXLZTCK-KBPBESRZSA-N 0.000 description 4
- WINFHLHJTRGLCV-BZSNNMDCSA-N Lys-Tyr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=C(O)C=C1 WINFHLHJTRGLCV-BZSNNMDCSA-N 0.000 description 4
- PPNCMJARTHYNEC-MEYUZBJRSA-N Lys-Tyr-Thr Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@H](O)C)C(O)=O)CC1=CC=C(O)C=C1 PPNCMJARTHYNEC-MEYUZBJRSA-N 0.000 description 4
- VWJFOUBDZIUXGA-AVGNSLFASA-N Lys-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCCCN)N VWJFOUBDZIUXGA-AVGNSLFASA-N 0.000 description 4
- JQEBITVYKUCBMC-SRVKXCTJSA-N Met-Arg-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JQEBITVYKUCBMC-SRVKXCTJSA-N 0.000 description 4
- OVTOTTGZBWXLFU-QXEWZRGKSA-N Met-Val-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O OVTOTTGZBWXLFU-QXEWZRGKSA-N 0.000 description 4
- 241000588622 Moraxella bovis Species 0.000 description 4
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 4
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 4
- AYPMIIKUMNADSU-IHRRRGAJSA-N Phe-Arg-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O AYPMIIKUMNADSU-IHRRRGAJSA-N 0.000 description 4
- VJEZWOSKRCLHRP-MELADBBJSA-N Phe-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O VJEZWOSKRCLHRP-MELADBBJSA-N 0.000 description 4
- WKTSCAXSYITIJJ-PCBIJLKTSA-N Phe-Ile-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O WKTSCAXSYITIJJ-PCBIJLKTSA-N 0.000 description 4
- ZUQACJLOHYRVPJ-DKIMLUQUSA-N Phe-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZUQACJLOHYRVPJ-DKIMLUQUSA-N 0.000 description 4
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 4
- BAONJAHBAUDJKA-BZSNNMDCSA-N Phe-Tyr-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 BAONJAHBAUDJKA-BZSNNMDCSA-N 0.000 description 4
- RGMLUHANLDVMPB-ULQDDVLXSA-N Phe-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RGMLUHANLDVMPB-ULQDDVLXSA-N 0.000 description 4
- AMBLXEMWFARNNQ-DCAQKATOSA-N Pro-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@@H]1CCCN1 AMBLXEMWFARNNQ-DCAQKATOSA-N 0.000 description 4
- FKKHDBFNOLCYQM-FXQIFTODSA-N Pro-Cys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O FKKHDBFNOLCYQM-FXQIFTODSA-N 0.000 description 4
- SZZBUDVXWZZPDH-BQBZGAKWSA-N Pro-Cys-Gly Chemical compound OC(=O)CNC(=O)[C@H](CS)NC(=O)[C@@H]1CCCN1 SZZBUDVXWZZPDH-BQBZGAKWSA-N 0.000 description 4
- NOXSEHJOXCWRHK-DCAQKATOSA-N Pro-Cys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@@H]1CCCN1 NOXSEHJOXCWRHK-DCAQKATOSA-N 0.000 description 4
- HJSCRFZVGXAGNG-SRVKXCTJSA-N Pro-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H]1CCCN1 HJSCRFZVGXAGNG-SRVKXCTJSA-N 0.000 description 4
- BFXZQMWKTYWGCF-PYJNHQTQSA-N Pro-His-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BFXZQMWKTYWGCF-PYJNHQTQSA-N 0.000 description 4
- LPGSNRSLPHRNBW-AVGNSLFASA-N Pro-His-Val Chemical compound C([C@@H](C(=O)N[C@@H](C(C)C)C([O-])=O)NC(=O)[C@H]1[NH2+]CCC1)C1=CN=CN1 LPGSNRSLPHRNBW-AVGNSLFASA-N 0.000 description 4
- IIRBTQHFVNGPMQ-AVGNSLFASA-N Pro-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 IIRBTQHFVNGPMQ-AVGNSLFASA-N 0.000 description 4
- 108010076504 Protein Sorting Signals Proteins 0.000 description 4
- 230000004570 RNA-binding Effects 0.000 description 4
- 238000012300 Sequence Analysis Methods 0.000 description 4
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 4
- FCRMLGJMPXCAHD-FXQIFTODSA-N Ser-Arg-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O FCRMLGJMPXCAHD-FXQIFTODSA-N 0.000 description 4
- OYEDZGNMSBZCIM-XGEHTFHBSA-N Ser-Arg-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OYEDZGNMSBZCIM-XGEHTFHBSA-N 0.000 description 4
- XVAUJOAYHWWNQF-ZLUOBGJFSA-N Ser-Asn-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O XVAUJOAYHWWNQF-ZLUOBGJFSA-N 0.000 description 4
- ZXLUWXWISXIFIX-ACZMJKKPSA-N Ser-Asn-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZXLUWXWISXIFIX-ACZMJKKPSA-N 0.000 description 4
- CNIIKZQXBBQHCX-FXQIFTODSA-N Ser-Asp-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O CNIIKZQXBBQHCX-FXQIFTODSA-N 0.000 description 4
- BYIROAKULFFTEK-CIUDSAMLSA-N Ser-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO BYIROAKULFFTEK-CIUDSAMLSA-N 0.000 description 4
- OQPNSDWGAMFJNU-QWRGUYRKSA-N Ser-Gly-Tyr Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 OQPNSDWGAMFJNU-QWRGUYRKSA-N 0.000 description 4
- ZOPISOXXPQNOCO-SVSWQMSJSA-N Ser-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CO)N ZOPISOXXPQNOCO-SVSWQMSJSA-N 0.000 description 4
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 4
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 4
- LRWBCWGEUCKDTN-BJDJZHNGSA-N Ser-Lys-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LRWBCWGEUCKDTN-BJDJZHNGSA-N 0.000 description 4
- VIIJCAQMJBHSJH-FXQIFTODSA-N Ser-Met-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O VIIJCAQMJBHSJH-FXQIFTODSA-N 0.000 description 4
- BUYHXYIUQUBEQP-AVGNSLFASA-N Ser-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CO)N BUYHXYIUQUBEQP-AVGNSLFASA-N 0.000 description 4
- MQUZANJDFOQOBX-SRVKXCTJSA-N Ser-Phe-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O MQUZANJDFOQOBX-SRVKXCTJSA-N 0.000 description 4
- SZRNDHWMVSFPSP-XKBZYTNZSA-N Ser-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N)O SZRNDHWMVSFPSP-XKBZYTNZSA-N 0.000 description 4
- ZWSZBWAFDZRBNM-UBHSHLNASA-N Ser-Trp-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O ZWSZBWAFDZRBNM-UBHSHLNASA-N 0.000 description 4
- LGIMRDKGABDMBN-DCAQKATOSA-N Ser-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N LGIMRDKGABDMBN-DCAQKATOSA-N 0.000 description 4
- DGDCHPCRMWEOJR-FQPOAREZSA-N Thr-Ala-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DGDCHPCRMWEOJR-FQPOAREZSA-N 0.000 description 4
- QQWNRERCGGZOKG-WEDXCCLWSA-N Thr-Gly-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O QQWNRERCGGZOKG-WEDXCCLWSA-N 0.000 description 4
- MSIYNSBKKVMGFO-BHNWBGBOSA-N Thr-Gly-Pro Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N)O MSIYNSBKKVMGFO-BHNWBGBOSA-N 0.000 description 4
- XOWKUMFHEZLKLT-CIQUZCHMSA-N Thr-Ile-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O XOWKUMFHEZLKLT-CIQUZCHMSA-N 0.000 description 4
- ZBKDBZUTTXINIX-RWRJDSDZSA-N Thr-Ile-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZBKDBZUTTXINIX-RWRJDSDZSA-N 0.000 description 4
- ADPHPKGWVDHWML-PPCPHDFISA-N Thr-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N ADPHPKGWVDHWML-PPCPHDFISA-N 0.000 description 4
- FQPDRTDDEZXCEC-SVSWQMSJSA-N Thr-Ile-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O FQPDRTDDEZXCEC-SVSWQMSJSA-N 0.000 description 4
- PRNGXSILMXSWQQ-OEAJRASXSA-N Thr-Leu-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PRNGXSILMXSWQQ-OEAJRASXSA-N 0.000 description 4
- CJXURNZYNHCYFD-WDCWCFNPSA-N Thr-Lys-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O CJXURNZYNHCYFD-WDCWCFNPSA-N 0.000 description 4
- VEIKMWOMUYMMMK-FCLVOEFKSA-N Thr-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 VEIKMWOMUYMMMK-FCLVOEFKSA-N 0.000 description 4
- FWTFAZKJORVTIR-VZFHVOOUSA-N Thr-Ser-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O FWTFAZKJORVTIR-VZFHVOOUSA-N 0.000 description 4
- NDZYTIMDOZMECO-SHGPDSBTSA-N Thr-Thr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O NDZYTIMDOZMECO-SHGPDSBTSA-N 0.000 description 4
- BBPCSGKKPJUYRB-UVOCVTCTSA-N Thr-Thr-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O BBPCSGKKPJUYRB-UVOCVTCTSA-N 0.000 description 4
- UMFLBPIPAJMNIM-LYARXQMPSA-N Thr-Trp-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CC=CC=C3)C(=O)O)N)O UMFLBPIPAJMNIM-LYARXQMPSA-N 0.000 description 4
- PNKDNKGMEHJTJQ-BPUTZDHNSA-N Trp-Arg-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N PNKDNKGMEHJTJQ-BPUTZDHNSA-N 0.000 description 4
- RNFZZCMCRDFNAE-WFBYXXMGSA-N Trp-Asn-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O RNFZZCMCRDFNAE-WFBYXXMGSA-N 0.000 description 4
- CZWIHKFGHICAJX-BPUTZDHNSA-N Trp-Glu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 CZWIHKFGHICAJX-BPUTZDHNSA-N 0.000 description 4
- JVTHMUDOKPQBOT-NSHDSACASA-N Trp-Gly-Gly Chemical compound C1=CC=C2C(C[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O)=CNC2=C1 JVTHMUDOKPQBOT-NSHDSACASA-N 0.000 description 4
- UKWSFUSPGPBJGU-VFAJRCTISA-N Trp-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O UKWSFUSPGPBJGU-VFAJRCTISA-N 0.000 description 4
- OFTGYORHQMSPAI-PJODQICGSA-N Trp-Met-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O OFTGYORHQMSPAI-PJODQICGSA-N 0.000 description 4
- DKKHULUSOSWGHS-UWJYBYFXSA-N Tyr-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N DKKHULUSOSWGHS-UWJYBYFXSA-N 0.000 description 4
- NQJDICVXXIMMMB-XDTLVQLUSA-N Tyr-Glu-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O NQJDICVXXIMMMB-XDTLVQLUSA-N 0.000 description 4
- CDHQEOXPWBDFPL-QWRGUYRKSA-N Tyr-Gly-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDHQEOXPWBDFPL-QWRGUYRKSA-N 0.000 description 4
- SYFHQHYTNCQCCN-MELADBBJSA-N Tyr-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O SYFHQHYTNCQCCN-MELADBBJSA-N 0.000 description 4
- LNYOXPDEIZJDEI-NHCYSSNCSA-N Val-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N LNYOXPDEIZJDEI-NHCYSSNCSA-N 0.000 description 4
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 4
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 4
- ZQGPWORGSNRQLN-NHCYSSNCSA-N Val-Asp-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZQGPWORGSNRQLN-NHCYSSNCSA-N 0.000 description 4
- LHADRQBREKTRLR-DCAQKATOSA-N Val-Cys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](C(C)C)N LHADRQBREKTRLR-DCAQKATOSA-N 0.000 description 4
- QHFQQRKNGCXTHL-AUTRQRHGSA-N Val-Gln-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QHFQQRKNGCXTHL-AUTRQRHGSA-N 0.000 description 4
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 4
- SYOMXKPPFZRELL-ONGXEEELSA-N Val-Gly-Lys Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N SYOMXKPPFZRELL-ONGXEEELSA-N 0.000 description 4
- XTDDIVQWDXMRJL-IHRRRGAJSA-N Val-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](C(C)C)N XTDDIVQWDXMRJL-IHRRRGAJSA-N 0.000 description 4
- DOBHJKVVACOQTN-DZKIICNBSA-N Val-Tyr-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 DOBHJKVVACOQTN-DZKIICNBSA-N 0.000 description 4
- RLVTVHSDKHBFQP-ULQDDVLXSA-N Val-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 RLVTVHSDKHBFQP-ULQDDVLXSA-N 0.000 description 4
- 238000007792 addition Methods 0.000 description 4
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 4
- 108010070944 alanylhistidine Proteins 0.000 description 4
- 125000000539 amino acid group Chemical group 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 230000003197 catalytic effect Effects 0.000 description 4
- 238000004113 cell culture Methods 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 4
- 210000000349 chromosome Anatomy 0.000 description 4
- 235000018417 cysteine Nutrition 0.000 description 4
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 4
- 239000000539 dimer Substances 0.000 description 4
- 230000002068 genetic effect Effects 0.000 description 4
- 108010078144 glutaminyl-glycine Proteins 0.000 description 4
- 108010089804 glycyl-threonine Proteins 0.000 description 4
- 108010010147 glycylglutamine Proteins 0.000 description 4
- 108010077515 glycylproline Proteins 0.000 description 4
- 239000003112 inhibitor Substances 0.000 description 4
- 108010027338 isoleucylcysteine Proteins 0.000 description 4
- 108010078274 isoleucylvaline Proteins 0.000 description 4
- 108010091871 leucylmethionine Proteins 0.000 description 4
- 239000002609 medium Substances 0.000 description 4
- 108010005942 methionylglycine Proteins 0.000 description 4
- 230000011987 methylation Effects 0.000 description 4
- 238000007069 methylation reaction Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 239000013612 plasmid Substances 0.000 description 4
- 238000012163 sequencing technique Methods 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- 238000013519 translation Methods 0.000 description 4
- 108010084932 tryptophyl-proline Proteins 0.000 description 4
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 3
- 102000012758 APOBEC-1 Deaminase Human genes 0.000 description 3
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 3
- NYDBKUNVSALYPX-NAKRPEOUSA-N Ala-Ile-Arg Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NYDBKUNVSALYPX-NAKRPEOUSA-N 0.000 description 3
- LBYMZCVBOKYZNS-CIUDSAMLSA-N Ala-Leu-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O LBYMZCVBOKYZNS-CIUDSAMLSA-N 0.000 description 3
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 3
- IORKCNUBHNIMKY-CIUDSAMLSA-N Ala-Pro-Glu Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O IORKCNUBHNIMKY-CIUDSAMLSA-N 0.000 description 3
- GIVATXIGCXFQQA-FXQIFTODSA-N Arg-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N GIVATXIGCXFQQA-FXQIFTODSA-N 0.000 description 3
- KWTVWJPNHAOREN-IHRRRGAJSA-N Arg-Asn-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KWTVWJPNHAOREN-IHRRRGAJSA-N 0.000 description 3
- IIABBYGHLYWVOS-FXQIFTODSA-N Arg-Asn-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O IIABBYGHLYWVOS-FXQIFTODSA-N 0.000 description 3
- ZTKHZAXGTFXUDD-VEVYYDQMSA-N Arg-Asn-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZTKHZAXGTFXUDD-VEVYYDQMSA-N 0.000 description 3
- OTCJMMRQBVDQRK-DCAQKATOSA-N Arg-Asp-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OTCJMMRQBVDQRK-DCAQKATOSA-N 0.000 description 3
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 3
- LKDHUGLXOHYINY-XUXIUFHCSA-N Arg-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LKDHUGLXOHYINY-XUXIUFHCSA-N 0.000 description 3
- BTJVOUQWFXABOI-IHRRRGAJSA-N Arg-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCNC(N)=N BTJVOUQWFXABOI-IHRRRGAJSA-N 0.000 description 3
- WCZXPVPHUMYLMS-VEVYYDQMSA-N Arg-Thr-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O WCZXPVPHUMYLMS-VEVYYDQMSA-N 0.000 description 3
- LFWOQHSQNCKXRU-UFYCRDLUSA-N Arg-Tyr-Phe Chemical compound C([C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 LFWOQHSQNCKXRU-UFYCRDLUSA-N 0.000 description 3
- QCTOLCVIGRLMQS-HRCADAONSA-N Arg-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O QCTOLCVIGRLMQS-HRCADAONSA-N 0.000 description 3
- CMLGVVWQQHUXOZ-GHCJXIJMSA-N Asn-Ala-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CMLGVVWQQHUXOZ-GHCJXIJMSA-N 0.000 description 3
- HOIFSHOLNKQCSA-FXQIFTODSA-N Asn-Arg-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O HOIFSHOLNKQCSA-FXQIFTODSA-N 0.000 description 3
- BDMIFVIWCNLDCT-CIUDSAMLSA-N Asn-Arg-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O BDMIFVIWCNLDCT-CIUDSAMLSA-N 0.000 description 3
- WPOLSNAQGVHROR-GUBZILKMSA-N Asn-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N WPOLSNAQGVHROR-GUBZILKMSA-N 0.000 description 3
- XVAPVJNJGLWGCS-ACZMJKKPSA-N Asn-Glu-Asn Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVAPVJNJGLWGCS-ACZMJKKPSA-N 0.000 description 3
- CTQIOCMSIJATNX-WHFBIAKZSA-N Asn-Gly-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O CTQIOCMSIJATNX-WHFBIAKZSA-N 0.000 description 3
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 3
- MYCSPQIARXTUTP-SRVKXCTJSA-N Asn-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(=O)N)N MYCSPQIARXTUTP-SRVKXCTJSA-N 0.000 description 3
- WXVGISRWSYGEDK-KKUMJFAQSA-N Asn-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N WXVGISRWSYGEDK-KKUMJFAQSA-N 0.000 description 3
- RTFWCVDISAMGEQ-SRVKXCTJSA-N Asn-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N RTFWCVDISAMGEQ-SRVKXCTJSA-N 0.000 description 3
- MVXJBVVLACEGCG-PCBIJLKTSA-N Asn-Phe-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVXJBVVLACEGCG-PCBIJLKTSA-N 0.000 description 3
- OSZBYGVKAFZWKC-FXQIFTODSA-N Asn-Pro-Cys Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(O)=O OSZBYGVKAFZWKC-FXQIFTODSA-N 0.000 description 3
- YUOXLJYVSZYPBJ-CIUDSAMLSA-N Asn-Pro-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O YUOXLJYVSZYPBJ-CIUDSAMLSA-N 0.000 description 3
- GMUOCGCDOYYWPD-FXQIFTODSA-N Asn-Pro-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O GMUOCGCDOYYWPD-FXQIFTODSA-N 0.000 description 3
- SUIJFTJDTJKSRK-IHRRRGAJSA-N Asn-Pro-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUIJFTJDTJKSRK-IHRRRGAJSA-N 0.000 description 3
- NJPLPRFQLBZAMH-IHRRRGAJSA-N Asn-Tyr-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCSC)C(O)=O NJPLPRFQLBZAMH-IHRRRGAJSA-N 0.000 description 3
- CASGONAXMZPHCK-FXQIFTODSA-N Asp-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)CN=C(N)N CASGONAXMZPHCK-FXQIFTODSA-N 0.000 description 3
- JDHOJQJMWBKHDB-CIUDSAMLSA-N Asp-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N JDHOJQJMWBKHDB-CIUDSAMLSA-N 0.000 description 3
- UGIBTKGQVWFTGX-BIIVOSGPSA-N Asp-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)C(=O)O UGIBTKGQVWFTGX-BIIVOSGPSA-N 0.000 description 3
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 3
- UFAQGGZUXVLONR-AVGNSLFASA-N Asp-Gln-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N)O UFAQGGZUXVLONR-AVGNSLFASA-N 0.000 description 3
- DGKCOYGQLNWNCJ-ACZMJKKPSA-N Asp-Glu-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O DGKCOYGQLNWNCJ-ACZMJKKPSA-N 0.000 description 3
- PSLSTUMPZILTAH-BYULHYEWSA-N Asp-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PSLSTUMPZILTAH-BYULHYEWSA-N 0.000 description 3
- HKEZZWQWXWGASX-KKUMJFAQSA-N Asp-Leu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HKEZZWQWXWGASX-KKUMJFAQSA-N 0.000 description 3
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 3
- UZFHNLYQWMGUHU-DCAQKATOSA-N Asp-Lys-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UZFHNLYQWMGUHU-DCAQKATOSA-N 0.000 description 3
- VSMYBNPOHYAXSD-GUBZILKMSA-N Asp-Lys-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O VSMYBNPOHYAXSD-GUBZILKMSA-N 0.000 description 3
- MYLZFUMPZCPJCJ-NHCYSSNCSA-N Asp-Lys-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MYLZFUMPZCPJCJ-NHCYSSNCSA-N 0.000 description 3
- SJLDOGLMVPHPLZ-IHRRRGAJSA-N Asp-Met-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SJLDOGLMVPHPLZ-IHRRRGAJSA-N 0.000 description 3
- HCOQNGIHSXICCB-IHRRRGAJSA-N Asp-Tyr-Arg Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)O HCOQNGIHSXICCB-IHRRRGAJSA-N 0.000 description 3
- 241000182988 Assa Species 0.000 description 3
- KABHAOSDMIYXTR-GUBZILKMSA-N Cys-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CS)N KABHAOSDMIYXTR-GUBZILKMSA-N 0.000 description 3
- GCDLPNRHPWBKJJ-WDSKDSINSA-N Cys-Gly-Glu Chemical compound [H]N[C@@H](CS)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O GCDLPNRHPWBKJJ-WDSKDSINSA-N 0.000 description 3
- CAXGCBSRJLADPD-FXQIFTODSA-N Cys-Pro-Asn Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O CAXGCBSRJLADPD-FXQIFTODSA-N 0.000 description 3
- 238000001712 DNA sequencing Methods 0.000 description 3
- 101000860092 Francisella tularensis subsp. novicida (strain U112) CRISPR-associated endonuclease Cas12a Proteins 0.000 description 3
- ZPDVKYLJTOFQJV-WDSKDSINSA-N Gln-Asn-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O ZPDVKYLJTOFQJV-WDSKDSINSA-N 0.000 description 3
- ZNTDJIMJKNNSLR-RWRJDSDZSA-N Gln-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZNTDJIMJKNNSLR-RWRJDSDZSA-N 0.000 description 3
- LGIKBBLQVSWUGK-DCAQKATOSA-N Gln-Leu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGIKBBLQVSWUGK-DCAQKATOSA-N 0.000 description 3
- XZLLTYBONVKGLO-SDDRHHMPSA-N Gln-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)N)N)C(=O)O XZLLTYBONVKGLO-SDDRHHMPSA-N 0.000 description 3
- JUUNNOLZGVYCJT-JYJNAYRXSA-N Gln-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JUUNNOLZGVYCJT-JYJNAYRXSA-N 0.000 description 3
- DBNLXHGDGBUCDV-KKUMJFAQSA-N Gln-Phe-Met Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O DBNLXHGDGBUCDV-KKUMJFAQSA-N 0.000 description 3
- UBRQJXFDVZNYJP-AVGNSLFASA-N Gln-Tyr-Ser Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O UBRQJXFDVZNYJP-AVGNSLFASA-N 0.000 description 3
- OGMQXTXGLDNBSS-FXQIFTODSA-N Glu-Ala-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O OGMQXTXGLDNBSS-FXQIFTODSA-N 0.000 description 3
- AKJRHDMTEJXTPV-ACZMJKKPSA-N Glu-Asn-Ala Chemical compound C[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O AKJRHDMTEJXTPV-ACZMJKKPSA-N 0.000 description 3
- PAQUJCSYVIBPLC-AVGNSLFASA-N Glu-Asp-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PAQUJCSYVIBPLC-AVGNSLFASA-N 0.000 description 3
- HUFCEIHAFNVSNR-IHRRRGAJSA-N Glu-Gln-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HUFCEIHAFNVSNR-IHRRRGAJSA-N 0.000 description 3
- HNVFSTLPVJWIDV-CIUDSAMLSA-N Glu-Glu-Gln Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HNVFSTLPVJWIDV-CIUDSAMLSA-N 0.000 description 3
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 3
- OGNJZUXUTPQVBR-BQBZGAKWSA-N Glu-Gly-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OGNJZUXUTPQVBR-BQBZGAKWSA-N 0.000 description 3
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 3
- SUIAHERNFYRBDZ-GVXVVHGQSA-N Glu-Lys-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O SUIAHERNFYRBDZ-GVXVVHGQSA-N 0.000 description 3
- PMSMKNYRZCKVMC-DRZSPHRISA-N Glu-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CCC(=O)O)N PMSMKNYRZCKVMC-DRZSPHRISA-N 0.000 description 3
- BPLNJYHNAJVLRT-ACZMJKKPSA-N Glu-Ser-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O BPLNJYHNAJVLRT-ACZMJKKPSA-N 0.000 description 3
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 3
- GPSHCSTUYOQPAI-JHEQGTHGSA-N Glu-Thr-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O GPSHCSTUYOQPAI-JHEQGTHGSA-N 0.000 description 3
- HVKAAUOFFTUSAA-XDTLVQLUSA-N Glu-Tyr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O HVKAAUOFFTUSAA-XDTLVQLUSA-N 0.000 description 3
- HHSKZJZWQFPSKN-AVGNSLFASA-N Glu-Tyr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O HHSKZJZWQFPSKN-AVGNSLFASA-N 0.000 description 3
- HJTSRYLPAYGEEC-SIUGBPQLSA-N Glu-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)O)N HJTSRYLPAYGEEC-SIUGBPQLSA-N 0.000 description 3
- MFYLRRCYBBJYPI-JYJNAYRXSA-N Glu-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O MFYLRRCYBBJYPI-JYJNAYRXSA-N 0.000 description 3
- YPHPEHMXOYTEQG-LAEOZQHASA-N Glu-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O YPHPEHMXOYTEQG-LAEOZQHASA-N 0.000 description 3
- YQPFCZVKMUVZIN-AUTRQRHGSA-N Glu-Val-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQPFCZVKMUVZIN-AUTRQRHGSA-N 0.000 description 3
- LZEUDRYSAZAJIO-AUTRQRHGSA-N Glu-Val-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZEUDRYSAZAJIO-AUTRQRHGSA-N 0.000 description 3
- VIPDPMHGICREIS-GVXVVHGQSA-N Glu-Val-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VIPDPMHGICREIS-GVXVVHGQSA-N 0.000 description 3
- OCDLPQDYTJPWNG-YUMQZZPRSA-N Gly-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN OCDLPQDYTJPWNG-YUMQZZPRSA-N 0.000 description 3
- LHRXAHLCRMQBGJ-RYUDHWBXSA-N Gly-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CN LHRXAHLCRMQBGJ-RYUDHWBXSA-N 0.000 description 3
- JNGJGFMFXREJNF-KBPBESRZSA-N Gly-Glu-Trp Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O JNGJGFMFXREJNF-KBPBESRZSA-N 0.000 description 3
- HQRHFUYMGCHHJS-LURJTMIESA-N Gly-Gly-Arg Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-LURJTMIESA-N 0.000 description 3
- IUZGUFAJDBHQQV-YUMQZZPRSA-N Gly-Leu-Asn Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IUZGUFAJDBHQQV-YUMQZZPRSA-N 0.000 description 3
- GMTXWRIDLGTVFC-IUCAKERBSA-N Gly-Lys-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMTXWRIDLGTVFC-IUCAKERBSA-N 0.000 description 3
- YLEIWGJJBFBFHC-KBPBESRZSA-N Gly-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 YLEIWGJJBFBFHC-KBPBESRZSA-N 0.000 description 3
- WNZOCXUOGVYYBJ-CDMKHQONSA-N Gly-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)CN)O WNZOCXUOGVYYBJ-CDMKHQONSA-N 0.000 description 3
- SCJJPCQUJYPHRZ-BQBZGAKWSA-N Gly-Pro-Asn Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O SCJJPCQUJYPHRZ-BQBZGAKWSA-N 0.000 description 3
- YABRDIBSPZONIY-BQBZGAKWSA-N Gly-Ser-Met Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O YABRDIBSPZONIY-BQBZGAKWSA-N 0.000 description 3
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 3
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 3
- LCRDMSSAKLTKBU-ZDLURKLDSA-N Gly-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN LCRDMSSAKLTKBU-ZDLURKLDSA-N 0.000 description 3
- NWOSHVVPKDQKKT-RYUDHWBXSA-N Gly-Tyr-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O NWOSHVVPKDQKKT-RYUDHWBXSA-N 0.000 description 3
- PNUFMLXHOLFRLD-KBPBESRZSA-N Gly-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 PNUFMLXHOLFRLD-KBPBESRZSA-N 0.000 description 3
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 3
- 244000068988 Glycine max Species 0.000 description 3
- 235000010469 Glycine max Nutrition 0.000 description 3
- 229940113491 Glycosylase inhibitor Drugs 0.000 description 3
- KZTLOHBDLMIFSH-XVYDVKMFSA-N His-Ala-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O KZTLOHBDLMIFSH-XVYDVKMFSA-N 0.000 description 3
- CHZRWFUGWRTUOD-IUCAKERBSA-N His-Gly-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N CHZRWFUGWRTUOD-IUCAKERBSA-N 0.000 description 3
- MPXGJGBXCRQQJE-MXAVVETBSA-N His-Ile-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O MPXGJGBXCRQQJE-MXAVVETBSA-N 0.000 description 3
- UXSATKFPUVZVDK-KKUMJFAQSA-N His-Lys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N UXSATKFPUVZVDK-KKUMJFAQSA-N 0.000 description 3
- SYPULFZAGBBIOM-GVXVVHGQSA-N His-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N SYPULFZAGBBIOM-GVXVVHGQSA-N 0.000 description 3
- GBMSSORHVHAYLU-QTKMDUPCSA-N His-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC1=CN=CN1)N)O GBMSSORHVHAYLU-QTKMDUPCSA-N 0.000 description 3
- 108010033040 Histones Proteins 0.000 description 3
- 101000964378 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3A Proteins 0.000 description 3
- VSZALHITQINTGC-GHCJXIJMSA-N Ile-Ala-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VSZALHITQINTGC-GHCJXIJMSA-N 0.000 description 3
- RWIKBYVJQAJYDP-BJDJZHNGSA-N Ile-Ala-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RWIKBYVJQAJYDP-BJDJZHNGSA-N 0.000 description 3
- SACHLUOUHCVIKI-GMOBBJLQSA-N Ile-Arg-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SACHLUOUHCVIKI-GMOBBJLQSA-N 0.000 description 3
- SCHZQZPYHBWYEQ-PEFMBERDSA-N Ile-Asn-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SCHZQZPYHBWYEQ-PEFMBERDSA-N 0.000 description 3
- FJWYJQRCVNGEAQ-ZPFDUUQYSA-N Ile-Asn-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N FJWYJQRCVNGEAQ-ZPFDUUQYSA-N 0.000 description 3
- QYOGJYIRKACXEP-SLBDDTMCSA-N Ile-Asn-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N QYOGJYIRKACXEP-SLBDDTMCSA-N 0.000 description 3
- HGNUKGZQASSBKQ-PCBIJLKTSA-N Ile-Asp-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HGNUKGZQASSBKQ-PCBIJLKTSA-N 0.000 description 3
- DMZOUKXXHJQPTL-GRLWGSQLSA-N Ile-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N DMZOUKXXHJQPTL-GRLWGSQLSA-N 0.000 description 3
- PHIXPNQDGGILMP-YVNDNENWSA-N Ile-Glu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PHIXPNQDGGILMP-YVNDNENWSA-N 0.000 description 3
- CDGLBYSAZFIIJO-RCOVLWMOSA-N Ile-Gly-Gly Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O CDGLBYSAZFIIJO-RCOVLWMOSA-N 0.000 description 3
- SVBAHOMTJRFSIC-SXTJYALSSA-N Ile-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SVBAHOMTJRFSIC-SXTJYALSSA-N 0.000 description 3
- SJLVSMMIFYTSGY-GRLWGSQLSA-N Ile-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SJLVSMMIFYTSGY-GRLWGSQLSA-N 0.000 description 3
- BBQABUDWDUKJMB-LZXPERKUSA-N Ile-Ile-Ile Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C([O-])=O BBQABUDWDUKJMB-LZXPERKUSA-N 0.000 description 3
- PKGGWLOLRLOPGK-XUXIUFHCSA-N Ile-Leu-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PKGGWLOLRLOPGK-XUXIUFHCSA-N 0.000 description 3
- TWYOYAKMLHWMOJ-ZPFDUUQYSA-N Ile-Leu-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O TWYOYAKMLHWMOJ-ZPFDUUQYSA-N 0.000 description 3
- FZWVCYCYWCLQDH-NHCYSSNCSA-N Ile-Leu-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N FZWVCYCYWCLQDH-NHCYSSNCSA-N 0.000 description 3
- PARSHQDZROHERM-NHCYSSNCSA-N Ile-Lys-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)O)N PARSHQDZROHERM-NHCYSSNCSA-N 0.000 description 3
- SAVXZJYTTQQQDD-QEWYBTABSA-N Ile-Phe-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SAVXZJYTTQQQDD-QEWYBTABSA-N 0.000 description 3
- XQLGNKLSPYCRMZ-HJWJTTGWSA-N Ile-Phe-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)O)N XQLGNKLSPYCRMZ-HJWJTTGWSA-N 0.000 description 3
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 3
- FXJLRZFMKGHYJP-CFMVVWHZSA-N Ile-Tyr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N FXJLRZFMKGHYJP-CFMVVWHZSA-N 0.000 description 3
- OMDWJWGZGMCQND-CFMVVWHZSA-N Ile-Tyr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N OMDWJWGZGMCQND-CFMVVWHZSA-N 0.000 description 3
- 229930010555 Inosine Natural products 0.000 description 3
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 3
- 241001112693 Lachnospiraceae Species 0.000 description 3
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N Leu-Ala-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 3
- SUPVSFFZWVOEOI-CQDKDKBSSA-N Leu-Ala-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-CQDKDKBSSA-N 0.000 description 3
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 3
- QLQHWWCSCLZUMA-KKUMJFAQSA-N Leu-Asp-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QLQHWWCSCLZUMA-KKUMJFAQSA-N 0.000 description 3
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 3
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 3
- KVMULWOHPPMHHE-DCAQKATOSA-N Leu-Glu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KVMULWOHPPMHHE-DCAQKATOSA-N 0.000 description 3
- HPBCTWSUJOGJSH-MNXVOIDGSA-N Leu-Glu-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HPBCTWSUJOGJSH-MNXVOIDGSA-N 0.000 description 3
- HVJVUYQWFYMGJS-GVXVVHGQSA-N Leu-Glu-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVJVUYQWFYMGJS-GVXVVHGQSA-N 0.000 description 3
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 3
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 3
- AIRUUHAOKGVJAD-JYJNAYRXSA-N Leu-Phe-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIRUUHAOKGVJAD-JYJNAYRXSA-N 0.000 description 3
- VULJUQZPSOASBZ-SRVKXCTJSA-N Leu-Pro-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O VULJUQZPSOASBZ-SRVKXCTJSA-N 0.000 description 3
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 3
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 3
- HQBOMRTVKVKFMN-WDSOQIARSA-N Leu-Trp-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C(C)C)C(O)=O HQBOMRTVKVKFMN-WDSOQIARSA-N 0.000 description 3
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 3
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 3
- PNPYKQFJGRFYJE-GUBZILKMSA-N Lys-Ala-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PNPYKQFJGRFYJE-GUBZILKMSA-N 0.000 description 3
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 3
- IRNSXVOWSXSULE-DCAQKATOSA-N Lys-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN IRNSXVOWSXSULE-DCAQKATOSA-N 0.000 description 3
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 3
- QQUJSUFWEDZQQY-AVGNSLFASA-N Lys-Gln-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN QQUJSUFWEDZQQY-AVGNSLFASA-N 0.000 description 3
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 3
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 3
- GPJGFSFYBJGYRX-YUMQZZPRSA-N Lys-Gly-Asp Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O GPJGFSFYBJGYRX-YUMQZZPRSA-N 0.000 description 3
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 3
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 3
- ZXFRGTAIIZHNHG-AJNGGQMLSA-N Lys-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N ZXFRGTAIIZHNHG-AJNGGQMLSA-N 0.000 description 3
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 3
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 3
- RIJCHEVHFWMDKD-SRVKXCTJSA-N Lys-Lys-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RIJCHEVHFWMDKD-SRVKXCTJSA-N 0.000 description 3
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 3
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 3
- ATNKHRAIZCMCCN-BZSNNMDCSA-N Lys-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N ATNKHRAIZCMCCN-BZSNNMDCSA-N 0.000 description 3
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 3
- DYJOORGDQIGZAS-DCAQKATOSA-N Lys-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCCN)N DYJOORGDQIGZAS-DCAQKATOSA-N 0.000 description 3
- RPWTZTBIFGENIA-VOAKCMCISA-N Lys-Thr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RPWTZTBIFGENIA-VOAKCMCISA-N 0.000 description 3
- UGCIQUYEJIEHKX-GVXVVHGQSA-N Lys-Val-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O UGCIQUYEJIEHKX-GVXVVHGQSA-N 0.000 description 3
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 3
- OSZTUONKUMCWEP-XUXIUFHCSA-N Met-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCSC OSZTUONKUMCWEP-XUXIUFHCSA-N 0.000 description 3
- SODXFJOPSCXOHE-IHRRRGAJSA-N Met-Leu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O SODXFJOPSCXOHE-IHRRRGAJSA-N 0.000 description 3
- AWGBEIYZPAXXSX-RWMBFGLXSA-N Met-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N AWGBEIYZPAXXSX-RWMBFGLXSA-N 0.000 description 3
- YYEIFXZOBZVDPH-DCAQKATOSA-N Met-Lys-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O YYEIFXZOBZVDPH-DCAQKATOSA-N 0.000 description 3
- KBTQZYASLSUFJR-KKUMJFAQSA-N Met-Phe-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N KBTQZYASLSUFJR-KKUMJFAQSA-N 0.000 description 3
- JQHYVIKEFYETEW-IHRRRGAJSA-N Met-Phe-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=CC=C1 JQHYVIKEFYETEW-IHRRRGAJSA-N 0.000 description 3
- MUDYEFAKNSTFAI-JYJNAYRXSA-N Met-Tyr-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O MUDYEFAKNSTFAI-JYJNAYRXSA-N 0.000 description 3
- 108060004795 Methyltransferase Proteins 0.000 description 3
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 3
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 3
- YMORXCKTSSGYIG-IHRRRGAJSA-N Phe-Arg-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N YMORXCKTSSGYIG-IHRRRGAJSA-N 0.000 description 3
- YYRCPTVAPLQRNC-ULQDDVLXSA-N Phe-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CC1=CC=CC=C1 YYRCPTVAPLQRNC-ULQDDVLXSA-N 0.000 description 3
- HHOOEUSPFGPZFP-QWRGUYRKSA-N Phe-Asn-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HHOOEUSPFGPZFP-QWRGUYRKSA-N 0.000 description 3
- UMKYAYXCMYYNHI-AVGNSLFASA-N Phe-Gln-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N UMKYAYXCMYYNHI-AVGNSLFASA-N 0.000 description 3
- RJYBHZVWJPUSLB-QEWYBTABSA-N Phe-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N RJYBHZVWJPUSLB-QEWYBTABSA-N 0.000 description 3
- FMMIYCMOVGXZIP-AVGNSLFASA-N Phe-Glu-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O FMMIYCMOVGXZIP-AVGNSLFASA-N 0.000 description 3
- FIRWJEJVFFGXSH-RYUDHWBXSA-N Phe-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 FIRWJEJVFFGXSH-RYUDHWBXSA-N 0.000 description 3
- CWFGECHCRMGPPT-MXAVVETBSA-N Phe-Ile-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O CWFGECHCRMGPPT-MXAVVETBSA-N 0.000 description 3
- AUJWXNGCAQWLEI-KBPBESRZSA-N Phe-Lys-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O AUJWXNGCAQWLEI-KBPBESRZSA-N 0.000 description 3
- BNRFQGLWLQESBG-YESZJQIVSA-N Phe-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O BNRFQGLWLQESBG-YESZJQIVSA-N 0.000 description 3
- SCKXGHWQPPURGT-KKUMJFAQSA-N Phe-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O SCKXGHWQPPURGT-KKUMJFAQSA-N 0.000 description 3
- IWZRODDWOSIXPZ-IRXDYDNUSA-N Phe-Phe-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)NCC(O)=O)C1=CC=CC=C1 IWZRODDWOSIXPZ-IRXDYDNUSA-N 0.000 description 3
- RBRNEFJTEHPDSL-ACRUOGEOSA-N Phe-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RBRNEFJTEHPDSL-ACRUOGEOSA-N 0.000 description 3
- WEDZFLRYSIDIRX-IHRRRGAJSA-N Phe-Ser-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 WEDZFLRYSIDIRX-IHRRRGAJSA-N 0.000 description 3
- GCFNFKNPCMBHNT-IRXDYDNUSA-N Phe-Tyr-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)NCC(=O)O)N GCFNFKNPCMBHNT-IRXDYDNUSA-N 0.000 description 3
- APMXLWHMIVWLLR-BZSNNMDCSA-N Phe-Tyr-Ser Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(O)=O)C1=CC=CC=C1 APMXLWHMIVWLLR-BZSNNMDCSA-N 0.000 description 3
- FZHBZMDRDASUHN-NAKRPEOUSA-N Pro-Ala-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1)C(O)=O FZHBZMDRDASUHN-NAKRPEOUSA-N 0.000 description 3
- ONPFOYPPPOHMNH-UVBJJODRSA-N Pro-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@@H]3CCCN3 ONPFOYPPPOHMNH-UVBJJODRSA-N 0.000 description 3
- OCSACVPBMIYNJE-GUBZILKMSA-N Pro-Arg-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O OCSACVPBMIYNJE-GUBZILKMSA-N 0.000 description 3
- IHCXPSYCHXFXKT-DCAQKATOSA-N Pro-Arg-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O IHCXPSYCHXFXKT-DCAQKATOSA-N 0.000 description 3
- MTHRMUXESFIAMS-DCAQKATOSA-N Pro-Asn-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O MTHRMUXESFIAMS-DCAQKATOSA-N 0.000 description 3
- WFHYFCWBLSKEMS-KKUMJFAQSA-N Pro-Glu-Phe Chemical compound N([C@@H](CCC(=O)O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C(=O)[C@@H]1CCCN1 WFHYFCWBLSKEMS-KKUMJFAQSA-N 0.000 description 3
- LXVLKXPFIDDHJG-CIUDSAMLSA-N Pro-Glu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O LXVLKXPFIDDHJG-CIUDSAMLSA-N 0.000 description 3
- DWGFLKQSGRUQTI-IHRRRGAJSA-N Pro-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 DWGFLKQSGRUQTI-IHRRRGAJSA-N 0.000 description 3
- SMFQZMGHCODUPQ-ULQDDVLXSA-N Pro-Lys-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SMFQZMGHCODUPQ-ULQDDVLXSA-N 0.000 description 3
- FNGOXVQBBCMFKV-CIUDSAMLSA-N Pro-Ser-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O FNGOXVQBBCMFKV-CIUDSAMLSA-N 0.000 description 3
- YQHZVYJAGWMHES-ZLUOBGJFSA-N Ser-Ala-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YQHZVYJAGWMHES-ZLUOBGJFSA-N 0.000 description 3
- JFWDJFULOLKQFY-QWRGUYRKSA-N Ser-Gly-Phe Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JFWDJFULOLKQFY-QWRGUYRKSA-N 0.000 description 3
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 3
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 3
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 3
- LRZLZIUXQBIWTB-KATARQTJSA-N Ser-Lys-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRZLZIUXQBIWTB-KATARQTJSA-N 0.000 description 3
- GDUZTEQRAOXYJS-SRVKXCTJSA-N Ser-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GDUZTEQRAOXYJS-SRVKXCTJSA-N 0.000 description 3
- UGTZYIPOBYXWRW-SRVKXCTJSA-N Ser-Phe-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O UGTZYIPOBYXWRW-SRVKXCTJSA-N 0.000 description 3
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 3
- WUXCHQZLUHBSDJ-LKXGYXEUSA-N Ser-Thr-Asp Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WUXCHQZLUHBSDJ-LKXGYXEUSA-N 0.000 description 3
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 3
- GSCVDSBEYVGMJQ-SRVKXCTJSA-N Ser-Tyr-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)O GSCVDSBEYVGMJQ-SRVKXCTJSA-N 0.000 description 3
- 244000061456 Solanum tuberosum Species 0.000 description 3
- 235000002595 Solanum tuberosum Nutrition 0.000 description 3
- IRKWVRSEQFTGGV-VEVYYDQMSA-N Thr-Asn-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IRKWVRSEQFTGGV-VEVYYDQMSA-N 0.000 description 3
- BNGDYRRHRGOPHX-IFFSRLJSSA-N Thr-Glu-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O BNGDYRRHRGOPHX-IFFSRLJSSA-N 0.000 description 3
- YOOAQCZYZHGUAZ-KATARQTJSA-N Thr-Leu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YOOAQCZYZHGUAZ-KATARQTJSA-N 0.000 description 3
- FDQXPJCLVPFKJW-KJEVXHAQSA-N Thr-Met-Tyr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N)O FDQXPJCLVPFKJW-KJEVXHAQSA-N 0.000 description 3
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 3
- IEZVHOULSUULHD-XGEHTFHBSA-N Thr-Ser-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O IEZVHOULSUULHD-XGEHTFHBSA-N 0.000 description 3
- VBMOVTMNHWPZJR-SUSMZKCASA-N Thr-Thr-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VBMOVTMNHWPZJR-SUSMZKCASA-N 0.000 description 3
- VPRHDRKAPYZMHL-SZMVWBNQSA-N Trp-Leu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 VPRHDRKAPYZMHL-SZMVWBNQSA-N 0.000 description 3
- GEGYPBOPIGNZIF-CWRNSKLLSA-N Trp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O GEGYPBOPIGNZIF-CWRNSKLLSA-N 0.000 description 3
- XLMDWQNAOKLKCP-XDTLVQLUSA-N Tyr-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N XLMDWQNAOKLKCP-XDTLVQLUSA-N 0.000 description 3
- WPVGRKLNHJJCEN-BZSNNMDCSA-N Tyr-Asp-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 WPVGRKLNHJJCEN-BZSNNMDCSA-N 0.000 description 3
- AZGZDDNKFFUDEH-QWRGUYRKSA-N Tyr-Gly-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AZGZDDNKFFUDEH-QWRGUYRKSA-N 0.000 description 3
- GITNQBVCEQBDQC-KKUMJFAQSA-N Tyr-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O GITNQBVCEQBDQC-KKUMJFAQSA-N 0.000 description 3
- VTCKHZJKWQENKX-KBPBESRZSA-N Tyr-Lys-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O VTCKHZJKWQENKX-KBPBESRZSA-N 0.000 description 3
- QHONGSVIVOFKAC-ULQDDVLXSA-N Tyr-Pro-His Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O QHONGSVIVOFKAC-ULQDDVLXSA-N 0.000 description 3
- GZWPQZDVTBZVEP-BZSNNMDCSA-N Tyr-Tyr-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O GZWPQZDVTBZVEP-BZSNNMDCSA-N 0.000 description 3
- ABSXSJZNRAQDDI-KJEVXHAQSA-N Tyr-Val-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ABSXSJZNRAQDDI-KJEVXHAQSA-N 0.000 description 3
- LTFLDDDGWOVIHY-NAKRPEOUSA-N Val-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N LTFLDDDGWOVIHY-NAKRPEOUSA-N 0.000 description 3
- JLFKWDAZBRYCGX-ZKWXMUAHSA-N Val-Asn-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N JLFKWDAZBRYCGX-ZKWXMUAHSA-N 0.000 description 3
- IQQYYFPCWKWUHW-YDHLFZDLSA-N Val-Asn-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N IQQYYFPCWKWUHW-YDHLFZDLSA-N 0.000 description 3
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 3
- ZIGZPYJXIWLQFC-QTKMDUPCSA-N Val-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C(C)C)N)O ZIGZPYJXIWLQFC-QTKMDUPCSA-N 0.000 description 3
- RFKJNTRMXGCKFE-FHWLQOOXSA-N Val-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC(C)C)C(O)=O)=CNC2=C1 RFKJNTRMXGCKFE-FHWLQOOXSA-N 0.000 description 3
- XXWBHOWRARMUOC-NHCYSSNCSA-N Val-Lys-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N XXWBHOWRARMUOC-NHCYSSNCSA-N 0.000 description 3
- QRVPEKJBBRYISE-XUXIUFHCSA-N Val-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N QRVPEKJBBRYISE-XUXIUFHCSA-N 0.000 description 3
- YMTOEGGOCHVGEH-IHRRRGAJSA-N Val-Lys-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O YMTOEGGOCHVGEH-IHRRRGAJSA-N 0.000 description 3
- VNGKMNPAENRGDC-JYJNAYRXSA-N Val-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 VNGKMNPAENRGDC-JYJNAYRXSA-N 0.000 description 3
- MIAZWUMFUURQNP-YDHLFZDLSA-N Val-Tyr-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N MIAZWUMFUURQNP-YDHLFZDLSA-N 0.000 description 3
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 235000007244 Zea mays Nutrition 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 3
- 230000004075 alteration Effects 0.000 description 3
- 230000000692 anti-sense effect Effects 0.000 description 3
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 230000002759 chromosomal effect Effects 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 230000005782 double-strand break Effects 0.000 description 3
- 210000005069 ears Anatomy 0.000 description 3
- 239000003623 enhancer Substances 0.000 description 3
- 108010028295 histidylhistidine Proteins 0.000 description 3
- 229960003786 inosine Drugs 0.000 description 3
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 3
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 3
- 238000007899 nucleic acid hybridization Methods 0.000 description 3
- 239000002853 nucleic acid probe Substances 0.000 description 3
- 210000004940 nucleus Anatomy 0.000 description 3
- 210000000056 organ Anatomy 0.000 description 3
- 108010074082 phenylalanyl-alanyl-lysine Proteins 0.000 description 3
- 108010018625 phenylalanylarginine Proteins 0.000 description 3
- 210000001236 prokaryotic cell Anatomy 0.000 description 3
- 108010070643 prolylglutamic acid Proteins 0.000 description 3
- 238000011002 quantification Methods 0.000 description 3
- 230000001172 regenerating effect Effects 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 108010071207 serylmethionine Proteins 0.000 description 3
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 3
- 230000001052 transient effect Effects 0.000 description 3
- 108010080629 tryptophan-leucine Proteins 0.000 description 3
- 238000011144 upstream manufacturing Methods 0.000 description 3
- VOUUHEHYSHWUHG-UWVGGRQHSA-N (2s)-2-[[2-[[2-[[2-[[(2s)-2-[[2-[[2-[(2-aminoacetyl)amino]acetyl]amino]acetyl]amino]-3-hydroxypropanoyl]amino]acetyl]amino]acetyl]amino]acetyl]amino]-3-hydroxypropanoic acid Chemical compound NCC(=O)NCC(=O)NCC(=O)N[C@@H](CO)C(=O)NCC(=O)NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O VOUUHEHYSHWUHG-UWVGGRQHSA-N 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 241000589155 Agrobacterium tumefaciens Species 0.000 description 2
- QDRGPQWIVZNJQD-CIUDSAMLSA-N Ala-Arg-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O QDRGPQWIVZNJQD-CIUDSAMLSA-N 0.000 description 2
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 2
- NXSFUECZFORGOG-CIUDSAMLSA-N Ala-Asn-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXSFUECZFORGOG-CIUDSAMLSA-N 0.000 description 2
- WXERCAHAIKMTKX-ZLUOBGJFSA-N Ala-Asp-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O WXERCAHAIKMTKX-ZLUOBGJFSA-N 0.000 description 2
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 2
- MKZCBYZBCINNJN-DLOVCJGASA-N Ala-Asp-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MKZCBYZBCINNJN-DLOVCJGASA-N 0.000 description 2
- KUDREHRZRIVKHS-UWJYBYFXSA-N Ala-Asp-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KUDREHRZRIVKHS-UWJYBYFXSA-N 0.000 description 2
- ZODMADSIQZZBSQ-FXQIFTODSA-N Ala-Gln-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZODMADSIQZZBSQ-FXQIFTODSA-N 0.000 description 2
- SFNFGFDRYJKZKN-XQXXSGGOSA-N Ala-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C)N)O SFNFGFDRYJKZKN-XQXXSGGOSA-N 0.000 description 2
- FUSPCLTUKXQREV-ACZMJKKPSA-N Ala-Glu-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O FUSPCLTUKXQREV-ACZMJKKPSA-N 0.000 description 2
- HMRWQTHUDVXMGH-GUBZILKMSA-N Ala-Glu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HMRWQTHUDVXMGH-GUBZILKMSA-N 0.000 description 2
- ROLXPVQSRCPVGK-XDTLVQLUSA-N Ala-Glu-Tyr Chemical compound N[C@@H](C)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O ROLXPVQSRCPVGK-XDTLVQLUSA-N 0.000 description 2
- LMFXXZPPZDCPTA-ZKWXMUAHSA-N Ala-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N LMFXXZPPZDCPTA-ZKWXMUAHSA-N 0.000 description 2
- CFPQUJZTLUQUTJ-HTFCKZLJSA-N Ala-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@H](C)N CFPQUJZTLUQUTJ-HTFCKZLJSA-N 0.000 description 2
- SUMYEVXWCAYLLJ-GUBZILKMSA-N Ala-Leu-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O SUMYEVXWCAYLLJ-GUBZILKMSA-N 0.000 description 2
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 2
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 2
- NINQYGGNRIBFSC-CIUDSAMLSA-N Ala-Lys-Ser Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CO)C(O)=O NINQYGGNRIBFSC-CIUDSAMLSA-N 0.000 description 2
- MDNAVFBZPROEHO-DCAQKATOSA-N Ala-Lys-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MDNAVFBZPROEHO-DCAQKATOSA-N 0.000 description 2
- PEIBBAXIKUAYGN-UBHSHLNASA-N Ala-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 PEIBBAXIKUAYGN-UBHSHLNASA-N 0.000 description 2
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 2
- CYBJZLQSUJEMAS-LFSVMHDDSA-N Ala-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C)N)O CYBJZLQSUJEMAS-LFSVMHDDSA-N 0.000 description 2
- DXTYEWAQOXYRHZ-KKXDTOCCSA-N Ala-Phe-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N DXTYEWAQOXYRHZ-KKXDTOCCSA-N 0.000 description 2
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 2
- VNFSAYFQLXPHPY-CIQUZCHMSA-N Ala-Thr-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNFSAYFQLXPHPY-CIQUZCHMSA-N 0.000 description 2
- VYMJAWXRWHJIMS-LKTVYLICSA-N Ala-Tyr-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VYMJAWXRWHJIMS-LKTVYLICSA-N 0.000 description 2
- MUGAESARFRGOTQ-IGNZVWTISA-N Ala-Tyr-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N MUGAESARFRGOTQ-IGNZVWTISA-N 0.000 description 2
- 108700028369 Alleles Proteins 0.000 description 2
- 108020005544 Antisense RNA Proteins 0.000 description 2
- OTOXOKCIIQLMFH-KZVJFYERSA-N Arg-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N OTOXOKCIIQLMFH-KZVJFYERSA-N 0.000 description 2
- OZNSCVPYWZRQPY-CIUDSAMLSA-N Arg-Asp-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OZNSCVPYWZRQPY-CIUDSAMLSA-N 0.000 description 2
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 2
- XTGGTAWGUFXJSV-NAKRPEOUSA-N Arg-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCN=C(N)N)N XTGGTAWGUFXJSV-NAKRPEOUSA-N 0.000 description 2
- JVMKBJNSRZWDBO-FXQIFTODSA-N Arg-Cys-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O JVMKBJNSRZWDBO-FXQIFTODSA-N 0.000 description 2
- QAODJPUKWNNNRP-DCAQKATOSA-N Arg-Glu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QAODJPUKWNNNRP-DCAQKATOSA-N 0.000 description 2
- RKRSYHCNPFGMTA-CIUDSAMLSA-N Arg-Glu-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O RKRSYHCNPFGMTA-CIUDSAMLSA-N 0.000 description 2
- UFBURHXMKFQVLM-CIUDSAMLSA-N Arg-Glu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UFBURHXMKFQVLM-CIUDSAMLSA-N 0.000 description 2
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 2
- CVKOQHYVDVYJSI-QTKMDUPCSA-N Arg-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCN=C(N)N)N)O CVKOQHYVDVYJSI-QTKMDUPCSA-N 0.000 description 2
- AGVNTAUPLWIQEN-ZPFDUUQYSA-N Arg-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AGVNTAUPLWIQEN-ZPFDUUQYSA-N 0.000 description 2
- OKKMBOSPBDASEP-CYDGBPFRSA-N Arg-Ile-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCSC)C(O)=O OKKMBOSPBDASEP-CYDGBPFRSA-N 0.000 description 2
- FSNVAJOPUDVQAR-AVGNSLFASA-N Arg-Lys-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FSNVAJOPUDVQAR-AVGNSLFASA-N 0.000 description 2
- NPAVRDPEFVKELR-DCAQKATOSA-N Arg-Lys-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NPAVRDPEFVKELR-DCAQKATOSA-N 0.000 description 2
- AOHKLEBWKMKITA-IHRRRGAJSA-N Arg-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AOHKLEBWKMKITA-IHRRRGAJSA-N 0.000 description 2
- AUZAXCPWMDBWEE-HJGDQZAQSA-N Arg-Thr-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O AUZAXCPWMDBWEE-HJGDQZAQSA-N 0.000 description 2
- QJWLLRZTJFPCHA-STECZYCISA-N Arg-Tyr-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O QJWLLRZTJFPCHA-STECZYCISA-N 0.000 description 2
- WOZDCBHUGJVJPL-AVGNSLFASA-N Arg-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WOZDCBHUGJVJPL-AVGNSLFASA-N 0.000 description 2
- XYOVHPDDWCEUDY-CIUDSAMLSA-N Asn-Ala-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O XYOVHPDDWCEUDY-CIUDSAMLSA-N 0.000 description 2
- AKEBUSZTMQLNIX-UWJYBYFXSA-N Asn-Ala-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N AKEBUSZTMQLNIX-UWJYBYFXSA-N 0.000 description 2
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 2
- BVLIJXXSXBUGEC-SRVKXCTJSA-N Asn-Asn-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVLIJXXSXBUGEC-SRVKXCTJSA-N 0.000 description 2
- QHBMKQWOIYJYMI-BYULHYEWSA-N Asn-Asn-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O QHBMKQWOIYJYMI-BYULHYEWSA-N 0.000 description 2
- GMCOADLDNLGOFE-ZLUOBGJFSA-N Asn-Asp-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N)C(=O)N GMCOADLDNLGOFE-ZLUOBGJFSA-N 0.000 description 2
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 2
- IYVSIZAXNLOKFQ-BYULHYEWSA-N Asn-Asp-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IYVSIZAXNLOKFQ-BYULHYEWSA-N 0.000 description 2
- CZIXHXIJJZLYRJ-SRVKXCTJSA-N Asn-Cys-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 CZIXHXIJJZLYRJ-SRVKXCTJSA-N 0.000 description 2
- GNKVBRYFXYWXAB-WDSKDSINSA-N Asn-Glu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O GNKVBRYFXYWXAB-WDSKDSINSA-N 0.000 description 2
- OLGCWMNDJTWQAG-GUBZILKMSA-N Asn-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(N)=O OLGCWMNDJTWQAG-GUBZILKMSA-N 0.000 description 2
- JZDZLBJVYWIIQU-AVGNSLFASA-N Asn-Glu-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JZDZLBJVYWIIQU-AVGNSLFASA-N 0.000 description 2
- DXVMJJNAOVECBA-WHFBIAKZSA-N Asn-Gly-Asn Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O DXVMJJNAOVECBA-WHFBIAKZSA-N 0.000 description 2
- UDSVWSUXKYXSTR-QWRGUYRKSA-N Asn-Gly-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UDSVWSUXKYXSTR-QWRGUYRKSA-N 0.000 description 2
- VXLBDJWTONZHJN-YUMQZZPRSA-N Asn-His-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N VXLBDJWTONZHJN-YUMQZZPRSA-N 0.000 description 2
- ACKNRKFVYUVWAC-ZPFDUUQYSA-N Asn-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ACKNRKFVYUVWAC-ZPFDUUQYSA-N 0.000 description 2
- XLZCLJRGGMBKLR-PCBIJLKTSA-N Asn-Ile-Phe Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XLZCLJRGGMBKLR-PCBIJLKTSA-N 0.000 description 2
- HDHZCEDPLTVHFZ-GUBZILKMSA-N Asn-Leu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O HDHZCEDPLTVHFZ-GUBZILKMSA-N 0.000 description 2
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 2
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 2
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 2
- FTSAJSADJCMDHH-CIUDSAMLSA-N Asn-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N FTSAJSADJCMDHH-CIUDSAMLSA-N 0.000 description 2
- VOGCFWDZYYTEOY-DCAQKATOSA-N Asn-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N VOGCFWDZYYTEOY-DCAQKATOSA-N 0.000 description 2
- GIQCDTKOIPUDSG-GARJFASQSA-N Asn-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N)C(=O)O GIQCDTKOIPUDSG-GARJFASQSA-N 0.000 description 2
- RLHANKIRBONJBK-IHRRRGAJSA-N Asn-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)N)N RLHANKIRBONJBK-IHRRRGAJSA-N 0.000 description 2
- YXVAESUIQFDBHN-SRVKXCTJSA-N Asn-Phe-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O YXVAESUIQFDBHN-SRVKXCTJSA-N 0.000 description 2
- UYCPJVYQYARFGB-YDHLFZDLSA-N Asn-Phe-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O UYCPJVYQYARFGB-YDHLFZDLSA-N 0.000 description 2
- XMHFCUKJRCQXGI-CIUDSAMLSA-N Asn-Pro-Gln Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O XMHFCUKJRCQXGI-CIUDSAMLSA-N 0.000 description 2
- JBDLMLZNDRLDIX-HJGDQZAQSA-N Asn-Thr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O JBDLMLZNDRLDIX-HJGDQZAQSA-N 0.000 description 2
- ULZOQOKFYMXHPZ-AQZXSJQPSA-N Asn-Trp-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ULZOQOKFYMXHPZ-AQZXSJQPSA-N 0.000 description 2
- KSZHWTRZPOTIGY-AVGNSLFASA-N Asn-Tyr-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O KSZHWTRZPOTIGY-AVGNSLFASA-N 0.000 description 2
- QNNBHTFDFFFHGC-KKUMJFAQSA-N Asn-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QNNBHTFDFFFHGC-KKUMJFAQSA-N 0.000 description 2
- DPSUVAPLRQDWAO-YDHLFZDLSA-N Asn-Tyr-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(=O)N)N DPSUVAPLRQDWAO-YDHLFZDLSA-N 0.000 description 2
- SYZWMVSXBZCOBZ-QXEWZRGKSA-N Asn-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)N)N SYZWMVSXBZCOBZ-QXEWZRGKSA-N 0.000 description 2
- XPGVTUBABLRGHY-BIIVOSGPSA-N Asp-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N XPGVTUBABLRGHY-BIIVOSGPSA-N 0.000 description 2
- JGDBHIVECJGXJA-FXQIFTODSA-N Asp-Asp-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JGDBHIVECJGXJA-FXQIFTODSA-N 0.000 description 2
- TVVYVAUGRHNTGT-UGYAYLCHSA-N Asp-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O TVVYVAUGRHNTGT-UGYAYLCHSA-N 0.000 description 2
- PXLNPFOJZQMXAT-BYULHYEWSA-N Asp-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O PXLNPFOJZQMXAT-BYULHYEWSA-N 0.000 description 2
- GHODABZPVZMWCE-FXQIFTODSA-N Asp-Glu-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O GHODABZPVZMWCE-FXQIFTODSA-N 0.000 description 2
- XDGBFDYXZCMYEX-NUMRIWBASA-N Asp-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)O XDGBFDYXZCMYEX-NUMRIWBASA-N 0.000 description 2
- RRKCPMGSRIDLNC-AVGNSLFASA-N Asp-Glu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RRKCPMGSRIDLNC-AVGNSLFASA-N 0.000 description 2
- KTTCQQNRRLCIBC-GHCJXIJMSA-N Asp-Ile-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O KTTCQQNRRLCIBC-GHCJXIJMSA-N 0.000 description 2
- CYCKJEFVFNRWEZ-UGYAYLCHSA-N Asp-Ile-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CYCKJEFVFNRWEZ-UGYAYLCHSA-N 0.000 description 2
- MFTVXYMXSAQZNL-DJFWLOJKSA-N Asp-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(=O)O)N MFTVXYMXSAQZNL-DJFWLOJKSA-N 0.000 description 2
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 2
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 2
- AKKUDRZKFZWPBH-SRVKXCTJSA-N Asp-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N AKKUDRZKFZWPBH-SRVKXCTJSA-N 0.000 description 2
- IDDMGSKZQDEDGA-SRVKXCTJSA-N Asp-Phe-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 IDDMGSKZQDEDGA-SRVKXCTJSA-N 0.000 description 2
- OTKUAVXGMREHRX-CFMVVWHZSA-N Asp-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(O)=O)CC1=CC=C(O)C=C1 OTKUAVXGMREHRX-CFMVVWHZSA-N 0.000 description 2
- GIKOVDMXBAFXDF-NHCYSSNCSA-N Asp-Val-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GIKOVDMXBAFXDF-NHCYSSNCSA-N 0.000 description 2
- JGLWFWXGOINXEA-YDHLFZDLSA-N Asp-Val-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 JGLWFWXGOINXEA-YDHLFZDLSA-N 0.000 description 2
- 244000063299 Bacillus subtilis Species 0.000 description 2
- 235000014469 Bacillus subtilis Nutrition 0.000 description 2
- 244000025254 Cannabis sativa Species 0.000 description 2
- 229920000858 Cyclodextrin Polymers 0.000 description 2
- DZIGZIIJIGGANI-FXQIFTODSA-N Cys-Glu-Gln Chemical compound SC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O DZIGZIIJIGGANI-FXQIFTODSA-N 0.000 description 2
- LKUCSUGWHYVYLP-GHCJXIJMSA-N Cys-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CS)N LKUCSUGWHYVYLP-GHCJXIJMSA-N 0.000 description 2
- XZKJEOMFLDVXJG-KATARQTJSA-N Cys-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CS)N)O XZKJEOMFLDVXJG-KATARQTJSA-N 0.000 description 2
- HMWBPUDETPKSSS-DCAQKATOSA-N Cys-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CS)N)C(=O)N[C@@H](CCCCN)C(=O)O HMWBPUDETPKSSS-DCAQKATOSA-N 0.000 description 2
- 230000007018 DNA scission Effects 0.000 description 2
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 2
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 2
- 238000002965 ELISA Methods 0.000 description 2
- 108010042407 Endonucleases Proteins 0.000 description 2
- 108700024394 Exon Proteins 0.000 description 2
- JSYULGSPLTZDHM-NRPADANISA-N Gln-Ala-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O JSYULGSPLTZDHM-NRPADANISA-N 0.000 description 2
- JESJDAAGXULQOP-CIUDSAMLSA-N Gln-Arg-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)CN=C(N)N JESJDAAGXULQOP-CIUDSAMLSA-N 0.000 description 2
- CKNUKHBRCSMKMO-XHNCKOQMSA-N Gln-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N)C(=O)O CKNUKHBRCSMKMO-XHNCKOQMSA-N 0.000 description 2
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 2
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 2
- NSNUZSPSADIMJQ-WDSKDSINSA-N Gln-Gly-Asp Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NSNUZSPSADIMJQ-WDSKDSINSA-N 0.000 description 2
- FGYPOQPQTUNESW-IUCAKERBSA-N Gln-Gly-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N FGYPOQPQTUNESW-IUCAKERBSA-N 0.000 description 2
- TWTWUBHEWQPMQW-ZPFDUUQYSA-N Gln-Ile-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWTWUBHEWQPMQW-ZPFDUUQYSA-N 0.000 description 2
- JKGHMESJHRTHIC-SIUGBPQLSA-N Gln-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JKGHMESJHRTHIC-SIUGBPQLSA-N 0.000 description 2
- FKXCBKCOSVIGCT-AVGNSLFASA-N Gln-Lys-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FKXCBKCOSVIGCT-AVGNSLFASA-N 0.000 description 2
- PBYFVIQRFLNQCO-GUBZILKMSA-N Gln-Pro-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O PBYFVIQRFLNQCO-GUBZILKMSA-N 0.000 description 2
- WIMVKDYAKRAUCG-IHRRRGAJSA-N Gln-Tyr-Glu Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O WIMVKDYAKRAUCG-IHRRRGAJSA-N 0.000 description 2
- BPDVTFBJZNBHEU-HGNGGELXSA-N Glu-Ala-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 BPDVTFBJZNBHEU-HGNGGELXSA-N 0.000 description 2
- LJLPOZGRPLORTF-CIUDSAMLSA-N Glu-Asn-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O LJLPOZGRPLORTF-CIUDSAMLSA-N 0.000 description 2
- VAIWPXWHWAPYDF-FXQIFTODSA-N Glu-Asp-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O VAIWPXWHWAPYDF-FXQIFTODSA-N 0.000 description 2
- JVSBYEDSSRZQGV-GUBZILKMSA-N Glu-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O JVSBYEDSSRZQGV-GUBZILKMSA-N 0.000 description 2
- PXHABOCPJVTGEK-BQBZGAKWSA-N Glu-Gln-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O PXHABOCPJVTGEK-BQBZGAKWSA-N 0.000 description 2
- NKLRYVLERDYDBI-FXQIFTODSA-N Glu-Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKLRYVLERDYDBI-FXQIFTODSA-N 0.000 description 2
- KRGZZKWSBGPLKL-IUCAKERBSA-N Glu-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N KRGZZKWSBGPLKL-IUCAKERBSA-N 0.000 description 2
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 2
- XTZDZAXYPDISRR-MNXVOIDGSA-N Glu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XTZDZAXYPDISRR-MNXVOIDGSA-N 0.000 description 2
- WTMZXOPHTIVFCP-QEWYBTABSA-N Glu-Ile-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WTMZXOPHTIVFCP-QEWYBTABSA-N 0.000 description 2
- ATVYZJGOZLVXDK-IUCAKERBSA-N Glu-Leu-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O ATVYZJGOZLVXDK-IUCAKERBSA-N 0.000 description 2
- SJJHXJDSNQJMMW-SRVKXCTJSA-N Glu-Lys-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O SJJHXJDSNQJMMW-SRVKXCTJSA-N 0.000 description 2
- YTRBQAQSUDSIQE-FHWLQOOXSA-N Glu-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 YTRBQAQSUDSIQE-FHWLQOOXSA-N 0.000 description 2
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 2
- SYAYROHMAIHWFB-KBIXCLLPSA-N Glu-Ser-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYAYROHMAIHWFB-KBIXCLLPSA-N 0.000 description 2
- JWNZHMSRZXXGTM-XKBZYTNZSA-N Glu-Ser-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWNZHMSRZXXGTM-XKBZYTNZSA-N 0.000 description 2
- DMYACXMQUABZIQ-NRPADANISA-N Glu-Ser-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O DMYACXMQUABZIQ-NRPADANISA-N 0.000 description 2
- TWYSSILQABLLME-HJGDQZAQSA-N Glu-Thr-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYSSILQABLLME-HJGDQZAQSA-N 0.000 description 2
- JVYNYWXHZWVJEF-NUMRIWBASA-N Glu-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O JVYNYWXHZWVJEF-NUMRIWBASA-N 0.000 description 2
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 2
- RZMXBFUSQNLEQF-QEJZJMRPSA-N Glu-Trp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N RZMXBFUSQNLEQF-QEJZJMRPSA-N 0.000 description 2
- FGGKGJHCVMYGCD-UKJIMTQDSA-N Glu-Val-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FGGKGJHCVMYGCD-UKJIMTQDSA-N 0.000 description 2
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 2
- FVGOGEGGQLNZGH-DZKIICNBSA-N Glu-Val-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FVGOGEGGQLNZGH-DZKIICNBSA-N 0.000 description 2
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 2
- DWUKOTKSTDWGAE-BQBZGAKWSA-N Gly-Asn-Arg Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DWUKOTKSTDWGAE-BQBZGAKWSA-N 0.000 description 2
- NZAFOTBEULLEQB-WDSKDSINSA-N Gly-Asn-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN NZAFOTBEULLEQB-WDSKDSINSA-N 0.000 description 2
- LURCIJSJAKFCRO-QWRGUYRKSA-N Gly-Asn-Tyr Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LURCIJSJAKFCRO-QWRGUYRKSA-N 0.000 description 2
- ZRZILYKEJBMFHY-BQBZGAKWSA-N Gly-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN ZRZILYKEJBMFHY-BQBZGAKWSA-N 0.000 description 2
- BPQYBFAXRGMGGY-LAEOZQHASA-N Gly-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN BPQYBFAXRGMGGY-LAEOZQHASA-N 0.000 description 2
- ZQIMMEYPEXIYBB-IUCAKERBSA-N Gly-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN ZQIMMEYPEXIYBB-IUCAKERBSA-N 0.000 description 2
- KAJAOGBVWCYGHZ-JTQLQIEISA-N Gly-Gly-Phe Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 2
- SWQALSGKVLYKDT-ZKWXMUAHSA-N Gly-Ile-Ala Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SWQALSGKVLYKDT-ZKWXMUAHSA-N 0.000 description 2
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 2
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 2
- DENRBIYENOKSEX-PEXQALLHSA-N Gly-Ile-His Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 DENRBIYENOKSEX-PEXQALLHSA-N 0.000 description 2
- PDUHNKAFQXQNLH-ZETCQYMHSA-N Gly-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)NCC(O)=O PDUHNKAFQXQNLH-ZETCQYMHSA-N 0.000 description 2
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 2
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 2
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 2
- IBYOLNARKHMLBG-WHOFXGATSA-N Gly-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IBYOLNARKHMLBG-WHOFXGATSA-N 0.000 description 2
- JPVGHHQGKPQYIL-KBPBESRZSA-N Gly-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 JPVGHHQGKPQYIL-KBPBESRZSA-N 0.000 description 2
- FEUPVVCGQLNXNP-IRXDYDNUSA-N Gly-Phe-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FEUPVVCGQLNXNP-IRXDYDNUSA-N 0.000 description 2
- QAMMIGULQSIRCD-IRXDYDNUSA-N Gly-Phe-Tyr Chemical compound C([C@H](NC(=O)C[NH3+])C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C([O-])=O)C1=CC=CC=C1 QAMMIGULQSIRCD-IRXDYDNUSA-N 0.000 description 2
- POJJAZJHBGXEGM-YUMQZZPRSA-N Gly-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)CN POJJAZJHBGXEGM-YUMQZZPRSA-N 0.000 description 2
- JSLVAHYTAJJEQH-QWRGUYRKSA-N Gly-Ser-Phe Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JSLVAHYTAJJEQH-QWRGUYRKSA-N 0.000 description 2
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 2
- RIYIFUFFFBIOEU-KBPBESRZSA-N Gly-Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 RIYIFUFFFBIOEU-KBPBESRZSA-N 0.000 description 2
- YGHSQRJSHKYUJY-SCZZXKLOSA-N Gly-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN YGHSQRJSHKYUJY-SCZZXKLOSA-N 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- TVRMJKNELJKNRS-GUBZILKMSA-N His-Glu-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N TVRMJKNELJKNRS-GUBZILKMSA-N 0.000 description 2
- FYTCLUIYTYFGPT-YUMQZZPRSA-N His-Gly-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FYTCLUIYTYFGPT-YUMQZZPRSA-N 0.000 description 2
- WTJBVCUCLWFGAH-JUKXBJQTSA-N His-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N WTJBVCUCLWFGAH-JUKXBJQTSA-N 0.000 description 2
- RNMNYMDTESKEAJ-KKUMJFAQSA-N His-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 RNMNYMDTESKEAJ-KKUMJFAQSA-N 0.000 description 2
- BRQKGRLDDDQWQJ-MBLNEYKQSA-N His-Thr-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O BRQKGRLDDDQWQJ-MBLNEYKQSA-N 0.000 description 2
- AHEBIAHEZWQVHB-QTKMDUPCSA-N His-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N)O AHEBIAHEZWQVHB-QTKMDUPCSA-N 0.000 description 2
- 101000615488 Homo sapiens Methyl-CpG-binding domain protein 2 Proteins 0.000 description 2
- JXUGDUWBMKIJDC-NAKRPEOUSA-N Ile-Ala-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O JXUGDUWBMKIJDC-NAKRPEOUSA-N 0.000 description 2
- TZCGZYWNIDZZMR-NAKRPEOUSA-N Ile-Arg-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](C)C(=O)O)N TZCGZYWNIDZZMR-NAKRPEOUSA-N 0.000 description 2
- ATXGFMOBVKSOMK-PEDHHIEDSA-N Ile-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N ATXGFMOBVKSOMK-PEDHHIEDSA-N 0.000 description 2
- WECYRWOMWSCWNX-XUXIUFHCSA-N Ile-Arg-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O WECYRWOMWSCWNX-XUXIUFHCSA-N 0.000 description 2
- HZMLFETXHFHGBB-UGYAYLCHSA-N Ile-Asn-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZMLFETXHFHGBB-UGYAYLCHSA-N 0.000 description 2
- NBJAAWYRLGCJOF-UGYAYLCHSA-N Ile-Asp-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NBJAAWYRLGCJOF-UGYAYLCHSA-N 0.000 description 2
- RGSOCXHDOPQREB-ZPFDUUQYSA-N Ile-Asp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N RGSOCXHDOPQREB-ZPFDUUQYSA-N 0.000 description 2
- GECLQMBTZCPAFY-PEFMBERDSA-N Ile-Gln-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GECLQMBTZCPAFY-PEFMBERDSA-N 0.000 description 2
- LJKDGRWXYUTRSH-YVNDNENWSA-N Ile-Gln-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N LJKDGRWXYUTRSH-YVNDNENWSA-N 0.000 description 2
- SLQVFYWBGNNOTK-BYULHYEWSA-N Ile-Gly-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N SLQVFYWBGNNOTK-BYULHYEWSA-N 0.000 description 2
- IGJWJGIHUFQANP-LAEOZQHASA-N Ile-Gly-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N IGJWJGIHUFQANP-LAEOZQHASA-N 0.000 description 2
- RIVKTKFVWXRNSJ-GRLWGSQLSA-N Ile-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RIVKTKFVWXRNSJ-GRLWGSQLSA-N 0.000 description 2
- YNMQUIVKEFRCPH-QSFUFRPTSA-N Ile-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)O)N YNMQUIVKEFRCPH-QSFUFRPTSA-N 0.000 description 2
- OUUCIIJSBIBCHB-ZPFDUUQYSA-N Ile-Leu-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O OUUCIIJSBIBCHB-ZPFDUUQYSA-N 0.000 description 2
- PHRWFSFCNJPWRO-PPCPHDFISA-N Ile-Leu-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N PHRWFSFCNJPWRO-PPCPHDFISA-N 0.000 description 2
- DSDPLOODKXISDT-XUXIUFHCSA-N Ile-Leu-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O DSDPLOODKXISDT-XUXIUFHCSA-N 0.000 description 2
- UIEZQYNXCYHMQS-BJDJZHNGSA-N Ile-Lys-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)O)N UIEZQYNXCYHMQS-BJDJZHNGSA-N 0.000 description 2
- WVUDHMBJNBWZBU-XUXIUFHCSA-N Ile-Lys-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N WVUDHMBJNBWZBU-XUXIUFHCSA-N 0.000 description 2
- UAELWXJFLZBKQS-WHOFXGATSA-N Ile-Phe-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O UAELWXJFLZBKQS-WHOFXGATSA-N 0.000 description 2
- MLSUZXHSNRBDCI-CYDGBPFRSA-N Ile-Pro-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)O)N MLSUZXHSNRBDCI-CYDGBPFRSA-N 0.000 description 2
- JODPUDMBQBIWCK-GHCJXIJMSA-N Ile-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O JODPUDMBQBIWCK-GHCJXIJMSA-N 0.000 description 2
- XMYURPUVJSKTMC-KBIXCLLPSA-N Ile-Ser-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XMYURPUVJSKTMC-KBIXCLLPSA-N 0.000 description 2
- PELCGFMHLZXWBQ-BJDJZHNGSA-N Ile-Ser-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)O)N PELCGFMHLZXWBQ-BJDJZHNGSA-N 0.000 description 2
- SAEWJTCJQVZQNZ-IUKAMOBKSA-N Ile-Thr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SAEWJTCJQVZQNZ-IUKAMOBKSA-N 0.000 description 2
- COWHUQXTSYTKQC-RWRJDSDZSA-N Ile-Thr-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N COWHUQXTSYTKQC-RWRJDSDZSA-N 0.000 description 2
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 2
- QGXQHJQPAPMACW-PPCPHDFISA-N Ile-Thr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)O)N QGXQHJQPAPMACW-PPCPHDFISA-N 0.000 description 2
- KWHFUMYCSPJCFQ-NGTWOADLSA-N Ile-Thr-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N KWHFUMYCSPJCFQ-NGTWOADLSA-N 0.000 description 2
- 102100034343 Integrase Human genes 0.000 description 2
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 2
- PBCHMHROGNUXMK-DLOVCJGASA-N Leu-Ala-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 PBCHMHROGNUXMK-DLOVCJGASA-N 0.000 description 2
- IBMVEYRWAWIOTN-RWMBFGLXSA-N Leu-Arg-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(O)=O IBMVEYRWAWIOTN-RWMBFGLXSA-N 0.000 description 2
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 2
- OXKYZSRZKBTVEY-ZPFDUUQYSA-N Leu-Asn-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OXKYZSRZKBTVEY-ZPFDUUQYSA-N 0.000 description 2
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 2
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 2
- YKNBJXOJTURHCU-DCAQKATOSA-N Leu-Asp-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKNBJXOJTURHCU-DCAQKATOSA-N 0.000 description 2
- GBDMISNMNXVTNV-XIRDDKMYSA-N Leu-Asp-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O GBDMISNMNXVTNV-XIRDDKMYSA-N 0.000 description 2
- FQZPTCNSNPWHLJ-AVGNSLFASA-N Leu-Gln-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O FQZPTCNSNPWHLJ-AVGNSLFASA-N 0.000 description 2
- AXZGZMGRBDQTEY-SRVKXCTJSA-N Leu-Gln-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O AXZGZMGRBDQTEY-SRVKXCTJSA-N 0.000 description 2
- HFBCHNRFRYLZNV-GUBZILKMSA-N Leu-Glu-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HFBCHNRFRYLZNV-GUBZILKMSA-N 0.000 description 2
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 2
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 2
- LLBQJYDYOLIQAI-JYJNAYRXSA-N Leu-Glu-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LLBQJYDYOLIQAI-JYJNAYRXSA-N 0.000 description 2
- XQXGNBFMAXWIGI-MXAVVETBSA-N Leu-His-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 XQXGNBFMAXWIGI-MXAVVETBSA-N 0.000 description 2
- OYQUOLRTJHWVSQ-SRVKXCTJSA-N Leu-His-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O OYQUOLRTJHWVSQ-SRVKXCTJSA-N 0.000 description 2
- OHZIZVWQXJPBJS-IXOXFDKPSA-N Leu-His-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OHZIZVWQXJPBJS-IXOXFDKPSA-N 0.000 description 2
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 2
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 2
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 2
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 2
- DDVHDMSBLRAKNV-IHRRRGAJSA-N Leu-Met-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O DDVHDMSBLRAKNV-IHRRRGAJSA-N 0.000 description 2
- GCXGCIYIHXSKAY-ULQDDVLXSA-N Leu-Phe-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GCXGCIYIHXSKAY-ULQDDVLXSA-N 0.000 description 2
- SYRTUBLKWNDSDK-DKIMLUQUSA-N Leu-Phe-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYRTUBLKWNDSDK-DKIMLUQUSA-N 0.000 description 2
- AEDWWMMHUGYIFD-HJGDQZAQSA-N Leu-Thr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O AEDWWMMHUGYIFD-HJGDQZAQSA-N 0.000 description 2
- LCNASHSOFMRYFO-WDCWCFNPSA-N Leu-Thr-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(N)=O LCNASHSOFMRYFO-WDCWCFNPSA-N 0.000 description 2
- VQHUBNVKFFLWRP-ULQDDVLXSA-N Leu-Tyr-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 VQHUBNVKFFLWRP-ULQDDVLXSA-N 0.000 description 2
- LMDVGHQPPPLYAR-IHRRRGAJSA-N Leu-Val-His Chemical compound N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O LMDVGHQPPPLYAR-IHRRRGAJSA-N 0.000 description 2
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 2
- 241000209510 Liliopsida Species 0.000 description 2
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 2
- SJNZALDHDUYDBU-IHRRRGAJSA-N Lys-Arg-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(O)=O SJNZALDHDUYDBU-IHRRRGAJSA-N 0.000 description 2
- WALVCOOOKULCQM-ULQDDVLXSA-N Lys-Arg-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WALVCOOOKULCQM-ULQDDVLXSA-N 0.000 description 2
- SWWCDAGDQHTKIE-RHYQMDGZSA-N Lys-Arg-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWWCDAGDQHTKIE-RHYQMDGZSA-N 0.000 description 2
- BYPMOIFBQPEWOH-CIUDSAMLSA-N Lys-Asn-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N BYPMOIFBQPEWOH-CIUDSAMLSA-N 0.000 description 2
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 2
- LZWNAOIMTLNMDW-NHCYSSNCSA-N Lys-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N LZWNAOIMTLNMDW-NHCYSSNCSA-N 0.000 description 2
- NRQRKMYZONPCTM-CIUDSAMLSA-N Lys-Asp-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O NRQRKMYZONPCTM-CIUDSAMLSA-N 0.000 description 2
- LLSUNJYOSCOOEB-GUBZILKMSA-N Lys-Glu-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O LLSUNJYOSCOOEB-GUBZILKMSA-N 0.000 description 2
- PBIPLDMFHAICIP-DCAQKATOSA-N Lys-Glu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PBIPLDMFHAICIP-DCAQKATOSA-N 0.000 description 2
- GHOIOYHDDKXIDX-SZMVWBNQSA-N Lys-Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 GHOIOYHDDKXIDX-SZMVWBNQSA-N 0.000 description 2
- XNKDCYABMBBEKN-IUCAKERBSA-N Lys-Gly-Gln Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O XNKDCYABMBBEKN-IUCAKERBSA-N 0.000 description 2
- ISHNZELVUVPCHY-ZETCQYMHSA-N Lys-Gly-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O ISHNZELVUVPCHY-ZETCQYMHSA-N 0.000 description 2
- DTUZCYRNEJDKSR-NHCYSSNCSA-N Lys-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN DTUZCYRNEJDKSR-NHCYSSNCSA-N 0.000 description 2
- VLMNBMFYRMGEMB-QWRGUYRKSA-N Lys-His-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CNC=N1 VLMNBMFYRMGEMB-QWRGUYRKSA-N 0.000 description 2
- PGLGNCVOWIORQE-SRVKXCTJSA-N Lys-His-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O PGLGNCVOWIORQE-SRVKXCTJSA-N 0.000 description 2
- SLQJJFAVWSZLBL-BJDJZHNGSA-N Lys-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN SLQJJFAVWSZLBL-BJDJZHNGSA-N 0.000 description 2
- XREQQOATSMMAJP-MGHWNKPDSA-N Lys-Ile-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XREQQOATSMMAJP-MGHWNKPDSA-N 0.000 description 2
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 2
- XIZQPFCRXLUNMK-BZSNNMDCSA-N Lys-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCCCN)N XIZQPFCRXLUNMK-BZSNNMDCSA-N 0.000 description 2
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 2
- GAHJXEMYXKLZRQ-AJNGGQMLSA-N Lys-Lys-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GAHJXEMYXKLZRQ-AJNGGQMLSA-N 0.000 description 2
- AZOFEHCPMBRNFD-BZSNNMDCSA-N Lys-Phe-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 AZOFEHCPMBRNFD-BZSNNMDCSA-N 0.000 description 2
- CNGOEHJCLVCJHN-SRVKXCTJSA-N Lys-Pro-Glu Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O CNGOEHJCLVCJHN-SRVKXCTJSA-N 0.000 description 2
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 2
- HYSVGEAWTGPMOA-IHRRRGAJSA-N Lys-Pro-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O HYSVGEAWTGPMOA-IHRRRGAJSA-N 0.000 description 2
- MIROMRNASYKZNL-ULQDDVLXSA-N Lys-Pro-Tyr Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 MIROMRNASYKZNL-ULQDDVLXSA-N 0.000 description 2
- SBQDRNOLGSYHQA-YUMQZZPRSA-N Lys-Ser-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SBQDRNOLGSYHQA-YUMQZZPRSA-N 0.000 description 2
- UWHCKWNPWKTMBM-WDCWCFNPSA-N Lys-Thr-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWHCKWNPWKTMBM-WDCWCFNPSA-N 0.000 description 2
- BVRNWWHJYNPJDG-XIRDDKMYSA-N Lys-Trp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N BVRNWWHJYNPJDG-XIRDDKMYSA-N 0.000 description 2
- KQAREVUPVXMNNP-WDSOQIARSA-N Lys-Trp-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCSC)C(O)=O KQAREVUPVXMNNP-WDSOQIARSA-N 0.000 description 2
- XYLSGAWRCZECIQ-JYJNAYRXSA-N Lys-Tyr-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 XYLSGAWRCZECIQ-JYJNAYRXSA-N 0.000 description 2
- QLFAPXUXEBAWEK-NHCYSSNCSA-N Lys-Val-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O QLFAPXUXEBAWEK-NHCYSSNCSA-N 0.000 description 2
- VKCPHIOZDWUFSW-ONGXEEELSA-N Lys-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN VKCPHIOZDWUFSW-ONGXEEELSA-N 0.000 description 2
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 2
- QEVRUYFHWJJUHZ-DCAQKATOSA-N Met-Ala-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(C)C QEVRUYFHWJJUHZ-DCAQKATOSA-N 0.000 description 2
- ULNXMMYXQKGNPG-LPEHRKFASA-N Met-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N ULNXMMYXQKGNPG-LPEHRKFASA-N 0.000 description 2
- BLIPQDLSCFGUFA-GUBZILKMSA-N Met-Arg-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O BLIPQDLSCFGUFA-GUBZILKMSA-N 0.000 description 2
- ZIIMORLEZLVRIP-SRVKXCTJSA-N Met-Leu-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZIIMORLEZLVRIP-SRVKXCTJSA-N 0.000 description 2
- DBXMFHGGHMXYHY-DCAQKATOSA-N Met-Leu-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O DBXMFHGGHMXYHY-DCAQKATOSA-N 0.000 description 2
- CNAGWYQWQDMUGC-IHRRRGAJSA-N Met-Phe-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CNAGWYQWQDMUGC-IHRRRGAJSA-N 0.000 description 2
- HLZORBMOISUNIV-DCAQKATOSA-N Met-Ser-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C HLZORBMOISUNIV-DCAQKATOSA-N 0.000 description 2
- DBMLDOWSVHMQQN-XGEHTFHBSA-N Met-Ser-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DBMLDOWSVHMQQN-XGEHTFHBSA-N 0.000 description 2
- SPSSJSICDYYTQN-HJGDQZAQSA-N Met-Thr-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(N)=O SPSSJSICDYYTQN-HJGDQZAQSA-N 0.000 description 2
- ATBJCCFCJXCNGZ-UFYCRDLUSA-N Met-Tyr-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)CCSC)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 ATBJCCFCJXCNGZ-UFYCRDLUSA-N 0.000 description 2
- 102100021299 Methyl-CpG-binding domain protein 2 Human genes 0.000 description 2
- 102000016397 Methyltransferase Human genes 0.000 description 2
- 241000588621 Moraxella Species 0.000 description 2
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 2
- 108010079364 N-glycylalanine Proteins 0.000 description 2
- 108010047562 NGR peptide Proteins 0.000 description 2
- 108700026244 Open Reading Frames Proteins 0.000 description 2
- ULECEJGNDHWSKD-QEJZJMRPSA-N Phe-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 ULECEJGNDHWSKD-QEJZJMRPSA-N 0.000 description 2
- KAHUBGWSIQNZQQ-KKUMJFAQSA-N Phe-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KAHUBGWSIQNZQQ-KKUMJFAQSA-N 0.000 description 2
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 2
- OJUMUUXGSXUZJZ-SRVKXCTJSA-N Phe-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O OJUMUUXGSXUZJZ-SRVKXCTJSA-N 0.000 description 2
- QPQDWBAJWOGAMJ-IHPCNDPISA-N Phe-Asp-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 QPQDWBAJWOGAMJ-IHPCNDPISA-N 0.000 description 2
- LWPMGKSZPKFKJD-DZKIICNBSA-N Phe-Glu-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O LWPMGKSZPKFKJD-DZKIICNBSA-N 0.000 description 2
- JQLQUPIYYJXZLJ-ZEWNOJEFSA-N Phe-Ile-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 JQLQUPIYYJXZLJ-ZEWNOJEFSA-N 0.000 description 2
- KZRQONDKKJCAOL-DKIMLUQUSA-N Phe-Leu-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZRQONDKKJCAOL-DKIMLUQUSA-N 0.000 description 2
- DNAXXTQSTKOHFO-QEJZJMRPSA-N Phe-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 DNAXXTQSTKOHFO-QEJZJMRPSA-N 0.000 description 2
- OQTDZEJJWWAGJT-KKUMJFAQSA-N Phe-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O OQTDZEJJWWAGJT-KKUMJFAQSA-N 0.000 description 2
- IEOHQGFKHXUALJ-JYJNAYRXSA-N Phe-Met-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IEOHQGFKHXUALJ-JYJNAYRXSA-N 0.000 description 2
- UXQFHEKRGHYJRA-STQMWFEESA-N Phe-Met-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O UXQFHEKRGHYJRA-STQMWFEESA-N 0.000 description 2
- OWSLLRKCHLTUND-BZSNNMDCSA-N Phe-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OWSLLRKCHLTUND-BZSNNMDCSA-N 0.000 description 2
- FGWUALWGCZJQDJ-URLPEUOOSA-N Phe-Thr-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FGWUALWGCZJQDJ-URLPEUOOSA-N 0.000 description 2
- VFDRDMOMHBJGKD-UFYCRDLUSA-N Phe-Tyr-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N VFDRDMOMHBJGKD-UFYCRDLUSA-N 0.000 description 2
- MMPBPRXOFJNCCN-ZEWNOJEFSA-N Phe-Tyr-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MMPBPRXOFJNCCN-ZEWNOJEFSA-N 0.000 description 2
- ZOGICTVLQDWPER-UFYCRDLUSA-N Phe-Tyr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O ZOGICTVLQDWPER-UFYCRDLUSA-N 0.000 description 2
- 101100504555 Pisum sativum SBEII gene Proteins 0.000 description 2
- LQZZPNDMYNZPFT-KKUMJFAQSA-N Pro-Gln-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LQZZPNDMYNZPFT-KKUMJFAQSA-N 0.000 description 2
- UEHYFUCOGHWASA-HJGDQZAQSA-N Pro-Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 UEHYFUCOGHWASA-HJGDQZAQSA-N 0.000 description 2
- AJCRQOHDLCBHFA-SRVKXCTJSA-N Pro-His-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AJCRQOHDLCBHFA-SRVKXCTJSA-N 0.000 description 2
- IBGCFJDLCYTKPW-NAKRPEOUSA-N Pro-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 IBGCFJDLCYTKPW-NAKRPEOUSA-N 0.000 description 2
- LNOWDSPAYBWJOR-PEDHHIEDSA-N Pro-Ile-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LNOWDSPAYBWJOR-PEDHHIEDSA-N 0.000 description 2
- KLSOMAFWRISSNI-OSUNSFLBSA-N Pro-Ile-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 KLSOMAFWRISSNI-OSUNSFLBSA-N 0.000 description 2
- SXMSEHDMNIUTSP-DCAQKATOSA-N Pro-Lys-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SXMSEHDMNIUTSP-DCAQKATOSA-N 0.000 description 2
- WOIFYRZPIORBRY-AVGNSLFASA-N Pro-Lys-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WOIFYRZPIORBRY-AVGNSLFASA-N 0.000 description 2
- OWQXAJQZLWHPBH-FXQIFTODSA-N Pro-Ser-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O OWQXAJQZLWHPBH-FXQIFTODSA-N 0.000 description 2
- KWMZPPWYBVZIER-XGEHTFHBSA-N Pro-Ser-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KWMZPPWYBVZIER-XGEHTFHBSA-N 0.000 description 2
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 2
- NBDHWLZEMKSVHH-UVBJJODRSA-N Pro-Trp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@@H]3CCCN3 NBDHWLZEMKSVHH-UVBJJODRSA-N 0.000 description 2
- FUOGXAQMNJMBFG-WPRPVWTQSA-N Pro-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FUOGXAQMNJMBFG-WPRPVWTQSA-N 0.000 description 2
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 2
- 241000700159 Rattus Species 0.000 description 2
- 241000700157 Rattus norvegicus Species 0.000 description 2
- PZZJMBYSYAKYPK-UWJYBYFXSA-N Ser-Ala-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PZZJMBYSYAKYPK-UWJYBYFXSA-N 0.000 description 2
- GXXTUIUYTWGPMV-FXQIFTODSA-N Ser-Arg-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O GXXTUIUYTWGPMV-FXQIFTODSA-N 0.000 description 2
- RZUOXAKGNHXZTB-GUBZILKMSA-N Ser-Arg-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O RZUOXAKGNHXZTB-GUBZILKMSA-N 0.000 description 2
- VAUMZJHYZQXZBQ-WHFBIAKZSA-N Ser-Asn-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O VAUMZJHYZQXZBQ-WHFBIAKZSA-N 0.000 description 2
- COAHUSQNSVFYBW-FXQIFTODSA-N Ser-Asn-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O COAHUSQNSVFYBW-FXQIFTODSA-N 0.000 description 2
- COLJZWUVZIXSSS-CIUDSAMLSA-N Ser-Cys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CO)N COLJZWUVZIXSSS-CIUDSAMLSA-N 0.000 description 2
- CDVFZMOFNJPUDD-ACZMJKKPSA-N Ser-Gln-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CDVFZMOFNJPUDD-ACZMJKKPSA-N 0.000 description 2
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 2
- GZBKRJVCRMZAST-XKBZYTNZSA-N Ser-Glu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZBKRJVCRMZAST-XKBZYTNZSA-N 0.000 description 2
- QGAHMVHBORDHDC-YUMQZZPRSA-N Ser-His-Gly Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CN=CN1 QGAHMVHBORDHDC-YUMQZZPRSA-N 0.000 description 2
- DJACUBDEDBZKLQ-KBIXCLLPSA-N Ser-Ile-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O DJACUBDEDBZKLQ-KBIXCLLPSA-N 0.000 description 2
- UIPXCLNLUUAMJU-JBDRJPRFSA-N Ser-Ile-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UIPXCLNLUUAMJU-JBDRJPRFSA-N 0.000 description 2
- FKZSXTKZLPPHQU-GQGQLFGLSA-N Ser-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CO)N FKZSXTKZLPPHQU-GQGQLFGLSA-N 0.000 description 2
- ZIFYDQAFEMIZII-GUBZILKMSA-N Ser-Leu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZIFYDQAFEMIZII-GUBZILKMSA-N 0.000 description 2
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 2
- XXNYYSXNXCJYKX-DCAQKATOSA-N Ser-Leu-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O XXNYYSXNXCJYKX-DCAQKATOSA-N 0.000 description 2
- VZQRNAYURWAEFE-KKUMJFAQSA-N Ser-Leu-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VZQRNAYURWAEFE-KKUMJFAQSA-N 0.000 description 2
- KZPRPBLHYMZIMH-MXAVVETBSA-N Ser-Phe-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZPRPBLHYMZIMH-MXAVVETBSA-N 0.000 description 2
- TVPQRPNBYCRRLL-IHRRRGAJSA-N Ser-Phe-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O TVPQRPNBYCRRLL-IHRRRGAJSA-N 0.000 description 2
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 2
- AZWNCEBQZXELEZ-FXQIFTODSA-N Ser-Pro-Ser Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O AZWNCEBQZXELEZ-FXQIFTODSA-N 0.000 description 2
- JCLAFVNDBJMLBC-JBDRJPRFSA-N Ser-Ser-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JCLAFVNDBJMLBC-JBDRJPRFSA-N 0.000 description 2
- FLMYSKVSDVHLEW-SVSWQMSJSA-N Ser-Thr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLMYSKVSDVHLEW-SVSWQMSJSA-N 0.000 description 2
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 241000193996 Streptococcus pyogenes Species 0.000 description 2
- 238000010459 TALEN Methods 0.000 description 2
- JZRWCGZRTZMZEH-UHFFFAOYSA-N Thiamine Natural products CC1=C(CCO)SC=[N+]1CC1=CN=C(C)N=C1N JZRWCGZRTZMZEH-UHFFFAOYSA-N 0.000 description 2
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 2
- VFEHSAJCWWHDBH-RHYQMDGZSA-N Thr-Arg-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VFEHSAJCWWHDBH-RHYQMDGZSA-N 0.000 description 2
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 2
- YBXMGKCLOPDEKA-NUMRIWBASA-N Thr-Asp-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YBXMGKCLOPDEKA-NUMRIWBASA-N 0.000 description 2
- ASJDFGOPDCVXTG-KATARQTJSA-N Thr-Cys-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O ASJDFGOPDCVXTG-KATARQTJSA-N 0.000 description 2
- GCXFWAZRHBRYEM-NUMRIWBASA-N Thr-Gln-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O GCXFWAZRHBRYEM-NUMRIWBASA-N 0.000 description 2
- LAFLAXHTDVNVEL-WDCWCFNPSA-N Thr-Gln-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O LAFLAXHTDVNVEL-WDCWCFNPSA-N 0.000 description 2
- GKWNLDNXMMLRMC-GLLZPBPUSA-N Thr-Glu-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O GKWNLDNXMMLRMC-GLLZPBPUSA-N 0.000 description 2
- ONNSECRQFSTMCC-XKBZYTNZSA-N Thr-Glu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ONNSECRQFSTMCC-XKBZYTNZSA-N 0.000 description 2
- AMXMBCAXAZUCFA-RHYQMDGZSA-N Thr-Leu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AMXMBCAXAZUCFA-RHYQMDGZSA-N 0.000 description 2
- VGYVVSQFSSKZRJ-OEAJRASXSA-N Thr-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=CC=C1 VGYVVSQFSSKZRJ-OEAJRASXSA-N 0.000 description 2
- NYQIZWROIMIQSL-VEVYYDQMSA-N Thr-Pro-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O NYQIZWROIMIQSL-VEVYYDQMSA-N 0.000 description 2
- XKWABWFMQXMUMT-HJGDQZAQSA-N Thr-Pro-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XKWABWFMQXMUMT-HJGDQZAQSA-N 0.000 description 2
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 2
- NMCBVGFGWSIGSB-NUTKFTJISA-N Trp-Ala-Leu Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N NMCBVGFGWSIGSB-NUTKFTJISA-N 0.000 description 2
- HJTYJQVRIQXMHM-XIRDDKMYSA-N Trp-Asp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N HJTYJQVRIQXMHM-XIRDDKMYSA-N 0.000 description 2
- NWQCKAPDGQMZQN-IHPCNDPISA-N Trp-Lys-Leu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O NWQCKAPDGQMZQN-IHPCNDPISA-N 0.000 description 2
- UHXOYRWHIQZAKV-SZMVWBNQSA-N Trp-Pro-Arg Chemical compound O=C([C@H](CC=1C2=CC=CC=C2NC=1)N)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O UHXOYRWHIQZAKV-SZMVWBNQSA-N 0.000 description 2
- KBKTUNYBNJWFRL-UBHSHLNASA-N Trp-Ser-Asn Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O)=CNC2=C1 KBKTUNYBNJWFRL-UBHSHLNASA-N 0.000 description 2
- ZWZOCUWOXSDYFZ-CQDKDKBSSA-N Tyr-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZWZOCUWOXSDYFZ-CQDKDKBSSA-N 0.000 description 2
- MBFJIHUHHCJBSN-AVGNSLFASA-N Tyr-Asn-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MBFJIHUHHCJBSN-AVGNSLFASA-N 0.000 description 2
- VTFWAGGJDRSQFG-MELADBBJSA-N Tyr-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O VTFWAGGJDRSQFG-MELADBBJSA-N 0.000 description 2
- GAYLGYUVTDMLKC-UWJYBYFXSA-N Tyr-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 GAYLGYUVTDMLKC-UWJYBYFXSA-N 0.000 description 2
- BEIGSKUPTIFYRZ-SRVKXCTJSA-N Tyr-Asp-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O BEIGSKUPTIFYRZ-SRVKXCTJSA-N 0.000 description 2
- VFJIWSJKZJTQII-SRVKXCTJSA-N Tyr-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O VFJIWSJKZJTQII-SRVKXCTJSA-N 0.000 description 2
- WEFIPBYPXZYPHD-HJPIBITLSA-N Tyr-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC1=CC=C(C=C1)O)N WEFIPBYPXZYPHD-HJPIBITLSA-N 0.000 description 2
- YWXMGBUGMLJMIP-IHPCNDPISA-N Tyr-Cys-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC3=CC=C(C=C3)O)N YWXMGBUGMLJMIP-IHPCNDPISA-N 0.000 description 2
- ARPONUQDNWLXOZ-KKUMJFAQSA-N Tyr-Gln-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ARPONUQDNWLXOZ-KKUMJFAQSA-N 0.000 description 2
- QUILOGWWLXMSAT-IHRRRGAJSA-N Tyr-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QUILOGWWLXMSAT-IHRRRGAJSA-N 0.000 description 2
- WZQZUVWEPMGIMM-JYJNAYRXSA-N Tyr-Gln-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O WZQZUVWEPMGIMM-JYJNAYRXSA-N 0.000 description 2
- LOOCQRRBKZTPKO-AVGNSLFASA-N Tyr-Glu-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 LOOCQRRBKZTPKO-AVGNSLFASA-N 0.000 description 2
- SLCSPPCQWUHPPO-JYJNAYRXSA-N Tyr-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SLCSPPCQWUHPPO-JYJNAYRXSA-N 0.000 description 2
- GIOBXJSONRQHKQ-RYUDHWBXSA-N Tyr-Gly-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O GIOBXJSONRQHKQ-RYUDHWBXSA-N 0.000 description 2
- FNWGDMZVYBVAGJ-XEGUGMAKSA-N Tyr-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CC=C(C=C1)O)N FNWGDMZVYBVAGJ-XEGUGMAKSA-N 0.000 description 2
- WPXKRJVHBXYLDT-JUKXBJQTSA-N Tyr-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=C(C=C2)O)N WPXKRJVHBXYLDT-JUKXBJQTSA-N 0.000 description 2
- NXRGXTBPMOGFID-CFMVVWHZSA-N Tyr-Ile-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O NXRGXTBPMOGFID-CFMVVWHZSA-N 0.000 description 2
- KGSDLCMCDFETHU-YESZJQIVSA-N Tyr-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O KGSDLCMCDFETHU-YESZJQIVSA-N 0.000 description 2
- FWOVTJKVUCGVND-UFYCRDLUSA-N Tyr-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N FWOVTJKVUCGVND-UFYCRDLUSA-N 0.000 description 2
- FDKDGFGTHGJKNV-FHWLQOOXSA-N Tyr-Phe-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N FDKDGFGTHGJKNV-FHWLQOOXSA-N 0.000 description 2
- PSALWJCUIAQKFW-ACRUOGEOSA-N Tyr-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N PSALWJCUIAQKFW-ACRUOGEOSA-N 0.000 description 2
- MNWINJDPGBNOED-ULQDDVLXSA-N Tyr-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=C(O)C=C1 MNWINJDPGBNOED-ULQDDVLXSA-N 0.000 description 2
- MQGGXGKQSVEQHR-KKUMJFAQSA-N Tyr-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 MQGGXGKQSVEQHR-KKUMJFAQSA-N 0.000 description 2
- UMSZZGTXGKHTFJ-SRVKXCTJSA-N Tyr-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UMSZZGTXGKHTFJ-SRVKXCTJSA-N 0.000 description 2
- MDXLPNRXCFOBTL-BZSNNMDCSA-N Tyr-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MDXLPNRXCFOBTL-BZSNNMDCSA-N 0.000 description 2
- OBKOPLHSRDATFO-XHSDSOJGSA-N Tyr-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N OBKOPLHSRDATFO-XHSDSOJGSA-N 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- UDNYEPLJTRDMEJ-RCOVLWMOSA-N Val-Asn-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N UDNYEPLJTRDMEJ-RCOVLWMOSA-N 0.000 description 2
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 2
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 2
- CPGJELLYDQEDRK-NAKRPEOUSA-N Val-Ile-Ala Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](C)C(O)=O CPGJELLYDQEDRK-NAKRPEOUSA-N 0.000 description 2
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 2
- RQOMPQGUGBILAG-AVGNSLFASA-N Val-Met-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O RQOMPQGUGBILAG-AVGNSLFASA-N 0.000 description 2
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 2
- UQMPYVLTQCGRSK-IFFSRLJSSA-N Val-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N)O UQMPYVLTQCGRSK-IFFSRLJSSA-N 0.000 description 2
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 2
- VTIAEOKFUJJBTC-YDHLFZDLSA-N Val-Tyr-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VTIAEOKFUJJBTC-YDHLFZDLSA-N 0.000 description 2
- NLNCNKIVJPEFBC-DLOVCJGASA-N Val-Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O NLNCNKIVJPEFBC-DLOVCJGASA-N 0.000 description 2
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 2
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 2
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 2
- 108010044940 alanylglutamine Proteins 0.000 description 2
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 2
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 2
- 230000000975 bioactive effect Effects 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 230000007321 biological mechanism Effects 0.000 description 2
- 239000001506 calcium phosphate Substances 0.000 description 2
- 229910000389 calcium phosphate Inorganic materials 0.000 description 2
- 235000011010 calcium phosphates Nutrition 0.000 description 2
- 210000003763 chloroplast Anatomy 0.000 description 2
- 239000003184 complementary RNA Substances 0.000 description 2
- 235000005822 corn Nutrition 0.000 description 2
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 2
- 150000001945 cysteines Chemical class 0.000 description 2
- 229940104302 cytosine Drugs 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 239000013024 dilution buffer Substances 0.000 description 2
- 238000004520 electroporation Methods 0.000 description 2
- 241001233957 eudicotyledons Species 0.000 description 2
- 239000013604 expression vector Substances 0.000 description 2
- 230000004720 fertilization Effects 0.000 description 2
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 2
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- 230000002363 herbicidal effect Effects 0.000 description 2
- 239000004009 herbicide Substances 0.000 description 2
- 108010092114 histidylphenylalanine Proteins 0.000 description 2
- 102000048646 human APOBEC3A Human genes 0.000 description 2
- 238000001764 infiltration Methods 0.000 description 2
- 230000008595 infiltration Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000002427 irreversible effect Effects 0.000 description 2
- 238000011005 laboratory method Methods 0.000 description 2
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 2
- 239000002502 liposome Substances 0.000 description 2
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000021121 meiosis Effects 0.000 description 2
- 238000000520 microinjection Methods 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 238000002703 mutagenesis Methods 0.000 description 2
- 231100000350 mutagenesis Toxicity 0.000 description 2
- 239000002105 nanoparticle Substances 0.000 description 2
- 230000030648 nucleus localization Effects 0.000 description 2
- 238000004161 plant tissue culture Methods 0.000 description 2
- 230000008488 polyadenylation Effects 0.000 description 2
- 229920001223 polyethylene glycol Polymers 0.000 description 2
- 108010029020 prolylglycine Proteins 0.000 description 2
- 108010090894 prolylleucine Proteins 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 150000003839 salts Chemical class 0.000 description 2
- HFHDHCJBZVLPGP-UHFFFAOYSA-N schardinger α-dextrin Chemical compound O1C(C(C2O)O)C(CO)OC2OC(C(C2O)O)C(CO)OC2OC(C(C2O)O)C(CO)OC2OC(C(O)C2O)C(CO)OC2OC(C(C2O)O)C(CO)OC2OC2C(O)C(O)C1OC2CO HFHDHCJBZVLPGP-UHFFFAOYSA-N 0.000 description 2
- 230000005783 single-strand break Effects 0.000 description 2
- 229910001415 sodium ion Inorganic materials 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 210000001082 somatic cell Anatomy 0.000 description 2
- 238000000527 sonication Methods 0.000 description 2
- 230000002269 spontaneous effect Effects 0.000 description 2
- 235000019157 thiamine Nutrition 0.000 description 2
- 229960003495 thiamine Drugs 0.000 description 2
- 239000011721 thiamine Substances 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 238000011426 transformation method Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 2
- 239000011701 zinc Substances 0.000 description 2
- 229910052725 zinc Inorganic materials 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 1
- FQVLRGLGWNWPSS-BXBUPLCLSA-N (4r,7s,10s,13s,16r)-16-acetamido-13-(1h-imidazol-5-ylmethyl)-10-methyl-6,9,12,15-tetraoxo-7-propan-2-yl-1,2-dithia-5,8,11,14-tetrazacycloheptadecane-4-carboxamide Chemical compound N1C(=O)[C@@H](NC(C)=O)CSSC[C@@H](C(N)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)NC(=O)[C@@H]1CC1=CN=CN1 FQVLRGLGWNWPSS-BXBUPLCLSA-N 0.000 description 1
- 101150084750 1 gene Proteins 0.000 description 1
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical group C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 description 1
- CKTSBUTUHBMZGZ-SHYZEUOFSA-N 2'‐deoxycytidine Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 CKTSBUTUHBMZGZ-SHYZEUOFSA-N 0.000 description 1
- 108010004483 APOBEC-3G Deaminase Proteins 0.000 description 1
- 241000604450 Acidaminococcus fermentans Species 0.000 description 1
- HHGYNJRJIINWAK-FXQIFTODSA-N Ala-Ala-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N HHGYNJRJIINWAK-FXQIFTODSA-N 0.000 description 1
- AAQGRPOPTAUUBM-ZLUOBGJFSA-N Ala-Ala-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O AAQGRPOPTAUUBM-ZLUOBGJFSA-N 0.000 description 1
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 1
- DVWVZSJAYIJZFI-FXQIFTODSA-N Ala-Arg-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O DVWVZSJAYIJZFI-FXQIFTODSA-N 0.000 description 1
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 1
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 1
- XEXJJJRVTFGWIC-FXQIFTODSA-N Ala-Asn-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XEXJJJRVTFGWIC-FXQIFTODSA-N 0.000 description 1
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 1
- YSMPVONNIWLJML-FXQIFTODSA-N Ala-Asp-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(O)=O YSMPVONNIWLJML-FXQIFTODSA-N 0.000 description 1
- BUDNAJYVCUHLSV-ZLUOBGJFSA-N Ala-Asp-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O BUDNAJYVCUHLSV-ZLUOBGJFSA-N 0.000 description 1
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 1
- NJPMYXWVWQWCSR-ACZMJKKPSA-N Ala-Glu-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NJPMYXWVWQWCSR-ACZMJKKPSA-N 0.000 description 1
- FBHOPGDGELNWRH-DRZSPHRISA-N Ala-Glu-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O FBHOPGDGELNWRH-DRZSPHRISA-N 0.000 description 1
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 1
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 1
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 1
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 1
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 1
- JEPNLGMEZMCFEX-QSFUFRPTSA-N Ala-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C)N JEPNLGMEZMCFEX-QSFUFRPTSA-N 0.000 description 1
- PNALXAODQKTNLV-JBDRJPRFSA-N Ala-Ile-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O PNALXAODQKTNLV-JBDRJPRFSA-N 0.000 description 1
- HHRAXZAYZFFRAM-CIUDSAMLSA-N Ala-Leu-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O HHRAXZAYZFFRAM-CIUDSAMLSA-N 0.000 description 1
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 1
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 1
- RGQCNKIDEQJEBT-CQDKDKBSSA-N Ala-Leu-Tyr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 RGQCNKIDEQJEBT-CQDKDKBSSA-N 0.000 description 1
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 1
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 1
- XHNLCGXYBXNRIS-BJDJZHNGSA-N Ala-Lys-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XHNLCGXYBXNRIS-BJDJZHNGSA-N 0.000 description 1
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 1
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 1
- FUKFQILQFQKHLE-DCAQKATOSA-N Ala-Lys-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O FUKFQILQFQKHLE-DCAQKATOSA-N 0.000 description 1
- OINVDEKBKBCPLX-JXUBOQSCSA-N Ala-Lys-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OINVDEKBKBCPLX-JXUBOQSCSA-N 0.000 description 1
- JAQNUEWEJWBVAY-WBAXXEDZSA-N Ala-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 JAQNUEWEJWBVAY-WBAXXEDZSA-N 0.000 description 1
- ADSGHMXEAZJJNF-DCAQKATOSA-N Ala-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N ADSGHMXEAZJJNF-DCAQKATOSA-N 0.000 description 1
- NHWYNIZWLJYZAG-XVYDVKMFSA-N Ala-Ser-His Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N NHWYNIZWLJYZAG-XVYDVKMFSA-N 0.000 description 1
- SAHQGRZIQVEJPF-JXUBOQSCSA-N Ala-Thr-Lys Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCCN SAHQGRZIQVEJPF-JXUBOQSCSA-N 0.000 description 1
- CREYEAPXISDKSB-FQPOAREZSA-N Ala-Thr-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CREYEAPXISDKSB-FQPOAREZSA-N 0.000 description 1
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 1
- FSXDWQGEWZQBPJ-HERUPUMHSA-N Ala-Trp-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(=O)O)C(=O)O)N FSXDWQGEWZQBPJ-HERUPUMHSA-N 0.000 description 1
- YXXPVUOMPSZURS-ZLIFDBKOSA-N Ala-Trp-Leu Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@H](C)N)=CNC2=C1 YXXPVUOMPSZURS-ZLIFDBKOSA-N 0.000 description 1
- GCTANJIJJROSLH-GVARAGBVSA-N Ala-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C)N GCTANJIJJROSLH-GVARAGBVSA-N 0.000 description 1
- XAXMJQUMRJAFCH-CQDKDKBSSA-N Ala-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 XAXMJQUMRJAFCH-CQDKDKBSSA-N 0.000 description 1
- LYILPUNCKACNGF-NAKRPEOUSA-N Ala-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N LYILPUNCKACNGF-NAKRPEOUSA-N 0.000 description 1
- 102100034035 Alcohol dehydrogenase 1A Human genes 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- KWKQGHSSNHPGOW-BQBZGAKWSA-N Arg-Ala-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)NCC(O)=O KWKQGHSSNHPGOW-BQBZGAKWSA-N 0.000 description 1
- MUXONAMCEUBVGA-DCAQKATOSA-N Arg-Arg-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O MUXONAMCEUBVGA-DCAQKATOSA-N 0.000 description 1
- DCGLNNVKIZXQOJ-FXQIFTODSA-N Arg-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N DCGLNNVKIZXQOJ-FXQIFTODSA-N 0.000 description 1
- NONSEUUPKITYQT-BQBZGAKWSA-N Arg-Asn-Gly Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N)CN=C(N)N NONSEUUPKITYQT-BQBZGAKWSA-N 0.000 description 1
- BVBKBQRPOJFCQM-DCAQKATOSA-N Arg-Asn-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BVBKBQRPOJFCQM-DCAQKATOSA-N 0.000 description 1
- GHNDBBVSWOWYII-LPEHRKFASA-N Arg-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O GHNDBBVSWOWYII-LPEHRKFASA-N 0.000 description 1
- RRGPUNYIPJXJBU-GUBZILKMSA-N Arg-Asp-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O RRGPUNYIPJXJBU-GUBZILKMSA-N 0.000 description 1
- SNBHMYQRNCJSOJ-CIUDSAMLSA-N Arg-Gln-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SNBHMYQRNCJSOJ-CIUDSAMLSA-N 0.000 description 1
- JUWQNWXEGDYCIE-YUMQZZPRSA-N Arg-Gln-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O JUWQNWXEGDYCIE-YUMQZZPRSA-N 0.000 description 1
- JCAISGGAOQXEHJ-ZPFDUUQYSA-N Arg-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N JCAISGGAOQXEHJ-ZPFDUUQYSA-N 0.000 description 1
- VNFWDYWTSHFRRG-SRVKXCTJSA-N Arg-Gln-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O VNFWDYWTSHFRRG-SRVKXCTJSA-N 0.000 description 1
- OBFTYSPXDRROQO-SRVKXCTJSA-N Arg-Gln-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCN=C(N)N OBFTYSPXDRROQO-SRVKXCTJSA-N 0.000 description 1
- LLZXKVAAEWBUPB-KKUMJFAQSA-N Arg-Gln-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LLZXKVAAEWBUPB-KKUMJFAQSA-N 0.000 description 1
- MZRBYBIQTIKERR-GUBZILKMSA-N Arg-Glu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MZRBYBIQTIKERR-GUBZILKMSA-N 0.000 description 1
- OHYQKYUTLIPFOX-ZPFDUUQYSA-N Arg-Glu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OHYQKYUTLIPFOX-ZPFDUUQYSA-N 0.000 description 1
- OGUPCHKBOKJFMA-SRVKXCTJSA-N Arg-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N OGUPCHKBOKJFMA-SRVKXCTJSA-N 0.000 description 1
- DJAIOAKQIOGULM-DCAQKATOSA-N Arg-Glu-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O DJAIOAKQIOGULM-DCAQKATOSA-N 0.000 description 1
- QKSAZKCRVQYYGS-UWVGGRQHSA-N Arg-Gly-His Chemical compound N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O QKSAZKCRVQYYGS-UWVGGRQHSA-N 0.000 description 1
- NVCIXQYNWYTLDO-IHRRRGAJSA-N Arg-His-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCN=C(N)N)N NVCIXQYNWYTLDO-IHRRRGAJSA-N 0.000 description 1
- RKQRHMKFNBYOTN-IHRRRGAJSA-N Arg-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N RKQRHMKFNBYOTN-IHRRRGAJSA-N 0.000 description 1
- OFIYLHVAAJYRBC-HJWJTTGWSA-N Arg-Ile-Phe Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O OFIYLHVAAJYRBC-HJWJTTGWSA-N 0.000 description 1
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 1
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 1
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 1
- PZBSKYJGKNNYNK-ULQDDVLXSA-N Arg-Leu-Tyr Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O PZBSKYJGKNNYNK-ULQDDVLXSA-N 0.000 description 1
- YVTHEZNOKSAWRW-DCAQKATOSA-N Arg-Lys-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O YVTHEZNOKSAWRW-DCAQKATOSA-N 0.000 description 1
- SSZGOKWBHLOCHK-DCAQKATOSA-N Arg-Lys-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N SSZGOKWBHLOCHK-DCAQKATOSA-N 0.000 description 1
- MJINRRBEMOLJAK-DCAQKATOSA-N Arg-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N MJINRRBEMOLJAK-DCAQKATOSA-N 0.000 description 1
- CLICCYPMVFGUOF-IHRRRGAJSA-N Arg-Lys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O CLICCYPMVFGUOF-IHRRRGAJSA-N 0.000 description 1
- XUGATJVGQUGQKY-ULQDDVLXSA-N Arg-Lys-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XUGATJVGQUGQKY-ULQDDVLXSA-N 0.000 description 1
- UBEKKPOFLCVTEZ-UHFFFAOYSA-N Arg-Lys-Val-Ser Chemical compound OCC(C(O)=O)NC(=O)C(C(C)C)NC(=O)C(CCCCN)NC(=O)C(N)CCCN=C(N)N UBEKKPOFLCVTEZ-UHFFFAOYSA-N 0.000 description 1
- JOADBFCFJGNIKF-GUBZILKMSA-N Arg-Met-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O JOADBFCFJGNIKF-GUBZILKMSA-N 0.000 description 1
- NYDIVDKTULRINZ-AVGNSLFASA-N Arg-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NYDIVDKTULRINZ-AVGNSLFASA-N 0.000 description 1
- KSUALAGYYLQSHJ-RCWTZXSCSA-N Arg-Met-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSUALAGYYLQSHJ-RCWTZXSCSA-N 0.000 description 1
- VEAIMHJZTIDCIH-KKUMJFAQSA-N Arg-Phe-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O VEAIMHJZTIDCIH-KKUMJFAQSA-N 0.000 description 1
- NIELFHOLFTUZME-HJWJTTGWSA-N Arg-Phe-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NIELFHOLFTUZME-HJWJTTGWSA-N 0.000 description 1
- BSYKSCBTTQKOJG-GUBZILKMSA-N Arg-Pro-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BSYKSCBTTQKOJG-GUBZILKMSA-N 0.000 description 1
- NGYHSXDNNOFHNE-AVGNSLFASA-N Arg-Pro-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O NGYHSXDNNOFHNE-AVGNSLFASA-N 0.000 description 1
- ADPACBMPYWJJCE-FXQIFTODSA-N Arg-Ser-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O ADPACBMPYWJJCE-FXQIFTODSA-N 0.000 description 1
- FRBAHXABMQXSJQ-FXQIFTODSA-N Arg-Ser-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O FRBAHXABMQXSJQ-FXQIFTODSA-N 0.000 description 1
- IZSMEUDYADKZTJ-KJEVXHAQSA-N Arg-Tyr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IZSMEUDYADKZTJ-KJEVXHAQSA-N 0.000 description 1
- XMZZGVGKGXRIGJ-JYJNAYRXSA-N Arg-Tyr-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O XMZZGVGKGXRIGJ-JYJNAYRXSA-N 0.000 description 1
- VYZBPPBKFCHCIS-WPRPVWTQSA-N Arg-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N VYZBPPBKFCHCIS-WPRPVWTQSA-N 0.000 description 1
- RZVVKNIACROXRM-ZLUOBGJFSA-N Asn-Ala-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N RZVVKNIACROXRM-ZLUOBGJFSA-N 0.000 description 1
- LEFKSBYHUGUWLP-ACZMJKKPSA-N Asn-Ala-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LEFKSBYHUGUWLP-ACZMJKKPSA-N 0.000 description 1
- NUHQMYUWLUSRJX-BIIVOSGPSA-N Asn-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N NUHQMYUWLUSRJX-BIIVOSGPSA-N 0.000 description 1
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 1
- XHFXZQHTLJVZBN-FXQIFTODSA-N Asn-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N XHFXZQHTLJVZBN-FXQIFTODSA-N 0.000 description 1
- DQTIWTULBGLJBL-DCAQKATOSA-N Asn-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N DQTIWTULBGLJBL-DCAQKATOSA-N 0.000 description 1
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 1
- AYZAWXAPBAYCHO-CIUDSAMLSA-N Asn-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N AYZAWXAPBAYCHO-CIUDSAMLSA-N 0.000 description 1
- KXEGPPNPXOKKHK-ZLUOBGJFSA-N Asn-Asp-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KXEGPPNPXOKKHK-ZLUOBGJFSA-N 0.000 description 1
- ZPMNECSEJXXNBE-CIUDSAMLSA-N Asn-Cys-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O ZPMNECSEJXXNBE-CIUDSAMLSA-N 0.000 description 1
- KUYKVGODHGHFDI-ACZMJKKPSA-N Asn-Gln-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O KUYKVGODHGHFDI-ACZMJKKPSA-N 0.000 description 1
- OKZOABJQOMAYEC-NUMRIWBASA-N Asn-Gln-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OKZOABJQOMAYEC-NUMRIWBASA-N 0.000 description 1
- ULRPXVNMIIYDDJ-ACZMJKKPSA-N Asn-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N ULRPXVNMIIYDDJ-ACZMJKKPSA-N 0.000 description 1
- QYXNFROWLZPWPC-FXQIFTODSA-N Asn-Glu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QYXNFROWLZPWPC-FXQIFTODSA-N 0.000 description 1
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 1
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 1
- ASCGFDYEKSRNPL-CIUDSAMLSA-N Asn-Glu-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O ASCGFDYEKSRNPL-CIUDSAMLSA-N 0.000 description 1
- COUZKSSMBFADSB-AVGNSLFASA-N Asn-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N COUZKSSMBFADSB-AVGNSLFASA-N 0.000 description 1
- GFFRWIJAFFMQGM-NUMRIWBASA-N Asn-Glu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GFFRWIJAFFMQGM-NUMRIWBASA-N 0.000 description 1
- DMLSCRJBWUEALP-LAEOZQHASA-N Asn-Glu-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O DMLSCRJBWUEALP-LAEOZQHASA-N 0.000 description 1
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 1
- QEQVUHQQYDZUEN-GUBZILKMSA-N Asn-His-Glu Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N QEQVUHQQYDZUEN-GUBZILKMSA-N 0.000 description 1
- IKLAUGBIDCDFOY-SRVKXCTJSA-N Asn-His-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O IKLAUGBIDCDFOY-SRVKXCTJSA-N 0.000 description 1
- PNHQRQTVBRDIEF-CIUDSAMLSA-N Asn-Leu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)N)N PNHQRQTVBRDIEF-CIUDSAMLSA-N 0.000 description 1
- HFPXZWPUVFVNLL-GUBZILKMSA-N Asn-Leu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFPXZWPUVFVNLL-GUBZILKMSA-N 0.000 description 1
- JLNFZLNDHONLND-GARJFASQSA-N Asn-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N JLNFZLNDHONLND-GARJFASQSA-N 0.000 description 1
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 1
- JWKDQOORUCYUIW-ZPFDUUQYSA-N Asn-Lys-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JWKDQOORUCYUIW-ZPFDUUQYSA-N 0.000 description 1
- ZYPWIUFLYMQZBS-SRVKXCTJSA-N Asn-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ZYPWIUFLYMQZBS-SRVKXCTJSA-N 0.000 description 1
- NTWOPSIUJBMNRI-KKUMJFAQSA-N Asn-Lys-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTWOPSIUJBMNRI-KKUMJFAQSA-N 0.000 description 1
- CDGHMJJJHYKMPA-DLOVCJGASA-N Asn-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC(=O)N)N CDGHMJJJHYKMPA-DLOVCJGASA-N 0.000 description 1
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 1
- BKFXFUPYETWGGA-XVSYOHENSA-N Asn-Phe-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BKFXFUPYETWGGA-XVSYOHENSA-N 0.000 description 1
- RBOBTTLFPRSXKZ-BZSNNMDCSA-N Asn-Phe-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RBOBTTLFPRSXKZ-BZSNNMDCSA-N 0.000 description 1
- NJSNXIOKBHPFMB-GMOBBJLQSA-N Asn-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)N)N NJSNXIOKBHPFMB-GMOBBJLQSA-N 0.000 description 1
- VHQSGALUSWIYOD-QXEWZRGKSA-N Asn-Pro-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O VHQSGALUSWIYOD-QXEWZRGKSA-N 0.000 description 1
- XTMZYFMTYJNABC-ZLUOBGJFSA-N Asn-Ser-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N XTMZYFMTYJNABC-ZLUOBGJFSA-N 0.000 description 1
- NPZJLGMWMDNQDD-GHCJXIJMSA-N Asn-Ser-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NPZJLGMWMDNQDD-GHCJXIJMSA-N 0.000 description 1
- UGXYFDQFLVCDFC-CIUDSAMLSA-N Asn-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O UGXYFDQFLVCDFC-CIUDSAMLSA-N 0.000 description 1
- VLDRQOHCMKCXLY-SRVKXCTJSA-N Asn-Ser-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VLDRQOHCMKCXLY-SRVKXCTJSA-N 0.000 description 1
- NCXTYSVDWLAQGZ-ZKWXMUAHSA-N Asn-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O NCXTYSVDWLAQGZ-ZKWXMUAHSA-N 0.000 description 1
- HPASIOLTWSNMFB-OLHMAJIHSA-N Asn-Thr-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O HPASIOLTWSNMFB-OLHMAJIHSA-N 0.000 description 1
- FMNBYVSGRCXWEK-FOHZUACHSA-N Asn-Thr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O FMNBYVSGRCXWEK-FOHZUACHSA-N 0.000 description 1
- PUUPMDXIHCOPJU-HJGDQZAQSA-N Asn-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O PUUPMDXIHCOPJU-HJGDQZAQSA-N 0.000 description 1
- AMGQTNHANMRPOE-LKXGYXEUSA-N Asn-Thr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O AMGQTNHANMRPOE-LKXGYXEUSA-N 0.000 description 1
- JPPLRQVZMZFOSX-UWJYBYFXSA-N Asn-Tyr-Ala Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=C(O)C=C1 JPPLRQVZMZFOSX-UWJYBYFXSA-N 0.000 description 1
- MJIJBEYEHBKTIM-BYULHYEWSA-N Asn-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MJIJBEYEHBKTIM-BYULHYEWSA-N 0.000 description 1
- KBQOUDLMWYWXNP-YDHLFZDLSA-N Asn-Val-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)N)N KBQOUDLMWYWXNP-YDHLFZDLSA-N 0.000 description 1
- VPPXTHJNTYDNFJ-CIUDSAMLSA-N Asp-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N VPPXTHJNTYDNFJ-CIUDSAMLSA-N 0.000 description 1
- GVPSCJQLUGIKAM-GUBZILKMSA-N Asp-Arg-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GVPSCJQLUGIKAM-GUBZILKMSA-N 0.000 description 1
- WSOKZUVWBXVJHX-CIUDSAMLSA-N Asp-Arg-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O WSOKZUVWBXVJHX-CIUDSAMLSA-N 0.000 description 1
- IXIWEFWRKIUMQX-DCAQKATOSA-N Asp-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(O)=O IXIWEFWRKIUMQX-DCAQKATOSA-N 0.000 description 1
- ATYWBXGNXZYZGI-ACZMJKKPSA-N Asp-Asn-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O ATYWBXGNXZYZGI-ACZMJKKPSA-N 0.000 description 1
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 1
- HOQGTAIGQSDCHR-SRVKXCTJSA-N Asp-Asn-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HOQGTAIGQSDCHR-SRVKXCTJSA-N 0.000 description 1
- KNMRXHIAVXHCLW-ZLUOBGJFSA-N Asp-Asn-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)C(=O)O KNMRXHIAVXHCLW-ZLUOBGJFSA-N 0.000 description 1
- RDRMWJBLOSRRAW-BYULHYEWSA-N Asp-Asn-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O RDRMWJBLOSRRAW-BYULHYEWSA-N 0.000 description 1
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 1
- PMEHKVHZQKJACS-PEFMBERDSA-N Asp-Gln-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PMEHKVHZQKJACS-PEFMBERDSA-N 0.000 description 1
- SNAWMGHSCHKSDK-GUBZILKMSA-N Asp-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N SNAWMGHSCHKSDK-GUBZILKMSA-N 0.000 description 1
- DXQOQMCLWWADMU-ACZMJKKPSA-N Asp-Gln-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DXQOQMCLWWADMU-ACZMJKKPSA-N 0.000 description 1
- XAJRHVUUVUPFQL-ACZMJKKPSA-N Asp-Glu-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XAJRHVUUVUPFQL-ACZMJKKPSA-N 0.000 description 1
- VFUXXFVCYZPOQG-WDSKDSINSA-N Asp-Glu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VFUXXFVCYZPOQG-WDSKDSINSA-N 0.000 description 1
- PDECQIHABNQRHN-GUBZILKMSA-N Asp-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(O)=O PDECQIHABNQRHN-GUBZILKMSA-N 0.000 description 1
- QCVXMEHGFUMKCO-YUMQZZPRSA-N Asp-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O QCVXMEHGFUMKCO-YUMQZZPRSA-N 0.000 description 1
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 1
- RQYMKRMRZWJGHC-BQBZGAKWSA-N Asp-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)O)N RQYMKRMRZWJGHC-BQBZGAKWSA-N 0.000 description 1
- KHGPWGKPYHPOIK-QWRGUYRKSA-N Asp-Gly-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KHGPWGKPYHPOIK-QWRGUYRKSA-N 0.000 description 1
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 1
- QHHVSXGWLYEAGX-GUBZILKMSA-N Asp-His-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N QHHVSXGWLYEAGX-GUBZILKMSA-N 0.000 description 1
- SWTQDYFZVOJVLL-KKUMJFAQSA-N Asp-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)O)N)O SWTQDYFZVOJVLL-KKUMJFAQSA-N 0.000 description 1
- LBFYTUPYYZENIR-GHCJXIJMSA-N Asp-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N LBFYTUPYYZENIR-GHCJXIJMSA-N 0.000 description 1
- SPWXXPFDTMYTRI-IUKAMOBKSA-N Asp-Ile-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SPWXXPFDTMYTRI-IUKAMOBKSA-N 0.000 description 1
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 1
- CJUKAWUWBZCTDQ-SRVKXCTJSA-N Asp-Leu-Lys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O CJUKAWUWBZCTDQ-SRVKXCTJSA-N 0.000 description 1
- ORRJQLIATJDMQM-HJGDQZAQSA-N Asp-Leu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O ORRJQLIATJDMQM-HJGDQZAQSA-N 0.000 description 1
- RXBGWGRSWXOBGK-KKUMJFAQSA-N Asp-Lys-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RXBGWGRSWXOBGK-KKUMJFAQSA-N 0.000 description 1
- GYWQGGUCMDCUJE-DLOVCJGASA-N Asp-Phe-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O GYWQGGUCMDCUJE-DLOVCJGASA-N 0.000 description 1
- PCJOFZYFFMBZKC-PCBIJLKTSA-N Asp-Phe-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PCJOFZYFFMBZKC-PCBIJLKTSA-N 0.000 description 1
- AHWRSSLYSGLBGD-CIUDSAMLSA-N Asp-Pro-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O AHWRSSLYSGLBGD-CIUDSAMLSA-N 0.000 description 1
- UAXIKORUDGGIGA-DCAQKATOSA-N Asp-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O UAXIKORUDGGIGA-DCAQKATOSA-N 0.000 description 1
- XUVTWGPERWIERB-IHRRRGAJSA-N Asp-Pro-Phe Chemical compound N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccccc1)C(O)=O XUVTWGPERWIERB-IHRRRGAJSA-N 0.000 description 1
- ZBYLEBZCVKLPCY-FXQIFTODSA-N Asp-Ser-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ZBYLEBZCVKLPCY-FXQIFTODSA-N 0.000 description 1
- QSFHZPQUAAQHAQ-CIUDSAMLSA-N Asp-Ser-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O QSFHZPQUAAQHAQ-CIUDSAMLSA-N 0.000 description 1
- XYPJXLLXNSAWHZ-SRVKXCTJSA-N Asp-Ser-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XYPJXLLXNSAWHZ-SRVKXCTJSA-N 0.000 description 1
- QOCFFCUFZGDHTP-NUMRIWBASA-N Asp-Thr-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QOCFFCUFZGDHTP-NUMRIWBASA-N 0.000 description 1
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 1
- UEFODXNXUAVPTC-VEVYYDQMSA-N Asp-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O UEFODXNXUAVPTC-VEVYYDQMSA-N 0.000 description 1
- MRYDJCIIVRXVGG-QEJZJMRPSA-N Asp-Trp-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(O)=O MRYDJCIIVRXVGG-QEJZJMRPSA-N 0.000 description 1
- YUELDQUPTAYEGM-XIRDDKMYSA-N Asp-Trp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CC(=O)O)N YUELDQUPTAYEGM-XIRDDKMYSA-N 0.000 description 1
- PLNJUJGNLDSFOP-UWJYBYFXSA-N Asp-Tyr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PLNJUJGNLDSFOP-UWJYBYFXSA-N 0.000 description 1
- USENATHVGFXRNO-SRVKXCTJSA-N Asp-Tyr-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 USENATHVGFXRNO-SRVKXCTJSA-N 0.000 description 1
- SQIARYGNVQWOSB-BZSNNMDCSA-N Asp-Tyr-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQIARYGNVQWOSB-BZSNNMDCSA-N 0.000 description 1
- VHUKCUHLFMRHOD-MELADBBJSA-N Asp-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O VHUKCUHLFMRHOD-MELADBBJSA-N 0.000 description 1
- ALMIMUZAWTUNIO-BZSNNMDCSA-N Asp-Tyr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ALMIMUZAWTUNIO-BZSNNMDCSA-N 0.000 description 1
- WAEDSQFVZJUHLI-BYULHYEWSA-N Asp-Val-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WAEDSQFVZJUHLI-BYULHYEWSA-N 0.000 description 1
- 235000007319 Avena orientalis Nutrition 0.000 description 1
- 241000209763 Avena sativa Species 0.000 description 1
- 235000007558 Avena sp Nutrition 0.000 description 1
- 241001265879 Bacillus phage AR9 Species 0.000 description 1
- 241000702198 Bacillus virus PBS1 Species 0.000 description 1
- 241000606125 Bacteroides Species 0.000 description 1
- 241000219310 Beta vulgaris subsp. vulgaris Species 0.000 description 1
- 235000014698 Brassica juncea var multisecta Nutrition 0.000 description 1
- 235000006008 Brassica napus var napus Nutrition 0.000 description 1
- 240000000385 Brassica napus var. napus Species 0.000 description 1
- 235000006618 Brassica rapa subsp oleifera Nutrition 0.000 description 1
- 235000004977 Brassica sinapistrum Nutrition 0.000 description 1
- 102100040399 C->U-editing enzyme APOBEC-2 Human genes 0.000 description 1
- 102220484559 C-type lectin domain family 4 member A_H36L_mutation Human genes 0.000 description 1
- 235000002566 Capsicum Nutrition 0.000 description 1
- 108091092236 Chimeric RNA Proteins 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 241000218631 Coniferophyta Species 0.000 description 1
- 102220584721 Coordinator of PRMT5 and differentiation stimulator_P48A_mutation Human genes 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- WZZGXXNRSZIQFC-VGDYDELISA-N Cys-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CS)N WZZGXXNRSZIQFC-VGDYDELISA-N 0.000 description 1
- SSNJZBGOMNLSLA-CIUDSAMLSA-N Cys-Leu-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O SSNJZBGOMNLSLA-CIUDSAMLSA-N 0.000 description 1
- HKALUUKHYNEDRS-GUBZILKMSA-N Cys-Leu-Gln Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HKALUUKHYNEDRS-GUBZILKMSA-N 0.000 description 1
- LHJDLVVQRJIURS-SRVKXCTJSA-N Cys-Phe-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N LHJDLVVQRJIURS-SRVKXCTJSA-N 0.000 description 1
- BCFXQBXXDSEHRS-FXQIFTODSA-N Cys-Ser-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BCFXQBXXDSEHRS-FXQIFTODSA-N 0.000 description 1
- UEHCDNYDBBCQEL-CIUDSAMLSA-N Cys-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N UEHCDNYDBBCQEL-CIUDSAMLSA-N 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 102100040263 DNA dC->dU-editing enzyme APOBEC-3A Human genes 0.000 description 1
- 102100040262 DNA dC->dU-editing enzyme APOBEC-3B Human genes 0.000 description 1
- 102100040261 DNA dC->dU-editing enzyme APOBEC-3C Human genes 0.000 description 1
- 102100040264 DNA dC->dU-editing enzyme APOBEC-3D Human genes 0.000 description 1
- 102100040266 DNA dC->dU-editing enzyme APOBEC-3F Human genes 0.000 description 1
- 102100038076 DNA dC->dU-editing enzyme APOBEC-3G Human genes 0.000 description 1
- 102100038050 DNA dC->dU-editing enzyme APOBEC-3H Human genes 0.000 description 1
- 101710082737 DNA dC->dU-editing enzyme APOBEC-3H Proteins 0.000 description 1
- 230000005778 DNA damage Effects 0.000 description 1
- 231100000277 DNA damage Toxicity 0.000 description 1
- 238000007400 DNA extraction Methods 0.000 description 1
- 230000007067 DNA methylation Effects 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 230000033616 DNA repair Effects 0.000 description 1
- 230000008265 DNA repair mechanism Effects 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- CKTSBUTUHBMZGZ-UHFFFAOYSA-N Deoxycytidine Natural products O=C1N=C(N)C=CN1C1OC(CO)C(O)C1 CKTSBUTUHBMZGZ-UHFFFAOYSA-N 0.000 description 1
- 108010046331 Deoxyribodipyrimidine photo-lyase Proteins 0.000 description 1
- 208000035240 Disease Resistance Diseases 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 108091029865 Exogenous DNA Proteins 0.000 description 1
- 108091092584 GDNA Proteins 0.000 description 1
- 101000892220 Geobacillus thermodenitrificans (strain NG80-2) Long-chain-alcohol dehydrogenase 1 Proteins 0.000 description 1
- HHWQMFIGMMOVFK-WDSKDSINSA-N Gln-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O HHWQMFIGMMOVFK-WDSKDSINSA-N 0.000 description 1
- MLZRSFQRBDNJON-GUBZILKMSA-N Gln-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MLZRSFQRBDNJON-GUBZILKMSA-N 0.000 description 1
- OVQXQLWWJSNYFV-XEGUGMAKSA-N Gln-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCC(N)=O)C)C(O)=O)=CNC2=C1 OVQXQLWWJSNYFV-XEGUGMAKSA-N 0.000 description 1
- PGPJSRSLQNXBDT-YUMQZZPRSA-N Gln-Arg-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O PGPJSRSLQNXBDT-YUMQZZPRSA-N 0.000 description 1
- MWLYSLMKFXWZPW-ZPFDUUQYSA-N Gln-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCC(N)=O MWLYSLMKFXWZPW-ZPFDUUQYSA-N 0.000 description 1
- DTMLKCYOQKZXKZ-HJGDQZAQSA-N Gln-Arg-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DTMLKCYOQKZXKZ-HJGDQZAQSA-N 0.000 description 1
- LJEPDHWNQXPXMM-NHCYSSNCSA-N Gln-Arg-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O LJEPDHWNQXPXMM-NHCYSSNCSA-N 0.000 description 1
- OETQLUYCMBARHJ-CIUDSAMLSA-N Gln-Asn-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OETQLUYCMBARHJ-CIUDSAMLSA-N 0.000 description 1
- SOBBAYVQSNXYPQ-ACZMJKKPSA-N Gln-Asn-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SOBBAYVQSNXYPQ-ACZMJKKPSA-N 0.000 description 1
- ODBLJLZVLAWVMS-GUBZILKMSA-N Gln-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N ODBLJLZVLAWVMS-GUBZILKMSA-N 0.000 description 1
- PONUFVLSGMQFAI-AVGNSLFASA-N Gln-Asn-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PONUFVLSGMQFAI-AVGNSLFASA-N 0.000 description 1
- WQWMZOIPXWSZNE-WDSKDSINSA-N Gln-Asp-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O WQWMZOIPXWSZNE-WDSKDSINSA-N 0.000 description 1
- JKPGHIQCHIIRMS-AVGNSLFASA-N Gln-Asp-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N JKPGHIQCHIIRMS-AVGNSLFASA-N 0.000 description 1
- WLODHVXYKYHLJD-ACZMJKKPSA-N Gln-Asp-Ser Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N WLODHVXYKYHLJD-ACZMJKKPSA-N 0.000 description 1
- KVXVVDFOZNYYKZ-DCAQKATOSA-N Gln-Gln-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KVXVVDFOZNYYKZ-DCAQKATOSA-N 0.000 description 1
- SNLOOPZHAQDMJG-CIUDSAMLSA-N Gln-Glu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SNLOOPZHAQDMJG-CIUDSAMLSA-N 0.000 description 1
- ZNZPKVQURDQFFS-FXQIFTODSA-N Gln-Glu-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZNZPKVQURDQFFS-FXQIFTODSA-N 0.000 description 1
- JHPFPROFOAJRFN-IHRRRGAJSA-N Gln-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O JHPFPROFOAJRFN-IHRRRGAJSA-N 0.000 description 1
- XSBGUANSZDGULP-IUCAKERBSA-N Gln-Gly-Lys Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O XSBGUANSZDGULP-IUCAKERBSA-N 0.000 description 1
- NNXIQPMZGZUFJJ-AVGNSLFASA-N Gln-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N NNXIQPMZGZUFJJ-AVGNSLFASA-N 0.000 description 1
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 1
- CAXXTYYGFYTBPV-IUCAKERBSA-N Gln-Leu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CAXXTYYGFYTBPV-IUCAKERBSA-N 0.000 description 1
- XFAUJGNLHIGXET-AVGNSLFASA-N Gln-Leu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XFAUJGNLHIGXET-AVGNSLFASA-N 0.000 description 1
- MLSKFHLRFVGNLL-WDCWCFNPSA-N Gln-Leu-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MLSKFHLRFVGNLL-WDCWCFNPSA-N 0.000 description 1
- IOFDDSNZJDIGPB-GVXVVHGQSA-N Gln-Leu-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IOFDDSNZJDIGPB-GVXVVHGQSA-N 0.000 description 1
- IHSGESFHTMFHRB-GUBZILKMSA-N Gln-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(N)=O IHSGESFHTMFHRB-GUBZILKMSA-N 0.000 description 1
- UWKPRVKWEKEMSY-DCAQKATOSA-N Gln-Lys-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWKPRVKWEKEMSY-DCAQKATOSA-N 0.000 description 1
- SXGMGNZEHFORAV-IUCAKERBSA-N Gln-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N SXGMGNZEHFORAV-IUCAKERBSA-N 0.000 description 1
- JRHPEMVLTRADLJ-AVGNSLFASA-N Gln-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JRHPEMVLTRADLJ-AVGNSLFASA-N 0.000 description 1
- WEAVZFWWIPIANL-SRVKXCTJSA-N Gln-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)N)N WEAVZFWWIPIANL-SRVKXCTJSA-N 0.000 description 1
- LUGUNEGJNDEBLU-DCAQKATOSA-N Gln-Met-Arg Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N LUGUNEGJNDEBLU-DCAQKATOSA-N 0.000 description 1
- DRNMNLKUUKKPIA-HTUGSXCWSA-N Gln-Phe-Thr Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)CCC(N)=O)C(O)=O DRNMNLKUUKKPIA-HTUGSXCWSA-N 0.000 description 1
- PIUPHASDUFSHTF-CIUDSAMLSA-N Gln-Pro-Asn Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)N)N)C(=O)N[C@@H](CC(=O)N)C(=O)O PIUPHASDUFSHTF-CIUDSAMLSA-N 0.000 description 1
- MFHVAWMMKZBSRQ-ACZMJKKPSA-N Gln-Ser-Cys Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N MFHVAWMMKZBSRQ-ACZMJKKPSA-N 0.000 description 1
- ZGHMRONFHDVXEF-AVGNSLFASA-N Gln-Ser-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZGHMRONFHDVXEF-AVGNSLFASA-N 0.000 description 1
- ARYKRXHBIPLULY-XKBZYTNZSA-N Gln-Thr-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ARYKRXHBIPLULY-XKBZYTNZSA-N 0.000 description 1
- XKPACHRGOWQHFH-IRIUXVKKSA-N Gln-Thr-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XKPACHRGOWQHFH-IRIUXVKKSA-N 0.000 description 1
- MKRDNSWGJWTBKZ-GVXVVHGQSA-N Gln-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MKRDNSWGJWTBKZ-GVXVVHGQSA-N 0.000 description 1
- FITIQFSXXBKFFM-NRPADANISA-N Gln-Val-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FITIQFSXXBKFFM-NRPADANISA-N 0.000 description 1
- HNAUFGBKJLTWQE-IFFSRLJSSA-N Gln-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCC(=O)N)N)O HNAUFGBKJLTWQE-IFFSRLJSSA-N 0.000 description 1
- CSMHMEATMDCQNY-DZKIICNBSA-N Gln-Val-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CSMHMEATMDCQNY-DZKIICNBSA-N 0.000 description 1
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 1
- ATRHMOJQJWPVBQ-DRZSPHRISA-N Glu-Ala-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ATRHMOJQJWPVBQ-DRZSPHRISA-N 0.000 description 1
- KBKGRMNVKPSQIF-XDTLVQLUSA-N Glu-Ala-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KBKGRMNVKPSQIF-XDTLVQLUSA-N 0.000 description 1
- NLKVNZUFDPWPNL-YUMQZZPRSA-N Glu-Arg-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O NLKVNZUFDPWPNL-YUMQZZPRSA-N 0.000 description 1
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 1
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 1
- QPRZKNOOOBWXSU-CIUDSAMLSA-N Glu-Asp-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N QPRZKNOOOBWXSU-CIUDSAMLSA-N 0.000 description 1
- NTBDVNJIWCKURJ-ACZMJKKPSA-N Glu-Asp-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NTBDVNJIWCKURJ-ACZMJKKPSA-N 0.000 description 1
- XXCDTYBVGMPIOA-FXQIFTODSA-N Glu-Asp-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XXCDTYBVGMPIOA-FXQIFTODSA-N 0.000 description 1
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 1
- OXEMJGCAJFFREE-FXQIFTODSA-N Glu-Gln-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O OXEMJGCAJFFREE-FXQIFTODSA-N 0.000 description 1
- GFLQTABMFBXRIY-GUBZILKMSA-N Glu-Gln-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GFLQTABMFBXRIY-GUBZILKMSA-N 0.000 description 1
- PVBBEKPHARMPHX-DCAQKATOSA-N Glu-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O PVBBEKPHARMPHX-DCAQKATOSA-N 0.000 description 1
- QQLBPVKLJBAXBS-FXQIFTODSA-N Glu-Glu-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QQLBPVKLJBAXBS-FXQIFTODSA-N 0.000 description 1
- LGYZYFFDELZWRS-DCAQKATOSA-N Glu-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O LGYZYFFDELZWRS-DCAQKATOSA-N 0.000 description 1
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 1
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 1
- CUXJIASLBRJOFV-LAEOZQHASA-N Glu-Gly-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUXJIASLBRJOFV-LAEOZQHASA-N 0.000 description 1
- ZWQVYZXPYSYPJD-RYUDHWBXSA-N Glu-Gly-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZWQVYZXPYSYPJD-RYUDHWBXSA-N 0.000 description 1
- VGUYMZGLJUJRBV-YVNDNENWSA-N Glu-Ile-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O VGUYMZGLJUJRBV-YVNDNENWSA-N 0.000 description 1
- ITBHUUMCJJQUSC-LAEOZQHASA-N Glu-Ile-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O ITBHUUMCJJQUSC-LAEOZQHASA-N 0.000 description 1
- QXDXIXFSFHUYAX-MNXVOIDGSA-N Glu-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O QXDXIXFSFHUYAX-MNXVOIDGSA-N 0.000 description 1
- GRHXUHCFENOCOS-ZPFDUUQYSA-N Glu-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)O)N GRHXUHCFENOCOS-ZPFDUUQYSA-N 0.000 description 1
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 1
- KRRFFAHEAOCBCQ-SIUGBPQLSA-N Glu-Ile-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KRRFFAHEAOCBCQ-SIUGBPQLSA-N 0.000 description 1
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 1
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 1
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 1
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 1
- SWRVAQHFBRZVNX-GUBZILKMSA-N Glu-Lys-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SWRVAQHFBRZVNX-GUBZILKMSA-N 0.000 description 1
- CUPSDFQZTVVTSK-GUBZILKMSA-N Glu-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O CUPSDFQZTVVTSK-GUBZILKMSA-N 0.000 description 1
- OCJRHJZKGGSPRW-IUCAKERBSA-N Glu-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O OCJRHJZKGGSPRW-IUCAKERBSA-N 0.000 description 1
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 1
- ZGEJRLJEAMPEDV-SRVKXCTJSA-N Glu-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N ZGEJRLJEAMPEDV-SRVKXCTJSA-N 0.000 description 1
- RBXSZQRSEGYDFG-GUBZILKMSA-N Glu-Lys-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O RBXSZQRSEGYDFG-GUBZILKMSA-N 0.000 description 1
- LGWUJBCIFGVBSJ-CIUDSAMLSA-N Glu-Met-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N LGWUJBCIFGVBSJ-CIUDSAMLSA-N 0.000 description 1
- JHSRJMUJOGLIHK-GUBZILKMSA-N Glu-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N JHSRJMUJOGLIHK-GUBZILKMSA-N 0.000 description 1
- UERORLSAFUHDGU-AVGNSLFASA-N Glu-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N UERORLSAFUHDGU-AVGNSLFASA-N 0.000 description 1
- SYWCGQOIIARSIX-SRVKXCTJSA-N Glu-Pro-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O SYWCGQOIIARSIX-SRVKXCTJSA-N 0.000 description 1
- BDISFWMLMNBTGP-NUMRIWBASA-N Glu-Thr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O BDISFWMLMNBTGP-NUMRIWBASA-N 0.000 description 1
- LWYUQLZOIORFFJ-XKBZYTNZSA-N Glu-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O LWYUQLZOIORFFJ-XKBZYTNZSA-N 0.000 description 1
- MWTGQXBHVRTCOR-GLLZPBPUSA-N Glu-Thr-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MWTGQXBHVRTCOR-GLLZPBPUSA-N 0.000 description 1
- HGJREIGJLUQBTJ-SZMVWBNQSA-N Glu-Trp-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(C)C)C(O)=O HGJREIGJLUQBTJ-SZMVWBNQSA-N 0.000 description 1
- KCCNSVHJSMMGFS-NRPADANISA-N Glu-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N KCCNSVHJSMMGFS-NRPADANISA-N 0.000 description 1
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 1
- PYTZFYUXZZHOAD-WHFBIAKZSA-N Gly-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)CN PYTZFYUXZZHOAD-WHFBIAKZSA-N 0.000 description 1
- KRRMJKMGWWXWDW-STQMWFEESA-N Gly-Arg-Phe Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KRRMJKMGWWXWDW-STQMWFEESA-N 0.000 description 1
- XUORRGAFUQIMLC-STQMWFEESA-N Gly-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)CN)O XUORRGAFUQIMLC-STQMWFEESA-N 0.000 description 1
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 1
- XRTDOIOIBMAXCT-NKWVEPMBSA-N Gly-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)CN)C(=O)O XRTDOIOIBMAXCT-NKWVEPMBSA-N 0.000 description 1
- XBWMTPAIUQIWKA-BYULHYEWSA-N Gly-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN XBWMTPAIUQIWKA-BYULHYEWSA-N 0.000 description 1
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 1
- RPLLQZBOVIVGMX-QWRGUYRKSA-N Gly-Asp-Phe Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RPLLQZBOVIVGMX-QWRGUYRKSA-N 0.000 description 1
- BYYNJRSNDARRBX-YFKPBYRVSA-N Gly-Gln-Gly Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O BYYNJRSNDARRBX-YFKPBYRVSA-N 0.000 description 1
- JUGQPPOVWXSPKJ-RYUDHWBXSA-N Gly-Gln-Phe Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JUGQPPOVWXSPKJ-RYUDHWBXSA-N 0.000 description 1
- HFXJIZNEXNIZIJ-BQBZGAKWSA-N Gly-Glu-Gln Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFXJIZNEXNIZIJ-BQBZGAKWSA-N 0.000 description 1
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 1
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 1
- QSVCIFZPGLOZGH-WDSKDSINSA-N Gly-Glu-Ser Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QSVCIFZPGLOZGH-WDSKDSINSA-N 0.000 description 1
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 1
- CUYLIWAAAYJKJH-RYUDHWBXSA-N Gly-Glu-Tyr Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 CUYLIWAAAYJKJH-RYUDHWBXSA-N 0.000 description 1
- JSNNHGHYGYMVCK-XVKPBYJWSA-N Gly-Glu-Val Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JSNNHGHYGYMVCK-XVKPBYJWSA-N 0.000 description 1
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 1
- XMPXVJIDADUOQB-RCOVLWMOSA-N Gly-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C([O-])=O)NC(=O)CNC(=O)C[NH3+] XMPXVJIDADUOQB-RCOVLWMOSA-N 0.000 description 1
- CQIIXEHDSZUSAG-QWRGUYRKSA-N Gly-His-His Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 CQIIXEHDSZUSAG-QWRGUYRKSA-N 0.000 description 1
- DGKBSGNCMCLDSL-BYULHYEWSA-N Gly-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN DGKBSGNCMCLDSL-BYULHYEWSA-N 0.000 description 1
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 1
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 1
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 1
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 1
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 1
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 1
- JPAACTMBBBGAAR-HOTGVXAUSA-N Gly-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)CN)CC(C)C)C(O)=O)=CNC2=C1 JPAACTMBBBGAAR-HOTGVXAUSA-N 0.000 description 1
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 1
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 1
- FHQRLHFYVZAQHU-IUCAKERBSA-N Gly-Lys-Gln Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O FHQRLHFYVZAQHU-IUCAKERBSA-N 0.000 description 1
- WDEHMRNSGHVNOH-VHSXEESVSA-N Gly-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)CN)C(=O)O WDEHMRNSGHVNOH-VHSXEESVSA-N 0.000 description 1
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 1
- BBTCXWTXOXUNFX-IUCAKERBSA-N Gly-Met-Arg Chemical compound CSCC[C@H](NC(=O)CN)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O BBTCXWTXOXUNFX-IUCAKERBSA-N 0.000 description 1
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 1
- VNNRLUNBJSWZPF-ZKWXMUAHSA-N Gly-Ser-Ile Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNNRLUNBJSWZPF-ZKWXMUAHSA-N 0.000 description 1
- FKESCSGWBPUTPN-FOHZUACHSA-N Gly-Thr-Asn Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O FKESCSGWBPUTPN-FOHZUACHSA-N 0.000 description 1
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 1
- LLWQVJNHMYBLLK-CDMKHQONSA-N Gly-Thr-Phe Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LLWQVJNHMYBLLK-CDMKHQONSA-N 0.000 description 1
- MYXNLWDWWOTERK-BHNWBGBOSA-N Gly-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN)O MYXNLWDWWOTERK-BHNWBGBOSA-N 0.000 description 1
- RCHFYMASWAZQQZ-ZANVPECISA-N Gly-Trp-Ala Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)CN)=CNC2=C1 RCHFYMASWAZQQZ-ZANVPECISA-N 0.000 description 1
- GULGDABMYTYMJZ-STQMWFEESA-N Gly-Trp-Asp Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O GULGDABMYTYMJZ-STQMWFEESA-N 0.000 description 1
- PYFIQROSWQERAS-LBPRGKRZSA-N Gly-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)CN)C(=O)NCC(O)=O)=CNC2=C1 PYFIQROSWQERAS-LBPRGKRZSA-N 0.000 description 1
- WRFOZIJRODPLIA-QWRGUYRKSA-N Gly-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)O WRFOZIJRODPLIA-QWRGUYRKSA-N 0.000 description 1
- DNVDEMWIYLVIQU-RCOVLWMOSA-N Gly-Val-Asp Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O DNVDEMWIYLVIQU-RCOVLWMOSA-N 0.000 description 1
- DKJWUIYLMLUBDX-XPUUQOCRSA-N Gly-Val-Cys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(=O)O DKJWUIYLMLUBDX-XPUUQOCRSA-N 0.000 description 1
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 1
- SBVMXEZQJVUARN-XPUUQOCRSA-N Gly-Val-Ser Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O SBVMXEZQJVUARN-XPUUQOCRSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 241000219146 Gossypium Species 0.000 description 1
- 108010012029 Guanine Deaminase Proteins 0.000 description 1
- 102000013587 Guanine deaminase Human genes 0.000 description 1
- 102220576552 HLA class I histocompatibility antigen, A alpha chain_W23R_mutation Human genes 0.000 description 1
- 108050008753 HNH endonucleases Proteins 0.000 description 1
- 102000000310 HNH endonucleases Human genes 0.000 description 1
- 102220491568 Heat shock 70 kDa protein 1B_D10A_mutation Human genes 0.000 description 1
- 244000020551 Helianthus annuus Species 0.000 description 1
- 235000003222 Helianthus annuus Nutrition 0.000 description 1
- HTTJABKRGRZYRN-UHFFFAOYSA-N Heparin Chemical compound OC1C(NC(=O)C)C(O)OC(COS(O)(=O)=O)C1OC1C(OS(O)(=O)=O)C(O)C(OC2C(C(OS(O)(=O)=O)C(OC3C(C(O)C(O)C(O3)C(O)=O)OS(O)(=O)=O)C(CO)O2)NS(O)(=O)=O)C(C(O)=O)O1 HTTJABKRGRZYRN-UHFFFAOYSA-N 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- AFPFGFUGETYOSY-HGNGGELXSA-N His-Ala-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AFPFGFUGETYOSY-HGNGGELXSA-N 0.000 description 1
- IPIVXQQRZXEUGW-UWJYBYFXSA-N His-Ala-His Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 IPIVXQQRZXEUGW-UWJYBYFXSA-N 0.000 description 1
- XINDHUAGVGCNSF-QSFUFRPTSA-N His-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XINDHUAGVGCNSF-QSFUFRPTSA-N 0.000 description 1
- VSLXGYMEHVAJBH-DLOVCJGASA-N His-Ala-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O VSLXGYMEHVAJBH-DLOVCJGASA-N 0.000 description 1
- SVHKVHBPTOMLTO-DCAQKATOSA-N His-Arg-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SVHKVHBPTOMLTO-DCAQKATOSA-N 0.000 description 1
- AVQOSMRPITVTRB-CIUDSAMLSA-N His-Asn-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AVQOSMRPITVTRB-CIUDSAMLSA-N 0.000 description 1
- FPNWKONEZAVQJF-GUBZILKMSA-N His-Asn-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N FPNWKONEZAVQJF-GUBZILKMSA-N 0.000 description 1
- IMCHNUANCIGUKS-SRVKXCTJSA-N His-Glu-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IMCHNUANCIGUKS-SRVKXCTJSA-N 0.000 description 1
- XMENRVZYPBKBIL-AVGNSLFASA-N His-Glu-His Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O XMENRVZYPBKBIL-AVGNSLFASA-N 0.000 description 1
- RGPWUJOMKFYFSR-QWRGUYRKSA-N His-Gly-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O RGPWUJOMKFYFSR-QWRGUYRKSA-N 0.000 description 1
- NQKRILCJYCASDV-QWRGUYRKSA-N His-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 NQKRILCJYCASDV-QWRGUYRKSA-N 0.000 description 1
- IDQNVIWPPWAFSY-AVGNSLFASA-N His-His-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O IDQNVIWPPWAFSY-AVGNSLFASA-N 0.000 description 1
- WJGSTIMGSIWHJX-HVTMNAMFSA-N His-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N WJGSTIMGSIWHJX-HVTMNAMFSA-N 0.000 description 1
- UQTKYYNHMVAOAA-HJPIBITLSA-N His-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N UQTKYYNHMVAOAA-HJPIBITLSA-N 0.000 description 1
- DYKZGTLPSNOFHU-DEQVHRJGSA-N His-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N DYKZGTLPSNOFHU-DEQVHRJGSA-N 0.000 description 1
- ZSKJIISDJXJQPV-BZSNNMDCSA-N His-Leu-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 ZSKJIISDJXJQPV-BZSNNMDCSA-N 0.000 description 1
- TWROVBNEHJSXDG-IHRRRGAJSA-N His-Leu-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O TWROVBNEHJSXDG-IHRRRGAJSA-N 0.000 description 1
- FHGVHXCQMJWQPK-SRVKXCTJSA-N His-Lys-Asn Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O FHGVHXCQMJWQPK-SRVKXCTJSA-N 0.000 description 1
- UMBKDWGQESDCTO-KKUMJFAQSA-N His-Lys-Lys Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O UMBKDWGQESDCTO-KKUMJFAQSA-N 0.000 description 1
- FBCURAVMSXNOLP-JYJNAYRXSA-N His-Phe-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N FBCURAVMSXNOLP-JYJNAYRXSA-N 0.000 description 1
- YAEKRYQASVCDLK-JYJNAYRXSA-N His-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N YAEKRYQASVCDLK-JYJNAYRXSA-N 0.000 description 1
- SVVULKPWDBIPCO-BZSNNMDCSA-N His-Phe-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SVVULKPWDBIPCO-BZSNNMDCSA-N 0.000 description 1
- PYNPBMCLAKTHJL-SRVKXCTJSA-N His-Pro-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O PYNPBMCLAKTHJL-SRVKXCTJSA-N 0.000 description 1
- FLXCRBXJRJSDHX-AVGNSLFASA-N His-Pro-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O FLXCRBXJRJSDHX-AVGNSLFASA-N 0.000 description 1
- CUEQQFOGARVNHU-VGDYDELISA-N His-Ser-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUEQQFOGARVNHU-VGDYDELISA-N 0.000 description 1
- ILUVWFTXAUYOBW-CUJWVEQBSA-N His-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC1=CN=CN1)N)O ILUVWFTXAUYOBW-CUJWVEQBSA-N 0.000 description 1
- CCUSLCQWVMWTIS-IXOXFDKPSA-N His-Thr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O CCUSLCQWVMWTIS-IXOXFDKPSA-N 0.000 description 1
- MKWFGXSFLYNTKC-XIRDDKMYSA-N His-Trp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N MKWFGXSFLYNTKC-XIRDDKMYSA-N 0.000 description 1
- PBJOQLUVSGXRSW-YTQUADARSA-N His-Trp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CNC3=CC=CC=C32)NC(=O)[C@H](CC4=CN=CN4)N)C(=O)O PBJOQLUVSGXRSW-YTQUADARSA-N 0.000 description 1
- LPBWRHRHEIYAIP-KKUMJFAQSA-N His-Tyr-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LPBWRHRHEIYAIP-KKUMJFAQSA-N 0.000 description 1
- WSWAUVHXQREQQG-JYJNAYRXSA-N His-Tyr-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O WSWAUVHXQREQQG-JYJNAYRXSA-N 0.000 description 1
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 1
- 101000780443 Homo sapiens Alcohol dehydrogenase 1A Proteins 0.000 description 1
- 101000964322 Homo sapiens C->U-editing enzyme APOBEC-2 Proteins 0.000 description 1
- 101000964385 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3B Proteins 0.000 description 1
- 101000964383 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3C Proteins 0.000 description 1
- 101000964382 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3D Proteins 0.000 description 1
- 101000964377 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3F Proteins 0.000 description 1
- 101000800426 Homo sapiens Putative C->U-editing enzyme APOBEC-4 Proteins 0.000 description 1
- 240000005979 Hordeum vulgare Species 0.000 description 1
- 235000007340 Hordeum vulgare Nutrition 0.000 description 1
- LQSBBHNVAVNZSX-GHCJXIJMSA-N Ile-Ala-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N LQSBBHNVAVNZSX-GHCJXIJMSA-N 0.000 description 1
- YKRYHWJRQUSTKG-KBIXCLLPSA-N Ile-Ala-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YKRYHWJRQUSTKG-KBIXCLLPSA-N 0.000 description 1
- QICVAHODWHIWIS-HTFCKZLJSA-N Ile-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N QICVAHODWHIWIS-HTFCKZLJSA-N 0.000 description 1
- DPTBVFUDCPINIP-JURCDPSOSA-N Ile-Ala-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DPTBVFUDCPINIP-JURCDPSOSA-N 0.000 description 1
- HDOYNXLPTRQLAD-JBDRJPRFSA-N Ile-Ala-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)O)N HDOYNXLPTRQLAD-JBDRJPRFSA-N 0.000 description 1
- BOTVMTSMOUSDRW-GMOBBJLQSA-N Ile-Arg-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(O)=O BOTVMTSMOUSDRW-GMOBBJLQSA-N 0.000 description 1
- YOTNPRLPIPHQSB-XUXIUFHCSA-N Ile-Arg-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOTNPRLPIPHQSB-XUXIUFHCSA-N 0.000 description 1
- QADCTXFNLZBZAB-GHCJXIJMSA-N Ile-Asn-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C)C(=O)O)N QADCTXFNLZBZAB-GHCJXIJMSA-N 0.000 description 1
- UAVQIQOOBXFKRC-BYULHYEWSA-N Ile-Asn-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O UAVQIQOOBXFKRC-BYULHYEWSA-N 0.000 description 1
- HDODQNPMSHDXJT-GHCJXIJMSA-N Ile-Asn-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O HDODQNPMSHDXJT-GHCJXIJMSA-N 0.000 description 1
- NKRJALPCDNXULF-BYULHYEWSA-N Ile-Asp-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O NKRJALPCDNXULF-BYULHYEWSA-N 0.000 description 1
- NPROWIBAWYMPAZ-GUDRVLHUSA-N Ile-Asp-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N NPROWIBAWYMPAZ-GUDRVLHUSA-N 0.000 description 1
- LLHYWBGDMBGNHA-VGDYDELISA-N Ile-Cys-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LLHYWBGDMBGNHA-VGDYDELISA-N 0.000 description 1
- HOLOYAZCIHDQNS-YVNDNENWSA-N Ile-Gln-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HOLOYAZCIHDQNS-YVNDNENWSA-N 0.000 description 1
- YBJWJQQBWRARLT-KBIXCLLPSA-N Ile-Gln-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O YBJWJQQBWRARLT-KBIXCLLPSA-N 0.000 description 1
- BEWFWZRGBDVXRP-PEFMBERDSA-N Ile-Glu-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O BEWFWZRGBDVXRP-PEFMBERDSA-N 0.000 description 1
- LGMUPVWZEYYUMU-YVNDNENWSA-N Ile-Glu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N LGMUPVWZEYYUMU-YVNDNENWSA-N 0.000 description 1
- IXEFKXAGHRQFAF-HVTMNAMFSA-N Ile-Glu-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N IXEFKXAGHRQFAF-HVTMNAMFSA-N 0.000 description 1
- NHJKZMDIMMTVCK-QXEWZRGKSA-N Ile-Gly-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N NHJKZMDIMMTVCK-QXEWZRGKSA-N 0.000 description 1
- MQFGXJNSUJTXDT-QSFUFRPTSA-N Ile-Gly-Ile Chemical compound N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)O MQFGXJNSUJTXDT-QSFUFRPTSA-N 0.000 description 1
- DFFTXLCCDFYRKD-MBLNEYKQSA-N Ile-Gly-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N DFFTXLCCDFYRKD-MBLNEYKQSA-N 0.000 description 1
- UQXADIGYEYBJEI-DJFWLOJKSA-N Ile-His-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N UQXADIGYEYBJEI-DJFWLOJKSA-N 0.000 description 1
- HUWYGQOISIJNMK-SIGLWIIPSA-N Ile-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HUWYGQOISIJNMK-SIGLWIIPSA-N 0.000 description 1
- PFPUFNLHBXKPHY-HTFCKZLJSA-N Ile-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)O)N PFPUFNLHBXKPHY-HTFCKZLJSA-N 0.000 description 1
- YGDWPQCLFJNMOL-MNXVOIDGSA-N Ile-Leu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YGDWPQCLFJNMOL-MNXVOIDGSA-N 0.000 description 1
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 1
- DBXXASNNDTXOLU-MXAVVETBSA-N Ile-Leu-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N DBXXASNNDTXOLU-MXAVVETBSA-N 0.000 description 1
- FCWFBHMAJZGWRY-XUXIUFHCSA-N Ile-Leu-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)O)N FCWFBHMAJZGWRY-XUXIUFHCSA-N 0.000 description 1
- GVKKVHNRTUFCCE-BJDJZHNGSA-N Ile-Leu-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)O)N GVKKVHNRTUFCCE-BJDJZHNGSA-N 0.000 description 1
- RMNMUUCYTMLWNA-ZPFDUUQYSA-N Ile-Lys-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N RMNMUUCYTMLWNA-ZPFDUUQYSA-N 0.000 description 1
- PNTWNAXGBOZMBO-MNXVOIDGSA-N Ile-Lys-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PNTWNAXGBOZMBO-MNXVOIDGSA-N 0.000 description 1
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 1
- GLYJPWIRLBAIJH-FQUUOJAGSA-N Ile-Lys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N GLYJPWIRLBAIJH-FQUUOJAGSA-N 0.000 description 1
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 1
- RVNOXPZHMUWCLW-GMOBBJLQSA-N Ile-Met-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N RVNOXPZHMUWCLW-GMOBBJLQSA-N 0.000 description 1
- RCMNUBZKIIJCOI-ZPFDUUQYSA-N Ile-Met-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N RCMNUBZKIIJCOI-ZPFDUUQYSA-N 0.000 description 1
- FTUZWJVSNZMLPI-RVMXOQNASA-N Ile-Met-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N FTUZWJVSNZMLPI-RVMXOQNASA-N 0.000 description 1
- UOPBQSJRBONRON-STECZYCISA-N Ile-Met-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UOPBQSJRBONRON-STECZYCISA-N 0.000 description 1
- CIDLJWVDMNDKPT-FIRPJDEBSA-N Ile-Phe-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N CIDLJWVDMNDKPT-FIRPJDEBSA-N 0.000 description 1
- BJECXJHLUJXPJQ-PYJNHQTQSA-N Ile-Pro-His Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N BJECXJHLUJXPJQ-PYJNHQTQSA-N 0.000 description 1
- NLZVTPYXYXMCIP-XUXIUFHCSA-N Ile-Pro-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O NLZVTPYXYXMCIP-XUXIUFHCSA-N 0.000 description 1
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 1
- VGSPNSSCMOHRRR-BJDJZHNGSA-N Ile-Ser-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N VGSPNSSCMOHRRR-BJDJZHNGSA-N 0.000 description 1
- ZDNNDIJTUHQCAM-MXAVVETBSA-N Ile-Ser-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N ZDNNDIJTUHQCAM-MXAVVETBSA-N 0.000 description 1
- JNLSTRPWUXOORL-MMWGEVLESA-N Ile-Ser-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N JNLSTRPWUXOORL-MMWGEVLESA-N 0.000 description 1
- CNMOKANDJMLAIF-CIQUZCHMSA-N Ile-Thr-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O CNMOKANDJMLAIF-CIQUZCHMSA-N 0.000 description 1
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 1
- NAFIFZNBSPWYOO-RWRJDSDZSA-N Ile-Thr-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NAFIFZNBSPWYOO-RWRJDSDZSA-N 0.000 description 1
- AUIYHFRUOOKTGX-UKJIMTQDSA-N Ile-Val-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N AUIYHFRUOOKTGX-UKJIMTQDSA-N 0.000 description 1
- UYODHPPSCXBNCS-XUXIUFHCSA-N Ile-Val-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C UYODHPPSCXBNCS-XUXIUFHCSA-N 0.000 description 1
- WIYDLTIBHZSPKY-HJWJTTGWSA-N Ile-Val-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WIYDLTIBHZSPKY-HJWJTTGWSA-N 0.000 description 1
- RQZFWBLDTBDEOF-RNJOBUHISA-N Ile-Val-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N RQZFWBLDTBDEOF-RNJOBUHISA-N 0.000 description 1
- SWNRZNLXMXRCJC-VKOGCVSHSA-N Ile-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 SWNRZNLXMXRCJC-VKOGCVSHSA-N 0.000 description 1
- YHFPHRUWZMEOIX-CYDGBPFRSA-N Ile-Val-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(=O)O)N YHFPHRUWZMEOIX-CYDGBPFRSA-N 0.000 description 1
- 108010061833 Integrases Proteins 0.000 description 1
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 1
- HBJZFCIVFIBNSV-DCAQKATOSA-N Leu-Arg-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(O)=O HBJZFCIVFIBNSV-DCAQKATOSA-N 0.000 description 1
- REPPKAMYTOJTFC-DCAQKATOSA-N Leu-Arg-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O REPPKAMYTOJTFC-DCAQKATOSA-N 0.000 description 1
- GRZSCTXVCDUIPO-SRVKXCTJSA-N Leu-Arg-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRZSCTXVCDUIPO-SRVKXCTJSA-N 0.000 description 1
- HASRFYOMVPJRPU-SRVKXCTJSA-N Leu-Arg-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HASRFYOMVPJRPU-SRVKXCTJSA-N 0.000 description 1
- IGUOAYLTQJLPPD-DCAQKATOSA-N Leu-Asn-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IGUOAYLTQJLPPD-DCAQKATOSA-N 0.000 description 1
- MDVZJYGNAGLPGJ-KKUMJFAQSA-N Leu-Asn-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MDVZJYGNAGLPGJ-KKUMJFAQSA-N 0.000 description 1
- FIJMQLGQLBLBOL-HJGDQZAQSA-N Leu-Asn-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FIJMQLGQLBLBOL-HJGDQZAQSA-N 0.000 description 1
- ZURHXHNAEJJRNU-CIUDSAMLSA-N Leu-Asp-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZURHXHNAEJJRNU-CIUDSAMLSA-N 0.000 description 1
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 1
- ULXYQAJWJGLCNR-YUMQZZPRSA-N Leu-Asp-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 1
- JQSXWJXBASFONF-KKUMJFAQSA-N Leu-Asp-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JQSXWJXBASFONF-KKUMJFAQSA-N 0.000 description 1
- RRSLQOLASISYTB-CIUDSAMLSA-N Leu-Cys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O RRSLQOLASISYTB-CIUDSAMLSA-N 0.000 description 1
- NHHKSOGJYNQENP-SRVKXCTJSA-N Leu-Cys-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N NHHKSOGJYNQENP-SRVKXCTJSA-N 0.000 description 1
- LOLUPZNNADDTAA-AVGNSLFASA-N Leu-Gln-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LOLUPZNNADDTAA-AVGNSLFASA-N 0.000 description 1
- ZFNLIDNJUWNIJL-WDCWCFNPSA-N Leu-Glu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZFNLIDNJUWNIJL-WDCWCFNPSA-N 0.000 description 1
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 1
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 1
- KVOFSTUWVSQMDK-KKUMJFAQSA-N Leu-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KVOFSTUWVSQMDK-KKUMJFAQSA-N 0.000 description 1
- SGIIOQQGLUUMDQ-IHRRRGAJSA-N Leu-His-Val Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C(C)C)C(=O)O)N SGIIOQQGLUUMDQ-IHRRRGAJSA-N 0.000 description 1
- DBSLVQBXKVKDKJ-BJDJZHNGSA-N Leu-Ile-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O DBSLVQBXKVKDKJ-BJDJZHNGSA-N 0.000 description 1
- AVEGDIAXTDVBJS-XUXIUFHCSA-N Leu-Ile-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AVEGDIAXTDVBJS-XUXIUFHCSA-N 0.000 description 1
- KOSWSHVQIVTVQF-ZPFDUUQYSA-N Leu-Ile-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KOSWSHVQIVTVQF-ZPFDUUQYSA-N 0.000 description 1
- JFSGIJSCJFQGSZ-MXAVVETBSA-N Leu-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N JFSGIJSCJFQGSZ-MXAVVETBSA-N 0.000 description 1
- OMHLATXVNQSALM-FQUUOJAGSA-N Leu-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(C)C)N OMHLATXVNQSALM-FQUUOJAGSA-N 0.000 description 1
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 1
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 1
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 1
- ZGUMORRUBUCXEH-AVGNSLFASA-N Leu-Lys-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZGUMORRUBUCXEH-AVGNSLFASA-N 0.000 description 1
- REPBGZHJKYWFMJ-KKUMJFAQSA-N Leu-Lys-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N REPBGZHJKYWFMJ-KKUMJFAQSA-N 0.000 description 1
- QNTJIDXQHWUBKC-BZSNNMDCSA-N Leu-Lys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNTJIDXQHWUBKC-BZSNNMDCSA-N 0.000 description 1
- RTIRBWJPYJYTLO-MELADBBJSA-N Leu-Lys-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N RTIRBWJPYJYTLO-MELADBBJSA-N 0.000 description 1
- HDHQQEDVWQGBEE-DCAQKATOSA-N Leu-Met-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O HDHQQEDVWQGBEE-DCAQKATOSA-N 0.000 description 1
- KQFZKDITNUEVFJ-JYJNAYRXSA-N Leu-Phe-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CC=CC=C1 KQFZKDITNUEVFJ-JYJNAYRXSA-N 0.000 description 1
- PTRKPHUGYULXPU-KKUMJFAQSA-N Leu-Phe-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O PTRKPHUGYULXPU-KKUMJFAQSA-N 0.000 description 1
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 1
- MUCIDQMDOYQYBR-IHRRRGAJSA-N Leu-Pro-His Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N MUCIDQMDOYQYBR-IHRRRGAJSA-N 0.000 description 1
- KZZCOWMDDXDKSS-CIUDSAMLSA-N Leu-Ser-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KZZCOWMDDXDKSS-CIUDSAMLSA-N 0.000 description 1
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 1
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 1
- SBANPBVRHYIMRR-GARJFASQSA-N Leu-Ser-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N SBANPBVRHYIMRR-GARJFASQSA-N 0.000 description 1
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 1
- BRTVHXHCUSXYRI-CIUDSAMLSA-N Leu-Ser-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O BRTVHXHCUSXYRI-CIUDSAMLSA-N 0.000 description 1
- SQUFDMCWMFOEBA-KKUMJFAQSA-N Leu-Ser-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SQUFDMCWMFOEBA-KKUMJFAQSA-N 0.000 description 1
- ICYRCNICGBJLGM-HJGDQZAQSA-N Leu-Thr-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O ICYRCNICGBJLGM-HJGDQZAQSA-N 0.000 description 1
- VDIARPPNADFEAV-WEDXCCLWSA-N Leu-Thr-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O VDIARPPNADFEAV-WEDXCCLWSA-N 0.000 description 1
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 1
- WUHBLPVELFTPQK-KKUMJFAQSA-N Leu-Tyr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O WUHBLPVELFTPQK-KKUMJFAQSA-N 0.000 description 1
- ISSAURVGLGAPDK-KKUMJFAQSA-N Leu-Tyr-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O ISSAURVGLGAPDK-KKUMJFAQSA-N 0.000 description 1
- SXOFUVGLPHCPRQ-KKUMJFAQSA-N Leu-Tyr-Cys Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(O)=O SXOFUVGLPHCPRQ-KKUMJFAQSA-N 0.000 description 1
- VHTIZYYHIUHMCA-JYJNAYRXSA-N Leu-Tyr-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O VHTIZYYHIUHMCA-JYJNAYRXSA-N 0.000 description 1
- ARNIBBOXIAWUOP-MGHWNKPDSA-N Leu-Tyr-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ARNIBBOXIAWUOP-MGHWNKPDSA-N 0.000 description 1
- UFPLDOKWDNTTRP-ULQDDVLXSA-N Leu-Tyr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CC=C(O)C=C1 UFPLDOKWDNTTRP-ULQDDVLXSA-N 0.000 description 1
- RDFIVFHPOSOXMW-ACRUOGEOSA-N Leu-Tyr-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RDFIVFHPOSOXMW-ACRUOGEOSA-N 0.000 description 1
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 1
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 1
- MVJRBCJCRYGCKV-GVXVVHGQSA-N Leu-Val-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MVJRBCJCRYGCKV-GVXVVHGQSA-N 0.000 description 1
- QQXJROOJCMIHIV-AVGNSLFASA-N Leu-Val-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(O)=O QQXJROOJCMIHIV-AVGNSLFASA-N 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 235000004431 Linum usitatissimum Nutrition 0.000 description 1
- 240000006240 Linum usitatissimum Species 0.000 description 1
- RVOMPSJXSRPFJT-DCAQKATOSA-N Lys-Ala-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RVOMPSJXSRPFJT-DCAQKATOSA-N 0.000 description 1
- GGAPIOORBXHMNY-ULQDDVLXSA-N Lys-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)O GGAPIOORBXHMNY-ULQDDVLXSA-N 0.000 description 1
- DNEJSAIMVANNPA-DCAQKATOSA-N Lys-Asn-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DNEJSAIMVANNPA-DCAQKATOSA-N 0.000 description 1
- NCTDKZKNBDZDOL-GARJFASQSA-N Lys-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O NCTDKZKNBDZDOL-GARJFASQSA-N 0.000 description 1
- PXHCFKXNSBJSTQ-KKUMJFAQSA-N Lys-Asn-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)O PXHCFKXNSBJSTQ-KKUMJFAQSA-N 0.000 description 1
- HKCCVDWHHTVVPN-CIUDSAMLSA-N Lys-Asp-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O HKCCVDWHHTVVPN-CIUDSAMLSA-N 0.000 description 1
- QUYCUALODHJQLK-CIUDSAMLSA-N Lys-Asp-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUYCUALODHJQLK-CIUDSAMLSA-N 0.000 description 1
- SQXUUGUCGJSWCK-CIUDSAMLSA-N Lys-Asp-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N SQXUUGUCGJSWCK-CIUDSAMLSA-N 0.000 description 1
- PHHYNOUOUWYQRO-XIRDDKMYSA-N Lys-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N PHHYNOUOUWYQRO-XIRDDKMYSA-N 0.000 description 1
- YVMQJGWLHRWMDF-MNXVOIDGSA-N Lys-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N YVMQJGWLHRWMDF-MNXVOIDGSA-N 0.000 description 1
- NNCDAORZCMPZPX-GUBZILKMSA-N Lys-Gln-Ser Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N NNCDAORZCMPZPX-GUBZILKMSA-N 0.000 description 1
- HEWWNLVEWBJBKA-WDCWCFNPSA-N Lys-Gln-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN HEWWNLVEWBJBKA-WDCWCFNPSA-N 0.000 description 1
- DRCILAJNUJKAHC-SRVKXCTJSA-N Lys-Glu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DRCILAJNUJKAHC-SRVKXCTJSA-N 0.000 description 1
- KZOHPCYVORJBLG-AVGNSLFASA-N Lys-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N KZOHPCYVORJBLG-AVGNSLFASA-N 0.000 description 1
- DUTMKEAPLLUGNO-JYJNAYRXSA-N Lys-Glu-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DUTMKEAPLLUGNO-JYJNAYRXSA-N 0.000 description 1
- PAMDBWYMLWOELY-SDDRHHMPSA-N Lys-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O PAMDBWYMLWOELY-SDDRHHMPSA-N 0.000 description 1
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 1
- FHIAJWBDZVHLAH-YUMQZZPRSA-N Lys-Gly-Ser Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FHIAJWBDZVHLAH-YUMQZZPRSA-N 0.000 description 1
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 1
- HAUUXTXKJNVIFY-ONGXEEELSA-N Lys-Gly-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAUUXTXKJNVIFY-ONGXEEELSA-N 0.000 description 1
- FGMHXLULNHTPID-KKUMJFAQSA-N Lys-His-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CN=CN1 FGMHXLULNHTPID-KKUMJFAQSA-N 0.000 description 1
- HQXSFFSLXFHWOX-IXOXFDKPSA-N Lys-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCCN)N)O HQXSFFSLXFHWOX-IXOXFDKPSA-N 0.000 description 1
- QBEPTBMRQALPEV-MNXVOIDGSA-N Lys-Ile-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN QBEPTBMRQALPEV-MNXVOIDGSA-N 0.000 description 1
- MUXNCRWTWBMNHX-SRVKXCTJSA-N Lys-Leu-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O MUXNCRWTWBMNHX-SRVKXCTJSA-N 0.000 description 1
- ONPDTSFZAIWMDI-AVGNSLFASA-N Lys-Leu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ONPDTSFZAIWMDI-AVGNSLFASA-N 0.000 description 1
- VMTYLUGCXIEDMV-QWRGUYRKSA-N Lys-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN VMTYLUGCXIEDMV-QWRGUYRKSA-N 0.000 description 1
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 1
- ZJWIXBZTAAJERF-IHRRRGAJSA-N Lys-Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZJWIXBZTAAJERF-IHRRRGAJSA-N 0.000 description 1
- PYFNONMJYNJENN-AVGNSLFASA-N Lys-Lys-Gln Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PYFNONMJYNJENN-AVGNSLFASA-N 0.000 description 1
- KJIXWRWPOCKYLD-IHRRRGAJSA-N Lys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N KJIXWRWPOCKYLD-IHRRRGAJSA-N 0.000 description 1
- BEGQVWUZFXLNHZ-IHPCNDPISA-N Lys-Lys-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 BEGQVWUZFXLNHZ-IHPCNDPISA-N 0.000 description 1
- GZGWILAQHOVXTD-DCAQKATOSA-N Lys-Met-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O GZGWILAQHOVXTD-DCAQKATOSA-N 0.000 description 1
- DAHQKYYIXPBESV-UWVGGRQHSA-N Lys-Met-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O DAHQKYYIXPBESV-UWVGGRQHSA-N 0.000 description 1
- ODTZHNZPINULEU-KKUMJFAQSA-N Lys-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N ODTZHNZPINULEU-KKUMJFAQSA-N 0.000 description 1
- TWPCWKVOZDUYAA-KKUMJFAQSA-N Lys-Phe-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O TWPCWKVOZDUYAA-KKUMJFAQSA-N 0.000 description 1
- CENKQZWVYMLRAX-ULQDDVLXSA-N Lys-Phe-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O CENKQZWVYMLRAX-ULQDDVLXSA-N 0.000 description 1
- IPTUBUUIFRZMJK-ACRUOGEOSA-N Lys-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 IPTUBUUIFRZMJK-ACRUOGEOSA-N 0.000 description 1
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 1
- WLXGMVVHTIUPHE-ULQDDVLXSA-N Lys-Phe-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O WLXGMVVHTIUPHE-ULQDDVLXSA-N 0.000 description 1
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 1
- AFLBTVGQCQLOFJ-AVGNSLFASA-N Lys-Pro-Arg Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AFLBTVGQCQLOFJ-AVGNSLFASA-N 0.000 description 1
- LECIJRIRMVOFMH-ULQDDVLXSA-N Lys-Pro-Phe Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 LECIJRIRMVOFMH-ULQDDVLXSA-N 0.000 description 1
- XFANQCRHTMOEAP-WDSOQIARSA-N Lys-Pro-Trp Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O XFANQCRHTMOEAP-WDSOQIARSA-N 0.000 description 1
- MGKFCQFVPKOWOL-CIUDSAMLSA-N Lys-Ser-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N MGKFCQFVPKOWOL-CIUDSAMLSA-N 0.000 description 1
- JMNRXRPBHFGXQX-GUBZILKMSA-N Lys-Ser-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JMNRXRPBHFGXQX-GUBZILKMSA-N 0.000 description 1
- ZUGVARDEGWMMLK-SRVKXCTJSA-N Lys-Ser-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN ZUGVARDEGWMMLK-SRVKXCTJSA-N 0.000 description 1
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 1
- DLCAXBGXGOVUCD-PPCPHDFISA-N Lys-Thr-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DLCAXBGXGOVUCD-PPCPHDFISA-N 0.000 description 1
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 1
- TXTZMVNJIRZABH-ULQDDVLXSA-N Lys-Val-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TXTZMVNJIRZABH-ULQDDVLXSA-N 0.000 description 1
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 1
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 1
- 241000218922 Magnoliophyta Species 0.000 description 1
- VHGIWFGJIHTASW-FXQIFTODSA-N Met-Ala-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O VHGIWFGJIHTASW-FXQIFTODSA-N 0.000 description 1
- BVXXDMUMHMXFER-BPNCWPANSA-N Met-Ala-Tyr Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVXXDMUMHMXFER-BPNCWPANSA-N 0.000 description 1
- XMMWDTUFTZMQFD-GMOBBJLQSA-N Met-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCSC XMMWDTUFTZMQFD-GMOBBJLQSA-N 0.000 description 1
- YCUSPBPZVJDMII-YUMQZZPRSA-N Met-Gly-Glu Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O YCUSPBPZVJDMII-YUMQZZPRSA-N 0.000 description 1
- FZUNSVYYPYJYAP-NAKRPEOUSA-N Met-Ile-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O FZUNSVYYPYJYAP-NAKRPEOUSA-N 0.000 description 1
- RRIHXWPHQSXHAQ-XUXIUFHCSA-N Met-Ile-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O RRIHXWPHQSXHAQ-XUXIUFHCSA-N 0.000 description 1
- AFFKUNVPPLQUGA-DCAQKATOSA-N Met-Leu-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O AFFKUNVPPLQUGA-DCAQKATOSA-N 0.000 description 1
- KMSMNUFBNCHMII-IHRRRGAJSA-N Met-Leu-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN KMSMNUFBNCHMII-IHRRRGAJSA-N 0.000 description 1
- CHDYFPCQVUOJEB-ULQDDVLXSA-N Met-Leu-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 CHDYFPCQVUOJEB-ULQDDVLXSA-N 0.000 description 1
- BEZJTLKUMFMITF-AVGNSLFASA-N Met-Lys-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCNC(N)=N BEZJTLKUMFMITF-AVGNSLFASA-N 0.000 description 1
- MSSJHBAKDDIRMJ-SRVKXCTJSA-N Met-Lys-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O MSSJHBAKDDIRMJ-SRVKXCTJSA-N 0.000 description 1
- DSZFTPCSFVWMKP-DCAQKATOSA-N Met-Ser-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN DSZFTPCSFVWMKP-DCAQKATOSA-N 0.000 description 1
- KSIPKXNIQOWMIC-RCWTZXSCSA-N Met-Thr-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KSIPKXNIQOWMIC-RCWTZXSCSA-N 0.000 description 1
- FXBKQTOGURNXSL-HJGDQZAQSA-N Met-Thr-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O FXBKQTOGURNXSL-HJGDQZAQSA-N 0.000 description 1
- WXJLBSXNUHIGSS-OSUNSFLBSA-N Met-Thr-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WXJLBSXNUHIGSS-OSUNSFLBSA-N 0.000 description 1
- NBEFNGUZUOUGFG-KKUMJFAQSA-N Met-Tyr-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NBEFNGUZUOUGFG-KKUMJFAQSA-N 0.000 description 1
- PNHRPOWKRRJATF-IHRRRGAJSA-N Met-Tyr-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 PNHRPOWKRRJATF-IHRRRGAJSA-N 0.000 description 1
- GHQFLTYXGUETFD-UFYCRDLUSA-N Met-Tyr-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N GHQFLTYXGUETFD-UFYCRDLUSA-N 0.000 description 1
- 241000588629 Moraxella lacunata Species 0.000 description 1
- 244000061176 Nicotiana tabacum Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 108091007494 Nucleic acid- binding domains Proteins 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 230000010718 Oxidation Activity Effects 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 206010034133 Pathogen resistance Diseases 0.000 description 1
- 239000006002 Pepper Substances 0.000 description 1
- JVTMTFMMMHAPCR-UBHSHLNASA-N Phe-Ala-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JVTMTFMMMHAPCR-UBHSHLNASA-N 0.000 description 1
- NEHSHYOUIWBYSA-DCPHZVHLSA-N Phe-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=CC=C3)N NEHSHYOUIWBYSA-DCPHZVHLSA-N 0.000 description 1
- SEPNOAFMZLLCEW-UBHSHLNASA-N Phe-Ala-Val Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O SEPNOAFMZLLCEW-UBHSHLNASA-N 0.000 description 1
- LGBVMDMZZFYSFW-HJWJTTGWSA-N Phe-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CC=CC=C1)N LGBVMDMZZFYSFW-HJWJTTGWSA-N 0.000 description 1
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 1
- KIAWKQJTSGRCSA-AVGNSLFASA-N Phe-Asn-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KIAWKQJTSGRCSA-AVGNSLFASA-N 0.000 description 1
- MECSIDWUTYRHRJ-KKUMJFAQSA-N Phe-Asn-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O MECSIDWUTYRHRJ-KKUMJFAQSA-N 0.000 description 1
- XMPUYNHKEPFERE-IHRRRGAJSA-N Phe-Asp-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 XMPUYNHKEPFERE-IHRRRGAJSA-N 0.000 description 1
- SWZKMTDPQXLQRD-XVSYOHENSA-N Phe-Asp-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWZKMTDPQXLQRD-XVSYOHENSA-N 0.000 description 1
- CUMXHKAOHNWRFQ-BZSNNMDCSA-N Phe-Asp-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 CUMXHKAOHNWRFQ-BZSNNMDCSA-N 0.000 description 1
- FGXIJNMDRCZVDE-KKUMJFAQSA-N Phe-Cys-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N FGXIJNMDRCZVDE-KKUMJFAQSA-N 0.000 description 1
- YEEFZOKPYOUXMX-KKUMJFAQSA-N Phe-Gln-Met Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O YEEFZOKPYOUXMX-KKUMJFAQSA-N 0.000 description 1
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 1
- MGBRZXXGQBAULP-DRZSPHRISA-N Phe-Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MGBRZXXGQBAULP-DRZSPHRISA-N 0.000 description 1
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 1
- KJJROSNFBRWPHS-JYJNAYRXSA-N Phe-Glu-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KJJROSNFBRWPHS-JYJNAYRXSA-N 0.000 description 1
- OYQBFWWQSVIHBN-FHWLQOOXSA-N Phe-Glu-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O OYQBFWWQSVIHBN-FHWLQOOXSA-N 0.000 description 1
- CSDMCMITJLKBAH-SOUVJXGZSA-N Phe-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O CSDMCMITJLKBAH-SOUVJXGZSA-N 0.000 description 1
- JJHVFCUWLSKADD-ONGXEEELSA-N Phe-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O JJHVFCUWLSKADD-ONGXEEELSA-N 0.000 description 1
- WPTYDQPGBMDUBI-QWRGUYRKSA-N Phe-Gly-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O WPTYDQPGBMDUBI-QWRGUYRKSA-N 0.000 description 1
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 1
- FXYXBEZMRACDDR-KKUMJFAQSA-N Phe-His-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O FXYXBEZMRACDDR-KKUMJFAQSA-N 0.000 description 1
- ZKSLXIGKRJMALF-MGHWNKPDSA-N Phe-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=CC=C2)N ZKSLXIGKRJMALF-MGHWNKPDSA-N 0.000 description 1
- YZJKNDCEPDDIDA-BZSNNMDCSA-N Phe-His-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CN=CN1 YZJKNDCEPDDIDA-BZSNNMDCSA-N 0.000 description 1
- SPXWRYVHOZVYBU-ULQDDVLXSA-N Phe-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=CC=C2)N SPXWRYVHOZVYBU-ULQDDVLXSA-N 0.000 description 1
- GYEPCBNTTRORKW-PCBIJLKTSA-N Phe-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O GYEPCBNTTRORKW-PCBIJLKTSA-N 0.000 description 1
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 1
- BWTKUQPNOMMKMA-FIRPJDEBSA-N Phe-Ile-Phe Chemical compound C([C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 BWTKUQPNOMMKMA-FIRPJDEBSA-N 0.000 description 1
- YKUGPVXSDOOANW-KKUMJFAQSA-N Phe-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YKUGPVXSDOOANW-KKUMJFAQSA-N 0.000 description 1
- KDYPMIZMXDECSU-JYJNAYRXSA-N Phe-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KDYPMIZMXDECSU-JYJNAYRXSA-N 0.000 description 1
- OSBADCBXAMSPQD-YESZJQIVSA-N Phe-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N OSBADCBXAMSPQD-YESZJQIVSA-N 0.000 description 1
- DMEYUTSDVRCWRS-ULQDDVLXSA-N Phe-Lys-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 DMEYUTSDVRCWRS-ULQDDVLXSA-N 0.000 description 1
- PHJUFDQVVKVOPU-ULQDDVLXSA-N Phe-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=CC=C1)N PHJUFDQVVKVOPU-ULQDDVLXSA-N 0.000 description 1
- BSHMIVKDJQGLNT-ACRUOGEOSA-N Phe-Lys-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 BSHMIVKDJQGLNT-ACRUOGEOSA-N 0.000 description 1
- SZYBZVANEAOIPE-UBHSHLNASA-N Phe-Met-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O SZYBZVANEAOIPE-UBHSHLNASA-N 0.000 description 1
- PTLMYJOMJLTMCB-KKUMJFAQSA-N Phe-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N PTLMYJOMJLTMCB-KKUMJFAQSA-N 0.000 description 1
- SRILZRSXIKRGBF-HRCADAONSA-N Phe-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N SRILZRSXIKRGBF-HRCADAONSA-N 0.000 description 1
- RYQWALWYQWBUKN-FHWLQOOXSA-N Phe-Phe-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RYQWALWYQWBUKN-FHWLQOOXSA-N 0.000 description 1
- TXJJXEXCZBHDNA-ACRUOGEOSA-N Phe-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N TXJJXEXCZBHDNA-ACRUOGEOSA-N 0.000 description 1
- MGLBSROLWAWCKN-FCLVOEFKSA-N Phe-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MGLBSROLWAWCKN-FCLVOEFKSA-N 0.000 description 1
- AFNJAQVMTIQTCB-DLOVCJGASA-N Phe-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 AFNJAQVMTIQTCB-DLOVCJGASA-N 0.000 description 1
- UNBFGVQVQGXXCK-KKUMJFAQSA-N Phe-Ser-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O UNBFGVQVQGXXCK-KKUMJFAQSA-N 0.000 description 1
- GLJZDMZJHFXJQG-BZSNNMDCSA-N Phe-Ser-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLJZDMZJHFXJQG-BZSNNMDCSA-N 0.000 description 1
- QSWKNJAPHQDAAS-MELADBBJSA-N Phe-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O QSWKNJAPHQDAAS-MELADBBJSA-N 0.000 description 1
- MCIXMYKSPQUMJG-SRVKXCTJSA-N Phe-Ser-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MCIXMYKSPQUMJG-SRVKXCTJSA-N 0.000 description 1
- LTAWNJXSRUCFAN-UNQGMJICSA-N Phe-Thr-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LTAWNJXSRUCFAN-UNQGMJICSA-N 0.000 description 1
- YFXXRYFWJFQAFW-JHYOHUSXSA-N Phe-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O YFXXRYFWJFQAFW-JHYOHUSXSA-N 0.000 description 1
- GTMSCDVFQLNEOY-BZSNNMDCSA-N Phe-Tyr-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N GTMSCDVFQLNEOY-BZSNNMDCSA-N 0.000 description 1
- JSGWNFKWZNPDAV-YDHLFZDLSA-N Phe-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JSGWNFKWZNPDAV-YDHLFZDLSA-N 0.000 description 1
- YUPRIZTWANWWHK-DZKIICNBSA-N Phe-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N YUPRIZTWANWWHK-DZKIICNBSA-N 0.000 description 1
- JTKGCYOOJLUETJ-ULQDDVLXSA-N Phe-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JTKGCYOOJLUETJ-ULQDDVLXSA-N 0.000 description 1
- 102000004160 Phosphoric Monoester Hydrolases Human genes 0.000 description 1
- 108090000608 Phosphoric Monoester Hydrolases Proteins 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 235000016761 Piper aduncum Nutrition 0.000 description 1
- 240000003889 Piper guineense Species 0.000 description 1
- 235000017804 Piper guineense Nutrition 0.000 description 1
- 235000008184 Piper nigrum Nutrition 0.000 description 1
- 239000004952 Polyamide Substances 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 1
- CGBYDGAJHSOGFQ-LPEHRKFASA-N Pro-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 CGBYDGAJHSOGFQ-LPEHRKFASA-N 0.000 description 1
- UTAUEDINXUMHLG-FXQIFTODSA-N Pro-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 UTAUEDINXUMHLG-FXQIFTODSA-N 0.000 description 1
- CJZTUKSFZUSNCC-FXQIFTODSA-N Pro-Asp-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 CJZTUKSFZUSNCC-FXQIFTODSA-N 0.000 description 1
- ODPIUQVTULPQEP-CIUDSAMLSA-N Pro-Gln-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ODPIUQVTULPQEP-CIUDSAMLSA-N 0.000 description 1
- UAYHMOIGIQZLFR-NHCYSSNCSA-N Pro-Gln-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UAYHMOIGIQZLFR-NHCYSSNCSA-N 0.000 description 1
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 1
- NXEYSLRNNPWCRN-SRVKXCTJSA-N Pro-Glu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXEYSLRNNPWCRN-SRVKXCTJSA-N 0.000 description 1
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 1
- SSWJYJHXQOYTSP-SRVKXCTJSA-N Pro-His-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O SSWJYJHXQOYTSP-SRVKXCTJSA-N 0.000 description 1
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 1
- VZKBJNBZMZHKRC-XUXIUFHCSA-N Pro-Ile-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O VZKBJNBZMZHKRC-XUXIUFHCSA-N 0.000 description 1
- BCNRNJWSRFDPTQ-HJWJTTGWSA-N Pro-Ile-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BCNRNJWSRFDPTQ-HJWJTTGWSA-N 0.000 description 1
- FKVNLUZHSFCNGY-RVMXOQNASA-N Pro-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 FKVNLUZHSFCNGY-RVMXOQNASA-N 0.000 description 1
- MCWHYUWXVNRXFV-RWMBFGLXSA-N Pro-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 MCWHYUWXVNRXFV-RWMBFGLXSA-N 0.000 description 1
- VTFXTWDFPTWNJY-RHYQMDGZSA-N Pro-Leu-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VTFXTWDFPTWNJY-RHYQMDGZSA-N 0.000 description 1
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 1
- ZLXKLMHAMDENIO-DCAQKATOSA-N Pro-Lys-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O ZLXKLMHAMDENIO-DCAQKATOSA-N 0.000 description 1
- RMODQFBNDDENCP-IHRRRGAJSA-N Pro-Lys-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O RMODQFBNDDENCP-IHRRRGAJSA-N 0.000 description 1
- MHHQQZIFLWFZGR-DCAQKATOSA-N Pro-Lys-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O MHHQQZIFLWFZGR-DCAQKATOSA-N 0.000 description 1
- MHBSUKYVBZVQRW-HJWJTTGWSA-N Pro-Phe-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MHBSUKYVBZVQRW-HJWJTTGWSA-N 0.000 description 1
- HOTVCUAVDQHUDB-UFYCRDLUSA-N Pro-Phe-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 HOTVCUAVDQHUDB-UFYCRDLUSA-N 0.000 description 1
- LNICFEXCAHIJOR-DCAQKATOSA-N Pro-Ser-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LNICFEXCAHIJOR-DCAQKATOSA-N 0.000 description 1
- CHYAYDLYYIJCKY-OSUNSFLBSA-N Pro-Thr-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CHYAYDLYYIJCKY-OSUNSFLBSA-N 0.000 description 1
- RSTWKJFWBKFOFC-JYJNAYRXSA-N Pro-Trp-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O RSTWKJFWBKFOFC-JYJNAYRXSA-N 0.000 description 1
- BVRBCQBUNGAWFP-KKUMJFAQSA-N Pro-Tyr-Gln Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O BVRBCQBUNGAWFP-KKUMJFAQSA-N 0.000 description 1
- VEUACYMXJKXALX-IHRRRGAJSA-N Pro-Tyr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O VEUACYMXJKXALX-IHRRRGAJSA-N 0.000 description 1
- QDDJNKWPTJHROJ-UFYCRDLUSA-N Pro-Tyr-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 QDDJNKWPTJHROJ-UFYCRDLUSA-N 0.000 description 1
- XDKKMRPRRCOELJ-GUBZILKMSA-N Pro-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 XDKKMRPRRCOELJ-GUBZILKMSA-N 0.000 description 1
- WWXNZNWZNZPDIF-SRVKXCTJSA-N Pro-Val-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 WWXNZNWZNZPDIF-SRVKXCTJSA-N 0.000 description 1
- 102100033091 Putative C->U-editing enzyme APOBEC-4 Human genes 0.000 description 1
- 108091093078 Pyrimidine dimer Proteins 0.000 description 1
- 108010003201 RGH 0205 Proteins 0.000 description 1
- 238000010357 RNA editing Methods 0.000 description 1
- 230000026279 RNA modification Effects 0.000 description 1
- 102000044126 RNA-Binding Proteins Human genes 0.000 description 1
- 108700020471 RNA-Binding Proteins Proteins 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 102000001218 Rec A Recombinases Human genes 0.000 description 1
- 108010055016 Rec A Recombinases Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 102000018120 Recombinases Human genes 0.000 description 1
- 108010091086 Recombinases Proteins 0.000 description 1
- ZUGXSSFMTXKHJS-ZLUOBGJFSA-N Ser-Ala-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O ZUGXSSFMTXKHJS-ZLUOBGJFSA-N 0.000 description 1
- FIXILCYTSAUERA-FXQIFTODSA-N Ser-Ala-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FIXILCYTSAUERA-FXQIFTODSA-N 0.000 description 1
- WTUJZHKANPDPIN-CIUDSAMLSA-N Ser-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N WTUJZHKANPDPIN-CIUDSAMLSA-N 0.000 description 1
- JPIDMRXXNMIVKY-VZFHVOOUSA-N Ser-Ala-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPIDMRXXNMIVKY-VZFHVOOUSA-N 0.000 description 1
- YUSRGTQIPCJNHQ-CIUDSAMLSA-N Ser-Arg-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O YUSRGTQIPCJNHQ-CIUDSAMLSA-N 0.000 description 1
- QFBNNYNWKYKVJO-DCAQKATOSA-N Ser-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N QFBNNYNWKYKVJO-DCAQKATOSA-N 0.000 description 1
- BCKYYTVFBXHPOG-ACZMJKKPSA-N Ser-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N BCKYYTVFBXHPOG-ACZMJKKPSA-N 0.000 description 1
- FIDMVVBUOCMMJG-CIUDSAMLSA-N Ser-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO FIDMVVBUOCMMJG-CIUDSAMLSA-N 0.000 description 1
- VGNYHOBZJKWRGI-CIUDSAMLSA-N Ser-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO VGNYHOBZJKWRGI-CIUDSAMLSA-N 0.000 description 1
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 1
- FTVRVZNYIYWJGB-ACZMJKKPSA-N Ser-Asp-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FTVRVZNYIYWJGB-ACZMJKKPSA-N 0.000 description 1
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 1
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 1
- CRZRTKAVUUGKEQ-ACZMJKKPSA-N Ser-Gln-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CRZRTKAVUUGKEQ-ACZMJKKPSA-N 0.000 description 1
- YPUSXTWURJANKF-KBIXCLLPSA-N Ser-Gln-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YPUSXTWURJANKF-KBIXCLLPSA-N 0.000 description 1
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 1
- HVKMTOIAYDOJPL-NRPADANISA-N Ser-Gln-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVKMTOIAYDOJPL-NRPADANISA-N 0.000 description 1
- PVDTYLHUWAEYGY-CIUDSAMLSA-N Ser-Glu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PVDTYLHUWAEYGY-CIUDSAMLSA-N 0.000 description 1
- UOLGINIHBRIECN-FXQIFTODSA-N Ser-Glu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UOLGINIHBRIECN-FXQIFTODSA-N 0.000 description 1
- DSGYZICNAMEJOC-AVGNSLFASA-N Ser-Glu-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DSGYZICNAMEJOC-AVGNSLFASA-N 0.000 description 1
- WOUIMBGNEUWXQG-VKHMYHEASA-N Ser-Gly Chemical compound OC[C@H](N)C(=O)NCC(O)=O WOUIMBGNEUWXQG-VKHMYHEASA-N 0.000 description 1
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 1
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 1
- SFTZWNJFZYOLBD-ZDLURKLDSA-N Ser-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO SFTZWNJFZYOLBD-ZDLURKLDSA-N 0.000 description 1
- CXBFHZLODKPIJY-AAEUAGOBSA-N Ser-Gly-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N CXBFHZLODKPIJY-AAEUAGOBSA-N 0.000 description 1
- HMRAQFJFTOLDKW-GUBZILKMSA-N Ser-His-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O HMRAQFJFTOLDKW-GUBZILKMSA-N 0.000 description 1
- WEQAYODCJHZSJZ-KKUMJFAQSA-N Ser-His-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 WEQAYODCJHZSJZ-KKUMJFAQSA-N 0.000 description 1
- ZUDXUJSYCCNZQJ-DCAQKATOSA-N Ser-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CO)N ZUDXUJSYCCNZQJ-DCAQKATOSA-N 0.000 description 1
- BKZYBLLIBOBOOW-GHCJXIJMSA-N Ser-Ile-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O BKZYBLLIBOBOOW-GHCJXIJMSA-N 0.000 description 1
- IFPBAGJBHSNYPR-ZKWXMUAHSA-N Ser-Ile-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O IFPBAGJBHSNYPR-ZKWXMUAHSA-N 0.000 description 1
- RIAKPZVSNBBNRE-BJDJZHNGSA-N Ser-Ile-Leu Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O RIAKPZVSNBBNRE-BJDJZHNGSA-N 0.000 description 1
- NLOAIFSWUUFQFR-CIUDSAMLSA-N Ser-Leu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O NLOAIFSWUUFQFR-CIUDSAMLSA-N 0.000 description 1
- WGDYNRCOQRERLZ-KKUMJFAQSA-N Ser-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)N WGDYNRCOQRERLZ-KKUMJFAQSA-N 0.000 description 1
- NQZFFLBPNDLTPO-DLOVCJGASA-N Ser-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CO)N NQZFFLBPNDLTPO-DLOVCJGASA-N 0.000 description 1
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 1
- NMZXJDSKEGFDLJ-DCAQKATOSA-N Ser-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CO)N)C(=O)N[C@@H](CCCCN)C(=O)O NMZXJDSKEGFDLJ-DCAQKATOSA-N 0.000 description 1
- CKDXFSPMIDSMGV-GUBZILKMSA-N Ser-Pro-Val Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O CKDXFSPMIDSMGV-GUBZILKMSA-N 0.000 description 1
- FZXOPYUEQGDGMS-ACZMJKKPSA-N Ser-Ser-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O FZXOPYUEQGDGMS-ACZMJKKPSA-N 0.000 description 1
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 1
- CUXJENOFJXOSOZ-BIIVOSGPSA-N Ser-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CO)N)C(=O)O CUXJENOFJXOSOZ-BIIVOSGPSA-N 0.000 description 1
- XJDMUQCLVSCRSJ-VZFHVOOUSA-N Ser-Thr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O XJDMUQCLVSCRSJ-VZFHVOOUSA-N 0.000 description 1
- SQHKXWODKJDZRC-LKXGYXEUSA-N Ser-Thr-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQHKXWODKJDZRC-LKXGYXEUSA-N 0.000 description 1
- KKKVOZNCLALMPV-XKBZYTNZSA-N Ser-Thr-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KKKVOZNCLALMPV-XKBZYTNZSA-N 0.000 description 1
- SNXUIBACCONSOH-BWBBJGPYSA-N Ser-Thr-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CO)C(O)=O SNXUIBACCONSOH-BWBBJGPYSA-N 0.000 description 1
- ZKOKTQPHFMRSJP-YJRXYDGGSA-N Ser-Thr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKOKTQPHFMRSJP-YJRXYDGGSA-N 0.000 description 1
- FHXGMDRKJHKLKW-QWRGUYRKSA-N Ser-Tyr-Gly Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 FHXGMDRKJHKLKW-QWRGUYRKSA-N 0.000 description 1
- HKHCTNFKZXAMIF-KKUMJFAQSA-N Ser-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=C(O)C=C1 HKHCTNFKZXAMIF-KKUMJFAQSA-N 0.000 description 1
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 108020004688 Small Nuclear RNA Proteins 0.000 description 1
- 102000039471 Small Nuclear RNA Human genes 0.000 description 1
- 244000061458 Solanum melongena Species 0.000 description 1
- 235000002597 Solanum melongena Nutrition 0.000 description 1
- 240000006394 Sorghum bicolor Species 0.000 description 1
- 235000011684 Sorghum saccharatum Nutrition 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 235000021536 Sugar beet Nutrition 0.000 description 1
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical group [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 108020005038 Terminator Codon Proteins 0.000 description 1
- 241000605257 Thiomicrospira sp. Species 0.000 description 1
- 241001645838 Thiospira Species 0.000 description 1
- MQCPGOZXFSYJPS-KZVJFYERSA-N Thr-Ala-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MQCPGOZXFSYJPS-KZVJFYERSA-N 0.000 description 1
- DFTCYYILCSQGIZ-GCJQMDKQSA-N Thr-Ala-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFTCYYILCSQGIZ-GCJQMDKQSA-N 0.000 description 1
- ZUXQFMVPAYGPFJ-JXUBOQSCSA-N Thr-Ala-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN ZUXQFMVPAYGPFJ-JXUBOQSCSA-N 0.000 description 1
- KEGBFULVYKYJRD-LFSVMHDDSA-N Thr-Ala-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KEGBFULVYKYJRD-LFSVMHDDSA-N 0.000 description 1
- GLQFKOVWXPPFTP-VEVYYDQMSA-N Thr-Arg-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O GLQFKOVWXPPFTP-VEVYYDQMSA-N 0.000 description 1
- JMQUAZXYFAEOIH-XGEHTFHBSA-N Thr-Arg-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N)O JMQUAZXYFAEOIH-XGEHTFHBSA-N 0.000 description 1
- VASYSJHSMSBTDU-LKXGYXEUSA-N Thr-Asn-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N)O VASYSJHSMSBTDU-LKXGYXEUSA-N 0.000 description 1
- LMMDEZPNUTZJAY-GCJQMDKQSA-N Thr-Asp-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O LMMDEZPNUTZJAY-GCJQMDKQSA-N 0.000 description 1
- MFEBUIFJVPNZLO-OLHMAJIHSA-N Thr-Asp-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O MFEBUIFJVPNZLO-OLHMAJIHSA-N 0.000 description 1
- GNHRVXYZKWSJTF-HJGDQZAQSA-N Thr-Asp-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O GNHRVXYZKWSJTF-HJGDQZAQSA-N 0.000 description 1
- ZUUDNCOCILSYAM-KKHAAJSZSA-N Thr-Asp-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ZUUDNCOCILSYAM-KKHAAJSZSA-N 0.000 description 1
- WLDUCKSCDRIVLJ-NUMRIWBASA-N Thr-Gln-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O WLDUCKSCDRIVLJ-NUMRIWBASA-N 0.000 description 1
- GARULAKWZGFIKC-RWRJDSDZSA-N Thr-Gln-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GARULAKWZGFIKC-RWRJDSDZSA-N 0.000 description 1
- LIXBDERDAGNVAV-XKBZYTNZSA-N Thr-Gln-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O LIXBDERDAGNVAV-XKBZYTNZSA-N 0.000 description 1
- JMGJDTNUMAZNLX-RWRJDSDZSA-N Thr-Glu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JMGJDTNUMAZNLX-RWRJDSDZSA-N 0.000 description 1
- KCRQEJSKXAIULJ-FJXKBIBVSA-N Thr-Gly-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O KCRQEJSKXAIULJ-FJXKBIBVSA-N 0.000 description 1
- NIEWSKWFURSECR-FOHZUACHSA-N Thr-Gly-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NIEWSKWFURSECR-FOHZUACHSA-N 0.000 description 1
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 1
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 1
- JQAWYCUUFIMTHE-WLTAIBSBSA-N Thr-Gly-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JQAWYCUUFIMTHE-WLTAIBSBSA-N 0.000 description 1
- WBCCCPZIJIJTSD-TUBUOCAGSA-N Thr-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H]([C@@H](C)O)N WBCCCPZIJIJTSD-TUBUOCAGSA-N 0.000 description 1
- AYCQVUUPIJHJTA-IXOXFDKPSA-N Thr-His-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O AYCQVUUPIJHJTA-IXOXFDKPSA-N 0.000 description 1
- FKIGTIXHSRNKJU-IXOXFDKPSA-N Thr-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CN=CN1 FKIGTIXHSRNKJU-IXOXFDKPSA-N 0.000 description 1
- SXAGUVRFGJSFKC-ZEILLAHLSA-N Thr-His-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SXAGUVRFGJSFKC-ZEILLAHLSA-N 0.000 description 1
- CRZNCABIJLRFKZ-IUKAMOBKSA-N Thr-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N CRZNCABIJLRFKZ-IUKAMOBKSA-N 0.000 description 1
- JRAUIKJSEAKTGD-TUBUOCAGSA-N Thr-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N JRAUIKJSEAKTGD-TUBUOCAGSA-N 0.000 description 1
- AHOLTQCAVBSUDP-PPCPHDFISA-N Thr-Ile-Lys Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)[C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O AHOLTQCAVBSUDP-PPCPHDFISA-N 0.000 description 1
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 1
- BVOVIGCHYNFJBZ-JXUBOQSCSA-N Thr-Leu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O BVOVIGCHYNFJBZ-JXUBOQSCSA-N 0.000 description 1
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 1
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 1
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 1
- IJVNLNRVDUTWDD-MEYUZBJRSA-N Thr-Leu-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IJVNLNRVDUTWDD-MEYUZBJRSA-N 0.000 description 1
- KRDSCBLRHORMRK-JXUBOQSCSA-N Thr-Lys-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O KRDSCBLRHORMRK-JXUBOQSCSA-N 0.000 description 1
- SCSVNSNWUTYSFO-WDCWCFNPSA-N Thr-Lys-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O SCSVNSNWUTYSFO-WDCWCFNPSA-N 0.000 description 1
- ZXIHABSKUITPTN-IXOXFDKPSA-N Thr-Lys-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O ZXIHABSKUITPTN-IXOXFDKPSA-N 0.000 description 1
- SPVHQURZJCUDQC-VOAKCMCISA-N Thr-Lys-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O SPVHQURZJCUDQC-VOAKCMCISA-N 0.000 description 1
- MGJLBZFUXUGMML-VOAKCMCISA-N Thr-Lys-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MGJLBZFUXUGMML-VOAKCMCISA-N 0.000 description 1
- KKPOGALELPLJTL-MEYUZBJRSA-N Thr-Lys-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KKPOGALELPLJTL-MEYUZBJRSA-N 0.000 description 1
- MXDOAJQRJBMGMO-FJXKBIBVSA-N Thr-Pro-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O MXDOAJQRJBMGMO-FJXKBIBVSA-N 0.000 description 1
- YGZWVPBHYABGLT-KJEVXHAQSA-N Thr-Pro-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YGZWVPBHYABGLT-KJEVXHAQSA-N 0.000 description 1
- DOBIBIXIHJKVJF-XKBZYTNZSA-N Thr-Ser-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O DOBIBIXIHJKVJF-XKBZYTNZSA-N 0.000 description 1
- IQPWNQRRAJHOKV-KATARQTJSA-N Thr-Ser-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN IQPWNQRRAJHOKV-KATARQTJSA-N 0.000 description 1
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 1
- NLWDSYKZUPRMBJ-IEGACIPQSA-N Thr-Trp-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O NLWDSYKZUPRMBJ-IEGACIPQSA-N 0.000 description 1
- LXXCHJKHJYRMIY-FQPOAREZSA-N Thr-Tyr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O LXXCHJKHJYRMIY-FQPOAREZSA-N 0.000 description 1
- BZTSQFWJNJYZSX-JRQIVUDYSA-N Thr-Tyr-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O BZTSQFWJNJYZSX-JRQIVUDYSA-N 0.000 description 1
- ABCLYRRGTZNIFU-BWAGICSOSA-N Thr-Tyr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O ABCLYRRGTZNIFU-BWAGICSOSA-N 0.000 description 1
- XVHAUVJXBFGUPC-RPTUDFQQSA-N Thr-Tyr-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XVHAUVJXBFGUPC-RPTUDFQQSA-N 0.000 description 1
- ILUOMMDDGREELW-OSUNSFLBSA-N Thr-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)O ILUOMMDDGREELW-OSUNSFLBSA-N 0.000 description 1
- KZTLZZQTJMCGIP-ZJDVBMNYSA-N Thr-Val-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KZTLZZQTJMCGIP-ZJDVBMNYSA-N 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- 102000008579 Transposases Human genes 0.000 description 1
- 108010020764 Transposases Proteins 0.000 description 1
- 235000021307 Triticum Nutrition 0.000 description 1
- 244000098338 Triticum aestivum Species 0.000 description 1
- MJBBMTOGSOSAKJ-HJXMPXNTSA-N Trp-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MJBBMTOGSOSAKJ-HJXMPXNTSA-N 0.000 description 1
- PEYSVKMXSLPQRU-FJHTZYQYSA-N Trp-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O PEYSVKMXSLPQRU-FJHTZYQYSA-N 0.000 description 1
- RSUXQZNWAOTBQF-XIRDDKMYSA-N Trp-Arg-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RSUXQZNWAOTBQF-XIRDDKMYSA-N 0.000 description 1
- KZIQDVNORJKTMO-WDSOQIARSA-N Trp-Arg-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N KZIQDVNORJKTMO-WDSOQIARSA-N 0.000 description 1
- CPZTZWFFGVKHEA-SZMVWBNQSA-N Trp-Gln-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N CPZTZWFFGVKHEA-SZMVWBNQSA-N 0.000 description 1
- DZIKVMCFXIIETR-JSGCOSHPSA-N Trp-Gly-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O DZIKVMCFXIIETR-JSGCOSHPSA-N 0.000 description 1
- HLDFBNPSURDYEN-VHWLVUOQSA-N Trp-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N HLDFBNPSURDYEN-VHWLVUOQSA-N 0.000 description 1
- PWPJLBWYRTVYQS-PMVMPFDFSA-N Trp-Phe-Leu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O PWPJLBWYRTVYQS-PMVMPFDFSA-N 0.000 description 1
- XOSGQKFEIOCPIJ-SZMVWBNQSA-N Trp-Pro-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CNC3=CC=CC=C32)N XOSGQKFEIOCPIJ-SZMVWBNQSA-N 0.000 description 1
- SEXRBCGSZRCIPE-LYSGOOTNSA-N Trp-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O SEXRBCGSZRCIPE-LYSGOOTNSA-N 0.000 description 1
- VMXLNDRJXVAJFT-JYBASQMISA-N Trp-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O VMXLNDRJXVAJFT-JYBASQMISA-N 0.000 description 1
- BURPTJBFWIOHEY-UWJYBYFXSA-N Tyr-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BURPTJBFWIOHEY-UWJYBYFXSA-N 0.000 description 1
- IELISNUVHBKYBX-XDTLVQLUSA-N Tyr-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 IELISNUVHBKYBX-XDTLVQLUSA-N 0.000 description 1
- HSVPZJLMPLMPOX-BPNCWPANSA-N Tyr-Arg-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O HSVPZJLMPLMPOX-BPNCWPANSA-N 0.000 description 1
- MICSYKFECRFCTJ-IHRRRGAJSA-N Tyr-Arg-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O MICSYKFECRFCTJ-IHRRRGAJSA-N 0.000 description 1
- WDIJBEWLXLQQKD-ULQDDVLXSA-N Tyr-Arg-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O WDIJBEWLXLQQKD-ULQDDVLXSA-N 0.000 description 1
- ADBDQGBDNUTRDB-ULQDDVLXSA-N Tyr-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O ADBDQGBDNUTRDB-ULQDDVLXSA-N 0.000 description 1
- MTEQZJFSEMXXRK-CFMVVWHZSA-N Tyr-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N MTEQZJFSEMXXRK-CFMVVWHZSA-N 0.000 description 1
- ZNFPUOSTMUMUDR-JRQIVUDYSA-N Tyr-Asn-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZNFPUOSTMUMUDR-JRQIVUDYSA-N 0.000 description 1
- NLMXVDDEQFKQQU-CFMVVWHZSA-N Tyr-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NLMXVDDEQFKQQU-CFMVVWHZSA-N 0.000 description 1
- JFDGVHXRCKEBAU-KKUMJFAQSA-N Tyr-Asp-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O JFDGVHXRCKEBAU-KKUMJFAQSA-N 0.000 description 1
- IYHNBRUWVBIVJR-IHRRRGAJSA-N Tyr-Gln-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 IYHNBRUWVBIVJR-IHRRRGAJSA-N 0.000 description 1
- UXUFNBVCPAWACG-SIUGBPQLSA-N Tyr-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N UXUFNBVCPAWACG-SIUGBPQLSA-N 0.000 description 1
- NZFCWALTLNFHHC-JYJNAYRXSA-N Tyr-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NZFCWALTLNFHHC-JYJNAYRXSA-N 0.000 description 1
- FBHBVXUBTYVCRU-BZSNNMDCSA-N Tyr-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CN=CN1 FBHBVXUBTYVCRU-BZSNNMDCSA-N 0.000 description 1
- USYGMBIIUDLYHJ-GVARAGBVSA-N Tyr-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 USYGMBIIUDLYHJ-GVARAGBVSA-N 0.000 description 1
- OHOVFPKXPZODHS-SJWGOKEGSA-N Tyr-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N OHOVFPKXPZODHS-SJWGOKEGSA-N 0.000 description 1
- BXPOOVDVGWEXDU-WZLNRYEVSA-N Tyr-Ile-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BXPOOVDVGWEXDU-WZLNRYEVSA-N 0.000 description 1
- FJBCEFPCVPHPPM-STECZYCISA-N Tyr-Ile-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O FJBCEFPCVPHPPM-STECZYCISA-N 0.000 description 1
- GULIUBBXCYPDJU-CQDKDKBSSA-N Tyr-Leu-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 GULIUBBXCYPDJU-CQDKDKBSSA-N 0.000 description 1
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 1
- CWVHKVVKAQIJKY-ACRUOGEOSA-N Tyr-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=C(C=C2)O)N CWVHKVVKAQIJKY-ACRUOGEOSA-N 0.000 description 1
- SINRIKQYQJRGDQ-MEYUZBJRSA-N Tyr-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SINRIKQYQJRGDQ-MEYUZBJRSA-N 0.000 description 1
- LRHBBGDMBLFYGL-FHWLQOOXSA-N Tyr-Phe-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LRHBBGDMBLFYGL-FHWLQOOXSA-N 0.000 description 1
- FASACHWGQBNSRO-ZEWNOJEFSA-N Tyr-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC2=CC=C(C=C2)O)N FASACHWGQBNSRO-ZEWNOJEFSA-N 0.000 description 1
- JXGUUJMPCRXMSO-HJOGWXRNSA-N Tyr-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 JXGUUJMPCRXMSO-HJOGWXRNSA-N 0.000 description 1
- VBFVQTPETKJCQW-RPTUDFQQSA-N Tyr-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VBFVQTPETKJCQW-RPTUDFQQSA-N 0.000 description 1
- CDBXVDXSLPLFMD-BPNCWPANSA-N Tyr-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDBXVDXSLPLFMD-BPNCWPANSA-N 0.000 description 1
- BIWVVOHTKDLRMP-ULQDDVLXSA-N Tyr-Pro-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BIWVVOHTKDLRMP-ULQDDVLXSA-N 0.000 description 1
- RCMWNNJFKNDKQR-UFYCRDLUSA-N Tyr-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 RCMWNNJFKNDKQR-UFYCRDLUSA-N 0.000 description 1
- SOAUMCDLIUGXJJ-SRVKXCTJSA-N Tyr-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O SOAUMCDLIUGXJJ-SRVKXCTJSA-N 0.000 description 1
- QPOUERMDWKKZEG-HJPIBITLSA-N Tyr-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QPOUERMDWKKZEG-HJPIBITLSA-N 0.000 description 1
- UUBKSZNKJUJQEJ-JRQIVUDYSA-N Tyr-Thr-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O UUBKSZNKJUJQEJ-JRQIVUDYSA-N 0.000 description 1
- AEOFMCAKYIQQFY-YDHLFZDLSA-N Tyr-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AEOFMCAKYIQQFY-YDHLFZDLSA-N 0.000 description 1
- SQUMHUZLJDUROQ-YDHLFZDLSA-N Tyr-Val-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O SQUMHUZLJDUROQ-YDHLFZDLSA-N 0.000 description 1
- PQPWEALFTLKSEB-DZKIICNBSA-N Tyr-Val-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PQPWEALFTLKSEB-DZKIICNBSA-N 0.000 description 1
- 102000006275 Ubiquitin-Protein Ligases Human genes 0.000 description 1
- 108010083111 Ubiquitin-Protein Ligases Proteins 0.000 description 1
- 108020004417 Untranslated RNA Proteins 0.000 description 1
- 102000039634 Untranslated RNA Human genes 0.000 description 1
- UEOOXDLMQZBPFR-ZKWXMUAHSA-N Val-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N UEOOXDLMQZBPFR-ZKWXMUAHSA-N 0.000 description 1
- IZFVRRYRMQFVGX-NRPADANISA-N Val-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N IZFVRRYRMQFVGX-NRPADANISA-N 0.000 description 1
- NMANTMWGQZASQN-QXEWZRGKSA-N Val-Arg-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N NMANTMWGQZASQN-QXEWZRGKSA-N 0.000 description 1
- PAPWZOJOLKZEFR-AVGNSLFASA-N Val-Arg-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N PAPWZOJOLKZEFR-AVGNSLFASA-N 0.000 description 1
- QGFPYRPIUXBYGR-YDHLFZDLSA-N Val-Asn-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N QGFPYRPIUXBYGR-YDHLFZDLSA-N 0.000 description 1
- IDKGBVZGNTYYCC-QXEWZRGKSA-N Val-Asn-Pro Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(O)=O IDKGBVZGNTYYCC-QXEWZRGKSA-N 0.000 description 1
- DBOXBUDEAJVKRE-LSJOCFKGSA-N Val-Asn-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DBOXBUDEAJVKRE-LSJOCFKGSA-N 0.000 description 1
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 1
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 1
- SCBITHMBEJNRHC-LSJOCFKGSA-N Val-Asp-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N SCBITHMBEJNRHC-LSJOCFKGSA-N 0.000 description 1
- XEYUMGGWQCIWAR-XVKPBYJWSA-N Val-Gln-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N XEYUMGGWQCIWAR-XVKPBYJWSA-N 0.000 description 1
- PWRITNSESKQTPW-NRPADANISA-N Val-Gln-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N PWRITNSESKQTPW-NRPADANISA-N 0.000 description 1
- XWYUBUYQMOUFRQ-IFFSRLJSSA-N Val-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N)O XWYUBUYQMOUFRQ-IFFSRLJSSA-N 0.000 description 1
- KZKMBGXCNLPYKD-YEPSODPASA-N Val-Gly-Thr Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O KZKMBGXCNLPYKD-YEPSODPASA-N 0.000 description 1
- SDSCOOZQQGUQFC-GVXVVHGQSA-N Val-His-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N SDSCOOZQQGUQFC-GVXVVHGQSA-N 0.000 description 1
- LKUDRJSNRWVGMS-QSFUFRPTSA-N Val-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LKUDRJSNRWVGMS-QSFUFRPTSA-N 0.000 description 1
- VXDSPJJQUQDCKH-UKJIMTQDSA-N Val-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N VXDSPJJQUQDCKH-UKJIMTQDSA-N 0.000 description 1
- FEXILLGKGGTLRI-NHCYSSNCSA-N Val-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N FEXILLGKGGTLRI-NHCYSSNCSA-N 0.000 description 1
- AEMPCGRFEZTWIF-IHRRRGAJSA-N Val-Leu-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O AEMPCGRFEZTWIF-IHRRRGAJSA-N 0.000 description 1
- ZHQWPWQNVRCXAX-XQQFMLRXSA-N Val-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZHQWPWQNVRCXAX-XQQFMLRXSA-N 0.000 description 1
- WBAJDGWKRIHOAC-GVXVVHGQSA-N Val-Lys-Gln Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O WBAJDGWKRIHOAC-GVXVVHGQSA-N 0.000 description 1
- MLADEWAIYAPAAU-IHRRRGAJSA-N Val-Lys-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N MLADEWAIYAPAAU-IHRRRGAJSA-N 0.000 description 1
- ZRSZTKTVPNSUNA-IHRRRGAJSA-N Val-Lys-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)C(C)C)C(O)=O ZRSZTKTVPNSUNA-IHRRRGAJSA-N 0.000 description 1
- XPKCFQZDQGVJCX-RHYQMDGZSA-N Val-Lys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N)O XPKCFQZDQGVJCX-RHYQMDGZSA-N 0.000 description 1
- JVGHIFMSFBZDHH-WPRPVWTQSA-N Val-Met-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)O)N JVGHIFMSFBZDHH-WPRPVWTQSA-N 0.000 description 1
- RYQUMYBMOJYYDK-NHCYSSNCSA-N Val-Pro-Glu Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(=O)O)C(=O)O)N RYQUMYBMOJYYDK-NHCYSSNCSA-N 0.000 description 1
- UGFMVXRXULGLNO-XPUUQOCRSA-N Val-Ser-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O UGFMVXRXULGLNO-XPUUQOCRSA-N 0.000 description 1
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 1
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 1
- PZTZYZUTCPZWJH-FXQIFTODSA-N Val-Ser-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PZTZYZUTCPZWJH-FXQIFTODSA-N 0.000 description 1
- SDHZOOIGIUEPDY-JYJNAYRXSA-N Val-Ser-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 SDHZOOIGIUEPDY-JYJNAYRXSA-N 0.000 description 1
- CEKSLIVSNNGOKH-KZVJFYERSA-N Val-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](C(C)C)N)O CEKSLIVSNNGOKH-KZVJFYERSA-N 0.000 description 1
- QTXGUIMEHKCPBH-FHWLQOOXSA-N Val-Trp-Lys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 QTXGUIMEHKCPBH-FHWLQOOXSA-N 0.000 description 1
- 241000607598 Vibrio Species 0.000 description 1
- 108020005202 Viral DNA Proteins 0.000 description 1
- 241000607479 Yersinia pestis Species 0.000 description 1
- 108010081404 acein-2 Proteins 0.000 description 1
- 230000021736 acetylation Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 102000005421 acetyltransferase Human genes 0.000 description 1
- 108020002494 acetyltransferase Proteins 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000006154 adenylylation Effects 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 230000009418 agronomic effect Effects 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 1
- 108010045350 alanyl-tyrosyl-alanine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- 230000029936 alkylation Effects 0.000 description 1
- 238000005804 alkylation reaction Methods 0.000 description 1
- 230000006229 amino acid addition Effects 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 108010038850 arginyl-isoleucyl-tyrosine Proteins 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 108010060035 arginylproline Proteins 0.000 description 1
- 108010027371 asparaginyl-leucyl-prolyl-arginine Proteins 0.000 description 1
- 230000010165 autogamy Effects 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- 230000027455 binding Effects 0.000 description 1
- 230000000443 biocontrol Effects 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 210000002421 cell wall Anatomy 0.000 description 1
- 239000012707 chemical precursor Substances 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 239000005081 chemiluminescent agent Substances 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 210000004748 cultured cell Anatomy 0.000 description 1
- 230000003013 cytotoxicity Effects 0.000 description 1
- 231100000135 cytotoxicity Toxicity 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 1
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000027832 depurination Effects 0.000 description 1
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 description 1
- 230000000368 destabilizing effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 1
- 235000011180 diphosphates Nutrition 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000034431 double-strand break repair via homologous recombination Effects 0.000 description 1
- 230000011559 double-strand break repair via nonhomologous end joining Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000000408 embryogenic effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 238000013401 experimental design Methods 0.000 description 1
- 238000010195 expression analysis Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 238000010362 genome editing Methods 0.000 description 1
- 108010079547 glutamylmethionine Proteins 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 1
- 108010090037 glycyl-alanyl-isoleucine Proteins 0.000 description 1
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 1
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 1
- 108010084389 glycyltryptophan Proteins 0.000 description 1
- 229960002897 heparin Drugs 0.000 description 1
- 229920000669 heparin Polymers 0.000 description 1
- 229920000140 heteropolymer Polymers 0.000 description 1
- 108010040030 histidinoalanine Proteins 0.000 description 1
- 230000006195 histone acetylation Effects 0.000 description 1
- 239000000710 homodimer Substances 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 239000010903 husk Substances 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 230000003301 hydrolyzing effect Effects 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 239000012499 inoculation medium Substances 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 108010009932 leucyl-alanyl-glycyl-valine Proteins 0.000 description 1
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 1
- 108010073093 leucyl-glycyl-glycyl-glycine Proteins 0.000 description 1
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000013011 mating Effects 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 239000002207 metabolite Substances 0.000 description 1
- 108010063431 methionyl-aspartyl-glycine Proteins 0.000 description 1
- 108700023046 methionyl-leucyl-phenylalanine Proteins 0.000 description 1
- 230000000116 mitigating effect Effects 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 230000036438 mutation frequency Effects 0.000 description 1
- 230000007498 myristoylation Effects 0.000 description 1
- 230000006780 non-homologous end joining Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 235000016709 nutrition Nutrition 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000009401 outcrossing Methods 0.000 description 1
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 1
- 108010065135 phenylalanyl-phenylalanyl-phenylalanine Proteins 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 150000004713 phosphodiesters Chemical class 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 108010025488 pinealon Proteins 0.000 description 1
- 210000002706 plastid Anatomy 0.000 description 1
- 229920002647 polyamide Polymers 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 1
- 108020001580 protein domains Proteins 0.000 description 1
- 238000001243 protein synthesis Methods 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 239000013635 pyrimidine dimer Substances 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000012207 quantitative assay Methods 0.000 description 1
- 230000000171 quenching effect Effects 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000001850 reproductive effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 239000012882 rooting medium Substances 0.000 description 1
- 102200012576 rs111033648 Human genes 0.000 description 1
- 102220089709 rs869320709 Human genes 0.000 description 1
- 238000009394 selective breeding Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- HBMJWWWQQXIZIP-UHFFFAOYSA-N silicon carbide Chemical compound [Si+]#[C-] HBMJWWWQQXIZIP-UHFFFAOYSA-N 0.000 description 1
- 229910010271 silicon carbide Inorganic materials 0.000 description 1
- 230000003007 single stranded DNA break Effects 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- SUKJFIGYRHOWBL-UHFFFAOYSA-N sodium hypochlorite Chemical compound [Na+].Cl[O-] SUKJFIGYRHOWBL-UHFFFAOYSA-N 0.000 description 1
- 108010068698 spleen exonuclease Proteins 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 239000008223 sterile water Substances 0.000 description 1
- 239000011593 sulfur Substances 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 230000010741 sumoylation Effects 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- KYMBYSLLVAOCFI-UHFFFAOYSA-N thiamine Chemical compound CC1=C(CCO)SCN1CC1=CN=C(C)N=C1N KYMBYSLLVAOCFI-UHFFFAOYSA-N 0.000 description 1
- 125000001990 thiamine group Chemical group 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- 108700004896 tripeptide FEG Proteins 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 1
- 108010035534 tyrosyl-leucyl-alanine Proteins 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 108010009962 valyltyrosine Proteins 0.000 description 1
- 108700026220 vif Genes Proteins 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/62—DNA sequences coding for fusion proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/01—Preparation of mutants without inserting foreign genetic material therein; Screening processes therefor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8201—Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
- C12N15/8213—Targeted insertion of genes into the plant genome by homologous recombination
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
- C12N15/8242—Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
- C12N15/8243—Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
- C12N15/8245—Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine involving modified carbohydrate or sugar alcohol metabolism, e.g. starch biosynthesis
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/78—Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y305/00—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
- C12Y305/04—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
- C12Y305/04005—Cytidine deaminase (3.5.4.5)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/01—Fusion polypeptide containing a localisation/targetting motif
- C07K2319/09—Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/80—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Cell Biology (AREA)
- Medicinal Chemistry (AREA)
- Nutrition Science (AREA)
- Crystallography & Structural Chemistry (AREA)
- Peptides Or Proteins (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
本发明涉及用于修饰细胞的基因组中的靶位点的方法和组合物。提供了融合蛋白,这些融合蛋白包括通过改进的接头序列连接的一种或多种DNA结合结构域和一种或多种异源结构域,如DNA修饰结构域。提供了密码子优化的多核苷酸,这些多核苷酸编码包括通过改进的接头序列连接的一种或多种DNA结合结构域和一种或多种异源结构域的融合蛋白。
Description
技术领域
本发明涉及用于在细胞的基因组中靶向核苷酸碱基编辑的方法和组合物。
关于序列表的电子提交的声明
特此附上并存档ASCII文本格式的序列表,该序列表是根据37C.F.R.§1.821提交的,标题为“81945_ST25”,创建于2020年9月18日,大约702千字节,并且通过引用并入本文。
背景技术
农业上非常需要具有编辑植物基因组的能力,以便创造有利的等位基因。有可能增加产量或预防疾病。基因组编辑是新领域,其中植物方面的进展滞后。此外,除了预期的变化外,基因组的变化是限制实现所期望的变化所面临的一个问题。CRISPR-CAS9通过对DNA进行双链切割起作用。由于这种断裂通过非同源末端连接或同源依赖性修复来进行修复,因此可能发生DNA碱基插入或缺失。称为碱基编辑的策略可以对DNA进行改变,而不需要切割以及产生插入和缺失。在一个版本中,称为胞苷脱氨酶的酶通过CAS9(Shimatani等人,2017.Nat.Biotechnol.[自然生物技术]35,441-443)或CAS12a(Li等人,2018.Nat.Biotechnol.[自然生物技术]36,324-327)酶靶向特定碱基,将CAS9或CAS12a酶进行修饰以使其不能切割DNA。胞苷脱氨酶和核酸酶缺陷型CAS9或CAS12a通过氨基酸接头的连接而融合在一起。接头连接的改进可以改进融合蛋白的功能性,如通过减少靶标碱基的改变来改进切割的精确度。
发明内容
为了满足这种改进需要,我们提供了优化的和改进的Cas12a酶和构建体。特别地,我们提供了包含异源结构域、第一接头序列以及V型CRISPR-Cas酶的融合蛋白。第一接头序列包含重复的GGGGS序列。异源结构域可以为脱氨酶、聚合酶、核酸酶、松弛酶、烷基转移酶、甲基转移酶、腺苷脱氨酶、胞苷脱氨酶、氧化酶、胸腺嘧啶烷基转移酶、腺嘌呤氧化酶、腺苷甲基转移酶、糖基化酶或核定位信号。用于碱基编辑,异源结构域为脱氨酶结构域—如胞苷脱氨酶或腺嘌呤脱氨酶。胞苷脱氨酶结构域可以是激活诱导的胞苷脱氨酶(“AID”)或载脂蛋白B mRNA-编辑复合物(“APOBEC”)结构域,如来自脱氨酶的APOBEC1家族。在一些背景下,APOBEC结构域包含与SEQ ID NO:1具有至少70%同一性的序列。当需要腺嘌呤脱氨酶时,腺嘌呤脱氨酶可以是TadA结构域,该结构域包含与SEQ ID NO:92具有至少70%同一性的氨基酸序列。
当V型CRISPR-Cas酶为V型-A(“Cas12a”)酶时,Cas12a选自由以下组成的组:SEQID NO:3、SEQ ID NO:6、SEQ ID NO:22、SEQ ID NO:45、SEQ ID NO:46、SEQ ID NO:47以及SEQ ID NO:48。Cas12a结构域可以是无催化活性的,但仍与靶DNA结合,并且允许异源结构域起作用。当Cas12a无活性,其序列为SEQ ID NO:3、SEQ ID NO:6或SEQ ID NO:22。
异源结构域和Cas12a酶之间的第一接头序列可以包含重复至少三次的GGGGS。在其他用途中,第一接头序列可以包含重复至少六次的GGGGS。
融合蛋白可以包含SEQ ID NO:11、12、13或44,并且其还可以包括尿嘧啶DNA糖基化酶抑制剂(“UGI”)结构域(如SEQ ID NO:8表示)。UGI结构域可以通过包含序列SGGS的第二接头与Cas12a酶连接。融合蛋白可以包含SEQ ID NO:17、SEQ ID NO:24、SEQ ID NO:35、SEQ ID NO:39、SEQ ID NO:43、SEQ ID NO:50、SEQ ID NO:52、SEQ ID NO:54、SEQ ID NO:56、SEQ ID NO:81、SEQ ID NO:83、SEQ ID NO:85、SEQ ID NO:87或SEQ ID NO:89。当与DNA接触时,与缺少重复的GGGGS序列的第一接头序列的现有技术融合蛋白相比,这些融合蛋白以增加的频率产生靶上编辑,并且以降低的频率产生脱靶编辑。
我们还提供了编辑植物基因组DNA的方法,该方法通过将植物基因组DNA与以下接触:(a)如通过以上方面之一所述的融合蛋白,并且该融合蛋白任选地包含UGI结构域;以及(b)将步骤(a)的融合蛋白靶向至植物基因组DNA的靶DNA序列的指导RNA(“gRNA”);其中与通过具有除重复的GGGGS序列外的第一接头的融合蛋白编辑的植物基因组DNA相比,经编辑的植物基因组DNA包含减少的脱靶编辑。
我们还提供了编辑具有减少的脱靶编辑的植物基因组DNA的方法,该方法通过将植物基因组DNA与以下接触:(a)如通过以上方面之一所述的融合蛋白,并且该融合蛋白任选地包含UGI结构域;以及(b)将步骤(a)的融合蛋白靶向至植物基因组DNA的靶DNA序列的指导RNA(“gRNA”);其中与通过具有除重复的GGGGS序列外的第一接头的融合蛋白编辑的植物基因组DNA相比,经编辑的植物基因组DNA包含减少的脱靶编辑。在一方面,融合蛋白包含SEQ ID NO:24。
我们还提供了通过以下来获得具有减少的脱靶编辑的经编辑的植物群体的方法:(a)获得包含待编辑的基因组DNA的植物细胞的群体;(b)获得编码如通过以上方面之一所述的融合蛋白、和任选地UGI结构域的核苷酸序列;(c)用步骤(b)的核苷酸序列转化植物细胞的群体,从而表达通过植物细胞的群体内的核酸序列编码的融合蛋白;(d)使转化的植物细胞的群体生长成植物,其中至少一种植物被编辑;以及(e)从步骤(d)的产物中选择至少一种经编辑的植物,从而获得经编辑的植物群体;其中与通过具有除重复的GGGGS序列外的第一接头的融合蛋白编辑的植物相比,经编辑的植物群体包含减少的脱靶编辑。在一方面,编码融合蛋白的核苷酸序列包含SEQ ID NO:17、SEQ ID NO:24、SEQ ID NO:35、SEQ ID NO:39、SEQ ID NO:43、SEQ ID NO:50、SEQ ID NO:52、SEQ ID NO:54、SEQ ID NO:56、SEQ IDNO:81、SEQ ID NO:83、SEQ ID NO:85、SEQ ID NO:87或SEQ ID NO:89。
附图说明
图1显示出三个版本的Cas12aBE的DNA构建体的示意性图示。(1)表示启动子;(2)为核定位信号;(3)为脱氨酶,例如,APOBEC脱氨酶;(4)为XTEN接头;(5)为LbCas12a;(6)为SGGS接头;(7)为尿嘧啶糖基化酶抑制剂;(8)为长接头,例如(G4S)6接头;(9)为Mb2Cas12a;(10)为编码指导RNA的元件。图1A显示出以5'至3'方向的LbCas12aBE加上指导RNA构建体,其中脱氨酶(3)通过XTEN接头(4)可操作地与LbCas12a(5)连接。图1B显示出以5'至3'方向的LbCas12aBE加上指导RNA构建体,其中脱氨酶(3)通过(G4S)6接头(8)可操作地与LbCas12a(5)连接。图1C显示以5'至3'方向的Mb2Cas12aBE加上指导RNA构建体,其中脱氨酶(3)通过(G4S)6接头(8)可操作地与Mb2Cas12a(9)连接。
图2显示出以5'至3'方向的DNA构建体的示意性图示,该DNA构建体包含Cas12aBE和多重指导RNA。(1)表示启动子;(2)为核定位信号;(3)为脱氨酶,例如,APOBEC脱氨酶;(6)为SGGS接头;(7)为尿嘧啶糖基化酶抑制剂;(8)为长接头,例如(G4S)6接头;(9)为Cas12a;(10)为第一编码指导RNA的元件;(11)为第二编码指导RNA的元件;以及,(12)为第三编码指导RNA的元件。每个编码指导RNA的元件包含crRNA区段和能够与基因组靶DNA序列杂交的靶序列区段。
对序列表中的序列的简述
SEQ ID NO:1为Apobec1的氨基酸序列。
SEQ ID NO:2为Apobec1的核苷酸序列。
SEQ ID NO:3为无催化活性的Mb2Cas12a的氨基酸序列。
SEQ ID NO:4为无催化活性的Mb2Cas12a的核苷酸序列。
SEQ ID NO:5为无催化活性的cLbCas12aBE的核苷酸序列。
SEQ ID NO:6为无催化活性的cLbCas12aBE的氨基酸序列。
SEQ ID NO:7为尿嘧啶DNA糖基化酶抑制剂(UGI)的核苷酸序列。
SEQ ID NO:8为尿嘧啶DNA糖基化酶抑制剂(UGI)的氨基酸序列。
SEQ ID NO:9为核苷酸序列,该核苷酸序列包含表达盒prSoUbi4:SV40NLS:cLbCas12aBE:GS6接头:SV40NLS:SGGS接头:UGI:SGGS接头:SV40NLS:tNOS。
SEQ ID NO:10为优化的(G4S)x6接头的核苷酸序列。
SEQ ID NO:11为优化的(G4S)x6接头的氨基酸序列。
SEQ ID NO:12为18个aa接头-SX的氨基酸序列。
SEQ ID NO:13为15个aa接头-(G4S)X3的氨基酸序列。
SEQ ID NO:14为核苷酸序列,该核苷酸序列包含来自构建体25057的融合蛋白cLBCas12aBE-07。
SEQ ID NO:15为氨基酸序列,该氨基酸序列包含来自构建体25057的融合蛋白cLBCas12aBE-07。
SEQ ID NO:16为核苷酸序列,该核苷酸序列包含来自构建体25058的融合蛋白cLBCas12aBE-08。
SEQ ID NO:17为氨基酸序列,该氨基酸序列包含来自构建体25058的融合蛋白cLBCas12aBE-08。
SEQ ID NO:18为核苷酸序列,该核苷酸序列包含来自构建体24524的融合蛋白cLBCas12aBE-01。
SEQ ID NO:19为氨基酸序列,该氨基酸序列包含来自构建体24524的融合蛋白cLBCas12aBE-01。
SEQ ID NO:20为cCas9BE-02的核苷酸序列。
SEQ ID NO:21为cCas9BE-02的氨基酸序列。
SEQ ID NO:22为无催化活性的AsCas12a的氨基酸序列。
SEQ ID NO:23为核苷酸序列,该核苷酸序列包含来自构建体24904的融合蛋白cLBCas12aBE-06。
SEQ ID NO:24为氨基酸序列,该氨基酸序列包含来自构建体24904的融合蛋白cLBCas12aBE-06。
SEQ ID NO:25为包含启动子prSoUbi4-02的核苷酸序列。
SEQ ID NO:26为包含Cas12a gRNA waxy1靶序列的核苷酸序列。
SEQ ID NO:27为包含Cas9 gRNA waxy1靶序列的核苷酸序列。
SEQ ID NO:28为包含ZmWaxy1基因外显子4的核苷酸序列。
SEQ ID NO:29为ZmWaxy1的正向引物。
SEQ ID NO:30为ZmWaxy1的反向引物。
SEQ ID NO:31为ZmWaxy1的测序引物。
SEQ ID NO:32为核苷酸序列,该核苷酸序列包含来自构建体24523的融合蛋白cLbCpf1-02。
SEQ ID NO:33为氨基酸序列,该氨基酸序列包含来自构建体24523的融合蛋白cLbCpf1-02。
SEQ ID NO:34为核苷酸序列,该核苷酸序列包含来自构建体25181的融合蛋白cLbCas12a-05。
SEQ ID NO:35为氨基酸序列,该氨基酸序列包含来自构建体25181的融合蛋白cLbCas12a-05。
SEQ ID NO:36为核苷酸序列,该核苷酸序列包含来自构建体25205的融合蛋白cLbCas12a-02。
SEQ ID NO:37为氨基酸序列,该氨基酸序列包含来自构建体25205的融合蛋白cLbCas12a-02。
SEQ ID NO:38为核苷酸序列,该核苷酸序列包含来自构建体25513的融合蛋白cLbCas12a-25。
SEQ ID NO:39为氨基酸序列,该氨基酸序列包含来自构建体25513的融合蛋白cLbCas12a-25。
SEQ ID NO:40为核苷酸序列,该核苷酸序列包含来自构建体25220的融合蛋白cMb2Cas12a-01。
SEQ ID NO:41为氨基酸序列,该氨基酸序列包含来自构建体25220的融合蛋白cMb2Cas12a-01。
SEQ ID NO:42为核苷酸序列,该核苷酸序列包含来自构建体25382的融合蛋白cMb2Cas12a-02。
SEQ ID NO:43为氨基酸序列,该氨基酸序列包含来自构建体25382的融合蛋白cMb2Cas12a-02。
SEQ ID NO:44为优化的(G4SG)x6接头的氨基酸序列。
SEQ ID NO:45为活性LbCas12a的氨基酸序列。
SEQ ID NO:46为活性Mb2Cas12a的氨基酸序列。
SEQ ID NO:47为活性AsCas12a的氨基酸序列。
SEQ ID NO:48为活性FnCas12a的氨基酸序列。
SEQ ID NO:49为核苷酸序列,该核苷酸序列包含来自构建体25457的融合蛋白cMb2Cas12a-BE-01。
SEQ ID NO:50为氨基酸序列,该氨基酸序列包含来自构建体25457的融合蛋白cMb2Cas12a-BE-01。
SEQ ID NO:51为核苷酸序列,该核苷酸序列包含来自构建体25268的融合蛋白cLbCas12a-BE-08。
SEQ ID NO:52为氨基酸序列,该氨基酸序列包含来自构建体25268的融合蛋白cLbCas12a-BE-08。
SEQ ID NO:53为核苷酸序列,该核苷酸序列包含来自构建体25173的融合蛋白cLbCas12a-05。
SEQ ID NO:54为氨基酸序列,该氨基酸序列包含来自构建体25173的融合蛋白cLbCas12a-05。
SEQ ID NO:55为核苷酸序列,该核苷酸序列包含来自构建体25175的融合蛋白cLbCas12a-05。
SEQ ID NO:56为氨基酸序列,该氨基酸序列包含来自构建体25175的融合蛋白cLbCas12a-05。
SEQ ID NO:57为具有优化的(G4SG)6接头的无催化活性的LbCas12a的氨基酸序列。
SEQ ID NO:58为具有优化的(G4S)6接头的活性Mb2Cas12a的氨基酸序列。
SEQ ID NO:59为具有XTEN接头的无催化活性的Mb2Cas12a的氨基酸序列。
SEQ ID NO:60为具有XTEN接头的活性AsCas12a的氨基酸序列。
SEQ ID NO:61为具有XTEN接头的无催化活性的AsCas12a的氨基酸序列。
SEQ ID NO:62为具有XTEN接头的活性FnCas12a的氨基酸序列。
SEQ ID NO:63为具有优化的(G4S)6接头的活性AsCas12a的氨基酸序列。
SEQ ID NO:64为具有优化的(G4S)6接头的无催化活性的AsCas12a的氨基酸序列。
SEQ ID NO:65为具有优化的(G4S)6接头的活性FnCas12a的氨基酸序列。
SEQ ID NO:66为具有优化的(G4SG)6接头的无催化活性的Mb2Cas12a的氨基酸序列。
SEQ ID NO:67为具有优化的(G4SG)6接头的活性AsCas12a的氨基酸序列。
SEQ ID NO:68为具有优化的(G4SG)6接头的无催化活性的AsCas12a的氨基酸序列。
SEQ ID NO:69为具有优化的(G4SG)6接头的活性FnCas12a的氨基酸序列。
SEQ ID NO:70为XTEN接头的氨基酸序列。
SEQ ID NO:71为包含Cas12a gRNA SBEII靶序列的核苷酸序列。
SEQ ID NO:72为包含Cas12a gRNA GL2靶序列的核苷酸序列。
SEQ ID NO:73为包含Cas12a gRNA Fad2靶序列的核苷酸序列。
SEQ ID NO:74为核苷酸序列,该核苷酸序列包含与waxy1、SBEII和Fad2靶序列一起使用的Cas12a crRNA序列。
SEQ ID NO:75为核苷酸序列,该核苷酸序列包含与GL2靶序列一起使用的Cas12acrRNA序列。
SEQ ID NO:76为核苷酸序列,该核苷酸序列包含来自构建体24785的融合蛋白cCas9ABE-01。
SEQ ID NO:77为氨基酸序列,该氨基酸序列包含来自构建体24785的融合蛋白cCas9ABE-01。
SEQ ID NO:78为核苷酸序列,该核苷酸序列包含来自构建体25459的融合蛋白cLbCas1aABE-01。
SEQ ID NO:79为氨基酸序列,该氨基酸序列包含来自构建体25459的融合蛋白cLbCas1aABE-01。
SEQ ID NO:80为核苷酸序列,该核苷酸序列包含来自构建体25504的融合蛋白cLbCas12aABE-02。
SEQ ID NO:81为氨基酸序列,该氨基酸序列包含来自构建体25504的融合蛋白cLbCas12aABE-02。
SEQ ID NO:82为核苷酸序列,该核苷酸序列包含来自构建体25289的融合蛋白cLbCas12aBE-09。
SEQ ID NO:83为氨基酸序列,该氨基酸序列包含来自构建体25289的融合蛋白cLbCas12aBE-09。
SEQ ID NO:84为核苷酸序列,该核苷酸序列包含来自构建体25658的融合蛋白cdLbCas12a-ABE-CBE-01。
SEQ ID NO:85为氨基酸序列,该氨基酸序列包含来自构建体25658的融合蛋白cdLbCas12a-ABE-CBE-01。
SEQ ID NO:86为核苷酸序列,该核苷酸序列包含来自构建体25701的融合蛋白cdLbCas12a-ABE-CBE-02。
SEQ ID NO:87为氨基酸序列,该氨基酸序列包含来自构建体25701的融合蛋白cdLbCas12a-ABE-CBE-02。
SEQ ID NO:88为核苷酸序列,该核苷酸序列包含来自构建体25702的融合蛋白cdLbCas12a-ABE-CBE-03。
SEQ ID NO:89为氨基酸序列,该氨基酸序列包含来自构建体25702的融合蛋白cdLbCas12a-ABE-CBE-03。
SEQ ID NO:90为包含Cas12a gRNA ADH1靶序列的核苷酸序列。
SEQ ID NO:91为包含TadA二聚体的核苷酸序列。
SEQ ID NO:92为包含TadA二聚体的氨基酸序列。
具体实施方式
本说明不旨在是可以实施本发明的所有不同方式,或可以添加到本发明中的所有特征的详细目录。例如,关于一个实施例所说明的特征可以并入其他实施例中,并且关于一个特定实施例所说明的特征可以从那个实施例删除。此外,鉴于本披露,本文建议的不同实施例的多种变化以及添加物对于本领域技术人员是显而易见的,这不脱离本发明。因此,以下说明旨在阐述本发明的一些特定实施例,并且并没有穷尽地叙述其所有排列、组合和变化。
定义
除非另外定义,本文所使用的所有技术和科学术语均具有与本发明所属领域的普通技术人员通常所理解的相同的含义。在本文的发明的说明中使用的术语是仅出于描述特定实施例的目的,且并不旨在限制本发明。本文提及的所有出版物、专利申请、专利以及其他参考文献通过引用以其全文并入本文。
提供以下的定义和方法以更好地定义本发明并且在本发明的实践中指导本领域的普通技术人员。除非另外说明,本文使用的术语应该根据相关领域的那些一般技术人员的常规用法来理解。分子生物学中的一般术语的定义也可在Rieger等人,Glossary of Genetics: Classical and Molecular[遗传学词汇表:标准和分子],第5版,Springer-Verlag,New York[施普林格出版社:纽约],1994中找到。
如本文使用的,术语“长接头”是指用于将异源结构域与目的蛋白连接的至少10个氨基酸的多肽链。通过举例而非限制,长接头可以包含序列GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS(SEQ ID NO:11),另外表示为(G4S)6或(G4S)x6或(G4S)*6。长接头可以包含GGGGSGGGGGSGGGGGSGGGGGSGGGGGSGGGGGSG(SEQ ID NO:44),另外表示为(G4SG)6或(G4SG)x6或(G4SG)*6。通过长接头与蛋白质连接的异源结构域包括胞苷脱氨酶、鸟嘌呤脱氨酶、尿嘧啶糖基化酶抑制剂(“UGI”)、核酸酶和可以以异源方式与目的蛋白可操作连接的任何其他蛋白质结构域。此类目的蛋白包括但不限于定点核酸酶(例如,Cas9、Cas12a、Cas12b、Cas12i、Cas12j或其他CRISPR核酸酶)、锌指、归巢核酸内切酶、转录激活子样效应子核酸酶(“TALEN”)等。
如在本发明的实施例的说明和所附权利要求中使用的,单数形式“一个/一种(a/an)”和“该/所述(the)”旨在也包括复数形式,除非上下文清楚地另外指明。
如本文使用的,“和/或”是指并且涵盖相关列出项目中的一个或多个的任何和所有可能的组合。
如本文使用的术语“约”当指代可测量的值如化合物的量、剂量、时间、温度等时意指涵盖指定量的20%、10%、5%、1%、0.5%、或甚至0.1%的变化。
术语“包含(comprise、comprises和/或comprising)”当在本说明书中使用时,指明所列举特征、整体、步骤、操作、元件、和/或组分的存在,但是不排除一种或多种其他特征、整体、步骤、操作、元件、组分、和/或其组的存在或添加。
如本文使用的,过渡短语“基本上由……组成”意指权利要求的范围将被解释为涵盖该权利要求中所提到的指定材料或步骤以及不实质上影响要求保护的发明的一个或多个基本特征和新特征的那些材料或步骤。因此,当用于本发明的权利要求中时,术语“基本上由……组成”并不旨在被解释为等同于“包含(comprising)”。
如本文使用的,术语“扩增的”意指使用至少一种核酸分子作为模板,构建核酸分子的多个拷贝或与该核酸分子互补的多个拷贝。参见,例如,Diagnostic MolecularMicrobiology:Principles and Applications[诊断分子微生物学:原理与应用],D.H.PERSING等人编辑,American Society for Microbiology[美国微生物学会],华盛顿(Washington,D.C.),(1993)。扩增产物被称为扩增子。
“编码序列”是转录成RNA(如mRNA、rRNA、tRNA、snRNA、正义RNA或反义RNA)的核酸序列。在一些实施例中,RNA随后在生物体内被翻译以产生蛋白质。
如本文使用的术语转基因的“事件”是指通过用异源DNA(例如,包括一个或多个目的基因(例如,转基因)的表达盒)转化和再生单个植物细胞而产生的重组植物。术语“事件”是指包括异源DNA的原始转化体和/或该转化体的子代。术语“事件”也是指通过转化体和另一种品系之间进行有性远交(outcross)而产生的子代。即使在重复回交至轮回亲本后,来自转化的亲本的插入DNA和侧翼DNA存在于在杂交子代的同样的染色体位置。通常,植物组织的转化产生多个事件,每个上述事件代表DNA构建体插入至植物细胞的基因组中的不同位置中。基于转基因或其他期望的特征的表达,选择特定的事件。因此,如本文使用的“事件MIR604”、“MIR604”或“MIR604事件”意指原始的MIR604转化体和/或MIR604转化体的子代(美国专利号7,361,813;7,897,748;8,354,519和8,884,102,通过引用结合在此)。
如本文使用的“表达盒”意指能够在适当的宿主细胞中指导特定的核苷酸序列表达的核酸分子,该核酸分子包含与目的核苷酸序列(典型地是编码区)可操作地连接的启动子,该核苷酸序列与终止信号可操作地连接。它还典型地包含适当翻译核苷酸序列所需要的序列。编码区通常对目的蛋白进行编码,但是还可以在正义或反义方向上对目的功能性RNA(例如反义RNA或非翻译RNA)进行编码。表达盒还可以包含在指导目的核苷酸序列表达中不需要的序列,但是其因为用于将表达盒从表达载体去除的方便的限制性位点而存在。包含目的核苷酸序列的表达盒可以是嵌合的,意味着至少一个它的组分相对于至少一个它的其他组分是异源的。表达盒还可以是天然存在的但已经是以对于异源表达有用的重组形式而获得的表达盒。然而,通常表达盒相对于宿主来说是异源的,即表达盒的特定核酸序列在宿主细胞中不是天然存在的,并且必须已经通过本领域已知的转化方法引入至宿主细胞或宿主细胞的祖先中。在表达盒中核苷酸序列的表达可以是在组成型启动子或诱导型启动子的控制之下,启动子只有当宿主细胞暴露于一些特定的外界刺激时才启动转录。在多细胞生物体(如植物)的情况下,启动子对于特定组织、或器官、或者发育阶段也可以是特异的。当被转化进植物中时,表达盒或其片段也可被称为“插入的序列”或者“插入序列”。
“基因”是位于基因组内的限定区域,并且除了前述的编码核酸序列之外,它还包含其他负责控制编码部分的表达(也就是转录和翻译)的主要调节性核酸序列。基因可以包括编码区和非编码区(例如,内含子、调节元件、启动子、增强子、终止序列和5'和3'非翻译区)二者。基因典型地表达mRNA、功能性RNA、或特异性蛋白,包括调节序列。基因可能或可能不能用于产生功能性蛋白质。在一些实施例中,基因仅指编码区。术语“天然基因”是指如在自然界中发现的基因。术语“嵌合基因”是指包含以下各项的任何基因:1)DNA序列,包括在自然界中未一起发现的调节序列和编码序列,或2)编码不天然邻接的蛋白的部分的序列,或3)不天然邻接的启动子的部分。因此,嵌合基因可以包含从不同来源得到的调节序列和编码序列,或包含从相同来源得到的、但以与在自然界中所发现的不同的方式进行安排的调节序列和编码序列。基因可以是“分离的”,分离的基因意为核酸分子,基本上(substantially或essentially)不含通常发现与其天然状态时的核酸分子相关的组分。此类组分包括其他细胞材料、来自重组产物的培养基、和/或在化学合成核酸分子中所使用的多种化学品。
关于多核苷酸编码序列的术语“表达(express或expression)”,意指该序列被转录,并且任选被翻译。
“目的基因”或“目的核苷酸序列”是指当转移至植物时,在植物上赋予所期望的特征(如抗生素抗性、病毒抗性、昆虫抗性、疾病抗性、或对其他有害生物的抗性、除草剂耐受性、改进的营养价值、改进的工业过程的性能或者改变的繁育能力)的任何基因。“目的基因”还可以是被转移至植物中用于在植物中产生商业上有价值的酶或代谢物的基因。
如本文使用的,“异源的”是指与其引入的宿主细胞天然不相关的核酸分子或核苷酸序列,该序列来源于另一种物种或来自相同物种或生物体,但是从其原始形式或主要在细胞中表达的形式进行了修饰,包括天然存在的核酸序列的非天然存在的多个拷贝。因此,源自与将其引入的细胞所属的生物体或物种不同的生物体或物种的核苷酸序列相对于那个细胞或细胞的子代而言是异源的。另外,异源核苷酸序列包括核苷酸序列,该核苷酸序列源自并插入相同的天然原始细胞类型,但是以非天然状态存在,例如,以不同拷贝数目存在,和/或处于与在核酸分子的天然状态中发现的那些不同的调节序列的控制下。核酸序列还可以异源于与其相关的其他核酸序列,例如在核酸构建体中,如,例如表达载体。作为一个非限制性实例,启动子可以与一种或多种调节元件和/或编码序列组合存在于核酸构建体中,这些调节元件和/或编码序列不与那个特定启动子相关地天然存在,即它们与启动子是异源的。
“同源”核酸序列是与其被引入的宿主细胞天然相关联的核酸序列。同源核酸序列还可以与其他核酸序列天然相关的核酸序列,这些其他核酸序列可以例如存在于核酸构建体中。作为一个非限制性实例,启动子可以与一种或多种调节元件和/或编码序列组合存在于核酸构建体中,这些调节元件和/或编码序列与那个特定启动子相关地天然存在,即它们与启动子是同源的。
“可操作地连接”是指在单个核酸序列上核酸序列的关联,这样使得一个的功能影响另一个的功能。例如,当启动子能够影响编码序列或者功能RNA的表达时(即编码序列或功能RNA处于启动子的转录控制之下),则启动子与编码序列或者功能RNA是可操作地连接的。正义方向或者反义方向的编码序列能够与调节序列可操作地连接。因此,可操作地与核苷酸序列相关的调节或控制序列(例如,启动子)能够影响核苷酸序列的表达。例如,与编码GFP的核苷酸序列可操作地连接的启动子将能够影响GFP核苷酸序列的表达。
控制序列不需要与目的核苷酸序列相邻,只要它们起到指导其表达的作用。因此,例如,介入未翻译的、已转录的序列可以在启动子与编码序列之间存在,并且启动子序列仍可以被认为与编码序列“可操作地连接”。
如本文使用的“引物”是分离的核酸,这些核酸通过核酸杂交被退火为互补靶DNA链,以在引物与靶DNA链之间形成杂交,然后通过聚合酶(如DNA聚合酶)沿着靶DNA链延伸。引物对或引物组可以用于核酸分子的扩增,例如通过聚合酶链式反应(PCR)或者其他核酸扩增方法。
“探针”是互补于靶核酸分子的一部分的分离的核酸分子,并且典型地用于检测和/或定量靶核酸分子。因此,在一些实施例中,探针可以是可检测部分或报道基因附接到的分离的核酸分子,如放射性同位素、配体、化学发光剂、荧光剂或酶。根据本发明的探针不仅可以包括脱氧核糖核酸或核糖核酸,还包括与靶核酸序列特异性结合并且可以用于检测靶核酸序列的存在或定量靶核酸序列的量的聚酰胺类以及其他探针材料。
设计TaqMan探针,使得其在由特定引物组扩增的DNA区域内退火。由于Taq聚合酶延伸引物并从互补链的3'至5'的单链模板合成新生链,所以聚合酶的5'至3'外切核酸酶通过探针延伸新生链,并且因此降解已经退火到模板的探针。探针的降解从其中释放荧光团,并打破了与淬灭剂的紧密相接,从而减轻了淬灭效应并允许荧光团的荧光。因此,在定量PCR热循环仪中检测到的荧光与释放的荧光团和PCR中存在的DNA模板的量成正比。
引物和探针的长度一般在5和100个核苷酸或更多核苷酸之间。在一些实施例中,引物和探针的长度可以为至少20个核苷酸或更多,或至少25个核苷酸或更多,或长度至少30个核苷酸或更多。这些引物和探针在本领域已知的最佳杂交条件下与靶序列特异性杂交。根据本发明的引物和探针可以具有与靶序列互补的完整序列,虽然与靶序列不同并保留与靶序列杂交的能力的探针可通过根据本发明的常规方法进行设计。
用于制备和使用探针和引物的方法描述于,例如,Molecular Cloning:A Laboratory Manual[分子克隆:实验室手册],第2版,第1-3卷,Sambrook等人,Cold SpringHarbor Laboratory Press,Cold Spring Harbor[冷泉港实验室出版社],冷泉港,纽约,1989中。PCR引物对可以源自已知序列,例如通过使用旨在用于该目的的计算机程序。
聚合酶链式反应(PCR)是用于“扩增”特定DNA片段的技术。为了进行PCR,必须知道待复制的DNA分子的核苷酸序列的至少一部分。通常,使用与待扩增的DNA的每条链的3'端处的核苷酸序列(已知序列)互补(例如,基本互补或完全互补)的引物或短寡核苷酸。将DNA样品加热以分离其链,并与这些引物混合。这些引物与其DNA样品中的互补序列杂交。使用原始DNA链作为模板开始合成(5'至3'方向)。反应混合物必须含有全部四种脱氧核苷酸三磷酸(dATP、dCTP、dGTP和dTTP)和DNA聚合酶。聚合继续进行,直到每条新合成的链已经进行得足够远以包含被另一个引物识别的序列。一旦发生这种情况,就会产生与原始分子相同的两个DNA分子。将这两个分子加热以分离其链,并重复该过程。每个循环使DNA分子的数量加倍。使用自动化设备,每个循环的复制可以在不到5分钟内完成。30个循环后,以DNA单分子开始的扩增已经超过10亿个拷贝(230=1.02x 109)。
寡核苷酸引物对的寡核苷酸互补于位于相对DNA链上和待扩增区域侧翼的DNA序列。退火引物与新合成的DNA链杂交。第一个扩增循环将导致两条新的DNA链,其5'端通过寡核苷酸引物的位置固定,但其3'端是可变的(‘不规则的’3'端)。两条新链可以依次充当用于合成所期望的长度的互补链的模板(5'端由引物定义并且3'端是固定的,因为合成不能超过相反引物的末端)。几个循环后,所期望的固定长度产品开始占主导地位。
定量聚合酶链式反应(qPCR)(也称为实时聚合酶链式反应)实时监测来自PCR反应的DNA产物的积累。qPCR是基于聚合酶链式反应(PCR)的分子生物学实验室技术,用于扩增并且同时定量靶DNA分子。可以在PCR中扩增和检测特定序列的甚至一个拷贝。PCR反应以指数方式生成DNA模板的拷贝。这导致起始靶序列的量和在任何特定循环下累积的PCR产物的量之间的定量关系。由于与模板、试剂限制或焦磷酸盐分子的积累一起发现的聚合酶反应的抑制剂,所以PCR反应最终停止以指数速率生成模板(即平台期),使得PCR产物的终点定量不可靠。因此,重复的反应可以生成可变量的PCR产物。只有在PCR反应的指数期期间才有可能回推以便确定模板序列的起始量。PCR产物积累时的测量(即实时定量PCR)允许在反应的指数期进行定量,并且因此消除与常规PCR相关的变异性。在实时PCR测定中,通过荧光信号积累来检测阳性反应。对于DNA样品中的一个或多个特异性序列,定量PCR能够进行检测和定量二者。数量可以是拷贝的绝对数量或是当归一化到DNA输入或额外的归一化基因时的相对量。从实时PCR的第一次记录以来,它已被用于越来越多的并且不同数量的应用,包括mRNA表达研究、基因组或病毒DNA中的DNA拷贝数测量、等位基因辨别测定、基因的特异性剪接变体的表达分析和石蜡包埋组织中的基因表达以及激光捕获的显微切割细胞。
如本文使用的,短语“Ct值”是指“循环阈值”,其被定义为“扩增靶标的量达到固定阈值的分数循环数”。在一些实施例中,其表示扩增曲线和阈值线之间的交点。扩增曲线典型地处于“S”形,这表示在给定循环(X轴)处的每个反应(Y轴)的相对荧光的变化,该变化在一些实施例中通过实时PCR仪器在PCR期间记录。在一些实施例中,阈值线是反应达到高于背景的荧光强度处的检测水平。参见Livak和Schmittgen(2001)25Methods[方法]402-408。它是PCR中靶标浓度的相对量度。通常,在一些实施例中,对于给定的参考基因,定量测定如qPCR的良好Ct值在10-40的范围内。Ct水平与样品中的靶核酸量成反比(即Ct水平越低,样品中的可检测的靶核酸量越高)。此外,定量测定如qPCR的良好Ct值显示出在成比例稀释靶gDNA的情况下的线性响应范围。
在一些实施例中,在其中可以实时收集Ct值进行定量分析的条件下进行qPCR。例如,在典型的qPCR实验中,在延伸期期间的PCR的每个循环处监测DNA扩增。当DNA处于扩增的对数线性期时,荧光的量通常增加到背景以上。在一些实施例中,在该时间点收集Ct值。
如本文使用的,术语“细胞”是指任何活细胞。细胞可以是原核细胞或真核细胞。细胞可以是分离的。细胞可能能够或可能不能够再生成生物体。细胞可以是在组织、愈伤组织、培养物、器官、或部分的上下文中。在一些实施例中,细胞可以是植物细胞。本发明的植物细胞可以处于分离的单细胞形式,或者可以是培养的细胞,或者可以是作为较高级的组织单位(如,例如,植物组织或植物器官)的一部分。植物细胞可以源自被子植物或裸子植物或是它们的一部分。在另外的实施例中,植物细胞可以是单子叶植物细胞、双子叶植物细胞。单子叶植物细胞可以是例如玉米、水稻、高粱、甘蔗、大麦、小麦、燕麦、草皮草、或观赏草细胞。双子叶植物细胞可以是例如烟草、胡椒、茄子、向日葵、十字花科植物、亚麻、马铃薯、棉花、大豆、甜菜、或油菜细胞。
如本文使用的术语“植物部分”包括但不限于:胚、花粉、胚珠、种子、叶、茎、芽、花、枝、果实、果仁、穗、穗轴、果壳、茎杆、根、根尖、花药、植物细胞(包括在植物和/或植物的部分中完整的植物细胞)、植物原生质体、植物组织、植物细胞组织培养物、植物愈伤组织、植物团等。如本文使用的,“芽”是指包括叶和茎的地上部分。此外,如本文使用的,“植物细胞”是指植物的结构和生理单位,包含细胞壁并且也可以指原生质体。
在细胞、原核细胞、细菌细胞、真核细胞、植物细胞、植物和/或植物部分的上下文中,术语“引入”(introducing或introduce)意指将核酸分子与细胞、真核细胞、植物、植物部分和/或植物细胞以这种方式相接触,使得核酸分子得以进入细胞、真核细胞、植物细胞和/或植物和/或植物部分的细胞的内部。当引入多于一种核酸分子,这些核酸分子可以被装配成单个多核苷酸或核酸构建体的一部分,或装配成分开的多核苷酸或核酸构建体,并且可以位于相同或不同的核酸构建体上。因此,可以在单个的转化事件中、在分开的转化事件中、或者例如作为育种方案的一部分,将这些多核苷酸引入到植物细胞中。
如本文使用的,术语“转化”和“转基因”是指含有至少一种重组(例如,异源)多核苷酸的全部或部分的任何细胞、原核细胞、真核细胞、植物、植物细胞、愈伤组织、植物组织、或植物部分。在一些实施例中,将重组多核苷酸的全部或部分稳定地整合到染色体或稳定的染色体外元件中,以便使得其传递到连续世代。出于本发明的目的,术语“重组多核苷酸”是指已经通过基因工程改变、重排或修饰的多核苷酸。实例包括任何克隆的多核苷酸,或与异源序列连接或接合的多核苷酸。术语“重组”不指因天然存在的事件(如自发突变)或因非自发诱变随后选择性育种而产生的多核苷酸改变。
如本文使用的术语“转化”是指将异源核酸引入细胞中。细胞的转化可以是稳定或瞬时的。因此,本发明的转基因细胞、植物细胞、植物和/或植物部分可以被稳定转化或瞬时转化。术语“转化”可以指将核酸分子转移到宿主细胞的基因组中,导致基因上稳定的遗传。在一些实施例中,引入植物、植物部分和/或植物细胞中是经由细菌介导的转化、粒子轰击转化、磷酸钙介导的转化、环糊精介导的转化、电穿孔、脂质体介导的转化、纳米粒子介导的转化、聚合物介导的转化、病毒介导的核酸递送、晶须介导的核酸递送、微量注射、超声波处理法、浸润法、聚乙二醇介导的转化、原生质体转化或导致向植物、植物部分和/或其细胞引入核酸的任何其他电学、化学、物理和/或生物学机制,或其任何组合进行的。
用于转化植物的程序在本领域中是熟知且常规的并且普遍描述于文献中。用于植物转化的方法的非限制性实例包括经由以下方式转化:细菌介导的核酸递送(例如,经由来自农杆菌属的细菌)、病毒介导的核酸递送、碳化硅或核酸须晶介导的核酸递送、脂质体介导的核酸递送、微注射、微粒轰击、磷酸钙介导的转化、环糊精介导的转化、电穿孔、纳米粒子介导的转化、超声处理、渗入、PEG介导的核酸吸收、以及使得核酸引入到植物细胞中的任何其他电学、化学、物理(机械)和/或生物学机制,包括其任何组合。本领域中已知的多种植物转化方法的一般指南包括Miki等人,(“Procedures for Introducing Foreign DNAinto Plants[将外源DNA引入植物中的程序]”在Plant Molecular Biology andBiotechnology[植物分子生物学和生物技术]的方法中,Glick,B.R.和Thompson,J.E.编辑(CRC出版有限公司(CRC Press,Inc.),波卡拉顿(Boca Raton),1993),第67-88页)和Rakowoczy-Trojanowska(Cell.Mol.Biol.Lett.[细胞分子生物学快报]7:849-858(2002))。
农杆菌介导的转化是用于转化植物的常用方法,因为其高转化效率以及因为它与许多不同物种的广泛实用性。农杆菌介导的转化典型地涉及将携带目的外源DNA的二元载体转移至适当的农杆菌菌株,这可能取决于由宿主农杆菌菌株在共同存在的Ti质粒上或染色体地携带的vir基因的互补体(Uknes等人,1993,Plant Cell[植物细胞]5:159-169)。可以使用携带重组二元载体的大肠杆菌,辅助大肠杆菌菌株(该辅助大肠杆菌菌株携带能够将重组二元载体移动到靶农杆菌菌株中的质粒)通过三亲本交配程序,来实现将重组二元载体转移至农杆菌。可替代地,可以通过核酸转化将重组二元载体转移至农杆菌中(和Willmitzer,1988,Nucleic Acids Res.[核酸研究]16:9877)。
通过重组农杆菌进行的植物转化通常涉及农杆菌与来自植物的外植体的共培养,并且遵循本领域熟知的方法。典型地在携带位于这些二元质粒T-DNA边界之间的抗生素或除草剂抗性标记的选择培养基上对转化的组织进行再生。
另一种用于转化植物、植物部分以及植物细胞的方法涉及在植物组织和细胞上推进惰性或生物学活性的粒子。参见例如,美国专利号4,945,050;5,036,006和5,100,792。通常,这种方法涉及在有效于穿透细胞的外表面并提供掺入在其内部中的条件下在植物细胞处推进惰性或生物活性的粒子。当使用惰性粒子时,可以通过用含有目的核酸的载体包被这些粒子而将载体引入细胞中。可替代地,一个或多个细胞可以被载体围绕以使得载体通过粒子的激发而被带入细胞中。也可以将生物活性粒子(例如,干燥的酵母细胞、干燥的细菌或噬菌体,各自包含一个或多个试图被引入的核酸)推进到植物组织中。
在多核苷酸的上下文中,“瞬时转化”意指将多核苷酸引入细胞中并且没有整合到细胞的基因组中。
如本文使用的,在被引入细胞中的多核苷酸的上下文中,“稳定引入(stablyintroducing、stably introduced)”、“稳定转化(stable transformation或stablytransformed)”意指引入的多核苷酸被稳定地整合到该细胞的基因组中,并且因此该细胞用该多核苷酸稳定地转化。因此,整合的多核苷酸能够由其子代继承,更具体地说,由多个连续世代的子代继承。如本文使用的“基因组”包括核和/或质体基因组,并且因此包括多核苷酸至例如叶绿体基因组中的整合。如本文使用的稳定转化还可以是指被保持在染色体外,例如,作为微染色体的多核苷酸。
瞬时转化可以通过例如酶联免疫测定(ELISA)或蛋白质印迹来进行检测,这两种方法可以检测由引入生物体的一个或多个核酸分子编码的肽或多肽的存在。细胞的稳定转化可以通过例如细胞基因组DNA与核酸序列(这些序列与引入生物体(例如,植物)中的核酸分子的核苷酸序列特异性地杂交)的DNA印迹杂交测定来进行检测。细胞的稳定转化可以通过例如细胞的RNA与核酸序列(这些序列与引入植物或其他生物体的核酸分子的核苷酸序列特异性地杂交)的RNA印记杂交测定来进行检测。细胞的稳定转化还可以通过例如聚合酶链式反应(PCR)或本领域内熟知的其他扩增反应来进行检测,该反应采用与核酸分子的一个或多个靶序列进行杂交的特异性引物序列,导致一个或多个靶序列的扩增,这种扩增可以根据标准方法进行检测。转化还可以通过本领域熟知的直接测序和/或杂交方案进行检测。
因此,在本发明的具体实施例中,植物细胞可以通过本领域内已知的任何方法并且如本文描述进行转化并且可以使用多种已知技术中的任一种来从这些经转化的细胞再生出完整的植物。来自植物细胞、植物组织培养和/或培养的原生质体的植物再生描述于例如Evans等人(Handbook of Plant Cell Cultures[植物细胞培养手册],第1卷,麦克米兰出版公司(MacMilan Publishing Co.)纽约(1983));和Vasil I.R.(编辑)(Cell Culture and Somatic Cell Genetics of Plants[植物的细胞培养和体细胞遗传学],学术出版社,奥兰多,第I卷(Acad.Press,Orlando)(1984)和第II卷(1986))中。选择转化的转基因植物、植物细胞和/或植物组织培养物的方法在本领域中是常规的,并且可以用于在此提供的本发明的方法中。
“转化和再生过程”是指将转基因稳定地引入植物细胞并从转基因植物细胞再生植物的过程。如本文使用的,转化和再生包括选择过程,其中转基因包含选择性标记,并且转化的细胞已经并入并表达转基因,使得转化的细胞将在选择剂存在下存活并发育繁盛。“再生”是指从植物细胞、植物细胞组、或植物片(如来自原生质体、愈伤组织、或组织部分的)长成整个植物。
术语“核苷酸序列”、“核酸”、“核酸序列”、“核酸分子”“寡核苷酸”以及“多核苷酸”在本文可互换地使用来指核苷酸的杂聚物并且涵盖RNA和DNA二者,包括cDNA、基因组DNA、mRNA、合成的(例如,化学合成的)DNA或RNA以及RNA和DNA的嵌合体。术语核酸分子是指核苷酸链,而不考虑链的长度。这些核苷酸包含糖、磷酸和碱,该碱是嘌呤或嘧啶。核酸分子可以是双链或单链的。在单链时,核酸分子可以是正义链或反义链。可以使用寡核苷酸类似物或衍生物(例如,肌苷或硫代磷酸核苷酸)合成核酸分子。此类寡核苷酸可以例如用于制备具有改变的碱基配对能力或对核酸酶的增强的抗性的核酸分子。本文提供的核酸序列在本文以5'至3'方向从左至右表示,并且使用代表核苷酸字符的标准代码表示,如美国序列规则,37CFR§§1.821-1.825和世界知识产权组织(WIPO)标准ST.25中所述。
“核酸片段”是给定的核酸分子的一部分。“RNA片段”是给定的RNA分子的一部分。“DNA片段”是给定的DNA分子的一部分。“核酸区段”是给定的核酸分子的一部分并且并不是从该分子分离的。“RNA区段”是给定的RNA分子的一部分并且并不是从该分子分离的。“DNA区段”是给定的DNA分子的一部分并且并不是从该分子分离的。多核苷酸的区段可以是任何长度,例如长度为至少5、10、15、20、25、30、40、50、75、100、150、200、300或500或更多个核苷酸。指导序列的区段或一部分可以是指导序列的约50%、40%、30%、20%、10%,例如指导序列的三分之一或更短,例如长度为7、6、5、4、3、或2个核苷酸。
在分子的上下文中,术语“源自”是指使用亲本分子或来自该亲本分子的信息,分离或制造的分子。例如,Cas9单突变体切口酶和Cas9双突变体无效核酸酶源自野生型Cas9蛋白。
在高等植物中,脱氧核糖核酸(DNA)是遗传物质,而核糖核酸(RNA)涉及将DNA中包含的信息到蛋白中的转移。“基因组”是在生物体的每个细胞中所包含的遗传物质的整体。除非另外表明,本发明的特定的核酸序列还暗示性地涵盖其保守地修饰的变体(例如,简并密码子取代)以及互补序列、以及连同明确地指明的序列。具体地,简并密码子取代可以通过产生如下序列而获得,在这些序列中,一个或多个所选的(或全部)密码子的第三位被混合碱基和/或脱氧肌苷残基取代(Batzer等人,Nucleic Acid Res.[核酸研究]19:5081(1991);Ohtsuka等人,J.Biol.Chem.[生物化学杂志]260:2605-2608(1985);和Rossolini等人,Mol.Cell.Probes[分子与细胞探针]8:91-98(1994))。术语核酸分子与基因、cDNA和由基因编码的mRNA可互换地使用。
如本文使用的“序列同一性”是指两个最佳比对的多核苷酸或肽序列在组分(例如,核苷酸或氨基酸)的整个比对窗口内不变的程度。“同一性”可以通过已知方法容易地计算出,这些方法包括但不限于以下文献中描述的那些:Computational Molecular Biology[计算分子生物学](Lesk,A.M.,编辑)Oxford University Press[牛津大学出版社],纽约(1988);Biocomputing:Informatics and Genome Projects[生物计算:信息学和基因组项目](Smith,D.W.,编辑)Academic Press[学术出版社],纽约(1993);Computer Analysisof Sequence Data[序列数据的计算机分析],第I部分(Griffin,A.M.和Griffin,H.G.编辑)Humana Press[胡马纳出版社],新泽西(1994);Sequence Analysis in Molecular Biology[分子生物学的序列分析](von Heinje,G.编辑)学术出版社(1987);和Sequence Analysis Primer[序列分析引物](Gribskov,M.和Devereux,J.编辑)斯托克顿出版社,纽约(1991)。
如本文使用的,术语“序列同一性百分比”或“同一性百分比”是指在最佳比对两个序列时,与测试(“主题”)多核苷酸分子(或其互补链)相比,参考(“查询”)多核苷酸分子(或其互补链)的线性多核苷酸序列中的同一核苷酸的百分比。在一些实施例中,“同一性百分比”可以是指氨基酸序列中同一氨基酸的百分比。
如本文使用的,在两个核酸分子、核苷酸序列或蛋白质序列上下文中,短语“基本上同一”是指当比较并比对最大对应性时具有至少约70%、至少约75%、至少约80%、至少约85%、至少约90%、至少约95%、至少约96%、至少约97%、至少约98%、或至少约99%核苷酸或氨基酸残基同一性的两个或更多个序列或子序列,如使用以下序列比较算法之一或通过目测检查所测量的。在本发明的一些实施例中,在长度为至少约50个残基至约150个残基的序列区域上存在基本同一性。因此,在本发明的一些实施例中,在长度为至少约50、约60、约70、约80、约90、约100、约110、约120、约130、约140、约150、或更多个残基的序列区域上存在实质一致性。在一些具体实施例中,这些序列在至少约150个残基上是基本上同一的。在另外的实施例中,序列在编码区的整个长度上是基本上同一的。此外,在代表性实施例中,基本上同一的核苷酸序列或蛋白序列进行基本上相同的功能(例如指导至具体的基因组靶表、具体的基因组靶位点的内切核酸酶切割)。
对于序列比较,典型地,一个序列充当与测试序列进行比较的参考序列。当使用序列比较算法时,将测试序列和参考序列输入到计算机中(若有必要,则指定子序列坐标),并且指定序列算法程序的参数。然后,序列比较算法基于所指定的程序参数来计算这个或这些测试序列相对于参考序列的序列同一性百分比。
用于比对比较窗口的最佳序列比对是本领域技术人员所熟知的并且可以由以下工具实施:如Smith和Waterman的局部同源性算法、Needleman和Wunsch的同源性比对算法、Pearson和Lipman的相似性搜索方法,并且任选地由这些算法的计算机化实现方式来实施,如作为Wisconsin(材料科学软件公司(Accelrys Inc.),圣地亚哥,加利福尼亚州)的部分可获得的GAP、BESTFIT、FASTA和TFASTA。测试序列和参考序列的已比对区段的“同一性分数”是由两个已比对序列所共有的同一组分的数目除以参考序列区段(即,完整的参考序列或参考序列的更小限定部分)中组分的总数目。序列同一性百分比被表示为同一性分数乘以100。一个或多个多核苷酸序列的比较可以是相对于全长多核苷酸序列或其部分,或相对于较长的多核苷酸序列。出于本发明的目的,也可以使用针对翻译的核苷酸序列的2.0版BLASTX和针对多核苷酸序列的2.0版BLASTN确定“同一性百分比”。
用于执行BLAST分析的软件可通过美国国家生物技术信息中心(National Centerfor Biotechnology Information)公开地获得。这种算法涉及首先通过鉴定查询序列中具有长度W的短字码而鉴定得分高的序列对(HSP),这些得分高的序列对当与数据库序列中具有相同长度的字码(word)进行比对时匹配或满足一些正值阈值的得分T。T被称为邻近字码得分阈值(Altschul等人,1990)。这些初始的邻近字码命中充当种子用于初始搜索以发现含有它们的较长的HSP。然后,将这些字码命中在两个方向上沿着每个序列延伸直到累积的比对得分可以增加。对于核苷酸序列,使用参数M(对于一对匹配残基的奖赏得分;总是>0)和N(对于错配残基的罚分;总是<0)来计算累积得分。对于氨基酸序列,使用得分矩阵来计算累积得分。当累积的比对得分从它的最大达到值降低了数量X;由于累积一个或多个负得分的残基比对使累积得分趋于0或低于0;或者到达任一序列的末端时,停止这些字码命中在每个方向上的延伸。BLAST算法的参数W、T、以及X决定了比对的灵敏度与速度。BLASTN程序(对核苷酸序列来说)使用字长(W)为11、期望值(E)为10、截止值(cutoff)为100、M=5、N=-4、以及两条链的比较作为默认值。对于氨基酸序列,BLASTP程序使用字长(W)为3、期望值(E)为10、以及BLOSUM62评分矩阵作为默认值(参见Henikoff&Henikoff,Proc.Natl.Acad.Sci.USA[美国国家科学院院刊]89:10915(1989))。
除了计算序列同一性百分比之外,BLAST算法还进行两个序列之间相似性的统计分析(参见,例如Karlin和Altschul,Proc.Natl.Acad.Sci.USA[美国国家科学院院刊]90:5873-5787(1993))。由BLAST算法提供的相似性的一种量度是最小总和概率(P(N)),提供了在两个核苷酸或氨基酸序列之间会偶然发生匹配的概率的指示。例如,如果在测试核苷酸序列与参考核苷酸序列的比较中的最小总和概率小于约0.1至小于约0.001,则测试核酸序列被认为与参考序列相似。因此,在本发明的一些实施例中,在测试核苷酸序列与参考核苷酸序列的比较中的最小总和概率小于约0.001。
当两个核苷酸序列在严格条件下彼此杂交时,这两个核苷酸序列也可以被认为是基本上同一的。在一些代表性实施例中,被认为基本上同一的两个核苷酸序列在高严格条件下彼此杂交。
在核酸杂交实验(如DNA杂交和RNA杂交)的上下文中,“严格杂交条件”和“严格杂交洗涤条件”是序列依赖性的,并且在不同的环境参数下是不同的。对核酸杂交的全面指南见于Tijssen Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid[生物化学和分子生物学实验室技术-使用核酸探针的杂交]第2章第I部分“Overview of principles of hybridization and the strategy ofnucleic acid probe assays[杂交原理和核酸探针测定策略综述]”Elsevier[爱思唯尔],纽约(1993)。通常,高严格杂交和洗涤条件在限定的离子强度和pH下被选定为比特定序列的热熔点(Tm)低约5℃。
Tm是50%的靶序列与完全匹配的探针进行杂交时的温度(在限定的离子强度和pH下)。非常严格条件被选定为等于特定探针的Tm。用于互补核苷酸序列(它们在DNA或RNA印迹中在滤器上具有超过100个互补残基)的杂交的严格杂交条件的一个实例是在42℃下具有1mg肝素的50%甲酰胺,其中杂交是过夜进行的。高严格洗涤条件的一个实例是0.15MNaCl,在72℃持续约15分钟。严格洗涤条件的实例是在65℃以0.2x SSC洗涤持续15分钟(参见Sambrook,下文,针对SSC缓冲液的描述)。通常,高严格洗涤之前会先进行低严格洗涤,以去除背景探针信号。对于例如多于100个核苷酸的双链体的中严格洗涤的实例是在45℃以1x SSC持续15分钟。对于例如多于100个核苷酸的双链体的低严格性洗涤的一个实例是在40℃下以4-6xSSC进行15分钟。对于短探针(例如,约10至50个核苷酸),严格条件典型地涉及小于约1.0M的Na离子的盐浓度,典型地在pH 7.0至8.3下约0.01至1.0M的Na离子浓度(或其他盐),并且温度典型地是至少约30℃。还可以通过添加去稳定剂(如甲酰胺)来达到严格条件。一般而言,在特定的杂交测定中相比于不相关的探针观察到的高出2x(或更高)的信噪比表明检测到特异性杂交。如果在严格条件下彼此不杂交的核苷酸序列所编码的蛋白质是基本上同一的,则这些核苷酸序列仍然是基本上同一的。例如,当使用遗传密码所允许的最大密码子简并性来生成核苷酸序列的拷贝时,这种情况可能发生。
以下是可以用来克隆同源核苷酸序列(这些序列是与本发明的参考核苷酸序列基本上同一的)的杂交/洗涤条件的设置的实例。在一个实施例中,参考核苷酸序列在50℃下、在7%十二烷基硫酸钠(SDS)、0.5M NaPO4、1mM EDTA中与“测试”核苷酸序列杂交,同时在50℃下、在2X SSC、0.1%SDS中洗涤。在另一个实施例中,参考核苷酸序列在50℃下、在7%十二烷基硫酸钠(SDS)、0.5M NaPO4、1mM EDTA中与“测试”核苷酸序列杂交,同时在50℃下、在1XSSC、0.1%SDS中洗涤;或者在50℃下、在7%十二烷基硫酸钠(SDS)、0.5M NaPO4、1mMEDTA中杂交,同时在50℃下、在0.5X SSC、0.1%SDS中洗涤。在仍另外的实施例中,参考核苷酸序列在50℃下、在7%十二烷基硫酸钠(SDS)、0.5M NaPO4、1mM EDTA中与“测试”核苷酸序列杂交,同时在50℃下、在0.1X SSC、0.1%SDS中洗涤;或者在50℃下、在7%十二烷基硫酸钠(SDS)、0.5M NaPO4、1mM EDTA中杂交,同时在65℃下、在0.1X SSC、0.1%SDS中洗涤。
“分离的”核酸分子或核苷酸序列或“分离的”多肽是通过人工脱离其天然环境存在的和/或当与其在其天然环境中的功能相比时具有不同的、修饰的、调节的和/或改变的功能的并且因此不是天然的产物的核酸分子、核苷酸序列或多肽。分离的核酸分子或分离的多肽可以以纯化形式存在或可以存在于非天然环境(如,例如重组宿主细胞)中。因此,例如,相对于多核苷酸而言,术语分离的意指将多核苷酸从其天然存在于其中的染色体和/或细胞中分离。如果将一种多核苷酸从其天然存在于其中的染色体和/或细胞中分离,并且然后将其插入并非其天然存在于其中的遗传背景、染色体、染色体位置、和/或细胞中,则多核苷酸也是被分离的。本发明的重组核酸分子和核苷酸序列可以被认为是如上文所定义的“分离的”。
因此,“分离的核酸分子”或“分离的核苷酸序列”是核酸分子或核苷酸序列,该核酸分子或核苷酸序列不与在其衍生而来的生物体的天然存在的基因组中的与其邻近的核苷酸序列(位于5'端的序列或位于3'端的序列)相邻。因此,在一个实施例中,分离的核酸包括一些或全部的5'非编码(例如,启动子)序列,这些序列与编码序列紧密相邻。因此,术语包括,例如,重组核酸,该重组核酸并入载体、并入自我复制的质粒或病毒、或并入原核生物或真核生物的基因组DNA,或者作为独立于其他序列的单独分子(例如,cDNA或通过PCR或限制性内切核酸酶处理而得到的基因组DNA片段)而存在。它也包括作为编码另外的多肽或肽序列的杂合核酸分子的部分的重组核酸。“分离的核酸分子”或“分离的核苷酸序列”还可以包括核苷酸序列,该核苷酸序列源自并插入相同的天然原始细胞类型,但是却以非天然状态存在,例如,以不同拷贝数目存在,和/或处于与在核酸分子的天然状态中发现的那些不同的调节序列的控制下。
术语“分离的”可以进一步指核酸分子、核苷酸序列、多肽、肽或片段,它们实质上不含细胞材料、病毒材料、和/或培养基(例如,当通过重组DNA技术生产时)、或化学前体或其他化学品(例如,当进行化学合成时)。另外,“分离的片段”是不作为片段天然存在并且不会在天然状态下如此存在的核酸分子、核苷酸序列或多肽的片段。“分离的”不必须意味着制备是工业纯的(同质的),但是它是足够纯的以提供处于可以用于预期目的形式的多肽或核酸。
在本发明的代表性实施例中,“分离的”核酸分子、核苷酸序列和/或多肽是至少约5%、10%、15%、20%、25%、30%、40%、50%、60%、70%、75%、80%、85%、90%、95%、97%、98%、99%纯(w/w)或更纯。在其他实施例中,“分离的”核酸、核苷酸序列和/或多肽表示与起始材料相比,实现核酸的至少约5倍、10倍、25倍、100倍、1000倍、10,000倍、100,000倍或更大富集(w/w)。
“野生型”核苷酸序列或氨基酸序列是指天然存在(“天然”)或内源核苷酸序列或氨基酸序列。因此,例如,“野生型mRNA”是天然存在于生物体中的或对生物体来说是内源性的mRNA。“同源”核苷酸序列是与它被引入的宿主细胞天然相关的核苷酸序列。
术语“开放阅读框”和“ORF”是指在编码序列的翻译起始和终止密码子之间编码的氨基酸序列。术语“起始密码子”和“终止密码子”是指在编码序列中三个相邻的核苷酸(‘密码子’)的单位,分别指明蛋白合成(mRNA翻译)的起始和链终止。
“启动子”是指核苷酸序列,通常在它的编码序列的上游(5'),它通过提供对适当的转录所需的RNA聚合酶以及其他因子的识别来控制编码序列的表达。“启动子调节序列”由近端和更远端上游元件组成。启动子调节序列影响相关编码序列的转录、RNA加工或稳定性、或翻译。调节序列包括增强子、启动子、非翻译的前导序列、内含子、以及聚腺苷酸化信号序列。它们包括天然序列以及合成序列、以及可能是合成序列与天然序列的组合的序列。“增强子”是DNA序列,它可以刺激启动子的活性并且可以是该启动子的固有元件或插入的异源元件以增强启动子的水平或组织特异性。它能够在两个方向(正常或翻转)上进行操作,并且甚至当移动到该启动子的上游或下游时还能够发挥作用。术语“启动子”的含义包括“启动子调节序列”。
“初级转化株”以及“T0世代”是指与最初转化(即,自从转化起未经历减数分裂以及受精)的组织具有相同遗传世代的转基因植物。“次级转化株”以及“T1、T2、T3等世代”是指经由一个或多个减数分裂以及受精循环而源自初级转化株的转基因植物。它们可以通过初级或次级转化株的自体受精或初级或次级转化株与其他转化或未转化植物的杂交衍生的。
“转基因”是指核酸分子,该核酸分子已经通过转化被引入该基因组中并且被稳定地保持。转基因可以包含至少一个表达盒,典型地包含至少两个表达盒,并且可以包含十个或更多个表达盒。转基因可以包括,例如对于待转化的特定植物的基因而言是异源的或者是同源的基因。此外,转基因可以包含被插入非天然生物体中的天然基因,或嵌合基因。术语“内源基因”是指在生物体的基因组中在其天然位置中的天然基因。“外源”基因是指正常在宿主生物体中未发现但通过基因转移被引入该生物体中的基因。
“内含子”是指几乎唯一地在真核基因中发生的DNA的内插区段,但该内插区段在该基因产物中没有被翻译成氨基酸序列。通过称为剪接的过程从未成熟的mRNA中去除这些内含子,该剪接使外显子未被触及,从而形成mRNA。出于本发明的目的,术语“内含子”的定义包括对源自靶基因的内含子的核苷酸序列进行修饰,条件是该经修饰的内含子没有显著地减少其关联的5'调节序列的活性。
“外显子”是指携带蛋白或其一部分的编码序列的DNA的区段。外显子通过内插的、非编码序列(内含子)分离。出于本发明的目的,术语“外显子”的定义包括对源自靶基因的外显子的核苷酸序列进行修饰,条件是该经修饰的外显子没有显著地减少其关联的5'调节序列的活性。
术语“切割(cleavage或cleaving)”是指多核苷酸的核糖基磷酸二酯主链中的共价磷酸二酯键联的断裂。术语“切割(cleavage或cleaving)”涵盖单链断裂和双链断裂二者。作为两次不同的单链切割事件的结果,可以发生双链切割。切割可以导致产生平末端或交错末端。“核酸酶切割位点”或“基因组核酸酶切割位点”是包括核酸酶切割序列的核苷酸区域,该核酸酶区域由特异性核酸酶识别,该核酸酶用于切割一条或两条链中基因组DNA的核苷酸序列。由核酸酶的这种切割引发了细胞内的DNA修复机制,该修复机制建立了同源重组发生的环境。
本发明提供了融合蛋白,该融合蛋白具有脱氨酶结构域和定点DNA-结合结构域之间的改进的接头,提供了增加的编辑效率和降低的突变频率。在本发明的一些实施例中,脱氨酶结构域为胞苷脱氨酶。在本发明的其他实施例中,脱氨酶结构域为腺嘌呤脱氨酶。在一些实施例中,胞苷脱氨酶结构域为激活诱导的胞苷脱氨酶(“AID”)。在本发明的一些实施例中,胞苷脱氨酶结构域为载脂蛋白B mRNA-编辑复合物(“APOBEC”)结构域。在一些实施例中,APOBEC结构域为APOBEC1家族脱氨酶。
“胞苷脱氨酶”是指对胞苷和脱氧胞苷的不可逆水解脱氨基作用进行催化,分别产生尿苷和脱氧尿苷的酶。胞苷脱氨酶维持细胞嘧啶库。胞苷脱氨酶的家族为APOBEC(“载脂蛋白B mRNA编辑酶,催化多肽样”)。该家族的成员是C至U编辑酶。APOBEC样蛋白的N末端结构域为催化结构域,而C末端结构域是假催化结构域(pseudocatalytic domain)。更具体地,催化结构域为锌依赖性胞苷脱氨酶结构域,并且对胞苷脱氨基很重要。APOBEC1的RNA编辑需要同源二聚体作用,并且该复合物与RNA结合蛋白相互作用以形成编辑体。APOBEC蛋白的非限制实例包括APOBEC1、APOBEC2、APOBEC3A、APOBEC3B、APOBEC3C、APOBEC3D、APOBEC3F、APOBEC3G、APOBEC3H、APOBEC4以及激活诱导的(胞苷)脱氨酶。APOBEC蛋白的多种突变体也是已知的,这些突变体带来了碱基编辑器的不同编辑特征。例如,对于人APOBEC3A,根据编辑效率,某些突变体(例如,Y130F、Y132D、W104A和D131Y)甚至优于野生型人APOBEC3A。因此,术语APOBEC及其每个家族成员还涵盖与相应野生型APOBEC蛋白具有一定水平(例如,70%、75%、80%、85%、90%、95%、98%、99%)的序列同一性,并且保留胞苷脱氨基活性的变体和突变体。变体和突变体可以用氨基酸添加、缺失和/或取代来得到。在一些实施例中,此类取代是保守取代。
“胞嘧啶碱基编辑器”(“CBE”)将C·G碱基对转化为T·A碱基对。
“腺嘌呤脱氨酶”是指对腺苷水解性脱氨基作用进行催化,产生肌苷的酶。肌苷与C配对,并且因此作为G读取或复制。示例酶为来自大肠杆菌(E.coli)的TadA,作为同源二聚体起作用。
“腺嘌呤碱基编辑器”(“ABE”)将A·T碱基对转化为G·C碱基对。
毛螺科属细菌(Lachnospiraceae bacterium)Cpf1(LbCpf1)为一大组的许多Cpf1蛋白之一。术语“Cpf1”和“Cas12a”全文可互换地使用。Cpf1为Cas蛋白。术语“Cas蛋白”或“规律成簇间隔短回文重复(CRISPR)相关(Cas)蛋白”是指与在例如酿脓链球菌(Streptococcus pyogenes)以及其他细菌中发现的CRISPR(规律成簇间隔短回文重复)—适应性免疫系统相关的RNA指导的DNA内切核酸酶Cas蛋白包括Cas9、Cas12a、Cas12b、Cas12i、Cas12j等。在本发明的一些实施例中,定点DNA结合结构域为来自毛螺科属细菌(“dLbCas12a”)的无催化活性的Cas12a。在其他实施例中,定点DNA结合结构域具有催化活性,来自毛螺科属细菌(“LbCas12a”)或牛眼莫拉氏菌(Moraxella bovoculi)AAX08_00205(“Mb2Cas12a”)。在本发明的一些实施例中,来自毛螺科属细菌、氨基酸球菌属物种、牛眼莫拉式菌属物种、硫微螺菌属物种(Thiomicrospira sp.)、腔隙莫拉式菌(Moraxellalacunata)、Methanomethylophilus alvus、丁酸弧菌属物种(Btyrivibrio sp.)或口腔拟杆菌属物种(Bacteroidetesoral sp.)的Cas12a蛋白作为融合蛋白的定点DNA-结合结构域提供。
融合蛋白可以包括其他片段,如尿嘧啶DNA糖基化酶抑制剂(UGI)和核定位序列(NLS)。
可以从枯草芽孢杆菌噬菌体PBS1中制备的“尿嘧啶糖基化酶抑制剂”(UGI)为抑制来自其他物种的大肠杆菌尿嘧啶-DNA糖基化酶(UDG)以及UDG的小蛋白质(9.5kDa)。通过以1:1UGD:UGI化学计量的可逆的蛋白质结合发生UDG的抑制。UGI能够解离UDG-DNA复合物。在芽孢杆菌属噬菌体AR9(YP_009283008.1)中发现UGI的非限制性实例。在一些实施例中,UGI包含SEQ ID NO:8的氨基酸序列或具有与SEQ ID NO:8具有至少70%、75%、80%、85%、90%或95%的序列同一性,并且保留尿嘧啶糖基化酶抑制活性。
在一些实施例中,UGI位于胞苷脱氨酶-Cpf1部分的C末端侧。在一些实施例中,融合蛋白包含至少两个UGI。
在一些实施例中,至少一种核定位信号(“NLS”)位于第一片段和第二片段(胞苷脱氨酶-Cpf1部分)的C末端,例如第二片段(包括Cpf1)和UGI之间。在一些实施例中,至少两种NLS位于第二片段和UGI之间。在一些实施例中,至少三种NLS位于第二片段和UGI之间。在一些实施例中,至少一种NLS位于第一片段和第二片段(胞苷脱氨酶-Cpf1部分)的N末端。
融合蛋白中的组分的非限制性示例排列从N-末端至C-末端包括(a)NLS、胞苷脱氨酶、Cas12a、NLS、UGI、NLS、2A和UGI;(b)NLS、胞苷脱氨酶、Cas12a、NLS、NLS、UGI、NLS、2A和UGI;(c)NLS、胞苷脱氨酶、Cas12a、NLS、UGI、NLS、2A、UGI、2A和UGI;(d)NLS、胞苷脱氨酶、Cas12a、NLS、UGI、NLS、2A、UGI、2A、UGI、2A和UGI。
在一些实施例中,任选地在融合蛋白中的每个片段之间提供肽接头。在一些实施例中,肽接头具有1至100个氨基酸残基(或3-20个、4-15个,不受限制)。在一些实施例中,至少10%、20%、30%、40%、50%、60%、70%、80%或90%的肽接头的氨基酸残基是选自由以下组成的组的氨基酸残基:丙氨酸、甘氨酸、半胱氨酸和丝氨酸。
本发明还提供了包含编码本发明的指导RNA的核酸序列的核酸分子。核酸分子可以是DNA或RNA分子。在一些实施例中,核酸分子是环化的。在其他实施例中,核酸分子是直链的。在一些实施例中,核酸分子是单链的、部分双链的、或双链的。在一些实施例中,核酸分子与至少一个多肽复合。多肽可以具有核酸识别结构域或核酸结合结构域。在一些实施例中,多肽用于介导例如,本发明的嵌合RNA和任选地核酸酶的递送的穿梭物。在一些实施例中,多肽是Feldan穿梭物(美国专利公开号20160298078,通过引用并入本文)。
“靶上编辑”为在被gRNA靶定的PAM位点后的区域中,胞嘧啶对胸腺嘧啶的取代。主要编辑窗口为PAM位点之后的8至13个碱基。“脱靶编辑”为gRNA靶区域内除C到T以外的插入或缺失(indel)或碱基改变,或者为gRNA靶区域外的碱基改变或插入或缺失。
“定点修饰多肽”修饰靶DNA(例如靶DNA的切割或甲基化)和/或与靶DNA缔合的多肽(例如组蛋白尾的甲基化或乙酰化)。定点修饰多肽在本文也称为“定点多肽”或“RNA结合定点修饰多肽”。由于定点修饰多肽与指导RNA的缔合,定点修饰多肽与指导RNA相互作用(该指导RNA是单个RNA分子或至少两个RNA分子的RNA双链体),并被引导至DNA序列(例如染色体序列或染色体外序列,例如游离体序列、微环序列、线粒体序列、叶绿体序列等)。
在一些情况下,定点修饰多肽是天然存在的修饰多肽。在其他情况下,定点修饰多肽不是天然存在的修饰多肽(例如嵌合多肽或被修饰的(例如突变、缺失、插入)的天然存在的多肽)。示例性天然存在的定点修饰多肽是本领域已知的(参见例如Makarova等人,2017,Cell[细胞]168:328-328.e1,和Shmakov等人,2017,Nat Rev Microbiol[自然微生物学综述]15(3):169-182,这两篇文献均通过引用并入本文)。这些天然存在的多肽结合DNA靶向RNA,并且由此被指导至靶DNA内的特定序列,并且切割靶DNA,从而产生双链断裂。
定点修饰多肽包含两个部分,即RNA结合部分和活性部分。在一些实施例中,定点修饰多肽包含:(i)与DNA靶向RNA相互作用的RNA结合部分,其中该DNA靶向RNA包含与靶DNA中的序列互补的核苷酸序列;和(ii)展现出定点酶活性(例如DNA甲基化活性、DNA切割活性、组蛋白乙酰化活性、组蛋白甲基化活性等)的活性部分,其中由DNA靶向RNA决定酶活性的位点。在其他实施例中,定点修饰多肽包含:(i)与DNA靶向RNA相互作用的RNA结合部分,其中该DNA靶向RNA包含与靶DNA中的序列互补的核苷酸序列;和(ii)调节靶DNA内的转录(例如增加或减少转录)的活性部分,其中由DNA靶向RNA决定靶DNA内的调节的转录的位点。
在一些情况下,定点修饰多肽具有可操作地连接的异源结构域。异源结构域可以是酶或信号肽。在异源结构域是酶结构域的方面中,该结构域具有修饰靶核酸的酶活性(例如核酸酶活性、甲基转移酶活性、脱甲基酶活性、DNA修复活性、DNA损伤活性、脱氨基活性、逆转录酶活性、歧化酶活性、烷基化活性、甲基化活性、脱嘌呤活性、氧化活性、嘧啶二聚体形成活性、整合酶活性、转座酶活性、重组酶活性、聚合酶活性、连接酶活性、螺旋酶活性、光解酶活性或糖基化酶活性)。在其他情况下,定点修饰多肽具有可操作地连接的酶结构域,该酶结构域的酶活性修饰与靶DNA缔合的多肽(例如,组蛋白)(例如甲基转移酶活性、脱甲基酶活性、乙酰转移酶活性、脱乙酰基酶活性、激酶活性、磷酸酶活性、泛素连接酶活性、去泛素化活性、腺苷酸化活性、脱腺苷酸化活性、SUMO化活性、去SUMO化活性、核糖基化活性、去核糖基化活性、豆蔻酰化活性或去豆蔻酰化活性)。示例性酶结构域包括腺苷脱氨酶、氧化酶、胸腺嘧啶烷基转移酶、腺嘌呤氧化酶、腺苷甲基转移酶、腺苷脱氨酶、糖基化酶,无论单独或与其他酶结构域组合。在异源结构域为信号肽的方面中,信号肽可以是核定位信号(“NLS”),如SV40 NLS。
在一些情况下,不同定点修饰多肽,例如不同Cas9蛋白(即来自多种物种的Cas9蛋白)可以有利地用于多种本发明提供的方法,以利用不同Cas9蛋白的多种酶特征(例如用于不同PAM序列偏好;用于增加或减少的酶活性;用于增加或减少的细胞毒性水平;用于改变NHEJ、同源定向修复、单链断裂、双链断裂等之间的平衡)。来自多种物种的Cas9蛋白(例如,Shmakov等人,2017中披露的那些,或源自其的多肽)可能需要靶DNA中的不同PAM序列。因此,对于特定Cas9酶的选择,PAM序列要求可能不同于已知对于Cas9活性所需要的5'-N GG-3'序列(其中N为A、T、C、或G)。本文已经鉴定了来自多种多样的物种的许多Cas9直系同源物,并且这些蛋白质仅共享少数相同的氨基酸。所有鉴定的Cas9直系同源物具有与中央HNH内切核酸酶结构域和分开的RuvC/RNA酶H结构域相同的结构域构造。Cas9蛋白共享具有保守构造的4个关键基序;基序1、2、和4是RuvC样基序,而基序3是HNH基序。相反,与TTTV的LbCas12a规范的PAM相比,不同物种的Cas12a蛋白可能具有不同的PAM序列要求。
定点修饰多肽还可以是嵌合的和修饰的CRISPR/Cas核酸酶。例如,它可以是修饰的Cas9“碱基编辑器”。碱基编辑使得能够以可编程方式将一个靶DNA碱基直接不可逆转变为另一个碱基,而不需要DNA切割或供体DNA分子。例如,Komor等人(2016,Nature[自然],533:420-424)教导了Cas9-胞苷脱氨酶融合,其中也已经将Cas9工程化为无活性的,并且不诱导双链DNA断裂。此外,Gaudelli等人(2017,Nature[自然],doi:10.1038/nature24644)教导了融合至tRNA腺苷脱氨酶的催化受损的Cas9,它可以介导靶DNA序列中A/T至G/C的转变。可以充当本发明的方法和组合物中的定点修饰多肽的另一类工程化的Cas9核酸酶是可以识别广范围的PAM序列(包括NG、GAA和GAT)的变体(Hu等人,2018,Nature[自然],doi:10.1038/nature26155)。
实施例
在一个实施例中,我们提供了融合蛋白,该融合蛋白以N-末端至C-末端的方向包含异源结构域、第一接头序列以及V型CRISPR-Cas酶,其中该第一接头序列包含重复的GGGGS序列。在一方面,异源结构域为脱氨酶、聚合酶、核酸酶、松弛酶、烷基转移酶、甲基转移酶、腺苷脱氨酶、胞苷脱氨酶、氧化酶、胸腺嘧啶烷基转移酶、腺嘌呤氧化酶、腺苷甲基转移酶、糖基化酶或核定位信号。在另一方面,异源结构域为脱氨酶结构域。在又另一方面,脱氨酶结构域为胞苷脱氨酶。在另一方面,胞苷脱氨酶结构域为激活诱导的胞苷脱氨酶(“AID”)。在又另一方面,胞苷脱氨酶结构域为载脂蛋白B mRNA-编辑复合物(“APOBEC”)结构域。在另一方面,APOBEC结构域为APOBEC1家族脱氨酶。在又另一方面,APOBEC结构域包含与SEQ ID NO:1具有至少70%同一性的序列。在另一方面,脱氨酶结构域为腺嘌呤脱氨酶。在又另一方面,腺嘌呤脱氨酶为包含与SEQ ID NO:92具有至少70%同一性的序列的TadA结构域。
在一方面,V型CRISPR-Cas酶为V-A型(“Cas12a”)酶。在另一方面,Cas12a结构域选自由以下组成的组:SEQ ID NO:3、SEQ ID NO:6、SEQ ID NO:22、SEQ ID NO:45、SEQ ID NO:46、SEQ ID NO:47以及SEQ ID NO:48。在又另一方面,Cas12a结构域为无催化活性的,并且选自由以下组成的组:SEQ ID NO:3、SEQ ID NO:6以及SEQ ID NO:22。
在一方面,第一接头序列包含重复至少三次的GGGGS。在另一方面,第一接头序列包含重复至少六次的GGGGS。
在一方面,融合蛋白包含选自由以下组成的组的序列:SEQ ID NO:11、12、13以及44。在另一方面,融合蛋白进一步包含尿嘧啶DNA糖基化酶抑制剂(“UGI”)结构域。在又另一方面,UGI结构域包含SEQ ID NO:8。在另一方面,UGI结构域通过包含序列SGGS的第二接头与Cas12a酶连接。在又另一方面,融合蛋白包含选自由以下组成的组的序列:SEQ ID NO:17、SEQ ID NO:24、SEQ ID NO:35、SEQ ID NO:39、SEQ ID NO:43、SEQ ID NO:50、SEQ IDNO:52、SEQ ID NO:54、SEQ ID NO:56、SEQ ID NO:81、SEQ ID NO:83、SEQ ID NO:85、SEQ IDNO:87以及SEQ ID NO:89。在另一方面,与具有除了重复的GGGGS序列外的第一接头序列的融合蛋白相比,当与DNA接触时,融合蛋白以增加的频率产生靶上编辑,并且以降低的频率产生脱靶编辑。
在另一个实施例中,我们提供了编辑植物基因组DNA的方法,该方法包括将植物基因组DNA与以下接触:(a)以上方面的融合蛋白,并且该融合蛋白任选地包含UGI结构域;以及(b)将步骤(a)的融合蛋白靶向至植物基因组DNA的靶DNA序列的指导RNA(“gRNA”);其中与通过具有除重复的GGGGS序列外的第一接头的融合蛋白编辑的植物基因组DNA相比,所述经编辑的植物基因组DNA包含减少的脱靶编辑。
在另一个实施例中,我们提供了编辑具有减少的脱靶编辑的植物基因组DNA的方法,该方法包括将植物基因组DNA与以下接触:(a)以上方面的融合蛋白,并且该融合蛋白任选地包含UGI结构域;以及(b)将步骤(a)的融合蛋白靶向至植物基因组DNA的靶DNA序列的指导RNA(“gRNA”);其中与通过具有除重复的GGGGS序列外的第一接头的融合蛋白编辑的植物基因组DNA相比,所述经编辑的植物基因组DNA包含减少的脱靶编辑。在一方面,融合蛋白包含SEQ ID NO:24。
在另一个实施例中,我们提供了获得具有减少的脱靶编辑的经编辑的植物群体的方法,该方法包括:(a)获得包含待编辑的基因组DNA的植物细胞的群体;(b)获得编码以上方面的融合蛋白、和任选地UGI结构域的核苷酸序列;(c)用步骤(b)的核苷酸序列转化植物细胞的群体,从而表达通过植物细胞的群体内的核酸序列编码的融合蛋白;(d)使转化的植物细胞的群体生长成植物,其中至少一种植物被编辑;以及(e)从步骤(d)的产物中选择至少一种经编辑的植物,从而获得经编辑的植物群体;其中与通过具有除重复的GGGGS序列外的第一接头的融合蛋白编辑的植物相比,经编辑的植物群体包含减少的脱靶编辑。在一方面,编码融合蛋白的核苷酸序列包含SEQ ID NO:17、SEQ ID NO:24、SEQ ID NO:35、SEQ IDNO:39、SEQ ID NO:43、SEQ ID NO:50、SEQ ID NO:52、SEQ ID NO:54、SEQ ID NO:56、SEQ IDNO:81、SEQ ID NO:83、SEQ ID NO:85、SEQ ID NO:87以及SEQ ID NO:89。在一些实施例中,提供了密码子优化的多核苷酸,这些多核苷酸编码包括通过改进的接头序列连接的一种或多种DNA结合结构域和一种或多种DNA修饰结构域的融合蛋白。
实例
以下实例提供了多个示意性实施例。根据本披露以及本领域中一般水平的技能,普通技术人员应当理解以下实例仅仅旨在是示例性的并且可以采用不离开本披露的主题的范围的许多改变、修改和变更。
实例1.dLbCas12a-BE的载体构建和指导RNA表达
我们融合了含有D832A/E925A/D1148A突变(下文为“dLbCas12a”,以前称为dLbCpf1)无催化活性的毛螺菌科细菌Cas12a、大鼠胞苷脱氨酶(APOBEC1)以及尿嘧啶DNA糖基化酶抑制剂(UGI),它们通过氨基酸接头连接成一个蛋白质,以利用植物中碱基编辑的有益特性。对融合构建体进行玉米(Zea mays)密码子优化,并且进行了商业合成(金斯瑞公司(GenScript),中国南京),并且在甘蔗泛素-4(SoUbi4)基因启动子下克隆以组成性地产生dLbCas12a-BE。
在构建体24524的dLbCas12a-BE中,核定位信号(SV40-NLS)在通过XTEN蛋白接头与dLbCas12a连接的APOBEC1的前面,其后是通过SGGS接头与UGI连接的SV40-NLS。还通过SGGS接头将SV40-NLS并入至UGI的C-末端,以改进融合蛋白对核的靶向。使用玉米优化的密码子制造的dLbCas12a-BE的合成序列列出在SEQ ID NO:18中。
在构建体24904的dLbCas12a-BE中,SV40-NLS在通过具有六个GGGGS氨基酸重复(被称为(G4S)x6)的30个氨基酸接头GGGGS GGGGS GGGGS GGGGS GGGGS GGGGS(SEQ ID NO:11)与dLbCas12a连接的APOBEC1的前面,其后是通过SGGS接头与UGI连接的SV40NLS。还通过SGGS接头将SV40-NLS并入至UGI的C-末端,以改进融合蛋白对核的靶向。使用玉米优化的密码子制造的dLbCas12a-BE的合成序列列出在SEQ ID NO:23中。
在构建体25057的dLbCas12a-BE中,SV40-NLS在通过XTEN蛋白接头与dLbCas12a连接的APOBEC1的前面,其后是通过18个氨基酸接头GGSTG GGSGG GSGGG SSG(SEQ ID NO:12)(被称为SX)与UGI连接的SV40-NLS。还通过15个氨基酸接头GGGGS GGGGS(被称为(G4S)x3)将SV40-NLS并入至UGI的C-末端,以改进dLbCas12a-BE对核的靶向。使用玉米优化的密码子制造的dLbCas12a-BE合成序列列出在SEQ ID NO:14中。
在构建体25058的dLbCas12a-BE中,SV40-NLS在通过30个氨基酸接头((G4S)x6)与dLbCas12a连接的APOBEC1的前面,其后是通过SX接头与UGI连接的SV40-NLS。还通过(G4S)x3将SV40-NLS并入至UGI的C-末端,以改进dLbCas12a-BE对核的靶向。使用玉米优化的密码子制造的dLbCas12a-BE合成序列列出在SEQ ID NO:16中。
在dLbCas12a-BE构建体中,CRISPR/Cas12a指导RNA转录物在SoUbi4启动子的控制下表达,该启动子靶向玉米Waxy1第4外显子区域,按照外显子4中的PAM序列以将C9、C10或C22改变为T。它还包括作为支架的LbCrRNA的同向重复。指导RNA的合成序列列出在SEQ IDNO:26中。
在构建体24784中,核定位信号(xSV40NLS-06)在通过xXTEN-02与玉米优化的Cas9基因(cCas9BE-02)连接的胞苷脱氨酶(xAPOBEC1-01)的前面,其后是通过xSGGS接头-02与尿嘧啶DNA糖基化酶抑制剂xUGI-02连接的核定位信号(xSV40NLS-04),该尿嘧啶DNA糖基化酶抑制剂xUGI-02通过xSGGS接头-02与核定位信号xSV40NLS-07连接。融合蛋白在甘蔗泛素-4启动子(prSoUbi4-02)的控制下,随后在NOS终止子(tNOS-05-01)的控制下驱动。Cas9蛋白为切口酶Cas9突变,其中D10A与大鼠APOBEC1和尿嘧啶DNA糖基化酶抑制剂(UGI)融合。核定位信号也被并入Cas9的C-末端以改进其对核的靶向。cCas9BE-02的合成序列列出在SEQ ID NO:20中。
实例2.玉米胚的农杆菌介导的转化
为产生在玉米Wx1中编辑的潜在的事件,选择优良玉米转化品种NP2222用于所述的所有实验(WO 16106121,通过引用并入本文)。
采用玉米品种NP2222用于玉米转化。当未成熟胚为约1.2mm时,从GH收获玉米穗,然后用20%的Clorox溶液对穗灭菌20分钟,并且用无菌水冲洗3次。
将通过电穿孔携带载体的根癌农杆菌(Agrobacterium tumefaciens)菌株LBA4404 17740RecA-在含有Gent(25μg/ml)和Spec(100μg/ml)抗生素的YP培养基上划线,并且在28℃下生长2天。在转化之前,选择单菌落并且划线至新鲜YP平板上,并且在28℃下生长1天。使用接种培养基重悬农杆菌。将OD660调节至0.25。
我们去除了胚乳,然后用无菌手术刀分离并且收集未成熟胚胎,并将其注入农杆菌悬浮液中持续二至三分钟。将感染的未成熟胚转移至共培养培养基中在22℃下持续二至四天。
共培养阶段后,将胚转移至含有选择剂的培养基中,在28℃黑暗条件下持续四周。将抗性胚性愈伤组织转移至再生培养基,并且在28℃、16/8光照周期条件下培养。约三周后,在相同的培养温度和光照条件下,将再生小植株转移到含有生根培养基的生长容器中。
实例3.分析靶区域中的编辑的碱基
我们使用Phire植物样品直接PCR预混液(Phire Plant Direct PCR Master Mix,赛默飞世尔公司(Thermo Fisher),F160L)直接从玉米叶片样品中扩增出大约410bp含有靶区域的DNA片段。PCR前不需要DNA纯化。对扩增的DNA片段进行Sanger DNA测序,以分析靶位点的突变。
按照制造商的建议进行DNA提取和PCR扩增。将一片幼叶(如直径大约2mm的冲头)置入30μL的稀释缓冲液中。用100μL移液器吸头通过短暂按压管壁,并且添加20μL的稀释缓冲液,压碎叶片样品。压碎叶片后,溶液呈绿色。在离心机中旋转沉降植物材料,并且将1μL上清液用作用于20μL PCR反应的模板。
PCR系统由以下组成:
ZmWaxy1的PCR引物:
正向引物:5'-AGATGGGAGACGGGTACGAGACGG-3'(SEQ ID NO:29)
反向引物:5'-GTATGGGTTGTTGTTGAGGCTCAGG-3'(SEQ ID NO:30)
DNA测序引物:5'-GACCACCCACTGTTCCTGGAGAGGG-3'(SEQ ID NO:31)
PCR条件:
98℃持续5分钟;
35个循环:98℃下持续5秒,随后60℃下持续5秒;
72℃下持续20秒;
72℃下持续1分钟;以及
在4℃下保持,直至准备进行分析。
测序:
PCR产物通过琼脂糖凝胶电泳分离后,并且在Sanger DNA测序前用特异性引物进行纯化。对于杂合突变,在靶核苷酸位置观察到双峰,而与对照不同的唯一单峰被认为是纯合突变。构建体24524、24904和24784的转基因事件用于扩增用于测序的ZmWxy1外显子4区域,从而评估碱基编辑。
表1.包含胞苷脱氨酶和切口酶Cas9(“nCas9-CBE”)之间的XTEN接头的CRISPR/Cas胞苷碱基编辑器(“CBE”)APOBEC
编辑的核苷酸以灰色阴影显示。如上所示,包含APOBEC结构域和定点核酸酶之间的XTEN接头的这一版本的Cas12a碱基编辑器,在位置5和6处最有效地将半胱氨酸编辑为硫胺素。然而,在位置-2、7和49处存在鸟嘌呤至腺嘌呤编辑的情况。位置由距PAM位点起点的核苷酸数量决定。
表2.包含APOBEC脱氨酶和dLbCas12a之间的XTEN接头的CRISPR/Cas胞苷碱基编辑器
编辑的核苷酸以灰色阴影显示。在这一版本中,包含APOBEC结构域和失活的定点核酸酶之间的XTEN接头的Cas12a碱基编辑器,在位置9、10和22处将半胱氨酸编辑为硫胺素,并且在位置39、44、52,特别是53处将鸟嘌呤编辑为腺嘌呤。其中鸟嘌呤被编辑成腺嘌呤表明编辑发生在补体链上。
表3.包含脱氨酶和dLbCas12a之间的长接头的CRISPR/Cas胞苷碱基编辑器
编辑的核苷酸以灰色阴影显示。在这一版本中,包含含有APOBEC结构域和失活的定点核酸酶之间的(G4S)6的长接头的Cas12a碱基编辑器,在位置9和10处将半胱氨酸编辑为硫胺素,并且在位置19和53处将鸟嘌呤编辑为腺嘌呤。其中鸟嘌呤被编辑成腺嘌呤表明编辑发生在补体链上。
实例4.测量编辑效率。
表4.dLbCas12a-CBE系统对玉米Wxy1的碱基编辑效率。
表4显示出具有长接头的Cas12a的碱基编辑效率如何与Cas9的碱基编辑效率相当。不经优化,Cas12aBE具有大约5%的较差的编辑效率;远低于Cas9的编辑效率(为87%)。然而,通过添加长接头以将脱氨酶与无催化活性的Cas12a可操作地连接,编辑效率提高12倍。
表5.具有长接头的LbCas12a对SBEIIb的编辑效率。
表5显示出当与XTEN接头或长接头可操作地连接时,LbCas12a碱基编辑器的编辑效率之间的直接比较。当脱氨酶通过长接头(如(G4S)6)与定点核酸酶可操作地连接时,困难靶的编辑效率提高了近5倍。
表6.具有长接头的LbCas12a对Waxy1的编辑效率。
表7.具有长接头的LbCas12a对SBEIIb、Waxy1和Glossy2的多重编辑。
使用相同构建体内的几个指导RNA分子的多重同时编辑(“多重复用(multiplexing)”或“多重编辑”)以及具有核定位信号和活性Cas12a之间的长接头,达到高编辑效率。即使是具有挑战性的靶,如SBEIIb,作为多重编辑实验设计的一部分时,也达到了可接受的编辑效率。
实例5.改进了大豆中的编辑。
使用长接头和Cas12a的组合也大大地改进了大豆编辑。标准Cas12a和长接头-Cas12a对GmFAD2编辑提高了近7倍。
表8.GmFAD2编辑
实例6.长接头改进了玉米中的Mb2Cas12a编辑。
长接头还提高了另外的Cas12酶(如Mb2Cas12a)的编辑效率。
表9.通过具有长接头的Mb2Cas12a来编辑。
不使用长接头,Mb2Cas12a不对靶序列进行编辑。然而,使用长接头,编辑效率显著改善。
实例7.其他异源结构域通过长接头可操作地连接至Cas12a。
通过长接头将异源结构域(仅除APOBEC脱氨酶之外)与Cas12a连接也在本发明的范围内。此类异源结构域包括但不限于,脱氨酶、聚合酶、核酸酶、松弛酶、烷基转移酶、甲基转移酶、腺苷脱氨酶、胞苷脱氨酶、氧化酶、胸腺嘧啶烷基转移酶、腺嘌呤氧化酶、腺苷甲基转移酶、糖基化酶或核定位信号。
我们将腺嘌呤脱氨酶与Cas12a可操作地连接,以创建Cas12a腺嘌呤碱基编辑器(“Cas12a-ABE”)。我们将无催化活性的LbCas12a(含有D832A、E925A和D1148A突变)融合至大肠杆菌野生腺嘌呤脱氨酶(经工程化的“TadA”含有W23R、H36L、P48A、R51L、L84F、A106V、D108N、H123Y、S146C、D147Y、R152P、E155V、I156F以及K157N氨基酸取代),通过氨基酸接头可操作地连接。对融合构建体进行玉米(Zea mays)密码子优化,并且进行了商业合成(金斯瑞公司(GenScript),中国南京),并且在甘蔗泛素-4(SoUbi4)基因启动子下克隆以组成性地产生dLbCa12a-ABE。
在构建体25459的dLbCas12a-ABE中,将189bp的马铃薯内含子插入TadA编码序列中,该编码序列前面是通过XTEN蛋白接头连接的TadA变体,以产生TadA二聚体。将其与dLbCas12a融合,并且通过GS接头将SV40-NLS并入至dLbCas12a的C-末端,以改进融合蛋白对核的靶向。使用玉米优化的密码子制造的dLbCas12a-ABE合成序列列出在SEQ ID NO:79中。
在构建体25504的dLbCas12a-ABE中,将189bp的马铃薯内含子插入TadA编码序列中,该编码序列前面是TadA变体,以产生TadA二聚体。将其通过30个氨基酸接头(G4S)x6蛋白接头与dLbCas12a连接,并且通过GS接头将SV40-NLS并入至dLbCas12a的C-末端,以改进融合蛋白对核的靶向。使用玉米优化的密码子制造的dLbCas12a-ABE合成序列列出在SEQID NO:81中。
在dLbCas12a-ABE构建体中,CRISPR/Cas12a指导RNA转录物在SoUbi4启动子的控制下表达,该启动子靶向玉米Waxy1基因。它还包括作为支架的LbCrRNA的同向重复。指导RNA的合成序列列出在SEQ ID NO:74中。
在构建体25459的实验中(其中腺嘌呤脱氨酶通过XTEN接头与dLbCas12a连接),当用于玉米植物时未产生可检测的编辑。在构建体25504的实验中,其中腺嘌呤脱氨酶通过(G4S)*6长接头与dLbCas12a连接,产生7%的编辑效率,约为Cas9ABE对照(构建体24785)的一半。参见表10。
表10.dLbCas12aABE
我们认为,这表示Cas12aABE首次被证明在植物中有效。据信,使用长接头将腺嘌呤脱氨酶与Cas12a可操作地连接是这一技术成功的原因。
实例8.在玉米中的双碱基编辑器
双碱基编辑器(融合至Cas酶的胞苷脱氨酶结构域和腺嘌呤脱氨酶结构域)。在这一概念中,作物基因的靶向饱和诱变可应用于产生具有改善的农艺性能的遗传变体,例如,在同一靶区域中的C:G>T:A和A:T>G:C取代。我们多重复用了四种指导RNA:一种靶向ZmWaxy1基因,和三种不同的指导RNA靶向ZmADH基因。
表11.在玉米中的双CBE-ABE Cas12a编辑频率。
表12.玉米中通过双CBE-ABE Cas12a来编辑。
总计,基于dLbCas12a的CBE-ABE为1%的C至T和A至G的突变。添加内含子增加载体稳定性,但可能会由于低效剪接而减少酶活性。这被认为是在使用Cas12a的植物中进行双CBE-ABE编辑的第一个例子。
汇总表
在上表中,大多数Cas12aBE构建体遵循异源酶结构域-接头-Cas酶的模式。这种模式的例外情况是:25702[TadA二聚体-接头-PmCDA-接头-Cas酶]、25701[PmCDA-接头-TadA二聚体-接头-Cas酶],以及25658[TadA二聚体-接头-Cas酶-PmCDA]。可能存在另外的核定位序列、尿嘧啶糖基化酶抑制剂以及其他组分,但未在此表中显示。此类细节出现在随附序列表中提供的序列中。
本文提供的这些实例和实施例是对权利要求的非限制性说明,并且不应被解释为唯一的可用实例。本领域技术人员可以进行另外的变化。
序列表
<110> Syngenta
Li, Jiang
Xu, Jianping
<120> 用于DNA碱基编辑的方法和组合物
<130> 代理人案号81945
<160> 92
<170> PatentIn version 3.5
<210> 1
<211> 229
<212> PRT
<213> 褐家鼠(Rattus norvegicus)
<400> 1
Met Ser Ser Glu Thr Gly Pro Val Ala Val Asp Pro Thr Leu Arg Arg
1 5 10 15
Arg Ile Glu Pro His Glu Phe Glu Val Phe Phe Asp Pro Arg Glu Leu
20 25 30
Arg Lys Glu Thr Cys Leu Leu Tyr Glu Ile Asn Trp Gly Gly Arg His
35 40 45
Ser Ile Trp Arg His Thr Ser Gln Asn Thr Asn Lys His Val Glu Val
50 55 60
Asn Phe Ile Glu Lys Phe Thr Thr Glu Arg Tyr Phe Cys Pro Asn Thr
65 70 75 80
Arg Cys Ser Ile Thr Trp Phe Leu Ser Trp Ser Pro Cys Gly Glu Cys
85 90 95
Ser Arg Ala Ile Thr Glu Phe Leu Ser Arg Tyr Pro His Val Thr Leu
100 105 110
Phe Ile Tyr Ile Ala Arg Leu Tyr His His Ala Asp Pro Arg Asn Arg
115 120 125
Gln Gly Leu Arg Asp Leu Ile Ser Ser Gly Val Thr Ile Gln Ile Met
130 135 140
Thr Glu Gln Glu Ser Gly Tyr Cys Trp Arg Asn Phe Val Asn Tyr Ser
145 150 155 160
Pro Ser Asn Glu Ala His Trp Pro Arg Tyr Pro His Leu Trp Val Arg
165 170 175
Leu Tyr Val Leu Glu Leu Tyr Cys Ile Ile Leu Gly Leu Pro Pro Cys
180 185 190
Leu Asn Ile Leu Arg Arg Lys Gln Pro Gln Leu Thr Phe Phe Thr Ile
195 200 205
Ala Leu Gln Ser Cys His Tyr Gln Arg Leu Pro Pro His Ile Leu Trp
210 215 220
Ala Thr Gly Leu Lys
225
<210> 2
<211> 687
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 2
atgtccagcg agaccggccc cgtggcggtg gaccccaccc tgcgcaggcg catcgagccg 60
cacgagttcg aggtgttctt cgaccccagg gagctccgca aggagacctg cctcctgtac 120
gagatcaact ggggcggcag gcactccatc tggaggcaca cgagccagaa caccaacaag 180
cacgtcgagg tgaacttcat cgagaagttc accacggaga ggtacttctg cccgaacacg 240
cgctgctcca tcacgtggtt cctctcgtgg agcccatgcg gcgagtgctc cagggcgatc 300
acggagttcc tcagccgcta cccgcacgtg accctgttca tctacatcgc taggctctac 360
caccacgcgg accccaggaa caggcagggc ctcagggacc tgatctccag cggcgtcacg 420
atccagatca tgaccgagca ggagtccggc tactgctgga ggaacttcgt gaactactcc 480
ccgagcaacg aggcccactg gccccgctac ccgcacctct gggtccgcct ctacgtgctc 540
gagctgtact gcatcatcct cggcctgccg ccctgcctca acatcctgag gcgcaagcag 600
ccccagctga cgttcttcac catcgccctg cagagctgcc actaccagag gctcccgccc 660
cacatcctgt gggcgaccgg gctcaag 687
<210> 3
<211> 1251
<212> PRT
<213> 牛莫拉氏菌(Moraxella bovis)
<400> 3
Met Leu Phe Gln Asp Phe Thr His Leu Tyr Pro Leu Ser Lys Thr Val
1 5 10 15
Arg Phe Glu Leu Lys Pro Ile Gly Arg Thr Leu Glu His Ile His Ala
20 25 30
Lys Asn Phe Leu Ser Gln Asp Glu Thr Met Ala Asp Met Tyr Gln Lys
35 40 45
Val Lys Val Ile Leu Asp Asp Tyr His Arg Asp Phe Ile Ala Asp Met
50 55 60
Met Gly Glu Val Lys Leu Thr Lys Leu Ala Glu Phe Tyr Asp Val Tyr
65 70 75 80
Leu Lys Phe Arg Lys Asn Pro Lys Asp Asp Gly Leu Gln Lys Gln Leu
85 90 95
Lys Asp Leu Gln Ala Val Leu Arg Lys Glu Ser Val Lys Pro Ile Gly
100 105 110
Ser Gly Gly Lys Tyr Lys Thr Gly Tyr Asp Arg Leu Phe Gly Ala Lys
115 120 125
Leu Phe Lys Asp Gly Lys Glu Leu Gly Asp Leu Ala Lys Phe Val Ile
130 135 140
Ala Gln Glu Gly Glu Ser Ser Pro Lys Leu Ala His Leu Ala His Phe
145 150 155 160
Glu Lys Phe Ser Thr Tyr Phe Thr Gly Phe His Asp Asn Arg Lys Asn
165 170 175
Met Tyr Ser Asp Glu Asp Lys His Thr Ala Ile Ala Tyr Arg Leu Ile
180 185 190
His Glu Asn Leu Pro Arg Phe Ile Asp Asn Leu Gln Ile Leu Thr Thr
195 200 205
Ile Lys Gln Lys His Ser Ala Leu Tyr Asp Gln Ile Ile Asn Glu Leu
210 215 220
Thr Ala Ser Gly Leu Asp Val Ser Leu Ala Ser His Leu Asp Gly Tyr
225 230 235 240
His Lys Leu Leu Thr Gln Glu Gly Ile Thr Ala Tyr Asn Arg Ile Ile
245 250 255
Gly Glu Val Asn Gly Tyr Thr Asn Lys His Asn Gln Ile Cys His Lys
260 265 270
Ser Glu Arg Ile Ala Lys Leu Arg Pro Leu His Lys Gln Ile Leu Ser
275 280 285
Asp Gly Met Gly Val Ser Phe Leu Pro Ser Lys Phe Ala Asp Asp Ser
290 295 300
Glu Met Cys Gln Ala Val Asn Glu Phe Tyr Arg His Tyr Thr Asp Val
305 310 315 320
Phe Ala Lys Val Gln Ser Leu Phe Asp Gly Phe Asp Asp His Gln Lys
325 330 335
Asp Gly Ile Tyr Val Glu His Lys Asn Leu Asn Glu Leu Ser Lys Gln
340 345 350
Ala Phe Gly Asp Phe Ala Leu Leu Gly Arg Val Leu Asp Gly Tyr Tyr
355 360 365
Val Asp Val Val Asn Pro Glu Phe Asn Glu Arg Phe Ala Lys Ala Lys
370 375 380
Thr Asp Asn Ala Lys Ala Lys Leu Thr Lys Glu Lys Asp Lys Phe Ile
385 390 395 400
Lys Gly Val His Ser Leu Ala Ser Leu Glu Gln Ala Ile Glu His His
405 410 415
Thr Ala Arg His Asp Asp Glu Ser Val Gln Ala Gly Lys Leu Gly Gln
420 425 430
Tyr Phe Lys His Gly Leu Ala Gly Val Asp Asn Pro Ile Gln Lys Ile
435 440 445
His Asn Asn His Ser Thr Ile Lys Gly Phe Leu Glu Arg Glu Arg Pro
450 455 460
Ala Gly Glu Arg Ala Leu Pro Lys Ile Lys Ser Gly Lys Asn Pro Glu
465 470 475 480
Met Thr Gln Leu Arg Gln Leu Lys Glu Leu Leu Asp Asn Ala Leu Asn
485 490 495
Val Ala His Phe Ala Lys Leu Leu Thr Thr Lys Thr Thr Leu Asp Asn
500 505 510
Gln Asp Gly Asn Phe Tyr Gly Glu Phe Gly Val Leu Tyr Asp Glu Leu
515 520 525
Ala Lys Ile Pro Thr Leu Tyr Asn Lys Val Arg Asp Tyr Leu Ser Gln
530 535 540
Lys Pro Phe Ser Thr Glu Lys Tyr Lys Leu Asn Phe Gly Asn Pro Thr
545 550 555 560
Leu Leu Asn Gly Trp Asp Leu Asn Lys Glu Lys Asp Asn Phe Gly Val
565 570 575
Ile Leu Gln Lys Asp Gly Cys Tyr Tyr Leu Ala Leu Leu Asp Lys Ala
580 585 590
His Lys Lys Val Phe Asp Asn Ala Pro Asn Thr Gly Lys Asn Val Tyr
595 600 605
Gln Lys Met Val Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro
610 615 620
Lys Val Phe Phe Ala Lys Ser Asn Leu Asp Tyr Tyr Asn Pro Ser Ala
625 630 635 640
Glu Leu Leu Asp Lys Tyr Ala Lys Gly Thr His Lys Lys Gly Asp Asn
645 650 655
Phe Asn Leu Lys Asp Cys His Ala Leu Ile Asp Phe Phe Lys Ala Gly
660 665 670
Ile Asn Lys His Pro Glu Trp Gln His Phe Gly Phe Lys Phe Ser Pro
675 680 685
Thr Ser Ser Tyr Arg Asp Leu Ser Asp Phe Tyr Arg Glu Val Glu Pro
690 695 700
Gln Gly Tyr Gln Val Lys Phe Val Asp Ile Asn Ala Asp Tyr Ile Asp
705 710 715 720
Glu Leu Val Glu Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys
725 730 735
Asp Phe Ser Pro Lys Ala His Gly Lys Pro Asn Leu His Thr Leu Tyr
740 745 750
Phe Lys Ala Leu Phe Ser Glu Asp Asn Leu Ala Asp Pro Ile Tyr Lys
755 760 765
Leu Asn Gly Glu Ala Gln Ile Phe Tyr Arg Lys Ala Ser Leu Asp Met
770 775 780
Asn Glu Thr Thr Ile His Arg Ala Gly Glu Val Leu Glu Asn Lys Asn
785 790 795 800
Pro Asp Asn Pro Lys Lys Arg Gln Phe Val Tyr Asp Ile Ile Lys Asp
805 810 815
Lys Arg Tyr Thr Gln Asp Lys Phe Met Leu His Val Pro Ile Thr Met
820 825 830
Asn Phe Gly Val Gln Gly Met Thr Ile Lys Glu Phe Asn Lys Lys Val
835 840 845
Asn Gln Ser Ile Gln Gln Tyr Asp Glu Val Asn Val Ile Gly Ile Asp
850 855 860
Arg Gly Glu Arg His Leu Leu Tyr Leu Thr Val Ile Asn Ser Lys Gly
865 870 875 880
Glu Ile Leu Glu Gln Arg Ser Leu Asn Asp Ile Thr Thr Ala Ser Ala
885 890 895
Asn Gly Thr Gln Val Thr Thr Pro Tyr His Lys Ile Leu Asp Lys Arg
900 905 910
Glu Ile Glu Arg Leu Asn Ala Arg Val Gly Trp Gly Glu Ile Glu Thr
915 920 925
Ile Lys Glu Leu Lys Ser Gly Tyr Leu Ser His Val Val His Gln Ile
930 935 940
Asn Gln Leu Met Leu Lys Tyr Asn Ala Ile Val Val Leu Glu Asp Leu
945 950 955 960
Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys Val Glu Lys Gln Ile Tyr
965 970 975
Gln Asn Phe Glu Asn Ala Leu Ile Lys Lys Leu Asn His Leu Val Leu
980 985 990
Lys Asp Lys Ala Asp Asp Glu Ile Gly Ser Tyr Lys Asn Ala Leu Gln
995 1000 1005
Leu Thr Asn Asn Phe Thr Asp Leu Lys Ser Ile Gly Lys Gln Thr
1010 1015 1020
Gly Phe Leu Phe Tyr Val Pro Ala Trp Asn Thr Ser Lys Ile Asp
1025 1030 1035
Pro Glu Thr Gly Phe Val Asp Leu Leu Lys Pro Arg Tyr Glu Asn
1040 1045 1050
Ile Ala Gln Ser Gln Ala Phe Phe Gly Lys Phe Asp Lys Ile Cys
1055 1060 1065
Tyr Asn Thr Asp Lys Gly Tyr Phe Glu Phe His Ile Asp Tyr Ala
1070 1075 1080
Lys Phe Thr Asp Lys Ala Lys Asn Ser Arg Gln Lys Trp Ala Ile
1085 1090 1095
Cys Ser His Gly Asp Lys Arg Tyr Val Tyr Asp Lys Thr Ala Asn
1100 1105 1110
Gln Asn Lys Gly Ala Ala Lys Gly Ile Asn Val Asn Asp Glu Leu
1115 1120 1125
Lys Ser Leu Phe Ala Arg Tyr His Ile Asn Asp Lys Gln Pro Asn
1130 1135 1140
Leu Val Met Asp Ile Cys Gln Asn Asn Asp Lys Glu Phe His Lys
1145 1150 1155
Ser Leu Met Cys Leu Leu Lys Thr Leu Leu Ala Leu Arg Tyr Ser
1160 1165 1170
Asn Ala Ser Ser Asp Glu Asp Phe Ile Leu Ser Pro Val Ala Asn
1175 1180 1185
Asp Glu Gly Val Phe Phe Asn Ser Ala Leu Ala Asp Asp Thr Gln
1190 1195 1200
Pro Gln Asn Ala Asp Ala Asn Gly Ala Tyr His Ile Ala Leu Lys
1205 1210 1215
Gly Leu Trp Leu Leu Asn Glu Leu Lys Asn Ser Asp Asp Leu Asn
1220 1225 1230
Lys Val Lys Leu Ala Ile Asp Asn Gln Thr Trp Leu Asn Phe Ala
1235 1240 1245
Gln Asn Arg
1250
<210> 4
<211> 3753
<212> DNA
<213> 人工序列
<220>
<223> 密码子优化
<400> 4
gctctgtttc aagattttac acatctgtac ccgctgagta aaacagtgcg gttcgagctg 60
aaacccatag gaaggaccct cgagcacatc cacgcgaaga attttctgag ccaggatgaa 120
actatggctg atatgtatca aaaagttaag gtcattttgg acgactatca tcgcgatttt 180
attgccgaca tgatgggaga ggtgaaactc acgaagcttg ctgaatttta cgacgtctat 240
ctgaagttca ggaaaaatcc taaggacgat gggctgcaaa aacagcttaa agaccttcaa 300
gctgtccttc ggaaggaatc ggtgaagcct atagggtcag gtgggaagta caaaacaggc 360
tacgatagac tctttggggc aaaactcttc aaagatggaa aagagttggg tgacctcgca 420
aaattcgtta tagcccaaga aggtgagtct tctccgaagc tggctcatct tgctcatttt 480
gagaagttca gcacgtattt tactggattt cacgataatc ggaagaatat gtactcggat 540
gaagacaagc atactgcaat agcgtacagg ctcatccatg agaatttgcc gagattcatc 600
gacaatctgc aaatcttgac aacaatcaaa caaaagcata gcgccctcta tgatcagata 660
atcaacgagc tcacggcctc cgggctcgac gtctccttgg cttctcatct tgacgggtat 720
cacaagctcc ttacacaaga ggggatcacg gcatacaaca ggatcatagg agaggtgaat 780
ggatatacaa ataagcataa ccagatatgc cacaagagcg agcgcatagc gaaacttaga 840
cccttgcaca agcaaatcct ttctgacgga atgggagtgt cattccttcc gtctaagttc 900
gcggatgata gtgagatgtg ccaagcggtc aacgaatttt atcgccatta tactgacgtg 960
ttcgcaaagg tgcaaagtct ctttgacgga tttgatgatc accagaaaga cgggatctat 1020
gttgaacaca aaaaccttaa tgaactgagc aaacaggcgt tcggcgactt tgctttgctg 1080
gggagggtcc ttgatggata ctacgtggac gttgtcaatc cggagttcaa tgagcggttc 1140
gcaaaggcca agactgacaa tgcgaaagcc aagcttacaa aagaaaagga caaattcatt 1200
aaaggagtcc actcactggc ttccctcgaa caagcaatag aacaccatac agctagacac 1260
gacgatgaga gtgttcaagc cggaaaactt ggccagtact tcaaacacgg tttggcgggg 1320
gttgacaacc cgattcagaa aattcacaat aaccattcga cgattaaagg gtttctggaa 1380
agggaaaggc ctgctgggga acgggcgctc ccgaagatca agtcaggaaa aaacccagaa 1440
atgacacagc tcaggcagct gaaggaactt ttggacaacg cattgaatgt ggcgcacttc 1500
gctaagctgc tgacaactaa aacaaccttg gacaaccagg atggaaattt ttacggggag 1560
tttggggtgc tttacgacga gctggctaaa attccaactc tctacaataa ggttagagat 1620
tatctctctc aaaagccctt ttctaccgaa aagtataagc tcaacttcgg caatccgacc 1680
cttctcaatg ggtgggacct gaacaaagag aaagataact ttggggttat acttcagaag 1740
gatggatgct attacttggc gcttcttgat aaggctcata aaaaagtttt cgacaacgcc 1800
cctaacactg gtaagaacgt ctaccaaaag atggtctaca aactgttgcc cggccccaac 1860
aaaatgcttc ctaaagtgtt tttcgcaaaa tcgaatctcg actattataa tccatctgcc 1920
gagctccttg acaaatatgc taaggggacc cataaaaagg gtgataattt caacctgaag 1980
gactgccacg cgcttatcga ctttttcaaa gccgggataa ataagcatcc ggagtggcaa 2040
cattttggtt ttaaattttc gccaacgtcg tcctatcgcg acctttccga tttctatagg 2100
gaagttgaac ctcaggggta ccaggtcaaa tttgttgaca ttaatgcgga ctacattgat 2160
gaattggtgg agcaagggaa gctctacctc tttcaaatat ataacaaaga tttctcgcca 2220
aaagcgcatg gtaaaccgaa tcttcatacc ttgtacttta aagcactttt ttcagaagat 2280
aacttggcgg acccgatcta caagctgaat ggggaagctc agatcttcta caggaaagct 2340
tcgttggaca tgaacgagac taccatacat cgcgcgggag aggtgcttga gaacaaaaat 2400
cccgacaacc cgaaaaagcg gcaattcgtt tacgacatca tcaaagacaa acggtacacg 2460
caggacaaat ttatgctcca cgtccccatt accatgaatt ttggagtcca aggcatgacc 2520
attaaggaat tcaacaaaaa ggtcaaccaa agtattcagc aatacgatga agtcaatgtc 2580
ataggcatag atcggggaga aaggcatctg ttgtatctta ccgtgattaa ctctaagggt 2640
gaaatactgg agcaacggtc acttaacgat ataaccacgg cgtccgcgaa cggtacacaa 2700
gtgaccactc cctaccacaa aatattggat aaaagggaga tagaacgctt gaatgcccgc 2760
gttggctggg gtgagattga gaccatcaaa gagcttaaat cgggatattt gtctcacgtc 2820
gttcatcaaa ttaaccaact catgcttaag tacaatgcaa tcgttgtgct cgaggacctg 2880
aactttggtt tcaaaagagg gaggttcaag gtggaaaaac aaatttacca gaactttgaa 2940
aacgcgctta tcaagaaatt gaatcacctt gttttgaaag ataaggcaga tgacgaaatc 3000
gggtcgtata aaaatgcact ccagttgaca aataatttca cggatttgaa gtcgatcggc 3060
aagcaaacag ggttcctctt ttatgtgcca gcgtggaata catcaaaaat tgatccggag 3120
acgggatttg tcgacttgct gaagcctagg tatgagaaca ttgcccaatc tcaggccttt 3180
ttcggcaaat tcgataaaat atgctacaac acagacaaag gttattttga atttcacatt 3240
gattacgcca aatttacaga taaggcgaaa aacagcagac agaaatgggc tatctgttct 3300
catggggaca aacgctatgt ctacgataag acggctaatc aaaataaagg cgccgcaaaa 3360
ggtattaatg tgaatgatga gctgaaaagc ttgtttgccc gctaccatat caatgataaa 3420
caaccaaact tggtgatgga catatgccag aacaatgaca aagaattcca caagtcactc 3480
atgtgcctgc ttaaaaccct tttggcgctg cggtatagca atgcatctag cgatgaagac 3540
tttattttga gtcccgtggc caacgacgag ggcgtgtttt ttaattcagc cttggcggac 3600
gatacgcagc cccagaatgc ggacgcaaac ggcgcgtacc acattgcact gaagggactg 3660
tggcttctga acgagctgaa aaatagcgac gacctgaata aagtcaagtt ggccattgac 3720
aatcaaacct ggttgaattt cgctcaaaat aga 3753
<210> 5
<211> 4367
<212> DNA
<213> 人工序列
<220>
<223> 密码子优化的融合蛋白
<400> 5
atgtccagcg agaccggccc cgtggcggtg gaccccaccc tgcgcaggcg catcgagccg 60
cacgagttcg aggtgttctt cgaccccagg gagctccgca aggagacctg cctcctgtac 120
gagatcaact ggggcggcag gcactccatc tggaggcaca cgagccagaa caccaacaag 180
cacgtcgagg tgaacttcat cgagaagttc accacggaga ggtacttctg cccgaacacg 240
cgctgctcca tcacgtggtt cctctcgtgg agcccatgcg gcgagtgctc cagggcgatc 300
acggagttcc tcagccgcta cccgcacgtg accctgttca tctacatcgc taggctctac 360
caccacgcgg accccaggaa caggcagggc ctcagggacc tgatctccag cggcgtcacg 420
atccagatca tgaccgagca ggagtccggc tactgctgga ggaacttcgt gaactactcc 480
ccgagcaacg aggcccactg gccccgctac ccgcacctct gggtccgcct ctacgtgctc 540
gagctgtact gcatcatcct cggcctgccg ccctgcctca acatcctgag gcgcaagcag 600
ccccagctga cgttcttcac catcgccctg cagagctgcc actaccagag gctcccgccc 660
cacatcctgt gggcgaccgg gctcaagggg ggcgggggct caggcggggg cgggagcggc 720
ggcgggggct ctgggggcgg cggcagcggc gggggcggca gcgggggcgg cgggtcgatg 780
agcaagctgg agaagttcac gaactgctac tccctcagca agaccctgag gttcaaggcg 840
atcccggtcg gcaagaccca ggagaacatc gacaacaagc ggctgctggt ggaggacgag 900
aagagggctg aggactacaa gggcgtgaag aagctcctgg accgctacta cctgtccttc 960
atcaacgacg tgctccacag catcaagctc aagaacctga acaactacat cagcctcttc 1020
aggaagaaga cgcgcaccga gaaggagaac aaggagctcg agaacctgga gatcaacctg 1080
aggaaggaga tcgccaaggc gttcaagggc aacgagggct acaagtccct cttcaagaag 1140
gacatcatcg agacgatcct cccggagttc ctggacgaca aggacgagat cgccctggtc 1200
aactccttca acggcttcac cacggcgttc accggcttct tcgacaaccg cgagaacatg 1260
ttcagcgagg aggccaagtc cacgagcatc gcgttcaggt gcatcaacga gaacctcacc 1320
cgctacatct ccaacatgga catcttcgag aaggtcgacg cgatcttcga caagcacgag 1380
gtgcaggaga tcaaggagaa gatcctgaac agcgactacg acgtcgagga cttcttcgag 1440
ggcgagttct tcaacttcgt cctcacgcag gagggcatcg acgtgtacaa cgccatcatc 1500
ggtggcttcg tgaccgagtc cggcgagaag atcaagggcc tgaacgagta catcaacctc 1560
tacaaccaga agaccaagca gaagctgccg aagttcaagc ccctgtacaa gcaggtgctc 1620
tccgacaggg agtccctcag cttctacggc gagggctaca cgagcgacga ggaggtcctg 1680
gaggtgttcc gcaacaccct caacaagaac agcgagatct tctccagcat caagaagctc 1740
gagaagctgt tcaagaactt cgacgagtac tccagcgccg gcatcttcgt caagaacggc 1800
ccggcgatct ccacgatcag caaggacatc ttcggcgagt ggaacgtgat ccgcgacaag 1860
tggaacgccg agtacgacga catccacctc aagaagaagg cggtggtcac cgagaagtac 1920
gaggacgaca ggcgcaagtc cttcaagaag atcggctcct tcagcctcga gcagctgcag 1980
gagtacgccg acgcggacct gagcgtggtc gagaagctca aggagatcat catccagaag 2040
gtcgacgaga tctacaaggt gtacggctcc agcgagaagc tcttcgacgc ggacttcgtc 2100
ctcgagaagt ccctgaagaa gaacgacgcc gtggtcgcga tcatgaagga cctcctggac 2160
tccgtgaaga gcttcgagaa ttacatcaag gccttcttcg gcgagggcaa ggagacgaac 2220
agggacgagt ccttctacgg cgacttcgtc ctggcctacg acatcctcct gaaggtggac 2280
cacatctacg acgcgatccg caactacgtg acccagaagc cgtacagcaa ggacaagttc 2340
aagctctact tccagaaccc ccagttcatg ggcggctggg acaaggacaa ggagacggac 2400
tacagggcga ccatcctgcg ctacggcagc aagtactacc tcgccatcat ggacaagaag 2460
tacgcgaagt gcctgcagaa gatcgacaag gacgacgtca acggcaacta cgagaagatc 2520
aactacaagc tcctgccggg ccccaacaag atgctcccga aggtgttctt ctccaagaag 2580
tggatggcct actacaaccc cagcgaggac atccagaaga tctacaagaa cggcacgttc 2640
aagaagggcg acatgttcaa cctgaacgac tgccacaagc tcatcgactt cttcaaggac 2700
tccatcagcc gctacccgaa gtggtccaac gcctacgact tcaacttcag cgagaccgag 2760
aagtacaagg acatcgcggg cttctaccgc gaggtcgagg agcagggcta caaggtgtcc 2820
ttcgagtccg ccagcaagaa ggaggtcgac aagctggtgg aggagggcaa gctctacatg 2880
ttccagatct acaacaagga cttctccgac aagagccacg gcacgcccaa cctgcacacc 2940
atgtacttca agctcctgtt cgacgagaac aaccacggcc agatcaggct gtccggcggc 3000
gccgagctct tcatgaggag ggcgagcctg aagaaggagg agctggtggt ccaccccgct 3060
aacagcccaa tcgcgaacaa gaacccggac aaccccaaga agaccacgac cctgtcctac 3120
gacgtgtaca aggacaagag gttcagcgag gaccagtacg agctccacat cccgatcgcg 3180
atcaacaagt gccccaagaa catcttcaag atcaacaccg aggtccgcgt gctcctgaag 3240
cacgacgaca acccctacgt gatcggcatc gctaggggcg agaggaacct cctgtacatc 3300
gtggtcgtgg acggcaaggg caacatcgtg gagcagtact ccctcaacga gatcatcaac 3360
aacttcaacg gcatcaggat caagacggac taccacagcc tcctggacaa gaaggagaag 3420
gagaggttcg aggcccgcca gaactggacc tccatcgaga acatcaagga gctgaaggcg 3480
ggctacatca gccaggtcgt gcacaagatc tgcgagctcg tcgagaagta cgacgccgtg 3540
atcgccctcg cggacctgaa ctccggcttc aagaacagcc gcgtcaaggt ggagaagcag 3600
gtctaccaga agttcgagaa gatgctcatc gacaagctga actacatggt ggacaagaag 3660
tccaacccct gcgctacggg cggcgcgctg aagggctacc agatcaccaa caagttcgag 3720
agcttcaagt ccatgagcac tcagaacggc ttcatcttct acatcccggc gtggctcacg 3780
tccaagatcg accccagcac cggcttcgtc aacctcctga agacgaagta cacctccatc 3840
gccgacagca agaagttcat ctccagcttc gaccgcatca tgtatgtgcc ggaggaggac 3900
ctgttcgagt tcgccctcga ctacaagaac ttctcccgca cggacgcgga ctacatcaag 3960
aagtggaagc tgtacagcta cggcaaccgc atccgcatct tcaggaaccc caagaagaac 4020
aacgtcttcg actgggagga ggtgtgcctg acctccgcgt acaaggagct cttcaacaag 4080
tacggcatca actaccagca gggcgacatc agggctctcc tgtgcgagca gagcgacaag 4140
gccttctact ccagcttcat ggcgctgatg tccctcatgc tgcagatgag gaactcgatc 4200
accggcagga cggacgtggc cttcctcatc tccccggtga agaacagcga cggcatcttc 4260
tacgactcca ggaactacga ggcccaggag aacgcgatcc tcccaaagaa cgcggacgcc 4320
aacggcgcct acaacatcgc caggaaggtc ctctgggcta tcggcca 4367
<210> 6
<211> 1455
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 6
Met Ser Ser Glu Thr Gly Pro Val Ala Val Asp Pro Thr Leu Arg Arg
1 5 10 15
Arg Ile Glu Pro His Glu Phe Glu Val Phe Phe Asp Pro Arg Glu Leu
20 25 30
Arg Lys Glu Thr Cys Leu Leu Tyr Glu Ile Asn Trp Gly Gly Arg His
35 40 45
Ser Ile Trp Arg His Thr Ser Gln Asn Thr Asn Lys His Val Glu Val
50 55 60
Asn Phe Ile Glu Lys Phe Thr Thr Glu Arg Tyr Phe Cys Pro Asn Thr
65 70 75 80
Arg Cys Ser Ile Thr Trp Phe Leu Ser Trp Ser Pro Cys Gly Glu Cys
85 90 95
Ser Arg Ala Ile Thr Glu Phe Leu Ser Arg Tyr Pro His Val Thr Leu
100 105 110
Phe Ile Tyr Ile Ala Arg Leu Tyr His His Ala Asp Pro Arg Asn Arg
115 120 125
Gln Gly Leu Arg Asp Leu Ile Ser Ser Gly Val Thr Ile Gln Ile Met
130 135 140
Thr Glu Gln Glu Ser Gly Tyr Cys Trp Arg Asn Phe Val Asn Tyr Ser
145 150 155 160
Pro Ser Asn Glu Ala His Trp Pro Arg Tyr Pro His Leu Trp Val Arg
165 170 175
Leu Tyr Val Leu Glu Leu Tyr Cys Ile Ile Leu Gly Leu Pro Pro Cys
180 185 190
Leu Asn Ile Leu Arg Arg Lys Gln Pro Gln Leu Thr Phe Phe Thr Ile
195 200 205
Ala Leu Gln Ser Cys His Tyr Gln Arg Leu Pro Pro His Ile Leu Trp
210 215 220
Ala Thr Gly Leu Lys Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
225 230 235 240
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly
245 250 255
Gly Gly Ser Met Ser Lys Leu Glu Lys Phe Thr Asn Cys Tyr Ser Leu
260 265 270
Ser Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly Lys Thr Gln Glu
275 280 285
Asn Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu Lys Arg Ala Glu
290 295 300
Asp Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr Leu Ser Phe
305 310 315 320
Ile Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn Leu Asn Asn Tyr
325 330 335
Ile Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu Asn Lys Glu
340 345 350
Leu Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala Lys Ala Phe
355 360 365
Lys Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp Ile Ile Glu
370 375 380
Thr Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile Ala Leu Val
385 390 395 400
Asn Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe Phe Asp Asn
405 410 415
Arg Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr Ser Ile Ala Phe
420 425 430
Arg Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn Met Asp Ile
435 440 445
Phe Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu Val Gln Glu Ile
450 455 460
Lys Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp Phe Phe Glu
465 470 475 480
Gly Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly Ile Asp Val Tyr
485 490 495
Asn Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly Glu Lys Ile Lys
500 505 510
Gly Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr Lys Gln Lys
515 520 525
Leu Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser Asp Arg Glu
530 535 540
Ser Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu Glu Val Leu
545 550 555 560
Glu Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile Phe Ser Ser
565 570 575
Ile Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu Tyr Ser Ser
580 585 590
Ala Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser Thr Ile Ser Lys
595 600 605
Asp Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys Trp Asn Ala Glu
610 615 620
Tyr Asp Asp Ile His Leu Lys Lys Lys Ala Val Val Thr Glu Lys Tyr
625 630 635 640
Glu Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser Phe Ser Leu
645 650 655
Glu Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val Val Glu Lys
660 665 670
Leu Lys Glu Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr Lys Val Tyr
675 680 685
Gly Ser Ser Glu Lys Leu Phe Asp Ala Asp Phe Val Leu Glu Lys Ser
690 695 700
Leu Lys Lys Asn Asp Ala Val Val Ala Ile Met Lys Asp Leu Leu Asp
705 710 715 720
Ser Val Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe Gly Glu Gly
725 730 735
Lys Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe Val Leu Ala
740 745 750
Tyr Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp Ala Ile Arg Asn
755 760 765
Tyr Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys Leu Tyr Phe
770 775 780
Gln Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp Lys Glu Thr Asp
785 790 795 800
Tyr Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala Ile
805 810 815
Met Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp Lys Asp Asp
820 825 830
Val Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu Pro Gly Pro
835 840 845
Asn Lys Met Leu Pro Lys Val Phe Phe Ser Lys Lys Trp Met Ala Tyr
850 855 860
Tyr Asn Pro Ser Glu Asp Ile Gln Lys Ile Tyr Lys Asn Gly Thr Phe
865 870 875 880
Lys Lys Gly Asp Met Phe Asn Leu Asn Asp Cys His Lys Leu Ile Asp
885 890 895
Phe Phe Lys Asp Ser Ile Ser Arg Tyr Pro Lys Trp Ser Asn Ala Tyr
900 905 910
Asp Phe Asn Phe Ser Glu Thr Glu Lys Tyr Lys Asp Ile Ala Gly Phe
915 920 925
Tyr Arg Glu Val Glu Glu Gln Gly Tyr Lys Val Ser Phe Glu Ser Ala
930 935 940
Ser Lys Lys Glu Val Asp Lys Leu Val Glu Glu Gly Lys Leu Tyr Met
945 950 955 960
Phe Gln Ile Tyr Asn Lys Asp Phe Ser Asp Lys Ser His Gly Thr Pro
965 970 975
Asn Leu His Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu Asn Asn His
980 985 990
Gly Gln Ile Arg Leu Ser Gly Gly Ala Glu Leu Phe Met Arg Arg Ala
995 1000 1005
Ser Leu Lys Lys Glu Glu Leu Val Val His Pro Ala Asn Ser Pro
1010 1015 1020
Ile Ala Asn Lys Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr Leu
1025 1030 1035
Ser Tyr Asp Val Tyr Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr
1040 1045 1050
Glu Leu His Ile Pro Ile Ala Ile Asn Lys Cys Pro Lys Asn Ile
1055 1060 1065
Phe Lys Ile Asn Thr Glu Val Arg Val Leu Leu Lys His Asp Asp
1070 1075 1080
Asn Pro Tyr Val Ile Gly Ile Ala Arg Gly Glu Arg Asn Leu Leu
1085 1090 1095
Tyr Ile Val Val Val Asp Gly Lys Gly Asn Ile Val Glu Gln Tyr
1100 1105 1110
Ser Leu Asn Glu Ile Ile Asn Asn Phe Asn Gly Ile Arg Ile Lys
1115 1120 1125
Thr Asp Tyr His Ser Leu Leu Asp Lys Lys Glu Lys Glu Arg Phe
1130 1135 1140
Glu Ala Arg Gln Asn Trp Thr Ser Ile Glu Asn Ile Lys Glu Leu
1145 1150 1155
Lys Ala Gly Tyr Ile Ser Gln Val Val His Lys Ile Cys Glu Leu
1160 1165 1170
Val Glu Lys Tyr Asp Ala Val Ile Ala Leu Ala Asp Leu Asn Ser
1175 1180 1185
Gly Phe Lys Asn Ser Arg Val Lys Val Glu Lys Gln Val Tyr Gln
1190 1195 1200
Lys Phe Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Met Val Asp
1205 1210 1215
Lys Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala Leu Lys Gly Tyr
1220 1225 1230
Gln Ile Thr Asn Lys Phe Glu Ser Phe Lys Ser Met Ser Thr Gln
1235 1240 1245
Asn Gly Phe Ile Phe Tyr Ile Pro Ala Trp Leu Thr Ser Lys Ile
1250 1255 1260
Asp Pro Ser Thr Gly Phe Val Asn Leu Leu Lys Thr Lys Tyr Thr
1265 1270 1275
Ser Ile Ala Asp Ser Lys Lys Phe Ile Ser Ser Phe Asp Arg Ile
1280 1285 1290
Met Tyr Val Pro Glu Glu Asp Leu Phe Glu Phe Ala Leu Asp Tyr
1295 1300 1305
Lys Asn Phe Ser Arg Thr Asp Ala Asp Tyr Ile Lys Lys Trp Lys
1310 1315 1320
Leu Tyr Ser Tyr Gly Asn Arg Ile Arg Ile Phe Arg Asn Pro Lys
1325 1330 1335
Lys Asn Asn Val Phe Asp Trp Glu Glu Val Cys Leu Thr Ser Ala
1340 1345 1350
Tyr Lys Glu Leu Phe Asn Lys Tyr Gly Ile Asn Tyr Gln Gln Gly
1355 1360 1365
Asp Ile Arg Ala Leu Leu Cys Glu Gln Ser Asp Lys Ala Phe Tyr
1370 1375 1380
Ser Ser Phe Met Ala Leu Met Ser Leu Met Leu Gln Met Arg Asn
1385 1390 1395
Ser Ile Thr Gly Arg Thr Asp Val Ala Phe Leu Ile Ser Pro Val
1400 1405 1410
Lys Asn Ser Asp Gly Ile Phe Tyr Asp Ser Arg Asn Tyr Glu Ala
1415 1420 1425
Gln Glu Asn Ala Ile Leu Pro Lys Asn Ala Asp Ala Asn Gly Ala
1430 1435 1440
Tyr Asn Ile Ala Arg Lys Val Leu Trp Ala Ile Gly
1445 1450 1455
<210> 7
<211> 249
<212> DNA
<213> 人工序列
<220>
<223> 密码子优化
<400> 7
acgaacctgt ccgacatcat cgagaaggag accggcaagc agctcgtgat ccaggagagc 60
atcctcatgc tgccggagga ggtcgaggag gtcatcggca acaagcccga gtccgacatc 120
ctcgtccaca cggcctacga cgagtccacc gacgagaacg tgatgctcct gacctcggac 180
gctcccgagt acaagccatg ggccctggtc atccaggaca gcaacggcga gaacaagatc 240
aagatgctc 249
<210> 8
<211> 83
<212> PRT
<213> 枯草芽孢杆菌(Bacillus subtilis)噬菌体PBSX
<400> 8
Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val
1 5 10 15
Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile
20 25 30
Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu
35 40 45
Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr
50 55 60
Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile
65 70 75 80
Lys Met Leu
<210> 9
<211> 6882
<212> DNA
<213> 人工序列
<220>
<223> 密码子优化的融合
<400> 9
gaattcatta tgtggtctag gtaggttcta tatataagaa aacttgaaat gttctaaaaa 60
aaaattcaag cccatgcatg attgaagcaa acggtatagc aacggtgtta acctgatcta 120
gtgatctctt gcaatcctta acggccacct accgcaggta gcaaacggcg tccccctcct 180
cgatatctcc gcggcgacct ctggcttttt ccgcggaatt gcgcggtggg gacggattcc 240
acgagaccgc gacgcaaccg cctctcgccg ctgggcccca caccgctcgg tgccgtagcc 300
tcacgggact ctttctccct cctcccccgt tataaattgg cttcatcccc tccttgcctc 360
atccatccaa atcccagtcc ccaatcccat cccttcgtag gagaaattca tcgaagctaa 420
gcgaatcctc gcgatcctct caaggtactg cgagttttcg atccccctct cgacccctcg 480
tatgtttgtg tttgtcgtag cgtttgatta ggtatgcttt ccctgtttgt gttcgtcgta 540
gcgtttgatt aggtatgctt tccctgttcg tgttcatcgt agtgtttgat taggtcgtgt 600
gaggcgatgg cctgctcgcg tccttcgatc tgtagtcgat ttgcgggtcg tggtgtagat 660
ctgcgggctg tgatgaagtt atttggtgtg atctgctcgc ctgattctgc gggttggctc 720
gagtagatat gatggttgga ccggttggtt cgtttaccgc gctagggttg ggctgggatg 780
atgttgcatg cgccgttgcg cgtgatcccg cagcaggact tgcgtttgat tgccagatct 840
cgttacgatt atgtgatttg gtttggactt tttagatctg tagcttctgc ttatgtgcca 900
gatgcgccta ctgctcatat gcctgatgat aatcataaat ggctgtggaa ctaactagtt 960
gattgcggag tcatgtatca gctacaggtg tagggactag ctacaggtgt agggacttgc 1020
gtctaattgt ttggtccttt actcatgttg caattatgca atttagttta gattgtttgt 1080
tccactcatc taggctgtaa aagggacact gcttagattg ctgtttaatc tttttagtag 1140
attatattat attggtaact tattacccct attacatgcc atacgtgact tctgctcatg 1200
cctgatgata atcatagatc actgtggaat taattagttg attgttgaat catgtttcat 1260
gtacatacca cggcacaatt gcttagttcc ttaacaaatg caaattttac tgatccatgt 1320
atgatttgcg tggttctcta atgtgaaata ctatagctac ttgttagtaa gaatcaggtt 1380
cgtatgctta atgctgtatg tgccttctgc tcatgcctga tgataatcat atatcactgg 1440
aattaattag ttgatcgttt aatcatatat caagtacata ccatgccaca atttttagtc 1500
acttaaccca tgcagattga actggtccct gcatgttttg ctaaattgtt ctattctgat 1560
tagaccatat atcatgtatt tttttttggt aatggttctc ttattttaaa tgctatatag 1620
ttctggtact tgttagaaag atctgcttca tagtttagtt gcctatccct cgaattagga 1680
tgctgagcag ctgatcctat agctttgttt catgtatcaa ttcttttgtg ttcaacagtc 1740
agtttttgtt agattcattg taacttatgg tcgcttactc ttctggtcct caatgcttgc 1800
agggatcccc taaatagacc atgccgaaga agaagcgcaa ggtcatgtcc agcgagaccg 1860
gccccgtggc ggtggacccc accctgcgca ggcgcatcga gccgcacgag ttcgaggtgt 1920
tcttcgaccc cagggagctc cgcaaggaga cctgcctcct gtacgagatc aactggggcg 1980
gcaggcactc catctggagg cacacgagcc agaacaccaa caagcacgtc gaggtgaact 2040
tcatcgagaa gttcaccacg gagaggtact tctgcccgaa cacgcgctgc tccatcacgt 2100
ggttcctctc gtggagccca tgcggcgagt gctccagggc gatcacggag ttcctcagcc 2160
gctacccgca cgtgaccctg ttcatctaca tcgctaggct ctaccaccac gcggacccca 2220
ggaacaggca gggcctcagg gacctgatct ccagcggcgt cacgatccag atcatgaccg 2280
agcaggagtc cggctactgc tggaggaact tcgtgaacta ctccccgagc aacgaggccc 2340
actggccccg ctacccgcac ctctgggtcc gcctctacgt gctcgagctg tactgcatca 2400
tcctcggcct gccgccctgc ctcaacatcc tgaggcgcaa gcagccccag ctgacgttct 2460
tcaccatcgc cctgcagagc tgccactacc agaggctccc gccccacatc ctgtgggcga 2520
ccgggctcaa ggggggcggg ggctcaggcg ggggcgggag cggcggcggg ggctctgggg 2580
gcggcggcag cggcgggggc ggcagcgggg gcggcgggtc gatgagcaag ctggagaagt 2640
tcacgaactg ctactccctc agcaagaccc tgaggttcaa ggcgatcccg gtcggcaaga 2700
cccaggagaa catcgacaac aagcggctgc tggtggagga cgagaagagg gctgaggact 2760
acaagggcgt gaagaagctc ctggaccgct actacctgtc cttcatcaac gacgtgctcc 2820
acagcatcaa gctcaagaac ctgaacaact acatcagcct cttcaggaag aagacgcgca 2880
ccgagaagga gaacaaggag ctcgagaacc tggagatcaa cctgaggaag gagatcgcca 2940
aggcgttcaa gggcaacgag ggctacaagt ccctcttcaa gaaggacatc atcgagacga 3000
tcctcccgga gttcctggac gacaaggacg agatcgccct ggtcaactcc ttcaacggct 3060
tcaccacggc gttcaccggc ttcttcgaca accgcgagaa catgttcagc gaggaggcca 3120
agtccacgag catcgcgttc aggtgcatca acgagaacct cacccgctac atctccaaca 3180
tggacatctt cgagaaggtc gacgcgatct tcgacaagca cgaggtgcag gagatcaagg 3240
agaagatcct gaacagcgac tacgacgtcg aggacttctt cgagggcgag ttcttcaact 3300
tcgtcctcac gcaggagggc atcgacgtgt acaacgccat catcggtggc ttcgtgaccg 3360
agtccggcga gaagatcaag ggcctgaacg agtacatcaa cctctacaac cagaagacca 3420
agcagaagct gccgaagttc aagcccctgt acaagcaggt gctctccgac agggagtccc 3480
tcagcttcta cggcgagggc tacacgagcg acgaggaggt cctggaggtg ttccgcaaca 3540
ccctcaacaa gaacagcgag atcttctcca gcatcaagaa gctcgagaag ctgttcaaga 3600
acttcgacga gtactccagc gccggcatct tcgtcaagaa cggcccggcg atctccacga 3660
tcagcaagga catcttcggc gagtggaacg tgatccgcga caagtggaac gccgagtacg 3720
acgacatcca cctcaagaag aaggcggtgg tcaccgagaa gtacgaggac gacaggcgca 3780
agtccttcaa gaagatcggc tccttcagcc tcgagcagct gcaggagtac gccgacgcgg 3840
acctgagcgt ggtcgagaag ctcaaggaga tcatcatcca gaaggtcgac gagatctaca 3900
aggtgtacgg ctccagcgag aagctcttcg acgcggactt cgtcctcgag aagtccctga 3960
agaagaacga cgccgtggtc gcgatcatga aggacctcct ggactccgtg aagagcttcg 4020
agaattacat caaggccttc ttcggcgagg gcaaggagac gaacagggac gagtccttct 4080
acggcgactt cgtcctggcc tacgacatcc tcctgaaggt ggaccacatc tacgacgcga 4140
tccgcaacta cgtgacccag aagccgtaca gcaaggacaa gttcaagctc tacttccaga 4200
acccccagtt catgggcggc tgggacaagg acaaggagac ggactacagg gcgaccatcc 4260
tgcgctacgg cagcaagtac tacctcgcca tcatggacaa gaagtacgcg aagtgcctgc 4320
agaagatcga caaggacgac gtcaacggca actacgagaa gatcaactac aagctcctgc 4380
cgggccccaa caagatgctc ccgaaggtgt tcttctccaa gaagtggatg gcctactaca 4440
accccagcga ggacatccag aagatctaca agaacggcac gttcaagaag ggcgacatgt 4500
tcaacctgaa cgactgccac aagctcatcg acttcttcaa ggactccatc agccgctacc 4560
cgaagtggtc caacgcctac gacttcaact tcagcgagac cgagaagtac aaggacatcg 4620
cgggcttcta ccgcgaggtc gaggagcagg gctacaaggt gtccttcgag tccgccagca 4680
agaaggaggt cgacaagctg gtggaggagg gcaagctcta catgttccag atctacaaca 4740
aggacttctc cgacaagagc cacggcacgc ccaacctgca caccatgtac ttcaagctcc 4800
tgttcgacga gaacaaccac ggccagatca ggctgtccgg cggcgccgag ctcttcatga 4860
ggagggcgag cctgaagaag gaggagctgg tggtccaccc cgctaacagc ccaatcgcga 4920
acaagaaccc ggacaacccc aagaagacca cgaccctgtc ctacgacgtg tacaaggaca 4980
agaggttcag cgaggaccag tacgagctcc acatcccgat cgcgatcaac aagtgcccca 5040
agaacatctt caagatcaac accgaggtcc gcgtgctcct gaagcacgac gacaacccct 5100
acgtgatcgg catcgctagg ggcgagagga acctcctgta catcgtggtc gtggacggca 5160
agggcaacat cgtggagcag tactccctca acgagatcat caacaacttc aacggcatca 5220
ggatcaagac ggactaccac agcctcctgg acaagaagga gaaggagagg ttcgaggccc 5280
gccagaactg gacctccatc gagaacatca aggagctgaa ggcgggctac atcagccagg 5340
tcgtgcacaa gatctgcgag ctcgtcgaga agtacgacgc cgtgatcgcc ctcgcggacc 5400
tgaactccgg cttcaagaac agccgcgtca aggtggagaa gcaggtctac cagaagttcg 5460
agaagatgct catcgacaag ctgaactaca tggtggacaa gaagtccaac ccctgcgcta 5520
cgggcggcgc gctgaagggc taccagatca ccaacaagtt cgagagcttc aagtccatga 5580
gcactcagaa cggcttcatc ttctacatcc cggcgtggct cacgtccaag atcgacccca 5640
gcaccggctt cgtcaacctc ctgaagacga agtacacctc catcgccgac agcaagaagt 5700
tcatctccag cttcgaccgc atcatgtatg tgccggagga ggacctgttc gagttcgccc 5760
tcgactacaa gaacttctcc cgcacggacg cggactacat caagaagtgg aagctgtaca 5820
gctacggcaa ccgcatccgc atcttcagga accccaagaa gaacaacgtc ttcgactggg 5880
aggaggtgtg cctgacctcc gcgtacaagg agctcttcaa caagtacggc atcaactacc 5940
agcagggcga catcagggct ctcctgtgcg agcagagcga caaggccttc tactccagct 6000
tcatggcgct gatgtccctc atgctgcaga tgaggaactc gatcaccggc aggacggacg 6060
tggccttcct catctccccg gtgaagaaca gcgacggcat cttctacgac tccaggaact 6120
acgaggccca ggagaacgcg atcctcccaa agaacgcgga cgccaacggc gcctacaaca 6180
tcgccaggaa ggtcctctgg gctatcggcc agttcaagaa ggcggaggac gagaagctgg 6240
acaaggtgaa gatcgccatc agcaacaagg agtggctcga gtacgcccag acctcggtca 6300
agcacggcag cccgaagaag aagcgcaagg tgtccggcgg cagcacgaac ctgtccgaca 6360
tcatcgagaa ggagaccggc aagcagctcg tgatccagga gagcatcctc atgctgccgg 6420
aggaggtcga ggaggtcatc ggcaacaagc ccgagtccga catcctcgtc cacacggcct 6480
acgacgagtc caccgacgag aacgtgatgc tcctgacctc ggacgctccc gagtacaagc 6540
catgggccct ggtcatccag gacagcaacg gcgagaacaa gatcaagatg ctctccggcg 6600
gcagcccgaa gaagaagcgc aaagtgtgag atcgttcaaa catttggcaa taaagtttct 6660
taagattgaa tcctgttgcc ggtcttgcga tgattatcat ataatttctg ttgaattacg 6720
ttaagcatgt aataattaac atgtaatgca tgacgttatt tatgagatgg gtttttatga 6780
ttagagtccc gcaattatac atttaatacg cgatagaaaa caaaatatag cgcgcaaact 6840
aggataaatt atcgcgcgcg gtgtcatcta tgttactaga tc 6882
<210> 10
<211> 90
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 10
gggggcgggg gctcaggcgg gggcgggagc ggcggcgggg gctctggggg cggcggcagc 60
ggcgggggcg gcagcggggg cggcgggtcg 90
<210> 11
<211> 30
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 11
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
1 5 10 15
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
20 25 30
<210> 12
<211> 18
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 12
Gly Gly Ser Thr Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser
1 5 10 15
Ser Gly
<210> 13
<211> 15
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 13
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
1 5 10 15
<210> 14
<211> 4842
<212> DNA
<213> 人工序列
<220>
<223> 密码子优化的融合
<400> 14
atgccgaaga agaagcgcaa ggtcatgtcc agcgagaccg gccccgtggc ggtggacccc 60
accctgcgca ggcgcatcga gccgcacgag ttcgaggtgt tcttcgaccc cagggagctc 120
cgcaaggaga cctgcctcct gtacgagatc aactggggcg gcaggcactc catctggagg 180
cacacgagcc agaacaccaa caagcacgtc gaggtgaact tcatcgagaa gttcaccacg 240
gagaggtact tctgcccgaa cacgcgctgc tccatcacgt ggttcctctc gtggagccca 300
tgcggcgagt gctccagggc gatcacggag ttcctcagcc gctacccgca cgtgaccctg 360
ttcatctaca tcgctaggct ctaccaccac gcggacccca ggaacaggca gggcctcagg 420
gacctgatct ccagcggcgt cacgatccag atcatgaccg agcaggagtc cggctactgc 480
tggaggaact tcgtgaacta ctccccgagc aacgaggccc actggccccg ctacccgcac 540
ctctgggtcc gcctctacgt gctcgagctg tactgcatca tcctcggcct gccgccctgc 600
ctcaacatcc tgaggcgcaa gcagccccag ctgacgttct tcaccatcgc cctgcagagc 660
tgccactacc agaggctccc gccccacatc ctgtgggcga ccgggctcaa gtcgggcagc 720
gagacccccg gcacctccga gtcggctacc ccagagtcca tgagcaagct ggagaagttc 780
acgaactgct actccctcag caagaccctg aggttcaagg cgatcccggt cggcaagacc 840
caggagaaca tcgacaacaa gcggctgctg gtggaggacg agaagagggc tgaggactac 900
aagggcgtga agaagctcct ggaccgctac tacctgtcct tcatcaacga cgtgctccac 960
agcatcaagc tcaagaacct gaacaactac atcagcctct tcaggaagaa gacgcgcacc 1020
gagaaggaga acaaggagct cgagaacctg gagatcaacc tgaggaagga gatcgccaag 1080
gcgttcaagg gcaacgaggg ctacaagtcc ctcttcaaga aggacatcat cgagacgatc 1140
ctcccggagt tcctggacga caaggacgag atcgccctgg tcaactcctt caacggcttc 1200
accacggcgt tcaccggctt cttcgacaac cgcgagaaca tgttcagcga ggaggccaag 1260
tccacgagca tcgcgttcag gtgcatcaac gagaacctca cccgctacat ctccaacatg 1320
gacatcttcg agaaggtcga cgcgatcttc gacaagcacg aggtgcagga gatcaaggag 1380
aagatcctga acagcgacta cgacgtcgag gacttcttcg agggcgagtt cttcaacttc 1440
gtcctcacgc aggagggcat cgacgtgtac aacgccatca tcggtggctt cgtgaccgag 1500
tccggcgaga agatcaaggg cctgaacgag tacatcaacc tctacaacca gaagaccaag 1560
cagaagctgc cgaagttcaa gcccctgtac aagcaggtgc tctccgacag ggagtccctc 1620
agcttctacg gcgagggcta cacgagcgac gaggaggtcc tggaggtgtt ccgcaacacc 1680
ctcaacaaga acagcgagat cttctccagc atcaagaagc tcgagaagct gttcaagaac 1740
ttcgacgagt actccagcgc cggcatcttc gtcaagaacg gcccggcgat ctccacgatc 1800
agcaaggaca tcttcggcga gtggaacgtg atccgcgaca agtggaacgc cgagtacgac 1860
gacatccacc tcaagaagaa ggcggtggtc accgagaagt acgaggacga caggcgcaag 1920
tccttcaaga agatcggctc cttcagcctc gagcagctgc aggagtacgc cgacgcggac 1980
ctgagcgtgg tcgagaagct caaggagatc atcatccaga aggtcgacga gatctacaag 2040
gtgtacggct ccagcgagaa gctcttcgac gcggacttcg tcctcgagaa gtccctgaag 2100
aagaacgacg ccgtggtcgc gatcatgaag gacctcctgg actccgtgaa gagcttcgag 2160
aattacatca aggccttctt cggcgagggc aaggagacga acagggacga gtccttctac 2220
ggcgacttcg tcctggccta cgacatcctc ctgaaggtgg accacatcta cgacgcgatc 2280
cgcaactacg tgacccagaa gccgtacagc aaggacaagt tcaagctcta cttccagaac 2340
ccccagttca tgggcggctg ggacaaggac aaggagacgg actacagggc gaccatcctg 2400
cgctacggca gcaagtacta cctcgccatc atggacaaga agtacgcgaa gtgcctgcag 2460
aagatcgaca aggacgacgt caacggcaac tacgagaaga tcaactacaa gctcctgccg 2520
ggccccaaca agatgctccc gaaggtgttc ttctccaaga agtggatggc ctactacaac 2580
cccagcgagg acatccagaa gatctacaag aacggcacgt tcaagaaggg cgacatgttc 2640
aacctgaacg actgccacaa gctcatcgac ttcttcaagg actccatcag ccgctacccg 2700
aagtggtcca acgcctacga cttcaacttc agcgagaccg agaagtacaa ggacatcgcg 2760
ggcttctacc gcgaggtcga ggagcagggc tacaaggtgt ccttcgagtc cgccagcaag 2820
aaggaggtcg acaagctggt ggaggagggc aagctctaca tgttccagat ctacaacaag 2880
gacttctccg acaagagcca cggcacgccc aacctgcaca ccatgtactt caagctcctg 2940
ttcgacgaga acaaccacgg ccagatcagg ctgtccggcg gcgccgagct cttcatgagg 3000
agggcgagcc tgaagaagga ggagctggtg gtccaccccg ctaacagccc aatcgcgaac 3060
aagaacccgg acaaccccaa gaagaccacg accctgtcct acgacgtgta caaggacaag 3120
aggttcagcg aggaccagta cgagctccac atcccgatcg cgatcaacaa gtgccccaag 3180
aacatcttca agatcaacac cgaggtccgc gtgctcctga agcacgacga caacccctac 3240
gtgatcggca tcgctagggg cgagaggaac ctcctgtaca tcgtggtcgt ggacggcaag 3300
ggcaacatcg tggagcagta ctccctcaac gagatcatca acaacttcaa cggcatcagg 3360
atcaagacgg actaccacag cctcctggac aagaaggaga aggagaggtt cgaggcccgc 3420
cagaactgga cctccatcga gaacatcaag gagctgaagg cgggctacat cagccaggtc 3480
gtgcacaaga tctgcgagct cgtcgagaag tacgacgccg tgatcgccct cgcggacctg 3540
aactccggct tcaagaacag ccgcgtcaag gtggagaagc aggtctacca gaagttcgag 3600
aagatgctca tcgacaagct gaactacatg gtggacaaga agtccaaccc ctgcgctacg 3660
ggcggcgcgc tgaagggcta ccagatcacc aacaagttcg agagcttcaa gtccatgagc 3720
actcagaacg gcttcatctt ctacatcccg gcgtggctca cgtccaagat cgaccccagc 3780
accggcttcg tcaacctcct gaagacgaag tacacctcca tcgccgacag caagaagttc 3840
atctccagct tcgaccgcat catgtatgtg ccggaggagg acctgttcga gttcgccctc 3900
gactacaaga acttctcccg cacggacgcg gactacatca agaagtggaa gctgtacagc 3960
tacggcaacc gcatccgcat cttcaggaac cccaagaaga acaacgtctt cgactgggag 4020
gaggtgtgcc tgacctccgc gtacaaggag ctcttcaaca agtacggcat caactaccag 4080
cagggcgaca tcagggctct cctgtgcgag cagagcgaca aggccttcta ctccagcttc 4140
atggcgctga tgtccctcat gctgcagatg aggaactcga tcaccggcag gacggacgtg 4200
gccttcctca tctccccggt gaagaacagc gacggcatct tctacgactc caggaactac 4260
gaggcccagg agaacgcgat cctcccaaag aacgcggacg ccaacggcgc ctacaacatc 4320
gccaggaagg tcctctgggc tatcggccag ttcaagaagg cggaggacga gaagctggac 4380
aaggtgaaga tcgccatcag caacaaggag tggctcgagt acgcccagac ctcggtcaag 4440
cacggcagcc cgaagaagaa gcgcaaggtg ggagggtcga caggaggcgg ttctggcgga 4500
ggttcaggtg gaggctcgag tggtacgaac ctgtccgaca tcatcgagaa ggagaccggc 4560
aagcagctcg tgatccagga gagcatcctc atgctgccgg aggaggtcga ggaggtcatc 4620
ggcaacaagc ccgagtccga catcctcgtc cacacggcct acgacgagtc caccgacgag 4680
aacgtgatgc tcctgacctc ggacgctccc gagtacaagc catgggccct ggtcatccag 4740
gacagcaacg gcgagaacaa gatcaagatg ctcggtggag gcggttcagg cggaggtggc 4800
tctggcggtg gcggatcgcc gaagaagaag cgcaaagtgt ga 4842
<210> 15
<211> 1613
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 15
Met Pro Lys Lys Lys Arg Lys Val Met Ser Ser Glu Thr Gly Pro Val
1 5 10 15
Ala Val Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu
20 25 30
Val Phe Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr
35 40 45
Glu Ile Asn Trp Gly Gly Arg His Ser Ile Trp Arg His Thr Ser Gln
50 55 60
Asn Thr Asn Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Thr
65 70 75 80
Glu Arg Tyr Phe Cys Pro Asn Thr Arg Cys Ser Ile Thr Trp Phe Leu
85 90 95
Ser Trp Ser Pro Cys Gly Glu Cys Ser Arg Ala Ile Thr Glu Phe Leu
100 105 110
Ser Arg Tyr Pro His Val Thr Leu Phe Ile Tyr Ile Ala Arg Leu Tyr
115 120 125
His His Ala Asp Pro Arg Asn Arg Gln Gly Leu Arg Asp Leu Ile Ser
130 135 140
Ser Gly Val Thr Ile Gln Ile Met Thr Glu Gln Glu Ser Gly Tyr Cys
145 150 155 160
Trp Arg Asn Phe Val Asn Tyr Ser Pro Ser Asn Glu Ala His Trp Pro
165 170 175
Arg Tyr Pro His Leu Trp Val Arg Leu Tyr Val Leu Glu Leu Tyr Cys
180 185 190
Ile Ile Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln
195 200 205
Pro Gln Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln
210 215 220
Arg Leu Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys Ser Gly Ser
225 230 235 240
Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Met Ser Lys
245 250 255
Leu Glu Lys Phe Thr Asn Cys Tyr Ser Leu Ser Lys Thr Leu Arg Phe
260 265 270
Lys Ala Ile Pro Val Gly Lys Thr Gln Glu Asn Ile Asp Asn Lys Arg
275 280 285
Leu Leu Val Glu Asp Glu Lys Arg Ala Glu Asp Tyr Lys Gly Val Lys
290 295 300
Lys Leu Leu Asp Arg Tyr Tyr Leu Ser Phe Ile Asn Asp Val Leu His
305 310 315 320
Ser Ile Lys Leu Lys Asn Leu Asn Asn Tyr Ile Ser Leu Phe Arg Lys
325 330 335
Lys Thr Arg Thr Glu Lys Glu Asn Lys Glu Leu Glu Asn Leu Glu Ile
340 345 350
Asn Leu Arg Lys Glu Ile Ala Lys Ala Phe Lys Gly Asn Glu Gly Tyr
355 360 365
Lys Ser Leu Phe Lys Lys Asp Ile Ile Glu Thr Ile Leu Pro Glu Phe
370 375 380
Leu Asp Asp Lys Asp Glu Ile Ala Leu Val Asn Ser Phe Asn Gly Phe
385 390 395 400
Thr Thr Ala Phe Thr Gly Phe Phe Asp Asn Arg Glu Asn Met Phe Ser
405 410 415
Glu Glu Ala Lys Ser Thr Ser Ile Ala Phe Arg Cys Ile Asn Glu Asn
420 425 430
Leu Thr Arg Tyr Ile Ser Asn Met Asp Ile Phe Glu Lys Val Asp Ala
435 440 445
Ile Phe Asp Lys His Glu Val Gln Glu Ile Lys Glu Lys Ile Leu Asn
450 455 460
Ser Asp Tyr Asp Val Glu Asp Phe Phe Glu Gly Glu Phe Phe Asn Phe
465 470 475 480
Val Leu Thr Gln Glu Gly Ile Asp Val Tyr Asn Ala Ile Ile Gly Gly
485 490 495
Phe Val Thr Glu Ser Gly Glu Lys Ile Lys Gly Leu Asn Glu Tyr Ile
500 505 510
Asn Leu Tyr Asn Gln Lys Thr Lys Gln Lys Leu Pro Lys Phe Lys Pro
515 520 525
Leu Tyr Lys Gln Val Leu Ser Asp Arg Glu Ser Leu Ser Phe Tyr Gly
530 535 540
Glu Gly Tyr Thr Ser Asp Glu Glu Val Leu Glu Val Phe Arg Asn Thr
545 550 555 560
Leu Asn Lys Asn Ser Glu Ile Phe Ser Ser Ile Lys Lys Leu Glu Lys
565 570 575
Leu Phe Lys Asn Phe Asp Glu Tyr Ser Ser Ala Gly Ile Phe Val Lys
580 585 590
Asn Gly Pro Ala Ile Ser Thr Ile Ser Lys Asp Ile Phe Gly Glu Trp
595 600 605
Asn Val Ile Arg Asp Lys Trp Asn Ala Glu Tyr Asp Asp Ile His Leu
610 615 620
Lys Lys Lys Ala Val Val Thr Glu Lys Tyr Glu Asp Asp Arg Arg Lys
625 630 635 640
Ser Phe Lys Lys Ile Gly Ser Phe Ser Leu Glu Gln Leu Gln Glu Tyr
645 650 655
Ala Asp Ala Asp Leu Ser Val Val Glu Lys Leu Lys Glu Ile Ile Ile
660 665 670
Gln Lys Val Asp Glu Ile Tyr Lys Val Tyr Gly Ser Ser Glu Lys Leu
675 680 685
Phe Asp Ala Asp Phe Val Leu Glu Lys Ser Leu Lys Lys Asn Asp Ala
690 695 700
Val Val Ala Ile Met Lys Asp Leu Leu Asp Ser Val Lys Ser Phe Glu
705 710 715 720
Asn Tyr Ile Lys Ala Phe Phe Gly Glu Gly Lys Glu Thr Asn Arg Asp
725 730 735
Glu Ser Phe Tyr Gly Asp Phe Val Leu Ala Tyr Asp Ile Leu Leu Lys
740 745 750
Val Asp His Ile Tyr Asp Ala Ile Arg Asn Tyr Val Thr Gln Lys Pro
755 760 765
Tyr Ser Lys Asp Lys Phe Lys Leu Tyr Phe Gln Asn Pro Gln Phe Met
770 775 780
Gly Gly Trp Asp Lys Asp Lys Glu Thr Asp Tyr Arg Ala Thr Ile Leu
785 790 795 800
Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala Ile Met Asp Lys Lys Tyr Ala
805 810 815
Lys Cys Leu Gln Lys Ile Asp Lys Asp Asp Val Asn Gly Asn Tyr Glu
820 825 830
Lys Ile Asn Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro Lys
835 840 845
Val Phe Phe Ser Lys Lys Trp Met Ala Tyr Tyr Asn Pro Ser Glu Asp
850 855 860
Ile Gln Lys Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly Asp Met Phe
865 870 875 880
Asn Leu Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys Asp Ser Ile
885 890 895
Ser Arg Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn Phe Ser Glu
900 905 910
Thr Glu Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu Val Glu Glu
915 920 925
Gln Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys Glu Val Asp
930 935 940
Lys Leu Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile Tyr Asn Lys
945 950 955 960
Asp Phe Ser Asp Lys Ser His Gly Thr Pro Asn Leu His Thr Met Tyr
965 970 975
Phe Lys Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile Arg Leu Ser
980 985 990
Gly Gly Ala Glu Leu Phe Met Arg Arg Ala Ser Leu Lys Lys Glu Glu
995 1000 1005
Leu Val Val His Pro Ala Asn Ser Pro Ile Ala Asn Lys Asn Pro
1010 1015 1020
Asp Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr Lys
1025 1030 1035
Asp Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro Ile
1040 1045 1050
Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr Glu
1055 1060 1065
Val Arg Val Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile Gly
1070 1075 1080
Ile Ala Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val Asp
1085 1090 1095
Gly Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile Ile
1100 1105 1110
Asn Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser Leu
1115 1120 1125
Leu Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp
1130 1135 1140
Thr Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile Ser
1145 1150 1155
Gln Val Val His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp Ala
1160 1165 1170
Val Ile Ala Leu Ala Asp Leu Asn Ser Gly Phe Lys Asn Ser Arg
1175 1180 1185
Val Lys Val Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Leu
1190 1195 1200
Ile Asp Lys Leu Asn Tyr Met Val Asp Lys Lys Ser Asn Pro Cys
1205 1210 1215
Ala Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile Thr Asn Lys Phe
1220 1225 1230
Glu Ser Phe Lys Ser Met Ser Thr Gln Asn Gly Phe Ile Phe Tyr
1235 1240 1245
Ile Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro Ser Thr Gly Phe
1250 1255 1260
Val Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile Ala Asp Ser Lys
1265 1270 1275
Lys Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr Val Pro Glu Glu
1280 1285 1290
Asp Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn Phe Ser Arg Thr
1295 1300 1305
Asp Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr Ser Tyr Gly Asn
1310 1315 1320
Arg Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn Asn Val Phe Asp
1325 1330 1335
Trp Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys Glu Leu Phe Asn
1340 1345 1350
Lys Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile Arg Ala Leu Leu
1355 1360 1365
Cys Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser Phe Met Ala Leu
1370 1375 1380
Met Ser Leu Met Leu Gln Met Arg Asn Ser Ile Thr Gly Arg Thr
1385 1390 1395
Asp Val Ala Phe Leu Ile Ser Pro Val Lys Asn Ser Asp Gly Ile
1400 1405 1410
Phe Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu Asn Ala Ile Leu
1415 1420 1425
Pro Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile Ala Arg Lys
1430 1435 1440
Val Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala Glu Asp Glu Lys
1445 1450 1455
Leu Asp Lys Val Lys Ile Ala Ile Ser Asn Lys Glu Trp Leu Glu
1460 1465 1470
Tyr Ala Gln Thr Ser Val Lys His Gly Ser Pro Lys Lys Lys Arg
1475 1480 1485
Lys Val Gly Gly Ser Thr Gly Gly Gly Ser Gly Gly Gly Ser Gly
1490 1495 1500
Gly Gly Ser Ser Gly Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu
1505 1510 1515
Thr Gly Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro
1520 1525 1530
Glu Glu Val Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile
1535 1540 1545
Leu Val His Thr Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met
1550 1555 1560
Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val
1565 1570 1575
Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu Gly Gly
1580 1585 1590
Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Pro Lys
1595 1600 1605
Lys Lys Arg Lys Val
1610
<210> 16
<211> 5145
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 16
atgccgaaga agaagcgcaa ggtcatgtcc agcgagaccg gccccgtggc ggtggacccc 60
accctgcgca ggcgcatcga gccgcacgag ttcgaggtgt tcttcgaccc cagggagctc 120
cgcaaggaga cctgcctcct gtacgagatc aactggggcg gcaggcactc catctggagg 180
cacacgagcc agaacaccaa caagcacgtc gaggtgaact tcatcgagaa gttcaccacg 240
gagaggtact tctgcccgaa cacgcgctgc tccatcacgt ggttcctctc gtggagccca 300
tgcggcgagt gctccagggc gatcacggag ttcctcagcc gctacccgca cgtgaccctg 360
ttcatctaca tcgctaggct ctaccaccac gcggacccca ggaacaggca gggcctcagg 420
gacctgatct ccagcggcgt cacgatccag atcatgaccg agcaggagtc cggctactgc 480
tggaggaact tcgtgaacta ctccccgagc aacgaggccc actggccccg ctacccgcac 540
ctctgggtcc gcctctacgt gctcgagctg tactgcatca tcctcggcct gccgccctgc 600
ctcaacatcc tgaggcgcaa gcagccccag ctgacgttct tcaccatcgc cctgcagagc 660
tgccactacc agaggctccc gccccacatc ctgtgggcga ccgggctcaa ggggggcggg 720
ggctcaggcg ggggcgggag cggcggcggg ggctctgggg gcggcggcag cggcgggggc 780
ggcagcgggg gcggcgggtc gatgagcaag ctggagaagt tcacgaactg ctactccctc 840
agcaagaccc tgaggttcaa ggcgatcccg gtcggcaaga cccaggagaa catcgacaac 900
aagcggctgc tggtggagga cgagaagagg gctgaggact acaagggcgt gaagaagctc 960
ctggaccgct actacctgtc cttcatcaac gacgtgctcc acagcatcaa gctcaagaac 1020
ctgaacaact acatcagcct cttcaggaag aagacgcgca ccgagaagga gaacaaggag 1080
ctcgagaacc tggagatcaa cctgaggaag gagatcgcca aggcgttcaa gggcaacgag 1140
ggctacaagt ccctcttcaa gaaggacatc atcgagacga tcctcccgga gttcctggac 1200
gacaaggacg agatcgccct ggtcaactcc ttcaacggct tcaccacggc gttcaccggc 1260
ttcttcgaca accgcgagaa catgttcagc gaggaggcca agtccacgag catcgcgttc 1320
aggtgcatca acgagaacct cacccgctac atctccaaca tggacatctt cgagaaggtc 1380
gacgcgatct tcgacaagca cgaggtgcag gagatcaagg agaagatcct gaacagcgac 1440
tacgacgtcg aggacttctt cgagggcgag ttcttcaact tcgtcctcac gcaggagggc 1500
atcgacgtgt acaacgccat catcggtggc ttcgtgaccg agtccggcga gaagatcaag 1560
ggcctgaacg agtacatcaa cctctacaac cagaagacca agcagaagct gccgaagttc 1620
aagcccctgt acaagcaggt gctctccgac agggagtccc tcagcttcta cggcgagggc 1680
tacacgagcg acgaggaggt cctggaggtg ttccgcaaca ccctcaacaa gaacagcgag 1740
atcttctcca gcatcaagaa gctcgagaag ctgttcaaga acttcgacga gtactccagc 1800
gccggcatct tcgtcaagaa cggcccggcg atctccacga tcagcaagga catcttcggc 1860
gagtggaacg tgatccgcga caagtggaac gccgagtacg acgacatcca cctcaagaag 1920
aaggcggtgg tcaccgagaa gtacgaggac gacaggcgca agtccttcaa gaagatcggc 1980
tccttcagcc tcgagcagct gcaggagtac gccgacgcgg acctgagcgt ggtcgagaag 2040
ctcaaggaga tcatcatcca gaaggtcgac gagatctaca aggtgtacgg ctccagcgag 2100
aagctcttcg acgcggactt cgtcctcgag aagtccctga agaagaacga cgccgtggtc 2160
gcgatcatga aggacctcct ggactccgtg aagagcttcg agaattacat caaggccttc 2220
ttcggcgagg gcaaggagac gaacagggac gagtccttct acggcgactt cgtcctggcc 2280
tacgacatcc tcctgaaggt ggaccacatc tacgacgcga tccgcaacta cgtgacccag 2340
aagccgtaca gcaaggacaa gttcaagctc tacttccaga acccccagtt catgggcggc 2400
tgggacaagg acaaggagac ggactacagg gcgaccatcc tgcgctacgg cagcaagtac 2460
tacctcgcca tcatggacaa gaagtacgcg aagtgcctgc agaagatcga caaggacgac 2520
gtcaacggca actacgagaa gatcaactac aagctcctgc cgggccccaa caagatgctc 2580
ccgaaggtgt tcttctccaa gaagtggatg gcctactaca accccagcga ggacatccag 2640
aagatctaca agaacggcac gttcaagaag ggcgacatgt tcaacctgaa cgactgccac 2700
aagctcatcg acttcttcaa ggactccatc agccgctacc cgaagtggtc caacgcctac 2760
gacttcaact tcagcgagac cgagaagtac aaggacatcg cgggcttcta ccgcgaggtc 2820
gaggagcagg gctacaaggt gtccttcgag tccgccagca agaaggaggt cgacaagctg 2880
gtggaggagg gcaagctcta catgttccag atctacaaca aggacttctc cgacaagagc 2940
cacggcacgc ccaacctgca caccatgtac ttcaagctcc tgttcgacga gaacaaccac 3000
ggccagatca ggctgtccgg cggcgccgag ctcttcatga ggagggcgag cctgaagaag 3060
gaggagctgg tggtccaccc cgctaacagc ccaatcgcga acaagaaccc ggacaacccc 3120
aagaagacca cgaccctgtc ctacgacgtg tacaaggaca agaggttcag cgaggaccag 3180
tacgagctcc acatcccgat cgcgatcaac aagtgcccca agaacatctt caagatcaac 3240
accgaggtcc gcgtgctcct gaagcacgac gacaacccct acgtgatcgg catcgctagg 3300
ggcgagagga acctcctgta catcgtggtc gtggacggca agggcaacat cgtggagcag 3360
tactccctca acgagatcat caacaacttc aacggcatca ggatcaagac ggactaccac 3420
agcctcctgg acaagaagga gaaggagagg ttcgaggccc gccagaactg gacctccatc 3480
gagaacatca aggagctgaa ggcgggctac atcagccagg tcgtgcacaa gatctgcgag 3540
ctcgtcgaga agtacgacgc cgtgatcgcc ctcgcggacc tgaactccgg cttcaagaac 3600
agccgcgtca aggtggagaa gcaggtctac cagaagttcg agaagatgct catcgacaag 3660
ctgaactaca tggtggacaa gaagtccaac ccctgcgcta cgggcggcgc gctgaagggc 3720
taccagatca ccaacaagtt cgagagcttc aagtccatga gcactcagaa cggcttcatc 3780
ttctacatcc cggcgtggct cacgtccaag atcgacccca gcaccggctt cgtcaacctc 3840
ctgaagacga agtacacctc catcgccgac agcaagaagt tcatctccag cttcgaccgc 3900
atcatgtatg tgccggagga ggacctgttc gagttcgccc tcgactacaa gaacttctcc 3960
cgcacggacg cggactacat caagaagtgg aagctgtaca gctacggcaa ccgcatccgc 4020
atcttcagga accccaagaa gaacaacgtc ttcgactggg aggaggtgtg cctgacctcc 4080
gcgtacaagg agctcttcaa caagtacggc atcaactacc agcagggcga catcagggct 4140
ctcctgtgcg agcagagcga caaggccttc tactccagct tcatggcgct gatgtccctc 4200
atgctgcaga tgaggaactc gatcaccggc aggacggacg tggccttcct catctccccg 4260
gtgaagaaca gcgacggcat cttctacgac tccaggaact acgaggccca ggagaacgcg 4320
atcctcccaa agaacgcgga cgccaacggc gcctacaaca tcgccaggaa ggtcctctgg 4380
gctatcggcc agttcaagaa ggcggaggac gagaagctgg acaaggtgaa gatcgccatc 4440
agcaacaagg agtggctcga gtacgcccag acctcggtca agcacggcag cccgaagaag 4500
aagcgcaagg tgggagggtc gacaggaggc ggttctggcg gaggttcagg tggaggctcg 4560
agtggtacga acctgtccga catcatcgag aaggagaccg gcaagcagct cgtgatccag 4620
gagagcatcc tcatgctgcc ggaggaggtc gaggaggtca tcggcaacaa gcccgagtcc 4680
gacatcctcg tccacacggc ctacgacgag tccaccgacg agaacgtgat gctcctgacc 4740
tcggacgctc ccgagtacaa gccatgggcc ctggtcatcc aggacagcaa cggcgagaac 4800
aagatcaaga tgctcggtgg aggcggttca ggcggaggtg gctctggcgg tggcggatcg 4860
acgaacctgt ccgacatcat cgagaaggag accggcaagc agctcgtgat ccaggagagc 4920
atcctcatgc tgccggagga ggtcgaggag gtcatcggca acaagcccga gtccgacatc 4980
ctcgtccaca cggcctacga cgagtccacc gacgagaacg tgatgctcct gacctcggac 5040
gctcccgagt acaagccatg ggccctggtc atccaggaca gcaacggcga gaacaagatc 5100
aagatgctct ccggcggcag cccgaagaag aagcgcaaag tgtga 5145
<210> 17
<211> 1714
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 17
Met Pro Lys Lys Lys Arg Lys Val Met Ser Ser Glu Thr Gly Pro Val
1 5 10 15
Ala Val Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu
20 25 30
Val Phe Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr
35 40 45
Glu Ile Asn Trp Gly Gly Arg His Ser Ile Trp Arg His Thr Ser Gln
50 55 60
Asn Thr Asn Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Thr
65 70 75 80
Glu Arg Tyr Phe Cys Pro Asn Thr Arg Cys Ser Ile Thr Trp Phe Leu
85 90 95
Ser Trp Ser Pro Cys Gly Glu Cys Ser Arg Ala Ile Thr Glu Phe Leu
100 105 110
Ser Arg Tyr Pro His Val Thr Leu Phe Ile Tyr Ile Ala Arg Leu Tyr
115 120 125
His His Ala Asp Pro Arg Asn Arg Gln Gly Leu Arg Asp Leu Ile Ser
130 135 140
Ser Gly Val Thr Ile Gln Ile Met Thr Glu Gln Glu Ser Gly Tyr Cys
145 150 155 160
Trp Arg Asn Phe Val Asn Tyr Ser Pro Ser Asn Glu Ala His Trp Pro
165 170 175
Arg Tyr Pro His Leu Trp Val Arg Leu Tyr Val Leu Glu Leu Tyr Cys
180 185 190
Ile Ile Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln
195 200 205
Pro Gln Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln
210 215 220
Arg Leu Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys Gly Gly Gly
225 230 235 240
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
245 250 255
Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Met Ser Lys Leu Glu
260 265 270
Lys Phe Thr Asn Cys Tyr Ser Leu Ser Lys Thr Leu Arg Phe Lys Ala
275 280 285
Ile Pro Val Gly Lys Thr Gln Glu Asn Ile Asp Asn Lys Arg Leu Leu
290 295 300
Val Glu Asp Glu Lys Arg Ala Glu Asp Tyr Lys Gly Val Lys Lys Leu
305 310 315 320
Leu Asp Arg Tyr Tyr Leu Ser Phe Ile Asn Asp Val Leu His Ser Ile
325 330 335
Lys Leu Lys Asn Leu Asn Asn Tyr Ile Ser Leu Phe Arg Lys Lys Thr
340 345 350
Arg Thr Glu Lys Glu Asn Lys Glu Leu Glu Asn Leu Glu Ile Asn Leu
355 360 365
Arg Lys Glu Ile Ala Lys Ala Phe Lys Gly Asn Glu Gly Tyr Lys Ser
370 375 380
Leu Phe Lys Lys Asp Ile Ile Glu Thr Ile Leu Pro Glu Phe Leu Asp
385 390 395 400
Asp Lys Asp Glu Ile Ala Leu Val Asn Ser Phe Asn Gly Phe Thr Thr
405 410 415
Ala Phe Thr Gly Phe Phe Asp Asn Arg Glu Asn Met Phe Ser Glu Glu
420 425 430
Ala Lys Ser Thr Ser Ile Ala Phe Arg Cys Ile Asn Glu Asn Leu Thr
435 440 445
Arg Tyr Ile Ser Asn Met Asp Ile Phe Glu Lys Val Asp Ala Ile Phe
450 455 460
Asp Lys His Glu Val Gln Glu Ile Lys Glu Lys Ile Leu Asn Ser Asp
465 470 475 480
Tyr Asp Val Glu Asp Phe Phe Glu Gly Glu Phe Phe Asn Phe Val Leu
485 490 495
Thr Gln Glu Gly Ile Asp Val Tyr Asn Ala Ile Ile Gly Gly Phe Val
500 505 510
Thr Glu Ser Gly Glu Lys Ile Lys Gly Leu Asn Glu Tyr Ile Asn Leu
515 520 525
Tyr Asn Gln Lys Thr Lys Gln Lys Leu Pro Lys Phe Lys Pro Leu Tyr
530 535 540
Lys Gln Val Leu Ser Asp Arg Glu Ser Leu Ser Phe Tyr Gly Glu Gly
545 550 555 560
Tyr Thr Ser Asp Glu Glu Val Leu Glu Val Phe Arg Asn Thr Leu Asn
565 570 575
Lys Asn Ser Glu Ile Phe Ser Ser Ile Lys Lys Leu Glu Lys Leu Phe
580 585 590
Lys Asn Phe Asp Glu Tyr Ser Ser Ala Gly Ile Phe Val Lys Asn Gly
595 600 605
Pro Ala Ile Ser Thr Ile Ser Lys Asp Ile Phe Gly Glu Trp Asn Val
610 615 620
Ile Arg Asp Lys Trp Asn Ala Glu Tyr Asp Asp Ile His Leu Lys Lys
625 630 635 640
Lys Ala Val Val Thr Glu Lys Tyr Glu Asp Asp Arg Arg Lys Ser Phe
645 650 655
Lys Lys Ile Gly Ser Phe Ser Leu Glu Gln Leu Gln Glu Tyr Ala Asp
660 665 670
Ala Asp Leu Ser Val Val Glu Lys Leu Lys Glu Ile Ile Ile Gln Lys
675 680 685
Val Asp Glu Ile Tyr Lys Val Tyr Gly Ser Ser Glu Lys Leu Phe Asp
690 695 700
Ala Asp Phe Val Leu Glu Lys Ser Leu Lys Lys Asn Asp Ala Val Val
705 710 715 720
Ala Ile Met Lys Asp Leu Leu Asp Ser Val Lys Ser Phe Glu Asn Tyr
725 730 735
Ile Lys Ala Phe Phe Gly Glu Gly Lys Glu Thr Asn Arg Asp Glu Ser
740 745 750
Phe Tyr Gly Asp Phe Val Leu Ala Tyr Asp Ile Leu Leu Lys Val Asp
755 760 765
His Ile Tyr Asp Ala Ile Arg Asn Tyr Val Thr Gln Lys Pro Tyr Ser
770 775 780
Lys Asp Lys Phe Lys Leu Tyr Phe Gln Asn Pro Gln Phe Met Gly Gly
785 790 795 800
Trp Asp Lys Asp Lys Glu Thr Asp Tyr Arg Ala Thr Ile Leu Arg Tyr
805 810 815
Gly Ser Lys Tyr Tyr Leu Ala Ile Met Asp Lys Lys Tyr Ala Lys Cys
820 825 830
Leu Gln Lys Ile Asp Lys Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile
835 840 845
Asn Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe
850 855 860
Phe Ser Lys Lys Trp Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln
865 870 875 880
Lys Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu
885 890 895
Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg
900 905 910
Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu
915 920 925
Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln Gly
930 935 940
Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys Glu Val Asp Lys Leu
945 950 955 960
Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile Tyr Asn Lys Asp Phe
965 970 975
Ser Asp Lys Ser His Gly Thr Pro Asn Leu His Thr Met Tyr Phe Lys
980 985 990
Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile Arg Leu Ser Gly Gly
995 1000 1005
Ala Glu Leu Phe Met Arg Arg Ala Ser Leu Lys Lys Glu Glu Leu
1010 1015 1020
Val Val His Pro Ala Asn Ser Pro Ile Ala Asn Lys Asn Pro Asp
1025 1030 1035
Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr Lys Asp
1040 1045 1050
Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro Ile Ala
1055 1060 1065
Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr Glu Val
1070 1075 1080
Arg Val Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile Gly Ile
1085 1090 1095
Ala Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val Asp Gly
1100 1105 1110
Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile Ile Asn
1115 1120 1125
Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser Leu Leu
1130 1135 1140
Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp Thr
1145 1150 1155
Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln
1160 1165 1170
Val Val His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp Ala Val
1175 1180 1185
Ile Ala Leu Ala Asp Leu Asn Ser Gly Phe Lys Asn Ser Arg Val
1190 1195 1200
Lys Val Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Leu Ile
1205 1210 1215
Asp Lys Leu Asn Tyr Met Val Asp Lys Lys Ser Asn Pro Cys Ala
1220 1225 1230
Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile Thr Asn Lys Phe Glu
1235 1240 1245
Ser Phe Lys Ser Met Ser Thr Gln Asn Gly Phe Ile Phe Tyr Ile
1250 1255 1260
Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro Ser Thr Gly Phe Val
1265 1270 1275
Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile Ala Asp Ser Lys Lys
1280 1285 1290
Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr Val Pro Glu Glu Asp
1295 1300 1305
Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn Phe Ser Arg Thr Asp
1310 1315 1320
Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr Ser Tyr Gly Asn Arg
1325 1330 1335
Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn Asn Val Phe Asp Trp
1340 1345 1350
Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys Glu Leu Phe Asn Lys
1355 1360 1365
Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile Arg Ala Leu Leu Cys
1370 1375 1380
Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser Phe Met Ala Leu Met
1385 1390 1395
Ser Leu Met Leu Gln Met Arg Asn Ser Ile Thr Gly Arg Thr Asp
1400 1405 1410
Val Ala Phe Leu Ile Ser Pro Val Lys Asn Ser Asp Gly Ile Phe
1415 1420 1425
Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu Asn Ala Ile Leu Pro
1430 1435 1440
Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile Ala Arg Lys Val
1445 1450 1455
Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala Glu Asp Glu Lys Leu
1460 1465 1470
Asp Lys Val Lys Ile Ala Ile Ser Asn Lys Glu Trp Leu Glu Tyr
1475 1480 1485
Ala Gln Thr Ser Val Lys His Gly Ser Pro Lys Lys Lys Arg Lys
1490 1495 1500
Val Gly Gly Ser Thr Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly
1505 1510 1515
Gly Ser Ser Gly Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr
1520 1525 1530
Gly Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu
1535 1540 1545
Glu Val Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu
1550 1555 1560
Val His Thr Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met Leu
1565 1570 1575
Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val Ile
1580 1585 1590
Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu Gly Gly Gly
1595 1600 1605
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Thr Asn Leu
1610 1615 1620
Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val Ile Gln
1625 1630 1635
Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile Gly
1640 1645 1650
Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu
1655 1660 1665
Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu
1670 1675 1680
Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn
1685 1690 1695
Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys
1700 1705 1710
Val
<210> 18
<211> 4767
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 18
atgccgaaga agaagcgcaa ggtcatgtcc agcgagaccg gccccgtggc ggtggacccc 60
accctgcgca ggcgcatcga gccgcacgag ttcgaggtgt tcttcgaccc cagggagctc 120
cgcaaggaga cctgcctcct gtacgagatc aactggggcg gcaggcactc catctggagg 180
cacacgagcc agaacaccaa caagcacgtc gaggtgaact tcatcgagaa gttcaccacg 240
gagaggtact tctgcccgaa cacgcgctgc tccatcacgt ggttcctctc gtggagccca 300
tgcggcgagt gctccagggc gatcacggag ttcctcagcc gctacccgca cgtgaccctg 360
ttcatctaca tcgctaggct ctaccaccac gcggacccca ggaacaggca gggcctcagg 420
gacctgatct ccagcggcgt cacgatccag atcatgaccg agcaggagtc cggctactgc 480
tggaggaact tcgtgaacta ctccccgagc aacgaggccc actggccccg ctacccgcac 540
ctctgggtcc gcctctacgt gctcgagctg tactgcatca tcctcggcct gccgccctgc 600
ctcaacatcc tgaggcgcaa gcagccccag ctgacgttct tcaccatcgc cctgcagagc 660
tgccactacc agaggctccc gccccacatc ctgtgggcga ccgggctcaa gtcgggcagc 720
gagacccccg gcacctccga gtcggctacc ccagagtcca tgagcaagct ggagaagttc 780
acgaactgct actccctcag caagaccctg aggttcaagg cgatcccggt cggcaagacc 840
caggagaaca tcgacaacaa gcggctgctg gtggaggacg agaagagggc tgaggactac 900
aagggcgtga agaagctcct ggaccgctac tacctgtcct tcatcaacga cgtgctccac 960
agcatcaagc tcaagaacct gaacaactac atcagcctct tcaggaagaa gacgcgcacc 1020
gagaaggaga acaaggagct cgagaacctg gagatcaacc tgaggaagga gatcgccaag 1080
gcgttcaagg gcaacgaggg ctacaagtcc ctcttcaaga aggacatcat cgagacgatc 1140
ctcccggagt tcctggacga caaggacgag atcgccctgg tcaactcctt caacggcttc 1200
accacggcgt tcaccggctt cttcgacaac cgcgagaaca tgttcagcga ggaggccaag 1260
tccacgagca tcgcgttcag gtgcatcaac gagaacctca cccgctacat ctccaacatg 1320
gacatcttcg agaaggtcga cgcgatcttc gacaagcacg aggtgcagga gatcaaggag 1380
aagatcctga acagcgacta cgacgtcgag gacttcttcg agggcgagtt cttcaacttc 1440
gtcctcacgc aggagggcat cgacgtgtac aacgccatca tcggtggctt cgtgaccgag 1500
tccggcgaga agatcaaggg cctgaacgag tacatcaacc tctacaacca gaagaccaag 1560
cagaagctgc cgaagttcaa gcccctgtac aagcaggtgc tctccgacag ggagtccctc 1620
agcttctacg gcgagggcta cacgagcgac gaggaggtcc tggaggtgtt ccgcaacacc 1680
ctcaacaaga acagcgagat cttctccagc atcaagaagc tcgagaagct gttcaagaac 1740
ttcgacgagt actccagcgc cggcatcttc gtcaagaacg gcccggcgat ctccacgatc 1800
agcaaggaca tcttcggcga gtggaacgtg atccgcgaca agtggaacgc cgagtacgac 1860
gacatccacc tcaagaagaa ggcggtggtc accgagaagt acgaggacga caggcgcaag 1920
tccttcaaga agatcggctc cttcagcctc gagcagctgc aggagtacgc cgacgcggac 1980
ctgagcgtgg tcgagaagct caaggagatc atcatccaga aggtcgacga gatctacaag 2040
gtgtacggct ccagcgagaa gctcttcgac gcggacttcg tcctcgagaa gtccctgaag 2100
aagaacgacg ccgtggtcgc gatcatgaag gacctcctgg actccgtgaa gagcttcgag 2160
aattacatca aggccttctt cggcgagggc aaggagacga acagggacga gtccttctac 2220
ggcgacttcg tcctggccta cgacatcctc ctgaaggtgg accacatcta cgacgcgatc 2280
cgcaactacg tgacccagaa gccgtacagc aaggacaagt tcaagctcta cttccagaac 2340
ccccagttca tgggcggctg ggacaaggac aaggagacgg actacagggc gaccatcctg 2400
cgctacggca gcaagtacta cctcgccatc atggacaaga agtacgcgaa gtgcctgcag 2460
aagatcgaca aggacgacgt caacggcaac tacgagaaga tcaactacaa gctcctgccg 2520
ggccccaaca agatgctccc gaaggtgttc ttctccaaga agtggatggc ctactacaac 2580
cccagcgagg acatccagaa gatctacaag aacggcacgt tcaagaaggg cgacatgttc 2640
aacctgaacg actgccacaa gctcatcgac ttcttcaagg actccatcag ccgctacccg 2700
aagtggtcca acgcctacga cttcaacttc agcgagaccg agaagtacaa ggacatcgcg 2760
ggcttctacc gcgaggtcga ggagcagggc tacaaggtgt ccttcgagtc cgccagcaag 2820
aaggaggtcg acaagctggt ggaggagggc aagctctaca tgttccagat ctacaacaag 2880
gacttctccg acaagagcca cggcacgccc aacctgcaca ccatgtactt caagctcctg 2940
ttcgacgaga acaaccacgg ccagatcagg ctgtccggcg gcgccgagct cttcatgagg 3000
agggcgagcc tgaagaagga ggagctggtg gtccaccccg ctaacagccc aatcgcgaac 3060
aagaacccgg acaaccccaa gaagaccacg accctgtcct acgacgtgta caaggacaag 3120
aggttcagcg aggaccagta cgagctccac atcccgatcg cgatcaacaa gtgccccaag 3180
aacatcttca agatcaacac cgaggtccgc gtgctcctga agcacgacga caacccctac 3240
gtgatcggca tcgctagggg cgagaggaac ctcctgtaca tcgtggtcgt ggacggcaag 3300
ggcaacatcg tggagcagta ctccctcaac gagatcatca acaacttcaa cggcatcagg 3360
atcaagacgg actaccacag cctcctggac aagaaggaga aggagaggtt cgaggcccgc 3420
cagaactgga cctccatcga gaacatcaag gagctgaagg cgggctacat cagccaggtc 3480
gtgcacaaga tctgcgagct cgtcgagaag tacgacgccg tgatcgccct cgcggacctg 3540
aactccggct tcaagaacag ccgcgtcaag gtggagaagc aggtctacca gaagttcgag 3600
aagatgctca tcgacaagct gaactacatg gtggacaaga agtccaaccc ctgcgctacg 3660
ggcggcgcgc tgaagggcta ccagatcacc aacaagttcg agagcttcaa gtccatgagc 3720
actcagaacg gcttcatctt ctacatcccg gcgtggctca cgtccaagat cgaccccagc 3780
accggcttcg tcaacctcct gaagacgaag tacacctcca tcgccgacag caagaagttc 3840
atctccagct tcgaccgcat catgtatgtg ccggaggagg acctgttcga gttcgccctc 3900
gactacaaga acttctcccg cacggacgcg gactacatca agaagtggaa gctgtacagc 3960
tacggcaacc gcatccgcat cttcaggaac cccaagaaga acaacgtctt cgactgggag 4020
gaggtgtgcc tgacctccgc gtacaaggag ctcttcaaca agtacggcat caactaccag 4080
cagggcgaca tcagggctct cctgtgcgag cagagcgaca aggccttcta ctccagcttc 4140
atggcgctga tgtccctcat gctgcagatg aggaactcga tcaccggcag gacggacgtg 4200
gccttcctca tctccccggt gaagaacagc gacggcatct tctacgactc caggaactac 4260
gaggcccagg agaacgcgat cctcccaaag aacgcggacg ccaacggcgc ctacaacatc 4320
gccaggaagg tcctctgggc tatcggccag ttcaagaagg cggaggacga gaagctggac 4380
aaggtgaaga tcgccatcag caacaaggag tggctcgagt acgcccagac ctcggtcaag 4440
cacggcagcc cgaagaagaa gcgcaaggtg tccggcggca gcacgaacct gtccgacatc 4500
atcgagaagg agaccggcaa gcagctcgtg atccaggaga gcatcctcat gctgccggag 4560
gaggtcgagg aggtcatcgg caacaagccc gagtccgaca tcctcgtcca cacggcctac 4620
gacgagtcca ccgacgagaa cgtgatgctc ctgacctcgg acgctcccga gtacaagcca 4680
tgggccctgg tcatccagga cagcaacggc gagaacaaga tcaagatgct ctccggcggc 4740
agcccgaaga agaagcgcaa agtgtga 4767
<210> 19
<211> 1588
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 19
Met Pro Lys Lys Lys Arg Lys Val Met Ser Ser Glu Thr Gly Pro Val
1 5 10 15
Ala Val Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu
20 25 30
Val Phe Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr
35 40 45
Glu Ile Asn Trp Gly Gly Arg His Ser Ile Trp Arg His Thr Ser Gln
50 55 60
Asn Thr Asn Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Thr
65 70 75 80
Glu Arg Tyr Phe Cys Pro Asn Thr Arg Cys Ser Ile Thr Trp Phe Leu
85 90 95
Ser Trp Ser Pro Cys Gly Glu Cys Ser Arg Ala Ile Thr Glu Phe Leu
100 105 110
Ser Arg Tyr Pro His Val Thr Leu Phe Ile Tyr Ile Ala Arg Leu Tyr
115 120 125
His His Ala Asp Pro Arg Asn Arg Gln Gly Leu Arg Asp Leu Ile Ser
130 135 140
Ser Gly Val Thr Ile Gln Ile Met Thr Glu Gln Glu Ser Gly Tyr Cys
145 150 155 160
Trp Arg Asn Phe Val Asn Tyr Ser Pro Ser Asn Glu Ala His Trp Pro
165 170 175
Arg Tyr Pro His Leu Trp Val Arg Leu Tyr Val Leu Glu Leu Tyr Cys
180 185 190
Ile Ile Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln
195 200 205
Pro Gln Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln
210 215 220
Arg Leu Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys Ser Gly Ser
225 230 235 240
Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Met Ser Lys
245 250 255
Leu Glu Lys Phe Thr Asn Cys Tyr Ser Leu Ser Lys Thr Leu Arg Phe
260 265 270
Lys Ala Ile Pro Val Gly Lys Thr Gln Glu Asn Ile Asp Asn Lys Arg
275 280 285
Leu Leu Val Glu Asp Glu Lys Arg Ala Glu Asp Tyr Lys Gly Val Lys
290 295 300
Lys Leu Leu Asp Arg Tyr Tyr Leu Ser Phe Ile Asn Asp Val Leu His
305 310 315 320
Ser Ile Lys Leu Lys Asn Leu Asn Asn Tyr Ile Ser Leu Phe Arg Lys
325 330 335
Lys Thr Arg Thr Glu Lys Glu Asn Lys Glu Leu Glu Asn Leu Glu Ile
340 345 350
Asn Leu Arg Lys Glu Ile Ala Lys Ala Phe Lys Gly Asn Glu Gly Tyr
355 360 365
Lys Ser Leu Phe Lys Lys Asp Ile Ile Glu Thr Ile Leu Pro Glu Phe
370 375 380
Leu Asp Asp Lys Asp Glu Ile Ala Leu Val Asn Ser Phe Asn Gly Phe
385 390 395 400
Thr Thr Ala Phe Thr Gly Phe Phe Asp Asn Arg Glu Asn Met Phe Ser
405 410 415
Glu Glu Ala Lys Ser Thr Ser Ile Ala Phe Arg Cys Ile Asn Glu Asn
420 425 430
Leu Thr Arg Tyr Ile Ser Asn Met Asp Ile Phe Glu Lys Val Asp Ala
435 440 445
Ile Phe Asp Lys His Glu Val Gln Glu Ile Lys Glu Lys Ile Leu Asn
450 455 460
Ser Asp Tyr Asp Val Glu Asp Phe Phe Glu Gly Glu Phe Phe Asn Phe
465 470 475 480
Val Leu Thr Gln Glu Gly Ile Asp Val Tyr Asn Ala Ile Ile Gly Gly
485 490 495
Phe Val Thr Glu Ser Gly Glu Lys Ile Lys Gly Leu Asn Glu Tyr Ile
500 505 510
Asn Leu Tyr Asn Gln Lys Thr Lys Gln Lys Leu Pro Lys Phe Lys Pro
515 520 525
Leu Tyr Lys Gln Val Leu Ser Asp Arg Glu Ser Leu Ser Phe Tyr Gly
530 535 540
Glu Gly Tyr Thr Ser Asp Glu Glu Val Leu Glu Val Phe Arg Asn Thr
545 550 555 560
Leu Asn Lys Asn Ser Glu Ile Phe Ser Ser Ile Lys Lys Leu Glu Lys
565 570 575
Leu Phe Lys Asn Phe Asp Glu Tyr Ser Ser Ala Gly Ile Phe Val Lys
580 585 590
Asn Gly Pro Ala Ile Ser Thr Ile Ser Lys Asp Ile Phe Gly Glu Trp
595 600 605
Asn Val Ile Arg Asp Lys Trp Asn Ala Glu Tyr Asp Asp Ile His Leu
610 615 620
Lys Lys Lys Ala Val Val Thr Glu Lys Tyr Glu Asp Asp Arg Arg Lys
625 630 635 640
Ser Phe Lys Lys Ile Gly Ser Phe Ser Leu Glu Gln Leu Gln Glu Tyr
645 650 655
Ala Asp Ala Asp Leu Ser Val Val Glu Lys Leu Lys Glu Ile Ile Ile
660 665 670
Gln Lys Val Asp Glu Ile Tyr Lys Val Tyr Gly Ser Ser Glu Lys Leu
675 680 685
Phe Asp Ala Asp Phe Val Leu Glu Lys Ser Leu Lys Lys Asn Asp Ala
690 695 700
Val Val Ala Ile Met Lys Asp Leu Leu Asp Ser Val Lys Ser Phe Glu
705 710 715 720
Asn Tyr Ile Lys Ala Phe Phe Gly Glu Gly Lys Glu Thr Asn Arg Asp
725 730 735
Glu Ser Phe Tyr Gly Asp Phe Val Leu Ala Tyr Asp Ile Leu Leu Lys
740 745 750
Val Asp His Ile Tyr Asp Ala Ile Arg Asn Tyr Val Thr Gln Lys Pro
755 760 765
Tyr Ser Lys Asp Lys Phe Lys Leu Tyr Phe Gln Asn Pro Gln Phe Met
770 775 780
Gly Gly Trp Asp Lys Asp Lys Glu Thr Asp Tyr Arg Ala Thr Ile Leu
785 790 795 800
Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala Ile Met Asp Lys Lys Tyr Ala
805 810 815
Lys Cys Leu Gln Lys Ile Asp Lys Asp Asp Val Asn Gly Asn Tyr Glu
820 825 830
Lys Ile Asn Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro Lys
835 840 845
Val Phe Phe Ser Lys Lys Trp Met Ala Tyr Tyr Asn Pro Ser Glu Asp
850 855 860
Ile Gln Lys Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly Asp Met Phe
865 870 875 880
Asn Leu Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys Asp Ser Ile
885 890 895
Ser Arg Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn Phe Ser Glu
900 905 910
Thr Glu Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu Val Glu Glu
915 920 925
Gln Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys Glu Val Asp
930 935 940
Lys Leu Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile Tyr Asn Lys
945 950 955 960
Asp Phe Ser Asp Lys Ser His Gly Thr Pro Asn Leu His Thr Met Tyr
965 970 975
Phe Lys Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile Arg Leu Ser
980 985 990
Gly Gly Ala Glu Leu Phe Met Arg Arg Ala Ser Leu Lys Lys Glu Glu
995 1000 1005
Leu Val Val His Pro Ala Asn Ser Pro Ile Ala Asn Lys Asn Pro
1010 1015 1020
Asp Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr Lys
1025 1030 1035
Asp Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro Ile
1040 1045 1050
Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr Glu
1055 1060 1065
Val Arg Val Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile Gly
1070 1075 1080
Ile Ala Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val Asp
1085 1090 1095
Gly Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile Ile
1100 1105 1110
Asn Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser Leu
1115 1120 1125
Leu Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp
1130 1135 1140
Thr Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile Ser
1145 1150 1155
Gln Val Val His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp Ala
1160 1165 1170
Val Ile Ala Leu Ala Asp Leu Asn Ser Gly Phe Lys Asn Ser Arg
1175 1180 1185
Val Lys Val Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Leu
1190 1195 1200
Ile Asp Lys Leu Asn Tyr Met Val Asp Lys Lys Ser Asn Pro Cys
1205 1210 1215
Ala Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile Thr Asn Lys Phe
1220 1225 1230
Glu Ser Phe Lys Ser Met Ser Thr Gln Asn Gly Phe Ile Phe Tyr
1235 1240 1245
Ile Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro Ser Thr Gly Phe
1250 1255 1260
Val Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile Ala Asp Ser Lys
1265 1270 1275
Lys Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr Val Pro Glu Glu
1280 1285 1290
Asp Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn Phe Ser Arg Thr
1295 1300 1305
Asp Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr Ser Tyr Gly Asn
1310 1315 1320
Arg Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn Asn Val Phe Asp
1325 1330 1335
Trp Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys Glu Leu Phe Asn
1340 1345 1350
Lys Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile Arg Ala Leu Leu
1355 1360 1365
Cys Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser Phe Met Ala Leu
1370 1375 1380
Met Ser Leu Met Leu Gln Met Arg Asn Ser Ile Thr Gly Arg Thr
1385 1390 1395
Asp Val Ala Phe Leu Ile Ser Pro Val Lys Asn Ser Asp Gly Ile
1400 1405 1410
Phe Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu Asn Ala Ile Leu
1415 1420 1425
Pro Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile Ala Arg Lys
1430 1435 1440
Val Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala Glu Asp Glu Lys
1445 1450 1455
Leu Asp Lys Val Lys Ile Ala Ile Ser Asn Lys Glu Trp Leu Glu
1460 1465 1470
Tyr Ala Gln Thr Ser Val Lys His Gly Ser Pro Lys Lys Lys Arg
1475 1480 1485
Lys Val Ser Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys
1490 1495 1500
Glu Thr Gly Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu
1505 1510 1515
Pro Glu Glu Val Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp
1520 1525 1530
Ile Leu Val His Thr Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val
1535 1540 1545
Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp Ala Leu
1550 1555 1560
Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu Ser
1565 1570 1575
Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1580 1585
<210> 20
<211> 5229
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 20
atgccgaaga agaagcgcaa ggtgtccagc gagaccggcc ccgtggcggt cgaccccacc 60
ctgcgcaggc gcatcgagcc gcacgagttc gaggtcttct tcgaccccag ggagctccgc 120
aaggagacct gcctcctgta cgagatcaac tggggcggca ggcactccat ctggaggcac 180
accagccaga acacgaacaa gcacgtggag gtcaacttca tcgagaagtt caccacggag 240
aggtacttct gcccgaacac ccgctgctcc atcacctggt tcctctcgtg gagcccatgc 300
ggcgagtgct ccagggcgat cacggagttc ctcagccgct acccgcacgt gaccctcttc 360
atctacatcg ctaggctgta ccaccacgcg gaccccagga acaggcaggg gctcagggac 420
ctgatctcca gcggcgtgac catccagatc atgacggagc aggagtccgg ctactgctgg 480
cgcaacttcg tcaactactc cccgagcaac gaggcccact ggccccgcta cccgcacctg 540
tgggtgcgcc tctacgtcct cgagctgtac tgcatcatcc tcggcctgcc gccctgcctc 600
aacatcctga ggcgcaagca gccccagctc accttcttca cgatcgccct gcagagctgc 660
cactaccagc ggctgccgcc ccacatcctc tgggccaccg gcctgaagtc gggcagcgag 720
acgcccggca cgtccgagtc ggctacccca gagctcaagg acaagaagta cagcatcggc 780
ctggcaatcg gcaccaacag cgtgggctgg gccgtgatca ccgacgagta caaggtgccg 840
agcaagaagt tcaaggtgct gggcaacacc gacaggcaca gcatcaagaa gaacctgatc 900
ggcgccctgc tgttcgacag cggcgagacc gccgaggcca ccaggctgaa gaggaccgcc 960
aggaggaggt acaccaggag gaagaacagg atctgctacc tgcaggagat cttcagcaac 1020
gagatggcca aggtggacga cagcttcttc cacaggctgg aggagagctt cctggtggag 1080
gaggacaaga agcacgagag gcacccgatc ttcggcaaca tcgtggacga ggtggcctac 1140
cacgagaagt acccgaccat ctaccacctg aggaagaagc tggtggacag caccgacaag 1200
gccgacctga ggctgatcta cctggccctg gcccacatga tcaagttcag gggccacttc 1260
ctgatcgagg gcgacctgaa cccggacaac agcgacgtgg acaagctgtt catccagctg 1320
gtgcagacct acaaccagct gttcgaggag aacccgatca acgccagcgg cgtggacgcc 1380
aaggccatcc tgagcgccag gctgagcaag agcaggaggc tggagaacct gatcgcccag 1440
ctgccgggcg agaagaagaa cggcctgttc ggcaacctga tcgccctgag cctgggcctg 1500
accccgaact tcaagagcaa cttcgacctg gccgaggacg ccaagctgca gctgagcaag 1560
gacacctacg acgacgacct ggacaacctg ctggcccaga tcggcgacca gtacgccgac 1620
ctgttcctgg ccgccaagaa cctgagcgac gccatcctgc tgagcgacat cctgagggtg 1680
aacaccgaga tcaccaaggc cccgctgagc gccagcatga tcaagaggta cgacgagcac 1740
caccaggacc tgaccctgct gaaggccctg gtgaggcagc agctgccgga gaagtacaag 1800
gagatcttct tcgaccagag caagaacggc tacgccggct acatcgacgg cggcgccagc 1860
caggaggagt tctacaagtt catcaagccg atcctggaga agatggacgg caccgaggag 1920
ctgctggtga agctgaacag ggaggacctg ctgaggaagc agaggacctt cgacaacggc 1980
agcatcccgc accagatcca cctgggcgag ctgcacgcca tcctgaggag gcaggaggac 2040
ttctacccgt tcctgaagga caacagggag aagatcgaga agatcctgac cttccgcatc 2100
ccgtactacg tgggcccgct ggccaggggc aacagcaggt tcgcctggat gaccaggaag 2160
agcgaggaga ccatcacccc gtggaacttc gaggaggtgg tggacaaggg cgccagcgcc 2220
cagagcttca tcgagaggat gaccaacttc gacaagaacc tgccgaacga gaaggtgctg 2280
ccgaagcaca gcctgctgta cgagtacttc accgtgtaca acgagctgac caaggtgaag 2340
tacgtgaccg agggcatgag gaagccggcc ttcctgagcg gcgagcagaa gaaggccatc 2400
gtggacctgc tgttcaagac caacaggaag gtgaccgtga agcagctgaa ggaggactac 2460
ttcaagaaga tcgagtgctt cgacagcgtg gagatcagcg gcgtggagga caggttcaac 2520
gccagcctgg gcacctacca cgacctgctg aagatcatca aggacaagga cttcctggac 2580
aacgaggaga acgaggacat cctggaggac atcgtgctga ccctgaccct gttcgaggac 2640
agggagatga tcgaggagag gctgaagacc tacgcccacc tgttcgacga caaggtgatg 2700
aagcagctga agaggaggag gtacaccggc tggggcaggc tgagcaggaa gctgatcaac 2760
ggcatcaggg acaagcagag cggcaagacc atcctggact tcctgaagag cgacggcttc 2820
gccaacagga acttcatgca gctgatccac gacgacagcc tgaccttcaa ggaggacatc 2880
cagaaggccc aggtgagcgg ccagggcgac agcctgcacg agcacatcgc caacctggcc 2940
ggcagcccgg ccatcaagaa gggcatcctg cagaccgtga aggtggtgga cgagctggtg 3000
aaggtgatgg gcaggcacaa gccggagaac atcgtgatcg agatggccag ggagaaccag 3060
accacccaga agggccagaa gaacagcagg gagaggatga agaggatcga ggagggcatc 3120
aaggagctgg gcagccagat cctgaaggag cacccggtgg agaacaccca gctgcagaac 3180
gagaagctgt acctgtacta cctgcagaac ggcagggaca tgtacgtgga ccaggagctg 3240
gacatcaaca ggctgagcga ctacgacgtg gaccacatcg tgccgcagag cttcctgaag 3300
gacgacagca tcgacaacaa ggtgctgacc aggagcgaca agaacagggg caagagcgac 3360
aacgtgccga gcgaggaggt ggtgaagaag atgaaaaact actggaggca gctgctgaac 3420
gccaagctga tcacccagag gaagttcgac aacctgacca aggccgagag gggcggcctg 3480
agcgagctgg acaaggccgg cttcattaaa aggcagctgg tggagaccag gcagatcacc 3540
aagcacgtgg cccagatcct ggacagcagg atgaacacca agtacgacga gaacgacaag 3600
ctgatcaggg aggtgaaggt gatcaccctg aagagcaagc tggtgagcga cttcaggaag 3660
gacttccagt tctacaaggt gagggagatc aataattacc accacgccca cgacgcctac 3720
ctgaacgccg tggtgggcac cgccctgatt aaaaagtacc cgaagctgga gagcgagttc 3780
gtgtacggcg actacaaggt gtacgacgtg aggaagatga tcgccaagag cgagcaggag 3840
atcggcaagg ccaccgccaa gtacttcttc tacagcaaca tcatgaactt cttcaagacc 3900
gagatcaccc tggccaacgg cgagatcagg aagaggccgc tgatcgagac caacggcgag 3960
accggcgaga tcgtgtggga caagggcagg gacttcgcca ccgtgaggaa ggtgctgtcc 4020
atgccgcagg tgaacatcgt gaagaagacc gaggtgcaga ccggcggctt cagcaaggag 4080
agcatcctgc cgaagaggaa cagcgacaag ctgatcgcca ggaagaagga ctgggatccg 4140
aagaagtacg gcggcttcga cagcccgacc gtggcctaca gcgtgctggt ggtggccaag 4200
gtggagaagg gcaagagcaa gaagctgaag agcgtgaagg agctggtggg catcaccatc 4260
atggagagga gcagcttcga gaagaaccca gtggacttcc tggaggccaa gggctacaag 4320
gaggtgaaga aggacctgat cattaaactg ccgaagtaca gcctgttcga gctggagaac 4380
ggcaggaaga ggatgctggc cagcgccggc gagctgcaga agggcaacga gctggccctg 4440
ccgagcaagt acgtgaactt cctgtacctg gccagccact acgagaagct gaagggcagc 4500
ccggaggaca acgagcagaa gcagctgttc gtggagcagc acaagcacta cctggacgag 4560
atcatcgagc agatcagcga gttcagcaag agggtgatcc tggccgacgc caacctggac 4620
aaggtgctga gcgcctacaa caagcacagg gacaagccga tcagggagca ggccgagaac 4680
atcatccacc tgttcaccct gaccaacctg ggcgccccgg ccgccttcaa gtacttcgac 4740
accaccatcg acaggaagag gtacaccagc accaaggagg tgctggacgc caccctgatc 4800
caccagagca tcaccggcct gtacgagacc aggatcgacc tgagccagct gggcggcgac 4860
agcagcccgc cgaagaagaa gaggaaggtg agctggaagg acgccagcgg ctggagcagg 4920
atgaccaggg actccggcgg cagcaccaac ctctccgaca tcatcgagaa ggagacgggc 4980
aagcagctcg tgatccagga gagcatcctc atgctgccgg aggaggtgga ggaggtcatc 5040
ggcaacaagc ccgagtccga catcctcgtg cacacggcct acgacgagtc caccgacgag 5100
aacgtcatgc tcctgacctc ggacgctccc gagtacaagc catgggccct cgtgatccag 5160
gacagcaacg gcgagaacaa gatcaagatg ctctccggcg gcagcccgaa gaagaagcgc 5220
aaagtctga 5229
<210> 21
<211> 1742
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 21
Met Pro Lys Lys Lys Arg Lys Val Ser Ser Glu Thr Gly Pro Val Ala
1 5 10 15
Val Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu Val
20 25 30
Phe Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr Glu
35 40 45
Ile Asn Trp Gly Gly Arg His Ser Ile Trp Arg His Thr Ser Gln Asn
50 55 60
Thr Asn Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Thr Glu
65 70 75 80
Arg Tyr Phe Cys Pro Asn Thr Arg Cys Ser Ile Thr Trp Phe Leu Ser
85 90 95
Trp Ser Pro Cys Gly Glu Cys Ser Arg Ala Ile Thr Glu Phe Leu Ser
100 105 110
Arg Tyr Pro His Val Thr Leu Phe Ile Tyr Ile Ala Arg Leu Tyr His
115 120 125
His Ala Asp Pro Arg Asn Arg Gln Gly Leu Arg Asp Leu Ile Ser Ser
130 135 140
Gly Val Thr Ile Gln Ile Met Thr Glu Gln Glu Ser Gly Tyr Cys Trp
145 150 155 160
Arg Asn Phe Val Asn Tyr Ser Pro Ser Asn Glu Ala His Trp Pro Arg
165 170 175
Tyr Pro His Leu Trp Val Arg Leu Tyr Val Leu Glu Leu Tyr Cys Ile
180 185 190
Ile Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln Pro
195 200 205
Gln Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln Arg
210 215 220
Leu Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys Ser Gly Ser Glu
225 230 235 240
Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Leu Lys Asp Lys Lys
245 250 255
Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val
260 265 270
Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly
275 280 285
Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu
290 295 300
Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala
305 310 315 320
Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu
325 330 335
Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg
340 345 350
Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His
355 360 365
Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr
370 375 380
Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys
385 390 395 400
Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe
405 410 415
Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp
420 425 430
Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe
435 440 445
Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu
450 455 460
Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln
465 470 475 480
Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu
485 490 495
Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu
500 505 510
Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp
515 520 525
Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala
530 535 540
Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val
545 550 555 560
Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg
565 570 575
Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg
580 585 590
Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys
595 600 605
Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe
610 615 620
Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu
625 630 635 640
Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr
645 650 655
Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His
660 665 670
Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn
675 680 685
Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val
690 695 700
Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys
705 710 715 720
Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys
725 730 735
Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys
740 745 750
Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu
755 760 765
Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu
770 775 780
Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile
785 790 795 800
Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu
805 810 815
Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile
820 825 830
Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp
835 840 845
Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn
850 855 860
Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp
865 870 875 880
Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp
885 890 895
Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly
900 905 910
Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly
915 920 925
Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn
930 935 940
Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile
945 950 955 960
Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile
965 970 975
Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr
980 985 990
Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg His Lys Pro
995 1000 1005
Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr Thr Gln
1010 1015 1020
Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu Glu
1025 1030 1035
Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
1040 1045 1050
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
1055 1060 1065
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn
1070 1075 1080
Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe
1085 1090 1095
Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp
1100 1105 1110
Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val
1115 1120 1125
Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu
1130 1135 1140
Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
1145 1150 1155
Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu
1160 1165 1170
Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp
1175 1180 1185
Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg
1190 1195 1200
Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe
1205 1210 1215
Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr
1220 1225 1230
His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala
1235 1240 1245
Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly
1250 1255 1260
Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser Glu
1265 1270 1275
Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn
1280 1285 1290
Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu
1295 1300 1305
Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu
1310 1315 1320
Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val
1325 1330 1335
Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln
1340 1345 1350
Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser
1355 1360 1365
Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr
1370 1375 1380
Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val
1385 1390 1395
Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys
1400 1405 1410
Glu Leu Val Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys
1415 1420 1425
Asn Pro Val Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys
1430 1435 1440
Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu
1445 1450 1455
Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln
1460 1465 1470
Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu
1475 1480 1485
Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp
1490 1495 1500
Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr Leu
1505 1510 1515
Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile
1520 1525 1530
Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys
1535 1540 1545
His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His
1550 1555 1560
Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr
1565 1570 1575
Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu
1580 1585 1590
Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr
1595 1600 1605
Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser Ser Pro
1610 1615 1620
Pro Lys Lys Lys Arg Lys Val Ser Trp Lys Asp Ala Ser Gly Trp
1625 1630 1635
Ser Arg Met Thr Arg Asp Ser Gly Gly Ser Thr Asn Leu Ser Asp
1640 1645 1650
Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val Ile Gln Glu Ser
1655 1660 1665
Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile Gly Asn Lys
1670 1675 1680
Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu Ser Thr
1685 1690 1695
Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys
1700 1705 1710
Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile
1715 1720 1725
Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1730 1735 1740
<210> 22
<211> 1316
<212> PRT
<213> 发酵氨基酸球菌(Acidaminococcus fermentans)
<400> 22
Met Thr Gln Phe Glu Gly Phe Thr Asn Leu Tyr Gln Val Ser Lys Thr
1 5 10 15
Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu Lys His Ile Gln
20 25 30
Glu Gln Gly Phe Ile Glu Glu Asp Lys Ala Arg Asn Asp His Tyr Lys
35 40 45
Glu Leu Lys Pro Ile Ile Asp Arg Ile Tyr Lys Thr Tyr Ala Asp Gln
50 55 60
Cys Leu Gln Leu Val Gln Leu Asp Trp Glu Asn Leu Ser Ala Ala Ile
65 70 75 80
Asp Ser Tyr Arg Lys Glu Lys Thr Glu Glu Thr Arg Asn Ala Leu Ile
85 90 95
Glu Glu Gln Ala Thr Tyr Arg Asn Ala Ile His Asp Tyr Phe Ile Gly
100 105 110
Arg Thr Asp Asn Leu Thr Asp Ala Ile Asn Lys Arg His Ala Glu Ile
115 120 125
Tyr Lys Gly Leu Phe Lys Ala Glu Leu Phe Asn Gly Lys Val Leu Lys
130 135 140
Gln Leu Gly Thr Val Thr Thr Thr Glu His Glu Asn Ala Leu Leu Arg
145 150 155 160
Ser Phe Asp Lys Phe Thr Thr Tyr Phe Ser Gly Phe Tyr Glu Asn Arg
165 170 175
Lys Asn Val Phe Ser Ala Glu Asp Ile Ser Thr Ala Ile Pro His Arg
180 185 190
Ile Val Gln Asp Asn Phe Pro Lys Phe Lys Glu Asn Cys His Ile Phe
195 200 205
Thr Arg Leu Ile Thr Ala Val Pro Ser Leu Arg Glu His Phe Glu Asn
210 215 220
Val Lys Lys Ala Ile Gly Ile Phe Val Ser Thr Ser Ile Glu Glu Val
225 230 235 240
Phe Ser Phe Pro Phe Tyr Asn Gln Leu Leu Thr Gln Thr Gln Ile Asp
245 250 255
Leu Tyr Asn Gln Leu Leu Gly Gly Ile Ser Arg Glu Ala Gly Thr Glu
260 265 270
Lys Ile Lys Gly Leu Asn Glu Val Leu Asn Leu Ala Ile Gln Lys Asn
275 280 285
Asp Glu Thr Ala His Ile Ile Ala Ser Leu Pro His Arg Phe Ile Pro
290 295 300
Leu Phe Lys Gln Ile Leu Ser Asp Arg Asn Thr Leu Ser Phe Ile Leu
305 310 315 320
Glu Glu Phe Lys Ser Asp Glu Glu Val Ile Gln Ser Phe Cys Lys Tyr
325 330 335
Lys Thr Leu Leu Arg Asn Glu Asn Val Leu Glu Thr Ala Glu Ala Leu
340 345 350
Phe Asn Glu Leu Asn Ser Ile Asp Leu Thr His Ile Phe Ile Ser His
355 360 365
Lys Lys Leu Glu Thr Ile Ser Ser Ala Leu Cys Asp His Trp Asp Thr
370 375 380
Leu Arg Asn Ala Leu Tyr Glu Arg Arg Ile Ser Glu Leu Thr Gly Lys
385 390 395 400
Ile Thr Lys Ser Ala Lys Glu Lys Val Gln Arg Ser Leu Lys His Glu
405 410 415
Asp Ile Asn Leu Gln Glu Ile Ile Ser Ala Ala Gly Lys Glu Leu Ser
420 425 430
Glu Ala Phe Lys Gln Lys Thr Ser Glu Ile Leu Ser His Ala His Ala
435 440 445
Ala Leu Asp Gln Pro Leu Pro Thr Thr Leu Lys Lys Gln Glu Glu Lys
450 455 460
Glu Ile Leu Lys Ser Gln Leu Asp Ser Leu Leu Gly Leu Tyr His Leu
465 470 475 480
Leu Asp Trp Phe Ala Val Asp Glu Ser Asn Glu Val Asp Pro Glu Phe
485 490 495
Ser Ala Arg Leu Thr Gly Ile Lys Leu Glu Met Glu Pro Ser Leu Ser
500 505 510
Phe Tyr Asn Lys Ala Arg Asn Tyr Ala Thr Lys Lys Pro Tyr Ser Val
515 520 525
Glu Lys Phe Lys Leu Asn Phe Gln Met Pro Thr Leu Ala Ser Gly Trp
530 535 540
Asp Val Asn Lys Glu Lys Asn Asn Gly Ala Ile Leu Phe Val Lys Asn
545 550 555 560
Gly Leu Tyr Tyr Leu Gly Ile Met Pro Lys Gln Lys Gly Arg Tyr Lys
565 570 575
Ala Leu Ser Phe Glu Pro Thr Glu Lys Thr Ser Glu Gly Phe Asp Lys
580 585 590
Met Tyr Tyr Asp Tyr Phe Pro Asp Ala Ala Lys Met Ile Pro Lys Cys
595 600 605
Ser Thr Gln Leu Lys Ala Val Thr Ala His Phe Gln Thr His Thr Thr
610 615 620
Pro Ile Leu Leu Ser Asn Asn Phe Ile Glu Pro Leu Glu Ile Thr Lys
625 630 635 640
Glu Ile Tyr Asp Leu Asn Asn Pro Glu Lys Glu Pro Lys Lys Phe Gln
645 650 655
Thr Ala Tyr Ala Lys Lys Thr Gly Asp Gln Lys Gly Tyr Arg Glu Ala
660 665 670
Leu Cys Lys Trp Ile Asp Phe Thr Arg Asp Phe Leu Ser Lys Tyr Thr
675 680 685
Lys Thr Thr Ser Ile Asp Leu Ser Ser Leu Arg Pro Ser Ser Gln Tyr
690 695 700
Lys Asp Leu Gly Glu Tyr Tyr Ala Glu Leu Asn Pro Leu Leu Tyr His
705 710 715 720
Ile Ser Phe Gln Arg Ile Ala Glu Lys Glu Ile Met Asp Ala Val Glu
725 730 735
Thr Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ala Lys
740 745 750
Gly His His Gly Lys Pro Asn Leu His Thr Leu Tyr Trp Thr Gly Leu
755 760 765
Phe Ser Pro Glu Asn Leu Ala Lys Thr Ser Ile Lys Leu Asn Gly Gln
770 775 780
Ala Glu Leu Phe Tyr Arg Pro Lys Ser Arg Met Lys Arg Met Ala His
785 790 795 800
Arg Leu Gly Glu Lys Met Leu Asn Lys Lys Leu Lys Asp Gln Lys Thr
805 810 815
Pro Ile Pro Asp Thr Leu Tyr Gln Glu Leu Tyr Asp Tyr Val Asn His
820 825 830
Arg Leu Ser His Asp Leu Ser Asp Glu Ala Arg Ala Leu Leu Pro Asn
835 840 845
Val Ile Thr Lys Glu Val Ser His Glu Ile Ile Lys Asp Arg Arg Phe
850 855 860
Thr Ser Asp Lys Phe Phe Phe His Val Pro Ile Thr Leu Asn Tyr Gln
865 870 875 880
Ala Ala Asn Ser Pro Ser Lys Phe Asn Gln Arg Val Asn Ala Tyr Leu
885 890 895
Lys Glu His Pro Glu Thr Pro Ile Ile Gly Ile Ala Arg Gly Glu Arg
900 905 910
Asn Leu Ile Tyr Ile Thr Val Ile Asp Ser Thr Gly Lys Ile Leu Glu
915 920 925
Gln Arg Ser Leu Asn Thr Ile Gln Gln Phe Asp Tyr Gln Lys Lys Leu
930 935 940
Asp Asn Arg Glu Lys Glu Arg Val Ala Ala Arg Gln Ala Trp Ser Val
945 950 955 960
Val Gly Thr Ile Lys Asp Leu Lys Gln Gly Tyr Leu Ser Gln Val Ile
965 970 975
His Glu Ile Val Asp Leu Met Ile His Tyr Gln Ala Val Val Val Leu
980 985 990
Ala Asn Leu Asn Phe Gly Phe Lys Ser Lys Arg Thr Gly Ile Ala Glu
995 1000 1005
Lys Ala Val Tyr Gln Gln Phe Glu Lys Met Leu Ile Asp Lys Leu
1010 1015 1020
Asn Cys Leu Val Leu Lys Asp Tyr Pro Ala Glu Lys Val Gly Gly
1025 1030 1035
Val Leu Asn Pro Tyr Gln Leu Thr Asp Gln Phe Thr Ser Phe Ala
1040 1045 1050
Lys Met Gly Thr Gln Ser Gly Phe Leu Phe Tyr Val Pro Ala Pro
1055 1060 1065
Tyr Thr Ser Lys Ile Asp Pro Leu Thr Gly Phe Val Asp Pro Phe
1070 1075 1080
Val Trp Lys Thr Ile Lys Asn His Glu Ser Arg Lys His Phe Leu
1085 1090 1095
Glu Gly Phe Asp Phe Leu His Tyr Asp Val Lys Thr Gly Asp Phe
1100 1105 1110
Ile Leu His Phe Lys Met Asn Arg Asn Leu Ser Phe Gln Arg Gly
1115 1120 1125
Leu Pro Gly Phe Met Pro Ala Trp Asp Ile Val Phe Glu Lys Asn
1130 1135 1140
Glu Thr Gln Phe Asp Ala Lys Gly Thr Pro Phe Ile Ala Gly Lys
1145 1150 1155
Arg Ile Val Pro Val Ile Glu Asn His Arg Phe Thr Gly Arg Tyr
1160 1165 1170
Arg Asp Leu Tyr Pro Ala Asn Glu Leu Ile Ala Leu Leu Glu Glu
1175 1180 1185
Lys Gly Ile Val Phe Arg Asp Gly Ser Asn Ile Leu Pro Lys Leu
1190 1195 1200
Leu Glu Asn Asp Asp Ser His Ala Ile Asp Thr Met Val Ala Leu
1205 1210 1215
Ile Arg Ser Val Leu Gln Met Arg Asn Ser Asn Ala Ala Thr Gly
1220 1225 1230
Glu Ala Tyr Ile Asn Ser Pro Val Arg Asp Leu Asn Gly Val Cys
1235 1240 1245
Phe Asp Ser Arg Phe Gln Asn Pro Glu Trp Pro Met Asp Ala Asp
1250 1255 1260
Ala Asn Gly Ala Tyr His Ile Ala Leu Lys Gly Gln Leu Leu Leu
1265 1270 1275
Asn His Leu Lys Glu Ser Lys Asp Leu Lys Leu Gln Asn Gly Ile
1280 1285 1290
Ser Asn Gln Asp Trp Leu Ala Tyr Ile Gln Glu Leu Arg Asn Gly
1295 1300 1305
Ser Pro Lys Lys Lys Arg Lys Val
1310 1315
<210> 23
<211> 4809
<212> DNA
<213> 人工序列
<220>
<223> 密码子优化
<400> 23
atgccgaaga agaagcgcaa ggtcatgtcc agcgagaccg gccccgtggc ggtggacccc 60
accctgcgca ggcgcatcga gccgcacgag ttcgaggtgt tcttcgaccc cagggagctc 120
cgcaaggaga cctgcctcct gtacgagatc aactggggcg gcaggcactc catctggagg 180
cacacgagcc agaacaccaa caagcacgtc gaggtgaact tcatcgagaa gttcaccacg 240
gagaggtact tctgcccgaa cacgcgctgc tccatcacgt ggttcctctc gtggagccca 300
tgcggcgagt gctccagggc gatcacggag ttcctcagcc gctacccgca cgtgaccctg 360
ttcatctaca tcgctaggct ctaccaccac gcggacccca ggaacaggca gggcctcagg 420
gacctgatct ccagcggcgt cacgatccag atcatgaccg agcaggagtc cggctactgc 480
tggaggaact tcgtgaacta ctccccgagc aacgaggccc actggccccg ctacccgcac 540
ctctgggtcc gcctctacgt gctcgagctg tactgcatca tcctcggcct gccgccctgc 600
ctcaacatcc tgaggcgcaa gcagccccag ctgacgttct tcaccatcgc cctgcagagc 660
tgccactacc agaggctccc gccccacatc ctgtgggcga ccgggctcaa ggggggcggg 720
ggctcaggcg ggggcgggag cggcggcggg ggctctgggg gcggcggcag cggcgggggc 780
ggcagcgggg gcggcgggtc gatgagcaag ctggagaagt tcacgaactg ctactccctc 840
agcaagaccc tgaggttcaa ggcgatcccg gtcggcaaga cccaggagaa catcgacaac 900
aagcggctgc tggtggagga cgagaagagg gctgaggact acaagggcgt gaagaagctc 960
ctggaccgct actacctgtc cttcatcaac gacgtgctcc acagcatcaa gctcaagaac 1020
ctgaacaact acatcagcct cttcaggaag aagacgcgca ccgagaagga gaacaaggag 1080
ctcgagaacc tggagatcaa cctgaggaag gagatcgcca aggcgttcaa gggcaacgag 1140
ggctacaagt ccctcttcaa gaaggacatc atcgagacga tcctcccgga gttcctggac 1200
gacaaggacg agatcgccct ggtcaactcc ttcaacggct tcaccacggc gttcaccggc 1260
ttcttcgaca accgcgagaa catgttcagc gaggaggcca agtccacgag catcgcgttc 1320
aggtgcatca acgagaacct cacccgctac atctccaaca tggacatctt cgagaaggtc 1380
gacgcgatct tcgacaagca cgaggtgcag gagatcaagg agaagatcct gaacagcgac 1440
tacgacgtcg aggacttctt cgagggcgag ttcttcaact tcgtcctcac gcaggagggc 1500
atcgacgtgt acaacgccat catcggtggc ttcgtgaccg agtccggcga gaagatcaag 1560
ggcctgaacg agtacatcaa cctctacaac cagaagacca agcagaagct gccgaagttc 1620
aagcccctgt acaagcaggt gctctccgac agggagtccc tcagcttcta cggcgagggc 1680
tacacgagcg acgaggaggt cctggaggtg ttccgcaaca ccctcaacaa gaacagcgag 1740
atcttctcca gcatcaagaa gctcgagaag ctgttcaaga acttcgacga gtactccagc 1800
gccggcatct tcgtcaagaa cggcccggcg atctccacga tcagcaagga catcttcggc 1860
gagtggaacg tgatccgcga caagtggaac gccgagtacg acgacatcca cctcaagaag 1920
aaggcggtgg tcaccgagaa gtacgaggac gacaggcgca agtccttcaa gaagatcggc 1980
tccttcagcc tcgagcagct gcaggagtac gccgacgcgg acctgagcgt ggtcgagaag 2040
ctcaaggaga tcatcatcca gaaggtcgac gagatctaca aggtgtacgg ctccagcgag 2100
aagctcttcg acgcggactt cgtcctcgag aagtccctga agaagaacga cgccgtggtc 2160
gcgatcatga aggacctcct ggactccgtg aagagcttcg agaattacat caaggccttc 2220
ttcggcgagg gcaaggagac gaacagggac gagtccttct acggcgactt cgtcctggcc 2280
tacgacatcc tcctgaaggt ggaccacatc tacgacgcga tccgcaacta cgtgacccag 2340
aagccgtaca gcaaggacaa gttcaagctc tacttccaga acccccagtt catgggcggc 2400
tgggacaagg acaaggagac ggactacagg gcgaccatcc tgcgctacgg cagcaagtac 2460
tacctcgcca tcatggacaa gaagtacgcg aagtgcctgc agaagatcga caaggacgac 2520
gtcaacggca actacgagaa gatcaactac aagctcctgc cgggccccaa caagatgctc 2580
ccgaaggtgt tcttctccaa gaagtggatg gcctactaca accccagcga ggacatccag 2640
aagatctaca agaacggcac gttcaagaag ggcgacatgt tcaacctgaa cgactgccac 2700
aagctcatcg acttcttcaa ggactccatc agccgctacc cgaagtggtc caacgcctac 2760
gacttcaact tcagcgagac cgagaagtac aaggacatcg cgggcttcta ccgcgaggtc 2820
gaggagcagg gctacaaggt gtccttcgag tccgccagca agaaggaggt cgacaagctg 2880
gtggaggagg gcaagctcta catgttccag atctacaaca aggacttctc cgacaagagc 2940
cacggcacgc ccaacctgca caccatgtac ttcaagctcc tgttcgacga gaacaaccac 3000
ggccagatca ggctgtccgg cggcgccgag ctcttcatga ggagggcgag cctgaagaag 3060
gaggagctgg tggtccaccc cgctaacagc ccaatcgcga acaagaaccc ggacaacccc 3120
aagaagacca cgaccctgtc ctacgacgtg tacaaggaca agaggttcag cgaggaccag 3180
tacgagctcc acatcccgat cgcgatcaac aagtgcccca agaacatctt caagatcaac 3240
accgaggtcc gcgtgctcct gaagcacgac gacaacccct acgtgatcgg catcgctagg 3300
ggcgagagga acctcctgta catcgtggtc gtggacggca agggcaacat cgtggagcag 3360
tactccctca acgagatcat caacaacttc aacggcatca ggatcaagac ggactaccac 3420
agcctcctgg acaagaagga gaaggagagg ttcgaggccc gccagaactg gacctccatc 3480
gagaacatca aggagctgaa ggcgggctac atcagccagg tcgtgcacaa gatctgcgag 3540
ctcgtcgaga agtacgacgc cgtgatcgcc ctcgcggacc tgaactccgg cttcaagaac 3600
agccgcgtca aggtggagaa gcaggtctac cagaagttcg agaagatgct catcgacaag 3660
ctgaactaca tggtggacaa gaagtccaac ccctgcgcta cgggcggcgc gctgaagggc 3720
taccagatca ccaacaagtt cgagagcttc aagtccatga gcactcagaa cggcttcatc 3780
ttctacatcc cggcgtggct cacgtccaag atcgacccca gcaccggctt cgtcaacctc 3840
ctgaagacga agtacacctc catcgccgac agcaagaagt tcatctccag cttcgaccgc 3900
atcatgtatg tgccggagga ggacctgttc gagttcgccc tcgactacaa gaacttctcc 3960
cgcacggacg cggactacat caagaagtgg aagctgtaca gctacggcaa ccgcatccgc 4020
atcttcagga accccaagaa gaacaacgtc ttcgactggg aggaggtgtg cctgacctcc 4080
gcgtacaagg agctcttcaa caagtacggc atcaactacc agcagggcga catcagggct 4140
ctcctgtgcg agcagagcga caaggccttc tactccagct tcatggcgct gatgtccctc 4200
atgctgcaga tgaggaactc gatcaccggc aggacggacg tggccttcct catctccccg 4260
gtgaagaaca gcgacggcat cttctacgac tccaggaact acgaggccca ggagaacgcg 4320
atcctcccaa agaacgcgga cgccaacggc gcctacaaca tcgccaggaa ggtcctctgg 4380
gctatcggcc agttcaagaa ggcggaggac gagaagctgg acaaggtgaa gatcgccatc 4440
agcaacaagg agtggctcga gtacgcccag acctcggtca agcacggcag cccgaagaag 4500
aagcgcaagg tgtccggcgg cagcacgaac ctgtccgaca tcatcgagaa ggagaccggc 4560
aagcagctcg tgatccagga gagcatcctc atgctgccgg aggaggtcga ggaggtcatc 4620
ggcaacaagc ccgagtccga catcctcgtc cacacggcct acgacgagtc caccgacgag 4680
aacgtgatgc tcctgacctc ggacgctccc gagtacaagc catgggccct ggtcatccag 4740
gacagcaacg gcgagaacaa gatcaagatg ctctccggcg gcagcccgaa gaagaagcgc 4800
aaagtgtga 4809
<210> 24
<211> 1602
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 24
Met Pro Lys Lys Lys Arg Lys Val Met Ser Ser Glu Thr Gly Pro Val
1 5 10 15
Ala Val Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu
20 25 30
Val Phe Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr
35 40 45
Glu Ile Asn Trp Gly Gly Arg His Ser Ile Trp Arg His Thr Ser Gln
50 55 60
Asn Thr Asn Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Thr
65 70 75 80
Glu Arg Tyr Phe Cys Pro Asn Thr Arg Cys Ser Ile Thr Trp Phe Leu
85 90 95
Ser Trp Ser Pro Cys Gly Glu Cys Ser Arg Ala Ile Thr Glu Phe Leu
100 105 110
Ser Arg Tyr Pro His Val Thr Leu Phe Ile Tyr Ile Ala Arg Leu Tyr
115 120 125
His His Ala Asp Pro Arg Asn Arg Gln Gly Leu Arg Asp Leu Ile Ser
130 135 140
Ser Gly Val Thr Ile Gln Ile Met Thr Glu Gln Glu Ser Gly Tyr Cys
145 150 155 160
Trp Arg Asn Phe Val Asn Tyr Ser Pro Ser Asn Glu Ala His Trp Pro
165 170 175
Arg Tyr Pro His Leu Trp Val Arg Leu Tyr Val Leu Glu Leu Tyr Cys
180 185 190
Ile Ile Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln
195 200 205
Pro Gln Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln
210 215 220
Arg Leu Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys Gly Gly Gly
225 230 235 240
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
245 250 255
Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Met Ser Lys Leu Glu
260 265 270
Lys Phe Thr Asn Cys Tyr Ser Leu Ser Lys Thr Leu Arg Phe Lys Ala
275 280 285
Ile Pro Val Gly Lys Thr Gln Glu Asn Ile Asp Asn Lys Arg Leu Leu
290 295 300
Val Glu Asp Glu Lys Arg Ala Glu Asp Tyr Lys Gly Val Lys Lys Leu
305 310 315 320
Leu Asp Arg Tyr Tyr Leu Ser Phe Ile Asn Asp Val Leu His Ser Ile
325 330 335
Lys Leu Lys Asn Leu Asn Asn Tyr Ile Ser Leu Phe Arg Lys Lys Thr
340 345 350
Arg Thr Glu Lys Glu Asn Lys Glu Leu Glu Asn Leu Glu Ile Asn Leu
355 360 365
Arg Lys Glu Ile Ala Lys Ala Phe Lys Gly Asn Glu Gly Tyr Lys Ser
370 375 380
Leu Phe Lys Lys Asp Ile Ile Glu Thr Ile Leu Pro Glu Phe Leu Asp
385 390 395 400
Asp Lys Asp Glu Ile Ala Leu Val Asn Ser Phe Asn Gly Phe Thr Thr
405 410 415
Ala Phe Thr Gly Phe Phe Asp Asn Arg Glu Asn Met Phe Ser Glu Glu
420 425 430
Ala Lys Ser Thr Ser Ile Ala Phe Arg Cys Ile Asn Glu Asn Leu Thr
435 440 445
Arg Tyr Ile Ser Asn Met Asp Ile Phe Glu Lys Val Asp Ala Ile Phe
450 455 460
Asp Lys His Glu Val Gln Glu Ile Lys Glu Lys Ile Leu Asn Ser Asp
465 470 475 480
Tyr Asp Val Glu Asp Phe Phe Glu Gly Glu Phe Phe Asn Phe Val Leu
485 490 495
Thr Gln Glu Gly Ile Asp Val Tyr Asn Ala Ile Ile Gly Gly Phe Val
500 505 510
Thr Glu Ser Gly Glu Lys Ile Lys Gly Leu Asn Glu Tyr Ile Asn Leu
515 520 525
Tyr Asn Gln Lys Thr Lys Gln Lys Leu Pro Lys Phe Lys Pro Leu Tyr
530 535 540
Lys Gln Val Leu Ser Asp Arg Glu Ser Leu Ser Phe Tyr Gly Glu Gly
545 550 555 560
Tyr Thr Ser Asp Glu Glu Val Leu Glu Val Phe Arg Asn Thr Leu Asn
565 570 575
Lys Asn Ser Glu Ile Phe Ser Ser Ile Lys Lys Leu Glu Lys Leu Phe
580 585 590
Lys Asn Phe Asp Glu Tyr Ser Ser Ala Gly Ile Phe Val Lys Asn Gly
595 600 605
Pro Ala Ile Ser Thr Ile Ser Lys Asp Ile Phe Gly Glu Trp Asn Val
610 615 620
Ile Arg Asp Lys Trp Asn Ala Glu Tyr Asp Asp Ile His Leu Lys Lys
625 630 635 640
Lys Ala Val Val Thr Glu Lys Tyr Glu Asp Asp Arg Arg Lys Ser Phe
645 650 655
Lys Lys Ile Gly Ser Phe Ser Leu Glu Gln Leu Gln Glu Tyr Ala Asp
660 665 670
Ala Asp Leu Ser Val Val Glu Lys Leu Lys Glu Ile Ile Ile Gln Lys
675 680 685
Val Asp Glu Ile Tyr Lys Val Tyr Gly Ser Ser Glu Lys Leu Phe Asp
690 695 700
Ala Asp Phe Val Leu Glu Lys Ser Leu Lys Lys Asn Asp Ala Val Val
705 710 715 720
Ala Ile Met Lys Asp Leu Leu Asp Ser Val Lys Ser Phe Glu Asn Tyr
725 730 735
Ile Lys Ala Phe Phe Gly Glu Gly Lys Glu Thr Asn Arg Asp Glu Ser
740 745 750
Phe Tyr Gly Asp Phe Val Leu Ala Tyr Asp Ile Leu Leu Lys Val Asp
755 760 765
His Ile Tyr Asp Ala Ile Arg Asn Tyr Val Thr Gln Lys Pro Tyr Ser
770 775 780
Lys Asp Lys Phe Lys Leu Tyr Phe Gln Asn Pro Gln Phe Met Gly Gly
785 790 795 800
Trp Asp Lys Asp Lys Glu Thr Asp Tyr Arg Ala Thr Ile Leu Arg Tyr
805 810 815
Gly Ser Lys Tyr Tyr Leu Ala Ile Met Asp Lys Lys Tyr Ala Lys Cys
820 825 830
Leu Gln Lys Ile Asp Lys Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile
835 840 845
Asn Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe
850 855 860
Phe Ser Lys Lys Trp Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln
865 870 875 880
Lys Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu
885 890 895
Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg
900 905 910
Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu
915 920 925
Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln Gly
930 935 940
Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys Glu Val Asp Lys Leu
945 950 955 960
Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile Tyr Asn Lys Asp Phe
965 970 975
Ser Asp Lys Ser His Gly Thr Pro Asn Leu His Thr Met Tyr Phe Lys
980 985 990
Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile Arg Leu Ser Gly Gly
995 1000 1005
Ala Glu Leu Phe Met Arg Arg Ala Ser Leu Lys Lys Glu Glu Leu
1010 1015 1020
Val Val His Pro Ala Asn Ser Pro Ile Ala Asn Lys Asn Pro Asp
1025 1030 1035
Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr Lys Asp
1040 1045 1050
Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro Ile Ala
1055 1060 1065
Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr Glu Val
1070 1075 1080
Arg Val Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile Gly Ile
1085 1090 1095
Ala Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val Asp Gly
1100 1105 1110
Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile Ile Asn
1115 1120 1125
Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser Leu Leu
1130 1135 1140
Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp Thr
1145 1150 1155
Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln
1160 1165 1170
Val Val His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp Ala Val
1175 1180 1185
Ile Ala Leu Ala Asp Leu Asn Ser Gly Phe Lys Asn Ser Arg Val
1190 1195 1200
Lys Val Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Leu Ile
1205 1210 1215
Asp Lys Leu Asn Tyr Met Val Asp Lys Lys Ser Asn Pro Cys Ala
1220 1225 1230
Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile Thr Asn Lys Phe Glu
1235 1240 1245
Ser Phe Lys Ser Met Ser Thr Gln Asn Gly Phe Ile Phe Tyr Ile
1250 1255 1260
Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro Ser Thr Gly Phe Val
1265 1270 1275
Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile Ala Asp Ser Lys Lys
1280 1285 1290
Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr Val Pro Glu Glu Asp
1295 1300 1305
Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn Phe Ser Arg Thr Asp
1310 1315 1320
Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr Ser Tyr Gly Asn Arg
1325 1330 1335
Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn Asn Val Phe Asp Trp
1340 1345 1350
Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys Glu Leu Phe Asn Lys
1355 1360 1365
Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile Arg Ala Leu Leu Cys
1370 1375 1380
Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser Phe Met Ala Leu Met
1385 1390 1395
Ser Leu Met Leu Gln Met Arg Asn Ser Ile Thr Gly Arg Thr Asp
1400 1405 1410
Val Ala Phe Leu Ile Ser Pro Val Lys Asn Ser Asp Gly Ile Phe
1415 1420 1425
Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu Asn Ala Ile Leu Pro
1430 1435 1440
Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile Ala Arg Lys Val
1445 1450 1455
Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala Glu Asp Glu Lys Leu
1460 1465 1470
Asp Lys Val Lys Ile Ala Ile Ser Asn Lys Glu Trp Leu Glu Tyr
1475 1480 1485
Ala Gln Thr Ser Val Lys His Gly Ser Pro Lys Lys Lys Arg Lys
1490 1495 1500
Val Ser Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu
1505 1510 1515
Thr Gly Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro
1520 1525 1530
Glu Glu Val Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile
1535 1540 1545
Leu Val His Thr Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met
1550 1555 1560
Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val
1565 1570 1575
Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu Ser Gly
1580 1585 1590
Gly Ser Pro Lys Lys Lys Arg Lys Val
1595 1600
<210> 25
<211> 1802
<212> DNA
<213> 甘蔗(Saccharum officinarum)
<400> 25
gaattcatta tgtggtctag gtaggttcta tatataagaa aacttgaaat gttctaaaaa 60
aaaattcaag cccatgcatg attgaagcaa acggtatagc aacggtgtta acctgatcta 120
gtgatctctt gcaatcctta acggccacct accgcaggta gcaaacggcg tccccctcct 180
cgatatctcc gcggcgacct ctggcttttt ccgcggaatt gcgcggtggg gacggattcc 240
acgagaccgc gacgcaaccg cctctcgccg ctgggcccca caccgctcgg tgccgtagcc 300
tcacgggact ctttctccct cctcccccgt tataaattgg cttcatcccc tccttgcctc 360
atccatccaa atcccagtcc ccaatcccat cccttcgtag gagaaattca tcgaagctaa 420
gcgaatcctc gcgatcctct caaggtactg cgagttttcg atccccctct cgacccctcg 480
tatgtttgtg tttgtcgtag cgtttgatta ggtatgcttt ccctgtttgt gttcgtcgta 540
gcgtttgatt aggtatgctt tccctgttcg tgttcatcgt agtgtttgat taggtcgtgt 600
gaggcgatgg cctgctcgcg tccttcgatc tgtagtcgat ttgcgggtcg tggtgtagat 660
ctgcgggctg tgatgaagtt atttggtgtg atctgctcgc ctgattctgc gggttggctc 720
gagtagatat gatggttgga ccggttggtt cgtttaccgc gctagggttg ggctgggatg 780
atgttgcatg cgccgttgcg cgtgatcccg cagcaggact tgcgtttgat tgccagatct 840
cgttacgatt atgtgatttg gtttggactt tttagatctg tagcttctgc ttatgtgcca 900
gatgcgccta ctgctcatat gcctgatgat aatcataaat ggctgtggaa ctaactagtt 960
gattgcggag tcatgtatca gctacaggtg tagggactag ctacaggtgt agggacttgc 1020
gtctaattgt ttggtccttt actcatgttg caattatgca atttagttta gattgtttgt 1080
tccactcatc taggctgtaa aagggacact gcttagattg ctgtttaatc tttttagtag 1140
attatattat attggtaact tattacccct attacatgcc atacgtgact tctgctcatg 1200
cctgatgata atcatagatc actgtggaat taattagttg attgttgaat catgtttcat 1260
gtacatacca cggcacaatt gcttagttcc ttaacaaatg caaattttac tgatccatgt 1320
atgatttgcg tggttctcta atgtgaaata ctatagctac ttgttagtaa gaatcaggtt 1380
cgtatgctta atgctgtatg tgccttctgc tcatgcctga tgataatcat atatcactgg 1440
aattaattag ttgatcgttt aatcatatat caagtacata ccatgccaca atttttagtc 1500
acttaaccca tgcagattga actggtccct gcatgttttg ctaaattgtt ctattctgat 1560
tagaccatat atcatgtatt tttttttggt aatggttctc ttattttaaa tgctatatag 1620
ttctggtact tgttagaaag atctgcttca tagtttagtt gcctatccct cgaattagga 1680
tgctgagcag ctgatcctat agctttgttt catgtatcaa ttcttttgtg ttcaacagtc 1740
agtttttgtt agattcattg taacttatgg tcgcttactc ttctggtcct caatgcttgc 1800
ag 1802
<210> 26
<211> 23
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 26
gggaaagacc gaggagaaga tct 23
<210> 27
<211> 20
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 27
aagaccgagg agaagatcta 20
<210> 28
<211> 90
<212> DNA
<213> 玉米(Zea mays)
<400> 28
gtttggggaa agaccgagga gaagatctac gggcctgtcg ctggaacgga ctacagggac 60
aaccagctgc ggttcagcct gctatgccag 90
<210> 29
<211> 24
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 29
agatgggaga cgggtacgag acgg 24
<210> 30
<211> 25
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 30
gtatgggttg ttgttgaggc tcagg 25
<210> 31
<211> 25
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 31
gaccacccac tgttcctgga gaggg 25
<210> 32
<211> 3783
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 32
atggctccta agaagaagcg gaaggttggt attcacgggg tgcctgcggc ttcaaagctc 60
gagaaattca ccaactgtta ttcgttgagc aaaacactgc ggtttaaagc gattccagtc 120
ggcaagactc aagagaatat agacaataag cggctgttgg tggaagatga aaagcgcgcg 180
gaagactaca aaggggtgaa gaagttgttg gacagatact acctctcttt tatcaatgat 240
gtcttgcact caatcaaatt gaagaatctg aacaactaca tctccctctt cagaaagaaa 300
acaaggacag aaaaggagaa taaggaactt gaaaatttgg agatcaatct gaggaaagag 360
atcgcgaaag cctttaaagg caacgaagga tacaaaagtc tgttcaagaa ggatataatt 420
gagacaattt tgccagagtt cctcgatgac aaggacgaga ttgcgctggt caattcgttc 480
aacggattca caacagcatt cacaggcttc tttgataatc gggaaaatat gttctctgag 540
gaggcaaagt ccacttctat tgcgttcagg tgtatcaatg agaatctcac taggtacatt 600
tccaacatgg atatctttga gaaggttgac gcaatttttg acaagcacga agttcaggag 660
attaaggaga agatcctcaa ttccgattat gacgttgagg acttcttcga gggtgagttt 720
tttaatttcg tgctcactca agagggtatc gacgtgtata atgcgatcat cggtgggttc 780
gtgactgagt ccggtgaaaa gattaaggga ttgaacgagt atatcaacct ttacaaccaa 840
aagacgaaac agaagctgcc aaagttcaag cctctttaca aacaggttct ttcagaccgc 900
gagtcactct cgttctatgg ggagggctac acttcggatg aggaagtcct ggaggtgttc 960
aggaatactc tcaataagaa ttcggagatt ttctcttcta taaaaaaact ggaaaagttg 1020
tttaagaatt ttgacgaata ctctagcgcc ggcatatttg tgaaaaacgg cccggccata 1080
tcaacgataa gtaaagatat cttcggcgaa tggaacgtga tcagagacaa atggaacgcg 1140
gagtatgacg atattcacct gaagaagaag gctgtcgtaa cggagaagta cgaggatgat 1200
cgcaggaaaa gcttcaaaaa gatcggaagt ttcagcctgg aacagttgca ggagtatgct 1260
gacgccgatc ttagcgtcgt cgagaagttg aaggagataa tcatccaaaa ggtcgacgag 1320
atatataaag tctatggatc aagtgaaaaa ctgttcgacg ccgacttcgt tttggagaag 1380
tccctgaaga agaacgacgc tgttgttgcc attatgaagg atctgctcga cagcgtgaag 1440
agtttcgaga actatattaa ggcttttttc ggggagggga aggagactaa cagagatgag 1500
tccttctacg gagacttcgt cctcgcgtac gatatactcc ttaaggtaga ccacatctac 1560
gacgcaatca gaaattacgt gacacaaaag ccgtacagca aggacaagtt caaactctac 1620
ttccagaacc cccagttcat gggcggctgg gacaaggaca aggaaacgga ttacagggct 1680
acgatcctga ggtatggttc aaaatactac ttggcgatta tggacaagaa gtacgccaag 1740
tgtctccaga agattgacaa agacgatgtc aatggcaatt atgagaagat caactacaag 1800
ctgcttccgg gtccgaacaa gatgctccca aaggttttct tcagcaagaa atggatggcc 1860
tactataacc caagcgagga catccagaag atttataaga acggtacgtt caagaagggc 1920
gacatgttca atcttaacga ctgtcacaag ctgatcgact tcttcaaaga ctcaattagc 1980
cggtacccaa agtggtctaa cgcctatgac ttcaactttt cggaaaccga gaagtacaag 2040
gatatagccg gattttatag agaggtggaa gagcagggct acaaggtgtc attcgagtcc 2100
gccagcaaga aggaagtgga caagctcgtg gaagagggta agctctacat gttccagatt 2160
tataataaag actttagcga taagagccac gggacaccta atctccacac aatgtatttc 2220
aagctgctct tcgacgagaa taaccacggc caaatcaggt tgtcaggagg ggctgaactc 2280
ttcatgcggc gcgctagcct taagaaggag gagcttgtag tccaccctgc gaatagtcca 2340
attgcgaata agaacccgga caatcctaaa aagactacaa cattgagcta cgacgtgtac 2400
aaggataaga ggttttccga ggatcagtac gagctccaca tcccgattgc gatcaacaag 2460
tgcccaaaga atattttcaa gataaacaca gaggtgcgtg tactcctgaa gcatgacgac 2520
aatccttacg tcattgggat tgatcggggc gagaggaacc tcctctatat tgtggtggtg 2580
gacgggaagg ggaacatagt cgaacagtac tcccttaacg aaataattaa caatttcaac 2640
ggcatccgta tcaagaccga ctaccattcg ttgctggaca agaaggagaa ggagagattt 2700
gaggcgcggc aaaattggac aagtatcgag aacatcaagg aactcaaagc aggttatatc 2760
tctcaagttg tgcataagat atgcgagctg gttgagaagt atgacgcagt gatcgctctt 2820
gaggacctca actcgggctt taagaattct agagttaaag tggagaagca ggtctatcaa 2880
aagttcgaga agatgcttat agataagctc aactacatgg tcgataagaa atcgaaccca 2940
tgtgccaccg gcggcgcact caaaggttac caaataacaa acaaattcga gtccttcaaa 3000
tcgatgagta ctcagaatgg gttcatattt tatataccgg cgtggcttac gtctaagatc 3060
gacccgtcaa ctggttttgt caacctgttg aagacgaaat acacgtccat tgccgattca 3120
aaaaagttca tatctagttt tgatcgtatt atgtacgtcc cagaggaaga tcttttcgag 3180
tttgctctcg actacaaaaa cttttcgcgc accgatgcgg attacattaa aaaatggaaa 3240
ctctattcgt acggcaacag aatcaggatt tttcgcaacc ctaagaagaa taacgtcttt 3300
gattgggagg aagtttgctt gactagcgcg tacaaggagc tctttaataa gtatggcatt 3360
aactaccaac agggtgatat cagagcactg ctttgcgaac aatctgacaa ggctttctac 3420
tcatccttca tggctttgat gagcctgatg ctccagatga gaaattcaat tacaggcaga 3480
accgacgtgg atttcttgat ctccccggtt aaaaattctg atggcatctt ttacgatagc 3540
aggaactatg aagcgcaaga gaatgcgatt ctgccaaaaa atgcagacgc caacggtgcc 3600
tataacatcg ccaggaaagt cctgtgggcg atcggccagt tcaaaaaggc cgaagacgaa 3660
aaattggaca aggtcaaaat cgctatcagc aacaaagagt ggctggagta tgctcagaca 3720
tccgtaaagc ataagcgtcc tgctgccacc aaaaaggccg gacaggctaa gaaaaagaag 3780
tga 3783
<210> 33
<211> 1260
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 33
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Ser Lys Leu Glu Lys Phe Thr Asn Cys Tyr Ser Leu Ser Lys Thr
20 25 30
Leu Arg Phe Lys Ala Ile Pro Val Gly Lys Thr Gln Glu Asn Ile Asp
35 40 45
Asn Lys Arg Leu Leu Val Glu Asp Glu Lys Arg Ala Glu Asp Tyr Lys
50 55 60
Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr Leu Ser Phe Ile Asn Asp
65 70 75 80
Val Leu His Ser Ile Lys Leu Lys Asn Leu Asn Asn Tyr Ile Ser Leu
85 90 95
Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu Asn Lys Glu Leu Glu Asn
100 105 110
Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala Lys Ala Phe Lys Gly Asn
115 120 125
Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp Ile Ile Glu Thr Ile Leu
130 135 140
Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile Ala Leu Val Asn Ser Phe
145 150 155 160
Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe Phe Asp Asn Arg Glu Asn
165 170 175
Met Phe Ser Glu Glu Ala Lys Ser Thr Ser Ile Ala Phe Arg Cys Ile
180 185 190
Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn Met Asp Ile Phe Glu Lys
195 200 205
Val Asp Ala Ile Phe Asp Lys His Glu Val Gln Glu Ile Lys Glu Lys
210 215 220
Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp Phe Phe Glu Gly Glu Phe
225 230 235 240
Phe Asn Phe Val Leu Thr Gln Glu Gly Ile Asp Val Tyr Asn Ala Ile
245 250 255
Ile Gly Gly Phe Val Thr Glu Ser Gly Glu Lys Ile Lys Gly Leu Asn
260 265 270
Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr Lys Gln Lys Leu Pro Lys
275 280 285
Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser Asp Arg Glu Ser Leu Ser
290 295 300
Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu Glu Val Leu Glu Val Phe
305 310 315 320
Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile Phe Ser Ser Ile Lys Lys
325 330 335
Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu Tyr Ser Ser Ala Gly Ile
340 345 350
Phe Val Lys Asn Gly Pro Ala Ile Ser Thr Ile Ser Lys Asp Ile Phe
355 360 365
Gly Glu Trp Asn Val Ile Arg Asp Lys Trp Asn Ala Glu Tyr Asp Asp
370 375 380
Ile His Leu Lys Lys Lys Ala Val Val Thr Glu Lys Tyr Glu Asp Asp
385 390 395 400
Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser Phe Ser Leu Glu Gln Leu
405 410 415
Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val Val Glu Lys Leu Lys Glu
420 425 430
Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr Lys Val Tyr Gly Ser Ser
435 440 445
Glu Lys Leu Phe Asp Ala Asp Phe Val Leu Glu Lys Ser Leu Lys Lys
450 455 460
Asn Asp Ala Val Val Ala Ile Met Lys Asp Leu Leu Asp Ser Val Lys
465 470 475 480
Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe Gly Glu Gly Lys Glu Thr
485 490 495
Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe Val Leu Ala Tyr Asp Ile
500 505 510
Leu Leu Lys Val Asp His Ile Tyr Asp Ala Ile Arg Asn Tyr Val Thr
515 520 525
Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys Leu Tyr Phe Gln Asn Pro
530 535 540
Gln Phe Met Gly Gly Trp Asp Lys Asp Lys Glu Thr Asp Tyr Arg Ala
545 550 555 560
Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala Ile Met Asp Lys
565 570 575
Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp Lys Asp Asp Val Asn Gly
580 585 590
Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met
595 600 605
Leu Pro Lys Val Phe Phe Ser Lys Lys Trp Met Ala Tyr Tyr Asn Pro
610 615 620
Ser Glu Asp Ile Gln Lys Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly
625 630 635 640
Asp Met Phe Asn Leu Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys
645 650 655
Asp Ser Ile Ser Arg Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn
660 665 670
Phe Ser Glu Thr Glu Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu
675 680 685
Val Glu Glu Gln Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys
690 695 700
Glu Val Asp Lys Leu Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile
705 710 715 720
Tyr Asn Lys Asp Phe Ser Asp Lys Ser His Gly Thr Pro Asn Leu His
725 730 735
Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile
740 745 750
Arg Leu Ser Gly Gly Ala Glu Leu Phe Met Arg Arg Ala Ser Leu Lys
755 760 765
Lys Glu Glu Leu Val Val His Pro Ala Asn Ser Pro Ile Ala Asn Lys
770 775 780
Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr
785 790 795 800
Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro Ile
805 810 815
Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr Glu Val
820 825 830
Arg Val Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile Gly Ile Asp
835 840 845
Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val Asp Gly Lys Gly
850 855 860
Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile Ile Asn Asn Phe Asn
865 870 875 880
Gly Ile Arg Ile Lys Thr Asp Tyr His Ser Leu Leu Asp Lys Lys Glu
885 890 895
Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp Thr Ser Ile Glu Asn Ile
900 905 910
Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln Val Val His Lys Ile Cys
915 920 925
Glu Leu Val Glu Lys Tyr Asp Ala Val Ile Ala Leu Glu Asp Leu Asn
930 935 940
Ser Gly Phe Lys Asn Ser Arg Val Lys Val Glu Lys Gln Val Tyr Gln
945 950 955 960
Lys Phe Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Met Val Asp Lys
965 970 975
Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile
980 985 990
Thr Asn Lys Phe Glu Ser Phe Lys Ser Met Ser Thr Gln Asn Gly Phe
995 1000 1005
Ile Phe Tyr Ile Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro Ser
1010 1015 1020
Thr Gly Phe Val Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile Ala
1025 1030 1035
Asp Ser Lys Lys Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr Val
1040 1045 1050
Pro Glu Glu Asp Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn Phe
1055 1060 1065
Ser Arg Thr Asp Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr Ser
1070 1075 1080
Tyr Gly Asn Arg Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn Asn
1085 1090 1095
Val Phe Asp Trp Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys Glu
1100 1105 1110
Leu Phe Asn Lys Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile Arg
1115 1120 1125
Ala Leu Leu Cys Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser Phe
1130 1135 1140
Met Ala Leu Met Ser Leu Met Leu Gln Met Arg Asn Ser Ile Thr
1145 1150 1155
Gly Arg Thr Asp Val Asp Phe Leu Ile Ser Pro Val Lys Asn Ser
1160 1165 1170
Asp Gly Ile Phe Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu Asn
1175 1180 1185
Ala Ile Leu Pro Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile
1190 1195 1200
Ala Arg Lys Val Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala Glu
1205 1210 1215
Asp Glu Lys Leu Asp Lys Val Lys Ile Ala Ile Ser Asn Lys Glu
1220 1225 1230
Trp Leu Glu Tyr Ala Gln Thr Ser Val Lys His Lys Arg Pro Ala
1235 1240 1245
Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys
1250 1255 1260
<210> 34
<211> 3873
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 34
atgccgaaga agaagcgcaa ggtcgggggc gggggctcag gcgggggcgg gagcggcggc 60
gggggctctg ggggcggcgg cagcggcggg ggcggcagcg ggggcggcgg gtcgatgagc 120
aagctggaga agttcacgaa ctgctactcc ctcagcaaga ccctgaggtt caaggcgatc 180
ccggtcggca agacccagga gaacatcgac aacaagcggc tgctggtgga ggacgagaag 240
agggctgagg actacaaggg cgtgaagaag ctcctggacc gctactacct gtccttcatc 300
aacgacgtgc tccacagcat caagctcaag aacctgaaca actacatcag cctcttcagg 360
aagaagacgc gcaccgagaa ggagaacaag gagctcgaga acctggagat caacctgagg 420
aaggagatcg ccaaggcgtt caagggcaac gagggctaca agtccctctt caagaaggac 480
atcatcgaga cgatcctccc ggagttcctg gacgacaagg acgagatcgc cctggtcaac 540
tccttcaacg gcttcaccac ggcgttcacc ggcttcttcg acaaccgcga gaacatgttc 600
agcgaggagg ccaagtccac gagcatcgcg ttcaggtgca tcaacgagaa cctcacccgc 660
tacatctcca acatggacat cttcgagaag gtcgacgcga tcttcgacaa gcacgaggtg 720
caggagatca aggagaagat cctgaacagc gactacgacg tcgaggactt cttcgagggc 780
gagttcttca acttcgtcct cacgcaggag ggcatcgacg tgtacaacgc catcatcggt 840
ggcttcgtga ccgagtccgg cgagaagatc aagggcctga acgagtacat caacctctac 900
aaccagaaga ccaagcagaa gctgccgaag ttcaagcccc tgtacaagca ggtgctctcc 960
gacagggagt ccctcagctt ctacggcgag ggctacacga gcgacgagga ggtcctggag 1020
gtgttccgca acaccctcaa caagaacagc gagatcttct ccagcatcaa gaagctcgag 1080
aagctgttca agaacttcga cgagtactcc agcgccggca tcttcgtcaa gaacggcccg 1140
gcgatctcca cgatcagcaa ggacatcttc ggcgagtgga acgtgatccg cgacaagtgg 1200
aacgccgagt acgacgacat ccacctcaag aagaaggcgg tggtcaccga gaagtacgag 1260
gacgacaggc gcaagtcctt caagaagatc ggctccttca gcctcgagca gctgcaggag 1320
tacgccgacg cggacctgag cgtggtcgag aagctcaagg agatcatcat ccagaaggtc 1380
gacgagatct acaaggtgta cggctccagc gagaagctct tcgacgcgga cttcgtcctc 1440
gagaagtccc tgaagaagaa cgacgccgtg gtcgcgatca tgaaggacct cctggactcc 1500
gtgaagagct tcgagaatta catcaaggcc ttcttcggcg agggcaagga gacgaacagg 1560
gacgagtcct tctacggcga cttcgtcctg gcctacgaca tcctcctgaa ggtggaccac 1620
atctacgacg cgatccgcaa ctacgtgacc cagaagccgt acagcaagga caagttcaag 1680
ctctacttcc agaaccccca gttcatgggc ggctgggaca aggacaagga gacggactac 1740
agggcgacca tcctgcgcta cggcagcaag tactacctcg ccatcatgga caagaagtac 1800
gcgaagtgcc tgcagaagat cgacaaggac gacgtcaacg gcaactacga gaagatcaac 1860
tacaagctcc tgccgggccc caacaagatg ctcccgaagg tgttcttctc caagaagtgg 1920
atggcctact acaaccccag cgaggacatc cagaagatct acaagaacgg cacgttcaag 1980
aagggcgaca tgttcaacct gaacgactgc cacaagctca tcgacttctt caaggactcc 2040
atcagccgct acccgaagtg gtccaacgcc tacgacttca acttcagcga gaccgagaag 2100
tacaaggaca tcgcgggctt ctaccgcgag gtcgaggagc agggctacaa ggtgtccttc 2160
gagtccgcca gcaagaagga ggtcgacaag ctggtggagg agggcaagct ctacatgttc 2220
cagatctaca acaaggactt ctccgacaag agccacggca cgcccaacct gcacaccatg 2280
tacttcaagc tcctgttcga cgagaacaac cacggccaga tcaggctgtc cggcggcgcc 2340
gagctcttca tgaggagggc gagcctgaag aaggaggagc tggtggtcca ccccgctaac 2400
agcccaatcg cgaacaagaa cccggacaac cccaagaaga ccacgaccct gtcctacgac 2460
gtgtacaagg acaagaggtt cagcgaggac cagtacgagc tccacatccc gatcgcgatc 2520
aacaagtgcc ccaagaacat cttcaagatc aacaccgagg tccgcgtgct cctgaagcac 2580
gacgacaacc cctacgtgat cggcatcgac aggggcgaga ggaacctcct gtacatcgtg 2640
gtcgtggacg gcaagggcaa catcgtggag cagtactccc tcaacgagat catcaacaac 2700
ttcaacggca tcaggatcaa gacggactac cacagcctcc tggacaagaa ggagaaggag 2760
aggttcgagg cccgccagaa ctggacctcc atcgagaaca tcaaggagct gaaggcgggc 2820
tacatcagcc aggtcgtgca caagatctgc gagctcgtcg agaagtacga cgccgtgatc 2880
gccctcgagg acctgaactc cggcttcaag aacagccgcg tcaaggtgga gaagcaggtc 2940
taccagaagt tcgagaagat gctcatcgac aagctgaact acatggtgga caagaagtcc 3000
aacccctgcg ctacgggcgg cgcgctgaag ggctaccaga tcaccaacaa gttcgagagc 3060
ttcaagtcca tgagcactca gaacggcttc atcttctaca tcccggcgtg gctcacgtcc 3120
aagatcgacc ccagcaccgg cttcgtcaac ctcctgaaga cgaagtacac ctccatcgcc 3180
gacagcaaga agttcatctc cagcttcgac cgcatcatgt atgtgccgga ggaggacctg 3240
ttcgagttcg ccctcgacta caagaacttc tcccgcacgg acgcggacta catcaagaag 3300
tggaagctgt acagctacgg caaccgcatc cgcatcttca ggaaccccaa gaagaacaac 3360
gtcttcgact gggaggaggt gtgcctgacc tccgcgtaca aggagctctt caacaagtac 3420
ggcatcaact accagcaggg cgacatcagg gctctcctgt gcgagcagag cgacaaggcc 3480
ttctactcca gcttcatggc gctgatgtcc ctcatgctgc agatgaggaa ctcgatcacc 3540
ggcaggacgg acgtggactt cctcatctcc ccggtgaaga acagcgacgg catcttctac 3600
gactccagga actacgaggc ccaggagaac gcgatcctcc caaagaacgc ggacgccaac 3660
ggcgcctaca acatcgccag gaaggtcctc tgggctatcg gccagttcaa gaaggcggag 3720
gacgagaagc tggacaaggt gaagatcgcc atcagcaaca aggagtggct cgagtacgcc 3780
cagacctcgg tcaagcacgg cagcccgaag aagaagcgca aggtgtccgg cggcagctcc 3840
ggcggcagcc cgaagaagaa gcgcaaagtg tga 3873
<210> 35
<211> 1290
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 35
Met Pro Lys Lys Lys Arg Lys Val Gly Gly Gly Gly Ser Gly Gly Gly
1 5 10 15
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
20 25 30
Ser Gly Gly Gly Gly Ser Met Ser Lys Leu Glu Lys Phe Thr Asn Cys
35 40 45
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly Lys
50 55 60
Thr Gln Glu Asn Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu Lys
65 70 75 80
Arg Ala Glu Asp Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr
85 90 95
Leu Ser Phe Ile Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn Leu
100 105 110
Asn Asn Tyr Ile Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu
115 120 125
Asn Lys Glu Leu Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala
130 135 140
Lys Ala Phe Lys Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp
145 150 155 160
Ile Ile Glu Thr Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile
165 170 175
Ala Leu Val Asn Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe
180 185 190
Phe Asp Asn Arg Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr Ser
195 200 205
Ile Ala Phe Arg Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn
210 215 220
Met Asp Ile Phe Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu Val
225 230 235 240
Gln Glu Ile Lys Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp
245 250 255
Phe Phe Glu Gly Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly Ile
260 265 270
Asp Val Tyr Asn Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly Glu
275 280 285
Lys Ile Lys Gly Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr
290 295 300
Lys Gln Lys Leu Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser
305 310 315 320
Asp Arg Glu Ser Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu
325 330 335
Glu Val Leu Glu Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile
340 345 350
Phe Ser Ser Ile Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu
355 360 365
Tyr Ser Ser Ala Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser Thr
370 375 380
Ile Ser Lys Asp Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys Trp
385 390 395 400
Asn Ala Glu Tyr Asp Asp Ile His Leu Lys Lys Lys Ala Val Val Thr
405 410 415
Glu Lys Tyr Glu Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser
420 425 430
Phe Ser Leu Glu Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val
435 440 445
Val Glu Lys Leu Lys Glu Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr
450 455 460
Lys Val Tyr Gly Ser Ser Glu Lys Leu Phe Asp Ala Asp Phe Val Leu
465 470 475 480
Glu Lys Ser Leu Lys Lys Asn Asp Ala Val Val Ala Ile Met Lys Asp
485 490 495
Leu Leu Asp Ser Val Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe
500 505 510
Gly Glu Gly Lys Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe
515 520 525
Val Leu Ala Tyr Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp Ala
530 535 540
Ile Arg Asn Tyr Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys
545 550 555 560
Leu Tyr Phe Gln Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp Lys
565 570 575
Glu Thr Asp Tyr Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr
580 585 590
Leu Ala Ile Met Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp
595 600 605
Lys Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu
610 615 620
Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ser Lys Lys Trp
625 630 635 640
Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln Lys Ile Tyr Lys Asn
645 650 655
Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu Asn Asp Cys His Lys
660 665 670
Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg Tyr Pro Lys Trp Ser
675 680 685
Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu Lys Tyr Lys Asp Ile
690 695 700
Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln Gly Tyr Lys Val Ser Phe
705 710 715 720
Glu Ser Ala Ser Lys Lys Glu Val Asp Lys Leu Val Glu Glu Gly Lys
725 730 735
Leu Tyr Met Phe Gln Ile Tyr Asn Lys Asp Phe Ser Asp Lys Ser His
740 745 750
Gly Thr Pro Asn Leu His Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu
755 760 765
Asn Asn His Gly Gln Ile Arg Leu Ser Gly Gly Ala Glu Leu Phe Met
770 775 780
Arg Arg Ala Ser Leu Lys Lys Glu Glu Leu Val Val His Pro Ala Asn
785 790 795 800
Ser Pro Ile Ala Asn Lys Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr
805 810 815
Leu Ser Tyr Asp Val Tyr Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr
820 825 830
Glu Leu His Ile Pro Ile Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe
835 840 845
Lys Ile Asn Thr Glu Val Arg Val Leu Leu Lys His Asp Asp Asn Pro
850 855 860
Tyr Val Ile Gly Ile Asp Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val
865 870 875 880
Val Val Asp Gly Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu
885 890 895
Ile Ile Asn Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser
900 905 910
Leu Leu Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp
915 920 925
Thr Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln
930 935 940
Val Val His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp Ala Val Ile
945 950 955 960
Ala Leu Glu Asp Leu Asn Ser Gly Phe Lys Asn Ser Arg Val Lys Val
965 970 975
Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Leu Ile Asp Lys Leu
980 985 990
Asn Tyr Met Val Asp Lys Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala
995 1000 1005
Leu Lys Gly Tyr Gln Ile Thr Asn Lys Phe Glu Ser Phe Lys Ser
1010 1015 1020
Met Ser Thr Gln Asn Gly Phe Ile Phe Tyr Ile Pro Ala Trp Leu
1025 1030 1035
Thr Ser Lys Ile Asp Pro Ser Thr Gly Phe Val Asn Leu Leu Lys
1040 1045 1050
Thr Lys Tyr Thr Ser Ile Ala Asp Ser Lys Lys Phe Ile Ser Ser
1055 1060 1065
Phe Asp Arg Ile Met Tyr Val Pro Glu Glu Asp Leu Phe Glu Phe
1070 1075 1080
Ala Leu Asp Tyr Lys Asn Phe Ser Arg Thr Asp Ala Asp Tyr Ile
1085 1090 1095
Lys Lys Trp Lys Leu Tyr Ser Tyr Gly Asn Arg Ile Arg Ile Phe
1100 1105 1110
Arg Asn Pro Lys Lys Asn Asn Val Phe Asp Trp Glu Glu Val Cys
1115 1120 1125
Leu Thr Ser Ala Tyr Lys Glu Leu Phe Asn Lys Tyr Gly Ile Asn
1130 1135 1140
Tyr Gln Gln Gly Asp Ile Arg Ala Leu Leu Cys Glu Gln Ser Asp
1145 1150 1155
Lys Ala Phe Tyr Ser Ser Phe Met Ala Leu Met Ser Leu Met Leu
1160 1165 1170
Gln Met Arg Asn Ser Ile Thr Gly Arg Thr Asp Val Asp Phe Leu
1175 1180 1185
Ile Ser Pro Val Lys Asn Ser Asp Gly Ile Phe Tyr Asp Ser Arg
1190 1195 1200
Asn Tyr Glu Ala Gln Glu Asn Ala Ile Leu Pro Lys Asn Ala Asp
1205 1210 1215
Ala Asn Gly Ala Tyr Asn Ile Ala Arg Lys Val Leu Trp Ala Ile
1220 1225 1230
Gly Gln Phe Lys Lys Ala Glu Asp Glu Lys Leu Asp Lys Val Lys
1235 1240 1245
Ile Ala Ile Ser Asn Lys Glu Trp Leu Glu Tyr Ala Gln Thr Ser
1250 1255 1260
Val Lys His Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Gly Gly
1265 1270 1275
Ser Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1280 1285 1290
<210> 36
<211> 3783
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 36
atggctccta agaagaagcg gaaggttggt attcacgggg tgcctgcggc ttcaaagctc 60
gagaaattca ccaactgtta ttcgttgagc aaaacactgc ggtttaaagc gattccagtc 120
ggcaagactc aagagaatat agacaataag cggctgttgg tggaagatga aaagcgcgcg 180
gaagactaca aaggggtgaa gaagttgttg gacagatact acctctcttt tatcaatgat 240
gtcttgcact caatcaaatt gaagaatctg aacaactaca tctccctctt cagaaagaaa 300
acaaggacag aaaaggagaa taaggaactt gaaaatttgg agatcaatct gaggaaagag 360
atcgcgaaag cctttaaagg caacgaagga tacaaaagtc tgttcaagaa ggatataatt 420
gagacaattt tgccagagtt cctcgatgac aaggacgaga ttgcgctggt caattcgttc 480
aacggattca caacagcatt cacaggcttc tttgataatc gggaaaatat gttctctgag 540
gaggcaaagt ccacttctat tgcgttcagg tgtatcaatg agaatctcac taggtacatt 600
tccaacatgg atatctttga gaaggttgac gcaatttttg acaagcacga agttcaggag 660
attaaggaga agatcctcaa ttccgattat gacgttgagg acttcttcga gggtgagttt 720
tttaatttcg tgctcactca agagggtatc gacgtgtata atgcgatcat cggtgggttc 780
gtgactgagt ccggtgaaaa gattaaggga ttgaacgagt atatcaacct ttacaaccaa 840
aagacgaaac agaagctgcc aaagttcaag cctctttaca aacaggttct ttcagaccgc 900
gagtcactct cgttctatgg ggagggctac acttcggatg aggaagtcct ggaggtgttc 960
aggaatactc tcaataagaa ttcggagatt ttctcttcta taaaaaaact ggaaaagttg 1020
tttaagaatt ttgacgaata ctctagcgcc ggcatatttg tgaaaaacgg cccggccata 1080
tcaacgataa gtaaagatat cttcggcgaa tggaacgtga tcagagacaa atggaacgcg 1140
gagtatgacg atattcacct gaagaagaag gctgtcgtaa cggagaagta cgaggatgat 1200
cgcaggaaaa gcttcaaaaa gatcggaagt ttcagcctgg aacagttgca ggagtatgct 1260
gacgccgatc ttagcgtcgt cgagaagttg aaggagataa tcatccaaaa ggtcgacgag 1320
atatataaag tctatggatc aagtgaaaaa ctgttcgacg ccgacttcgt tttggagaag 1380
tccctgaaga agaacgacgc tgttgttgcc attatgaagg atctgctcga cagcgtgaag 1440
agtttcgaga actatattaa ggcttttttc ggggagggga aggagactaa cagagatgag 1500
tccttctacg gagacttcgt cctcgcgtac gatatactcc ttaaggtaga ccacatctac 1560
gacgcaatca gaaattacgt gacacaaaag ccgtacagca aggacaagtt caaactctac 1620
ttccagaacc cccagttcat gggcggctgg gacaaggaca aggaaacgga ttacagggct 1680
acgatcctga ggtatggttc aaaatactac ttggcgatta tggacaagaa gtacgccaag 1740
tgtctccaga agattgacaa agacgatgtc aatggcaatt atgagaagat caactacaag 1800
ctgcttccgg gtccgaacaa gatgctccca aaggttttct tcagcaagaa atggatggcc 1860
tactataacc caagcgagga catccagaag atttataaga acggtacgtt caagaagggc 1920
gacatgttca atcttaacga ctgtcacaag ctgatcgact tcttcaaaga ctcaattagc 1980
cggtacccaa agtggtctaa cgcctatgac ttcaactttt cggaaaccga gaagtacaag 2040
gatatagccg gattttatag agaggtggaa gagcagggct acaaggtgtc attcgagtcc 2100
gccagcaaga aggaagtgga caagctcgtg gaagagggta agctctacat gttccagatt 2160
tataataaag actttagcga taagagccac gggacaccta atctccacac aatgtatttc 2220
aagctgctct tcgacgagaa taaccacggc caaatcaggt tgtcaggagg ggctgaactc 2280
ttcatgcggc gcgctagcct taagaaggag gagcttgtag tccaccctgc gaatagtcca 2340
attgcgaata agaacccgga caatcctaaa aagactacaa cattgagcta cgacgtgtac 2400
aaggataaga ggttttccga ggatcagtac gagctccaca tcccgattgc gatcaacaag 2460
tgcccaaaga atattttcaa gataaacaca gaggtgcgtg tactcctgaa gcatgacgac 2520
aatccttacg tcattgggat tgatcggggc gagaggaacc tcctctatat tgtggtggtg 2580
gacgggaagg ggaacatagt cgaacagtac tcccttaacg aaataattaa caatttcaac 2640
ggcatccgta tcaagaccga ctaccattcg ttgctggaca agaaggagaa ggagagattt 2700
gaggcgcggc aaaattggac aagtatcgag aacatcaagg aactcaaagc aggttatatc 2760
tctcaagttg tgcataagat atgcgagctg gttgagaagt atgacgcagt gatcgctctt 2820
gaggacctca actcgggctt taagaattct agagttaaag tggagaagca ggtctatcaa 2880
aagttcgaga agatgcttat agataagctc aactacatgg tcgataagaa atcgaaccca 2940
tgtgccaccg gcggcgcact caaaggttac caaataacaa acaaattcga gtccttcaaa 3000
tcgatgagta ctcagaatgg gttcatattt tatataccgg cgtggcttac gtctaagatc 3060
gacccgtcaa ctggttttgt caacctgttg aagacgaaat acacgtccat tgccgattca 3120
aaaaagttca tatctagttt tgatcgtatt atgtacgtcc cagaggaaga tcttttcgag 3180
tttgctctcg actacaaaaa cttttcgcgc accgatgcgg attacattaa aaaatggaaa 3240
ctctattcgt acggcaacag aatcaggatt tttcgcaacc ctaagaagaa taacgtcttt 3300
gattgggagg aagtttgctt gactagcgcg tacaaggagc tctttaataa gtatggcatt 3360
aactaccaac agggtgatat cagagcactg ctttgcgaac aatctgacaa ggctttctac 3420
tcatccttca tggctttgat gagcctgatg ctccagatga gaaattcaat tacaggcaga 3480
accgacgtgg atttcttgat ctccccggtt aaaaattctg atggcatctt ttacgatagc 3540
aggaactatg aagcgcaaga gaatgcgatt ctgccaaaaa atgcagacgc caacggtgcc 3600
tataacatcg ccaggaaagt cctgtgggcg atcggccagt tcaaaaaggc cgaagacgaa 3660
aaattggaca aggtcaaaat cgctatcagc aacaaagagt ggctggagta tgctcagaca 3720
tccgtaaagc ataagcgtcc tgctgccacc aaaaaggccg gacaggctaa gaaaaagaag 3780
tga 3783
<210> 37
<211> 1260
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 37
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Ser Lys Leu Glu Lys Phe Thr Asn Cys Tyr Ser Leu Ser Lys Thr
20 25 30
Leu Arg Phe Lys Ala Ile Pro Val Gly Lys Thr Gln Glu Asn Ile Asp
35 40 45
Asn Lys Arg Leu Leu Val Glu Asp Glu Lys Arg Ala Glu Asp Tyr Lys
50 55 60
Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr Leu Ser Phe Ile Asn Asp
65 70 75 80
Val Leu His Ser Ile Lys Leu Lys Asn Leu Asn Asn Tyr Ile Ser Leu
85 90 95
Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu Asn Lys Glu Leu Glu Asn
100 105 110
Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala Lys Ala Phe Lys Gly Asn
115 120 125
Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp Ile Ile Glu Thr Ile Leu
130 135 140
Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile Ala Leu Val Asn Ser Phe
145 150 155 160
Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe Phe Asp Asn Arg Glu Asn
165 170 175
Met Phe Ser Glu Glu Ala Lys Ser Thr Ser Ile Ala Phe Arg Cys Ile
180 185 190
Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn Met Asp Ile Phe Glu Lys
195 200 205
Val Asp Ala Ile Phe Asp Lys His Glu Val Gln Glu Ile Lys Glu Lys
210 215 220
Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp Phe Phe Glu Gly Glu Phe
225 230 235 240
Phe Asn Phe Val Leu Thr Gln Glu Gly Ile Asp Val Tyr Asn Ala Ile
245 250 255
Ile Gly Gly Phe Val Thr Glu Ser Gly Glu Lys Ile Lys Gly Leu Asn
260 265 270
Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr Lys Gln Lys Leu Pro Lys
275 280 285
Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser Asp Arg Glu Ser Leu Ser
290 295 300
Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu Glu Val Leu Glu Val Phe
305 310 315 320
Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile Phe Ser Ser Ile Lys Lys
325 330 335
Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu Tyr Ser Ser Ala Gly Ile
340 345 350
Phe Val Lys Asn Gly Pro Ala Ile Ser Thr Ile Ser Lys Asp Ile Phe
355 360 365
Gly Glu Trp Asn Val Ile Arg Asp Lys Trp Asn Ala Glu Tyr Asp Asp
370 375 380
Ile His Leu Lys Lys Lys Ala Val Val Thr Glu Lys Tyr Glu Asp Asp
385 390 395 400
Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser Phe Ser Leu Glu Gln Leu
405 410 415
Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val Val Glu Lys Leu Lys Glu
420 425 430
Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr Lys Val Tyr Gly Ser Ser
435 440 445
Glu Lys Leu Phe Asp Ala Asp Phe Val Leu Glu Lys Ser Leu Lys Lys
450 455 460
Asn Asp Ala Val Val Ala Ile Met Lys Asp Leu Leu Asp Ser Val Lys
465 470 475 480
Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe Gly Glu Gly Lys Glu Thr
485 490 495
Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe Val Leu Ala Tyr Asp Ile
500 505 510
Leu Leu Lys Val Asp His Ile Tyr Asp Ala Ile Arg Asn Tyr Val Thr
515 520 525
Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys Leu Tyr Phe Gln Asn Pro
530 535 540
Gln Phe Met Gly Gly Trp Asp Lys Asp Lys Glu Thr Asp Tyr Arg Ala
545 550 555 560
Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala Ile Met Asp Lys
565 570 575
Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp Lys Asp Asp Val Asn Gly
580 585 590
Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met
595 600 605
Leu Pro Lys Val Phe Phe Ser Lys Lys Trp Met Ala Tyr Tyr Asn Pro
610 615 620
Ser Glu Asp Ile Gln Lys Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly
625 630 635 640
Asp Met Phe Asn Leu Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys
645 650 655
Asp Ser Ile Ser Arg Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn
660 665 670
Phe Ser Glu Thr Glu Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu
675 680 685
Val Glu Glu Gln Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys
690 695 700
Glu Val Asp Lys Leu Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile
705 710 715 720
Tyr Asn Lys Asp Phe Ser Asp Lys Ser His Gly Thr Pro Asn Leu His
725 730 735
Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile
740 745 750
Arg Leu Ser Gly Gly Ala Glu Leu Phe Met Arg Arg Ala Ser Leu Lys
755 760 765
Lys Glu Glu Leu Val Val His Pro Ala Asn Ser Pro Ile Ala Asn Lys
770 775 780
Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr
785 790 795 800
Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro Ile
805 810 815
Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr Glu Val
820 825 830
Arg Val Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile Gly Ile Asp
835 840 845
Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val Asp Gly Lys Gly
850 855 860
Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile Ile Asn Asn Phe Asn
865 870 875 880
Gly Ile Arg Ile Lys Thr Asp Tyr His Ser Leu Leu Asp Lys Lys Glu
885 890 895
Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp Thr Ser Ile Glu Asn Ile
900 905 910
Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln Val Val His Lys Ile Cys
915 920 925
Glu Leu Val Glu Lys Tyr Asp Ala Val Ile Ala Leu Glu Asp Leu Asn
930 935 940
Ser Gly Phe Lys Asn Ser Arg Val Lys Val Glu Lys Gln Val Tyr Gln
945 950 955 960
Lys Phe Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Met Val Asp Lys
965 970 975
Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile
980 985 990
Thr Asn Lys Phe Glu Ser Phe Lys Ser Met Ser Thr Gln Asn Gly Phe
995 1000 1005
Ile Phe Tyr Ile Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro Ser
1010 1015 1020
Thr Gly Phe Val Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile Ala
1025 1030 1035
Asp Ser Lys Lys Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr Val
1040 1045 1050
Pro Glu Glu Asp Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn Phe
1055 1060 1065
Ser Arg Thr Asp Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr Ser
1070 1075 1080
Tyr Gly Asn Arg Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn Asn
1085 1090 1095
Val Phe Asp Trp Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys Glu
1100 1105 1110
Leu Phe Asn Lys Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile Arg
1115 1120 1125
Ala Leu Leu Cys Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser Phe
1130 1135 1140
Met Ala Leu Met Ser Leu Met Leu Gln Met Arg Asn Ser Ile Thr
1145 1150 1155
Gly Arg Thr Asp Val Asp Phe Leu Ile Ser Pro Val Lys Asn Ser
1160 1165 1170
Asp Gly Ile Phe Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu Asn
1175 1180 1185
Ala Ile Leu Pro Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile
1190 1195 1200
Ala Arg Lys Val Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala Glu
1205 1210 1215
Asp Glu Lys Leu Asp Lys Val Lys Ile Ala Ile Ser Asn Lys Glu
1220 1225 1230
Trp Leu Glu Tyr Ala Gln Thr Ser Val Lys His Lys Arg Pro Ala
1235 1240 1245
Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys
1250 1255 1260
<210> 38
<211> 3873
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 38
atgccgaaga agaagcgcaa ggtcgggggc gggggctcag gcgggggcgg gagcggcggc 60
gggggctctg ggggcggcgg cagcggcggg ggcggcagcg ggggcggcgg gtcgatgtca 120
aagctcgaga aattcaccaa ctgttattcg ttgagcaaaa cactgcggtt taaagcgatt 180
ccagtcggca agactcaaga gaatatagac aataagcggc tgttggtgga agatgaaaag 240
cgcgcggaag actacaaagg ggtgaagaag ttgttggaca gatactacct ctcttttatc 300
aatgatgtct tgcactcaat caaattgaag aatctgaaca actacatctc cctcttcaga 360
aagaaaacaa ggacagaaaa ggagaataag gaacttgaaa atttggagat caatctgagg 420
aaagagatcg cgaaagcctt taaaggcaac gaaggataca aaagtctgtt caagaaggat 480
ataattgaga caattttgcc agagttcctc gatgacaagg acgagattgc gctggtcaat 540
tcgttcaacg gattcacaac agcattcaca ggcttctttg ataatcggga aaatatgttc 600
tctgaggagg caaagtccac ttctattgcg ttcaggtgta tcaatgagaa tctcactagg 660
tacatttcca acatggatat ctttgagaag gttgacgcaa tttttgacaa gcacgaagtt 720
caggagatta aggagaagat cctcaattcc gattatgacg ttgaggactt cttcgagggt 780
gagtttttta atttcgtgct cactcaagag ggtatcgacg tgtataatgc gatcatcggt 840
gggttcgtga ctgagtccgg tgaaaagatt aagggattga acgagtatat caacctttac 900
aaccaaaaga cgaaacagaa gctgccaaag ttcaagcctc tttacaaaca ggttctttca 960
gaccgcgagt cactctcgtt ctatggggag ggctacactt cggatgagga agtcctggag 1020
gtgttcagga atactctcaa taagaattcg gagattttct cttctataaa aaaactggaa 1080
aagttgttta agaattttga cgaatactct agcgccggca tatttgtgaa aaacggcccg 1140
gccatatcaa cgataagtaa agatatcttc ggcgaatgga acgtgatcag agacaaatgg 1200
aacgcggagt atgacgatat tcacctgaag aagaaggctg tcgtaacgga gaagtacgag 1260
gatgatcgca ggaaaagctt caaaaagatc ggaagtttca gcctggaaca gttgcaggag 1320
tatgctgacg ccgatcttag cgtcgtcgag aagttgaagg agataatcat ccaaaaggtc 1380
gacgagatat ataaagtcta tggatcaagt gaaaaactgt tcgacgccga cttcgttttg 1440
gagaagtccc tgaagaagaa cgacgctgtt gttgccatta tgaaggatct gctcgacagc 1500
gtgaagagtt tcgagaacta tattaaggct tttttcgggg aggggaagga gactaacaga 1560
gatgagtcct tctacggaga cttcgtcctc gcgtacgata tactccttaa ggtagaccac 1620
atctacgacg caatcagaaa ttacgtgaca caaaagccgt acagcaagga caagttcaaa 1680
ctctacttcc agaaccccca gttcatgggc ggctgggaca aggacaagga aacggattac 1740
agggctacga tcctgaggta tggttcaaaa tactacttgg cgattatgga caagaagtac 1800
gccaagtgtc tccagaagat tgacaaagac gatgtcaatg gcaattatga gaagatcaac 1860
tacaagctgc ttccgggtcc gaacaagatg ctcccaaagg ttttcttcag caagaaatgg 1920
atggcctact ataacccaag cgaggacatc cagaagattt ataagaacgg tacgttcaag 1980
aagggcgaca tgttcaatct taacgactgt cacaagctga tcgacttctt caaagactca 2040
attagccggt acccaaagtg gtctaacgcc tatgacttca acttttcgga aaccgagaag 2100
tacaaggata tagccggatt ttatagagag gtggaagagc agggctacaa ggtgtcattc 2160
gagtccgcca gcaagaagga agtggacaag ctcgtggaag agggtaagct ctacatgttc 2220
cagatttata ataaagactt tagcgataag agccacggga cacctaatct ccacacaatg 2280
tatttcaagc tgctcttcga cgagaataac cacggccaaa tcaggttgtc aggaggggct 2340
gaactcttca tgcggcgcgc tagccttaag aaggaggagc ttgtagtcca ccctgcgaat 2400
agtccaattg cgaataagaa cccggacaat cctaaaaaga ctacaacatt gagctacgac 2460
gtgtacaagg ataagaggtt ttccgaggat cagtacgagc tccacatccc gattgcgatc 2520
aacaagtgcc caaagaatat tttcaagata aacacagagg tgcgtgtact cctgaagcat 2580
gacgacaatc cttacgtcat tgggattgat cggggcgaga ggaacctcct ctatattgtg 2640
gtggtggacg ggaaggggaa catagtcgaa cagtactccc ttaacgaaat aattaacaat 2700
ttcaacggca tccgtatcaa gaccgactac cattcgttgc tggacaagaa ggagaaggag 2760
agatttgagg cgcggcaaaa ttggacaagt atcgagaaca tcaaggaact caaagcaggt 2820
tatatctctc aagttgtgca taagatatgc gagctggttg agaagtatga cgcagtgatc 2880
gctcttgagg acctcaactc gggctttaag aattctagag ttaaagtgga gaagcaggtc 2940
tatcaaaagt tcgagaagat gcttatagat aagctcaact acatggtcga taagaaatcg 3000
aacccatgtg ccaccggcgg cgcactcaaa ggttaccaaa taacaaacaa attcgagtcc 3060
ttcaaatcga tgagtactca gaatgggttc atattttata taccggcgtg gcttacgtct 3120
aagatcgacc cgtcaactgg ttttgtcaac ctgttgaaga cgaaatacac gtccattgcc 3180
gattcaaaaa agttcatatc tagttttgat cgtattatgt acgtcccaga ggaagatctt 3240
ttcgagtttg ctctcgacta caaaaacttt tcgcgcaccg atgcggatta cattaaaaaa 3300
tggaaactct attcgtacgg caacagaatc aggatttttc gcaaccctaa gaagaataac 3360
gtctttgatt gggaggaagt ttgcttgact agcgcgtaca aggagctctt taataagtat 3420
ggcattaact accaacaggg tgatatcaga gcactgcttt gcgaacaatc tgacaaggct 3480
ttctactcat ccttcatggc tttgatgagc ctgatgctcc agatgagaaa ttcaattaca 3540
ggcagaaccg acgtggattt cttgatctcc ccggttaaaa attctgatgg catcttttac 3600
gatagcagga actatgaagc gcaagagaat gcgattctgc caaaaaatgc agacgccaac 3660
ggtgcctata acatcgccag gaaagtcctg tgggcgatcg gccagttcaa aaaggccgaa 3720
gacgaaaaat tggacaaggt caaaatcgct atcagcaaca aagagtggct ggagtatgct 3780
cagacatccg taaagcatgg cagcccgaag aagaagcgca aggtgtccgg cggcagctcc 3840
ggcggcagcc cgaagaagaa gcgcaaagtg tga 3873
<210> 39
<211> 1290
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 39
Met Pro Lys Lys Lys Arg Lys Val Gly Gly Gly Gly Ser Gly Gly Gly
1 5 10 15
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
20 25 30
Ser Gly Gly Gly Gly Ser Met Ser Lys Leu Glu Lys Phe Thr Asn Cys
35 40 45
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly Lys
50 55 60
Thr Gln Glu Asn Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu Lys
65 70 75 80
Arg Ala Glu Asp Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr
85 90 95
Leu Ser Phe Ile Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn Leu
100 105 110
Asn Asn Tyr Ile Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu
115 120 125
Asn Lys Glu Leu Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala
130 135 140
Lys Ala Phe Lys Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp
145 150 155 160
Ile Ile Glu Thr Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile
165 170 175
Ala Leu Val Asn Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe
180 185 190
Phe Asp Asn Arg Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr Ser
195 200 205
Ile Ala Phe Arg Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn
210 215 220
Met Asp Ile Phe Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu Val
225 230 235 240
Gln Glu Ile Lys Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp
245 250 255
Phe Phe Glu Gly Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly Ile
260 265 270
Asp Val Tyr Asn Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly Glu
275 280 285
Lys Ile Lys Gly Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr
290 295 300
Lys Gln Lys Leu Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser
305 310 315 320
Asp Arg Glu Ser Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu
325 330 335
Glu Val Leu Glu Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile
340 345 350
Phe Ser Ser Ile Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu
355 360 365
Tyr Ser Ser Ala Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser Thr
370 375 380
Ile Ser Lys Asp Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys Trp
385 390 395 400
Asn Ala Glu Tyr Asp Asp Ile His Leu Lys Lys Lys Ala Val Val Thr
405 410 415
Glu Lys Tyr Glu Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser
420 425 430
Phe Ser Leu Glu Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val
435 440 445
Val Glu Lys Leu Lys Glu Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr
450 455 460
Lys Val Tyr Gly Ser Ser Glu Lys Leu Phe Asp Ala Asp Phe Val Leu
465 470 475 480
Glu Lys Ser Leu Lys Lys Asn Asp Ala Val Val Ala Ile Met Lys Asp
485 490 495
Leu Leu Asp Ser Val Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe
500 505 510
Gly Glu Gly Lys Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe
515 520 525
Val Leu Ala Tyr Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp Ala
530 535 540
Ile Arg Asn Tyr Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys
545 550 555 560
Leu Tyr Phe Gln Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp Lys
565 570 575
Glu Thr Asp Tyr Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr
580 585 590
Leu Ala Ile Met Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp
595 600 605
Lys Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu
610 615 620
Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ser Lys Lys Trp
625 630 635 640
Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln Lys Ile Tyr Lys Asn
645 650 655
Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu Asn Asp Cys His Lys
660 665 670
Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg Tyr Pro Lys Trp Ser
675 680 685
Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu Lys Tyr Lys Asp Ile
690 695 700
Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln Gly Tyr Lys Val Ser Phe
705 710 715 720
Glu Ser Ala Ser Lys Lys Glu Val Asp Lys Leu Val Glu Glu Gly Lys
725 730 735
Leu Tyr Met Phe Gln Ile Tyr Asn Lys Asp Phe Ser Asp Lys Ser His
740 745 750
Gly Thr Pro Asn Leu His Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu
755 760 765
Asn Asn His Gly Gln Ile Arg Leu Ser Gly Gly Ala Glu Leu Phe Met
770 775 780
Arg Arg Ala Ser Leu Lys Lys Glu Glu Leu Val Val His Pro Ala Asn
785 790 795 800
Ser Pro Ile Ala Asn Lys Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr
805 810 815
Leu Ser Tyr Asp Val Tyr Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr
820 825 830
Glu Leu His Ile Pro Ile Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe
835 840 845
Lys Ile Asn Thr Glu Val Arg Val Leu Leu Lys His Asp Asp Asn Pro
850 855 860
Tyr Val Ile Gly Ile Asp Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val
865 870 875 880
Val Val Asp Gly Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu
885 890 895
Ile Ile Asn Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser
900 905 910
Leu Leu Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp
915 920 925
Thr Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln
930 935 940
Val Val His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp Ala Val Ile
945 950 955 960
Ala Leu Glu Asp Leu Asn Ser Gly Phe Lys Asn Ser Arg Val Lys Val
965 970 975
Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Leu Ile Asp Lys Leu
980 985 990
Asn Tyr Met Val Asp Lys Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala
995 1000 1005
Leu Lys Gly Tyr Gln Ile Thr Asn Lys Phe Glu Ser Phe Lys Ser
1010 1015 1020
Met Ser Thr Gln Asn Gly Phe Ile Phe Tyr Ile Pro Ala Trp Leu
1025 1030 1035
Thr Ser Lys Ile Asp Pro Ser Thr Gly Phe Val Asn Leu Leu Lys
1040 1045 1050
Thr Lys Tyr Thr Ser Ile Ala Asp Ser Lys Lys Phe Ile Ser Ser
1055 1060 1065
Phe Asp Arg Ile Met Tyr Val Pro Glu Glu Asp Leu Phe Glu Phe
1070 1075 1080
Ala Leu Asp Tyr Lys Asn Phe Ser Arg Thr Asp Ala Asp Tyr Ile
1085 1090 1095
Lys Lys Trp Lys Leu Tyr Ser Tyr Gly Asn Arg Ile Arg Ile Phe
1100 1105 1110
Arg Asn Pro Lys Lys Asn Asn Val Phe Asp Trp Glu Glu Val Cys
1115 1120 1125
Leu Thr Ser Ala Tyr Lys Glu Leu Phe Asn Lys Tyr Gly Ile Asn
1130 1135 1140
Tyr Gln Gln Gly Asp Ile Arg Ala Leu Leu Cys Glu Gln Ser Asp
1145 1150 1155
Lys Ala Phe Tyr Ser Ser Phe Met Ala Leu Met Ser Leu Met Leu
1160 1165 1170
Gln Met Arg Asn Ser Ile Thr Gly Arg Thr Asp Val Asp Phe Leu
1175 1180 1185
Ile Ser Pro Val Lys Asn Ser Asp Gly Ile Phe Tyr Asp Ser Arg
1190 1195 1200
Asn Tyr Glu Ala Gln Glu Asn Ala Ile Leu Pro Lys Asn Ala Asp
1205 1210 1215
Ala Asn Gly Ala Tyr Asn Ile Ala Arg Lys Val Leu Trp Ala Ile
1220 1225 1230
Gly Gln Phe Lys Lys Ala Glu Asp Glu Lys Leu Asp Lys Val Lys
1235 1240 1245
Ile Ala Ile Ser Asn Lys Glu Trp Leu Glu Tyr Ala Gln Thr Ser
1250 1255 1260
Val Lys His Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Gly Gly
1265 1270 1275
Ser Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1280 1285 1290
<210> 40
<211> 3852
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 40
atggctccta agaagaagcg gaaggttggt attcacgggg tgcctgcggc tctgtttcaa 60
gattttacac atctgtaccc gctgagtaaa acagtgcggt tcgagctgaa acccatagga 120
aggaccctcg agcacatcca cgcgaagaat tttctgagcc aggatgaaac tatggctgat 180
atgtatcaaa aagttaaggt cattttggac gactatcatc gcgattttat tgccgacatg 240
atgggagagg tgaaactcac gaagcttgct gaattttacg acgtctatct gaagttcagg 300
aaaaatccta aggacgatgg gctgcaaaaa cagcttaaag accttcaagc tgtccttcgg 360
aaggaatcgg tgaagcctat agggtcaggt gggaagtaca aaacaggcta cgatagactc 420
tttggggcaa aactcttcaa agatggaaaa gagttgggtg acctcgcaaa attcgttata 480
gcccaagaag gtgagtcttc tccgaagctg gctcatcttg ctcattttga gaagttcagc 540
acgtatttta ctggatttca cgataatcgg aagaatatgt actcggatga agacaagcat 600
actgcaatag cgtacaggct catccatgag aatttgccga gattcatcga caatctgcaa 660
atcttgacaa caatcaaaca aaagcatagc gccctctatg atcagataat caacgagctc 720
acggcctccg ggctcgacgt ctccttggct tctcatcttg acgggtatca caagctcctt 780
acacaagagg ggatcacggc atacaacagg atcataggag aggtgaatgg atatacaaat 840
aagcataacc agatatgcca caagagcgag cgcatagcga aacttagacc cttgcacaag 900
caaatccttt ctgacggaat gggagtgtca ttccttccgt ctaagttcgc ggatgatagt 960
gagatgtgcc aagcggtcaa cgaattttat cgccattata ctgacgtgtt cgcaaaggtg 1020
caaagtctct ttgacggatt tgatgatcac cagaaagacg ggatctatgt tgaacacaaa 1080
aaccttaatg aactgagcaa acaggcgttc ggcgactttg ctttgctggg gagggtcctt 1140
gatggatact acgtggacgt tgtcaatccg gagttcaatg agcggttcgc aaaggccaag 1200
actgacaatg cgaaagccaa gcttacaaaa gaaaaggaca aattcattaa aggagtccac 1260
tcactggctt ccctcgaaca agcaatagaa caccatacag ctagacacga cgatgagagt 1320
gttcaagccg gaaaacttgg ccagtacttc aaacacggtt tggcgggggt tgacaacccg 1380
attcagaaaa ttcacaataa ccattcgacg attaaagggt ttctggaaag ggaaaggcct 1440
gctggggaac gggcgctccc gaagatcaag tcaggaaaaa acccagaaat gacacagctc 1500
aggcagctga aggaactttt ggacaacgca ttgaatgtgg cgcacttcgc taagctgctg 1560
acaactaaaa caaccttgga caaccaggat ggaaattttt acggggagtt tggggtgctt 1620
tacgacgagc tggctaaaat tccaactctc tacaataagg ttagagatta tctctctcaa 1680
aagccctttt ctaccgaaaa gtataagctc aacttcggca atccgaccct tctcaatggg 1740
tgggacctga acaaagagaa agataacttt ggggttatac ttcagaagga tggatgctat 1800
tacttggcgc ttcttgataa ggctcataaa aaagttttcg acaacgcccc taacactggt 1860
aagaacgtct accaaaagat ggtctacaaa ctgttgcccg gccccaacaa aatgcttcct 1920
aaagtgtttt tcgcaaaatc gaatctcgac tattataatc catctgccga gctccttgac 1980
aaatatgcta aggggaccca taaaaagggt gataatttca acctgaagga ctgccacgcg 2040
cttatcgact ttttcaaagc cgggataaat aagcatccgg agtggcaaca ttttggtttt 2100
aaattttcgc caacgtcgtc ctatcgcgac ctttccgatt tctataggga agttgaacct 2160
caggggtacc aggtcaaatt tgttgacatt aatgcggact acattgatga attggtggag 2220
caagggaagc tctacctctt tcaaatatat aacaaagatt tctcgccaaa agcgcatggt 2280
aaaccgaatc ttcatacctt gtactttaaa gcactttttt cagaagataa cttggcggac 2340
ccgatctaca agctgaatgg ggaagctcag atcttctaca ggaaagcttc gttggacatg 2400
aacgagacta ccatacatcg cgcgggagag gtgcttgaga acaaaaatcc cgacaacccg 2460
aaaaagcggc aattcgttta cgacatcatc aaagacaaac ggtacacgca ggacaaattt 2520
atgctccacg tccccattac catgaatttt ggagtccaag gcatgaccat taaggaattc 2580
aacaaaaagg tcaaccaaag tattcagcaa tacgatgaag tcaatgtcat aggcatagat 2640
cggggagaaa ggcatctgtt gtatcttacc gtgattaact ctaagggtga aatactggag 2700
caacggtcac ttaacgatat aaccacggcg tccgcgaacg gtacacaagt gaccactccc 2760
taccacaaaa tattggataa aagggagata gaacgcttga atgcccgcgt tggctggggt 2820
gagattgaga ccatcaaaga gcttaaatcg ggatatttgt ctcacgtcgt tcatcaaatt 2880
aaccaactca tgcttaagta caatgcaatc gttgtgctcg aggacctgaa ctttggtttc 2940
aaaagaggga ggttcaaggt ggaaaaacaa atttaccaga actttgaaaa cgcgcttatc 3000
aagaaattga atcaccttgt tttgaaagat aaggcagatg acgaaatcgg gtcgtataaa 3060
aatgcactcc agttgacaaa taatttcacg gatttgaagt cgatcggcaa gcaaacaggg 3120
ttcctctttt atgtgccagc gtggaataca tcaaaaattg atccggagac gggatttgtc 3180
gacttgctga agcctaggta tgagaacatt gcccaatctc aggccttttt cggcaaattc 3240
gataaaatat gctacaacac agacaaaggt tattttgaat ttcacattga ttacgccaaa 3300
tttacagata aggcgaaaaa cagcagacag aaatgggcta tctgttctca tggggacaaa 3360
cgctatgtct acgataagac ggctaatcaa aataaaggcg ccgcaaaagg tattaatgtg 3420
aatgatgagc tgaaaagctt gtttgcccgc taccatatca atgataaaca accaaacttg 3480
gtgatggaca tatgccagaa caatgacaaa gaattccaca agtcactcat gtgcctgctt 3540
aaaacccttt tggcgctgcg gtatagcaat gcatctagcg atgaagactt tattttgagt 3600
cccgtggcca acgacgaggg cgtgtttttt aattcagcct tggcggacga tacgcagccc 3660
cagaatgcgg acgcaaacgg cgcgtaccac attgcactga agggactgtg gcttctgaac 3720
gagctgaaaa atagcgacga cctgaataaa gtcaagttgg ccattgacaa tcaaacctgg 3780
ttgaatttcg ctcaaaatag aaagcgtcct gctgccacca aaaaggccgg acaggctaag 3840
aaaaagaagt ga 3852
<210> 41
<211> 1283
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 41
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Leu Phe Gln Asp Phe Thr His Leu Tyr Pro Leu Ser Lys Thr Val
20 25 30
Arg Phe Glu Leu Lys Pro Ile Gly Arg Thr Leu Glu His Ile His Ala
35 40 45
Lys Asn Phe Leu Ser Gln Asp Glu Thr Met Ala Asp Met Tyr Gln Lys
50 55 60
Val Lys Val Ile Leu Asp Asp Tyr His Arg Asp Phe Ile Ala Asp Met
65 70 75 80
Met Gly Glu Val Lys Leu Thr Lys Leu Ala Glu Phe Tyr Asp Val Tyr
85 90 95
Leu Lys Phe Arg Lys Asn Pro Lys Asp Asp Gly Leu Gln Lys Gln Leu
100 105 110
Lys Asp Leu Gln Ala Val Leu Arg Lys Glu Ser Val Lys Pro Ile Gly
115 120 125
Ser Gly Gly Lys Tyr Lys Thr Gly Tyr Asp Arg Leu Phe Gly Ala Lys
130 135 140
Leu Phe Lys Asp Gly Lys Glu Leu Gly Asp Leu Ala Lys Phe Val Ile
145 150 155 160
Ala Gln Glu Gly Glu Ser Ser Pro Lys Leu Ala His Leu Ala His Phe
165 170 175
Glu Lys Phe Ser Thr Tyr Phe Thr Gly Phe His Asp Asn Arg Lys Asn
180 185 190
Met Tyr Ser Asp Glu Asp Lys His Thr Ala Ile Ala Tyr Arg Leu Ile
195 200 205
His Glu Asn Leu Pro Arg Phe Ile Asp Asn Leu Gln Ile Leu Thr Thr
210 215 220
Ile Lys Gln Lys His Ser Ala Leu Tyr Asp Gln Ile Ile Asn Glu Leu
225 230 235 240
Thr Ala Ser Gly Leu Asp Val Ser Leu Ala Ser His Leu Asp Gly Tyr
245 250 255
His Lys Leu Leu Thr Gln Glu Gly Ile Thr Ala Tyr Asn Arg Ile Ile
260 265 270
Gly Glu Val Asn Gly Tyr Thr Asn Lys His Asn Gln Ile Cys His Lys
275 280 285
Ser Glu Arg Ile Ala Lys Leu Arg Pro Leu His Lys Gln Ile Leu Ser
290 295 300
Asp Gly Met Gly Val Ser Phe Leu Pro Ser Lys Phe Ala Asp Asp Ser
305 310 315 320
Glu Met Cys Gln Ala Val Asn Glu Phe Tyr Arg His Tyr Thr Asp Val
325 330 335
Phe Ala Lys Val Gln Ser Leu Phe Asp Gly Phe Asp Asp His Gln Lys
340 345 350
Asp Gly Ile Tyr Val Glu His Lys Asn Leu Asn Glu Leu Ser Lys Gln
355 360 365
Ala Phe Gly Asp Phe Ala Leu Leu Gly Arg Val Leu Asp Gly Tyr Tyr
370 375 380
Val Asp Val Val Asn Pro Glu Phe Asn Glu Arg Phe Ala Lys Ala Lys
385 390 395 400
Thr Asp Asn Ala Lys Ala Lys Leu Thr Lys Glu Lys Asp Lys Phe Ile
405 410 415
Lys Gly Val His Ser Leu Ala Ser Leu Glu Gln Ala Ile Glu His His
420 425 430
Thr Ala Arg His Asp Asp Glu Ser Val Gln Ala Gly Lys Leu Gly Gln
435 440 445
Tyr Phe Lys His Gly Leu Ala Gly Val Asp Asn Pro Ile Gln Lys Ile
450 455 460
His Asn Asn His Ser Thr Ile Lys Gly Phe Leu Glu Arg Glu Arg Pro
465 470 475 480
Ala Gly Glu Arg Ala Leu Pro Lys Ile Lys Ser Gly Lys Asn Pro Glu
485 490 495
Met Thr Gln Leu Arg Gln Leu Lys Glu Leu Leu Asp Asn Ala Leu Asn
500 505 510
Val Ala His Phe Ala Lys Leu Leu Thr Thr Lys Thr Thr Leu Asp Asn
515 520 525
Gln Asp Gly Asn Phe Tyr Gly Glu Phe Gly Val Leu Tyr Asp Glu Leu
530 535 540
Ala Lys Ile Pro Thr Leu Tyr Asn Lys Val Arg Asp Tyr Leu Ser Gln
545 550 555 560
Lys Pro Phe Ser Thr Glu Lys Tyr Lys Leu Asn Phe Gly Asn Pro Thr
565 570 575
Leu Leu Asn Gly Trp Asp Leu Asn Lys Glu Lys Asp Asn Phe Gly Val
580 585 590
Ile Leu Gln Lys Asp Gly Cys Tyr Tyr Leu Ala Leu Leu Asp Lys Ala
595 600 605
His Lys Lys Val Phe Asp Asn Ala Pro Asn Thr Gly Lys Asn Val Tyr
610 615 620
Gln Lys Met Val Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro
625 630 635 640
Lys Val Phe Phe Ala Lys Ser Asn Leu Asp Tyr Tyr Asn Pro Ser Ala
645 650 655
Glu Leu Leu Asp Lys Tyr Ala Lys Gly Thr His Lys Lys Gly Asp Asn
660 665 670
Phe Asn Leu Lys Asp Cys His Ala Leu Ile Asp Phe Phe Lys Ala Gly
675 680 685
Ile Asn Lys His Pro Glu Trp Gln His Phe Gly Phe Lys Phe Ser Pro
690 695 700
Thr Ser Ser Tyr Arg Asp Leu Ser Asp Phe Tyr Arg Glu Val Glu Pro
705 710 715 720
Gln Gly Tyr Gln Val Lys Phe Val Asp Ile Asn Ala Asp Tyr Ile Asp
725 730 735
Glu Leu Val Glu Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys
740 745 750
Asp Phe Ser Pro Lys Ala His Gly Lys Pro Asn Leu His Thr Leu Tyr
755 760 765
Phe Lys Ala Leu Phe Ser Glu Asp Asn Leu Ala Asp Pro Ile Tyr Lys
770 775 780
Leu Asn Gly Glu Ala Gln Ile Phe Tyr Arg Lys Ala Ser Leu Asp Met
785 790 795 800
Asn Glu Thr Thr Ile His Arg Ala Gly Glu Val Leu Glu Asn Lys Asn
805 810 815
Pro Asp Asn Pro Lys Lys Arg Gln Phe Val Tyr Asp Ile Ile Lys Asp
820 825 830
Lys Arg Tyr Thr Gln Asp Lys Phe Met Leu His Val Pro Ile Thr Met
835 840 845
Asn Phe Gly Val Gln Gly Met Thr Ile Lys Glu Phe Asn Lys Lys Val
850 855 860
Asn Gln Ser Ile Gln Gln Tyr Asp Glu Val Asn Val Ile Gly Ile Asp
865 870 875 880
Arg Gly Glu Arg His Leu Leu Tyr Leu Thr Val Ile Asn Ser Lys Gly
885 890 895
Glu Ile Leu Glu Gln Arg Ser Leu Asn Asp Ile Thr Thr Ala Ser Ala
900 905 910
Asn Gly Thr Gln Val Thr Thr Pro Tyr His Lys Ile Leu Asp Lys Arg
915 920 925
Glu Ile Glu Arg Leu Asn Ala Arg Val Gly Trp Gly Glu Ile Glu Thr
930 935 940
Ile Lys Glu Leu Lys Ser Gly Tyr Leu Ser His Val Val His Gln Ile
945 950 955 960
Asn Gln Leu Met Leu Lys Tyr Asn Ala Ile Val Val Leu Glu Asp Leu
965 970 975
Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys Val Glu Lys Gln Ile Tyr
980 985 990
Gln Asn Phe Glu Asn Ala Leu Ile Lys Lys Leu Asn His Leu Val Leu
995 1000 1005
Lys Asp Lys Ala Asp Asp Glu Ile Gly Ser Tyr Lys Asn Ala Leu
1010 1015 1020
Gln Leu Thr Asn Asn Phe Thr Asp Leu Lys Ser Ile Gly Lys Gln
1025 1030 1035
Thr Gly Phe Leu Phe Tyr Val Pro Ala Trp Asn Thr Ser Lys Ile
1040 1045 1050
Asp Pro Glu Thr Gly Phe Val Asp Leu Leu Lys Pro Arg Tyr Glu
1055 1060 1065
Asn Ile Ala Gln Ser Gln Ala Phe Phe Gly Lys Phe Asp Lys Ile
1070 1075 1080
Cys Tyr Asn Thr Asp Lys Gly Tyr Phe Glu Phe His Ile Asp Tyr
1085 1090 1095
Ala Lys Phe Thr Asp Lys Ala Lys Asn Ser Arg Gln Lys Trp Ala
1100 1105 1110
Ile Cys Ser His Gly Asp Lys Arg Tyr Val Tyr Asp Lys Thr Ala
1115 1120 1125
Asn Gln Asn Lys Gly Ala Ala Lys Gly Ile Asn Val Asn Asp Glu
1130 1135 1140
Leu Lys Ser Leu Phe Ala Arg Tyr His Ile Asn Asp Lys Gln Pro
1145 1150 1155
Asn Leu Val Met Asp Ile Cys Gln Asn Asn Asp Lys Glu Phe His
1160 1165 1170
Lys Ser Leu Met Cys Leu Leu Lys Thr Leu Leu Ala Leu Arg Tyr
1175 1180 1185
Ser Asn Ala Ser Ser Asp Glu Asp Phe Ile Leu Ser Pro Val Ala
1190 1195 1200
Asn Asp Glu Gly Val Phe Phe Asn Ser Ala Leu Ala Asp Asp Thr
1205 1210 1215
Gln Pro Gln Asn Ala Asp Ala Asn Gly Ala Tyr His Ile Ala Leu
1220 1225 1230
Lys Gly Leu Trp Leu Leu Asn Glu Leu Lys Asn Ser Asp Asp Leu
1235 1240 1245
Asn Lys Val Lys Leu Ala Ile Asp Asn Gln Thr Trp Leu Asn Phe
1250 1255 1260
Ala Gln Asn Arg Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln
1265 1270 1275
Ala Lys Lys Lys Lys
1280
<210> 42
<211> 3936
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 42
atgccgaaga agaagcgcaa ggtcgggggc gggggctcag gcgggggcgg gagcggcggc 60
gggggctctg ggggcggcgg cagcggcggg ggcggcagcg ggggcggcgg gtcgatgctg 120
ttccaggatt tcactcatct gtaccctctc tcaaagactg ttcggttcga gctcaagcct 180
attgggcgga ctctggagca catccacgcg aagaacttcc tcagccagga cgaaaccatg 240
gccgacatgt accagaaggt caaggtcatc ctcgacgact accacaggga cttcatcgcg 300
gacatgatgg gcgaggtgaa gctgacgaag ctcgccgagt tctacgacgt ctacctcaag 360
ttccgcaaga acccgaagga cgacggcctc cagaagcagc tcaaggacct gcaggccgtc 420
ctgaggaagg agtccgtcaa gcccatcggc agcggcggca agtacaagac cggctacgac 480
aggctgttcg gcgccaagct gttcaaggac ggcaaggagc tcggcgacct ggcgaagttc 540
gtgatcgcgc aggagggcga gagctccccc aagctggccc acctggccca cttcgagaag 600
ttcagcacgt acttcaccgg cttccacgac aacaggaaga acatgtacag cgacgaggac 660
aagcacacgg ccatcgccta ccgcctcatc cacgagaacc tgccccgctt catcgacaac 720
ctgcagatcc tgacgaccat caagcagaag cactccgccc tgtacgacca gatcatcaac 780
gagctcaccg cgagcggcct cgacgtgtcc ctcgccagcc acctcgacgg ctaccacaag 840
ctcctgaccc aggagggcat caccgcctac aaccgcatca tcggcgaggt gaacggctac 900
accaacaagc acaaccagat ctgccacaag tccgagagga tcgccaagct caggcccctg 960
cacaagcaga tcctcagcga cggcatgggc gtgagcttcc tcccgtccaa gttcgccgac 1020
gactccgaga tgtgccaggc cgtgaacgag ttctacaggc actacaccga cgtgttcgcc 1080
aaggtgcagt ccctgttcga cggcttcgac gaccaccaga aggacggcat ctacgtggag 1140
cacaagaacc tgaacgagct gtccaagcag gccttcggcg acttcgccct cctgggccgc 1200
gtgctggacg gctactacgt ggacgtcgtg aacccggagt tcaacgagcg cttcgcgaag 1260
gcgaagacgg acaacgcgaa ggccaagctc accaaggaga aggacaagtt catcaagggc 1320
gtccacagcc tcgcgtccct ggagcaggcg atcgagcacc acaccgcgcg ccacgacgac 1380
gagtccgtgc aggccggcaa gctcggccag tacttcaagc acggcctggc cggcgtcgac 1440
aacccgatcc agaagatcca caacaaccac tccaccatca agggcttcct ggagagggag 1500
cgcccggcgg gcgagcgcgc gctccccaag atcaagtccg gcaagaaccc cgagatgacg 1560
cagctcaggc agctgaagga gctgctcgac aacgcgctca acgtggcgca cttcgccaag 1620
ctgctcacga ccaagaccac gctggacaac caggacggca acttctacgg cgagttcggc 1680
gtcctgtacg acgagctggc gaagatcccg accctgtaca acaaggtccg cgactacctg 1740
agccagaagc ccttctccac cgagaagtac aagctcaact tcggcaaccc gaccctcctg 1800
aacggctggg acctcaacaa ggagaaggac aacttcggcg tgatcctcca gaaggacggc 1860
tgctactacc tcgccctgct ggacaaggcg cacaagaagg tcttcgacaa cgccccgaac 1920
accggcaaga acgtgtacca gaagatggtg tacaagctgc tccccggccc caacaagatg 1980
ctgccgaagg tgttcttcgc gaagtccaac ctcgactact acaaccccag cgccgagctc 2040
ctggacaagt acgccaaggg cacgcacaag aagggcgaca acttcaacct caaggactgc 2100
cacgcgctga tcgacttctt caaggcgggc atcaacaagc accccgagtg gcagcacttc 2160
ggcttcaagt tcagcccgac ctccagctac agggacctca gcgacttcta ccgcgaggtg 2220
gagccccagg gctaccaggt gaagttcgtc gacatcaacg ccgactacat cgacgagctc 2280
gtcgagcagg gcaagctcta cctgttccag atctacaaca aggacttctc cccgaaggcc 2340
cacggcaagc cgaacctcca cacgctctac ttcaaggccc tcttcagcga ggacaacctg 2400
gccgacccga tctacaagct caacggcgag gcgcagatct tctaccgcaa ggccagcctg 2460
gacatgaacg aaacgaccat ccacagggcc ggcgaggtcc tggagaacaa gaacccggac 2520
aacccgaaga agaggcagtt cgtctacgac atcatcaagg acaagaggta cacccaggac 2580
aagttcatgc tccacgtccc gatcaccatg aacttcggcg tccagggcat gaccatcaag 2640
gagttcaaca agaaggtcaa ccagagcatc cagcagtacg acgaggtgaa cgtcatcggc 2700
atcgaccgcg gcgagaggca cctgctctac ctgacggtca tcaactccaa gggcgagatc 2760
ctcgagcagc gcagcctgaa cgacatcacg accgcgagcg ccaacggcac gcaggtcacc 2820
acgccgtacc acaagatcct cgacaagcgc gagatcgaga ggctgaacgc gcgcgtcggc 2880
tggggcgaga tcgaaacgat caaggagctc aagtccggct acctcagcca cgtcgtgcac 2940
cagatcaacc agctcatgct gaagtacaac gcgatcgtgg tcctcgagga cctgaacttc 3000
ggcttcaaga ggggccgctt caaggtggag aagcagatct accagaactt cgagaacgcc 3060
ctgatcaaga agctcaacca cctggtcctc aaggacaagg cggacgacga gatcggcagc 3120
tacaagaacg cgctccagct gaccaacaac ttcacggacc tcaagtccat cggcaagcag 3180
acgggcttcc tgttctacgt gccggcgtgg aacacctcca agatcgaccc ggaaaccggc 3240
ttcgtcgacc tgctcaagcc gcgctacgag aacatcgcgc agtcccaggc gttcttcggc 3300
aagttcgaca agatctgcta caacaccgac aagggctact tcgagttcca catcgactac 3360
gcgaagttca ccgacaaggc caagaactcc aggcagaagt gggccatctg cagccacggc 3420
gacaagcgct acgtgtacga caagacggcg aaccagaaca agggcgcggc caagggcatc 3480
aacgtgaacg acgagctgaa gtccctcttc gcgcgctacc acatcaacga caagcagccg 3540
aacctcgtca tggacatctg ccagaacaac gacaaggagt tccacaagag cctgatgtgc 3600
ctgctcaaga ccctgctcgc cctccgctac tccaacgcga gctccgacga ggacttcatc 3660
ctcagccccg tcgcgaacga cgagggcgtg ttcttcaact ccgccctcgc ggacgacacg 3720
cagccgcaga acgccgacgc gaacggcgcc taccacatcg ccctcaaggg cctgtggctg 3780
ctcaacgagc tcaagaactc cgacgacctg aacaaggtga agctcgccat tgacaaccag 3840
acgtggctga atttcgctca gaataggccg aagaagaagc gcaaggtgtc cggcggcagc 3900
tccggcggca gcccgaagaa gaagcgcaaa gtgtga 3936
<210> 43
<211> 1311
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 43
Met Pro Lys Lys Lys Arg Lys Val Gly Gly Gly Gly Ser Gly Gly Gly
1 5 10 15
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
20 25 30
Ser Gly Gly Gly Gly Ser Met Leu Phe Gln Asp Phe Thr His Leu Tyr
35 40 45
Pro Leu Ser Lys Thr Val Arg Phe Glu Leu Lys Pro Ile Gly Arg Thr
50 55 60
Leu Glu His Ile His Ala Lys Asn Phe Leu Ser Gln Asp Glu Thr Met
65 70 75 80
Ala Asp Met Tyr Gln Lys Val Lys Val Ile Leu Asp Asp Tyr His Arg
85 90 95
Asp Phe Ile Ala Asp Met Met Gly Glu Val Lys Leu Thr Lys Leu Ala
100 105 110
Glu Phe Tyr Asp Val Tyr Leu Lys Phe Arg Lys Asn Pro Lys Asp Asp
115 120 125
Gly Leu Gln Lys Gln Leu Lys Asp Leu Gln Ala Val Leu Arg Lys Glu
130 135 140
Ser Val Lys Pro Ile Gly Ser Gly Gly Lys Tyr Lys Thr Gly Tyr Asp
145 150 155 160
Arg Leu Phe Gly Ala Lys Leu Phe Lys Asp Gly Lys Glu Leu Gly Asp
165 170 175
Leu Ala Lys Phe Val Ile Ala Gln Glu Gly Glu Ser Ser Pro Lys Leu
180 185 190
Ala His Leu Ala His Phe Glu Lys Phe Ser Thr Tyr Phe Thr Gly Phe
195 200 205
His Asp Asn Arg Lys Asn Met Tyr Ser Asp Glu Asp Lys His Thr Ala
210 215 220
Ile Ala Tyr Arg Leu Ile His Glu Asn Leu Pro Arg Phe Ile Asp Asn
225 230 235 240
Leu Gln Ile Leu Thr Thr Ile Lys Gln Lys His Ser Ala Leu Tyr Asp
245 250 255
Gln Ile Ile Asn Glu Leu Thr Ala Ser Gly Leu Asp Val Ser Leu Ala
260 265 270
Ser His Leu Asp Gly Tyr His Lys Leu Leu Thr Gln Glu Gly Ile Thr
275 280 285
Ala Tyr Asn Arg Ile Ile Gly Glu Val Asn Gly Tyr Thr Asn Lys His
290 295 300
Asn Gln Ile Cys His Lys Ser Glu Arg Ile Ala Lys Leu Arg Pro Leu
305 310 315 320
His Lys Gln Ile Leu Ser Asp Gly Met Gly Val Ser Phe Leu Pro Ser
325 330 335
Lys Phe Ala Asp Asp Ser Glu Met Cys Gln Ala Val Asn Glu Phe Tyr
340 345 350
Arg His Tyr Thr Asp Val Phe Ala Lys Val Gln Ser Leu Phe Asp Gly
355 360 365
Phe Asp Asp His Gln Lys Asp Gly Ile Tyr Val Glu His Lys Asn Leu
370 375 380
Asn Glu Leu Ser Lys Gln Ala Phe Gly Asp Phe Ala Leu Leu Gly Arg
385 390 395 400
Val Leu Asp Gly Tyr Tyr Val Asp Val Val Asn Pro Glu Phe Asn Glu
405 410 415
Arg Phe Ala Lys Ala Lys Thr Asp Asn Ala Lys Ala Lys Leu Thr Lys
420 425 430
Glu Lys Asp Lys Phe Ile Lys Gly Val His Ser Leu Ala Ser Leu Glu
435 440 445
Gln Ala Ile Glu His His Thr Ala Arg His Asp Asp Glu Ser Val Gln
450 455 460
Ala Gly Lys Leu Gly Gln Tyr Phe Lys His Gly Leu Ala Gly Val Asp
465 470 475 480
Asn Pro Ile Gln Lys Ile His Asn Asn His Ser Thr Ile Lys Gly Phe
485 490 495
Leu Glu Arg Glu Arg Pro Ala Gly Glu Arg Ala Leu Pro Lys Ile Lys
500 505 510
Ser Gly Lys Asn Pro Glu Met Thr Gln Leu Arg Gln Leu Lys Glu Leu
515 520 525
Leu Asp Asn Ala Leu Asn Val Ala His Phe Ala Lys Leu Leu Thr Thr
530 535 540
Lys Thr Thr Leu Asp Asn Gln Asp Gly Asn Phe Tyr Gly Glu Phe Gly
545 550 555 560
Val Leu Tyr Asp Glu Leu Ala Lys Ile Pro Thr Leu Tyr Asn Lys Val
565 570 575
Arg Asp Tyr Leu Ser Gln Lys Pro Phe Ser Thr Glu Lys Tyr Lys Leu
580 585 590
Asn Phe Gly Asn Pro Thr Leu Leu Asn Gly Trp Asp Leu Asn Lys Glu
595 600 605
Lys Asp Asn Phe Gly Val Ile Leu Gln Lys Asp Gly Cys Tyr Tyr Leu
610 615 620
Ala Leu Leu Asp Lys Ala His Lys Lys Val Phe Asp Asn Ala Pro Asn
625 630 635 640
Thr Gly Lys Asn Val Tyr Gln Lys Met Val Tyr Lys Leu Leu Pro Gly
645 650 655
Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ala Lys Ser Asn Leu Asp
660 665 670
Tyr Tyr Asn Pro Ser Ala Glu Leu Leu Asp Lys Tyr Ala Lys Gly Thr
675 680 685
His Lys Lys Gly Asp Asn Phe Asn Leu Lys Asp Cys His Ala Leu Ile
690 695 700
Asp Phe Phe Lys Ala Gly Ile Asn Lys His Pro Glu Trp Gln His Phe
705 710 715 720
Gly Phe Lys Phe Ser Pro Thr Ser Ser Tyr Arg Asp Leu Ser Asp Phe
725 730 735
Tyr Arg Glu Val Glu Pro Gln Gly Tyr Gln Val Lys Phe Val Asp Ile
740 745 750
Asn Ala Asp Tyr Ile Asp Glu Leu Val Glu Gln Gly Lys Leu Tyr Leu
755 760 765
Phe Gln Ile Tyr Asn Lys Asp Phe Ser Pro Lys Ala His Gly Lys Pro
770 775 780
Asn Leu His Thr Leu Tyr Phe Lys Ala Leu Phe Ser Glu Asp Asn Leu
785 790 795 800
Ala Asp Pro Ile Tyr Lys Leu Asn Gly Glu Ala Gln Ile Phe Tyr Arg
805 810 815
Lys Ala Ser Leu Asp Met Asn Glu Thr Thr Ile His Arg Ala Gly Glu
820 825 830
Val Leu Glu Asn Lys Asn Pro Asp Asn Pro Lys Lys Arg Gln Phe Val
835 840 845
Tyr Asp Ile Ile Lys Asp Lys Arg Tyr Thr Gln Asp Lys Phe Met Leu
850 855 860
His Val Pro Ile Thr Met Asn Phe Gly Val Gln Gly Met Thr Ile Lys
865 870 875 880
Glu Phe Asn Lys Lys Val Asn Gln Ser Ile Gln Gln Tyr Asp Glu Val
885 890 895
Asn Val Ile Gly Ile Asp Arg Gly Glu Arg His Leu Leu Tyr Leu Thr
900 905 910
Val Ile Asn Ser Lys Gly Glu Ile Leu Glu Gln Arg Ser Leu Asn Asp
915 920 925
Ile Thr Thr Ala Ser Ala Asn Gly Thr Gln Val Thr Thr Pro Tyr His
930 935 940
Lys Ile Leu Asp Lys Arg Glu Ile Glu Arg Leu Asn Ala Arg Val Gly
945 950 955 960
Trp Gly Glu Ile Glu Thr Ile Lys Glu Leu Lys Ser Gly Tyr Leu Ser
965 970 975
His Val Val His Gln Ile Asn Gln Leu Met Leu Lys Tyr Asn Ala Ile
980 985 990
Val Val Leu Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys
995 1000 1005
Val Glu Lys Gln Ile Tyr Gln Asn Phe Glu Asn Ala Leu Ile Lys
1010 1015 1020
Lys Leu Asn His Leu Val Leu Lys Asp Lys Ala Asp Asp Glu Ile
1025 1030 1035
Gly Ser Tyr Lys Asn Ala Leu Gln Leu Thr Asn Asn Phe Thr Asp
1040 1045 1050
Leu Lys Ser Ile Gly Lys Gln Thr Gly Phe Leu Phe Tyr Val Pro
1055 1060 1065
Ala Trp Asn Thr Ser Lys Ile Asp Pro Glu Thr Gly Phe Val Asp
1070 1075 1080
Leu Leu Lys Pro Arg Tyr Glu Asn Ile Ala Gln Ser Gln Ala Phe
1085 1090 1095
Phe Gly Lys Phe Asp Lys Ile Cys Tyr Asn Thr Asp Lys Gly Tyr
1100 1105 1110
Phe Glu Phe His Ile Asp Tyr Ala Lys Phe Thr Asp Lys Ala Lys
1115 1120 1125
Asn Ser Arg Gln Lys Trp Ala Ile Cys Ser His Gly Asp Lys Arg
1130 1135 1140
Tyr Val Tyr Asp Lys Thr Ala Asn Gln Asn Lys Gly Ala Ala Lys
1145 1150 1155
Gly Ile Asn Val Asn Asp Glu Leu Lys Ser Leu Phe Ala Arg Tyr
1160 1165 1170
His Ile Asn Asp Lys Gln Pro Asn Leu Val Met Asp Ile Cys Gln
1175 1180 1185
Asn Asn Asp Lys Glu Phe His Lys Ser Leu Met Cys Leu Leu Lys
1190 1195 1200
Thr Leu Leu Ala Leu Arg Tyr Ser Asn Ala Ser Ser Asp Glu Asp
1205 1210 1215
Phe Ile Leu Ser Pro Val Ala Asn Asp Glu Gly Val Phe Phe Asn
1220 1225 1230
Ser Ala Leu Ala Asp Asp Thr Gln Pro Gln Asn Ala Asp Ala Asn
1235 1240 1245
Gly Ala Tyr His Ile Ala Leu Lys Gly Leu Trp Leu Leu Asn Glu
1250 1255 1260
Leu Lys Asn Ser Asp Asp Leu Asn Lys Val Lys Leu Ala Ile Asp
1265 1270 1275
Asn Gln Thr Trp Leu Asn Phe Ala Gln Asn Arg Pro Lys Lys Lys
1280 1285 1290
Arg Lys Val Ser Gly Gly Ser Ser Gly Gly Ser Pro Lys Lys Lys
1295 1300 1305
Arg Lys Val
1310
<210> 44
<211> 36
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 44
Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly
1 5 10 15
Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly
20 25 30
Gly Gly Ser Gly
35
<210> 45
<211> 1260
<212> PRT
<213> 人工序列
<220>
<223> 合成
<220>
<221> xNLS-03
<222> (1)..(9)
<220>
<221> x表位-03
<222> (10)..(17)
<220>
<221> xNLS-04
<222> (1245)..(1260)
<400> 45
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Ser Lys Leu Glu Lys Phe Thr Asn Cys Tyr Ser Leu Ser Lys Thr
20 25 30
Leu Arg Phe Lys Ala Ile Pro Val Gly Lys Thr Gln Glu Asn Ile Asp
35 40 45
Asn Lys Arg Leu Leu Val Glu Asp Glu Lys Arg Ala Glu Asp Tyr Lys
50 55 60
Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr Leu Ser Phe Ile Asn Asp
65 70 75 80
Val Leu His Ser Ile Lys Leu Lys Asn Leu Asn Asn Tyr Ile Ser Leu
85 90 95
Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu Asn Lys Glu Leu Glu Asn
100 105 110
Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala Lys Ala Phe Lys Gly Asn
115 120 125
Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp Ile Ile Glu Thr Ile Leu
130 135 140
Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile Ala Leu Val Asn Ser Phe
145 150 155 160
Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe Phe Asp Asn Arg Glu Asn
165 170 175
Met Phe Ser Glu Glu Ala Lys Ser Thr Ser Ile Ala Phe Arg Cys Ile
180 185 190
Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn Met Asp Ile Phe Glu Lys
195 200 205
Val Asp Ala Ile Phe Asp Lys His Glu Val Gln Glu Ile Lys Glu Lys
210 215 220
Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp Phe Phe Glu Gly Glu Phe
225 230 235 240
Phe Asn Phe Val Leu Thr Gln Glu Gly Ile Asp Val Tyr Asn Ala Ile
245 250 255
Ile Gly Gly Phe Val Thr Glu Ser Gly Glu Lys Ile Lys Gly Leu Asn
260 265 270
Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr Lys Gln Lys Leu Pro Lys
275 280 285
Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser Asp Arg Glu Ser Leu Ser
290 295 300
Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu Glu Val Leu Glu Val Phe
305 310 315 320
Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile Phe Ser Ser Ile Lys Lys
325 330 335
Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu Tyr Ser Ser Ala Gly Ile
340 345 350
Phe Val Lys Asn Gly Pro Ala Ile Ser Thr Ile Ser Lys Asp Ile Phe
355 360 365
Gly Glu Trp Asn Val Ile Arg Asp Lys Trp Asn Ala Glu Tyr Asp Asp
370 375 380
Ile His Leu Lys Lys Lys Ala Val Val Thr Glu Lys Tyr Glu Asp Asp
385 390 395 400
Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser Phe Ser Leu Glu Gln Leu
405 410 415
Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val Val Glu Lys Leu Lys Glu
420 425 430
Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr Lys Val Tyr Gly Ser Ser
435 440 445
Glu Lys Leu Phe Asp Ala Asp Phe Val Leu Glu Lys Ser Leu Lys Lys
450 455 460
Asn Asp Ala Val Val Ala Ile Met Lys Asp Leu Leu Asp Ser Val Lys
465 470 475 480
Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe Gly Glu Gly Lys Glu Thr
485 490 495
Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe Val Leu Ala Tyr Asp Ile
500 505 510
Leu Leu Lys Val Asp His Ile Tyr Asp Ala Ile Arg Asn Tyr Val Thr
515 520 525
Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys Leu Tyr Phe Gln Asn Pro
530 535 540
Gln Phe Met Gly Gly Trp Asp Lys Asp Lys Glu Thr Asp Tyr Arg Ala
545 550 555 560
Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala Ile Met Asp Lys
565 570 575
Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp Lys Asp Asp Val Asn Gly
580 585 590
Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met
595 600 605
Leu Pro Lys Val Phe Phe Ser Lys Lys Trp Met Ala Tyr Tyr Asn Pro
610 615 620
Ser Glu Asp Ile Gln Lys Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly
625 630 635 640
Asp Met Phe Asn Leu Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys
645 650 655
Asp Ser Ile Ser Arg Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn
660 665 670
Phe Ser Glu Thr Glu Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu
675 680 685
Val Glu Glu Gln Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys
690 695 700
Glu Val Asp Lys Leu Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile
705 710 715 720
Tyr Asn Lys Asp Phe Ser Asp Lys Ser His Gly Thr Pro Asn Leu His
725 730 735
Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile
740 745 750
Arg Leu Ser Gly Gly Ala Glu Leu Phe Met Arg Arg Ala Ser Leu Lys
755 760 765
Lys Glu Glu Leu Val Val His Pro Ala Asn Ser Pro Ile Ala Asn Lys
770 775 780
Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr
785 790 795 800
Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro Ile
805 810 815
Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr Glu Val
820 825 830
Arg Val Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile Gly Ile Asp
835 840 845
Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val Asp Gly Lys Gly
850 855 860
Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile Ile Asn Asn Phe Asn
865 870 875 880
Gly Ile Arg Ile Lys Thr Asp Tyr His Ser Leu Leu Asp Lys Lys Glu
885 890 895
Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp Thr Ser Ile Glu Asn Ile
900 905 910
Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln Val Val His Lys Ile Cys
915 920 925
Glu Leu Val Glu Lys Tyr Asp Ala Val Ile Ala Leu Glu Asp Leu Asn
930 935 940
Ser Gly Phe Lys Asn Ser Arg Val Lys Val Glu Lys Gln Val Tyr Gln
945 950 955 960
Lys Phe Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Met Val Asp Lys
965 970 975
Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile
980 985 990
Thr Asn Lys Phe Glu Ser Phe Lys Ser Met Ser Thr Gln Asn Gly Phe
995 1000 1005
Ile Phe Tyr Ile Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro Ser
1010 1015 1020
Thr Gly Phe Val Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile Ala
1025 1030 1035
Asp Ser Lys Lys Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr Val
1040 1045 1050
Pro Glu Glu Asp Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn Phe
1055 1060 1065
Ser Arg Thr Asp Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr Ser
1070 1075 1080
Tyr Gly Asn Arg Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn Asn
1085 1090 1095
Val Phe Asp Trp Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys Glu
1100 1105 1110
Leu Phe Asn Lys Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile Arg
1115 1120 1125
Ala Leu Leu Cys Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser Phe
1130 1135 1140
Met Ala Leu Met Ser Leu Met Leu Gln Met Arg Asn Ser Ile Thr
1145 1150 1155
Gly Arg Thr Asp Val Asp Phe Leu Ile Ser Pro Val Lys Asn Ser
1160 1165 1170
Asp Gly Ile Phe Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu Asn
1175 1180 1185
Ala Ile Leu Pro Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile
1190 1195 1200
Ala Arg Lys Val Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala Glu
1205 1210 1215
Asp Glu Lys Leu Asp Lys Val Lys Ile Ala Ile Ser Asn Lys Glu
1220 1225 1230
Trp Leu Glu Tyr Ala Gln Thr Ser Val Lys His Lys Arg Pro Ala
1235 1240 1245
Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys
1250 1255 1260
<210> 46
<211> 1283
<212> PRT
<213> 人工序列
<220>
<223> 合成
<220>
<221> xNLS-03
<222> (1)..(9)
<220>
<221> x表位-03
<222> (10)..(17)
<220>
<221> xNLS-04
<222> (1268)..(1283)
<400> 46
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Leu Phe Gln Asp Phe Thr His Leu Tyr Pro Leu Ser Lys Thr Val
20 25 30
Arg Phe Glu Leu Lys Pro Ile Gly Arg Thr Leu Glu His Ile His Ala
35 40 45
Lys Asn Phe Leu Ser Gln Asp Glu Thr Met Ala Asp Met Tyr Gln Lys
50 55 60
Val Lys Val Ile Leu Asp Asp Tyr His Arg Asp Phe Ile Ala Asp Met
65 70 75 80
Met Gly Glu Val Lys Leu Thr Lys Leu Ala Glu Phe Tyr Asp Val Tyr
85 90 95
Leu Lys Phe Arg Lys Asn Pro Lys Asp Asp Gly Leu Gln Lys Gln Leu
100 105 110
Lys Asp Leu Gln Ala Val Leu Arg Lys Glu Ser Val Lys Pro Ile Gly
115 120 125
Ser Gly Gly Lys Tyr Lys Thr Gly Tyr Asp Arg Leu Phe Gly Ala Lys
130 135 140
Leu Phe Lys Asp Gly Lys Glu Leu Gly Asp Leu Ala Lys Phe Val Ile
145 150 155 160
Ala Gln Glu Gly Glu Ser Ser Pro Lys Leu Ala His Leu Ala His Phe
165 170 175
Glu Lys Phe Ser Thr Tyr Phe Thr Gly Phe His Asp Asn Arg Lys Asn
180 185 190
Met Tyr Ser Asp Glu Asp Lys His Thr Ala Ile Ala Tyr Arg Leu Ile
195 200 205
His Glu Asn Leu Pro Arg Phe Ile Asp Asn Leu Gln Ile Leu Thr Thr
210 215 220
Ile Lys Gln Lys His Ser Ala Leu Tyr Asp Gln Ile Ile Asn Glu Leu
225 230 235 240
Thr Ala Ser Gly Leu Asp Val Ser Leu Ala Ser His Leu Asp Gly Tyr
245 250 255
His Lys Leu Leu Thr Gln Glu Gly Ile Thr Ala Tyr Asn Arg Ile Ile
260 265 270
Gly Glu Val Asn Gly Tyr Thr Asn Lys His Asn Gln Ile Cys His Lys
275 280 285
Ser Glu Arg Ile Ala Lys Leu Arg Pro Leu His Lys Gln Ile Leu Ser
290 295 300
Asp Gly Met Gly Val Ser Phe Leu Pro Ser Lys Phe Ala Asp Asp Ser
305 310 315 320
Glu Met Cys Gln Ala Val Asn Glu Phe Tyr Arg His Tyr Thr Asp Val
325 330 335
Phe Ala Lys Val Gln Ser Leu Phe Asp Gly Phe Asp Asp His Gln Lys
340 345 350
Asp Gly Ile Tyr Val Glu His Lys Asn Leu Asn Glu Leu Ser Lys Gln
355 360 365
Ala Phe Gly Asp Phe Ala Leu Leu Gly Arg Val Leu Asp Gly Tyr Tyr
370 375 380
Val Asp Val Val Asn Pro Glu Phe Asn Glu Arg Phe Ala Lys Ala Lys
385 390 395 400
Thr Asp Asn Ala Lys Ala Lys Leu Thr Lys Glu Lys Asp Lys Phe Ile
405 410 415
Lys Gly Val His Ser Leu Ala Ser Leu Glu Gln Ala Ile Glu His His
420 425 430
Thr Ala Arg His Asp Asp Glu Ser Val Gln Ala Gly Lys Leu Gly Gln
435 440 445
Tyr Phe Lys His Gly Leu Ala Gly Val Asp Asn Pro Ile Gln Lys Ile
450 455 460
His Asn Asn His Ser Thr Ile Lys Gly Phe Leu Glu Arg Glu Arg Pro
465 470 475 480
Ala Gly Glu Arg Ala Leu Pro Lys Ile Lys Ser Gly Lys Asn Pro Glu
485 490 495
Met Thr Gln Leu Arg Gln Leu Lys Glu Leu Leu Asp Asn Ala Leu Asn
500 505 510
Val Ala His Phe Ala Lys Leu Leu Thr Thr Lys Thr Thr Leu Asp Asn
515 520 525
Gln Asp Gly Asn Phe Tyr Gly Glu Phe Gly Val Leu Tyr Asp Glu Leu
530 535 540
Ala Lys Ile Pro Thr Leu Tyr Asn Lys Val Arg Asp Tyr Leu Ser Gln
545 550 555 560
Lys Pro Phe Ser Thr Glu Lys Tyr Lys Leu Asn Phe Gly Asn Pro Thr
565 570 575
Leu Leu Asn Gly Trp Asp Leu Asn Lys Glu Lys Asp Asn Phe Gly Val
580 585 590
Ile Leu Gln Lys Asp Gly Cys Tyr Tyr Leu Ala Leu Leu Asp Lys Ala
595 600 605
His Lys Lys Val Phe Asp Asn Ala Pro Asn Thr Gly Lys Asn Val Tyr
610 615 620
Gln Lys Met Val Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro
625 630 635 640
Lys Val Phe Phe Ala Lys Ser Asn Leu Asp Tyr Tyr Asn Pro Ser Ala
645 650 655
Glu Leu Leu Asp Lys Tyr Ala Lys Gly Thr His Lys Lys Gly Asp Asn
660 665 670
Phe Asn Leu Lys Asp Cys His Ala Leu Ile Asp Phe Phe Lys Ala Gly
675 680 685
Ile Asn Lys His Pro Glu Trp Gln His Phe Gly Phe Lys Phe Ser Pro
690 695 700
Thr Ser Ser Tyr Arg Asp Leu Ser Asp Phe Tyr Arg Glu Val Glu Pro
705 710 715 720
Gln Gly Tyr Gln Val Lys Phe Val Asp Ile Asn Ala Asp Tyr Ile Asp
725 730 735
Glu Leu Val Glu Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys
740 745 750
Asp Phe Ser Pro Lys Ala His Gly Lys Pro Asn Leu His Thr Leu Tyr
755 760 765
Phe Lys Ala Leu Phe Ser Glu Asp Asn Leu Ala Asp Pro Ile Tyr Lys
770 775 780
Leu Asn Gly Glu Ala Gln Ile Phe Tyr Arg Lys Ala Ser Leu Asp Met
785 790 795 800
Asn Glu Thr Thr Ile His Arg Ala Gly Glu Val Leu Glu Asn Lys Asn
805 810 815
Pro Asp Asn Pro Lys Lys Arg Gln Phe Val Tyr Asp Ile Ile Lys Asp
820 825 830
Lys Arg Tyr Thr Gln Asp Lys Phe Met Leu His Val Pro Ile Thr Met
835 840 845
Asn Phe Gly Val Gln Gly Met Thr Ile Lys Glu Phe Asn Lys Lys Val
850 855 860
Asn Gln Ser Ile Gln Gln Tyr Asp Glu Val Asn Val Ile Gly Ile Asp
865 870 875 880
Arg Gly Glu Arg His Leu Leu Tyr Leu Thr Val Ile Asn Ser Lys Gly
885 890 895
Glu Ile Leu Glu Gln Arg Ser Leu Asn Asp Ile Thr Thr Ala Ser Ala
900 905 910
Asn Gly Thr Gln Val Thr Thr Pro Tyr His Lys Ile Leu Asp Lys Arg
915 920 925
Glu Ile Glu Arg Leu Asn Ala Arg Val Gly Trp Gly Glu Ile Glu Thr
930 935 940
Ile Lys Glu Leu Lys Ser Gly Tyr Leu Ser His Val Val His Gln Ile
945 950 955 960
Asn Gln Leu Met Leu Lys Tyr Asn Ala Ile Val Val Leu Glu Asp Leu
965 970 975
Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys Val Glu Lys Gln Ile Tyr
980 985 990
Gln Asn Phe Glu Asn Ala Leu Ile Lys Lys Leu Asn His Leu Val Leu
995 1000 1005
Lys Asp Lys Ala Asp Asp Glu Ile Gly Ser Tyr Lys Asn Ala Leu
1010 1015 1020
Gln Leu Thr Asn Asn Phe Thr Asp Leu Lys Ser Ile Gly Lys Gln
1025 1030 1035
Thr Gly Phe Leu Phe Tyr Val Pro Ala Trp Asn Thr Ser Lys Ile
1040 1045 1050
Asp Pro Glu Thr Gly Phe Val Asp Leu Leu Lys Pro Arg Tyr Glu
1055 1060 1065
Asn Ile Ala Gln Ser Gln Ala Phe Phe Gly Lys Phe Asp Lys Ile
1070 1075 1080
Cys Tyr Asn Thr Asp Lys Gly Tyr Phe Glu Phe His Ile Asp Tyr
1085 1090 1095
Ala Lys Phe Thr Asp Lys Ala Lys Asn Ser Arg Gln Lys Trp Ala
1100 1105 1110
Ile Cys Ser His Gly Asp Lys Arg Tyr Val Tyr Asp Lys Thr Ala
1115 1120 1125
Asn Gln Asn Lys Gly Ala Ala Lys Gly Ile Asn Val Asn Asp Glu
1130 1135 1140
Leu Lys Ser Leu Phe Ala Arg Tyr His Ile Asn Asp Lys Gln Pro
1145 1150 1155
Asn Leu Val Met Asp Ile Cys Gln Asn Asn Asp Lys Glu Phe His
1160 1165 1170
Lys Ser Leu Met Cys Leu Leu Lys Thr Leu Leu Ala Leu Arg Tyr
1175 1180 1185
Ser Asn Ala Ser Ser Asp Glu Asp Phe Ile Leu Ser Pro Val Ala
1190 1195 1200
Asn Asp Glu Gly Val Phe Phe Asn Ser Ala Leu Ala Asp Asp Thr
1205 1210 1215
Gln Pro Gln Asn Ala Asp Ala Asn Gly Ala Tyr His Ile Ala Leu
1220 1225 1230
Lys Gly Leu Trp Leu Leu Asn Glu Leu Lys Asn Ser Asp Asp Leu
1235 1240 1245
Asn Lys Val Lys Leu Ala Ile Asp Asn Gln Thr Trp Leu Asn Phe
1250 1255 1260
Ala Gln Asn Arg Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln
1265 1270 1275
Ala Lys Lys Lys Lys
1280
<210> 47
<211> 1352
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 47
Met Thr Gln Phe Glu Gly Phe Thr Asn Leu Tyr Gln Val Ser Lys Thr
1 5 10 15
Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu Lys His Ile Gln
20 25 30
Glu Gln Gly Phe Ile Glu Glu Asp Lys Ala Arg Asn Asp His Tyr Lys
35 40 45
Glu Leu Lys Pro Ile Ile Asp Arg Ile Tyr Lys Thr Tyr Ala Asp Gln
50 55 60
Cys Leu Gln Leu Val Gln Leu Asp Trp Glu Asn Leu Ser Ala Ala Ile
65 70 75 80
Asp Ser Tyr Arg Lys Glu Lys Thr Glu Glu Thr Arg Asn Ala Leu Ile
85 90 95
Glu Glu Gln Ala Thr Tyr Arg Asn Ala Ile His Asp Tyr Phe Ile Gly
100 105 110
Arg Thr Asp Asn Leu Thr Asp Ala Ile Asn Lys Arg His Ala Glu Ile
115 120 125
Tyr Lys Gly Leu Phe Lys Ala Glu Leu Phe Asn Gly Lys Val Leu Lys
130 135 140
Gln Leu Gly Thr Val Thr Thr Thr Glu His Glu Asn Ala Leu Leu Arg
145 150 155 160
Ser Phe Asp Lys Phe Thr Thr Tyr Phe Ser Gly Phe Tyr Glu Asn Arg
165 170 175
Lys Asn Val Phe Ser Ala Glu Asp Ile Ser Thr Ala Ile Pro His Arg
180 185 190
Ile Val Gln Asp Asn Phe Pro Lys Phe Lys Glu Asn Cys His Ile Phe
195 200 205
Thr Arg Leu Ile Thr Ala Val Pro Ser Leu Arg Glu His Phe Glu Asn
210 215 220
Val Lys Lys Ala Ile Gly Ile Phe Val Ser Thr Ser Ile Glu Glu Val
225 230 235 240
Phe Ser Phe Pro Phe Tyr Asn Gln Leu Leu Thr Gln Thr Gln Ile Asp
245 250 255
Leu Tyr Asn Gln Leu Leu Gly Gly Ile Ser Arg Glu Ala Gly Thr Glu
260 265 270
Lys Ile Lys Gly Leu Asn Glu Val Leu Asn Leu Ala Ile Gln Lys Asn
275 280 285
Asp Glu Thr Ala His Ile Ile Ala Ser Leu Pro His Arg Phe Ile Pro
290 295 300
Leu Phe Lys Gln Ile Leu Ser Asp Arg Asn Thr Leu Ser Phe Ile Leu
305 310 315 320
Glu Glu Phe Lys Ser Asp Glu Glu Val Ile Gln Ser Phe Cys Lys Tyr
325 330 335
Lys Thr Leu Leu Arg Asn Glu Asn Val Leu Glu Thr Ala Glu Ala Leu
340 345 350
Phe Asn Glu Leu Asn Ser Ile Asp Leu Thr His Ile Phe Ile Ser His
355 360 365
Lys Lys Leu Glu Thr Ile Ser Ser Ala Leu Cys Asp His Trp Asp Thr
370 375 380
Leu Arg Asn Ala Leu Tyr Glu Arg Arg Ile Ser Glu Leu Thr Gly Lys
385 390 395 400
Ile Thr Lys Ser Ala Lys Glu Lys Val Gln Arg Ser Leu Lys His Glu
405 410 415
Asp Ile Asn Leu Gln Glu Ile Ile Ser Ala Ala Gly Lys Glu Leu Ser
420 425 430
Glu Ala Phe Lys Gln Lys Thr Ser Glu Ile Leu Ser His Ala His Ala
435 440 445
Ala Leu Asp Gln Pro Leu Pro Thr Thr Leu Lys Lys Gln Glu Glu Lys
450 455 460
Glu Ile Leu Lys Ser Gln Leu Asp Ser Leu Leu Gly Leu Tyr His Leu
465 470 475 480
Leu Asp Trp Phe Ala Val Asp Glu Ser Asn Glu Val Asp Pro Glu Phe
485 490 495
Ser Ala Arg Leu Thr Gly Ile Lys Leu Glu Met Glu Pro Ser Leu Ser
500 505 510
Phe Tyr Asn Lys Ala Arg Asn Tyr Ala Thr Lys Lys Pro Tyr Ser Val
515 520 525
Glu Lys Phe Lys Leu Asn Phe Gln Met Pro Thr Leu Ala Ser Gly Trp
530 535 540
Asp Val Asn Lys Glu Lys Asn Asn Gly Ala Ile Leu Phe Val Lys Asn
545 550 555 560
Gly Leu Tyr Tyr Leu Gly Ile Met Pro Lys Gln Lys Gly Arg Tyr Lys
565 570 575
Ala Leu Ser Phe Glu Pro Thr Glu Lys Thr Ser Glu Gly Phe Asp Lys
580 585 590
Met Tyr Tyr Asp Tyr Phe Pro Asp Ala Ala Lys Met Ile Pro Lys Cys
595 600 605
Ser Thr Gln Leu Lys Ala Val Thr Ala His Phe Gln Thr His Thr Thr
610 615 620
Pro Ile Leu Leu Ser Asn Asn Phe Ile Glu Pro Leu Glu Ile Thr Lys
625 630 635 640
Glu Ile Tyr Asp Leu Asn Asn Pro Glu Lys Glu Pro Lys Lys Phe Gln
645 650 655
Thr Ala Tyr Ala Lys Lys Thr Gly Asp Gln Lys Gly Tyr Arg Glu Ala
660 665 670
Leu Cys Lys Trp Ile Asp Phe Thr Arg Asp Phe Leu Ser Lys Tyr Thr
675 680 685
Lys Thr Thr Ser Ile Asp Leu Ser Ser Leu Arg Pro Ser Ser Gln Tyr
690 695 700
Lys Asp Leu Gly Glu Tyr Tyr Ala Glu Leu Asn Pro Leu Leu Tyr His
705 710 715 720
Ile Ser Phe Gln Arg Ile Ala Glu Lys Glu Ile Met Asp Ala Val Glu
725 730 735
Thr Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ala Lys
740 745 750
Gly His His Gly Lys Pro Asn Leu His Thr Leu Tyr Trp Thr Gly Leu
755 760 765
Phe Ser Pro Glu Asn Leu Ala Lys Thr Ser Ile Lys Leu Asn Gly Gln
770 775 780
Ala Glu Leu Phe Tyr Arg Pro Lys Ser Arg Met Lys Arg Met Ala His
785 790 795 800
Arg Leu Gly Glu Lys Met Leu Asn Lys Lys Leu Lys Asp Gln Lys Thr
805 810 815
Pro Ile Pro Asp Thr Leu Tyr Gln Glu Leu Tyr Asp Tyr Val Asn His
820 825 830
Arg Leu Ser His Asp Leu Ser Asp Glu Ala Arg Ala Leu Leu Pro Asn
835 840 845
Val Ile Thr Lys Glu Val Ser His Glu Ile Ile Lys Asp Arg Arg Phe
850 855 860
Thr Ser Asp Lys Phe Phe Phe His Val Pro Ile Thr Leu Asn Tyr Gln
865 870 875 880
Ala Ala Asn Ser Pro Ser Lys Phe Asn Gln Arg Val Asn Ala Tyr Leu
885 890 895
Lys Glu His Pro Glu Thr Pro Ile Ile Gly Ile Asp Arg Gly Glu Arg
900 905 910
Asn Leu Ile Tyr Ile Thr Val Ile Asp Ser Thr Gly Lys Ile Leu Glu
915 920 925
Gln Arg Ser Leu Asn Thr Ile Gln Gln Phe Asp Tyr Gln Lys Lys Leu
930 935 940
Asp Asn Arg Glu Lys Glu Arg Val Ala Ala Arg Gln Ala Trp Ser Val
945 950 955 960
Val Gly Thr Ile Lys Asp Leu Lys Gln Gly Tyr Leu Ser Gln Val Ile
965 970 975
His Glu Ile Val Asp Leu Met Ile His Tyr Gln Ala Val Val Val Leu
980 985 990
Glu Asn Leu Asn Phe Gly Phe Lys Ser Lys Arg Thr Gly Ile Ala Glu
995 1000 1005
Lys Ala Val Tyr Gln Gln Phe Glu Lys Met Leu Ile Asp Lys Leu
1010 1015 1020
Asn Cys Leu Val Leu Lys Asp Tyr Pro Ala Glu Lys Val Gly Gly
1025 1030 1035
Val Leu Asn Pro Tyr Gln Leu Thr Asp Gln Phe Thr Ser Phe Ala
1040 1045 1050
Lys Met Gly Thr Gln Ser Gly Phe Leu Phe Tyr Val Pro Ala Pro
1055 1060 1065
Tyr Thr Ser Lys Ile Asp Pro Leu Thr Gly Phe Val Asp Pro Phe
1070 1075 1080
Val Trp Lys Thr Ile Lys Asn His Glu Ser Arg Lys His Phe Leu
1085 1090 1095
Glu Gly Phe Asp Phe Leu His Tyr Asp Val Lys Thr Gly Asp Phe
1100 1105 1110
Ile Leu His Phe Lys Met Asn Arg Asn Leu Ser Phe Gln Arg Gly
1115 1120 1125
Leu Pro Gly Phe Met Pro Ala Trp Asp Ile Val Phe Glu Lys Asn
1130 1135 1140
Glu Thr Gln Phe Asp Ala Lys Gly Thr Pro Phe Ile Ala Gly Lys
1145 1150 1155
Arg Ile Val Pro Val Ile Glu Asn His Arg Phe Thr Gly Arg Tyr
1160 1165 1170
Arg Asp Leu Tyr Pro Ala Asn Glu Leu Ile Ala Leu Leu Glu Glu
1175 1180 1185
Lys Gly Ile Val Phe Arg Asp Gly Ser Asn Ile Leu Pro Lys Leu
1190 1195 1200
Leu Glu Asn Asp Asp Ser His Ala Ile Asp Thr Met Val Ala Leu
1205 1210 1215
Ile Arg Ser Val Leu Gln Met Arg Asn Ser Asn Ala Ala Thr Gly
1220 1225 1230
Glu Asp Tyr Ile Asn Ser Pro Val Arg Asp Leu Asn Gly Val Cys
1235 1240 1245
Phe Asp Ser Arg Phe Gln Asn Pro Glu Trp Pro Met Asp Ala Asp
1250 1255 1260
Ala Asn Gly Ala Tyr His Ile Ala Leu Lys Gly Gln Leu Leu Leu
1265 1270 1275
Asn His Leu Lys Glu Ser Lys Asp Leu Lys Leu Gln Asn Gly Ile
1280 1285 1290
Ser Asn Gln Asp Trp Leu Ala Tyr Ile Gln Glu Leu Arg Asn Lys
1295 1300 1305
Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys
1310 1315 1320
Gly Ser Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Tyr Pro Tyr Asp
1325 1330 1335
Val Pro Asp Tyr Ala Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
1340 1345 1350
<210> 48
<211> 1387
<212> PRT
<213> 人工序列
<220>
<223> 合成
<220>
<221> xSV40NLS-06
<222> (2)..(8)
<220>
<221> x接头-06
<222> (9)..(39)
<220>
<221> xSV40NLS-04
<222> (1339)..(1345)
<220>
<221> xsGGS接头-02
<222> (1346)..(1349)
<220>
<221> xsGGS接头-02
<222> (1350)..(1353)
<220>
<221> xSV40NLS-07
<222> (1354)..(1360)
<220>
<221> tag3XHA
<222> (1361)..(1387)
<400> 48
Met Pro Lys Lys Lys Arg Lys Val Gly Gly Gly Gly Ser Gly Gly Gly
1 5 10 15
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
20 25 30
Ser Gly Gly Gly Gly Ser Met Ser Ile Tyr Gln Glu Phe Val Asn Lys
35 40 45
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys
50 55 60
Thr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys
65 70 75 80
Arg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr His
85 90 95
Gln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu Asp
100 105 110
Leu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp
115 120 125
Asp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys
130 135 140
Lys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu
145 150 155 160
Phe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu
165 170 175
Ile Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys
180 185 190
Ala Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys
195 200 205
Ser Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn Arg
210 215 220
Lys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg
225 230 235 240
Ile Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr
245 250 255
Glu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile
260 265 270
Lys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr
275 280 285
Ser Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu Ile
290 295 300
Ala Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn
305 310 315 320
Thr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg Lys
325 330 335
Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys
340 345 350
Thr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu Ser
355 360 365
Asp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp Ser
370 375 380
Asp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe
385 390 395 400
Lys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe
405 410 415
Asp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys
420 425 430
Asn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr
435 440 445
Ser Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala
450 455 460
Pro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile Ala
465 470 475 480
Lys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys Leu
485 490 495
Ala Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys Arg
500 505 510
Phe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe Asp
515 520 525
Glu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys Tyr
530 535 540
Gln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp Asp
545 550 555 560
Val Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu His
565 570 575
Lys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn Ile
580 585 590
Leu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr Phe
595 600 605
Glu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr Ile
610 615 620
Thr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu Asn
625 630 635 640
Ser Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn Thr
645 650 655
Ala Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met Asn
660 665 670
Lys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn Lys
675 680 685
Gly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala Asn
690 695 700
Lys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe Tyr
705 710 715 720
Asn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His Thr
725 730 735
Lys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn Ile
740 745 750
Glu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser Lys
755 760 765
His Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln Arg
770 775 780
Tyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly Tyr
785 790 795 800
Lys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val Val
805 810 815
Asn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser
820 825 830
Ala Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys Ala
835 840 845
Leu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn Gly
850 855 860
Glu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile Thr
865 870 875 880
His Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro Lys
885 890 895
Lys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe Thr
900 905 910
Glu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys Ser
915 920 925
Ser Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys Glu
930 935 940
Lys Ala Asn Asp Val His Ile Leu Ser Ile Asp Arg Gly Glu Arg His
945 950 955 960
Leu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys Gln
965 970 975
Asp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr His
980 985 990
Asp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys Asp
995 1000 1005
Trp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu
1010 1015 1020
Ser Gln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn
1025 1030 1035
Ala Ile Val Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly
1040 1045 1050
Arg Phe Lys Val Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met
1055 1060 1065
Leu Ile Glu Lys Leu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe
1070 1075 1080
Asp Lys Thr Gly Gly Val Leu Arg Ala Tyr Gln Leu Thr Ala Pro
1085 1090 1095
Phe Glu Thr Phe Lys Lys Met Gly Lys Gln Thr Gly Ile Ile Tyr
1100 1105 1110
Tyr Val Pro Ala Gly Phe Thr Ser Lys Ile Cys Pro Val Thr Gly
1115 1120 1125
Phe Val Asn Gln Leu Tyr Pro Lys Tyr Glu Ser Val Ser Lys Ser
1130 1135 1140
Gln Glu Phe Phe Ser Lys Phe Asp Lys Ile Cys Tyr Asn Leu Asp
1145 1150 1155
Lys Gly Tyr Phe Glu Phe Ser Phe Asp Tyr Lys Asn Phe Gly Asp
1160 1165 1170
Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala Ser Phe Gly Ser Arg
1175 1180 1185
Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His Asn Trp Asp Thr
1190 1195 1200
Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu Leu Lys Asp
1205 1210 1215
Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala Ala Ile
1220 1225 1230
Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser Val
1235 1240 1245
Leu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu
1250 1255 1260
Leu Asp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe
1265 1270 1275
Phe Asp Ser Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp
1280 1285 1290
Ala Asn Gly Ala Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu
1295 1300 1305
Gly Arg Ile Lys Asn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val
1310 1315 1320
Ile Lys Asn Glu Glu Tyr Phe Glu Phe Val Gln Asn Arg Asn Asn
1325 1330 1335
Pro Lys Lys Lys Arg Lys Val Ser Gly Gly Ser Ser Gly Gly Ser
1340 1345 1350
Pro Lys Lys Lys Arg Lys Val Tyr Pro Tyr Asp Val Pro Asp Tyr
1355 1360 1365
Ala Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Tyr Pro Tyr Asp Val
1370 1375 1380
Pro Asp Tyr Ala
1385
<210> 49
<211> 4899
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 49
atgccgaaga agaagcgcaa ggtcatgtcc agcgagaccg gccccgtggc ggtggacccc 60
accctgcgca ggcgcatcga gccgcacgag ttcgaggtgt tcttcgaccc cagggagctc 120
cgcaaggaga cctgcctcct gtacgagatc aactggggcg gcaggcactc catctggagg 180
cacacgagcc agaacaccaa caagcacgtc gaggtgaact tcatcgagaa gttcaccacg 240
gagaggtact tctgcccgaa cacgcgctgc tccatcacgt ggttcctctc gtggagccca 300
tgcggcgagt gctccagggc gatcacggag ttcctcagcc gctacccgca cgtgaccctg 360
ttcatctaca tcgctaggct ctaccaccac gcggacccca ggaacaggca gggcctcagg 420
gacctgatct ccagcggcgt cacgatccag atcatgaccg agcaggagtc cggctactgc 480
tggaggaact tcgtgaacta ctccccgagc aacgaggccc actggccccg ctacccgcac 540
ctctgggtcc gcctctacgt gctcgagctg tactgcatca tcctcggcct gccgccctgc 600
ctcaacatcc tgaggcgcaa gcagccccag ctgacgttct tcaccatcgc cctgcagagc 660
tgccactacc agaggctccc gccccacatc ctgtgggcga ccgggctcaa ggggggcggg 720
ggctcaggcg ggggcgggag cggcggcggg ggctctgggg gcggcggcag cggcgggggc 780
ggcagcgggg gcggcgggtc gatgctcttc caggacttca cccacctcta cccgctgtcc 840
aagacggtga ggttcgagct gaagcccatc ggccgcaccc tcgagcacat ccacgccaag 900
aacttcctca gccaggacga gacgatggcg gacatgtacc agaaggtgaa ggtcatcctg 960
gacgactacc acagggactt catcgccgac atgatgggcg aggtgaagct caccaagctg 1020
gcggagttct acgacgtcta cctgaagttc cgcaagaacc cgaaggacga cggcctccag 1080
aagcagctca aggacctgca ggccgtgctg aggaaggagt cggtcaagcc aatcggcagc 1140
ggcggcaagt acaagaccgg ctacgacagg ctgttcggcg ccaagctctt caaggacggc 1200
aaggagctcg gcgacctggc caagttcgtg atcgcccagg agggcgagtc ctcgcccaag 1260
ctcgctcacc tggcccactt cgagaagttc tccacctact tcacgggctt ccacgacaac 1320
cgcaagaaca tgtacagcga cgaggacaag cacaccgcca tcgcgtacag gctgatccac 1380
gagaacctcc cccgcttcat cgacaacctc cagatcctga ccacgatcaa gcagaagcac 1440
tccgccctgt acgaccagat catcaacgag ctgacggctt cgggcctgga cgtgtccctg 1500
gccagccacc tcgacggcta ccacaagctc ctgacccagg agggcatcac ggcctacaac 1560
aggatcatcg gcgaggtcaa cggctacacg aacaagcaca accagatctg ccacaagtcg 1620
gagaggatcg ccaagctcag gcccctgcac aagcagatcc tgagcgacgg catgggcgtg 1680
tccttcctcc ccagcaagtt cgccgacgac tccgagatgt gccaggcggt caacgagttc 1740
taccgccact acaccgacgt gttcgccaag gtccagagcc tgttcgacgg cttcgacgac 1800
caccagaagg acggcatcta cgtggagcac aagaacctca acgagctgtc caagcaggcc 1860
ttcggcgact tcgccctcct gggcagggtg ctggacggct actacgtcga cgtggtcaac 1920
ccggagttca acgagcgctt cgccaaggcg aagaccgaca acgccaaggc gaagctgacg 1980
aaggagaagg acaagttcat caagggcgtc cactcgctgg ccagcctgga gcaggccatc 2040
gagcaccaca cggctaggca cgacgacgag tcggtgcagg ccggcaagct gggccagtac 2100
ttcaagcacg gcctggcggg cgtggacaac ccgatccaga agatccacaa caaccactcc 2160
accatcaagg gcttcctcga gagggagagg cccgcgggcg agagggcgct gcccaagatc 2220
aagagcggca agaaccccga gatgacgcag ctccgccagc tgaaggagct cctggacaac 2280
gccctcaacg tggcccactt cgcgaagctc ctgaccacga agaccacgct ggacaaccag 2340
gacggcaact tctacggcga gttcggcgtg ctgtacgacg agctcgcgaa gatcccgacc 2400
ctgtacaaca aggtccgcga ctacctctcc cagaagccgt tcagcaccga gaagtacaag 2460
ctcaacttcg gcaaccccac gctcctgaac ggctgggacc tgaacaagga gaaggacaac 2520
ttcggcgtga tcctgcagaa ggacggctgc tactacctcg ccctcctgga caaggcgcac 2580
aagaaggtct tcgacaacgc ccccaacacg ggcaagaacg tgtaccagaa gatggtctac 2640
aagctcctgc cgggccccaa caagatgctg ccgaaggtgt tcttcgcgaa gtccaacctc 2700
gactactaca accccagcgc cgagctcctg gacaagtacg cgaagggcac ccacaagaag 2760
ggcgacaact tcaacctgaa ggactgccac gccctcatcg acttcttcaa ggcgggcatc 2820
aacaagcacc cggagtggca gcacttcggc ttcaagttct cccccacgtc cagctacagg 2880
gacctcagcg acttctacag ggaggtggag ccccagggct accaggtgaa gttcgtcgac 2940
atcaacgccg actacatcga cgagctggtc gagcagggca agctctacct gttccagatc 3000
tacaacaagg acttctcgcc caaggcccac ggcaagccaa acctccacac cctgtacttc 3060
aaggccctgt tcagcgagga caacctcgcg gaccccatct acaagctcaa cggcgaggcc 3120
cagatcttct acaggaaggc gtccctggac atgaacgaga cgaccatcca cagggcgggc 3180
gaggtgctcg agaacaagaa cccggacaac cccaagaaga ggcagttcgt ctacgacatc 3240
atcaaggaca agcgctacac gcaggacaag ttcatgctgc acgtgccgat caccatgaac 3300
ttcggcgtcc agggcatgac gatcaaggag ttcaacaaga aggtgaacca gtccatccag 3360
cagtacgacg aggtgaacgt catcggcatc gctcgcggcg agaggcacct cctgtacctc 3420
accgtcatca acagcaaggg cgagatcctg gagcagaggt ccctcaacga catcacgacc 3480
gcttcggcca acggcacgca ggtgaccacg ccctaccaca agatcctgga caagcgcgag 3540
atcgagaggc tcaacgctag ggtgggctgg ggcgagatcg agaccatcaa ggagctgaag 3600
tccggctacc tcagccacgt ggtccaccag atcaaccagc tcatgctgaa gtacaacgcc 3660
atcgtggtcc tcgcggacct gaatttcggc ttcaagcgcg gcaggttcaa ggtggagaag 3720
cagatctacc agaacttcga gaacgccctg atcaagaagc tcaaccacct cgtcctgaag 3780
gacaaggccg acgacgagat cggctcctac aagaacgcgc tccagctgac caacaacttc 3840
acggacctga agagcatcgg caagcagacc ggcttcctct tctacgtgcc ggcgtggaac 3900
acctccaaga tcgaccccga gacgggcttc gtcgacctcc tgaagccgag gtacgagaac 3960
atcgcccaga gccaggcgtt cttcggcaag ttcgacaaga tctgctacaa caccgacaag 4020
ggctacttcg agttccacat cgactacgcc aagttcacgg acaaggcgaa gaactccagg 4080
cagaagtggg ccatctgcag ccacggcgac aagcgctacg tgtacgacaa gaccgcgaac 4140
cagaacaagg gcgccgcgaa gggcatcaac gtcaacgacg agctcaagtc cctgttcgcc 4200
cgctaccaca tcaacgacaa gcagccgaac ctcgtgatgg acatctgcca gaacaacgac 4260
aaggagttcc acaagagcct gatgtgcctc ctgaagaccc tcctggccct ccgctactcc 4320
aacgcctcca gcgacgaggc gttcatcctg agccccgtgg ccaacgacga gggcgtcttc 4380
ttcaactcgg ctctggccga cgacacccag ccacagaacg cggacgccaa cggcgcttac 4440
cacatcgcgc tcaagggcct gtggctcctg aacgagctca agaacagcga cgacctgaac 4500
aaggtcaagc tcgccatcga caaccagacc tggctgaact tcgcccagaa ccgcaagagg 4560
cccgcggcca cgaagaaggc gggccaggcc aagaagaaga agtccggcgg cagcacgaac 4620
ctgtccgaca tcatcgagaa ggagaccggc aagcagctcg tgatccagga gagcatcctc 4680
atgctgccgg aggaggtcga ggaggtcatc ggcaacaagc ccgagtccga catcctcgtc 4740
cacacggcct acgacgagtc caccgacgag aacgtgatgc tcctgacctc ggacgctccc 4800
gagtacaagc catgggccct ggtcatccag gacagcaacg gcgagaacaa gatcaagatg 4860
ctctccggcg gcagcccgaa gaagaagcgc aaagtgtga 4899
<210> 50
<211> 1632
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 50
Met Pro Lys Lys Lys Arg Lys Val Met Ser Ser Glu Thr Gly Pro Val
1 5 10 15
Ala Val Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu
20 25 30
Val Phe Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr
35 40 45
Glu Ile Asn Trp Gly Gly Arg His Ser Ile Trp Arg His Thr Ser Gln
50 55 60
Asn Thr Asn Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Thr
65 70 75 80
Glu Arg Tyr Phe Cys Pro Asn Thr Arg Cys Ser Ile Thr Trp Phe Leu
85 90 95
Ser Trp Ser Pro Cys Gly Glu Cys Ser Arg Ala Ile Thr Glu Phe Leu
100 105 110
Ser Arg Tyr Pro His Val Thr Leu Phe Ile Tyr Ile Ala Arg Leu Tyr
115 120 125
His His Ala Asp Pro Arg Asn Arg Gln Gly Leu Arg Asp Leu Ile Ser
130 135 140
Ser Gly Val Thr Ile Gln Ile Met Thr Glu Gln Glu Ser Gly Tyr Cys
145 150 155 160
Trp Arg Asn Phe Val Asn Tyr Ser Pro Ser Asn Glu Ala His Trp Pro
165 170 175
Arg Tyr Pro His Leu Trp Val Arg Leu Tyr Val Leu Glu Leu Tyr Cys
180 185 190
Ile Ile Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln
195 200 205
Pro Gln Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln
210 215 220
Arg Leu Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys Gly Gly Gly
225 230 235 240
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
245 250 255
Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Met Leu Phe Gln Asp
260 265 270
Phe Thr His Leu Tyr Pro Leu Ser Lys Thr Val Arg Phe Glu Leu Lys
275 280 285
Pro Ile Gly Arg Thr Leu Glu His Ile His Ala Lys Asn Phe Leu Ser
290 295 300
Gln Asp Glu Thr Met Ala Asp Met Tyr Gln Lys Val Lys Val Ile Leu
305 310 315 320
Asp Asp Tyr His Arg Asp Phe Ile Ala Asp Met Met Gly Glu Val Lys
325 330 335
Leu Thr Lys Leu Ala Glu Phe Tyr Asp Val Tyr Leu Lys Phe Arg Lys
340 345 350
Asn Pro Lys Asp Asp Gly Leu Gln Lys Gln Leu Lys Asp Leu Gln Ala
355 360 365
Val Leu Arg Lys Glu Ser Val Lys Pro Ile Gly Ser Gly Gly Lys Tyr
370 375 380
Lys Thr Gly Tyr Asp Arg Leu Phe Gly Ala Lys Leu Phe Lys Asp Gly
385 390 395 400
Lys Glu Leu Gly Asp Leu Ala Lys Phe Val Ile Ala Gln Glu Gly Glu
405 410 415
Ser Ser Pro Lys Leu Ala His Leu Ala His Phe Glu Lys Phe Ser Thr
420 425 430
Tyr Phe Thr Gly Phe His Asp Asn Arg Lys Asn Met Tyr Ser Asp Glu
435 440 445
Asp Lys His Thr Ala Ile Ala Tyr Arg Leu Ile His Glu Asn Leu Pro
450 455 460
Arg Phe Ile Asp Asn Leu Gln Ile Leu Thr Thr Ile Lys Gln Lys His
465 470 475 480
Ser Ala Leu Tyr Asp Gln Ile Ile Asn Glu Leu Thr Ala Ser Gly Leu
485 490 495
Asp Val Ser Leu Ala Ser His Leu Asp Gly Tyr His Lys Leu Leu Thr
500 505 510
Gln Glu Gly Ile Thr Ala Tyr Asn Arg Ile Ile Gly Glu Val Asn Gly
515 520 525
Tyr Thr Asn Lys His Asn Gln Ile Cys His Lys Ser Glu Arg Ile Ala
530 535 540
Lys Leu Arg Pro Leu His Lys Gln Ile Leu Ser Asp Gly Met Gly Val
545 550 555 560
Ser Phe Leu Pro Ser Lys Phe Ala Asp Asp Ser Glu Met Cys Gln Ala
565 570 575
Val Asn Glu Phe Tyr Arg His Tyr Thr Asp Val Phe Ala Lys Val Gln
580 585 590
Ser Leu Phe Asp Gly Phe Asp Asp His Gln Lys Asp Gly Ile Tyr Val
595 600 605
Glu His Lys Asn Leu Asn Glu Leu Ser Lys Gln Ala Phe Gly Asp Phe
610 615 620
Ala Leu Leu Gly Arg Val Leu Asp Gly Tyr Tyr Val Asp Val Val Asn
625 630 635 640
Pro Glu Phe Asn Glu Arg Phe Ala Lys Ala Lys Thr Asp Asn Ala Lys
645 650 655
Ala Lys Leu Thr Lys Glu Lys Asp Lys Phe Ile Lys Gly Val His Ser
660 665 670
Leu Ala Ser Leu Glu Gln Ala Ile Glu His His Thr Ala Arg His Asp
675 680 685
Asp Glu Ser Val Gln Ala Gly Lys Leu Gly Gln Tyr Phe Lys His Gly
690 695 700
Leu Ala Gly Val Asp Asn Pro Ile Gln Lys Ile His Asn Asn His Ser
705 710 715 720
Thr Ile Lys Gly Phe Leu Glu Arg Glu Arg Pro Ala Gly Glu Arg Ala
725 730 735
Leu Pro Lys Ile Lys Ser Gly Lys Asn Pro Glu Met Thr Gln Leu Arg
740 745 750
Gln Leu Lys Glu Leu Leu Asp Asn Ala Leu Asn Val Ala His Phe Ala
755 760 765
Lys Leu Leu Thr Thr Lys Thr Thr Leu Asp Asn Gln Asp Gly Asn Phe
770 775 780
Tyr Gly Glu Phe Gly Val Leu Tyr Asp Glu Leu Ala Lys Ile Pro Thr
785 790 795 800
Leu Tyr Asn Lys Val Arg Asp Tyr Leu Ser Gln Lys Pro Phe Ser Thr
805 810 815
Glu Lys Tyr Lys Leu Asn Phe Gly Asn Pro Thr Leu Leu Asn Gly Trp
820 825 830
Asp Leu Asn Lys Glu Lys Asp Asn Phe Gly Val Ile Leu Gln Lys Asp
835 840 845
Gly Cys Tyr Tyr Leu Ala Leu Leu Asp Lys Ala His Lys Lys Val Phe
850 855 860
Asp Asn Ala Pro Asn Thr Gly Lys Asn Val Tyr Gln Lys Met Val Tyr
865 870 875 880
Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ala
885 890 895
Lys Ser Asn Leu Asp Tyr Tyr Asn Pro Ser Ala Glu Leu Leu Asp Lys
900 905 910
Tyr Ala Lys Gly Thr His Lys Lys Gly Asp Asn Phe Asn Leu Lys Asp
915 920 925
Cys His Ala Leu Ile Asp Phe Phe Lys Ala Gly Ile Asn Lys His Pro
930 935 940
Glu Trp Gln His Phe Gly Phe Lys Phe Ser Pro Thr Ser Ser Tyr Arg
945 950 955 960
Asp Leu Ser Asp Phe Tyr Arg Glu Val Glu Pro Gln Gly Tyr Gln Val
965 970 975
Lys Phe Val Asp Ile Asn Ala Asp Tyr Ile Asp Glu Leu Val Glu Gln
980 985 990
Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser Pro Lys
995 1000 1005
Ala His Gly Lys Pro Asn Leu His Thr Leu Tyr Phe Lys Ala Leu
1010 1015 1020
Phe Ser Glu Asp Asn Leu Ala Asp Pro Ile Tyr Lys Leu Asn Gly
1025 1030 1035
Glu Ala Gln Ile Phe Tyr Arg Lys Ala Ser Leu Asp Met Asn Glu
1040 1045 1050
Thr Thr Ile His Arg Ala Gly Glu Val Leu Glu Asn Lys Asn Pro
1055 1060 1065
Asp Asn Pro Lys Lys Arg Gln Phe Val Tyr Asp Ile Ile Lys Asp
1070 1075 1080
Lys Arg Tyr Thr Gln Asp Lys Phe Met Leu His Val Pro Ile Thr
1085 1090 1095
Met Asn Phe Gly Val Gln Gly Met Thr Ile Lys Glu Phe Asn Lys
1100 1105 1110
Lys Val Asn Gln Ser Ile Gln Gln Tyr Asp Glu Val Asn Val Ile
1115 1120 1125
Gly Ile Ala Arg Gly Glu Arg His Leu Leu Tyr Leu Thr Val Ile
1130 1135 1140
Asn Ser Lys Gly Glu Ile Leu Glu Gln Arg Ser Leu Asn Asp Ile
1145 1150 1155
Thr Thr Ala Ser Ala Asn Gly Thr Gln Val Thr Thr Pro Tyr His
1160 1165 1170
Lys Ile Leu Asp Lys Arg Glu Ile Glu Arg Leu Asn Ala Arg Val
1175 1180 1185
Gly Trp Gly Glu Ile Glu Thr Ile Lys Glu Leu Lys Ser Gly Tyr
1190 1195 1200
Leu Ser His Val Val His Gln Ile Asn Gln Leu Met Leu Lys Tyr
1205 1210 1215
Asn Ala Ile Val Val Leu Ala Asp Leu Asn Phe Gly Phe Lys Arg
1220 1225 1230
Gly Arg Phe Lys Val Glu Lys Gln Ile Tyr Gln Asn Phe Glu Asn
1235 1240 1245
Ala Leu Ile Lys Lys Leu Asn His Leu Val Leu Lys Asp Lys Ala
1250 1255 1260
Asp Asp Glu Ile Gly Ser Tyr Lys Asn Ala Leu Gln Leu Thr Asn
1265 1270 1275
Asn Phe Thr Asp Leu Lys Ser Ile Gly Lys Gln Thr Gly Phe Leu
1280 1285 1290
Phe Tyr Val Pro Ala Trp Asn Thr Ser Lys Ile Asp Pro Glu Thr
1295 1300 1305
Gly Phe Val Asp Leu Leu Lys Pro Arg Tyr Glu Asn Ile Ala Gln
1310 1315 1320
Ser Gln Ala Phe Phe Gly Lys Phe Asp Lys Ile Cys Tyr Asn Thr
1325 1330 1335
Asp Lys Gly Tyr Phe Glu Phe His Ile Asp Tyr Ala Lys Phe Thr
1340 1345 1350
Asp Lys Ala Lys Asn Ser Arg Gln Lys Trp Ala Ile Cys Ser His
1355 1360 1365
Gly Asp Lys Arg Tyr Val Tyr Asp Lys Thr Ala Asn Gln Asn Lys
1370 1375 1380
Gly Ala Ala Lys Gly Ile Asn Val Asn Asp Glu Leu Lys Ser Leu
1385 1390 1395
Phe Ala Arg Tyr His Ile Asn Asp Lys Gln Pro Asn Leu Val Met
1400 1405 1410
Asp Ile Cys Gln Asn Asn Asp Lys Glu Phe His Lys Ser Leu Met
1415 1420 1425
Cys Leu Leu Lys Thr Leu Leu Ala Leu Arg Tyr Ser Asn Ala Ser
1430 1435 1440
Ser Asp Glu Ala Phe Ile Leu Ser Pro Val Ala Asn Asp Glu Gly
1445 1450 1455
Val Phe Phe Asn Ser Ala Leu Ala Asp Asp Thr Gln Pro Gln Asn
1460 1465 1470
Ala Asp Ala Asn Gly Ala Tyr His Ile Ala Leu Lys Gly Leu Trp
1475 1480 1485
Leu Leu Asn Glu Leu Lys Asn Ser Asp Asp Leu Asn Lys Val Lys
1490 1495 1500
Leu Ala Ile Asp Asn Gln Thr Trp Leu Asn Phe Ala Gln Asn Arg
1505 1510 1515
Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys
1520 1525 1530
Lys Ser Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu
1535 1540 1545
Thr Gly Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro
1550 1555 1560
Glu Glu Val Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile
1565 1570 1575
Leu Val His Thr Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met
1580 1585 1590
Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val
1595 1600 1605
Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu Ser Gly
1610 1615 1620
Gly Ser Pro Lys Lys Lys Arg Lys Val
1625 1630
<210> 51
<211> 4809
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 51
atgccgaaga agaagcgcaa ggtcatgtcc agcgagaccg gccccgtggc ggtggacccc 60
accctgcgca ggcgcatcga gccgcacgag ttcgaggtgt tcttcgaccc cagggagctc 120
cgcaaggaga cctgcctcct gtacgagatc aactggggcg gcaggcactc catctggagg 180
cacacgagcc agaacaccaa caagcacgtc gaggtgaact tcatcgagaa gttcaccacg 240
gagaggtact tctgcccgaa cacgcgctgc tccatcacgt ggttcctctc gtggagccca 300
tgcggcgagt gctccagggc gatcacggag ttcctcagcc gctacccgca cgtgaccctg 360
ttcatctaca tcgctaggct ctaccaccac gcggacccca ggaacaggca gggcctcagg 420
gacctgatct ccagcggcgt cacgatccag atcatgaccg agcaggagtc cggctactgc 480
tggaggaact tcgtgaacta ctccccgagc aacgaggccc actggccccg ctacccgcac 540
ctctgggtcc gcctctacgt gctcgagctg tactgcatca tcctcggcct gccgccctgc 600
ctcaacatcc tgaggcgcaa gcagccccag ctgacgttct tcaccatcgc cctgcagagc 660
tgccactacc agaggctccc gccccacatc ctgtgggcga ccgggctcaa ggggggcggg 720
ggctcaggcg ggggcgggag cggcggcggg ggctctgggg gcggcggcag cggcgggggc 780
ggcagcgggg gcggcgggtc gatgagcaag ctggagaagt tcacgaactg ctactccctc 840
agcaagaccc tgaggttcaa ggcgatcccg gtcggcaaga cccaggagaa catcgacaac 900
aagcggctgc tggtggagga cgagaagagg gctgaggact acaagggcgt gaagaagctc 960
ctggaccgct actacctgtc cttcatcaac gacgtgctcc acagcatcaa gctcaagaac 1020
ctgaacaact acatcagcct cttcaggaag aagacgcgca ccgagaagga gaacaaggag 1080
ctcgagaacc tggagatcaa cctgaggaag gagatcgcca aggcgttcaa gggcaacgag 1140
ggctacaagt ccctcttcaa gaaggacatc atcgagacga tcctcccgga gttcctggac 1200
gacaaggacg agatcgccct ggtcaactcc ttcaacggct tcaccacggc gttcaccggc 1260
ttcttcgaca accgcgagaa catgttcagc gaggaggcca agtccacgag catcgcgttc 1320
aggtgcatca acgagaacct cacccgctac atctccaaca tggacatctt cgagaaggtc 1380
gacgcgatct tcgacaagca cgaggtgcag gagatcaagg agaagatcct gaacagcgac 1440
tacgacgtcg aggacttctt cgagggcgag ttcttcaact tcgtcctcac gcaggagggc 1500
atcgacgtgt acaacgccat catcggtggc ttcgtgaccg agtccggcga gaagatcaag 1560
ggcctgaacg agtacatcaa cctctacaac cagaagacca agcagaagct gccgaagttc 1620
aagcccctgt acaagcaggt gctctccgac agggagtccc tcagcttcta cggcgagggc 1680
tacacgagcg acgaggaggt cctggaggtg ttccgcaaca ccctcaacaa gaacagcgag 1740
atcttctcca gcatcaagaa gctcgagaag ctgttcaaga acttcgacga gtactccagc 1800
gccggcatct tcgtcaagaa cggcccggcg atctccacga tcagcaagga catcttcggc 1860
gagtggaacg tgatccgcga caagtggaac gccgagtacg acgacatcca cctcaagaag 1920
aaggcggtgg tcaccgagaa gtacgaggac gacaggcgca agtccttcaa gaagatcggc 1980
tccttcagcc tcgagcagct gcaggagtac gccgacgcgg acctgagcgt ggtcgagaag 2040
ctcaaggaga tcatcatcca gaaggtcgac gagatctaca aggtgtacgg ctccagcgag 2100
aagctcttcg acgcggactt cgtcctcgag aagtccctga agaagaacga cgccgtggtc 2160
gcgatcatga aggacctcct ggactccgtg aagagcttcg agaattacat caaggccttc 2220
ttcggcgagg gcaaggagac gaacagggac gagtccttct acggcgactt cgtcctggcc 2280
tacgacatcc tcctgaaggt ggaccacatc tacgacgcga tccgcaacta cgtgacccag 2340
aagccgtaca gcaaggacaa gttcaagctc tacttccaga acccccagtt catgggcggc 2400
tgggacaagg acaaggagac ggactacagg gcgaccatcc tgcgctacgg cagcaagtac 2460
tacctcgcca tcatggacaa gaagtacgcg aagtgcctgc agaagatcga caaggacgac 2520
gtcaacggca actacgagaa gatcaactac aagctcctgc cgggccccaa caagatgctc 2580
ccgaaggtgt tcttctccaa gaagtggatg gcctactaca accccagcga ggacatccag 2640
aagatctaca agaacggcac gttcaagaag ggcgacatgt tcaacctgaa cgactgccac 2700
aagctcatcg acttcttcaa ggactccatc agccgctacc cgaagtggtc caacgcctac 2760
gacttcaact tcagcgagac cgagaagtac aaggacatcg cgggcttcta ccgcgaggtc 2820
gaggagcagg gctacaaggt gtccttcgag tccgccagca agaaggaggt cgacaagctg 2880
gtggaggagg gcaagctcta catgttccag atctacaaca aggacttctc cgacaagagc 2940
cacggcacgc ccaacctgca caccatgtac ttcaagctcc tgttcgacga gaacaaccac 3000
ggccagatca ggctgtccgg cggcgccgag ctcttcatga ggagggcgag cctgaagaag 3060
gaggagctgg tggtccaccc cgctaacagc ccaatcgcga acaagaaccc ggacaacccc 3120
aagaagacca cgaccctgtc ctacgacgtg tacaaggaca agaggttcag cgaggaccag 3180
tacgagctcc acatcccgat cgcgatcaac aagtgcccca agaacatctt caagatcaac 3240
accgaggtcc gcgtgctcct gaagcacgac gacaacccct acgtgatcgg catcgacagg 3300
ggcgagagga acctcctgta catcgtggtc gtggacggca agggcaacat cgtggagcag 3360
tactccctca acgagatcat caacaacttc aacggcatca ggatcaagac ggactaccac 3420
agcctcctgg acaagaagga gaaggagagg ttcgaggccc gccagaactg gacctccatc 3480
gagaacatca aggagctgaa ggcgggctac atcagccagg tcgtgcacaa gatctgcgag 3540
ctcgtcgaga agtacgacgc cgtgatcgcc ctcgaggacc tgaactccgg cttcaagaac 3600
agccgcgtca aggtggagaa gcaggtctac cagaagttcg agaagatgct catcgacaag 3660
ctgaactaca tggtggacaa gaagtccaac ccctgcgcta cgggcggcgc gctgaagggc 3720
taccagatca ccaacaagtt cgagagcttc aagtccatga gcactcagaa cggcttcatc 3780
ttctacatcc cggcgtggct cacgtccaag atcgacccca gcaccggctt cgtcaacctc 3840
ctgaagacga agtacacctc catcgccgac agcaagaagt tcatctccag cttcgaccgc 3900
atcatgtatg tgccggagga ggacctgttc gagttcgccc tcgactacaa gaacttctcc 3960
cgcacggacg cggactacat caagaagtgg aagctgtaca gctacggcaa ccgcatccgc 4020
atcttcagga accccaagaa gaacaacgtc ttcgactggg aggaggtgtg cctgacctcc 4080
gcgtacaagg agctcttcaa caagtacggc atcaactacc agcagggcga catcagggct 4140
ctcctgtgcg agcagagcga caaggccttc tactccagct tcatggcgct gatgtccctc 4200
atgctgcaga tgaggaactc gatcaccggc aggacggacg tggacttcct catctccccg 4260
gtgaagaaca gcgacggcat cttctacgac tccaggaact acgaggccca ggagaacgcg 4320
atcctcccaa agaacgcgga cgccaacggc gcctacaaca tcgccaggaa ggtcctctgg 4380
gctatcggcc agttcaagaa ggcggaggac gagaagctgg acaaggtgaa gatcgccatc 4440
agcaacaagg agtggctcga gtacgcccag acctcggtca agcacggcag cccgaagaag 4500
aagcgcaagg tgtccggcgg cagcacgaac ctgtccgaca tcatcgagaa ggagaccggc 4560
aagcagctcg tgatccagga gagcatcctc atgctgccgg aggaggtcga ggaggtcatc 4620
ggcaacaagc ccgagtccga catcctcgtc cacacggcct acgacgagtc caccgacgag 4680
aacgtgatgc tcctgacctc ggacgctccc gagtacaagc catgggccct ggtcatccag 4740
gacagcaacg gcgagaacaa gatcaagatg ctctccggcg gcagcccgaa gaagaagcgc 4800
aaagtgtga 4809
<210> 52
<211> 1602
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 52
Met Pro Lys Lys Lys Arg Lys Val Met Ser Ser Glu Thr Gly Pro Val
1 5 10 15
Ala Val Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu
20 25 30
Val Phe Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr
35 40 45
Glu Ile Asn Trp Gly Gly Arg His Ser Ile Trp Arg His Thr Ser Gln
50 55 60
Asn Thr Asn Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Thr
65 70 75 80
Glu Arg Tyr Phe Cys Pro Asn Thr Arg Cys Ser Ile Thr Trp Phe Leu
85 90 95
Ser Trp Ser Pro Cys Gly Glu Cys Ser Arg Ala Ile Thr Glu Phe Leu
100 105 110
Ser Arg Tyr Pro His Val Thr Leu Phe Ile Tyr Ile Ala Arg Leu Tyr
115 120 125
His His Ala Asp Pro Arg Asn Arg Gln Gly Leu Arg Asp Leu Ile Ser
130 135 140
Ser Gly Val Thr Ile Gln Ile Met Thr Glu Gln Glu Ser Gly Tyr Cys
145 150 155 160
Trp Arg Asn Phe Val Asn Tyr Ser Pro Ser Asn Glu Ala His Trp Pro
165 170 175
Arg Tyr Pro His Leu Trp Val Arg Leu Tyr Val Leu Glu Leu Tyr Cys
180 185 190
Ile Ile Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln
195 200 205
Pro Gln Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln
210 215 220
Arg Leu Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys Gly Gly Gly
225 230 235 240
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
245 250 255
Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Met Ser Lys Leu Glu
260 265 270
Lys Phe Thr Asn Cys Tyr Ser Leu Ser Lys Thr Leu Arg Phe Lys Ala
275 280 285
Ile Pro Val Gly Lys Thr Gln Glu Asn Ile Asp Asn Lys Arg Leu Leu
290 295 300
Val Glu Asp Glu Lys Arg Ala Glu Asp Tyr Lys Gly Val Lys Lys Leu
305 310 315 320
Leu Asp Arg Tyr Tyr Leu Ser Phe Ile Asn Asp Val Leu His Ser Ile
325 330 335
Lys Leu Lys Asn Leu Asn Asn Tyr Ile Ser Leu Phe Arg Lys Lys Thr
340 345 350
Arg Thr Glu Lys Glu Asn Lys Glu Leu Glu Asn Leu Glu Ile Asn Leu
355 360 365
Arg Lys Glu Ile Ala Lys Ala Phe Lys Gly Asn Glu Gly Tyr Lys Ser
370 375 380
Leu Phe Lys Lys Asp Ile Ile Glu Thr Ile Leu Pro Glu Phe Leu Asp
385 390 395 400
Asp Lys Asp Glu Ile Ala Leu Val Asn Ser Phe Asn Gly Phe Thr Thr
405 410 415
Ala Phe Thr Gly Phe Phe Asp Asn Arg Glu Asn Met Phe Ser Glu Glu
420 425 430
Ala Lys Ser Thr Ser Ile Ala Phe Arg Cys Ile Asn Glu Asn Leu Thr
435 440 445
Arg Tyr Ile Ser Asn Met Asp Ile Phe Glu Lys Val Asp Ala Ile Phe
450 455 460
Asp Lys His Glu Val Gln Glu Ile Lys Glu Lys Ile Leu Asn Ser Asp
465 470 475 480
Tyr Asp Val Glu Asp Phe Phe Glu Gly Glu Phe Phe Asn Phe Val Leu
485 490 495
Thr Gln Glu Gly Ile Asp Val Tyr Asn Ala Ile Ile Gly Gly Phe Val
500 505 510
Thr Glu Ser Gly Glu Lys Ile Lys Gly Leu Asn Glu Tyr Ile Asn Leu
515 520 525
Tyr Asn Gln Lys Thr Lys Gln Lys Leu Pro Lys Phe Lys Pro Leu Tyr
530 535 540
Lys Gln Val Leu Ser Asp Arg Glu Ser Leu Ser Phe Tyr Gly Glu Gly
545 550 555 560
Tyr Thr Ser Asp Glu Glu Val Leu Glu Val Phe Arg Asn Thr Leu Asn
565 570 575
Lys Asn Ser Glu Ile Phe Ser Ser Ile Lys Lys Leu Glu Lys Leu Phe
580 585 590
Lys Asn Phe Asp Glu Tyr Ser Ser Ala Gly Ile Phe Val Lys Asn Gly
595 600 605
Pro Ala Ile Ser Thr Ile Ser Lys Asp Ile Phe Gly Glu Trp Asn Val
610 615 620
Ile Arg Asp Lys Trp Asn Ala Glu Tyr Asp Asp Ile His Leu Lys Lys
625 630 635 640
Lys Ala Val Val Thr Glu Lys Tyr Glu Asp Asp Arg Arg Lys Ser Phe
645 650 655
Lys Lys Ile Gly Ser Phe Ser Leu Glu Gln Leu Gln Glu Tyr Ala Asp
660 665 670
Ala Asp Leu Ser Val Val Glu Lys Leu Lys Glu Ile Ile Ile Gln Lys
675 680 685
Val Asp Glu Ile Tyr Lys Val Tyr Gly Ser Ser Glu Lys Leu Phe Asp
690 695 700
Ala Asp Phe Val Leu Glu Lys Ser Leu Lys Lys Asn Asp Ala Val Val
705 710 715 720
Ala Ile Met Lys Asp Leu Leu Asp Ser Val Lys Ser Phe Glu Asn Tyr
725 730 735
Ile Lys Ala Phe Phe Gly Glu Gly Lys Glu Thr Asn Arg Asp Glu Ser
740 745 750
Phe Tyr Gly Asp Phe Val Leu Ala Tyr Asp Ile Leu Leu Lys Val Asp
755 760 765
His Ile Tyr Asp Ala Ile Arg Asn Tyr Val Thr Gln Lys Pro Tyr Ser
770 775 780
Lys Asp Lys Phe Lys Leu Tyr Phe Gln Asn Pro Gln Phe Met Gly Gly
785 790 795 800
Trp Asp Lys Asp Lys Glu Thr Asp Tyr Arg Ala Thr Ile Leu Arg Tyr
805 810 815
Gly Ser Lys Tyr Tyr Leu Ala Ile Met Asp Lys Lys Tyr Ala Lys Cys
820 825 830
Leu Gln Lys Ile Asp Lys Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile
835 840 845
Asn Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe
850 855 860
Phe Ser Lys Lys Trp Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln
865 870 875 880
Lys Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu
885 890 895
Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg
900 905 910
Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu
915 920 925
Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln Gly
930 935 940
Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys Glu Val Asp Lys Leu
945 950 955 960
Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile Tyr Asn Lys Asp Phe
965 970 975
Ser Asp Lys Ser His Gly Thr Pro Asn Leu His Thr Met Tyr Phe Lys
980 985 990
Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile Arg Leu Ser Gly Gly
995 1000 1005
Ala Glu Leu Phe Met Arg Arg Ala Ser Leu Lys Lys Glu Glu Leu
1010 1015 1020
Val Val His Pro Ala Asn Ser Pro Ile Ala Asn Lys Asn Pro Asp
1025 1030 1035
Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr Lys Asp
1040 1045 1050
Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro Ile Ala
1055 1060 1065
Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr Glu Val
1070 1075 1080
Arg Val Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile Gly Ile
1085 1090 1095
Asp Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val Asp Gly
1100 1105 1110
Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile Ile Asn
1115 1120 1125
Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser Leu Leu
1130 1135 1140
Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp Thr
1145 1150 1155
Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln
1160 1165 1170
Val Val His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp Ala Val
1175 1180 1185
Ile Ala Leu Glu Asp Leu Asn Ser Gly Phe Lys Asn Ser Arg Val
1190 1195 1200
Lys Val Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Leu Ile
1205 1210 1215
Asp Lys Leu Asn Tyr Met Val Asp Lys Lys Ser Asn Pro Cys Ala
1220 1225 1230
Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile Thr Asn Lys Phe Glu
1235 1240 1245
Ser Phe Lys Ser Met Ser Thr Gln Asn Gly Phe Ile Phe Tyr Ile
1250 1255 1260
Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro Ser Thr Gly Phe Val
1265 1270 1275
Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile Ala Asp Ser Lys Lys
1280 1285 1290
Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr Val Pro Glu Glu Asp
1295 1300 1305
Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn Phe Ser Arg Thr Asp
1310 1315 1320
Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr Ser Tyr Gly Asn Arg
1325 1330 1335
Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn Asn Val Phe Asp Trp
1340 1345 1350
Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys Glu Leu Phe Asn Lys
1355 1360 1365
Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile Arg Ala Leu Leu Cys
1370 1375 1380
Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser Phe Met Ala Leu Met
1385 1390 1395
Ser Leu Met Leu Gln Met Arg Asn Ser Ile Thr Gly Arg Thr Asp
1400 1405 1410
Val Asp Phe Leu Ile Ser Pro Val Lys Asn Ser Asp Gly Ile Phe
1415 1420 1425
Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu Asn Ala Ile Leu Pro
1430 1435 1440
Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile Ala Arg Lys Val
1445 1450 1455
Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala Glu Asp Glu Lys Leu
1460 1465 1470
Asp Lys Val Lys Ile Ala Ile Ser Asn Lys Glu Trp Leu Glu Tyr
1475 1480 1485
Ala Gln Thr Ser Val Lys His Gly Ser Pro Lys Lys Lys Arg Lys
1490 1495 1500
Val Ser Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu
1505 1510 1515
Thr Gly Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro
1520 1525 1530
Glu Glu Val Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile
1535 1540 1545
Leu Val His Thr Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met
1550 1555 1560
Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val
1565 1570 1575
Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu Ser Gly
1580 1585 1590
Gly Ser Pro Lys Lys Lys Arg Lys Val
1595 1600
<210> 53
<211> 3873
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 53
atgccgaaga agaagcgcaa ggtcgggggc gggggctcag gcgggggcgg gagcggcggc 60
gggggctctg ggggcggcgg cagcggcggg ggcggcagcg ggggcggcgg gtcgatgagc 120
aagctggaga agttcacgaa ctgctactcc ctcagcaaga ccctgaggtt caaggcgatc 180
ccggtcggca agacccagga gaacatcgac aacaagcggc tgctggtgga ggacgagaag 240
agggctgagg actacaaggg cgtgaagaag ctcctggacc gctactacct gtccttcatc 300
aacgacgtgc tccacagcat caagctcaag aacctgaaca actacatcag cctcttcagg 360
aagaagacgc gcaccgagaa ggagaacaag gagctcgaga acctggagat caacctgagg 420
aaggagatcg ccaaggcgtt caagggcaac gagggctaca agtccctctt caagaaggac 480
atcatcgaga cgatcctccc ggagttcctg gacgacaagg acgagatcgc cctggtcaac 540
tccttcaacg gcttcaccac ggcgttcacc ggcttcttcg acaaccgcga gaacatgttc 600
agcgaggagg ccaagtccac gagcatcgcg ttcaggtgca tcaacgagaa cctcacccgc 660
tacatctcca acatggacat cttcgagaag gtcgacgcga tcttcgacaa gcacgaggtg 720
caggagatca aggagaagat cctgaacagc gactacgacg tcgaggactt cttcgagggc 780
gagttcttca acttcgtcct cacgcaggag ggcatcgacg tgtacaacgc catcatcggt 840
ggcttcgtga ccgagtccgg cgagaagatc aagggcctga acgagtacat caacctctac 900
aaccagaaga ccaagcagaa gctgccgaag ttcaagcccc tgtacaagca ggtgctctcc 960
gacagggagt ccctcagctt ctacggcgag ggctacacga gcgacgagga ggtcctggag 1020
gtgttccgca acaccctcaa caagaacagc gagatcttct ccagcatcaa gaagctcgag 1080
aagctgttca agaacttcga cgagtactcc agcgccggca tcttcgtcaa gaacggcccg 1140
gcgatctcca cgatcagcaa ggacatcttc ggcgagtgga acgtgatccg cgacaagtgg 1200
aacgccgagt acgacgacat ccacctcaag aagaaggcgg tggtcaccga gaagtacgag 1260
gacgacaggc gcaagtcctt caagaagatc ggctccttca gcctcgagca gctgcaggag 1320
tacgccgacg cggacctgag cgtggtcgag aagctcaagg agatcatcat ccagaaggtc 1380
gacgagatct acaaggtgta cggctccagc gagaagctct tcgacgcgga cttcgtcctc 1440
gagaagtccc tgaagaagaa cgacgccgtg gtcgcgatca tgaaggacct cctggactcc 1500
gtgaagagct tcgagaatta catcaaggcc ttcttcggcg agggcaagga gacgaacagg 1560
gacgagtcct tctacggcga cttcgtcctg gcctacgaca tcctcctgaa ggtggaccac 1620
atctacgacg cgatccgcaa ctacgtgacc cagaagccgt acagcaagga caagttcaag 1680
ctctacttcc agaaccccca gttcatgggc ggctgggaca aggacaagga gacggactac 1740
agggcgacca tcctgcgcta cggcagcaag tactacctcg ccatcatgga caagaagtac 1800
gcgaagtgcc tgcagaagat cgacaaggac gacgtcaacg gcaactacga gaagatcaac 1860
tacaagctcc tgccgggccc caacaagatg ctcccgaagg tgttcttctc caagaagtgg 1920
atggcctact acaaccccag cgaggacatc cagaagatct acaagaacgg cacgttcaag 1980
aagggcgaca tgttcaacct gaacgactgc cacaagctca tcgacttctt caaggactcc 2040
atcagccgct acccgaagtg gtccaacgcc tacgacttca acttcagcga gaccgagaag 2100
tacaaggaca tcgcgggctt ctaccgcgag gtcgaggagc agggctacaa ggtgtccttc 2160
gagtccgcca gcaagaagga ggtcgacaag ctggtggagg agggcaagct ctacatgttc 2220
cagatctaca acaaggactt ctccgacaag agccacggca cgcccaacct gcacaccatg 2280
tacttcaagc tcctgttcga cgagaacaac cacggccaga tcaggctgtc cggcggcgcc 2340
gagctcttca tgaggagggc gagcctgaag aaggaggagc tggtggtcca ccccgctaac 2400
agcccaatcg cgaacaagaa cccggacaac cccaagaaga ccacgaccct gtcctacgac 2460
gtgtacaagg acaagaggtt cagcgaggac cagtacgagc tccacatccc gatcgcgatc 2520
aacaagtgcc ccaagaacat cttcaagatc aacaccgagg tccgcgtgct cctgaagcac 2580
gacgacaacc cctacgtgat cggcatcgac aggggcgaga ggaacctcct gtacatcgtg 2640
gtcgtggacg gcaagggcaa catcgtggag cagtactccc tcaacgagat catcaacaac 2700
ttcaacggca tcaggatcaa gacggactac cacagcctcc tggacaagaa ggagaaggag 2760
aggttcgagg cccgccagaa ctggacctcc atcgagaaca tcaaggagct gaaggcgggc 2820
tacatcagcc aggtcgtgca caagatctgc gagctcgtcg agaagtacga cgccgtgatc 2880
gccctcgagg acctgaactc cggcttcaag aacagccgcg tcaaggtgga gaagcaggtc 2940
taccagaagt tcgagaagat gctcatcgac aagctgaact acatggtgga caagaagtcc 3000
aacccctgcg ctacgggcgg cgcgctgaag ggctaccaga tcaccaacaa gttcgagagc 3060
ttcaagtcca tgagcactca gaacggcttc atcttctaca tcccggcgtg gctcacgtcc 3120
aagatcgacc ccagcaccgg cttcgtcaac ctcctgaaga cgaagtacac ctccatcgcc 3180
gacagcaaga agttcatctc cagcttcgac cgcatcatgt atgtgccgga ggaggacctg 3240
ttcgagttcg ccctcgacta caagaacttc tcccgcacgg acgcggacta catcaagaag 3300
tggaagctgt acagctacgg caaccgcatc cgcatcttca ggaaccccaa gaagaacaac 3360
gtcttcgact gggaggaggt gtgcctgacc tccgcgtaca aggagctctt caacaagtac 3420
ggcatcaact accagcaggg cgacatcagg gctctcctgt gcgagcagag cgacaaggcc 3480
ttctactcca gcttcatggc gctgatgtcc ctcatgctgc agatgaggaa ctcgatcacc 3540
ggcaggacgg acgtggactt cctcatctcc ccggtgaaga acagcgacgg catcttctac 3600
gactccagga actacgaggc ccaggagaac gcgatcctcc caaagaacgc ggacgccaac 3660
ggcgcctaca acatcgccag gaaggtcctc tgggctatcg gccagttcaa gaaggcggag 3720
gacgagaagc tggacaaggt gaagatcgcc atcagcaaca aggagtggct cgagtacgcc 3780
cagacctcgg tcaagcacgg cagcccgaag aagaagcgca aggtgtccgg cggcagctcc 3840
ggcggcagcc cgaagaagaa gcgcaaagtg tga 3873
<210> 54
<211> 1290
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 54
Met Pro Lys Lys Lys Arg Lys Val Gly Gly Gly Gly Ser Gly Gly Gly
1 5 10 15
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
20 25 30
Ser Gly Gly Gly Gly Ser Met Ser Lys Leu Glu Lys Phe Thr Asn Cys
35 40 45
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly Lys
50 55 60
Thr Gln Glu Asn Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu Lys
65 70 75 80
Arg Ala Glu Asp Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr
85 90 95
Leu Ser Phe Ile Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn Leu
100 105 110
Asn Asn Tyr Ile Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu
115 120 125
Asn Lys Glu Leu Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala
130 135 140
Lys Ala Phe Lys Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp
145 150 155 160
Ile Ile Glu Thr Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile
165 170 175
Ala Leu Val Asn Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe
180 185 190
Phe Asp Asn Arg Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr Ser
195 200 205
Ile Ala Phe Arg Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn
210 215 220
Met Asp Ile Phe Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu Val
225 230 235 240
Gln Glu Ile Lys Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp
245 250 255
Phe Phe Glu Gly Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly Ile
260 265 270
Asp Val Tyr Asn Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly Glu
275 280 285
Lys Ile Lys Gly Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr
290 295 300
Lys Gln Lys Leu Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser
305 310 315 320
Asp Arg Glu Ser Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu
325 330 335
Glu Val Leu Glu Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile
340 345 350
Phe Ser Ser Ile Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu
355 360 365
Tyr Ser Ser Ala Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser Thr
370 375 380
Ile Ser Lys Asp Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys Trp
385 390 395 400
Asn Ala Glu Tyr Asp Asp Ile His Leu Lys Lys Lys Ala Val Val Thr
405 410 415
Glu Lys Tyr Glu Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser
420 425 430
Phe Ser Leu Glu Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val
435 440 445
Val Glu Lys Leu Lys Glu Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr
450 455 460
Lys Val Tyr Gly Ser Ser Glu Lys Leu Phe Asp Ala Asp Phe Val Leu
465 470 475 480
Glu Lys Ser Leu Lys Lys Asn Asp Ala Val Val Ala Ile Met Lys Asp
485 490 495
Leu Leu Asp Ser Val Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe
500 505 510
Gly Glu Gly Lys Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe
515 520 525
Val Leu Ala Tyr Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp Ala
530 535 540
Ile Arg Asn Tyr Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys
545 550 555 560
Leu Tyr Phe Gln Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp Lys
565 570 575
Glu Thr Asp Tyr Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr
580 585 590
Leu Ala Ile Met Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp
595 600 605
Lys Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu
610 615 620
Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ser Lys Lys Trp
625 630 635 640
Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln Lys Ile Tyr Lys Asn
645 650 655
Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu Asn Asp Cys His Lys
660 665 670
Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg Tyr Pro Lys Trp Ser
675 680 685
Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu Lys Tyr Lys Asp Ile
690 695 700
Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln Gly Tyr Lys Val Ser Phe
705 710 715 720
Glu Ser Ala Ser Lys Lys Glu Val Asp Lys Leu Val Glu Glu Gly Lys
725 730 735
Leu Tyr Met Phe Gln Ile Tyr Asn Lys Asp Phe Ser Asp Lys Ser His
740 745 750
Gly Thr Pro Asn Leu His Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu
755 760 765
Asn Asn His Gly Gln Ile Arg Leu Ser Gly Gly Ala Glu Leu Phe Met
770 775 780
Arg Arg Ala Ser Leu Lys Lys Glu Glu Leu Val Val His Pro Ala Asn
785 790 795 800
Ser Pro Ile Ala Asn Lys Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr
805 810 815
Leu Ser Tyr Asp Val Tyr Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr
820 825 830
Glu Leu His Ile Pro Ile Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe
835 840 845
Lys Ile Asn Thr Glu Val Arg Val Leu Leu Lys His Asp Asp Asn Pro
850 855 860
Tyr Val Ile Gly Ile Asp Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val
865 870 875 880
Val Val Asp Gly Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu
885 890 895
Ile Ile Asn Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser
900 905 910
Leu Leu Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp
915 920 925
Thr Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln
930 935 940
Val Val His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp Ala Val Ile
945 950 955 960
Ala Leu Glu Asp Leu Asn Ser Gly Phe Lys Asn Ser Arg Val Lys Val
965 970 975
Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Leu Ile Asp Lys Leu
980 985 990
Asn Tyr Met Val Asp Lys Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala
995 1000 1005
Leu Lys Gly Tyr Gln Ile Thr Asn Lys Phe Glu Ser Phe Lys Ser
1010 1015 1020
Met Ser Thr Gln Asn Gly Phe Ile Phe Tyr Ile Pro Ala Trp Leu
1025 1030 1035
Thr Ser Lys Ile Asp Pro Ser Thr Gly Phe Val Asn Leu Leu Lys
1040 1045 1050
Thr Lys Tyr Thr Ser Ile Ala Asp Ser Lys Lys Phe Ile Ser Ser
1055 1060 1065
Phe Asp Arg Ile Met Tyr Val Pro Glu Glu Asp Leu Phe Glu Phe
1070 1075 1080
Ala Leu Asp Tyr Lys Asn Phe Ser Arg Thr Asp Ala Asp Tyr Ile
1085 1090 1095
Lys Lys Trp Lys Leu Tyr Ser Tyr Gly Asn Arg Ile Arg Ile Phe
1100 1105 1110
Arg Asn Pro Lys Lys Asn Asn Val Phe Asp Trp Glu Glu Val Cys
1115 1120 1125
Leu Thr Ser Ala Tyr Lys Glu Leu Phe Asn Lys Tyr Gly Ile Asn
1130 1135 1140
Tyr Gln Gln Gly Asp Ile Arg Ala Leu Leu Cys Glu Gln Ser Asp
1145 1150 1155
Lys Ala Phe Tyr Ser Ser Phe Met Ala Leu Met Ser Leu Met Leu
1160 1165 1170
Gln Met Arg Asn Ser Ile Thr Gly Arg Thr Asp Val Asp Phe Leu
1175 1180 1185
Ile Ser Pro Val Lys Asn Ser Asp Gly Ile Phe Tyr Asp Ser Arg
1190 1195 1200
Asn Tyr Glu Ala Gln Glu Asn Ala Ile Leu Pro Lys Asn Ala Asp
1205 1210 1215
Ala Asn Gly Ala Tyr Asn Ile Ala Arg Lys Val Leu Trp Ala Ile
1220 1225 1230
Gly Gln Phe Lys Lys Ala Glu Asp Glu Lys Leu Asp Lys Val Lys
1235 1240 1245
Ile Ala Ile Ser Asn Lys Glu Trp Leu Glu Tyr Ala Gln Thr Ser
1250 1255 1260
Val Lys His Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Gly Gly
1265 1270 1275
Ser Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1280 1285 1290
<210> 55
<211> 3873
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 55
atgccgaaga agaagcgcaa ggtcgggggc gggggctcag gcgggggcgg gagcggcggc 60
gggggctctg ggggcggcgg cagcggcggg ggcggcagcg ggggcggcgg gtcgatgagc 120
aagctggaga agttcacgaa ctgctactcc ctcagcaaga ccctgaggtt caaggcgatc 180
ccggtcggca agacccagga gaacatcgac aacaagcggc tgctggtgga ggacgagaag 240
agggctgagg actacaaggg cgtgaagaag ctcctggacc gctactacct gtccttcatc 300
aacgacgtgc tccacagcat caagctcaag aacctgaaca actacatcag cctcttcagg 360
aagaagacgc gcaccgagaa ggagaacaag gagctcgaga acctggagat caacctgagg 420
aaggagatcg ccaaggcgtt caagggcaac gagggctaca agtccctctt caagaaggac 480
atcatcgaga cgatcctccc ggagttcctg gacgacaagg acgagatcgc cctggtcaac 540
tccttcaacg gcttcaccac ggcgttcacc ggcttcttcg acaaccgcga gaacatgttc 600
agcgaggagg ccaagtccac gagcatcgcg ttcaggtgca tcaacgagaa cctcacccgc 660
tacatctcca acatggacat cttcgagaag gtcgacgcga tcttcgacaa gcacgaggtg 720
caggagatca aggagaagat cctgaacagc gactacgacg tcgaggactt cttcgagggc 780
gagttcttca acttcgtcct cacgcaggag ggcatcgacg tgtacaacgc catcatcggt 840
ggcttcgtga ccgagtccgg cgagaagatc aagggcctga acgagtacat caacctctac 900
aaccagaaga ccaagcagaa gctgccgaag ttcaagcccc tgtacaagca ggtgctctcc 960
gacagggagt ccctcagctt ctacggcgag ggctacacga gcgacgagga ggtcctggag 1020
gtgttccgca acaccctcaa caagaacagc gagatcttct ccagcatcaa gaagctcgag 1080
aagctgttca agaacttcga cgagtactcc agcgccggca tcttcgtcaa gaacggcccg 1140
gcgatctcca cgatcagcaa ggacatcttc ggcgagtgga acgtgatccg cgacaagtgg 1200
aacgccgagt acgacgacat ccacctcaag aagaaggcgg tggtcaccga gaagtacgag 1260
gacgacaggc gcaagtcctt caagaagatc ggctccttca gcctcgagca gctgcaggag 1320
tacgccgacg cggacctgag cgtggtcgag aagctcaagg agatcatcat ccagaaggtc 1380
gacgagatct acaaggtgta cggctccagc gagaagctct tcgacgcgga cttcgtcctc 1440
gagaagtccc tgaagaagaa cgacgccgtg gtcgcgatca tgaaggacct cctggactcc 1500
gtgaagagct tcgagaatta catcaaggcc ttcttcggcg agggcaagga gacgaacagg 1560
gacgagtcct tctacggcga cttcgtcctg gcctacgaca tcctcctgaa ggtggaccac 1620
atctacgacg cgatccgcaa ctacgtgacc cagaagccgt acagcaagga caagttcaag 1680
ctctacttcc agaaccccca gttcatgggc ggctgggaca aggacaagga gacggactac 1740
agggcgacca tcctgcgcta cggcagcaag tactacctcg ccatcatgga caagaagtac 1800
gcgaagtgcc tgcagaagat cgacaaggac gacgtcaacg gcaactacga gaagatcaac 1860
tacaagctcc tgccgggccc caacaagatg ctcccgaagg tgttcttctc caagaagtgg 1920
atggcctact acaaccccag cgaggacatc cagaagatct acaagaacgg cacgttcaag 1980
aagggcgaca tgttcaacct gaacgactgc cacaagctca tcgacttctt caaggactcc 2040
atcagccgct acccgaagtg gtccaacgcc tacgacttca acttcagcga gaccgagaag 2100
tacaaggaca tcgcgggctt ctaccgcgag gtcgaggagc agggctacaa ggtgtccttc 2160
gagtccgcca gcaagaagga ggtcgacaag ctggtggagg agggcaagct ctacatgttc 2220
cagatctaca acaaggactt ctccgacaag agccacggca cgcccaacct gcacaccatg 2280
tacttcaagc tcctgttcga cgagaacaac cacggccaga tcaggctgtc cggcggcgcc 2340
gagctcttca tgaggagggc gagcctgaag aaggaggagc tggtggtcca ccccgctaac 2400
agcccaatcg cgaacaagaa cccggacaac cccaagaaga ccacgaccct gtcctacgac 2460
gtgtacaagg acaagaggtt cagcgaggac cagtacgagc tccacatccc gatcgcgatc 2520
aacaagtgcc ccaagaacat cttcaagatc aacaccgagg tccgcgtgct cctgaagcac 2580
gacgacaacc cctacgtgat cggcatcgac aggggcgaga ggaacctcct gtacatcgtg 2640
gtcgtggacg gcaagggcaa catcgtggag cagtactccc tcaacgagat catcaacaac 2700
ttcaacggca tcaggatcaa gacggactac cacagcctcc tggacaagaa ggagaaggag 2760
aggttcgagg cccgccagaa ctggacctcc atcgagaaca tcaaggagct gaaggcgggc 2820
tacatcagcc aggtcgtgca caagatctgc gagctcgtcg agaagtacga cgccgtgatc 2880
gccctcgagg acctgaactc cggcttcaag aacagccgcg tcaaggtgga gaagcaggtc 2940
taccagaagt tcgagaagat gctcatcgac aagctgaact acatggtgga caagaagtcc 3000
aacccctgcg ctacgggcgg cgcgctgaag ggctaccaga tcaccaacaa gttcgagagc 3060
ttcaagtcca tgagcactca gaacggcttc atcttctaca tcccggcgtg gctcacgtcc 3120
aagatcgacc ccagcaccgg cttcgtcaac ctcctgaaga cgaagtacac ctccatcgcc 3180
gacagcaaga agttcatctc cagcttcgac cgcatcatgt atgtgccgga ggaggacctg 3240
ttcgagttcg ccctcgacta caagaacttc tcccgcacgg acgcggacta catcaagaag 3300
tggaagctgt acagctacgg caaccgcatc cgcatcttca ggaaccccaa gaagaacaac 3360
gtcttcgact gggaggaggt gtgcctgacc tccgcgtaca aggagctctt caacaagtac 3420
ggcatcaact accagcaggg cgacatcagg gctctcctgt gcgagcagag cgacaaggcc 3480
ttctactcca gcttcatggc gctgatgtcc ctcatgctgc agatgaggaa ctcgatcacc 3540
ggcaggacgg acgtggactt cctcatctcc ccggtgaaga acagcgacgg catcttctac 3600
gactccagga actacgaggc ccaggagaac gcgatcctcc caaagaacgc ggacgccaac 3660
ggcgcctaca acatcgccag gaaggtcctc tgggctatcg gccagttcaa gaaggcggag 3720
gacgagaagc tggacaaggt gaagatcgcc atcagcaaca aggagtggct cgagtacgcc 3780
cagacctcgg tcaagcacgg cagcccgaag aagaagcgca aggtgtccgg cggcagctcc 3840
ggcggcagcc cgaagaagaa gcgcaaagtg tga 3873
<210> 56
<211> 1290
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 56
Met Pro Lys Lys Lys Arg Lys Val Gly Gly Gly Gly Ser Gly Gly Gly
1 5 10 15
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
20 25 30
Ser Gly Gly Gly Gly Ser Met Ser Lys Leu Glu Lys Phe Thr Asn Cys
35 40 45
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly Lys
50 55 60
Thr Gln Glu Asn Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu Lys
65 70 75 80
Arg Ala Glu Asp Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr
85 90 95
Leu Ser Phe Ile Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn Leu
100 105 110
Asn Asn Tyr Ile Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu
115 120 125
Asn Lys Glu Leu Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala
130 135 140
Lys Ala Phe Lys Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp
145 150 155 160
Ile Ile Glu Thr Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile
165 170 175
Ala Leu Val Asn Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe
180 185 190
Phe Asp Asn Arg Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr Ser
195 200 205
Ile Ala Phe Arg Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn
210 215 220
Met Asp Ile Phe Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu Val
225 230 235 240
Gln Glu Ile Lys Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp
245 250 255
Phe Phe Glu Gly Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly Ile
260 265 270
Asp Val Tyr Asn Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly Glu
275 280 285
Lys Ile Lys Gly Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr
290 295 300
Lys Gln Lys Leu Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser
305 310 315 320
Asp Arg Glu Ser Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu
325 330 335
Glu Val Leu Glu Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile
340 345 350
Phe Ser Ser Ile Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu
355 360 365
Tyr Ser Ser Ala Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser Thr
370 375 380
Ile Ser Lys Asp Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys Trp
385 390 395 400
Asn Ala Glu Tyr Asp Asp Ile His Leu Lys Lys Lys Ala Val Val Thr
405 410 415
Glu Lys Tyr Glu Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser
420 425 430
Phe Ser Leu Glu Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val
435 440 445
Val Glu Lys Leu Lys Glu Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr
450 455 460
Lys Val Tyr Gly Ser Ser Glu Lys Leu Phe Asp Ala Asp Phe Val Leu
465 470 475 480
Glu Lys Ser Leu Lys Lys Asn Asp Ala Val Val Ala Ile Met Lys Asp
485 490 495
Leu Leu Asp Ser Val Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe
500 505 510
Gly Glu Gly Lys Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe
515 520 525
Val Leu Ala Tyr Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp Ala
530 535 540
Ile Arg Asn Tyr Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys
545 550 555 560
Leu Tyr Phe Gln Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp Lys
565 570 575
Glu Thr Asp Tyr Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr
580 585 590
Leu Ala Ile Met Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp
595 600 605
Lys Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu
610 615 620
Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ser Lys Lys Trp
625 630 635 640
Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln Lys Ile Tyr Lys Asn
645 650 655
Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu Asn Asp Cys His Lys
660 665 670
Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg Tyr Pro Lys Trp Ser
675 680 685
Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu Lys Tyr Lys Asp Ile
690 695 700
Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln Gly Tyr Lys Val Ser Phe
705 710 715 720
Glu Ser Ala Ser Lys Lys Glu Val Asp Lys Leu Val Glu Glu Gly Lys
725 730 735
Leu Tyr Met Phe Gln Ile Tyr Asn Lys Asp Phe Ser Asp Lys Ser His
740 745 750
Gly Thr Pro Asn Leu His Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu
755 760 765
Asn Asn His Gly Gln Ile Arg Leu Ser Gly Gly Ala Glu Leu Phe Met
770 775 780
Arg Arg Ala Ser Leu Lys Lys Glu Glu Leu Val Val His Pro Ala Asn
785 790 795 800
Ser Pro Ile Ala Asn Lys Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr
805 810 815
Leu Ser Tyr Asp Val Tyr Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr
820 825 830
Glu Leu His Ile Pro Ile Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe
835 840 845
Lys Ile Asn Thr Glu Val Arg Val Leu Leu Lys His Asp Asp Asn Pro
850 855 860
Tyr Val Ile Gly Ile Asp Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val
865 870 875 880
Val Val Asp Gly Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu
885 890 895
Ile Ile Asn Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser
900 905 910
Leu Leu Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp
915 920 925
Thr Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln
930 935 940
Val Val His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp Ala Val Ile
945 950 955 960
Ala Leu Glu Asp Leu Asn Ser Gly Phe Lys Asn Ser Arg Val Lys Val
965 970 975
Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Leu Ile Asp Lys Leu
980 985 990
Asn Tyr Met Val Asp Lys Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala
995 1000 1005
Leu Lys Gly Tyr Gln Ile Thr Asn Lys Phe Glu Ser Phe Lys Ser
1010 1015 1020
Met Ser Thr Gln Asn Gly Phe Ile Phe Tyr Ile Pro Ala Trp Leu
1025 1030 1035
Thr Ser Lys Ile Asp Pro Ser Thr Gly Phe Val Asn Leu Leu Lys
1040 1045 1050
Thr Lys Tyr Thr Ser Ile Ala Asp Ser Lys Lys Phe Ile Ser Ser
1055 1060 1065
Phe Asp Arg Ile Met Tyr Val Pro Glu Glu Asp Leu Phe Glu Phe
1070 1075 1080
Ala Leu Asp Tyr Lys Asn Phe Ser Arg Thr Asp Ala Asp Tyr Ile
1085 1090 1095
Lys Lys Trp Lys Leu Tyr Ser Tyr Gly Asn Arg Ile Arg Ile Phe
1100 1105 1110
Arg Asn Pro Lys Lys Asn Asn Val Phe Asp Trp Glu Glu Val Cys
1115 1120 1125
Leu Thr Ser Ala Tyr Lys Glu Leu Phe Asn Lys Tyr Gly Ile Asn
1130 1135 1140
Tyr Gln Gln Gly Asp Ile Arg Ala Leu Leu Cys Glu Gln Ser Asp
1145 1150 1155
Lys Ala Phe Tyr Ser Ser Phe Met Ala Leu Met Ser Leu Met Leu
1160 1165 1170
Gln Met Arg Asn Ser Ile Thr Gly Arg Thr Asp Val Asp Phe Leu
1175 1180 1185
Ile Ser Pro Val Lys Asn Ser Asp Gly Ile Phe Tyr Asp Ser Arg
1190 1195 1200
Asn Tyr Glu Ala Gln Glu Asn Ala Ile Leu Pro Lys Asn Ala Asp
1205 1210 1215
Ala Asn Gly Ala Tyr Asn Ile Ala Arg Lys Val Leu Trp Ala Ile
1220 1225 1230
Gly Gln Phe Lys Lys Ala Glu Asp Glu Lys Leu Asp Lys Val Lys
1235 1240 1245
Ile Ala Ile Ser Asn Lys Glu Trp Leu Glu Tyr Ala Gln Thr Ser
1250 1255 1260
Val Lys His Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Gly Gly
1265 1270 1275
Ser Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1280 1285 1290
<210> 57
<211> 1491
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 57
Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly
1 5 10 15
Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly
20 25 30
Gly Gly Ser Gly Met Ser Ser Glu Thr Gly Pro Val Ala Val Asp Pro
35 40 45
Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu Val Phe Phe Asp
50 55 60
Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr Glu Ile Asn Trp
65 70 75 80
Gly Gly Arg His Ser Ile Trp Arg His Thr Ser Gln Asn Thr Asn Lys
85 90 95
His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Thr Glu Arg Tyr Phe
100 105 110
Cys Pro Asn Thr Arg Cys Ser Ile Thr Trp Phe Leu Ser Trp Ser Pro
115 120 125
Cys Gly Glu Cys Ser Arg Ala Ile Thr Glu Phe Leu Ser Arg Tyr Pro
130 135 140
His Val Thr Leu Phe Ile Tyr Ile Ala Arg Leu Tyr His His Ala Asp
145 150 155 160
Pro Arg Asn Arg Gln Gly Leu Arg Asp Leu Ile Ser Ser Gly Val Thr
165 170 175
Ile Gln Ile Met Thr Glu Gln Glu Ser Gly Tyr Cys Trp Arg Asn Phe
180 185 190
Val Asn Tyr Ser Pro Ser Asn Glu Ala His Trp Pro Arg Tyr Pro His
195 200 205
Leu Trp Val Arg Leu Tyr Val Leu Glu Leu Tyr Cys Ile Ile Leu Gly
210 215 220
Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln Pro Gln Leu Thr
225 230 235 240
Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln Arg Leu Pro Pro
245 250 255
His Ile Leu Trp Ala Thr Gly Leu Lys Gly Gly Gly Gly Ser Gly Gly
260 265 270
Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly
275 280 285
Gly Ser Gly Gly Gly Gly Ser Met Ser Lys Leu Glu Lys Phe Thr Asn
290 295 300
Cys Tyr Ser Leu Ser Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly
305 310 315 320
Lys Thr Gln Glu Asn Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu
325 330 335
Lys Arg Ala Glu Asp Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr
340 345 350
Tyr Leu Ser Phe Ile Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn
355 360 365
Leu Asn Asn Tyr Ile Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys
370 375 380
Glu Asn Lys Glu Leu Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile
385 390 395 400
Ala Lys Ala Phe Lys Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys
405 410 415
Asp Ile Ile Glu Thr Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu
420 425 430
Ile Ala Leu Val Asn Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly
435 440 445
Phe Phe Asp Asn Arg Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr
450 455 460
Ser Ile Ala Phe Arg Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser
465 470 475 480
Asn Met Asp Ile Phe Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu
485 490 495
Val Gln Glu Ile Lys Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu
500 505 510
Asp Phe Phe Glu Gly Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly
515 520 525
Ile Asp Val Tyr Asn Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly
530 535 540
Glu Lys Ile Lys Gly Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys
545 550 555 560
Thr Lys Gln Lys Leu Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu
565 570 575
Ser Asp Arg Glu Ser Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp
580 585 590
Glu Glu Val Leu Glu Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu
595 600 605
Ile Phe Ser Ser Ile Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp
610 615 620
Glu Tyr Ser Ser Ala Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser
625 630 635 640
Thr Ile Ser Lys Asp Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys
645 650 655
Trp Asn Ala Glu Tyr Asp Asp Ile His Leu Lys Lys Lys Ala Val Val
660 665 670
Thr Glu Lys Tyr Glu Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly
675 680 685
Ser Phe Ser Leu Glu Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser
690 695 700
Val Val Glu Lys Leu Lys Glu Ile Ile Ile Gln Lys Val Asp Glu Ile
705 710 715 720
Tyr Lys Val Tyr Gly Ser Ser Glu Lys Leu Phe Asp Ala Asp Phe Val
725 730 735
Leu Glu Lys Ser Leu Lys Lys Asn Asp Ala Val Val Ala Ile Met Lys
740 745 750
Asp Leu Leu Asp Ser Val Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe
755 760 765
Phe Gly Glu Gly Lys Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp
770 775 780
Phe Val Leu Ala Tyr Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp
785 790 795 800
Ala Ile Arg Asn Tyr Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe
805 810 815
Lys Leu Tyr Phe Gln Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp
820 825 830
Lys Glu Thr Asp Tyr Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr
835 840 845
Tyr Leu Ala Ile Met Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile
850 855 860
Asp Lys Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys Leu
865 870 875 880
Leu Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ser Lys Lys
885 890 895
Trp Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln Lys Ile Tyr Lys
900 905 910
Asn Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu Asn Asp Cys His
915 920 925
Lys Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg Tyr Pro Lys Trp
930 935 940
Ser Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu Lys Tyr Lys Asp
945 950 955 960
Ile Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln Gly Tyr Lys Val Ser
965 970 975
Phe Glu Ser Ala Ser Lys Lys Glu Val Asp Lys Leu Val Glu Glu Gly
980 985 990
Lys Leu Tyr Met Phe Gln Ile Tyr Asn Lys Asp Phe Ser Asp Lys Ser
995 1000 1005
His Gly Thr Pro Asn Leu His Thr Met Tyr Phe Lys Leu Leu Phe
1010 1015 1020
Asp Glu Asn Asn His Gly Gln Ile Arg Leu Ser Gly Gly Ala Glu
1025 1030 1035
Leu Phe Met Arg Arg Ala Ser Leu Lys Lys Glu Glu Leu Val Val
1040 1045 1050
His Pro Ala Asn Ser Pro Ile Ala Asn Lys Asn Pro Asp Asn Pro
1055 1060 1065
Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr Lys Asp Lys Arg
1070 1075 1080
Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro Ile Ala Ile Asn
1085 1090 1095
Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr Glu Val Arg Val
1100 1105 1110
Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile Gly Ile Ala Arg
1115 1120 1125
Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val Asp Gly Lys Gly
1130 1135 1140
Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile Ile Asn Asn Phe
1145 1150 1155
Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser Leu Leu Asp Lys
1160 1165 1170
Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn Trp Thr Ser Ile
1175 1180 1185
Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln Val Val
1190 1195 1200
His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp Ala Val Ile Ala
1205 1210 1215
Leu Ala Asp Leu Asn Ser Gly Phe Lys Asn Ser Arg Val Lys Val
1220 1225 1230
Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Leu Ile Asp Lys
1235 1240 1245
Leu Asn Tyr Met Val Asp Lys Lys Ser Asn Pro Cys Ala Thr Gly
1250 1255 1260
Gly Ala Leu Lys Gly Tyr Gln Ile Thr Asn Lys Phe Glu Ser Phe
1265 1270 1275
Lys Ser Met Ser Thr Gln Asn Gly Phe Ile Phe Tyr Ile Pro Ala
1280 1285 1290
Trp Leu Thr Ser Lys Ile Asp Pro Ser Thr Gly Phe Val Asn Leu
1295 1300 1305
Leu Lys Thr Lys Tyr Thr Ser Ile Ala Asp Ser Lys Lys Phe Ile
1310 1315 1320
Ser Ser Phe Asp Arg Ile Met Tyr Val Pro Glu Glu Asp Leu Phe
1325 1330 1335
Glu Phe Ala Leu Asp Tyr Lys Asn Phe Ser Arg Thr Asp Ala Asp
1340 1345 1350
Tyr Ile Lys Lys Trp Lys Leu Tyr Ser Tyr Gly Asn Arg Ile Arg
1355 1360 1365
Ile Phe Arg Asn Pro Lys Lys Asn Asn Val Phe Asp Trp Glu Glu
1370 1375 1380
Val Cys Leu Thr Ser Ala Tyr Lys Glu Leu Phe Asn Lys Tyr Gly
1385 1390 1395
Ile Asn Tyr Gln Gln Gly Asp Ile Arg Ala Leu Leu Cys Glu Gln
1400 1405 1410
Ser Asp Lys Ala Phe Tyr Ser Ser Phe Met Ala Leu Met Ser Leu
1415 1420 1425
Met Leu Gln Met Arg Asn Ser Ile Thr Gly Arg Thr Asp Val Ala
1430 1435 1440
Phe Leu Ile Ser Pro Val Lys Asn Ser Asp Gly Ile Phe Tyr Asp
1445 1450 1455
Ser Arg Asn Tyr Glu Ala Gln Glu Asn Ala Ile Leu Pro Lys Asn
1460 1465 1470
Ala Asp Ala Asn Gly Ala Tyr Asn Ile Ala Arg Lys Val Leu Trp
1475 1480 1485
Ala Ile Gly
1490
<210> 58
<211> 1662
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 58
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
1 5 10 15
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Met Pro
20 25 30
Lys Lys Lys Arg Lys Val Met Ser Ser Glu Thr Gly Pro Val Ala Val
35 40 45
Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu Val Phe
50 55 60
Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr Glu Ile
65 70 75 80
Asn Trp Gly Gly Arg His Ser Ile Trp Arg His Thr Ser Gln Asn Thr
85 90 95
Asn Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Thr Glu Arg
100 105 110
Tyr Phe Cys Pro Asn Thr Arg Cys Ser Ile Thr Trp Phe Leu Ser Trp
115 120 125
Ser Pro Cys Gly Glu Cys Ser Arg Ala Ile Thr Glu Phe Leu Ser Arg
130 135 140
Tyr Pro His Val Thr Leu Phe Ile Tyr Ile Ala Arg Leu Tyr His His
145 150 155 160
Ala Asp Pro Arg Asn Arg Gln Gly Leu Arg Asp Leu Ile Ser Ser Gly
165 170 175
Val Thr Ile Gln Ile Met Thr Glu Gln Glu Ser Gly Tyr Cys Trp Arg
180 185 190
Asn Phe Val Asn Tyr Ser Pro Ser Asn Glu Ala His Trp Pro Arg Tyr
195 200 205
Pro His Leu Trp Val Arg Leu Tyr Val Leu Glu Leu Tyr Cys Ile Ile
210 215 220
Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln Pro Gln
225 230 235 240
Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln Arg Leu
245 250 255
Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys Gly Gly Gly Gly Ser
260 265 270
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
275 280 285
Gly Gly Gly Ser Gly Gly Gly Gly Ser Met Leu Phe Gln Asp Phe Thr
290 295 300
His Leu Tyr Pro Leu Ser Lys Thr Val Arg Phe Glu Leu Lys Pro Ile
305 310 315 320
Gly Arg Thr Leu Glu His Ile His Ala Lys Asn Phe Leu Ser Gln Asp
325 330 335
Glu Thr Met Ala Asp Met Tyr Gln Lys Val Lys Val Ile Leu Asp Asp
340 345 350
Tyr His Arg Asp Phe Ile Ala Asp Met Met Gly Glu Val Lys Leu Thr
355 360 365
Lys Leu Ala Glu Phe Tyr Asp Val Tyr Leu Lys Phe Arg Lys Asn Pro
370 375 380
Lys Asp Asp Gly Leu Gln Lys Gln Leu Lys Asp Leu Gln Ala Val Leu
385 390 395 400
Arg Lys Glu Ser Val Lys Pro Ile Gly Ser Gly Gly Lys Tyr Lys Thr
405 410 415
Gly Tyr Asp Arg Leu Phe Gly Ala Lys Leu Phe Lys Asp Gly Lys Glu
420 425 430
Leu Gly Asp Leu Ala Lys Phe Val Ile Ala Gln Glu Gly Glu Ser Ser
435 440 445
Pro Lys Leu Ala His Leu Ala His Phe Glu Lys Phe Ser Thr Tyr Phe
450 455 460
Thr Gly Phe His Asp Asn Arg Lys Asn Met Tyr Ser Asp Glu Asp Lys
465 470 475 480
His Thr Ala Ile Ala Tyr Arg Leu Ile His Glu Asn Leu Pro Arg Phe
485 490 495
Ile Asp Asn Leu Gln Ile Leu Thr Thr Ile Lys Gln Lys His Ser Ala
500 505 510
Leu Tyr Asp Gln Ile Ile Asn Glu Leu Thr Ala Ser Gly Leu Asp Val
515 520 525
Ser Leu Ala Ser His Leu Asp Gly Tyr His Lys Leu Leu Thr Gln Glu
530 535 540
Gly Ile Thr Ala Tyr Asn Arg Ile Ile Gly Glu Val Asn Gly Tyr Thr
545 550 555 560
Asn Lys His Asn Gln Ile Cys His Lys Ser Glu Arg Ile Ala Lys Leu
565 570 575
Arg Pro Leu His Lys Gln Ile Leu Ser Asp Gly Met Gly Val Ser Phe
580 585 590
Leu Pro Ser Lys Phe Ala Asp Asp Ser Glu Met Cys Gln Ala Val Asn
595 600 605
Glu Phe Tyr Arg His Tyr Thr Asp Val Phe Ala Lys Val Gln Ser Leu
610 615 620
Phe Asp Gly Phe Asp Asp His Gln Lys Asp Gly Ile Tyr Val Glu His
625 630 635 640
Lys Asn Leu Asn Glu Leu Ser Lys Gln Ala Phe Gly Asp Phe Ala Leu
645 650 655
Leu Gly Arg Val Leu Asp Gly Tyr Tyr Val Asp Val Val Asn Pro Glu
660 665 670
Phe Asn Glu Arg Phe Ala Lys Ala Lys Thr Asp Asn Ala Lys Ala Lys
675 680 685
Leu Thr Lys Glu Lys Asp Lys Phe Ile Lys Gly Val His Ser Leu Ala
690 695 700
Ser Leu Glu Gln Ala Ile Glu His His Thr Ala Arg His Asp Asp Glu
705 710 715 720
Ser Val Gln Ala Gly Lys Leu Gly Gln Tyr Phe Lys His Gly Leu Ala
725 730 735
Gly Val Asp Asn Pro Ile Gln Lys Ile His Asn Asn His Ser Thr Ile
740 745 750
Lys Gly Phe Leu Glu Arg Glu Arg Pro Ala Gly Glu Arg Ala Leu Pro
755 760 765
Lys Ile Lys Ser Gly Lys Asn Pro Glu Met Thr Gln Leu Arg Gln Leu
770 775 780
Lys Glu Leu Leu Asp Asn Ala Leu Asn Val Ala His Phe Ala Lys Leu
785 790 795 800
Leu Thr Thr Lys Thr Thr Leu Asp Asn Gln Asp Gly Asn Phe Tyr Gly
805 810 815
Glu Phe Gly Val Leu Tyr Asp Glu Leu Ala Lys Ile Pro Thr Leu Tyr
820 825 830
Asn Lys Val Arg Asp Tyr Leu Ser Gln Lys Pro Phe Ser Thr Glu Lys
835 840 845
Tyr Lys Leu Asn Phe Gly Asn Pro Thr Leu Leu Asn Gly Trp Asp Leu
850 855 860
Asn Lys Glu Lys Asp Asn Phe Gly Val Ile Leu Gln Lys Asp Gly Cys
865 870 875 880
Tyr Tyr Leu Ala Leu Leu Asp Lys Ala His Lys Lys Val Phe Asp Asn
885 890 895
Ala Pro Asn Thr Gly Lys Asn Val Tyr Gln Lys Met Val Tyr Lys Leu
900 905 910
Leu Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ala Lys Ser
915 920 925
Asn Leu Asp Tyr Tyr Asn Pro Ser Ala Glu Leu Leu Asp Lys Tyr Ala
930 935 940
Lys Gly Thr His Lys Lys Gly Asp Asn Phe Asn Leu Lys Asp Cys His
945 950 955 960
Ala Leu Ile Asp Phe Phe Lys Ala Gly Ile Asn Lys His Pro Glu Trp
965 970 975
Gln His Phe Gly Phe Lys Phe Ser Pro Thr Ser Ser Tyr Arg Asp Leu
980 985 990
Ser Asp Phe Tyr Arg Glu Val Glu Pro Gln Gly Tyr Gln Val Lys Phe
995 1000 1005
Val Asp Ile Asn Ala Asp Tyr Ile Asp Glu Leu Val Glu Gln Gly
1010 1015 1020
Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser Pro Lys
1025 1030 1035
Ala His Gly Lys Pro Asn Leu His Thr Leu Tyr Phe Lys Ala Leu
1040 1045 1050
Phe Ser Glu Asp Asn Leu Ala Asp Pro Ile Tyr Lys Leu Asn Gly
1055 1060 1065
Glu Ala Gln Ile Phe Tyr Arg Lys Ala Ser Leu Asp Met Asn Glu
1070 1075 1080
Thr Thr Ile His Arg Ala Gly Glu Val Leu Glu Asn Lys Asn Pro
1085 1090 1095
Asp Asn Pro Lys Lys Arg Gln Phe Val Tyr Asp Ile Ile Lys Asp
1100 1105 1110
Lys Arg Tyr Thr Gln Asp Lys Phe Met Leu His Val Pro Ile Thr
1115 1120 1125
Met Asn Phe Gly Val Gln Gly Met Thr Ile Lys Glu Phe Asn Lys
1130 1135 1140
Lys Val Asn Gln Ser Ile Gln Gln Tyr Asp Glu Val Asn Val Ile
1145 1150 1155
Gly Ile Ala Arg Gly Glu Arg His Leu Leu Tyr Leu Thr Val Ile
1160 1165 1170
Asn Ser Lys Gly Glu Ile Leu Glu Gln Arg Ser Leu Asn Asp Ile
1175 1180 1185
Thr Thr Ala Ser Ala Asn Gly Thr Gln Val Thr Thr Pro Tyr His
1190 1195 1200
Lys Ile Leu Asp Lys Arg Glu Ile Glu Arg Leu Asn Ala Arg Val
1205 1210 1215
Gly Trp Gly Glu Ile Glu Thr Ile Lys Glu Leu Lys Ser Gly Tyr
1220 1225 1230
Leu Ser His Val Val His Gln Ile Asn Gln Leu Met Leu Lys Tyr
1235 1240 1245
Asn Ala Ile Val Val Leu Ala Asp Leu Asn Phe Gly Phe Lys Arg
1250 1255 1260
Gly Arg Phe Lys Val Glu Lys Gln Ile Tyr Gln Asn Phe Glu Asn
1265 1270 1275
Ala Leu Ile Lys Lys Leu Asn His Leu Val Leu Lys Asp Lys Ala
1280 1285 1290
Asp Asp Glu Ile Gly Ser Tyr Lys Asn Ala Leu Gln Leu Thr Asn
1295 1300 1305
Asn Phe Thr Asp Leu Lys Ser Ile Gly Lys Gln Thr Gly Phe Leu
1310 1315 1320
Phe Tyr Val Pro Ala Trp Asn Thr Ser Lys Ile Asp Pro Glu Thr
1325 1330 1335
Gly Phe Val Asp Leu Leu Lys Pro Arg Tyr Glu Asn Ile Ala Gln
1340 1345 1350
Ser Gln Ala Phe Phe Gly Lys Phe Asp Lys Ile Cys Tyr Asn Thr
1355 1360 1365
Asp Lys Gly Tyr Phe Glu Phe His Ile Asp Tyr Ala Lys Phe Thr
1370 1375 1380
Asp Lys Ala Lys Asn Ser Arg Gln Lys Trp Ala Ile Cys Ser His
1385 1390 1395
Gly Asp Lys Arg Tyr Val Tyr Asp Lys Thr Ala Asn Gln Asn Lys
1400 1405 1410
Gly Ala Ala Lys Gly Ile Asn Val Asn Asp Glu Leu Lys Ser Leu
1415 1420 1425
Phe Ala Arg Tyr His Ile Asn Asp Lys Gln Pro Asn Leu Val Met
1430 1435 1440
Asp Ile Cys Gln Asn Asn Asp Lys Glu Phe His Lys Ser Leu Met
1445 1450 1455
Cys Leu Leu Lys Thr Leu Leu Ala Leu Arg Tyr Ser Asn Ala Ser
1460 1465 1470
Ser Asp Glu Ala Phe Ile Leu Ser Pro Val Ala Asn Asp Glu Gly
1475 1480 1485
Val Phe Phe Asn Ser Ala Leu Ala Asp Asp Thr Gln Pro Gln Asn
1490 1495 1500
Ala Asp Ala Asn Gly Ala Tyr His Ile Ala Leu Lys Gly Leu Trp
1505 1510 1515
Leu Leu Asn Glu Leu Lys Asn Ser Asp Asp Leu Asn Lys Val Lys
1520 1525 1530
Leu Ala Ile Asp Asn Gln Thr Trp Leu Asn Phe Ala Gln Asn Arg
1535 1540 1545
Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys
1550 1555 1560
Lys Ser Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu
1565 1570 1575
Thr Gly Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro
1580 1585 1590
Glu Glu Val Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile
1595 1600 1605
Leu Val His Thr Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met
1610 1615 1620
Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val
1625 1630 1635
Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu Ser Gly
1640 1645 1650
Gly Ser Pro Lys Lys Lys Arg Lys Val
1655 1660
<210> 59
<211> 1267
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 59
Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser
1 5 10 15
Met Leu Phe Gln Asp Phe Thr His Leu Tyr Pro Leu Ser Lys Thr Val
20 25 30
Arg Phe Glu Leu Lys Pro Ile Gly Arg Thr Leu Glu His Ile His Ala
35 40 45
Lys Asn Phe Leu Ser Gln Asp Glu Thr Met Ala Asp Met Tyr Gln Lys
50 55 60
Val Lys Val Ile Leu Asp Asp Tyr His Arg Asp Phe Ile Ala Asp Met
65 70 75 80
Met Gly Glu Val Lys Leu Thr Lys Leu Ala Glu Phe Tyr Asp Val Tyr
85 90 95
Leu Lys Phe Arg Lys Asn Pro Lys Asp Asp Gly Leu Gln Lys Gln Leu
100 105 110
Lys Asp Leu Gln Ala Val Leu Arg Lys Glu Ser Val Lys Pro Ile Gly
115 120 125
Ser Gly Gly Lys Tyr Lys Thr Gly Tyr Asp Arg Leu Phe Gly Ala Lys
130 135 140
Leu Phe Lys Asp Gly Lys Glu Leu Gly Asp Leu Ala Lys Phe Val Ile
145 150 155 160
Ala Gln Glu Gly Glu Ser Ser Pro Lys Leu Ala His Leu Ala His Phe
165 170 175
Glu Lys Phe Ser Thr Tyr Phe Thr Gly Phe His Asp Asn Arg Lys Asn
180 185 190
Met Tyr Ser Asp Glu Asp Lys His Thr Ala Ile Ala Tyr Arg Leu Ile
195 200 205
His Glu Asn Leu Pro Arg Phe Ile Asp Asn Leu Gln Ile Leu Thr Thr
210 215 220
Ile Lys Gln Lys His Ser Ala Leu Tyr Asp Gln Ile Ile Asn Glu Leu
225 230 235 240
Thr Ala Ser Gly Leu Asp Val Ser Leu Ala Ser His Leu Asp Gly Tyr
245 250 255
His Lys Leu Leu Thr Gln Glu Gly Ile Thr Ala Tyr Asn Arg Ile Ile
260 265 270
Gly Glu Val Asn Gly Tyr Thr Asn Lys His Asn Gln Ile Cys His Lys
275 280 285
Ser Glu Arg Ile Ala Lys Leu Arg Pro Leu His Lys Gln Ile Leu Ser
290 295 300
Asp Gly Met Gly Val Ser Phe Leu Pro Ser Lys Phe Ala Asp Asp Ser
305 310 315 320
Glu Met Cys Gln Ala Val Asn Glu Phe Tyr Arg His Tyr Thr Asp Val
325 330 335
Phe Ala Lys Val Gln Ser Leu Phe Asp Gly Phe Asp Asp His Gln Lys
340 345 350
Asp Gly Ile Tyr Val Glu His Lys Asn Leu Asn Glu Leu Ser Lys Gln
355 360 365
Ala Phe Gly Asp Phe Ala Leu Leu Gly Arg Val Leu Asp Gly Tyr Tyr
370 375 380
Val Asp Val Val Asn Pro Glu Phe Asn Glu Arg Phe Ala Lys Ala Lys
385 390 395 400
Thr Asp Asn Ala Lys Ala Lys Leu Thr Lys Glu Lys Asp Lys Phe Ile
405 410 415
Lys Gly Val His Ser Leu Ala Ser Leu Glu Gln Ala Ile Glu His His
420 425 430
Thr Ala Arg His Asp Asp Glu Ser Val Gln Ala Gly Lys Leu Gly Gln
435 440 445
Tyr Phe Lys His Gly Leu Ala Gly Val Asp Asn Pro Ile Gln Lys Ile
450 455 460
His Asn Asn His Ser Thr Ile Lys Gly Phe Leu Glu Arg Glu Arg Pro
465 470 475 480
Ala Gly Glu Arg Ala Leu Pro Lys Ile Lys Ser Gly Lys Asn Pro Glu
485 490 495
Met Thr Gln Leu Arg Gln Leu Lys Glu Leu Leu Asp Asn Ala Leu Asn
500 505 510
Val Ala His Phe Ala Lys Leu Leu Thr Thr Lys Thr Thr Leu Asp Asn
515 520 525
Gln Asp Gly Asn Phe Tyr Gly Glu Phe Gly Val Leu Tyr Asp Glu Leu
530 535 540
Ala Lys Ile Pro Thr Leu Tyr Asn Lys Val Arg Asp Tyr Leu Ser Gln
545 550 555 560
Lys Pro Phe Ser Thr Glu Lys Tyr Lys Leu Asn Phe Gly Asn Pro Thr
565 570 575
Leu Leu Asn Gly Trp Asp Leu Asn Lys Glu Lys Asp Asn Phe Gly Val
580 585 590
Ile Leu Gln Lys Asp Gly Cys Tyr Tyr Leu Ala Leu Leu Asp Lys Ala
595 600 605
His Lys Lys Val Phe Asp Asn Ala Pro Asn Thr Gly Lys Asn Val Tyr
610 615 620
Gln Lys Met Val Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro
625 630 635 640
Lys Val Phe Phe Ala Lys Ser Asn Leu Asp Tyr Tyr Asn Pro Ser Ala
645 650 655
Glu Leu Leu Asp Lys Tyr Ala Lys Gly Thr His Lys Lys Gly Asp Asn
660 665 670
Phe Asn Leu Lys Asp Cys His Ala Leu Ile Asp Phe Phe Lys Ala Gly
675 680 685
Ile Asn Lys His Pro Glu Trp Gln His Phe Gly Phe Lys Phe Ser Pro
690 695 700
Thr Ser Ser Tyr Arg Asp Leu Ser Asp Phe Tyr Arg Glu Val Glu Pro
705 710 715 720
Gln Gly Tyr Gln Val Lys Phe Val Asp Ile Asn Ala Asp Tyr Ile Asp
725 730 735
Glu Leu Val Glu Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys
740 745 750
Asp Phe Ser Pro Lys Ala His Gly Lys Pro Asn Leu His Thr Leu Tyr
755 760 765
Phe Lys Ala Leu Phe Ser Glu Asp Asn Leu Ala Asp Pro Ile Tyr Lys
770 775 780
Leu Asn Gly Glu Ala Gln Ile Phe Tyr Arg Lys Ala Ser Leu Asp Met
785 790 795 800
Asn Glu Thr Thr Ile His Arg Ala Gly Glu Val Leu Glu Asn Lys Asn
805 810 815
Pro Asp Asn Pro Lys Lys Arg Gln Phe Val Tyr Asp Ile Ile Lys Asp
820 825 830
Lys Arg Tyr Thr Gln Asp Lys Phe Met Leu His Val Pro Ile Thr Met
835 840 845
Asn Phe Gly Val Gln Gly Met Thr Ile Lys Glu Phe Asn Lys Lys Val
850 855 860
Asn Gln Ser Ile Gln Gln Tyr Asp Glu Val Asn Val Ile Gly Ile Asp
865 870 875 880
Arg Gly Glu Arg His Leu Leu Tyr Leu Thr Val Ile Asn Ser Lys Gly
885 890 895
Glu Ile Leu Glu Gln Arg Ser Leu Asn Asp Ile Thr Thr Ala Ser Ala
900 905 910
Asn Gly Thr Gln Val Thr Thr Pro Tyr His Lys Ile Leu Asp Lys Arg
915 920 925
Glu Ile Glu Arg Leu Asn Ala Arg Val Gly Trp Gly Glu Ile Glu Thr
930 935 940
Ile Lys Glu Leu Lys Ser Gly Tyr Leu Ser His Val Val His Gln Ile
945 950 955 960
Asn Gln Leu Met Leu Lys Tyr Asn Ala Ile Val Val Leu Glu Asp Leu
965 970 975
Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys Val Glu Lys Gln Ile Tyr
980 985 990
Gln Asn Phe Glu Asn Ala Leu Ile Lys Lys Leu Asn His Leu Val Leu
995 1000 1005
Lys Asp Lys Ala Asp Asp Glu Ile Gly Ser Tyr Lys Asn Ala Leu
1010 1015 1020
Gln Leu Thr Asn Asn Phe Thr Asp Leu Lys Ser Ile Gly Lys Gln
1025 1030 1035
Thr Gly Phe Leu Phe Tyr Val Pro Ala Trp Asn Thr Ser Lys Ile
1040 1045 1050
Asp Pro Glu Thr Gly Phe Val Asp Leu Leu Lys Pro Arg Tyr Glu
1055 1060 1065
Asn Ile Ala Gln Ser Gln Ala Phe Phe Gly Lys Phe Asp Lys Ile
1070 1075 1080
Cys Tyr Asn Thr Asp Lys Gly Tyr Phe Glu Phe His Ile Asp Tyr
1085 1090 1095
Ala Lys Phe Thr Asp Lys Ala Lys Asn Ser Arg Gln Lys Trp Ala
1100 1105 1110
Ile Cys Ser His Gly Asp Lys Arg Tyr Val Tyr Asp Lys Thr Ala
1115 1120 1125
Asn Gln Asn Lys Gly Ala Ala Lys Gly Ile Asn Val Asn Asp Glu
1130 1135 1140
Leu Lys Ser Leu Phe Ala Arg Tyr His Ile Asn Asp Lys Gln Pro
1145 1150 1155
Asn Leu Val Met Asp Ile Cys Gln Asn Asn Asp Lys Glu Phe His
1160 1165 1170
Lys Ser Leu Met Cys Leu Leu Lys Thr Leu Leu Ala Leu Arg Tyr
1175 1180 1185
Ser Asn Ala Ser Ser Asp Glu Asp Phe Ile Leu Ser Pro Val Ala
1190 1195 1200
Asn Asp Glu Gly Val Phe Phe Asn Ser Ala Leu Ala Asp Asp Thr
1205 1210 1215
Gln Pro Gln Asn Ala Asp Ala Asn Gly Ala Tyr His Ile Ala Leu
1220 1225 1230
Lys Gly Leu Trp Leu Leu Asn Glu Leu Lys Asn Ser Asp Asp Leu
1235 1240 1245
Asn Lys Val Lys Leu Ala Ile Asp Asn Gln Thr Trp Leu Asn Phe
1250 1255 1260
Ala Gln Asn Arg
1265
<210> 60
<211> 1368
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 60
Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser
1 5 10 15
Met Thr Gln Phe Glu Gly Phe Thr Asn Leu Tyr Gln Val Ser Lys Thr
20 25 30
Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu Lys His Ile Gln
35 40 45
Glu Gln Gly Phe Ile Glu Glu Asp Lys Ala Arg Asn Asp His Tyr Lys
50 55 60
Glu Leu Lys Pro Ile Ile Asp Arg Ile Tyr Lys Thr Tyr Ala Asp Gln
65 70 75 80
Cys Leu Gln Leu Val Gln Leu Asp Trp Glu Asn Leu Ser Ala Ala Ile
85 90 95
Asp Ser Tyr Arg Lys Glu Lys Thr Glu Glu Thr Arg Asn Ala Leu Ile
100 105 110
Glu Glu Gln Ala Thr Tyr Arg Asn Ala Ile His Asp Tyr Phe Ile Gly
115 120 125
Arg Thr Asp Asn Leu Thr Asp Ala Ile Asn Lys Arg His Ala Glu Ile
130 135 140
Tyr Lys Gly Leu Phe Lys Ala Glu Leu Phe Asn Gly Lys Val Leu Lys
145 150 155 160
Gln Leu Gly Thr Val Thr Thr Thr Glu His Glu Asn Ala Leu Leu Arg
165 170 175
Ser Phe Asp Lys Phe Thr Thr Tyr Phe Ser Gly Phe Tyr Glu Asn Arg
180 185 190
Lys Asn Val Phe Ser Ala Glu Asp Ile Ser Thr Ala Ile Pro His Arg
195 200 205
Ile Val Gln Asp Asn Phe Pro Lys Phe Lys Glu Asn Cys His Ile Phe
210 215 220
Thr Arg Leu Ile Thr Ala Val Pro Ser Leu Arg Glu His Phe Glu Asn
225 230 235 240
Val Lys Lys Ala Ile Gly Ile Phe Val Ser Thr Ser Ile Glu Glu Val
245 250 255
Phe Ser Phe Pro Phe Tyr Asn Gln Leu Leu Thr Gln Thr Gln Ile Asp
260 265 270
Leu Tyr Asn Gln Leu Leu Gly Gly Ile Ser Arg Glu Ala Gly Thr Glu
275 280 285
Lys Ile Lys Gly Leu Asn Glu Val Leu Asn Leu Ala Ile Gln Lys Asn
290 295 300
Asp Glu Thr Ala His Ile Ile Ala Ser Leu Pro His Arg Phe Ile Pro
305 310 315 320
Leu Phe Lys Gln Ile Leu Ser Asp Arg Asn Thr Leu Ser Phe Ile Leu
325 330 335
Glu Glu Phe Lys Ser Asp Glu Glu Val Ile Gln Ser Phe Cys Lys Tyr
340 345 350
Lys Thr Leu Leu Arg Asn Glu Asn Val Leu Glu Thr Ala Glu Ala Leu
355 360 365
Phe Asn Glu Leu Asn Ser Ile Asp Leu Thr His Ile Phe Ile Ser His
370 375 380
Lys Lys Leu Glu Thr Ile Ser Ser Ala Leu Cys Asp His Trp Asp Thr
385 390 395 400
Leu Arg Asn Ala Leu Tyr Glu Arg Arg Ile Ser Glu Leu Thr Gly Lys
405 410 415
Ile Thr Lys Ser Ala Lys Glu Lys Val Gln Arg Ser Leu Lys His Glu
420 425 430
Asp Ile Asn Leu Gln Glu Ile Ile Ser Ala Ala Gly Lys Glu Leu Ser
435 440 445
Glu Ala Phe Lys Gln Lys Thr Ser Glu Ile Leu Ser His Ala His Ala
450 455 460
Ala Leu Asp Gln Pro Leu Pro Thr Thr Leu Lys Lys Gln Glu Glu Lys
465 470 475 480
Glu Ile Leu Lys Ser Gln Leu Asp Ser Leu Leu Gly Leu Tyr His Leu
485 490 495
Leu Asp Trp Phe Ala Val Asp Glu Ser Asn Glu Val Asp Pro Glu Phe
500 505 510
Ser Ala Arg Leu Thr Gly Ile Lys Leu Glu Met Glu Pro Ser Leu Ser
515 520 525
Phe Tyr Asn Lys Ala Arg Asn Tyr Ala Thr Lys Lys Pro Tyr Ser Val
530 535 540
Glu Lys Phe Lys Leu Asn Phe Gln Met Pro Thr Leu Ala Ser Gly Trp
545 550 555 560
Asp Val Asn Lys Glu Lys Asn Asn Gly Ala Ile Leu Phe Val Lys Asn
565 570 575
Gly Leu Tyr Tyr Leu Gly Ile Met Pro Lys Gln Lys Gly Arg Tyr Lys
580 585 590
Ala Leu Ser Phe Glu Pro Thr Glu Lys Thr Ser Glu Gly Phe Asp Lys
595 600 605
Met Tyr Tyr Asp Tyr Phe Pro Asp Ala Ala Lys Met Ile Pro Lys Cys
610 615 620
Ser Thr Gln Leu Lys Ala Val Thr Ala His Phe Gln Thr His Thr Thr
625 630 635 640
Pro Ile Leu Leu Ser Asn Asn Phe Ile Glu Pro Leu Glu Ile Thr Lys
645 650 655
Glu Ile Tyr Asp Leu Asn Asn Pro Glu Lys Glu Pro Lys Lys Phe Gln
660 665 670
Thr Ala Tyr Ala Lys Lys Thr Gly Asp Gln Lys Gly Tyr Arg Glu Ala
675 680 685
Leu Cys Lys Trp Ile Asp Phe Thr Arg Asp Phe Leu Ser Lys Tyr Thr
690 695 700
Lys Thr Thr Ser Ile Asp Leu Ser Ser Leu Arg Pro Ser Ser Gln Tyr
705 710 715 720
Lys Asp Leu Gly Glu Tyr Tyr Ala Glu Leu Asn Pro Leu Leu Tyr His
725 730 735
Ile Ser Phe Gln Arg Ile Ala Glu Lys Glu Ile Met Asp Ala Val Glu
740 745 750
Thr Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ala Lys
755 760 765
Gly His His Gly Lys Pro Asn Leu His Thr Leu Tyr Trp Thr Gly Leu
770 775 780
Phe Ser Pro Glu Asn Leu Ala Lys Thr Ser Ile Lys Leu Asn Gly Gln
785 790 795 800
Ala Glu Leu Phe Tyr Arg Pro Lys Ser Arg Met Lys Arg Met Ala His
805 810 815
Arg Leu Gly Glu Lys Met Leu Asn Lys Lys Leu Lys Asp Gln Lys Thr
820 825 830
Pro Ile Pro Asp Thr Leu Tyr Gln Glu Leu Tyr Asp Tyr Val Asn His
835 840 845
Arg Leu Ser His Asp Leu Ser Asp Glu Ala Arg Ala Leu Leu Pro Asn
850 855 860
Val Ile Thr Lys Glu Val Ser His Glu Ile Ile Lys Asp Arg Arg Phe
865 870 875 880
Thr Ser Asp Lys Phe Phe Phe His Val Pro Ile Thr Leu Asn Tyr Gln
885 890 895
Ala Ala Asn Ser Pro Ser Lys Phe Asn Gln Arg Val Asn Ala Tyr Leu
900 905 910
Lys Glu His Pro Glu Thr Pro Ile Ile Gly Ile Asp Arg Gly Glu Arg
915 920 925
Asn Leu Ile Tyr Ile Thr Val Ile Asp Ser Thr Gly Lys Ile Leu Glu
930 935 940
Gln Arg Ser Leu Asn Thr Ile Gln Gln Phe Asp Tyr Gln Lys Lys Leu
945 950 955 960
Asp Asn Arg Glu Lys Glu Arg Val Ala Ala Arg Gln Ala Trp Ser Val
965 970 975
Val Gly Thr Ile Lys Asp Leu Lys Gln Gly Tyr Leu Ser Gln Val Ile
980 985 990
His Glu Ile Val Asp Leu Met Ile His Tyr Gln Ala Val Val Val Leu
995 1000 1005
Glu Asn Leu Asn Phe Gly Phe Lys Ser Lys Arg Thr Gly Ile Ala
1010 1015 1020
Glu Lys Ala Val Tyr Gln Gln Phe Glu Lys Met Leu Ile Asp Lys
1025 1030 1035
Leu Asn Cys Leu Val Leu Lys Asp Tyr Pro Ala Glu Lys Val Gly
1040 1045 1050
Gly Val Leu Asn Pro Tyr Gln Leu Thr Asp Gln Phe Thr Ser Phe
1055 1060 1065
Ala Lys Met Gly Thr Gln Ser Gly Phe Leu Phe Tyr Val Pro Ala
1070 1075 1080
Pro Tyr Thr Ser Lys Ile Asp Pro Leu Thr Gly Phe Val Asp Pro
1085 1090 1095
Phe Val Trp Lys Thr Ile Lys Asn His Glu Ser Arg Lys His Phe
1100 1105 1110
Leu Glu Gly Phe Asp Phe Leu His Tyr Asp Val Lys Thr Gly Asp
1115 1120 1125
Phe Ile Leu His Phe Lys Met Asn Arg Asn Leu Ser Phe Gln Arg
1130 1135 1140
Gly Leu Pro Gly Phe Met Pro Ala Trp Asp Ile Val Phe Glu Lys
1145 1150 1155
Asn Glu Thr Gln Phe Asp Ala Lys Gly Thr Pro Phe Ile Ala Gly
1160 1165 1170
Lys Arg Ile Val Pro Val Ile Glu Asn His Arg Phe Thr Gly Arg
1175 1180 1185
Tyr Arg Asp Leu Tyr Pro Ala Asn Glu Leu Ile Ala Leu Leu Glu
1190 1195 1200
Glu Lys Gly Ile Val Phe Arg Asp Gly Ser Asn Ile Leu Pro Lys
1205 1210 1215
Leu Leu Glu Asn Asp Asp Ser His Ala Ile Asp Thr Met Val Ala
1220 1225 1230
Leu Ile Arg Ser Val Leu Gln Met Arg Asn Ser Asn Ala Ala Thr
1235 1240 1245
Gly Glu Asp Tyr Ile Asn Ser Pro Val Arg Asp Leu Asn Gly Val
1250 1255 1260
Cys Phe Asp Ser Arg Phe Gln Asn Pro Glu Trp Pro Met Asp Ala
1265 1270 1275
Asp Ala Asn Gly Ala Tyr His Ile Ala Leu Lys Gly Gln Leu Leu
1280 1285 1290
Leu Asn His Leu Lys Glu Ser Lys Asp Leu Lys Leu Gln Asn Gly
1295 1300 1305
Ile Ser Asn Gln Asp Trp Leu Ala Tyr Ile Gln Glu Leu Arg Asn
1310 1315 1320
Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys
1325 1330 1335
Lys Gly Ser Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Tyr Pro Tyr
1340 1345 1350
Asp Val Pro Asp Tyr Ala Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
1355 1360 1365
<210> 61
<211> 1332
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 61
Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser
1 5 10 15
Met Thr Gln Phe Glu Gly Phe Thr Asn Leu Tyr Gln Val Ser Lys Thr
20 25 30
Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu Lys His Ile Gln
35 40 45
Glu Gln Gly Phe Ile Glu Glu Asp Lys Ala Arg Asn Asp His Tyr Lys
50 55 60
Glu Leu Lys Pro Ile Ile Asp Arg Ile Tyr Lys Thr Tyr Ala Asp Gln
65 70 75 80
Cys Leu Gln Leu Val Gln Leu Asp Trp Glu Asn Leu Ser Ala Ala Ile
85 90 95
Asp Ser Tyr Arg Lys Glu Lys Thr Glu Glu Thr Arg Asn Ala Leu Ile
100 105 110
Glu Glu Gln Ala Thr Tyr Arg Asn Ala Ile His Asp Tyr Phe Ile Gly
115 120 125
Arg Thr Asp Asn Leu Thr Asp Ala Ile Asn Lys Arg His Ala Glu Ile
130 135 140
Tyr Lys Gly Leu Phe Lys Ala Glu Leu Phe Asn Gly Lys Val Leu Lys
145 150 155 160
Gln Leu Gly Thr Val Thr Thr Thr Glu His Glu Asn Ala Leu Leu Arg
165 170 175
Ser Phe Asp Lys Phe Thr Thr Tyr Phe Ser Gly Phe Tyr Glu Asn Arg
180 185 190
Lys Asn Val Phe Ser Ala Glu Asp Ile Ser Thr Ala Ile Pro His Arg
195 200 205
Ile Val Gln Asp Asn Phe Pro Lys Phe Lys Glu Asn Cys His Ile Phe
210 215 220
Thr Arg Leu Ile Thr Ala Val Pro Ser Leu Arg Glu His Phe Glu Asn
225 230 235 240
Val Lys Lys Ala Ile Gly Ile Phe Val Ser Thr Ser Ile Glu Glu Val
245 250 255
Phe Ser Phe Pro Phe Tyr Asn Gln Leu Leu Thr Gln Thr Gln Ile Asp
260 265 270
Leu Tyr Asn Gln Leu Leu Gly Gly Ile Ser Arg Glu Ala Gly Thr Glu
275 280 285
Lys Ile Lys Gly Leu Asn Glu Val Leu Asn Leu Ala Ile Gln Lys Asn
290 295 300
Asp Glu Thr Ala His Ile Ile Ala Ser Leu Pro His Arg Phe Ile Pro
305 310 315 320
Leu Phe Lys Gln Ile Leu Ser Asp Arg Asn Thr Leu Ser Phe Ile Leu
325 330 335
Glu Glu Phe Lys Ser Asp Glu Glu Val Ile Gln Ser Phe Cys Lys Tyr
340 345 350
Lys Thr Leu Leu Arg Asn Glu Asn Val Leu Glu Thr Ala Glu Ala Leu
355 360 365
Phe Asn Glu Leu Asn Ser Ile Asp Leu Thr His Ile Phe Ile Ser His
370 375 380
Lys Lys Leu Glu Thr Ile Ser Ser Ala Leu Cys Asp His Trp Asp Thr
385 390 395 400
Leu Arg Asn Ala Leu Tyr Glu Arg Arg Ile Ser Glu Leu Thr Gly Lys
405 410 415
Ile Thr Lys Ser Ala Lys Glu Lys Val Gln Arg Ser Leu Lys His Glu
420 425 430
Asp Ile Asn Leu Gln Glu Ile Ile Ser Ala Ala Gly Lys Glu Leu Ser
435 440 445
Glu Ala Phe Lys Gln Lys Thr Ser Glu Ile Leu Ser His Ala His Ala
450 455 460
Ala Leu Asp Gln Pro Leu Pro Thr Thr Leu Lys Lys Gln Glu Glu Lys
465 470 475 480
Glu Ile Leu Lys Ser Gln Leu Asp Ser Leu Leu Gly Leu Tyr His Leu
485 490 495
Leu Asp Trp Phe Ala Val Asp Glu Ser Asn Glu Val Asp Pro Glu Phe
500 505 510
Ser Ala Arg Leu Thr Gly Ile Lys Leu Glu Met Glu Pro Ser Leu Ser
515 520 525
Phe Tyr Asn Lys Ala Arg Asn Tyr Ala Thr Lys Lys Pro Tyr Ser Val
530 535 540
Glu Lys Phe Lys Leu Asn Phe Gln Met Pro Thr Leu Ala Ser Gly Trp
545 550 555 560
Asp Val Asn Lys Glu Lys Asn Asn Gly Ala Ile Leu Phe Val Lys Asn
565 570 575
Gly Leu Tyr Tyr Leu Gly Ile Met Pro Lys Gln Lys Gly Arg Tyr Lys
580 585 590
Ala Leu Ser Phe Glu Pro Thr Glu Lys Thr Ser Glu Gly Phe Asp Lys
595 600 605
Met Tyr Tyr Asp Tyr Phe Pro Asp Ala Ala Lys Met Ile Pro Lys Cys
610 615 620
Ser Thr Gln Leu Lys Ala Val Thr Ala His Phe Gln Thr His Thr Thr
625 630 635 640
Pro Ile Leu Leu Ser Asn Asn Phe Ile Glu Pro Leu Glu Ile Thr Lys
645 650 655
Glu Ile Tyr Asp Leu Asn Asn Pro Glu Lys Glu Pro Lys Lys Phe Gln
660 665 670
Thr Ala Tyr Ala Lys Lys Thr Gly Asp Gln Lys Gly Tyr Arg Glu Ala
675 680 685
Leu Cys Lys Trp Ile Asp Phe Thr Arg Asp Phe Leu Ser Lys Tyr Thr
690 695 700
Lys Thr Thr Ser Ile Asp Leu Ser Ser Leu Arg Pro Ser Ser Gln Tyr
705 710 715 720
Lys Asp Leu Gly Glu Tyr Tyr Ala Glu Leu Asn Pro Leu Leu Tyr His
725 730 735
Ile Ser Phe Gln Arg Ile Ala Glu Lys Glu Ile Met Asp Ala Val Glu
740 745 750
Thr Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ala Lys
755 760 765
Gly His His Gly Lys Pro Asn Leu His Thr Leu Tyr Trp Thr Gly Leu
770 775 780
Phe Ser Pro Glu Asn Leu Ala Lys Thr Ser Ile Lys Leu Asn Gly Gln
785 790 795 800
Ala Glu Leu Phe Tyr Arg Pro Lys Ser Arg Met Lys Arg Met Ala His
805 810 815
Arg Leu Gly Glu Lys Met Leu Asn Lys Lys Leu Lys Asp Gln Lys Thr
820 825 830
Pro Ile Pro Asp Thr Leu Tyr Gln Glu Leu Tyr Asp Tyr Val Asn His
835 840 845
Arg Leu Ser His Asp Leu Ser Asp Glu Ala Arg Ala Leu Leu Pro Asn
850 855 860
Val Ile Thr Lys Glu Val Ser His Glu Ile Ile Lys Asp Arg Arg Phe
865 870 875 880
Thr Ser Asp Lys Phe Phe Phe His Val Pro Ile Thr Leu Asn Tyr Gln
885 890 895
Ala Ala Asn Ser Pro Ser Lys Phe Asn Gln Arg Val Asn Ala Tyr Leu
900 905 910
Lys Glu His Pro Glu Thr Pro Ile Ile Gly Ile Ala Arg Gly Glu Arg
915 920 925
Asn Leu Ile Tyr Ile Thr Val Ile Asp Ser Thr Gly Lys Ile Leu Glu
930 935 940
Gln Arg Ser Leu Asn Thr Ile Gln Gln Phe Asp Tyr Gln Lys Lys Leu
945 950 955 960
Asp Asn Arg Glu Lys Glu Arg Val Ala Ala Arg Gln Ala Trp Ser Val
965 970 975
Val Gly Thr Ile Lys Asp Leu Lys Gln Gly Tyr Leu Ser Gln Val Ile
980 985 990
His Glu Ile Val Asp Leu Met Ile His Tyr Gln Ala Val Val Val Leu
995 1000 1005
Ala Asn Leu Asn Phe Gly Phe Lys Ser Lys Arg Thr Gly Ile Ala
1010 1015 1020
Glu Lys Ala Val Tyr Gln Gln Phe Glu Lys Met Leu Ile Asp Lys
1025 1030 1035
Leu Asn Cys Leu Val Leu Lys Asp Tyr Pro Ala Glu Lys Val Gly
1040 1045 1050
Gly Val Leu Asn Pro Tyr Gln Leu Thr Asp Gln Phe Thr Ser Phe
1055 1060 1065
Ala Lys Met Gly Thr Gln Ser Gly Phe Leu Phe Tyr Val Pro Ala
1070 1075 1080
Pro Tyr Thr Ser Lys Ile Asp Pro Leu Thr Gly Phe Val Asp Pro
1085 1090 1095
Phe Val Trp Lys Thr Ile Lys Asn His Glu Ser Arg Lys His Phe
1100 1105 1110
Leu Glu Gly Phe Asp Phe Leu His Tyr Asp Val Lys Thr Gly Asp
1115 1120 1125
Phe Ile Leu His Phe Lys Met Asn Arg Asn Leu Ser Phe Gln Arg
1130 1135 1140
Gly Leu Pro Gly Phe Met Pro Ala Trp Asp Ile Val Phe Glu Lys
1145 1150 1155
Asn Glu Thr Gln Phe Asp Ala Lys Gly Thr Pro Phe Ile Ala Gly
1160 1165 1170
Lys Arg Ile Val Pro Val Ile Glu Asn His Arg Phe Thr Gly Arg
1175 1180 1185
Tyr Arg Asp Leu Tyr Pro Ala Asn Glu Leu Ile Ala Leu Leu Glu
1190 1195 1200
Glu Lys Gly Ile Val Phe Arg Asp Gly Ser Asn Ile Leu Pro Lys
1205 1210 1215
Leu Leu Glu Asn Asp Asp Ser His Ala Ile Asp Thr Met Val Ala
1220 1225 1230
Leu Ile Arg Ser Val Leu Gln Met Arg Asn Ser Asn Ala Ala Thr
1235 1240 1245
Gly Glu Ala Tyr Ile Asn Ser Pro Val Arg Asp Leu Asn Gly Val
1250 1255 1260
Cys Phe Asp Ser Arg Phe Gln Asn Pro Glu Trp Pro Met Asp Ala
1265 1270 1275
Asp Ala Asn Gly Ala Tyr His Ile Ala Leu Lys Gly Gln Leu Leu
1280 1285 1290
Leu Asn His Leu Lys Glu Ser Lys Asp Leu Lys Leu Gln Asn Gly
1295 1300 1305
Ile Ser Asn Gln Asp Trp Leu Ala Tyr Ile Gln Glu Leu Arg Asn
1310 1315 1320
Gly Ser Pro Lys Lys Lys Arg Lys Val
1325 1330
<210> 62
<211> 1403
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 62
Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser
1 5 10 15
Met Pro Lys Lys Lys Arg Lys Val Gly Gly Gly Gly Ser Gly Gly Gly
20 25 30
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
35 40 45
Ser Gly Gly Gly Gly Ser Met Ser Ile Tyr Gln Glu Phe Val Asn Lys
50 55 60
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys
65 70 75 80
Thr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys
85 90 95
Arg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr His
100 105 110
Gln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu Asp
115 120 125
Leu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp
130 135 140
Asp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys
145 150 155 160
Lys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu
165 170 175
Phe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu
180 185 190
Ile Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys
195 200 205
Ala Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys
210 215 220
Ser Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn Arg
225 230 235 240
Lys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg
245 250 255
Ile Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr
260 265 270
Glu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile
275 280 285
Lys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr
290 295 300
Ser Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu Ile
305 310 315 320
Ala Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn
325 330 335
Thr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg Lys
340 345 350
Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys
355 360 365
Thr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu Ser
370 375 380
Asp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp Ser
385 390 395 400
Asp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe
405 410 415
Lys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe
420 425 430
Asp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys
435 440 445
Asn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr
450 455 460
Ser Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala
465 470 475 480
Pro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile Ala
485 490 495
Lys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys Leu
500 505 510
Ala Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys Arg
515 520 525
Phe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe Asp
530 535 540
Glu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys Tyr
545 550 555 560
Gln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp Asp
565 570 575
Val Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu His
580 585 590
Lys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn Ile
595 600 605
Leu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr Phe
610 615 620
Glu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr Ile
625 630 635 640
Thr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu Asn
645 650 655
Ser Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn Thr
660 665 670
Ala Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met Asn
675 680 685
Lys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn Lys
690 695 700
Gly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala Asn
705 710 715 720
Lys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe Tyr
725 730 735
Asn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His Thr
740 745 750
Lys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn Ile
755 760 765
Glu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser Lys
770 775 780
His Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln Arg
785 790 795 800
Tyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly Tyr
805 810 815
Lys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val Val
820 825 830
Asn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser
835 840 845
Ala Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys Ala
850 855 860
Leu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn Gly
865 870 875 880
Glu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile Thr
885 890 895
His Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro Lys
900 905 910
Lys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe Thr
915 920 925
Glu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys Ser
930 935 940
Ser Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys Glu
945 950 955 960
Lys Ala Asn Asp Val His Ile Leu Ser Ile Asp Arg Gly Glu Arg His
965 970 975
Leu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys Gln
980 985 990
Asp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr His
995 1000 1005
Asp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys
1010 1015 1020
Asp Trp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr
1025 1030 1035
Leu Ser Gln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr
1040 1045 1050
Asn Ala Ile Val Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg
1055 1060 1065
Gly Arg Phe Lys Val Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys
1070 1075 1080
Met Leu Ile Glu Lys Leu Asn Tyr Leu Val Phe Lys Asp Asn Glu
1085 1090 1095
Phe Asp Lys Thr Gly Gly Val Leu Arg Ala Tyr Gln Leu Thr Ala
1100 1105 1110
Pro Phe Glu Thr Phe Lys Lys Met Gly Lys Gln Thr Gly Ile Ile
1115 1120 1125
Tyr Tyr Val Pro Ala Gly Phe Thr Ser Lys Ile Cys Pro Val Thr
1130 1135 1140
Gly Phe Val Asn Gln Leu Tyr Pro Lys Tyr Glu Ser Val Ser Lys
1145 1150 1155
Ser Gln Glu Phe Phe Ser Lys Phe Asp Lys Ile Cys Tyr Asn Leu
1160 1165 1170
Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp Tyr Lys Asn Phe Gly
1175 1180 1185
Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala Ser Phe Gly Ser
1190 1195 1200
Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His Asn Trp Asp
1205 1210 1215
Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu Leu Lys
1220 1225 1230
Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala Ala
1235 1240 1245
Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser
1250 1255 1260
Val Leu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr
1265 1270 1275
Glu Leu Asp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn
1280 1285 1290
Phe Phe Asp Ser Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala
1295 1300 1305
Asp Ala Asn Gly Ala Tyr His Ile Gly Leu Lys Gly Leu Met Leu
1310 1315 1320
Leu Gly Arg Ile Lys Asn Asn Gln Glu Gly Lys Lys Leu Asn Leu
1325 1330 1335
Val Ile Lys Asn Glu Glu Tyr Phe Glu Phe Val Gln Asn Arg Asn
1340 1345 1350
Asn Pro Lys Lys Lys Arg Lys Val Ser Gly Gly Ser Ser Gly Gly
1355 1360 1365
Ser Pro Lys Lys Lys Arg Lys Val Tyr Pro Tyr Asp Val Pro Asp
1370 1375 1380
Tyr Ala Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Tyr Pro Tyr Asp
1385 1390 1395
Val Pro Asp Tyr Ala
1400
<210> 63
<211> 1382
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 63
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
1 5 10 15
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Met Thr
20 25 30
Gln Phe Glu Gly Phe Thr Asn Leu Tyr Gln Val Ser Lys Thr Leu Arg
35 40 45
Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu Lys His Ile Gln Glu Gln
50 55 60
Gly Phe Ile Glu Glu Asp Lys Ala Arg Asn Asp His Tyr Lys Glu Leu
65 70 75 80
Lys Pro Ile Ile Asp Arg Ile Tyr Lys Thr Tyr Ala Asp Gln Cys Leu
85 90 95
Gln Leu Val Gln Leu Asp Trp Glu Asn Leu Ser Ala Ala Ile Asp Ser
100 105 110
Tyr Arg Lys Glu Lys Thr Glu Glu Thr Arg Asn Ala Leu Ile Glu Glu
115 120 125
Gln Ala Thr Tyr Arg Asn Ala Ile His Asp Tyr Phe Ile Gly Arg Thr
130 135 140
Asp Asn Leu Thr Asp Ala Ile Asn Lys Arg His Ala Glu Ile Tyr Lys
145 150 155 160
Gly Leu Phe Lys Ala Glu Leu Phe Asn Gly Lys Val Leu Lys Gln Leu
165 170 175
Gly Thr Val Thr Thr Thr Glu His Glu Asn Ala Leu Leu Arg Ser Phe
180 185 190
Asp Lys Phe Thr Thr Tyr Phe Ser Gly Phe Tyr Glu Asn Arg Lys Asn
195 200 205
Val Phe Ser Ala Glu Asp Ile Ser Thr Ala Ile Pro His Arg Ile Val
210 215 220
Gln Asp Asn Phe Pro Lys Phe Lys Glu Asn Cys His Ile Phe Thr Arg
225 230 235 240
Leu Ile Thr Ala Val Pro Ser Leu Arg Glu His Phe Glu Asn Val Lys
245 250 255
Lys Ala Ile Gly Ile Phe Val Ser Thr Ser Ile Glu Glu Val Phe Ser
260 265 270
Phe Pro Phe Tyr Asn Gln Leu Leu Thr Gln Thr Gln Ile Asp Leu Tyr
275 280 285
Asn Gln Leu Leu Gly Gly Ile Ser Arg Glu Ala Gly Thr Glu Lys Ile
290 295 300
Lys Gly Leu Asn Glu Val Leu Asn Leu Ala Ile Gln Lys Asn Asp Glu
305 310 315 320
Thr Ala His Ile Ile Ala Ser Leu Pro His Arg Phe Ile Pro Leu Phe
325 330 335
Lys Gln Ile Leu Ser Asp Arg Asn Thr Leu Ser Phe Ile Leu Glu Glu
340 345 350
Phe Lys Ser Asp Glu Glu Val Ile Gln Ser Phe Cys Lys Tyr Lys Thr
355 360 365
Leu Leu Arg Asn Glu Asn Val Leu Glu Thr Ala Glu Ala Leu Phe Asn
370 375 380
Glu Leu Asn Ser Ile Asp Leu Thr His Ile Phe Ile Ser His Lys Lys
385 390 395 400
Leu Glu Thr Ile Ser Ser Ala Leu Cys Asp His Trp Asp Thr Leu Arg
405 410 415
Asn Ala Leu Tyr Glu Arg Arg Ile Ser Glu Leu Thr Gly Lys Ile Thr
420 425 430
Lys Ser Ala Lys Glu Lys Val Gln Arg Ser Leu Lys His Glu Asp Ile
435 440 445
Asn Leu Gln Glu Ile Ile Ser Ala Ala Gly Lys Glu Leu Ser Glu Ala
450 455 460
Phe Lys Gln Lys Thr Ser Glu Ile Leu Ser His Ala His Ala Ala Leu
465 470 475 480
Asp Gln Pro Leu Pro Thr Thr Leu Lys Lys Gln Glu Glu Lys Glu Ile
485 490 495
Leu Lys Ser Gln Leu Asp Ser Leu Leu Gly Leu Tyr His Leu Leu Asp
500 505 510
Trp Phe Ala Val Asp Glu Ser Asn Glu Val Asp Pro Glu Phe Ser Ala
515 520 525
Arg Leu Thr Gly Ile Lys Leu Glu Met Glu Pro Ser Leu Ser Phe Tyr
530 535 540
Asn Lys Ala Arg Asn Tyr Ala Thr Lys Lys Pro Tyr Ser Val Glu Lys
545 550 555 560
Phe Lys Leu Asn Phe Gln Met Pro Thr Leu Ala Ser Gly Trp Asp Val
565 570 575
Asn Lys Glu Lys Asn Asn Gly Ala Ile Leu Phe Val Lys Asn Gly Leu
580 585 590
Tyr Tyr Leu Gly Ile Met Pro Lys Gln Lys Gly Arg Tyr Lys Ala Leu
595 600 605
Ser Phe Glu Pro Thr Glu Lys Thr Ser Glu Gly Phe Asp Lys Met Tyr
610 615 620
Tyr Asp Tyr Phe Pro Asp Ala Ala Lys Met Ile Pro Lys Cys Ser Thr
625 630 635 640
Gln Leu Lys Ala Val Thr Ala His Phe Gln Thr His Thr Thr Pro Ile
645 650 655
Leu Leu Ser Asn Asn Phe Ile Glu Pro Leu Glu Ile Thr Lys Glu Ile
660 665 670
Tyr Asp Leu Asn Asn Pro Glu Lys Glu Pro Lys Lys Phe Gln Thr Ala
675 680 685
Tyr Ala Lys Lys Thr Gly Asp Gln Lys Gly Tyr Arg Glu Ala Leu Cys
690 695 700
Lys Trp Ile Asp Phe Thr Arg Asp Phe Leu Ser Lys Tyr Thr Lys Thr
705 710 715 720
Thr Ser Ile Asp Leu Ser Ser Leu Arg Pro Ser Ser Gln Tyr Lys Asp
725 730 735
Leu Gly Glu Tyr Tyr Ala Glu Leu Asn Pro Leu Leu Tyr His Ile Ser
740 745 750
Phe Gln Arg Ile Ala Glu Lys Glu Ile Met Asp Ala Val Glu Thr Gly
755 760 765
Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ala Lys Gly His
770 775 780
His Gly Lys Pro Asn Leu His Thr Leu Tyr Trp Thr Gly Leu Phe Ser
785 790 795 800
Pro Glu Asn Leu Ala Lys Thr Ser Ile Lys Leu Asn Gly Gln Ala Glu
805 810 815
Leu Phe Tyr Arg Pro Lys Ser Arg Met Lys Arg Met Ala His Arg Leu
820 825 830
Gly Glu Lys Met Leu Asn Lys Lys Leu Lys Asp Gln Lys Thr Pro Ile
835 840 845
Pro Asp Thr Leu Tyr Gln Glu Leu Tyr Asp Tyr Val Asn His Arg Leu
850 855 860
Ser His Asp Leu Ser Asp Glu Ala Arg Ala Leu Leu Pro Asn Val Ile
865 870 875 880
Thr Lys Glu Val Ser His Glu Ile Ile Lys Asp Arg Arg Phe Thr Ser
885 890 895
Asp Lys Phe Phe Phe His Val Pro Ile Thr Leu Asn Tyr Gln Ala Ala
900 905 910
Asn Ser Pro Ser Lys Phe Asn Gln Arg Val Asn Ala Tyr Leu Lys Glu
915 920 925
His Pro Glu Thr Pro Ile Ile Gly Ile Asp Arg Gly Glu Arg Asn Leu
930 935 940
Ile Tyr Ile Thr Val Ile Asp Ser Thr Gly Lys Ile Leu Glu Gln Arg
945 950 955 960
Ser Leu Asn Thr Ile Gln Gln Phe Asp Tyr Gln Lys Lys Leu Asp Asn
965 970 975
Arg Glu Lys Glu Arg Val Ala Ala Arg Gln Ala Trp Ser Val Val Gly
980 985 990
Thr Ile Lys Asp Leu Lys Gln Gly Tyr Leu Ser Gln Val Ile His Glu
995 1000 1005
Ile Val Asp Leu Met Ile His Tyr Gln Ala Val Val Val Leu Glu
1010 1015 1020
Asn Leu Asn Phe Gly Phe Lys Ser Lys Arg Thr Gly Ile Ala Glu
1025 1030 1035
Lys Ala Val Tyr Gln Gln Phe Glu Lys Met Leu Ile Asp Lys Leu
1040 1045 1050
Asn Cys Leu Val Leu Lys Asp Tyr Pro Ala Glu Lys Val Gly Gly
1055 1060 1065
Val Leu Asn Pro Tyr Gln Leu Thr Asp Gln Phe Thr Ser Phe Ala
1070 1075 1080
Lys Met Gly Thr Gln Ser Gly Phe Leu Phe Tyr Val Pro Ala Pro
1085 1090 1095
Tyr Thr Ser Lys Ile Asp Pro Leu Thr Gly Phe Val Asp Pro Phe
1100 1105 1110
Val Trp Lys Thr Ile Lys Asn His Glu Ser Arg Lys His Phe Leu
1115 1120 1125
Glu Gly Phe Asp Phe Leu His Tyr Asp Val Lys Thr Gly Asp Phe
1130 1135 1140
Ile Leu His Phe Lys Met Asn Arg Asn Leu Ser Phe Gln Arg Gly
1145 1150 1155
Leu Pro Gly Phe Met Pro Ala Trp Asp Ile Val Phe Glu Lys Asn
1160 1165 1170
Glu Thr Gln Phe Asp Ala Lys Gly Thr Pro Phe Ile Ala Gly Lys
1175 1180 1185
Arg Ile Val Pro Val Ile Glu Asn His Arg Phe Thr Gly Arg Tyr
1190 1195 1200
Arg Asp Leu Tyr Pro Ala Asn Glu Leu Ile Ala Leu Leu Glu Glu
1205 1210 1215
Lys Gly Ile Val Phe Arg Asp Gly Ser Asn Ile Leu Pro Lys Leu
1220 1225 1230
Leu Glu Asn Asp Asp Ser His Ala Ile Asp Thr Met Val Ala Leu
1235 1240 1245
Ile Arg Ser Val Leu Gln Met Arg Asn Ser Asn Ala Ala Thr Gly
1250 1255 1260
Glu Asp Tyr Ile Asn Ser Pro Val Arg Asp Leu Asn Gly Val Cys
1265 1270 1275
Phe Asp Ser Arg Phe Gln Asn Pro Glu Trp Pro Met Asp Ala Asp
1280 1285 1290
Ala Asn Gly Ala Tyr His Ile Ala Leu Lys Gly Gln Leu Leu Leu
1295 1300 1305
Asn His Leu Lys Glu Ser Lys Asp Leu Lys Leu Gln Asn Gly Ile
1310 1315 1320
Ser Asn Gln Asp Trp Leu Ala Tyr Ile Gln Glu Leu Arg Asn Lys
1325 1330 1335
Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys
1340 1345 1350
Gly Ser Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Tyr Pro Tyr Asp
1355 1360 1365
Val Pro Asp Tyr Ala Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
1370 1375 1380
<210> 64
<211> 1346
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 64
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
1 5 10 15
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Met Thr
20 25 30
Gln Phe Glu Gly Phe Thr Asn Leu Tyr Gln Val Ser Lys Thr Leu Arg
35 40 45
Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu Lys His Ile Gln Glu Gln
50 55 60
Gly Phe Ile Glu Glu Asp Lys Ala Arg Asn Asp His Tyr Lys Glu Leu
65 70 75 80
Lys Pro Ile Ile Asp Arg Ile Tyr Lys Thr Tyr Ala Asp Gln Cys Leu
85 90 95
Gln Leu Val Gln Leu Asp Trp Glu Asn Leu Ser Ala Ala Ile Asp Ser
100 105 110
Tyr Arg Lys Glu Lys Thr Glu Glu Thr Arg Asn Ala Leu Ile Glu Glu
115 120 125
Gln Ala Thr Tyr Arg Asn Ala Ile His Asp Tyr Phe Ile Gly Arg Thr
130 135 140
Asp Asn Leu Thr Asp Ala Ile Asn Lys Arg His Ala Glu Ile Tyr Lys
145 150 155 160
Gly Leu Phe Lys Ala Glu Leu Phe Asn Gly Lys Val Leu Lys Gln Leu
165 170 175
Gly Thr Val Thr Thr Thr Glu His Glu Asn Ala Leu Leu Arg Ser Phe
180 185 190
Asp Lys Phe Thr Thr Tyr Phe Ser Gly Phe Tyr Glu Asn Arg Lys Asn
195 200 205
Val Phe Ser Ala Glu Asp Ile Ser Thr Ala Ile Pro His Arg Ile Val
210 215 220
Gln Asp Asn Phe Pro Lys Phe Lys Glu Asn Cys His Ile Phe Thr Arg
225 230 235 240
Leu Ile Thr Ala Val Pro Ser Leu Arg Glu His Phe Glu Asn Val Lys
245 250 255
Lys Ala Ile Gly Ile Phe Val Ser Thr Ser Ile Glu Glu Val Phe Ser
260 265 270
Phe Pro Phe Tyr Asn Gln Leu Leu Thr Gln Thr Gln Ile Asp Leu Tyr
275 280 285
Asn Gln Leu Leu Gly Gly Ile Ser Arg Glu Ala Gly Thr Glu Lys Ile
290 295 300
Lys Gly Leu Asn Glu Val Leu Asn Leu Ala Ile Gln Lys Asn Asp Glu
305 310 315 320
Thr Ala His Ile Ile Ala Ser Leu Pro His Arg Phe Ile Pro Leu Phe
325 330 335
Lys Gln Ile Leu Ser Asp Arg Asn Thr Leu Ser Phe Ile Leu Glu Glu
340 345 350
Phe Lys Ser Asp Glu Glu Val Ile Gln Ser Phe Cys Lys Tyr Lys Thr
355 360 365
Leu Leu Arg Asn Glu Asn Val Leu Glu Thr Ala Glu Ala Leu Phe Asn
370 375 380
Glu Leu Asn Ser Ile Asp Leu Thr His Ile Phe Ile Ser His Lys Lys
385 390 395 400
Leu Glu Thr Ile Ser Ser Ala Leu Cys Asp His Trp Asp Thr Leu Arg
405 410 415
Asn Ala Leu Tyr Glu Arg Arg Ile Ser Glu Leu Thr Gly Lys Ile Thr
420 425 430
Lys Ser Ala Lys Glu Lys Val Gln Arg Ser Leu Lys His Glu Asp Ile
435 440 445
Asn Leu Gln Glu Ile Ile Ser Ala Ala Gly Lys Glu Leu Ser Glu Ala
450 455 460
Phe Lys Gln Lys Thr Ser Glu Ile Leu Ser His Ala His Ala Ala Leu
465 470 475 480
Asp Gln Pro Leu Pro Thr Thr Leu Lys Lys Gln Glu Glu Lys Glu Ile
485 490 495
Leu Lys Ser Gln Leu Asp Ser Leu Leu Gly Leu Tyr His Leu Leu Asp
500 505 510
Trp Phe Ala Val Asp Glu Ser Asn Glu Val Asp Pro Glu Phe Ser Ala
515 520 525
Arg Leu Thr Gly Ile Lys Leu Glu Met Glu Pro Ser Leu Ser Phe Tyr
530 535 540
Asn Lys Ala Arg Asn Tyr Ala Thr Lys Lys Pro Tyr Ser Val Glu Lys
545 550 555 560
Phe Lys Leu Asn Phe Gln Met Pro Thr Leu Ala Ser Gly Trp Asp Val
565 570 575
Asn Lys Glu Lys Asn Asn Gly Ala Ile Leu Phe Val Lys Asn Gly Leu
580 585 590
Tyr Tyr Leu Gly Ile Met Pro Lys Gln Lys Gly Arg Tyr Lys Ala Leu
595 600 605
Ser Phe Glu Pro Thr Glu Lys Thr Ser Glu Gly Phe Asp Lys Met Tyr
610 615 620
Tyr Asp Tyr Phe Pro Asp Ala Ala Lys Met Ile Pro Lys Cys Ser Thr
625 630 635 640
Gln Leu Lys Ala Val Thr Ala His Phe Gln Thr His Thr Thr Pro Ile
645 650 655
Leu Leu Ser Asn Asn Phe Ile Glu Pro Leu Glu Ile Thr Lys Glu Ile
660 665 670
Tyr Asp Leu Asn Asn Pro Glu Lys Glu Pro Lys Lys Phe Gln Thr Ala
675 680 685
Tyr Ala Lys Lys Thr Gly Asp Gln Lys Gly Tyr Arg Glu Ala Leu Cys
690 695 700
Lys Trp Ile Asp Phe Thr Arg Asp Phe Leu Ser Lys Tyr Thr Lys Thr
705 710 715 720
Thr Ser Ile Asp Leu Ser Ser Leu Arg Pro Ser Ser Gln Tyr Lys Asp
725 730 735
Leu Gly Glu Tyr Tyr Ala Glu Leu Asn Pro Leu Leu Tyr His Ile Ser
740 745 750
Phe Gln Arg Ile Ala Glu Lys Glu Ile Met Asp Ala Val Glu Thr Gly
755 760 765
Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ala Lys Gly His
770 775 780
His Gly Lys Pro Asn Leu His Thr Leu Tyr Trp Thr Gly Leu Phe Ser
785 790 795 800
Pro Glu Asn Leu Ala Lys Thr Ser Ile Lys Leu Asn Gly Gln Ala Glu
805 810 815
Leu Phe Tyr Arg Pro Lys Ser Arg Met Lys Arg Met Ala His Arg Leu
820 825 830
Gly Glu Lys Met Leu Asn Lys Lys Leu Lys Asp Gln Lys Thr Pro Ile
835 840 845
Pro Asp Thr Leu Tyr Gln Glu Leu Tyr Asp Tyr Val Asn His Arg Leu
850 855 860
Ser His Asp Leu Ser Asp Glu Ala Arg Ala Leu Leu Pro Asn Val Ile
865 870 875 880
Thr Lys Glu Val Ser His Glu Ile Ile Lys Asp Arg Arg Phe Thr Ser
885 890 895
Asp Lys Phe Phe Phe His Val Pro Ile Thr Leu Asn Tyr Gln Ala Ala
900 905 910
Asn Ser Pro Ser Lys Phe Asn Gln Arg Val Asn Ala Tyr Leu Lys Glu
915 920 925
His Pro Glu Thr Pro Ile Ile Gly Ile Ala Arg Gly Glu Arg Asn Leu
930 935 940
Ile Tyr Ile Thr Val Ile Asp Ser Thr Gly Lys Ile Leu Glu Gln Arg
945 950 955 960
Ser Leu Asn Thr Ile Gln Gln Phe Asp Tyr Gln Lys Lys Leu Asp Asn
965 970 975
Arg Glu Lys Glu Arg Val Ala Ala Arg Gln Ala Trp Ser Val Val Gly
980 985 990
Thr Ile Lys Asp Leu Lys Gln Gly Tyr Leu Ser Gln Val Ile His Glu
995 1000 1005
Ile Val Asp Leu Met Ile His Tyr Gln Ala Val Val Val Leu Ala
1010 1015 1020
Asn Leu Asn Phe Gly Phe Lys Ser Lys Arg Thr Gly Ile Ala Glu
1025 1030 1035
Lys Ala Val Tyr Gln Gln Phe Glu Lys Met Leu Ile Asp Lys Leu
1040 1045 1050
Asn Cys Leu Val Leu Lys Asp Tyr Pro Ala Glu Lys Val Gly Gly
1055 1060 1065
Val Leu Asn Pro Tyr Gln Leu Thr Asp Gln Phe Thr Ser Phe Ala
1070 1075 1080
Lys Met Gly Thr Gln Ser Gly Phe Leu Phe Tyr Val Pro Ala Pro
1085 1090 1095
Tyr Thr Ser Lys Ile Asp Pro Leu Thr Gly Phe Val Asp Pro Phe
1100 1105 1110
Val Trp Lys Thr Ile Lys Asn His Glu Ser Arg Lys His Phe Leu
1115 1120 1125
Glu Gly Phe Asp Phe Leu His Tyr Asp Val Lys Thr Gly Asp Phe
1130 1135 1140
Ile Leu His Phe Lys Met Asn Arg Asn Leu Ser Phe Gln Arg Gly
1145 1150 1155
Leu Pro Gly Phe Met Pro Ala Trp Asp Ile Val Phe Glu Lys Asn
1160 1165 1170
Glu Thr Gln Phe Asp Ala Lys Gly Thr Pro Phe Ile Ala Gly Lys
1175 1180 1185
Arg Ile Val Pro Val Ile Glu Asn His Arg Phe Thr Gly Arg Tyr
1190 1195 1200
Arg Asp Leu Tyr Pro Ala Asn Glu Leu Ile Ala Leu Leu Glu Glu
1205 1210 1215
Lys Gly Ile Val Phe Arg Asp Gly Ser Asn Ile Leu Pro Lys Leu
1220 1225 1230
Leu Glu Asn Asp Asp Ser His Ala Ile Asp Thr Met Val Ala Leu
1235 1240 1245
Ile Arg Ser Val Leu Gln Met Arg Asn Ser Asn Ala Ala Thr Gly
1250 1255 1260
Glu Ala Tyr Ile Asn Ser Pro Val Arg Asp Leu Asn Gly Val Cys
1265 1270 1275
Phe Asp Ser Arg Phe Gln Asn Pro Glu Trp Pro Met Asp Ala Asp
1280 1285 1290
Ala Asn Gly Ala Tyr His Ile Ala Leu Lys Gly Gln Leu Leu Leu
1295 1300 1305
Asn His Leu Lys Glu Ser Lys Asp Leu Lys Leu Gln Asn Gly Ile
1310 1315 1320
Ser Asn Gln Asp Trp Leu Ala Tyr Ile Gln Glu Leu Arg Asn Gly
1325 1330 1335
Ser Pro Lys Lys Lys Arg Lys Val
1340 1345
<210> 65
<211> 1417
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 65
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
1 5 10 15
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Met Pro
20 25 30
Lys Lys Lys Arg Lys Val Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
35 40 45
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
50 55 60
Gly Gly Gly Ser Met Ser Ile Tyr Gln Glu Phe Val Asn Lys Tyr Ser
65 70 75 80
Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu
85 90 95
Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys Arg Ala
100 105 110
Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr His Gln Phe
115 120 125
Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu Asp Leu Leu
130 135 140
Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp Asp Asp
145 150 155 160
Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys Lys Gln
165 170 175
Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu Phe Asn
180 185 190
Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu Ile Leu
195 200 205
Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys Ala Asn
210 215 220
Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys Ser Phe
225 230 235 240
Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn Arg Lys Asn
245 250 255
Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg Ile Val
260 265 270
Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr Glu Ser
275 280 285
Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile Lys Lys
290 295 300
Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr Ser Glu
305 310 315 320
Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu Ile Ala Asn
325 330 335
Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn Thr Ile
340 345 350
Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg Lys Gly Ile
355 360 365
Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys Thr Leu
370 375 380
Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu Ser Asp Thr
385 390 395 400
Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp Ser Asp Val
405 410 415
Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe Lys Thr
420 425 430
Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe Asp Asp
435 440 445
Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys Asn Asp
450 455 460
Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr Ser Val
465 470 475 480
Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala Pro Lys
485 490 495
Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile Ala Lys Lys
500 505 510
Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys Leu Ala Leu
515 520 525
Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys Arg Phe Glu
530 535 540
Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe Asp Glu Ile
545 550 555 560
Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys Tyr Gln Asn
565 570 575
Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp Asp Val Lys
580 585 590
Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu His Lys Leu
595 600 605
Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn Ile Leu Asp
610 615 620
Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr Phe Glu Leu
625 630 635 640
Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr Ile Thr Gln
645 650 655
Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu Asn Ser Thr
660 665 670
Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn Thr Ala Ile
675 680 685
Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met Asn Lys Lys
690 695 700
Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn Lys Gly Glu
705 710 715 720
Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala Asn Lys Met
725 730 735
Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe Tyr Asn Pro
740 745 750
Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His Thr Lys Asn
755 760 765
Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn Ile Glu Asp
770 775 780
Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser Lys His Pro
785 790 795 800
Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln Arg Tyr Asn
805 810 815
Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly Tyr Lys Leu
820 825 830
Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val Val Asn Gln
835 840 845
Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser Ala Tyr
850 855 860
Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys Ala Leu Phe
865 870 875 880
Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn Gly Glu Ala
885 890 895
Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile Thr His Pro
900 905 910
Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro Lys Lys Glu
915 920 925
Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe Thr Glu Asp
930 935 940
Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys Ser Ser Gly
945 950 955 960
Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys Glu Lys Ala
965 970 975
Asn Asp Val His Ile Leu Ser Ile Asp Arg Gly Glu Arg His Leu Ala
980 985 990
Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys Gln Asp Thr
995 1000 1005
Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr His Asp
1010 1015 1020
Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys Asp
1025 1030 1035
Trp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu
1040 1045 1050
Ser Gln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn
1055 1060 1065
Ala Ile Val Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly
1070 1075 1080
Arg Phe Lys Val Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met
1085 1090 1095
Leu Ile Glu Lys Leu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe
1100 1105 1110
Asp Lys Thr Gly Gly Val Leu Arg Ala Tyr Gln Leu Thr Ala Pro
1115 1120 1125
Phe Glu Thr Phe Lys Lys Met Gly Lys Gln Thr Gly Ile Ile Tyr
1130 1135 1140
Tyr Val Pro Ala Gly Phe Thr Ser Lys Ile Cys Pro Val Thr Gly
1145 1150 1155
Phe Val Asn Gln Leu Tyr Pro Lys Tyr Glu Ser Val Ser Lys Ser
1160 1165 1170
Gln Glu Phe Phe Ser Lys Phe Asp Lys Ile Cys Tyr Asn Leu Asp
1175 1180 1185
Lys Gly Tyr Phe Glu Phe Ser Phe Asp Tyr Lys Asn Phe Gly Asp
1190 1195 1200
Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala Ser Phe Gly Ser Arg
1205 1210 1215
Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His Asn Trp Asp Thr
1220 1225 1230
Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu Leu Lys Asp
1235 1240 1245
Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala Ala Ile
1250 1255 1260
Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser Val
1265 1270 1275
Leu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu
1280 1285 1290
Leu Asp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe
1295 1300 1305
Phe Asp Ser Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp
1310 1315 1320
Ala Asn Gly Ala Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu
1325 1330 1335
Gly Arg Ile Lys Asn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val
1340 1345 1350
Ile Lys Asn Glu Glu Tyr Phe Glu Phe Val Gln Asn Arg Asn Asn
1355 1360 1365
Pro Lys Lys Lys Arg Lys Val Ser Gly Gly Ser Ser Gly Gly Ser
1370 1375 1380
Pro Lys Lys Lys Arg Lys Val Tyr Pro Tyr Asp Val Pro Asp Tyr
1385 1390 1395
Ala Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Tyr Pro Tyr Asp Val
1400 1405 1410
Pro Asp Tyr Ala
1415
<210> 66
<211> 1287
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 66
Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly
1 5 10 15
Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly
20 25 30
Gly Gly Ser Gly Met Leu Phe Gln Asp Phe Thr His Leu Tyr Pro Leu
35 40 45
Ser Lys Thr Val Arg Phe Glu Leu Lys Pro Ile Gly Arg Thr Leu Glu
50 55 60
His Ile His Ala Lys Asn Phe Leu Ser Gln Asp Glu Thr Met Ala Asp
65 70 75 80
Met Tyr Gln Lys Val Lys Val Ile Leu Asp Asp Tyr His Arg Asp Phe
85 90 95
Ile Ala Asp Met Met Gly Glu Val Lys Leu Thr Lys Leu Ala Glu Phe
100 105 110
Tyr Asp Val Tyr Leu Lys Phe Arg Lys Asn Pro Lys Asp Asp Gly Leu
115 120 125
Gln Lys Gln Leu Lys Asp Leu Gln Ala Val Leu Arg Lys Glu Ser Val
130 135 140
Lys Pro Ile Gly Ser Gly Gly Lys Tyr Lys Thr Gly Tyr Asp Arg Leu
145 150 155 160
Phe Gly Ala Lys Leu Phe Lys Asp Gly Lys Glu Leu Gly Asp Leu Ala
165 170 175
Lys Phe Val Ile Ala Gln Glu Gly Glu Ser Ser Pro Lys Leu Ala His
180 185 190
Leu Ala His Phe Glu Lys Phe Ser Thr Tyr Phe Thr Gly Phe His Asp
195 200 205
Asn Arg Lys Asn Met Tyr Ser Asp Glu Asp Lys His Thr Ala Ile Ala
210 215 220
Tyr Arg Leu Ile His Glu Asn Leu Pro Arg Phe Ile Asp Asn Leu Gln
225 230 235 240
Ile Leu Thr Thr Ile Lys Gln Lys His Ser Ala Leu Tyr Asp Gln Ile
245 250 255
Ile Asn Glu Leu Thr Ala Ser Gly Leu Asp Val Ser Leu Ala Ser His
260 265 270
Leu Asp Gly Tyr His Lys Leu Leu Thr Gln Glu Gly Ile Thr Ala Tyr
275 280 285
Asn Arg Ile Ile Gly Glu Val Asn Gly Tyr Thr Asn Lys His Asn Gln
290 295 300
Ile Cys His Lys Ser Glu Arg Ile Ala Lys Leu Arg Pro Leu His Lys
305 310 315 320
Gln Ile Leu Ser Asp Gly Met Gly Val Ser Phe Leu Pro Ser Lys Phe
325 330 335
Ala Asp Asp Ser Glu Met Cys Gln Ala Val Asn Glu Phe Tyr Arg His
340 345 350
Tyr Thr Asp Val Phe Ala Lys Val Gln Ser Leu Phe Asp Gly Phe Asp
355 360 365
Asp His Gln Lys Asp Gly Ile Tyr Val Glu His Lys Asn Leu Asn Glu
370 375 380
Leu Ser Lys Gln Ala Phe Gly Asp Phe Ala Leu Leu Gly Arg Val Leu
385 390 395 400
Asp Gly Tyr Tyr Val Asp Val Val Asn Pro Glu Phe Asn Glu Arg Phe
405 410 415
Ala Lys Ala Lys Thr Asp Asn Ala Lys Ala Lys Leu Thr Lys Glu Lys
420 425 430
Asp Lys Phe Ile Lys Gly Val His Ser Leu Ala Ser Leu Glu Gln Ala
435 440 445
Ile Glu His His Thr Ala Arg His Asp Asp Glu Ser Val Gln Ala Gly
450 455 460
Lys Leu Gly Gln Tyr Phe Lys His Gly Leu Ala Gly Val Asp Asn Pro
465 470 475 480
Ile Gln Lys Ile His Asn Asn His Ser Thr Ile Lys Gly Phe Leu Glu
485 490 495
Arg Glu Arg Pro Ala Gly Glu Arg Ala Leu Pro Lys Ile Lys Ser Gly
500 505 510
Lys Asn Pro Glu Met Thr Gln Leu Arg Gln Leu Lys Glu Leu Leu Asp
515 520 525
Asn Ala Leu Asn Val Ala His Phe Ala Lys Leu Leu Thr Thr Lys Thr
530 535 540
Thr Leu Asp Asn Gln Asp Gly Asn Phe Tyr Gly Glu Phe Gly Val Leu
545 550 555 560
Tyr Asp Glu Leu Ala Lys Ile Pro Thr Leu Tyr Asn Lys Val Arg Asp
565 570 575
Tyr Leu Ser Gln Lys Pro Phe Ser Thr Glu Lys Tyr Lys Leu Asn Phe
580 585 590
Gly Asn Pro Thr Leu Leu Asn Gly Trp Asp Leu Asn Lys Glu Lys Asp
595 600 605
Asn Phe Gly Val Ile Leu Gln Lys Asp Gly Cys Tyr Tyr Leu Ala Leu
610 615 620
Leu Asp Lys Ala His Lys Lys Val Phe Asp Asn Ala Pro Asn Thr Gly
625 630 635 640
Lys Asn Val Tyr Gln Lys Met Val Tyr Lys Leu Leu Pro Gly Pro Asn
645 650 655
Lys Met Leu Pro Lys Val Phe Phe Ala Lys Ser Asn Leu Asp Tyr Tyr
660 665 670
Asn Pro Ser Ala Glu Leu Leu Asp Lys Tyr Ala Lys Gly Thr His Lys
675 680 685
Lys Gly Asp Asn Phe Asn Leu Lys Asp Cys His Ala Leu Ile Asp Phe
690 695 700
Phe Lys Ala Gly Ile Asn Lys His Pro Glu Trp Gln His Phe Gly Phe
705 710 715 720
Lys Phe Ser Pro Thr Ser Ser Tyr Arg Asp Leu Ser Asp Phe Tyr Arg
725 730 735
Glu Val Glu Pro Gln Gly Tyr Gln Val Lys Phe Val Asp Ile Asn Ala
740 745 750
Asp Tyr Ile Asp Glu Leu Val Glu Gln Gly Lys Leu Tyr Leu Phe Gln
755 760 765
Ile Tyr Asn Lys Asp Phe Ser Pro Lys Ala His Gly Lys Pro Asn Leu
770 775 780
His Thr Leu Tyr Phe Lys Ala Leu Phe Ser Glu Asp Asn Leu Ala Asp
785 790 795 800
Pro Ile Tyr Lys Leu Asn Gly Glu Ala Gln Ile Phe Tyr Arg Lys Ala
805 810 815
Ser Leu Asp Met Asn Glu Thr Thr Ile His Arg Ala Gly Glu Val Leu
820 825 830
Glu Asn Lys Asn Pro Asp Asn Pro Lys Lys Arg Gln Phe Val Tyr Asp
835 840 845
Ile Ile Lys Asp Lys Arg Tyr Thr Gln Asp Lys Phe Met Leu His Val
850 855 860
Pro Ile Thr Met Asn Phe Gly Val Gln Gly Met Thr Ile Lys Glu Phe
865 870 875 880
Asn Lys Lys Val Asn Gln Ser Ile Gln Gln Tyr Asp Glu Val Asn Val
885 890 895
Ile Gly Ile Asp Arg Gly Glu Arg His Leu Leu Tyr Leu Thr Val Ile
900 905 910
Asn Ser Lys Gly Glu Ile Leu Glu Gln Arg Ser Leu Asn Asp Ile Thr
915 920 925
Thr Ala Ser Ala Asn Gly Thr Gln Val Thr Thr Pro Tyr His Lys Ile
930 935 940
Leu Asp Lys Arg Glu Ile Glu Arg Leu Asn Ala Arg Val Gly Trp Gly
945 950 955 960
Glu Ile Glu Thr Ile Lys Glu Leu Lys Ser Gly Tyr Leu Ser His Val
965 970 975
Val His Gln Ile Asn Gln Leu Met Leu Lys Tyr Asn Ala Ile Val Val
980 985 990
Leu Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys Val Glu
995 1000 1005
Lys Gln Ile Tyr Gln Asn Phe Glu Asn Ala Leu Ile Lys Lys Leu
1010 1015 1020
Asn His Leu Val Leu Lys Asp Lys Ala Asp Asp Glu Ile Gly Ser
1025 1030 1035
Tyr Lys Asn Ala Leu Gln Leu Thr Asn Asn Phe Thr Asp Leu Lys
1040 1045 1050
Ser Ile Gly Lys Gln Thr Gly Phe Leu Phe Tyr Val Pro Ala Trp
1055 1060 1065
Asn Thr Ser Lys Ile Asp Pro Glu Thr Gly Phe Val Asp Leu Leu
1070 1075 1080
Lys Pro Arg Tyr Glu Asn Ile Ala Gln Ser Gln Ala Phe Phe Gly
1085 1090 1095
Lys Phe Asp Lys Ile Cys Tyr Asn Thr Asp Lys Gly Tyr Phe Glu
1100 1105 1110
Phe His Ile Asp Tyr Ala Lys Phe Thr Asp Lys Ala Lys Asn Ser
1115 1120 1125
Arg Gln Lys Trp Ala Ile Cys Ser His Gly Asp Lys Arg Tyr Val
1130 1135 1140
Tyr Asp Lys Thr Ala Asn Gln Asn Lys Gly Ala Ala Lys Gly Ile
1145 1150 1155
Asn Val Asn Asp Glu Leu Lys Ser Leu Phe Ala Arg Tyr His Ile
1160 1165 1170
Asn Asp Lys Gln Pro Asn Leu Val Met Asp Ile Cys Gln Asn Asn
1175 1180 1185
Asp Lys Glu Phe His Lys Ser Leu Met Cys Leu Leu Lys Thr Leu
1190 1195 1200
Leu Ala Leu Arg Tyr Ser Asn Ala Ser Ser Asp Glu Asp Phe Ile
1205 1210 1215
Leu Ser Pro Val Ala Asn Asp Glu Gly Val Phe Phe Asn Ser Ala
1220 1225 1230
Leu Ala Asp Asp Thr Gln Pro Gln Asn Ala Asp Ala Asn Gly Ala
1235 1240 1245
Tyr His Ile Ala Leu Lys Gly Leu Trp Leu Leu Asn Glu Leu Lys
1250 1255 1260
Asn Ser Asp Asp Leu Asn Lys Val Lys Leu Ala Ile Asp Asn Gln
1265 1270 1275
Thr Trp Leu Asn Phe Ala Gln Asn Arg
1280 1285
<210> 67
<211> 1388
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 67
Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly
1 5 10 15
Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly
20 25 30
Gly Gly Ser Gly Met Thr Gln Phe Glu Gly Phe Thr Asn Leu Tyr Gln
35 40 45
Val Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu
50 55 60
Lys His Ile Gln Glu Gln Gly Phe Ile Glu Glu Asp Lys Ala Arg Asn
65 70 75 80
Asp His Tyr Lys Glu Leu Lys Pro Ile Ile Asp Arg Ile Tyr Lys Thr
85 90 95
Tyr Ala Asp Gln Cys Leu Gln Leu Val Gln Leu Asp Trp Glu Asn Leu
100 105 110
Ser Ala Ala Ile Asp Ser Tyr Arg Lys Glu Lys Thr Glu Glu Thr Arg
115 120 125
Asn Ala Leu Ile Glu Glu Gln Ala Thr Tyr Arg Asn Ala Ile His Asp
130 135 140
Tyr Phe Ile Gly Arg Thr Asp Asn Leu Thr Asp Ala Ile Asn Lys Arg
145 150 155 160
His Ala Glu Ile Tyr Lys Gly Leu Phe Lys Ala Glu Leu Phe Asn Gly
165 170 175
Lys Val Leu Lys Gln Leu Gly Thr Val Thr Thr Thr Glu His Glu Asn
180 185 190
Ala Leu Leu Arg Ser Phe Asp Lys Phe Thr Thr Tyr Phe Ser Gly Phe
195 200 205
Tyr Glu Asn Arg Lys Asn Val Phe Ser Ala Glu Asp Ile Ser Thr Ala
210 215 220
Ile Pro His Arg Ile Val Gln Asp Asn Phe Pro Lys Phe Lys Glu Asn
225 230 235 240
Cys His Ile Phe Thr Arg Leu Ile Thr Ala Val Pro Ser Leu Arg Glu
245 250 255
His Phe Glu Asn Val Lys Lys Ala Ile Gly Ile Phe Val Ser Thr Ser
260 265 270
Ile Glu Glu Val Phe Ser Phe Pro Phe Tyr Asn Gln Leu Leu Thr Gln
275 280 285
Thr Gln Ile Asp Leu Tyr Asn Gln Leu Leu Gly Gly Ile Ser Arg Glu
290 295 300
Ala Gly Thr Glu Lys Ile Lys Gly Leu Asn Glu Val Leu Asn Leu Ala
305 310 315 320
Ile Gln Lys Asn Asp Glu Thr Ala His Ile Ile Ala Ser Leu Pro His
325 330 335
Arg Phe Ile Pro Leu Phe Lys Gln Ile Leu Ser Asp Arg Asn Thr Leu
340 345 350
Ser Phe Ile Leu Glu Glu Phe Lys Ser Asp Glu Glu Val Ile Gln Ser
355 360 365
Phe Cys Lys Tyr Lys Thr Leu Leu Arg Asn Glu Asn Val Leu Glu Thr
370 375 380
Ala Glu Ala Leu Phe Asn Glu Leu Asn Ser Ile Asp Leu Thr His Ile
385 390 395 400
Phe Ile Ser His Lys Lys Leu Glu Thr Ile Ser Ser Ala Leu Cys Asp
405 410 415
His Trp Asp Thr Leu Arg Asn Ala Leu Tyr Glu Arg Arg Ile Ser Glu
420 425 430
Leu Thr Gly Lys Ile Thr Lys Ser Ala Lys Glu Lys Val Gln Arg Ser
435 440 445
Leu Lys His Glu Asp Ile Asn Leu Gln Glu Ile Ile Ser Ala Ala Gly
450 455 460
Lys Glu Leu Ser Glu Ala Phe Lys Gln Lys Thr Ser Glu Ile Leu Ser
465 470 475 480
His Ala His Ala Ala Leu Asp Gln Pro Leu Pro Thr Thr Leu Lys Lys
485 490 495
Gln Glu Glu Lys Glu Ile Leu Lys Ser Gln Leu Asp Ser Leu Leu Gly
500 505 510
Leu Tyr His Leu Leu Asp Trp Phe Ala Val Asp Glu Ser Asn Glu Val
515 520 525
Asp Pro Glu Phe Ser Ala Arg Leu Thr Gly Ile Lys Leu Glu Met Glu
530 535 540
Pro Ser Leu Ser Phe Tyr Asn Lys Ala Arg Asn Tyr Ala Thr Lys Lys
545 550 555 560
Pro Tyr Ser Val Glu Lys Phe Lys Leu Asn Phe Gln Met Pro Thr Leu
565 570 575
Ala Ser Gly Trp Asp Val Asn Lys Glu Lys Asn Asn Gly Ala Ile Leu
580 585 590
Phe Val Lys Asn Gly Leu Tyr Tyr Leu Gly Ile Met Pro Lys Gln Lys
595 600 605
Gly Arg Tyr Lys Ala Leu Ser Phe Glu Pro Thr Glu Lys Thr Ser Glu
610 615 620
Gly Phe Asp Lys Met Tyr Tyr Asp Tyr Phe Pro Asp Ala Ala Lys Met
625 630 635 640
Ile Pro Lys Cys Ser Thr Gln Leu Lys Ala Val Thr Ala His Phe Gln
645 650 655
Thr His Thr Thr Pro Ile Leu Leu Ser Asn Asn Phe Ile Glu Pro Leu
660 665 670
Glu Ile Thr Lys Glu Ile Tyr Asp Leu Asn Asn Pro Glu Lys Glu Pro
675 680 685
Lys Lys Phe Gln Thr Ala Tyr Ala Lys Lys Thr Gly Asp Gln Lys Gly
690 695 700
Tyr Arg Glu Ala Leu Cys Lys Trp Ile Asp Phe Thr Arg Asp Phe Leu
705 710 715 720
Ser Lys Tyr Thr Lys Thr Thr Ser Ile Asp Leu Ser Ser Leu Arg Pro
725 730 735
Ser Ser Gln Tyr Lys Asp Leu Gly Glu Tyr Tyr Ala Glu Leu Asn Pro
740 745 750
Leu Leu Tyr His Ile Ser Phe Gln Arg Ile Ala Glu Lys Glu Ile Met
755 760 765
Asp Ala Val Glu Thr Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys
770 775 780
Asp Phe Ala Lys Gly His His Gly Lys Pro Asn Leu His Thr Leu Tyr
785 790 795 800
Trp Thr Gly Leu Phe Ser Pro Glu Asn Leu Ala Lys Thr Ser Ile Lys
805 810 815
Leu Asn Gly Gln Ala Glu Leu Phe Tyr Arg Pro Lys Ser Arg Met Lys
820 825 830
Arg Met Ala His Arg Leu Gly Glu Lys Met Leu Asn Lys Lys Leu Lys
835 840 845
Asp Gln Lys Thr Pro Ile Pro Asp Thr Leu Tyr Gln Glu Leu Tyr Asp
850 855 860
Tyr Val Asn His Arg Leu Ser His Asp Leu Ser Asp Glu Ala Arg Ala
865 870 875 880
Leu Leu Pro Asn Val Ile Thr Lys Glu Val Ser His Glu Ile Ile Lys
885 890 895
Asp Arg Arg Phe Thr Ser Asp Lys Phe Phe Phe His Val Pro Ile Thr
900 905 910
Leu Asn Tyr Gln Ala Ala Asn Ser Pro Ser Lys Phe Asn Gln Arg Val
915 920 925
Asn Ala Tyr Leu Lys Glu His Pro Glu Thr Pro Ile Ile Gly Ile Asp
930 935 940
Arg Gly Glu Arg Asn Leu Ile Tyr Ile Thr Val Ile Asp Ser Thr Gly
945 950 955 960
Lys Ile Leu Glu Gln Arg Ser Leu Asn Thr Ile Gln Gln Phe Asp Tyr
965 970 975
Gln Lys Lys Leu Asp Asn Arg Glu Lys Glu Arg Val Ala Ala Arg Gln
980 985 990
Ala Trp Ser Val Val Gly Thr Ile Lys Asp Leu Lys Gln Gly Tyr Leu
995 1000 1005
Ser Gln Val Ile His Glu Ile Val Asp Leu Met Ile His Tyr Gln
1010 1015 1020
Ala Val Val Val Leu Glu Asn Leu Asn Phe Gly Phe Lys Ser Lys
1025 1030 1035
Arg Thr Gly Ile Ala Glu Lys Ala Val Tyr Gln Gln Phe Glu Lys
1040 1045 1050
Met Leu Ile Asp Lys Leu Asn Cys Leu Val Leu Lys Asp Tyr Pro
1055 1060 1065
Ala Glu Lys Val Gly Gly Val Leu Asn Pro Tyr Gln Leu Thr Asp
1070 1075 1080
Gln Phe Thr Ser Phe Ala Lys Met Gly Thr Gln Ser Gly Phe Leu
1085 1090 1095
Phe Tyr Val Pro Ala Pro Tyr Thr Ser Lys Ile Asp Pro Leu Thr
1100 1105 1110
Gly Phe Val Asp Pro Phe Val Trp Lys Thr Ile Lys Asn His Glu
1115 1120 1125
Ser Arg Lys His Phe Leu Glu Gly Phe Asp Phe Leu His Tyr Asp
1130 1135 1140
Val Lys Thr Gly Asp Phe Ile Leu His Phe Lys Met Asn Arg Asn
1145 1150 1155
Leu Ser Phe Gln Arg Gly Leu Pro Gly Phe Met Pro Ala Trp Asp
1160 1165 1170
Ile Val Phe Glu Lys Asn Glu Thr Gln Phe Asp Ala Lys Gly Thr
1175 1180 1185
Pro Phe Ile Ala Gly Lys Arg Ile Val Pro Val Ile Glu Asn His
1190 1195 1200
Arg Phe Thr Gly Arg Tyr Arg Asp Leu Tyr Pro Ala Asn Glu Leu
1205 1210 1215
Ile Ala Leu Leu Glu Glu Lys Gly Ile Val Phe Arg Asp Gly Ser
1220 1225 1230
Asn Ile Leu Pro Lys Leu Leu Glu Asn Asp Asp Ser His Ala Ile
1235 1240 1245
Asp Thr Met Val Ala Leu Ile Arg Ser Val Leu Gln Met Arg Asn
1250 1255 1260
Ser Asn Ala Ala Thr Gly Glu Asp Tyr Ile Asn Ser Pro Val Arg
1265 1270 1275
Asp Leu Asn Gly Val Cys Phe Asp Ser Arg Phe Gln Asn Pro Glu
1280 1285 1290
Trp Pro Met Asp Ala Asp Ala Asn Gly Ala Tyr His Ile Ala Leu
1295 1300 1305
Lys Gly Gln Leu Leu Leu Asn His Leu Lys Glu Ser Lys Asp Leu
1310 1315 1320
Lys Leu Gln Asn Gly Ile Ser Asn Gln Asp Trp Leu Ala Tyr Ile
1325 1330 1335
Gln Glu Leu Arg Asn Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly
1340 1345 1350
Gln Ala Lys Lys Lys Lys Gly Ser Tyr Pro Tyr Asp Val Pro Asp
1355 1360 1365
Tyr Ala Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Tyr Pro Tyr Asp
1370 1375 1380
Val Pro Asp Tyr Ala
1385
<210> 68
<211> 1352
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 68
Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly
1 5 10 15
Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly
20 25 30
Gly Gly Ser Gly Met Thr Gln Phe Glu Gly Phe Thr Asn Leu Tyr Gln
35 40 45
Val Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu
50 55 60
Lys His Ile Gln Glu Gln Gly Phe Ile Glu Glu Asp Lys Ala Arg Asn
65 70 75 80
Asp His Tyr Lys Glu Leu Lys Pro Ile Ile Asp Arg Ile Tyr Lys Thr
85 90 95
Tyr Ala Asp Gln Cys Leu Gln Leu Val Gln Leu Asp Trp Glu Asn Leu
100 105 110
Ser Ala Ala Ile Asp Ser Tyr Arg Lys Glu Lys Thr Glu Glu Thr Arg
115 120 125
Asn Ala Leu Ile Glu Glu Gln Ala Thr Tyr Arg Asn Ala Ile His Asp
130 135 140
Tyr Phe Ile Gly Arg Thr Asp Asn Leu Thr Asp Ala Ile Asn Lys Arg
145 150 155 160
His Ala Glu Ile Tyr Lys Gly Leu Phe Lys Ala Glu Leu Phe Asn Gly
165 170 175
Lys Val Leu Lys Gln Leu Gly Thr Val Thr Thr Thr Glu His Glu Asn
180 185 190
Ala Leu Leu Arg Ser Phe Asp Lys Phe Thr Thr Tyr Phe Ser Gly Phe
195 200 205
Tyr Glu Asn Arg Lys Asn Val Phe Ser Ala Glu Asp Ile Ser Thr Ala
210 215 220
Ile Pro His Arg Ile Val Gln Asp Asn Phe Pro Lys Phe Lys Glu Asn
225 230 235 240
Cys His Ile Phe Thr Arg Leu Ile Thr Ala Val Pro Ser Leu Arg Glu
245 250 255
His Phe Glu Asn Val Lys Lys Ala Ile Gly Ile Phe Val Ser Thr Ser
260 265 270
Ile Glu Glu Val Phe Ser Phe Pro Phe Tyr Asn Gln Leu Leu Thr Gln
275 280 285
Thr Gln Ile Asp Leu Tyr Asn Gln Leu Leu Gly Gly Ile Ser Arg Glu
290 295 300
Ala Gly Thr Glu Lys Ile Lys Gly Leu Asn Glu Val Leu Asn Leu Ala
305 310 315 320
Ile Gln Lys Asn Asp Glu Thr Ala His Ile Ile Ala Ser Leu Pro His
325 330 335
Arg Phe Ile Pro Leu Phe Lys Gln Ile Leu Ser Asp Arg Asn Thr Leu
340 345 350
Ser Phe Ile Leu Glu Glu Phe Lys Ser Asp Glu Glu Val Ile Gln Ser
355 360 365
Phe Cys Lys Tyr Lys Thr Leu Leu Arg Asn Glu Asn Val Leu Glu Thr
370 375 380
Ala Glu Ala Leu Phe Asn Glu Leu Asn Ser Ile Asp Leu Thr His Ile
385 390 395 400
Phe Ile Ser His Lys Lys Leu Glu Thr Ile Ser Ser Ala Leu Cys Asp
405 410 415
His Trp Asp Thr Leu Arg Asn Ala Leu Tyr Glu Arg Arg Ile Ser Glu
420 425 430
Leu Thr Gly Lys Ile Thr Lys Ser Ala Lys Glu Lys Val Gln Arg Ser
435 440 445
Leu Lys His Glu Asp Ile Asn Leu Gln Glu Ile Ile Ser Ala Ala Gly
450 455 460
Lys Glu Leu Ser Glu Ala Phe Lys Gln Lys Thr Ser Glu Ile Leu Ser
465 470 475 480
His Ala His Ala Ala Leu Asp Gln Pro Leu Pro Thr Thr Leu Lys Lys
485 490 495
Gln Glu Glu Lys Glu Ile Leu Lys Ser Gln Leu Asp Ser Leu Leu Gly
500 505 510
Leu Tyr His Leu Leu Asp Trp Phe Ala Val Asp Glu Ser Asn Glu Val
515 520 525
Asp Pro Glu Phe Ser Ala Arg Leu Thr Gly Ile Lys Leu Glu Met Glu
530 535 540
Pro Ser Leu Ser Phe Tyr Asn Lys Ala Arg Asn Tyr Ala Thr Lys Lys
545 550 555 560
Pro Tyr Ser Val Glu Lys Phe Lys Leu Asn Phe Gln Met Pro Thr Leu
565 570 575
Ala Ser Gly Trp Asp Val Asn Lys Glu Lys Asn Asn Gly Ala Ile Leu
580 585 590
Phe Val Lys Asn Gly Leu Tyr Tyr Leu Gly Ile Met Pro Lys Gln Lys
595 600 605
Gly Arg Tyr Lys Ala Leu Ser Phe Glu Pro Thr Glu Lys Thr Ser Glu
610 615 620
Gly Phe Asp Lys Met Tyr Tyr Asp Tyr Phe Pro Asp Ala Ala Lys Met
625 630 635 640
Ile Pro Lys Cys Ser Thr Gln Leu Lys Ala Val Thr Ala His Phe Gln
645 650 655
Thr His Thr Thr Pro Ile Leu Leu Ser Asn Asn Phe Ile Glu Pro Leu
660 665 670
Glu Ile Thr Lys Glu Ile Tyr Asp Leu Asn Asn Pro Glu Lys Glu Pro
675 680 685
Lys Lys Phe Gln Thr Ala Tyr Ala Lys Lys Thr Gly Asp Gln Lys Gly
690 695 700
Tyr Arg Glu Ala Leu Cys Lys Trp Ile Asp Phe Thr Arg Asp Phe Leu
705 710 715 720
Ser Lys Tyr Thr Lys Thr Thr Ser Ile Asp Leu Ser Ser Leu Arg Pro
725 730 735
Ser Ser Gln Tyr Lys Asp Leu Gly Glu Tyr Tyr Ala Glu Leu Asn Pro
740 745 750
Leu Leu Tyr His Ile Ser Phe Gln Arg Ile Ala Glu Lys Glu Ile Met
755 760 765
Asp Ala Val Glu Thr Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys
770 775 780
Asp Phe Ala Lys Gly His His Gly Lys Pro Asn Leu His Thr Leu Tyr
785 790 795 800
Trp Thr Gly Leu Phe Ser Pro Glu Asn Leu Ala Lys Thr Ser Ile Lys
805 810 815
Leu Asn Gly Gln Ala Glu Leu Phe Tyr Arg Pro Lys Ser Arg Met Lys
820 825 830
Arg Met Ala His Arg Leu Gly Glu Lys Met Leu Asn Lys Lys Leu Lys
835 840 845
Asp Gln Lys Thr Pro Ile Pro Asp Thr Leu Tyr Gln Glu Leu Tyr Asp
850 855 860
Tyr Val Asn His Arg Leu Ser His Asp Leu Ser Asp Glu Ala Arg Ala
865 870 875 880
Leu Leu Pro Asn Val Ile Thr Lys Glu Val Ser His Glu Ile Ile Lys
885 890 895
Asp Arg Arg Phe Thr Ser Asp Lys Phe Phe Phe His Val Pro Ile Thr
900 905 910
Leu Asn Tyr Gln Ala Ala Asn Ser Pro Ser Lys Phe Asn Gln Arg Val
915 920 925
Asn Ala Tyr Leu Lys Glu His Pro Glu Thr Pro Ile Ile Gly Ile Ala
930 935 940
Arg Gly Glu Arg Asn Leu Ile Tyr Ile Thr Val Ile Asp Ser Thr Gly
945 950 955 960
Lys Ile Leu Glu Gln Arg Ser Leu Asn Thr Ile Gln Gln Phe Asp Tyr
965 970 975
Gln Lys Lys Leu Asp Asn Arg Glu Lys Glu Arg Val Ala Ala Arg Gln
980 985 990
Ala Trp Ser Val Val Gly Thr Ile Lys Asp Leu Lys Gln Gly Tyr Leu
995 1000 1005
Ser Gln Val Ile His Glu Ile Val Asp Leu Met Ile His Tyr Gln
1010 1015 1020
Ala Val Val Val Leu Ala Asn Leu Asn Phe Gly Phe Lys Ser Lys
1025 1030 1035
Arg Thr Gly Ile Ala Glu Lys Ala Val Tyr Gln Gln Phe Glu Lys
1040 1045 1050
Met Leu Ile Asp Lys Leu Asn Cys Leu Val Leu Lys Asp Tyr Pro
1055 1060 1065
Ala Glu Lys Val Gly Gly Val Leu Asn Pro Tyr Gln Leu Thr Asp
1070 1075 1080
Gln Phe Thr Ser Phe Ala Lys Met Gly Thr Gln Ser Gly Phe Leu
1085 1090 1095
Phe Tyr Val Pro Ala Pro Tyr Thr Ser Lys Ile Asp Pro Leu Thr
1100 1105 1110
Gly Phe Val Asp Pro Phe Val Trp Lys Thr Ile Lys Asn His Glu
1115 1120 1125
Ser Arg Lys His Phe Leu Glu Gly Phe Asp Phe Leu His Tyr Asp
1130 1135 1140
Val Lys Thr Gly Asp Phe Ile Leu His Phe Lys Met Asn Arg Asn
1145 1150 1155
Leu Ser Phe Gln Arg Gly Leu Pro Gly Phe Met Pro Ala Trp Asp
1160 1165 1170
Ile Val Phe Glu Lys Asn Glu Thr Gln Phe Asp Ala Lys Gly Thr
1175 1180 1185
Pro Phe Ile Ala Gly Lys Arg Ile Val Pro Val Ile Glu Asn His
1190 1195 1200
Arg Phe Thr Gly Arg Tyr Arg Asp Leu Tyr Pro Ala Asn Glu Leu
1205 1210 1215
Ile Ala Leu Leu Glu Glu Lys Gly Ile Val Phe Arg Asp Gly Ser
1220 1225 1230
Asn Ile Leu Pro Lys Leu Leu Glu Asn Asp Asp Ser His Ala Ile
1235 1240 1245
Asp Thr Met Val Ala Leu Ile Arg Ser Val Leu Gln Met Arg Asn
1250 1255 1260
Ser Asn Ala Ala Thr Gly Glu Ala Tyr Ile Asn Ser Pro Val Arg
1265 1270 1275
Asp Leu Asn Gly Val Cys Phe Asp Ser Arg Phe Gln Asn Pro Glu
1280 1285 1290
Trp Pro Met Asp Ala Asp Ala Asn Gly Ala Tyr His Ile Ala Leu
1295 1300 1305
Lys Gly Gln Leu Leu Leu Asn His Leu Lys Glu Ser Lys Asp Leu
1310 1315 1320
Lys Leu Gln Asn Gly Ile Ser Asn Gln Asp Trp Leu Ala Tyr Ile
1325 1330 1335
Gln Glu Leu Arg Asn Gly Ser Pro Lys Lys Lys Arg Lys Val
1340 1345 1350
<210> 69
<211> 1423
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 69
Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly
1 5 10 15
Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly
20 25 30
Gly Gly Ser Gly Met Pro Lys Lys Lys Arg Lys Val Gly Gly Gly Gly
35 40 45
Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
50 55 60
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Met Ser Ile Tyr Gln Glu
65 70 75 80
Phe Val Asn Lys Tyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile
85 90 95
Pro Gln Gly Lys Thr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu
100 105 110
Asp Asp Glu Lys Arg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile
115 120 125
Asp Lys Tyr His Gln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys
130 135 140
Ile Ser Glu Asp Leu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu
145 150 155 160
Lys Lys Ser Asp Asp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys
165 170 175
Asp Thr Ile Lys Lys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys
180 185 190
Phe Lys Asn Leu Phe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln
195 200 205
Glu Ser Asp Leu Ile Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile
210 215 220
Glu Leu Phe Lys Ala Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu
225 230 235 240
Glu Ile Ile Lys Ser Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe
245 250 255
His Glu Asn Arg Lys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser
260 265 270
Ile Ile Tyr Arg Ile Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn
275 280 285
Lys Ala Lys Tyr Glu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn
290 295 300
Tyr Glu Gln Ile Lys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile
305 310 315 320
Asp Tyr Lys Thr Ser Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu
325 330 335
Val Phe Glu Ile Ala Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile
340 345 350
Thr Lys Phe Asn Thr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn
355 360 365
Thr Lys Arg Lys Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln
370 375 380
Ile Asn Asp Lys Thr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys
385 390 395 400
Gln Ile Leu Ser Asp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu
405 410 415
Glu Asp Asp Ser Asp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln
420 425 430
Ile Ala Ala Phe Lys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu
435 440 445
Ser Leu Leu Phe Asp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys
450 455 460
Ile Tyr Phe Lys Asn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val
465 470 475 480
Phe Asp Asp Tyr Ser Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr
485 490 495
Gln Gln Ile Ala Pro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln
500 505 510
Glu Leu Ile Ala Lys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu
515 520 525
Thr Ile Lys Leu Ala Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp
530 535 540
Lys Gln Cys Arg Phe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro
545 550 555 560
Met Ile Phe Asp Glu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile
565 570 575
Ser Ile Lys Tyr Gln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser
580 585 590
Ala Glu Asp Asp Val Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn
595 600 605
Asn Leu Leu His Lys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp
610 615 620
Lys Ala Asn Ile Leu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu
625 630 635 640
Glu Cys Tyr Phe Glu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile
645 650 655
Arg Asn Tyr Ile Thr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu
660 665 670
Asn Phe Glu Asn Ser Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu
675 680 685
Pro Asp Asn Thr Ala Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu
690 695 700
Gly Val Met Asn Lys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile
705 710 715 720
Lys Glu Asn Lys Gly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu
725 730 735
Pro Gly Ala Asn Lys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser
740 745 750
Ile Lys Phe Tyr Asn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His
755 760 765
Ser Thr His Thr Lys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe
770 775 780
Glu Phe Asn Ile Glu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln
785 790 795 800
Ser Ile Ser Lys His Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser
805 810 815
Asp Thr Gln Arg Tyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu
820 825 830
Asn Gln Gly Tyr Lys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile
835 840 845
Asp Ser Val Val Asn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn
850 855 860
Lys Asp Phe Ser Ala Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu
865 870 875 880
Tyr Trp Lys Ala Leu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr
885 890 895
Lys Leu Asn Gly Glu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro
900 905 910
Lys Lys Ile Thr His Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys
915 920 925
Asp Asn Pro Lys Lys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp
930 935 940
Lys Arg Phe Thr Glu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile
945 950 955 960
Asn Phe Lys Ser Ser Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu
965 970 975
Leu Leu Lys Glu Lys Ala Asn Asp Val His Ile Leu Ser Ile Asp Arg
980 985 990
Gly Glu Arg His Leu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn
995 1000 1005
Ile Ile Lys Gln Asp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met
1010 1015 1020
Lys Thr Asn Tyr His Asp Lys Leu Ala Ala Ile Glu Lys Asp Arg
1025 1030 1035
Asp Ser Ala Arg Lys Asp Trp Lys Lys Ile Asn Asn Ile Lys Glu
1040 1045 1050
Met Lys Glu Gly Tyr Leu Ser Gln Val Val His Glu Ile Ala Lys
1055 1060 1065
Leu Val Ile Glu Tyr Asn Ala Ile Val Val Phe Glu Asp Leu Asn
1070 1075 1080
Phe Gly Phe Lys Arg Gly Arg Phe Lys Val Glu Lys Gln Val Tyr
1085 1090 1095
Gln Lys Leu Glu Lys Met Leu Ile Glu Lys Leu Asn Tyr Leu Val
1100 1105 1110
Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly Gly Val Leu Arg Ala
1115 1120 1125
Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys Lys Met Gly Lys
1130 1135 1140
Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe Thr Ser Lys
1145 1150 1155
Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro Lys Tyr
1160 1165 1170
Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp Lys
1175 1180 1185
Ile Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp
1190 1195 1200
Tyr Lys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile
1205 1210 1215
Ala Ser Phe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys
1220 1225 1230
Asn His Asn Trp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu
1235 1240 1245
Glu Lys Leu Leu Lys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu
1250 1255 1260
Cys Ile Lys Ala Ala Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe
1265 1270 1275
Ala Lys Leu Thr Ser Val Leu Asn Thr Ile Leu Gln Met Arg Asn
1280 1285 1290
Ser Lys Thr Gly Thr Glu Leu Asp Tyr Leu Ile Ser Pro Val Ala
1295 1300 1305
Asp Val Asn Gly Asn Phe Phe Asp Ser Arg Gln Ala Pro Lys Asn
1310 1315 1320
Met Pro Gln Asp Ala Asp Ala Asn Gly Ala Tyr His Ile Gly Leu
1325 1330 1335
Lys Gly Leu Met Leu Leu Gly Arg Ile Lys Asn Asn Gln Glu Gly
1340 1345 1350
Lys Lys Leu Asn Leu Val Ile Lys Asn Glu Glu Tyr Phe Glu Phe
1355 1360 1365
Val Gln Asn Arg Asn Asn Pro Lys Lys Lys Arg Lys Val Ser Gly
1370 1375 1380
Gly Ser Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val Tyr Pro
1385 1390 1395
Tyr Asp Val Pro Asp Tyr Ala Tyr Pro Tyr Asp Val Pro Asp Tyr
1400 1405 1410
Ala Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
1415 1420
<210> 70
<211> 16
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 70
Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser
1 5 10 15
<210> 71
<211> 23
<212> DNA
<213> 玉米
<400> 71
attgatagag cacatgagct tgg 23
<210> 72
<211> 23
<212> DNA
<213> 玉米
<400> 72
gtcacagatc acaaacttca aat 23
<210> 73
<211> 23
<212> DNA
<213> 大豆(Glycine max)
<400> 73
gaacccttga gagaggcttc ttc 23
<210> 74
<211> 21
<212> DNA
<213> 人工序列
<220>
<223> crRNA
<400> 74
taatttctac taagtgtaga t 21
<210> 75
<211> 20
<212> DNA
<213> 人工序列
<220>
<223> crRNA
<400> 75
taatttctac tgttgtagat 20
<210> 76
<211> 5358
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 76
atgtccgagg tggagttcag ccacgagtac tggatgaggc acgctctcac cctggctaag 60
agggcgtggg acgagaggga ggtgccggtg ggcgccgtgc tcgtccacaa caaccgcgtg 120
atcggcgagg gctggaacag gcccatcggc aggcacgacc caaccgctca cgccgagatc 180
atggctctca ggcagggcgg cctggtcatg cagaactaca ggctgatcga cgcgaccctc 240
tacgtgaccc tcgagccctg cgtcatgtgc gcgggcgcca tgatccactc caggatcggc 300
agggtggtct tcggcgctag ggacgccaag acgggcgctg cgggcagcct catggacgtg 360
ctgcaccacc ccggcatgaa ccaccgcgtc gagatcaccg agggcatcct cgcggacgag 420
tgcgctgcgc tcctgtccga cttcttcagg atgcgcaggc aggagatcaa ggcccagaag 480
aaggcgcagt ccagcaccga ctccggcggc tccagcggcg gctccagcgg cagcgagacc 540
ccgggcacgt ccgagagcgc gacgcccgag agcagcggcg gctccagcgg cggctcctcg 600
gaggtcgagt tcagccatga gtactggatg aggcatgccc tgactctcgc taagagggcg 660
cgggatgagc gcgaggtgcc ggtgggggcc gtgctcgtcc tgaacaaccg cgtgatcggg 720
gagggctgga accgggctat cggcctccac gacccaacgg cccatgccga gatcatggcc 780
ctgaggcagg gcggcctggt catgcaaaac tacaggctca tcgacgccac cctctacgtg 840
accttcgagc catgcgtgat gtgcgcgggg gccatgatcc actcgaggat tgggagggtg 900
gtcttcggcg tgcgcaacgc taagacgggg gccgccggca gcctcatgga cgtcctgcac 960
tacccgggca tgaaccacag ggtggagatt accgagggca tcctggccga tgagtgcgcc 1020
gcgctcctgt gctacttctt ccgcatgccc aggcaggtct tcaacgcgca gaagaaggcc 1080
cagagctcca ctgattccgg cggctccagc ggcggctcca gtggcagcga gactcctggc 1140
acgtccgaga gcgccacgcc cgagtctagc ggcggctcca gcggcggctc cgacaagaag 1200
tacagcatcg gcctggcaat cggcaccaac agcgtgggct gggccgtgat caccgacgag 1260
tacaaggtgc cgagcaagaa gttcaaggtg ctgggcaaca ccgacaggca cagcatcaag 1320
aagaacctga tcggcgccct gctgttcgac agcggcgaga ccgccgaggc caccaggctg 1380
aagaggaccg ccaggaggag gtacaccagg aggaagaaca ggatctgcta cctgcaggag 1440
atcttcagca acgagatggc caaggtggac gacagcttct tccacaggct ggaggagagc 1500
ttcctggtgg aggaggacaa gaagcacgag aggcacccga tcttcggcaa catcgtggac 1560
gaggtggcct accacgagaa gtacccgacc atctaccacc tgaggaagaa gctggtggac 1620
agcaccgaca aggccgacct gaggctgatc tacctggccc tggcccacat gatcaagttc 1680
aggggccact tcctgatcga gggcgacctg aacccggaca acagcgacgt ggacaagctg 1740
ttcatccagc tggtgcagac ctacaaccag ctgttcgagg agaacccgat caacgccagc 1800
ggcgtggacg ccaaggccat cctgagcgcc aggctgagca agagcaggag gctggagaac 1860
ctgatcgccc agctgccggg cgagaagaag aacggcctgt tcggcaacct gatcgccctg 1920
agcctgggcc tgaccccgaa cttcaagagc aacttcgacc tggccgagga cgccaagctg 1980
cagctgagca aggacaccta cgacgacgac ctggacaacc tgctggccca gatcggcgac 2040
cagtacgccg acctgttcct ggccgccaag aacctgagcg acgccatcct gctgagcgac 2100
atcctgaggg tgaacaccga gatcaccaag gccccgctga gcgccagcat gatcaagagg 2160
tacgacgagc accaccagga cctgaccctg ctgaaggccc tggtgaggca gcagctgccg 2220
gagaagtaca aggagatctt cttcgaccag agcaagaacg gctacgccgg ctacatcgac 2280
ggcggcgcca gccaggagga gttctacaag ttcatcaagc cgatcctgga gaagatggac 2340
ggcaccgagg agctgctggt gaagctgaac agggaggacc tgctgaggaa gcagaggacc 2400
ttcgacaacg gcagcatccc gcaccagatc cacctgggcg agctgcacgc catcctgagg 2460
aggcaggagg acttctaccc gttcctgaag gacaacaggg agaagatcga gaagatcctg 2520
accttccgca tcccgtacta cgtgggcccg ctggccaggg gcaacagcag gttcgcctgg 2580
atgaccagga agagcgagga gaccatcacc ccgtggaact tcgaggaggt ggtggacaag 2640
ggcgccagcg cccagagctt catcgagagg atgaccaact tcgacaagaa cctgccgaac 2700
gagaaggtgc tgccgaagca cagcctgctg tacgagtact tcaccgtgta caacgagctg 2760
accaaggtga agtacgtgac cgagggcatg aggaagccgg ccttcctgag cggcgagcag 2820
aagaaggcca tcgtggacct gctgttcaag accaacagga aggtgaccgt gaagcagctg 2880
aaggaggact acttcaagaa gatcgagtgc ttcgacagcg tggagatcag cggcgtggag 2940
gacaggttca acgccagcct gggcacctac cacgacctgc tgaagatcat caaggacaag 3000
gacttcctgg acaacgagga gaacgaggac atcctggagg acatcgtgct gaccctgacc 3060
ctgttcgagg acagggagat gatcgaggag aggctgaaga cctacgccca cctgttcgac 3120
gacaaggtga tgaagcagct gaagaggagg aggtacaccg gctggggcag gctgagcagg 3180
aagctgatca acggcatcag ggacaagcag agcggcaaga ccatcctgga cttcctgaag 3240
agcgacggct tcgccaacag gaacttcatg cagctgatcc acgacgacag cctgaccttc 3300
aaggaggaca tccagaaggc ccaggtgagc ggccagggcg acagcctgca cgagcacatc 3360
gccaacctgg ccggcagccc ggccatcaag aagggcatcc tgcagaccgt gaaggtggtg 3420
gacgagctgg tgaaggtgat gggcaggcac aagccggaga acatcgtgat cgagatggcc 3480
agggagaacc agaccaccca gaagggccag aagaacagca gggagaggat gaagaggatc 3540
gaggagggca tcaaggagct gggcagccag atcctgaagg agcacccggt ggagaacacc 3600
cagctgcaga acgagaagct gtacctgtac tacctgcaga acggcaggga catgtacgtg 3660
gaccaggagc tggacatcaa caggctgagc gactacgacg tggaccacat cgtgccgcag 3720
agcttcctga aggacgacag catcgacaac aaggtgctga ccaggagcga caagaacagg 3780
ggcaagagcg acaacgtgcc gagcgaggag gtggtgaaga agatgaaaaa ctactggagg 3840
cagctgctga acgccaagct gatcacccag aggaagttcg acaacctgac caaggccgag 3900
aggggcggcc tgagcgagct ggacaaggcc ggcttcatta aaaggcagct ggtggagacc 3960
aggcagatca ccaagcacgt ggcccagatc ctggacagca ggatgaacac caagtacgac 4020
gagaacgaca agctgatcag ggaggtgaag gtgatcaccc tgaagagcaa gctggtgagc 4080
gacttcagga aggacttcca gttctacaag gtgagggaga tcaataatta ccaccacgcc 4140
cacgacgcct acctgaacgc cgtggtgggc accgccctga ttaaaaagta cccgaagctg 4200
gagagcgagt tcgtgtacgg cgactacaag gtgtacgacg tgaggaagat gatcgccaag 4260
agcgagcagg agatcggcaa ggccaccgcc aagtacttct tctacagcaa catcatgaac 4320
ttcttcaaga ccgagatcac cctggccaac ggcgagatca ggaagaggcc gctgatcgag 4380
accaacggcg agaccggcga gatcgtgtgg gacaagggca gggacttcgc caccgtgagg 4440
aaggtgctgt ccatgccgca ggtgaacatc gtgaagaaga ccgaggtgca gaccggcggc 4500
ttcagcaagg agagcatcct gccgaagagg aacagcgaca agctgatcgc caggaagaag 4560
gactgggatc cgaagaagta cggcggcttc gacagcccga ccgtggccta cagcgtgctg 4620
gtggtggcca aggtggagaa gggcaagagc aagaagctga agagcgtgaa ggagctggtg 4680
ggcatcacca tcatggagag gagcagcttc gagaagaacc cagtggactt cctggaggcc 4740
aagggctaca aggaggtgaa gaaggacctg atcattaaac tgccgaagta cagcctgttc 4800
gagctggaga acggcaggaa gaggatgctg gccagcgccg gcgagctgca gaagggcaac 4860
gagctggccc tgccgagcaa gtacgtgaac ttcctgtacc tggccagcca ctacgagaag 4920
ctgaagggca gcccggagga caacgagcag aagcagctgt tcgtggagca gcacaagcac 4980
tacctggacg agatcatcga gcagatcagc gagttcagca agagggtgat cctggccgac 5040
gccaacctgg acaaggtgct gagcgcctac aacaagcaca gggacaagcc gatcagggag 5100
caggccgaga acatcatcca cctgttcacc ctgaccaacc tgggcgcccc ggccgccttc 5160
aagtacttcg acaccaccat cgacaggaag aggtacacca gcaccaagga ggtgctggac 5220
gccaccctga tccaccagag catcaccggc ctgtacgaga ccaggatcga cctgagccag 5280
ctgggcggcg acagcagccc gccgaagaag aagaggaagg tgagctggaa ggacgccagc 5340
ggctggagca ggatgtga 5358
<210> 77
<211> 1785
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 77
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro
35 40 45
Ile Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Leu Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Ala Arg Asp Ala Lys Thr Gly
100 105 110
Ala Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met Asn His
115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140
Leu Ser Asp Phe Phe Arg Met Arg Arg Gln Glu Ile Lys Ala Gln Lys
145 150 155 160
Lys Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190
Gly Gly Ser Ser Gly Gly Ser Ser Glu Val Glu Phe Ser His Glu Tyr
195 200 205
Trp Met Arg His Ala Leu Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg
210 215 220
Glu Val Pro Val Gly Ala Val Leu Val Leu Asn Asn Arg Val Ile Gly
225 230 235 240
Glu Gly Trp Asn Arg Ala Ile Gly Leu His Asp Pro Thr Ala His Ala
245 250 255
Glu Ile Met Ala Leu Arg Gln Gly Gly Leu Val Met Gln Asn Tyr Arg
260 265 270
Leu Ile Asp Ala Thr Leu Tyr Val Thr Phe Glu Pro Cys Val Met Cys
275 280 285
Ala Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val Phe Gly Val
290 295 300
Arg Asn Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp Val Leu His
305 310 315 320
Tyr Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly Ile Leu Ala
325 330 335
Asp Glu Cys Ala Ala Leu Leu Cys Tyr Phe Phe Arg Met Pro Arg Gln
340 345 350
Val Phe Asn Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp Ser Gly Gly
355 360 365
Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser
370 375 380
Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser Asp Lys Lys
385 390 395 400
Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val
405 410 415
Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly
420 425 430
Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu
435 440 445
Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala
450 455 460
Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu
465 470 475 480
Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg
485 490 495
Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His
500 505 510
Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr
515 520 525
Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys
530 535 540
Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe
545 550 555 560
Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp
565 570 575
Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe
580 585 590
Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu
595 600 605
Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln
610 615 620
Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu
625 630 635 640
Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu
645 650 655
Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp
660 665 670
Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala
675 680 685
Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val
690 695 700
Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg
705 710 715 720
Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg
725 730 735
Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys
740 745 750
Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe
755 760 765
Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu
770 775 780
Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr
785 790 795 800
Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His
805 810 815
Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn
820 825 830
Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val
835 840 845
Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys
850 855 860
Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys
865 870 875 880
Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys
885 890 895
Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu
900 905 910
Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu
915 920 925
Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile
930 935 940
Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu
945 950 955 960
Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile
965 970 975
Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp
980 985 990
Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn
995 1000 1005
Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu
1010 1015 1020
Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His Leu
1025 1030 1035
Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
1040 1045 1050
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
1055 1060 1065
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly
1070 1075 1080
Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu
1085 1090 1095
Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly
1100 1105 1110
Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala
1115 1120 1125
Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu
1130 1135 1140
Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
1145 1150 1155
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser
1160 1165 1170
Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly
1175 1180 1185
Ser Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln
1190 1195 1200
Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met
1205 1210 1215
Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp
1220 1225 1230
Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile
1235 1240 1245
Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser
1250 1255 1260
Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr
1265 1270 1275
Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
1280 1285 1290
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
1295 1300 1305
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile
1310 1315 1320
Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys
1325 1330 1335
Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr
1340 1345 1350
Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe
1355 1360 1365
Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala
1370 1375 1380
Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro
1385 1390 1395
Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp
1400 1405 1410
Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala
1415 1420 1425
Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys
1430 1435 1440
Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu
1445 1450 1455
Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly
1460 1465 1470
Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val
1475 1480 1485
Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys
1490 1495 1500
Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg
1505 1510 1515
Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro
1520 1525 1530
Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly
1535 1540 1545
Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Val Gly Ile Thr
1550 1555 1560
Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Val Asp Phe Leu
1565 1570 1575
Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys
1580 1585 1590
Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg
1595 1600 1605
Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala
1610 1615 1620
Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr
1625 1630 1635
Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu
1640 1645 1650
Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln
1655 1660 1665
Ile Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu
1670 1675 1680
Asp Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile
1685 1690 1695
Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn
1700 1705 1710
Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp
1715 1720 1725
Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu
1730 1735 1740
Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu
1745 1750 1755
Ser Gln Leu Gly Gly Asp Ser Ser Pro Pro Lys Lys Lys Arg Lys
1760 1765 1770
Val Ser Trp Lys Asp Ala Ser Gly Trp Ser Arg Met
1775 1780 1785
<210> 78
<211> 5094
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 78
atgtccgagg tggagttcag ccacgagtac tggatgaggc acgctctcac cctggctaag 60
agggcgtggg acgagaggga ggtgccggtg ggcgccgtgc tcgtccacaa caaccgcgtg 120
atcggcgagg gctggaacag gcccatcggc aggcacgacc caaccgctca cgccgagatc 180
atggctctca ggcagggcgg cctggtcatg cagaactaca ggctgatcga cgcgaccctc 240
tacgtgaccc tcgagccctg cgtcatggta agtttctgct tctacctttg atatatatat 300
aataattatc attaattagt agtaatataa tatttcaaat atttttttca aaataaaaga 360
atgtagtata tagcaattgc ttttctgtag tttataagtg tgtatatttt aatttataac 420
ttttctaata tatgaccaaa atttgttgat gtgcagtgcg cgggcgccat gatccactcc 480
aggatcggca gggtggtctt cggcgctagg gacgccaaga cgggcgctgc gggcagcctc 540
atggacgtgc tgcaccaccc cggcatgaac caccgcgtcg agatcaccga gggcatcctc 600
gcggacgagt gcgctgcgct cctgtccgac ttcttcagga tgcgcaggca ggagatcaag 660
gcccagaaga aggcgcagtc cagcaccgac tccggcggct ccagcggcgg ctccagcggc 720
agcgagaccc cgggcacgtc cgagagcgcg acgcccgaga gcagcggcgg ctccagcggc 780
ggctcctcgg aggtcgagtt cagccatgag tactggatga ggcatgccct gactctcgct 840
aagagggcgc gggatgagcg cgaggtgccg gtgggggccg tgctcgtcct gaacaaccgc 900
gtgatcgggg agggctggaa ccgggctatc ggcctccacg acccaacggc ccatgccgag 960
atcatggccc tgaggcaggg cggcctggtc atgcaaaact acaggctcat cgacgccacc 1020
ctctacgtga ccttcgagcc atgcgtgatg tgcgcggggg ccatgatcca ctcgaggatt 1080
gggagggtgg tcttcggcgt gcgcaacgct aagacggggg ccgccggcag cctcatggac 1140
gtcctgcact acccgggcat gaaccacagg gtggagatta ccgagggcat cctggccgat 1200
gagtgcgccg cgctcctgtg ctacttcttc cgcatgccca ggcaggtctt caacgcgcag 1260
aagaaggccc agagctccac tgattccggc ggctccagcg gcggctccag tggcagcgag 1320
actcctggca cgtccgagag cgccacgccc gagtctagcg gcggctccag cggcggctcc 1380
atgagcaagc tggagaagtt cacgaactgc tactccctca gcaagaccct gaggttcaag 1440
gcgatcccgg tcggcaagac ccaggagaac atcgacaaca agcggctgct ggtggaggac 1500
gagaagaggg ctgaggacta caagggcgtg aagaagctcc tggaccgcta ctacctgtcc 1560
ttcatcaacg acgtgctcca cagcatcaag ctcaagaacc tgaacaacta catcagcctc 1620
ttcaggaaga agacgcgcac cgagaaggag aacaaggagc tcgagaacct ggagatcaac 1680
ctgaggaagg agatcgccaa ggcgttcaag ggcaacgagg gctacaagtc cctcttcaag 1740
aaggacatca tcgagacgat cctcccggag ttcctggacg acaaggacga gatcgccctg 1800
gtcaactcct tcaacggctt caccacggcg ttcaccggct tcttcgacaa ccgcgagaac 1860
atgttcagcg aggaggccaa gtccacgagc atcgcgttca ggtgcatcaa cgagaacctc 1920
acccgctaca tctccaacat ggacatcttc gagaaggtcg acgcgatctt cgacaagcac 1980
gaggtgcagg agatcaagga gaagatcctg aacagcgact acgacgtcga ggacttcttc 2040
gagggcgagt tcttcaactt cgtcctcacg caggagggca tcgacgtgta caacgccatc 2100
atcggtggct tcgtgaccga gtccggcgag aagatcaagg gcctgaacga gtacatcaac 2160
ctctacaacc agaagaccaa gcagaagctg ccgaagttca agcccctgta caagcaggtg 2220
ctctccgaca gggagtccct cagcttctac ggcgagggct acacgagcga cgaggaggtc 2280
ctggaggtgt tccgcaacac cctcaacaag aacagcgaga tcttctccag catcaagaag 2340
ctcgagaagc tgttcaagaa cttcgacgag tactccagcg ccggcatctt cgtcaagaac 2400
ggcccggcga tctccacgat cagcaaggac atcttcggcg agtggaacgt gatccgcgac 2460
aagtggaacg ccgagtacga cgacatccac ctcaagaaga aggcggtggt caccgagaag 2520
tacgaggacg acaggcgcaa gtccttcaag aagatcggct ccttcagcct cgagcagctg 2580
caggagtacg ccgacgcgga cctgagcgtg gtcgagaagc tcaaggagat catcatccag 2640
aaggtcgacg agatctacaa ggtgtacggc tccagcgaga agctcttcga cgcggacttc 2700
gtcctcgaga agtccctgaa gaagaacgac gccgtggtcg cgatcatgaa ggacctcctg 2760
gactccgtga agagcttcga gaattacatc aaggccttct tcggcgaggg caaggagacg 2820
aacagggacg agtccttcta cggcgacttc gtcctggcct acgacatcct cctgaaggtg 2880
gaccacatct acgacgcgat ccgcaactac gtgacccaga agccgtacag caaggacaag 2940
ttcaagctct acttccagaa cccccagttc atgggcggct gggacaagga caaggagacg 3000
gactacaggg cgaccatcct gcgctacggc agcaagtact acctcgccat catggacaag 3060
aagtacgcga agtgcctgca gaagatcgac aaggacgacg tcaacggcaa ctacgagaag 3120
atcaactaca agctcctgcc gggccccaac aagatgctcc cgaaggtgtt cttctccaag 3180
aagtggatgg cctactacaa ccccagcgag gacatccaga agatctacaa gaacggcacg 3240
ttcaagaagg gcgacatgtt caacctgaac gactgccaca agctcatcga cttcttcaag 3300
gactccatca gccgctaccc gaagtggtcc aacgcctacg acttcaactt cagcgagacc 3360
gagaagtaca aggacatcgc gggcttctac cgcgaggtcg aggagcaggg ctacaaggtg 3420
tccttcgagt ccgccagcaa gaaggaggtc gacaagctgg tggaggaggg caagctctac 3480
atgttccaga tctacaacaa ggacttctcc gacaagagcc acggcacgcc caacctgcac 3540
accatgtact tcaagctcct gttcgacgag aacaaccacg gccagatcag gctgtccggc 3600
ggcgccgagc tcttcatgag gagggcgagc ctgaagaagg aggagctggt ggtccacccc 3660
gctaacagcc caatcgcgaa caagaacccg gacaacccca agaagaccac gaccctgtcc 3720
tacgacgtgt acaaggacaa gaggttcagc gaggaccagt acgagctcca catcccgatc 3780
gcgatcaaca agtgccccaa gaacatcttc aagatcaaca ccgaggtccg cgtgctcctg 3840
aagcacgacg acaaccccta cgtgatcggc atcgctaggg gcgagaggaa cctcctgtac 3900
atcgtggtcg tggacggcaa gggcaacatc gtggagcagt actccctcaa cgagatcatc 3960
aacaacttca acggcatcag gatcaagacg gactaccaca gcctcctgga caagaaggag 4020
aaggagaggt tcgaggcccg ccagaactgg acctccatcg agaacatcaa ggagctgaag 4080
gcgggctaca tcagccaggt cgtgcacaag atctgcgagc tcgtcgagaa gtacgacgcc 4140
gtgatcgccc tcgcggacct gaactccggc ttcaagaaca gccgcgtcaa ggtggagaag 4200
caggtctacc agaagttcga gaagatgctc atcgacaagc tgaactacat ggtggacaag 4260
aagtccaacc cctgcgctac gggcggcgcg ctgaagggct accagatcac caacaagttc 4320
gagagcttca agtccatgag cactcagaac ggcttcatct tctacatccc ggcgtggctc 4380
acgtccaaga tcgaccccag caccggcttc gtcaacctcc tgaagacgaa gtacacctcc 4440
atcgccgaca gcaagaagtt catctccagc ttcgaccgca tcatgtatgt gccggaggag 4500
gacctgttcg agttcgccct cgactacaag aacttctccc gcacggacgc ggactacatc 4560
aagaagtgga agctgtacag ctacggcaac cgcatccgca tcttcaggaa ccccaagaag 4620
aacaacgtct tcgactggga ggaggtgtgc ctgacctccg cgtacaagga gctcttcaac 4680
aagtacggca tcaactacca gcagggcgac atcagggctc tcctgtgcga gcagagcgac 4740
aaggccttct actccagctt catggcgctg atgtccctca tgctgcagat gaggaactcg 4800
atcaccggca ggacggacgt ggccttcctc atctccccgg tgaagaacag cgacggcatc 4860
ttctacgact ccaggaacta cgaggcccag gagaacgcga tcctcccaaa gaacgcggac 4920
gccaacggcg cctacaacat cgccaggaag gtcctctggg ctatcggcca gttcaagaag 4980
gcggaggacg agaagctgga caaggtgaag atcgccatca gcaacaagga gtggctcgag 5040
tacgcccaga cctcggtcaa gcacggcagc ccgaagaaga agcgcaaggt gtga 5094
<210> 79
<211> 1691
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 79
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro
35 40 45
Ile Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Leu Glu Pro Cys Val Met Val Ser Phe Cys Phe Tyr Leu
85 90 95
Tyr Ile Tyr Asn Asn Tyr His Leu Val Val Ile Tyr Phe Lys Tyr Phe
100 105 110
Phe Gln Asn Lys Arg Met Tyr Ile Ala Ile Ala Phe Leu Phe Ile Ser
115 120 125
Val Tyr Ile Leu Ile Tyr Asn Phe Ser Asn Ile Pro Lys Phe Val Asp
130 135 140
Val Gln Cys Ala Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val
145 150 155 160
Phe Gly Ala Arg Asp Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp
165 170 175
Val Leu His His Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly
180 185 190
Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu Ser Asp Phe Phe Arg Met
195 200 205
Arg Arg Gln Glu Ile Lys Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp
210 215 220
Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr
225 230 235 240
Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser
245 250 255
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
260 265 270
Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala Val
275 280 285
Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala Ile
290 295 300
Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
305 310 315 320
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
325 330 335
Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
340 345 350
Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly Ala
355 360 365
Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His Arg
370 375 380
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
385 390 395 400
Cys Tyr Phe Phe Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys Lys
405 410 415
Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly
420 425 430
Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly
435 440 445
Gly Ser Ser Gly Gly Ser Met Ser Lys Leu Glu Lys Phe Thr Asn Cys
450 455 460
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly Lys
465 470 475 480
Thr Gln Glu Asn Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu Lys
485 490 495
Arg Ala Glu Asp Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr
500 505 510
Leu Ser Phe Ile Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn Leu
515 520 525
Asn Asn Tyr Ile Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu
530 535 540
Asn Lys Glu Leu Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala
545 550 555 560
Lys Ala Phe Lys Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp
565 570 575
Ile Ile Glu Thr Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile
580 585 590
Ala Leu Val Asn Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe
595 600 605
Phe Asp Asn Arg Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr Ser
610 615 620
Ile Ala Phe Arg Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn
625 630 635 640
Met Asp Ile Phe Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu Val
645 650 655
Gln Glu Ile Lys Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp
660 665 670
Phe Phe Glu Gly Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly Ile
675 680 685
Asp Val Tyr Asn Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly Glu
690 695 700
Lys Ile Lys Gly Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr
705 710 715 720
Lys Gln Lys Leu Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser
725 730 735
Asp Arg Glu Ser Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu
740 745 750
Glu Val Leu Glu Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile
755 760 765
Phe Ser Ser Ile Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu
770 775 780
Tyr Ser Ser Ala Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser Thr
785 790 795 800
Ile Ser Lys Asp Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys Trp
805 810 815
Asn Ala Glu Tyr Asp Asp Ile His Leu Lys Lys Lys Ala Val Val Thr
820 825 830
Glu Lys Tyr Glu Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser
835 840 845
Phe Ser Leu Glu Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val
850 855 860
Val Glu Lys Leu Lys Glu Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr
865 870 875 880
Lys Val Tyr Gly Ser Ser Glu Lys Leu Phe Asp Ala Asp Phe Val Leu
885 890 895
Glu Lys Ser Leu Lys Lys Asn Asp Ala Val Val Ala Ile Met Lys Asp
900 905 910
Leu Leu Asp Ser Val Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe
915 920 925
Gly Glu Gly Lys Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe
930 935 940
Val Leu Ala Tyr Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp Ala
945 950 955 960
Ile Arg Asn Tyr Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys
965 970 975
Leu Tyr Phe Gln Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp Lys
980 985 990
Glu Thr Asp Tyr Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr
995 1000 1005
Leu Ala Ile Met Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile
1010 1015 1020
Asp Lys Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys
1025 1030 1035
Leu Leu Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ser
1040 1045 1050
Lys Lys Trp Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln Lys
1055 1060 1065
Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu
1070 1075 1080
Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser
1085 1090 1095
Arg Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn Phe Ser Glu
1100 1105 1110
Thr Glu Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu Val Glu
1115 1120 1125
Glu Gln Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys Glu
1130 1135 1140
Val Asp Lys Leu Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile
1145 1150 1155
Tyr Asn Lys Asp Phe Ser Asp Lys Ser His Gly Thr Pro Asn Leu
1160 1165 1170
His Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu Asn Asn His Gly
1175 1180 1185
Gln Ile Arg Leu Ser Gly Gly Ala Glu Leu Phe Met Arg Arg Ala
1190 1195 1200
Ser Leu Lys Lys Glu Glu Leu Val Val His Pro Ala Asn Ser Pro
1205 1210 1215
Ile Ala Asn Lys Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr Leu
1220 1225 1230
Ser Tyr Asp Val Tyr Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr
1235 1240 1245
Glu Leu His Ile Pro Ile Ala Ile Asn Lys Cys Pro Lys Asn Ile
1250 1255 1260
Phe Lys Ile Asn Thr Glu Val Arg Val Leu Leu Lys His Asp Asp
1265 1270 1275
Asn Pro Tyr Val Ile Gly Ile Ala Arg Gly Glu Arg Asn Leu Leu
1280 1285 1290
Tyr Ile Val Val Val Asp Gly Lys Gly Asn Ile Val Glu Gln Tyr
1295 1300 1305
Ser Leu Asn Glu Ile Ile Asn Asn Phe Asn Gly Ile Arg Ile Lys
1310 1315 1320
Thr Asp Tyr His Ser Leu Leu Asp Lys Lys Glu Lys Glu Arg Phe
1325 1330 1335
Glu Ala Arg Gln Asn Trp Thr Ser Ile Glu Asn Ile Lys Glu Leu
1340 1345 1350
Lys Ala Gly Tyr Ile Ser Gln Val Val His Lys Ile Cys Glu Leu
1355 1360 1365
Val Glu Lys Tyr Asp Ala Val Ile Ala Leu Ala Asp Leu Asn Ser
1370 1375 1380
Gly Phe Lys Asn Ser Arg Val Lys Val Glu Lys Gln Val Tyr Gln
1385 1390 1395
Lys Phe Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Met Val Asp
1400 1405 1410
Lys Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala Leu Lys Gly Tyr
1415 1420 1425
Gln Ile Thr Asn Lys Phe Glu Ser Phe Lys Ser Met Ser Thr Gln
1430 1435 1440
Asn Gly Phe Ile Phe Tyr Ile Pro Ala Trp Leu Thr Ser Lys Ile
1445 1450 1455
Asp Pro Ser Thr Gly Phe Val Asn Leu Leu Lys Thr Lys Tyr Thr
1460 1465 1470
Ser Ile Ala Asp Ser Lys Lys Phe Ile Ser Ser Phe Asp Arg Ile
1475 1480 1485
Met Tyr Val Pro Glu Glu Asp Leu Phe Glu Phe Ala Leu Asp Tyr
1490 1495 1500
Lys Asn Phe Ser Arg Thr Asp Ala Asp Tyr Ile Lys Lys Trp Lys
1505 1510 1515
Leu Tyr Ser Tyr Gly Asn Arg Ile Arg Ile Phe Arg Asn Pro Lys
1520 1525 1530
Lys Asn Asn Val Phe Asp Trp Glu Glu Val Cys Leu Thr Ser Ala
1535 1540 1545
Tyr Lys Glu Leu Phe Asn Lys Tyr Gly Ile Asn Tyr Gln Gln Gly
1550 1555 1560
Asp Ile Arg Ala Leu Leu Cys Glu Gln Ser Asp Lys Ala Phe Tyr
1565 1570 1575
Ser Ser Phe Met Ala Leu Met Ser Leu Met Leu Gln Met Arg Asn
1580 1585 1590
Ser Ile Thr Gly Arg Thr Asp Val Ala Phe Leu Ile Ser Pro Val
1595 1600 1605
Lys Asn Ser Asp Gly Ile Phe Tyr Asp Ser Arg Asn Tyr Glu Ala
1610 1615 1620
Gln Glu Asn Ala Ile Leu Pro Lys Asn Ala Asp Ala Asn Gly Ala
1625 1630 1635
Tyr Asn Ile Ala Arg Lys Val Leu Trp Ala Ile Gly Gln Phe Lys
1640 1645 1650
Lys Ala Glu Asp Glu Lys Leu Asp Lys Val Lys Ile Ala Ile Ser
1655 1660 1665
Asn Lys Glu Trp Leu Glu Tyr Ala Gln Thr Ser Val Lys His Gly
1670 1675 1680
Ser Pro Lys Lys Lys Arg Lys Val
1685 1690
<210> 80
<211> 5088
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 80
atgtccgagg tggagttcag ccacgagtac tggatgaggc acgctctcac cctggctaag 60
agggcgtggg acgagaggga ggtgccggtg ggcgccgtgc tcgtccacaa caaccgcgtg 120
atcggcgagg gctggaacag gcccatcggc aggcacgacc caaccgctca cgccgagatc 180
atggctctca ggcagggcgg cctggtcatg cagaactaca ggctgatcga cgcgaccctc 240
tacgtgaccc tcgagccctg cgtcatggta agtttctgct tctacctttg atatatatat 300
aataattatc attaattagt agtaatataa tatttcaaat atttttttca aaataaaaga 360
atgtagtata tagcaattgc ttttctgtag tttataagtg tgtatatttt aatttataac 420
ttttctaata tatgaccaaa atttgttgat gtgcagtgcg cgggcgccat gatccactcc 480
aggatcggca gggtggtctt cggcgctagg gacgccaaga cgggcgctgc gggcagcctc 540
atggacgtgc tgcaccaccc cggcatgaac caccgcgtcg agatcaccga gggcatcctc 600
gcggacgagt gcgctgcgct cctgtccgac ttcttcagga tgcgcaggca ggagatcaag 660
gcccagaaga aggcgcagtc cagcaccgac tccggcggct ccagcggcgg ctccagcggc 720
agcgagaccc cgggcacgtc cgagagcgcg acgcccgaga gcagcggcgg ctccagcggc 780
ggctcctcgg aggtcgagtt cagccatgag tactggatga ggcatgccct gactctcgct 840
aagagggcgc gggatgagcg cgaggtgccg gtgggggccg tgctcgtcct gaacaaccgc 900
gtgatcgggg agggctggaa ccgggctatc ggcctccacg acccaacggc ccatgccgag 960
atcatggccc tgaggcaggg cggcctggtc atgcaaaact acaggctcat cgacgccacc 1020
ctctacgtga ccttcgagcc atgcgtgatg tgcgcggggg ccatgatcca ctcgaggatt 1080
gggagggtgg tcttcggcgt gcgcaacgct aagacggggg ccgccggcag cctcatggac 1140
gtcctgcact acccgggcat gaaccacagg gtggagatta ccgagggcat cctggccgat 1200
gagtgcgccg cgctcctgtg ctacttcttc cgcatgccca ggcaggtctt caacgcgcag 1260
aagaaggccc agagctccac tgatgggggc gggggctcag gcgggggcgg gagcggcggc 1320
gggggctctg ggggcggcgg cagcggcggg ggcggcagcg ggggcggcgg gtcgatgagc 1380
aagctggaga agttcacgaa ctgctactcc ctcagcaaga ccctgaggtt caaggcgatc 1440
ccggtcggca agacccagga gaacatcgac aacaagcggc tgctggtgga ggacgagaag 1500
agggctgagg actacaaggg cgtgaagaag ctcctggacc gctactacct gtccttcatc 1560
aacgacgtgc tccacagcat caagctcaag aacctgaaca actacatcag cctcttcagg 1620
aagaagacgc gcaccgagaa ggagaacaag gagctcgaga acctggagat caacctgagg 1680
aaggagatcg ccaaggcgtt caagggcaac gagggctaca agtccctctt caagaaggac 1740
atcatcgaga cgatcctccc ggagttcctg gacgacaagg acgagatcgc cctggtcaac 1800
tccttcaacg gcttcaccac ggcgttcacc ggcttcttcg acaaccgcga gaacatgttc 1860
agcgaggagg ccaagtccac gagcatcgcg ttcaggtgca tcaacgagaa cctcacccgc 1920
tacatctcca acatggacat cttcgagaag gtcgacgcga tcttcgacaa gcacgaggtg 1980
caggagatca aggagaagat cctgaacagc gactacgacg tcgaggactt cttcgagggc 2040
gagttcttca acttcgtcct cacgcaggag ggcatcgacg tgtacaacgc catcatcggt 2100
ggcttcgtga ccgagtccgg cgagaagatc aagggcctga acgagtacat caacctctac 2160
aaccagaaga ccaagcagaa gctgccgaag ttcaagcccc tgtacaagca ggtgctctcc 2220
gacagggagt ccctcagctt ctacggcgag ggctacacga gcgacgagga ggtcctggag 2280
gtgttccgca acaccctcaa caagaacagc gagatcttct ccagcatcaa gaagctcgag 2340
aagctgttca agaacttcga cgagtactcc agcgccggca tcttcgtcaa gaacggcccg 2400
gcgatctcca cgatcagcaa ggacatcttc ggcgagtgga acgtgatccg cgacaagtgg 2460
aacgccgagt acgacgacat ccacctcaag aagaaggcgg tggtcaccga gaagtacgag 2520
gacgacaggc gcaagtcctt caagaagatc ggctccttca gcctcgagca gctgcaggag 2580
tacgccgacg cggacctgag cgtggtcgag aagctcaagg agatcatcat ccagaaggtc 2640
gacgagatct acaaggtgta cggctccagc gagaagctct tcgacgcgga cttcgtcctc 2700
gagaagtccc tgaagaagaa cgacgccgtg gtcgcgatca tgaaggacct cctggactcc 2760
gtgaagagct tcgagaatta catcaaggcc ttcttcggcg agggcaagga gacgaacagg 2820
gacgagtcct tctacggcga cttcgtcctg gcctacgaca tcctcctgaa ggtggaccac 2880
atctacgacg cgatccgcaa ctacgtgacc cagaagccgt acagcaagga caagttcaag 2940
ctctacttcc agaaccccca gttcatgggc ggctgggaca aggacaagga gacggactac 3000
agggcgacca tcctgcgcta cggcagcaag tactacctcg ccatcatgga caagaagtac 3060
gcgaagtgcc tgcagaagat cgacaaggac gacgtcaacg gcaactacga gaagatcaac 3120
tacaagctcc tgccgggccc caacaagatg ctcccgaagg tgttcttctc caagaagtgg 3180
atggcctact acaaccccag cgaggacatc cagaagatct acaagaacgg cacgttcaag 3240
aagggcgaca tgttcaacct gaacgactgc cacaagctca tcgacttctt caaggactcc 3300
atcagccgct acccgaagtg gtccaacgcc tacgacttca acttcagcga gaccgagaag 3360
tacaaggaca tcgcgggctt ctaccgcgag gtcgaggagc agggctacaa ggtgtccttc 3420
gagtccgcca gcaagaagga ggtcgacaag ctggtggagg agggcaagct ctacatgttc 3480
cagatctaca acaaggactt ctccgacaag agccacggca cgcccaacct gcacaccatg 3540
tacttcaagc tcctgttcga cgagaacaac cacggccaga tcaggctgtc cggcggcgcc 3600
gagctcttca tgaggagggc gagcctgaag aaggaggagc tggtggtcca ccccgctaac 3660
agcccaatcg cgaacaagaa cccggacaac cccaagaaga ccacgaccct gtcctacgac 3720
gtgtacaagg acaagaggtt cagcgaggac cagtacgagc tccacatccc gatcgcgatc 3780
aacaagtgcc ccaagaacat cttcaagatc aacaccgagg tccgcgtgct cctgaagcac 3840
gacgacaacc cctacgtgat cggcatcgct aggggcgaga ggaacctcct gtacatcgtg 3900
gtcgtggacg gcaagggcaa catcgtggag cagtactccc tcaacgagat catcaacaac 3960
ttcaacggca tcaggatcaa gacggactac cacagcctcc tggacaagaa ggagaaggag 4020
aggttcgagg cccgccagaa ctggacctcc atcgagaaca tcaaggagct gaaggcgggc 4080
tacatcagcc aggtcgtgca caagatctgc gagctcgtcg agaagtacga cgccgtgatc 4140
gccctcgcgg acctgaactc cggcttcaag aacagccgcg tcaaggtgga gaagcaggtc 4200
taccagaagt tcgagaagat gctcatcgac aagctgaact acatggtgga caagaagtcc 4260
aacccctgcg ctacgggcgg cgcgctgaag ggctaccaga tcaccaacaa gttcgagagc 4320
ttcaagtcca tgagcactca gaacggcttc atcttctaca tcccggcgtg gctcacgtcc 4380
aagatcgacc ccagcaccgg cttcgtcaac ctcctgaaga cgaagtacac ctccatcgcc 4440
gacagcaaga agttcatctc cagcttcgac cgcatcatgt atgtgccgga ggaggacctg 4500
ttcgagttcg ccctcgacta caagaacttc tcccgcacgg acgcggacta catcaagaag 4560
tggaagctgt acagctacgg caaccgcatc cgcatcttca ggaaccccaa gaagaacaac 4620
gtcttcgact gggaggaggt gtgcctgacc tccgcgtaca aggagctctt caacaagtac 4680
ggcatcaact accagcaggg cgacatcagg gctctcctgt gcgagcagag cgacaaggcc 4740
ttctactcca gcttcatggc gctgatgtcc ctcatgctgc agatgaggaa ctcgatcacc 4800
ggcaggacgg acgtggcctt cctcatctcc ccggtgaaga acagcgacgg catcttctac 4860
gactccagga actacgaggc ccaggagaac gcgatcctcc caaagaacgc ggacgccaac 4920
ggcgcctaca acatcgccag gaaggtcctc tgggctatcg gccagttcaa gaaggcggag 4980
gacgagaagc tggacaaggt gaagatcgcc atcagcaaca aggagtggct cgagtacgcc 5040
cagacctcgg tcaagcacgg cagcccgaag aagaagcgca aggtgtga 5088
<210> 81
<211> 1689
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 81
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro
35 40 45
Ile Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Leu Glu Pro Cys Val Met Val Ser Phe Cys Phe Tyr Leu
85 90 95
Tyr Ile Tyr Asn Asn Tyr His Leu Val Val Ile Tyr Phe Lys Tyr Phe
100 105 110
Phe Gln Asn Lys Arg Met Tyr Ile Ala Ile Ala Phe Leu Phe Ile Ser
115 120 125
Val Tyr Ile Leu Ile Tyr Asn Phe Ser Asn Ile Pro Lys Phe Val Asp
130 135 140
Val Gln Cys Ala Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val
145 150 155 160
Phe Gly Ala Arg Asp Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp
165 170 175
Val Leu His His Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly
180 185 190
Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu Ser Asp Phe Phe Arg Met
195 200 205
Arg Arg Gln Glu Ile Lys Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp
210 215 220
Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr
225 230 235 240
Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser
245 250 255
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
260 265 270
Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala Val
275 280 285
Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala Ile
290 295 300
Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
305 310 315 320
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
325 330 335
Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
340 345 350
Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly Ala
355 360 365
Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His Arg
370 375 380
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
385 390 395 400
Cys Tyr Phe Phe Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys Lys
405 410 415
Ala Gln Ser Ser Thr Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
420 425 430
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
435 440 445
Gly Gly Gly Ser Met Ser Lys Leu Glu Lys Phe Thr Asn Cys Tyr Ser
450 455 460
Leu Ser Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly Lys Thr Gln
465 470 475 480
Glu Asn Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu Lys Arg Ala
485 490 495
Glu Asp Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr Leu Ser
500 505 510
Phe Ile Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn Leu Asn Asn
515 520 525
Tyr Ile Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu Asn Lys
530 535 540
Glu Leu Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala Lys Ala
545 550 555 560
Phe Lys Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp Ile Ile
565 570 575
Glu Thr Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile Ala Leu
580 585 590
Val Asn Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe Phe Asp
595 600 605
Asn Arg Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr Ser Ile Ala
610 615 620
Phe Arg Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn Met Asp
625 630 635 640
Ile Phe Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu Val Gln Glu
645 650 655
Ile Lys Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp Phe Phe
660 665 670
Glu Gly Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly Ile Asp Val
675 680 685
Tyr Asn Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly Glu Lys Ile
690 695 700
Lys Gly Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr Lys Gln
705 710 715 720
Lys Leu Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser Asp Arg
725 730 735
Glu Ser Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu Glu Val
740 745 750
Leu Glu Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile Phe Ser
755 760 765
Ser Ile Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu Tyr Ser
770 775 780
Ser Ala Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser Thr Ile Ser
785 790 795 800
Lys Asp Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys Trp Asn Ala
805 810 815
Glu Tyr Asp Asp Ile His Leu Lys Lys Lys Ala Val Val Thr Glu Lys
820 825 830
Tyr Glu Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser Phe Ser
835 840 845
Leu Glu Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val Val Glu
850 855 860
Lys Leu Lys Glu Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr Lys Val
865 870 875 880
Tyr Gly Ser Ser Glu Lys Leu Phe Asp Ala Asp Phe Val Leu Glu Lys
885 890 895
Ser Leu Lys Lys Asn Asp Ala Val Val Ala Ile Met Lys Asp Leu Leu
900 905 910
Asp Ser Val Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe Gly Glu
915 920 925
Gly Lys Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe Val Leu
930 935 940
Ala Tyr Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp Ala Ile Arg
945 950 955 960
Asn Tyr Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys Leu Tyr
965 970 975
Phe Gln Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp Lys Glu Thr
980 985 990
Asp Tyr Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala
995 1000 1005
Ile Met Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp Lys
1010 1015 1020
Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu
1025 1030 1035
Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ser Lys Lys
1040 1045 1050
Trp Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln Lys Ile Tyr
1055 1060 1065
Lys Asn Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu Asn Asp
1070 1075 1080
Cys His Lys Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg Tyr
1085 1090 1095
Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu
1100 1105 1110
Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln
1115 1120 1125
Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys Glu Val Asp
1130 1135 1140
Lys Leu Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile Tyr Asn
1145 1150 1155
Lys Asp Phe Ser Asp Lys Ser His Gly Thr Pro Asn Leu His Thr
1160 1165 1170
Met Tyr Phe Lys Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile
1175 1180 1185
Arg Leu Ser Gly Gly Ala Glu Leu Phe Met Arg Arg Ala Ser Leu
1190 1195 1200
Lys Lys Glu Glu Leu Val Val His Pro Ala Asn Ser Pro Ile Ala
1205 1210 1215
Asn Lys Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr
1220 1225 1230
Asp Val Tyr Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu
1235 1240 1245
His Ile Pro Ile Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys
1250 1255 1260
Ile Asn Thr Glu Val Arg Val Leu Leu Lys His Asp Asp Asn Pro
1265 1270 1275
Tyr Val Ile Gly Ile Ala Arg Gly Glu Arg Asn Leu Leu Tyr Ile
1280 1285 1290
Val Val Val Asp Gly Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu
1295 1300 1305
Asn Glu Ile Ile Asn Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp
1310 1315 1320
Tyr His Ser Leu Leu Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala
1325 1330 1335
Arg Gln Asn Trp Thr Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala
1340 1345 1350
Gly Tyr Ile Ser Gln Val Val His Lys Ile Cys Glu Leu Val Glu
1355 1360 1365
Lys Tyr Asp Ala Val Ile Ala Leu Ala Asp Leu Asn Ser Gly Phe
1370 1375 1380
Lys Asn Ser Arg Val Lys Val Glu Lys Gln Val Tyr Gln Lys Phe
1385 1390 1395
Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Met Val Asp Lys Lys
1400 1405 1410
Ser Asn Pro Cys Ala Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile
1415 1420 1425
Thr Asn Lys Phe Glu Ser Phe Lys Ser Met Ser Thr Gln Asn Gly
1430 1435 1440
Phe Ile Phe Tyr Ile Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro
1445 1450 1455
Ser Thr Gly Phe Val Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile
1460 1465 1470
Ala Asp Ser Lys Lys Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr
1475 1480 1485
Val Pro Glu Glu Asp Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn
1490 1495 1500
Phe Ser Arg Thr Asp Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr
1505 1510 1515
Ser Tyr Gly Asn Arg Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn
1520 1525 1530
Asn Val Phe Asp Trp Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys
1535 1540 1545
Glu Leu Phe Asn Lys Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile
1550 1555 1560
Arg Ala Leu Leu Cys Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser
1565 1570 1575
Phe Met Ala Leu Met Ser Leu Met Leu Gln Met Arg Asn Ser Ile
1580 1585 1590
Thr Gly Arg Thr Asp Val Ala Phe Leu Ile Ser Pro Val Lys Asn
1595 1600 1605
Ser Asp Gly Ile Phe Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu
1610 1615 1620
Asn Ala Ile Leu Pro Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn
1625 1630 1635
Ile Ala Arg Lys Val Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala
1640 1645 1650
Glu Asp Glu Lys Leu Asp Lys Val Lys Ile Ala Ile Ser Asn Lys
1655 1660 1665
Glu Trp Leu Glu Tyr Ala Gln Thr Ser Val Lys His Gly Ser Pro
1670 1675 1680
Lys Lys Lys Arg Lys Val
1685
<210> 82
<211> 4936
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 82
atgccgaaga agaagcgcaa ggtcatgacc gacgcggagt atgtgcgcat ccacgagaag 60
ctcgacatct acaccttcaa gaagcagttc ttcaacaaca agaagagcgt ctcccacagg 120
tgctacgtgc tcttcgagct gaagaggcgc ggcgagcgca gggcctgctt ctggggctac 180
gccgtgaaca agccgcagag cggcaccgag cgcggcatcc acgccgagat cttcagcatc 240
cgcaaggtgg aggagtacct cagggacaac ccgggccagt tcaccatcaa ctggtacagc 300
tcctggagcc cggtaagttt ctgcttctac ctttgatata tatataataa ttatcattaa 360
ttagtagtaa tataatattt caaatatttt tttcaaaata aaagaatgta gtatatagca 420
attgcttttc tgtagtttat aagtgtgtat attttaattt ataacttttc taatatatga 480
ccaaaatttg ttgatgtgca gtgcgcggac tgcgccgaga agatcctgga gtggtacaac 540
caggagctga ggggcaacgg ccacaccctg aagatctggg cctgcaagct ctactacgag 600
aagaacgcga ggaaccagat cggcctgtgg aacctccgcg acaacggcgt cggcctcaac 660
gtgatggtct ccgagcacta ccagtgctgc cgcaagatct tcatccagtc cagccacaac 720
cagctcaacg agaacaggtg gctggagaag accctgaaga gggccgagaa gcgcaggtcc 780
gagctcagca tcatgatcca ggtgaagatc ctccacacca cgaagtcccc cgccgtgggg 840
gggcgggggc tcaggcgggg gcgggagcgg cggcgggggc tctgggggcg gcggcagcgg 900
cgggggcggc agcgggggcg gcgggtcgat gagcaagctg gagaagttca cgaactgcta 960
ctccctcagc aagaccctga ggttcaaggc gatcccggtc ggcaagaccc aggagaacat 1020
cgacaacaag cggctgctgg tggaggacga gaagagggct gaggactaca agggcgtgaa 1080
gaagctcctg gaccgctact acctgtcctt catcaacgac gtgctccaca gcatcaagct 1140
caagaacctg aacaactaca tcagcctctt caggaagaag acgcgcaccg agaaggagaa 1200
caaggagctc gagaacctgg agatcaacct gaggaaggag atcgccaagg cgttcaaggg 1260
caacgagggc tacaagtccc tcttcaagaa ggacatcatc gagacgatcc tcccggagtt 1320
cctggacgac aaggacgaga tcgccctggt caactccttc aacggcttca ccacggcgtt 1380
caccggcttc ttcgacaacc gcgagaacat gttcagcgag gaggccaagt ccacgagcat 1440
cgcgttcagg tgcatcaacg agaacctcac ccgctacatc tccaacatgg acatcttcga 1500
gaaggtcgac gcgatcttcg acaagcacga ggtgcaggag atcaaggaga agatcctgaa 1560
cagcgactac gacgtcgagg acttcttcga gggcgagttc ttcaacttcg tcctcacgca 1620
ggagggcatc gacgtgtaca acgccatcat cggtggcttc gtgaccgagt ccggcgagaa 1680
gatcaagggc ctgaacgagt acatcaacct ctacaaccag aagaccaagc agaagctgcc 1740
gaagttcaag cccctgtaca agcaggtgct ctccgacagg gagtccctca gcttctacgg 1800
cgagggctac acgagcgacg aggaggtcct ggaggtgttc cgcaacaccc tcaacaagaa 1860
cagcgagatc ttctccagca tcaagaagct cgagaagctg ttcaagaact tcgacgagta 1920
ctccagcgcc ggcatcttcg tcaagaacgg cccggcgatc tccacgatca gcaaggacat 1980
cttcggcgag tggaacgtga tccgcgacaa gtggaacgcc gagtacgacg acatccacct 2040
caagaagaag gcggtggtca ccgagaagta cgaggacgac aggcgcaagt ccttcaagaa 2100
gatcggctcc ttcagcctcg agcagctgca ggagtacgcc gacgcggacc tgagcgtggt 2160
cgagaagctc aaggagatca tcatccagaa ggtcgacgag atctacaagg tgtacggctc 2220
cagcgagaag ctcttcgacg cggacttcgt cctcgagaag tccctgaaga agaacgacgc 2280
cgtggtcgcg atcatgaagg acctcctgga ctccgtgaag agcttcgaga attacatcaa 2340
ggccttcttc ggcgagggca aggagacgaa cagggacgag tccttctacg gcgacttcgt 2400
cctggcctac gacatcctcc tgaaggtgga ccacatctac gacgcgatcc gcaactacgt 2460
gacccagaag ccgtacagca aggacaagtt caagctctac ttccagaacc cccagttcat 2520
gggcggctgg gacaaggaca aggagacgga ctacagggcg accatcctgc gctacggcag 2580
caagtactac ctcgccatca tggacaagaa gtacgcgaag tgcctgcaga agatcgacaa 2640
ggacgacgtc aacggcaact acgagaagat caactacaag ctcctgccgg gccccaacaa 2700
gatgctcccg aaggtgttct tctccaagaa gtggatggcc tactacaacc ccagcgagga 2760
catccagaag atctacaaga acggcacgtt caagaagggc gacatgttca acctgaacga 2820
ctgccacaag ctcatcgact tcttcaagga ctccatcagc cgctacccga agtggtccaa 2880
cgcctacgac ttcaacttca gcgagaccga gaagtacaag gacatcgcgg gcttctaccg 2940
cgaggtcgag gagcagggct acaaggtgtc cttcgagtcc gccagcaaga aggaggtcga 3000
caagctggtg gaggagggca agctctacat gttccagatc tacaacaagg acttctccga 3060
caagagccac ggcacgccca acctgcacac catgtacttc aagctcctgt tcgacgagaa 3120
caaccacggc cagatcaggc tgtccggcgg cgccgagctc ttcatgagga gggcgagcct 3180
gaagaaggag gagctggtgg tccaccccgc taacagccca atcgcgaaca agaacccgga 3240
caaccccaag aagaccacga ccctgtccta cgacgtgtac aaggacaaga ggttcagcga 3300
ggaccagtac gagctccaca tcccgatcgc gatcaacaag tgccccaaga acatcttcaa 3360
gatcaacacc gaggtccgcg tgctcctgaa gcacgacgac aacccctacg tgatcggcat 3420
cgctaggggc gagaggaacc tcctgtacat cgtggtcgtg gacggcaagg gcaacatcgt 3480
ggagcagtac tccctcaacg agatcatcaa caacttcaac ggcatcagga tcaagacgga 3540
ctaccacagc ctcctggaca agaaggagaa ggagaggttc gaggcccgcc agaactggac 3600
ctccatcgag aacatcaagg agctgaaggc gggctacatc agccaggtcg tgcacaagat 3660
ctgcgagctc gtcgagaagt acgacgccgt gatcgccctc gcggacctga actccggctt 3720
caagaacagc cgcgtcaagg tggagaagca ggtctaccag aagttcgaga agatgctcat 3780
cgacaagctg aactacatgg tggacaagaa gtccaacccc tgcgctacgg gcggcgcgct 3840
gaagggctac cagatcacca acaagttcga gagcttcaag tccatgagca ctcagaacgg 3900
cttcatcttc tacatcccgg cgtggctcac gtccaagatc gaccccagca ccggcttcgt 3960
caacctcctg aagacgaagt acacctccat cgccgacagc aagaagttca tctccagctt 4020
cgaccgcatc atgtatgtgc cggaggagga cctgttcgag ttcgccctcg actacaagaa 4080
cttctcccgc acggacgcgg actacatcaa gaagtggaag ctgtacagct acggcaaccg 4140
catccgcatc ttcaggaacc ccaagaagaa caacgtcttc gactgggagg aggtgtgcct 4200
gacctccgcg tacaaggagc tcttcaacaa gtacggcatc aactaccagc agggcgacat 4260
cagggctctc ctgtgcgagc agagcgacaa ggccttctac tccagcttca tggcgctgat 4320
gtccctcatg ctgcagatga ggaactcgat caccggcagg acggacgtgg ccttcctcat 4380
ctccccggtg aagaacagcg acggcatctt ctacgactcc aggaactacg aggcccagga 4440
gaacgcgatc ctcccaaaga acgcggacgc caacggcgcc tacaacatcg ccaggaaggt 4500
cctctgggct atcggccagt tcaagaaggc ggaggacgag aagctggaca aggtgaagat 4560
cgccatcagc aacaaggagt ggctcgagta cgcccagacc tcggtcaagc acggcagccc 4620
gaagaagaag cgcaaggtgt ccggcggcag cacgaacctg tccgacatca tcgagaagga 4680
gaccggcaag cagctcgtga tccaggagag catcctcatg ctgccggagg aggtcgagga 4740
ggtcatcggc aacaagcccg agtccgacat cctcgtccac acggcctacg acgagtccac 4800
cgacgagaac gtgatgctcc tgacctcgga cgctcccgag tacaagccat gggccctggt 4860
catccaggac agcaacggcg agaacaagat caagatgctc tccggcggca gcccgaagaa 4920
gaagcgcaaa gtgtga 4936
<210> 83
<211> 1636
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 83
Met Pro Lys Lys Lys Arg Lys Val Met Thr Asp Ala Glu Tyr Val Arg
1 5 10 15
Ile His Glu Lys Leu Asp Ile Tyr Thr Phe Lys Lys Gln Phe Phe Asn
20 25 30
Asn Lys Lys Ser Val Ser His Arg Cys Tyr Val Leu Phe Glu Leu Lys
35 40 45
Arg Arg Gly Glu Arg Arg Ala Cys Phe Trp Gly Tyr Ala Val Asn Lys
50 55 60
Pro Gln Ser Gly Thr Glu Arg Gly Ile His Ala Glu Ile Phe Ser Ile
65 70 75 80
Arg Lys Val Glu Glu Tyr Leu Arg Asp Asn Pro Gly Gln Phe Thr Ile
85 90 95
Asn Trp Tyr Ser Ser Trp Ser Pro Val Ser Phe Cys Phe Tyr Leu Tyr
100 105 110
Ile Tyr Asn Asn Tyr His Leu Val Val Ile Tyr Phe Lys Tyr Phe Phe
115 120 125
Gln Asn Lys Arg Met Tyr Ile Ala Ile Ala Phe Leu Phe Ile Ser Val
130 135 140
Tyr Ile Leu Ile Tyr Asn Phe Ser Asn Ile Pro Lys Phe Val Asp Val
145 150 155 160
Gln Cys Ala Asp Cys Ala Glu Lys Ile Leu Glu Trp Tyr Asn Gln Glu
165 170 175
Leu Arg Gly Asn Gly His Thr Leu Lys Ile Trp Ala Cys Lys Leu Tyr
180 185 190
Tyr Glu Lys Asn Ala Arg Asn Gln Ile Gly Leu Trp Asn Leu Arg Asp
195 200 205
Asn Gly Val Gly Leu Asn Val Met Val Ser Glu His Tyr Gln Cys Cys
210 215 220
Arg Lys Ile Phe Ile Gln Ser Ser His Asn Gln Leu Asn Glu Asn Arg
225 230 235 240
Trp Leu Glu Lys Thr Leu Lys Arg Ala Glu Lys Arg Arg Ser Glu Leu
245 250 255
Ser Ile Met Ile Gln Val Lys Ile Leu His Thr Thr Lys Ser Pro Ala
260 265 270
Val Gly Gly Arg Gly Leu Arg Arg Gly Arg Glu Arg Arg Arg Gly Leu
275 280 285
Trp Gly Arg Arg Gln Arg Arg Gly Arg Gln Arg Gly Arg Arg Val Asp
290 295 300
Glu Gln Ala Gly Glu Val His Glu Leu Leu Leu Pro Gln Gln Asp Pro
305 310 315 320
Glu Val Gln Gly Asp Pro Gly Arg Gln Asp Pro Gly Glu His Arg Gln
325 330 335
Gln Ala Ala Ala Gly Gly Gly Arg Glu Glu Gly Gly Leu Gln Gly Arg
340 345 350
Glu Glu Ala Pro Gly Pro Leu Leu Pro Val Leu His Gln Arg Arg Ala
355 360 365
Pro Gln His Gln Ala Gln Glu Pro Glu Gln Leu His Gln Pro Leu Gln
370 375 380
Glu Glu Asp Ala His Arg Glu Gly Glu Gln Gly Ala Arg Glu Pro Gly
385 390 395 400
Asp Gln Pro Glu Glu Gly Asp Arg Gln Gly Val Gln Gly Gln Arg Gly
405 410 415
Leu Gln Val Pro Leu Gln Glu Gly His His Arg Asp Asp Pro Pro Gly
420 425 430
Val Pro Gly Arg Gln Gly Arg Asp Arg Pro Gly Gln Leu Leu Gln Arg
435 440 445
Leu His His Gly Val His Arg Leu Leu Arg Gln Pro Arg Glu His Val
450 455 460
Gln Arg Gly Gly Gln Val His Glu His Arg Val Gln Val His Gln Arg
465 470 475 480
Glu Pro His Pro Leu His Leu Gln His Gly His Leu Arg Glu Gly Arg
485 490 495
Arg Asp Leu Arg Gln Ala Arg Gly Ala Gly Asp Gln Gly Glu Asp Pro
500 505 510
Glu Gln Arg Leu Arg Arg Arg Gly Leu Leu Arg Gly Arg Val Leu Gln
515 520 525
Leu Arg Pro His Ala Gly Gly His Arg Arg Val Gln Arg His His Arg
530 535 540
Trp Leu Arg Asp Arg Val Arg Arg Glu Asp Gln Gly Pro Glu Arg Val
545 550 555 560
His Gln Pro Leu Gln Pro Glu Asp Gln Ala Glu Ala Ala Glu Val Gln
565 570 575
Ala Pro Val Gln Ala Gly Ala Leu Arg Gln Gly Val Pro Gln Leu Leu
580 585 590
Arg Arg Gly Leu His Glu Arg Arg Gly Gly Pro Gly Gly Val Pro Gln
595 600 605
His Pro Gln Gln Glu Gln Arg Asp Leu Leu Gln His Gln Glu Ala Arg
610 615 620
Glu Ala Val Gln Glu Leu Arg Arg Val Leu Gln Arg Arg His Leu Arg
625 630 635 640
Gln Glu Arg Pro Gly Asp Leu His Asp Gln Gln Gly His Leu Arg Arg
645 650 655
Val Glu Arg Asp Pro Arg Gln Val Glu Arg Arg Val Arg Arg His Pro
660 665 670
Pro Gln Glu Glu Gly Gly Gly His Arg Glu Val Arg Gly Arg Gln Ala
675 680 685
Gln Val Leu Gln Glu Asp Arg Leu Leu Gln Pro Arg Ala Ala Ala Gly
690 695 700
Val Arg Arg Arg Gly Pro Glu Arg Gly Arg Glu Ala Gln Gly Asp His
705 710 715 720
His Pro Glu Gly Arg Arg Asp Leu Gln Gly Val Arg Leu Gln Arg Glu
725 730 735
Ala Leu Arg Arg Gly Leu Arg Pro Arg Glu Val Pro Glu Glu Glu Arg
740 745 750
Arg Arg Gly Arg Asp His Glu Gly Pro Pro Gly Leu Arg Glu Glu Leu
755 760 765
Arg Glu Leu His Gln Gly Leu Leu Arg Arg Gly Gln Gly Asp Glu Gln
770 775 780
Gly Arg Val Leu Leu Arg Arg Leu Arg Pro Gly Leu Arg His Pro Pro
785 790 795 800
Glu Gly Gly Pro His Leu Arg Arg Asp Pro Gln Leu Arg Asp Pro Glu
805 810 815
Ala Val Gln Gln Gly Gln Val Gln Ala Leu Leu Pro Glu Pro Pro Val
820 825 830
His Gly Arg Leu Gly Gln Gly Gln Gly Asp Gly Leu Gln Gly Asp His
835 840 845
Pro Ala Leu Arg Gln Gln Val Leu Pro Arg His His Gly Gln Glu Val
850 855 860
Arg Glu Val Pro Ala Glu Asp Arg Gln Gly Arg Arg Gln Arg Gln Leu
865 870 875 880
Arg Glu Asp Gln Leu Gln Ala Pro Ala Gly Pro Gln Gln Asp Ala Pro
885 890 895
Glu Gly Val Leu Leu Gln Glu Val Asp Gly Leu Leu Gln Pro Gln Arg
900 905 910
Gly His Pro Glu Asp Leu Gln Glu Arg His Val Gln Glu Gly Arg His
915 920 925
Val Gln Pro Glu Arg Leu Pro Gln Ala His Arg Leu Leu Gln Gly Leu
930 935 940
His Gln Pro Leu Pro Glu Val Val Gln Arg Leu Arg Leu Gln Leu Gln
945 950 955 960
Arg Asp Arg Glu Val Gln Gly His Arg Gly Leu Leu Pro Arg Gly Arg
965 970 975
Gly Ala Gly Leu Gln Gly Val Leu Arg Val Arg Gln Gln Glu Gly Gly
980 985 990
Arg Gln Ala Gly Gly Gly Gly Gln Ala Leu His Val Pro Asp Leu Gln
995 1000 1005
Gln Gly Leu Leu Arg Gln Glu Pro Arg His Ala Gln Pro Ala His
1010 1015 1020
His Val Leu Gln Ala Pro Val Arg Arg Glu Gln Pro Arg Pro Asp
1025 1030 1035
Gln Ala Val Arg Arg Arg Arg Ala Leu His Glu Glu Gly Glu Pro
1040 1045 1050
Glu Glu Gly Gly Ala Gly Gly Pro Pro Arg Gln Pro Asn Arg Glu
1055 1060 1065
Gln Glu Pro Gly Gln Pro Gln Glu Asp His Asp Pro Val Leu Arg
1070 1075 1080
Arg Val Gln Gly Gln Glu Val Gln Arg Gly Pro Val Arg Ala Pro
1085 1090 1095
His Pro Asp Arg Asp Gln Gln Val Pro Gln Glu His Leu Gln Asp
1100 1105 1110
Gln His Arg Gly Pro Arg Ala Pro Glu Ala Arg Arg Gln Pro Leu
1115 1120 1125
Arg Asp Arg His Arg Gly Arg Glu Glu Pro Pro Val His Arg Gly
1130 1135 1140
Arg Gly Arg Gln Gly Gln His Arg Gly Ala Val Leu Pro Gln Arg
1145 1150 1155
Asp His Gln Gln Leu Gln Arg His Gln Asp Gln Asp Gly Leu Pro
1160 1165 1170
Gln Pro Pro Gly Gln Glu Gly Glu Gly Glu Val Arg Gly Pro Pro
1175 1180 1185
Glu Leu Asp Leu His Arg Glu His Gln Gly Ala Glu Gly Gly Leu
1190 1195 1200
His Gln Pro Gly Arg Ala Gln Asp Leu Arg Ala Arg Arg Glu Val
1205 1210 1215
Arg Arg Arg Asp Arg Pro Arg Gly Pro Glu Leu Arg Leu Gln Glu
1220 1225 1230
Gln Pro Arg Gln Gly Gly Glu Ala Gly Leu Pro Glu Val Arg Glu
1235 1240 1245
Asp Ala His Arg Gln Ala Glu Leu His Gly Gly Gln Glu Val Gln
1250 1255 1260
Pro Leu Arg Tyr Gly Arg Arg Ala Glu Gly Leu Pro Asp His Gln
1265 1270 1275
Gln Val Arg Glu Leu Gln Val His Glu His Ser Glu Arg Leu His
1280 1285 1290
Leu Leu His Pro Gly Val Ala His Val Gln Asp Arg Pro Gln His
1295 1300 1305
Arg Leu Arg Gln Pro Pro Glu Asp Glu Val His Leu His Arg Arg
1310 1315 1320
Gln Gln Glu Val His Leu Gln Leu Arg Pro His His Val Cys Ala
1325 1330 1335
Gly Gly Gly Pro Val Arg Val Arg Pro Arg Leu Gln Glu Leu Leu
1340 1345 1350
Pro His Gly Arg Gly Leu His Gln Glu Val Glu Ala Val Gln Leu
1355 1360 1365
Arg Gln Pro His Pro His Leu Gln Glu Pro Gln Glu Glu Gln Arg
1370 1375 1380
Leu Arg Leu Gly Gly Gly Val Pro Asp Leu Arg Val Gln Gly Ala
1385 1390 1395
Leu Gln Gln Val Arg His Gln Leu Pro Ala Gly Arg His Gln Gly
1400 1405 1410
Ser Pro Val Arg Ala Glu Arg Gln Gly Leu Leu Leu Gln Leu His
1415 1420 1425
Gly Ala Asp Val Pro His Ala Ala Asp Glu Glu Leu Asp His Arg
1430 1435 1440
Gln Asp Gly Arg Gly Leu Pro His Leu Pro Gly Glu Glu Gln Arg
1445 1450 1455
Arg His Leu Leu Arg Leu Gln Glu Leu Arg Gly Pro Gly Glu Arg
1460 1465 1470
Asp Pro Pro Lys Glu Arg Gly Arg Gln Arg Arg Leu Gln His Arg
1475 1480 1485
Gln Glu Gly Pro Leu Gly Tyr Arg Pro Val Gln Glu Gly Gly Gly
1490 1495 1500
Arg Glu Ala Gly Gln Gly Glu Asp Arg His Gln Gln Gln Gly Val
1505 1510 1515
Ala Arg Val Arg Pro Asp Leu Gly Gln Ala Arg Gln Pro Glu Glu
1520 1525 1530
Glu Ala Gln Gly Val Arg Arg Gln His Glu Pro Val Arg His His
1535 1540 1545
Arg Glu Gly Asp Arg Gln Ala Ala Arg Asp Pro Gly Glu His Pro
1550 1555 1560
His Ala Ala Gly Gly Gly Arg Gly Gly His Arg Gln Gln Ala Arg
1565 1570 1575
Val Arg His Pro Arg Pro His Gly Leu Arg Arg Val His Arg Arg
1580 1585 1590
Glu Arg Asp Ala Pro Asp Leu Gly Arg Ser Arg Val Gln Ala Met
1595 1600 1605
Gly Pro Gly His Pro Gly Gln Gln Arg Arg Glu Gln Asp Gln Asp
1610 1615 1620
Ala Leu Arg Arg Gln Pro Glu Glu Glu Ala Gln Ser Val
1625 1630 1635
<210> 84
<211> 6210
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 84
atgtccgagg tggagttcag ccacgagtac tggatgaggc acgctctcac cctggctaag 60
agggcgtggg acgagaggga ggtgccggtg ggcgccgtgc tcgtccacaa caaccgcgtg 120
atcggcgagg gctggaacag gcccatcggc aggcacgacc caaccgctca cgccgagatc 180
atggctctca ggcagggcgg cctggtcatg cagaactaca ggctgatcga cgcgaccctc 240
tacgtgaccc tcgagccctg cgtcatggta agtttctgct tctacctttg atatatatat 300
aataattatc attaattagt agtaatataa tatttcaaat atttttttca aaataaaaga 360
atgtagtata tagcaattgc ttttctgtag tttataagtg tgtatatttt aatttataac 420
ttttctaata tatgaccaaa atttgttgat gtgcagtgcg cgggcgccat gatccactcc 480
aggatcggca gggtggtctt cggcgctagg gacgccaaga cgggcgctgc gggcagcctc 540
atggacgtgc tgcaccaccc cggcatgaac caccgcgtcg agatcaccga gggcatcctc 600
gcggacgagt gcgctgcgct cctgtccgac ttcttcagga tgcgcaggca ggagatcaag 660
gcccagaaga aggcgcagtc cagcaccgac tccggcggct ccagcggcgg ctccagcggc 720
agcgagaccc cgggcacgtc cgagagcgcg acgcccgaga gcagcggcgg ctccagcggc 780
ggctcctcgg aggtcgagtt cagccatgag tactggatga ggcatgccct gactctcgct 840
aagagggcgc gggatgagcg cgaggtgccg gtgggggccg tgctcgtcct gaacaaccgc 900
gtgatcgggg agggctggaa ccgggctatc ggcctccacg acccaacggc ccatgccgag 960
atcatggccc tgaggcaggg cggcctggtc atgcaaaact acaggctcat cgacgccacc 1020
ctctacgtga ccttcgagcc atgcgtgatg tgcgcggggg ccatgatcca ctcgaggatt 1080
gggagggtgg tcttcggcgt gcgcaacgct aagacggggg ccgccggcag cctcatggac 1140
gtcctgcact acccgggcat gaaccacagg gtggagatta ccgagggcat cctggccgat 1200
gagtgcgccg cgctcctgtg ctacttcttc cgcatgccca ggcaggtctt caacgcgcag 1260
aagaaggccc agagctccac tgatgggggc gggggctcag gcgggggcgg gagcggcggc 1320
gggggctctg ggggcggcgg cagcggcggg ggcggcagcg ggggcggcgg gtcgatgagc 1380
aagctggaga agttcacgaa ctgctactcc ctcagcaaga ccctgaggtt caaggcgatc 1440
ccggtcggca agacccagga gaacatcgac aacaagcggc tgctggtgga ggacgagaag 1500
agggctgagg actacaaggg cgtgaagaag ctcctggacc gctactacct gtccttcatc 1560
aacgacgtgc tccacagcat caagctcaag aacctgaaca actacatcag cctcttcagg 1620
aagaagacgc gcaccgagaa ggagaacaag gagctcgaga acctggagat caacctgagg 1680
aaggagatcg ccaaggcgtt caagggcaac gagggctaca agtccctctt caagaaggac 1740
atcatcgaga cgatcctccc ggagttcctg gacgacaagg acgagatcgc cctggtcaac 1800
tccttcaacg gcttcaccac ggcgttcacc ggcttcttcg acaaccgcga gaacatgttc 1860
agcgaggagg ccaagtccac gagcatcgcg ttcaggtgca tcaacgagaa cctcacccgc 1920
tacatctcca acatggacat cttcgagaag gtcgacgcga tcttcgacaa gcacgaggtg 1980
caggagatca aggagaagat cctgaacagc gactacgacg tcgaggactt cttcgagggc 2040
gagttcttca acttcgtcct cacgcaggag ggcatcgacg tgtacaacgc catcatcggt 2100
ggcttcgtga ccgagtccgg cgagaagatc aagggcctga acgagtacat caacctctac 2160
aaccagaaga ccaagcagaa gctgccgaag ttcaagcccc tgtacaagca ggtgctctcc 2220
gacagggagt ccctcagctt ctacggcgag ggctacacga gcgacgagga ggtcctggag 2280
gtgttccgca acaccctcaa caagaacagc gagatcttct ccagcatcaa gaagctcgag 2340
aagctgttca agaacttcga cgagtactcc agcgccggca tcttcgtcaa gaacggcccg 2400
gcgatctcca cgatcagcaa ggacatcttc ggcgagtgga acgtgatccg cgacaagtgg 2460
aacgccgagt acgacgacat ccacctcaag aagaaggcgg tggtcaccga gaagtacgag 2520
gacgacaggc gcaagtcctt caagaagatc ggctccttca gcctcgagca gctgcaggag 2580
tacgccgacg cggacctgag cgtggtcgag aagctcaagg agatcatcat ccagaaggtc 2640
gacgagatct acaaggtgta cggctccagc gagaagctct tcgacgcgga cttcgtcctc 2700
gagaagtccc tgaagaagaa cgacgccgtg gtcgcgatca tgaaggacct cctggactcc 2760
gtgaagagct tcgagaatta catcaaggcc ttcttcggcg agggcaagga gacgaacagg 2820
gacgagtcct tctacggcga cttcgtcctg gcctacgaca tcctcctgaa ggtggaccac 2880
atctacgacg cgatccgcaa ctacgtgacc cagaagccgt acagcaagga caagttcaag 2940
ctctacttcc agaaccccca gttcatgggc ggctgggaca aggacaagga gacggactac 3000
agggcgacca tcctgcgcta cggcagcaag tactacctcg ccatcatgga caagaagtac 3060
gcgaagtgcc tgcagaagat cgacaaggac gacgtcaacg gcaactacga gaagatcaac 3120
tacaagctcc tgccgggccc caacaagatg ctcccgaagg tgttcttctc caagaagtgg 3180
atggcctact acaaccccag cgaggacatc cagaagatct acaagaacgg cacgttcaag 3240
aagggcgaca tgttcaacct gaacgactgc cacaagctca tcgacttctt caaggactcc 3300
atcagccgct acccgaagtg gtccaacgcc tacgacttca acttcagcga gaccgagaag 3360
tacaaggaca tcgcgggctt ctaccgcgag gtcgaggagc agggctacaa ggtgtccttc 3420
gagtccgcca gcaagaagga ggtcgacaag ctggtggagg agggcaagct ctacatgttc 3480
cagatctaca acaaggactt ctccgacaag agccacggca cgcccaacct gcacaccatg 3540
tacttcaagc tcctgttcga cgagaacaac cacggccaga tcaggctgtc cggcggcgcc 3600
gagctcttca tgaggagggc gagcctgaag aaggaggagc tggtggtcca ccccgctaac 3660
agcccaatcg cgaacaagaa cccggacaac cccaagaaga ccacgaccct gtcctacgac 3720
gtgtacaagg acaagaggtt cagcgaggac cagtacgagc tccacatccc gatcgcgatc 3780
aacaagtgcc ccaagaacat cttcaagatc aacaccgagg tccgcgtgct cctgaagcac 3840
gacgacaacc cctacgtgat cggcatcgct aggggcgaga ggaacctcct gtacatcgtg 3900
gtcgtggacg gcaagggcaa catcgtggag cagtactccc tcaacgagat catcaacaac 3960
ttcaacggca tcaggatcaa gacggactac cacagcctcc tggacaagaa ggagaaggag 4020
aggttcgagg cccgccagaa ctggacctcc atcgagaaca tcaaggagct gaaggcgggc 4080
tacatcagcc aggtcgtgca caagatctgc gagctcgtcg agaagtacga cgccgtgatc 4140
gccctcgcgg acctgaactc cggcttcaag aacagccgcg tcaaggtgga gaagcaggtc 4200
taccagaagt tcgagaagat gctcatcgac aagctgaact acatggtgga caagaagtcc 4260
aacccctgcg ctacgggcgg cgcgctgaag ggctaccaga tcaccaacaa gttcgagagc 4320
ttcaagtcca tgagcactca gaacggcttc atcttctaca tcccggcgtg gctcacgtcc 4380
aagatcgacc ccagcaccgg cttcgtcaac ctcctgaaga cgaagtacac ctccatcgcc 4440
gacagcaaga agttcatctc cagcttcgac cgcatcatgt atgtgccgga ggaggacctg 4500
ttcgagttcg ccctcgacta caagaacttc tcccgcacgg acgcggacta catcaagaag 4560
tggaagctgt acagctacgg caaccgcatc cgcatcttca ggaaccccaa gaagaacaac 4620
gtcttcgact gggaggaggt gtgcctgacc tccgcgtaca aggagctctt caacaagtac 4680
ggcatcaact accagcaggg cgacatcagg gctctcctgt gcgagcagag cgacaaggcc 4740
ttctactcca gcttcatggc gctgatgtcc ctcatgctgc agatgaggaa ctcgatcacc 4800
ggcaggacgg acgtggcctt cctcatctcc ccggtgaaga acagcgacgg catcttctac 4860
gactccagga actacgaggc ccaggagaac gcgatcctcc caaagaacgc ggacgccaac 4920
ggcgcctaca acatcgccag gaaggtcctc tgggctatcg gccagttcaa gaaggcggag 4980
gacgagaagc tggacaaggt gaagatcgcc atcagcaaca aggagtggct cgagtacgcc 5040
cagacctcgg tcaagcacgg cagcccgaag aagaagcgca aggtgggcag cgcggagtac 5100
gttcgggctc tgttcgactt caacggcaac gacgaggagg acctcccgtt caagaagggc 5160
gacatcctgc gcatcaggga caagccggag gagcagtggt ggaacgccga ggactccgag 5220
ggcaagaggg gcatgatccc ggtcccctac gtggagaagt acatgaccga cgcggagtat 5280
gtgcgcatcc acgagaagct cgacatctac accttcaaga agcagttctt caacaacaag 5340
aagagcgtct cccacaggtg ctacgtgctc ttcgagctga agaggcgcgg cgagcgcagg 5400
gcctgcttct ggggctacgc cgtgaacaag ccgcagagcg gcaccgagcg cggcatccac 5460
gccgagatct tcagcatccg caaggtggag gagtacctca gggacaaccc gggccagttc 5520
accatcaact ggtacagctc ctggagcccg tgcgcggact gcgccgagaa gatcctggag 5580
tggtacaacc aggagctgag gggcaacggc cacaccctga agatctgggc ctgcaagctc 5640
tactacgaga agaacgcgag gaaccagatc ggcctgtgga acctccgcga caacggcgtc 5700
ggcctcaacg tgatggtctc cgagcactac cagtgctgcc gcaagatctt catccagtcc 5760
agccacaacc agctcaacga gaacaggtgg ctggagaaga ccctgaagag ggccgagaag 5820
cgcaggtccg agctcagcat catgatccag gtgaagatcc tccacaccac gaagtccccc 5880
gccgtgggca gcccgaagaa gaagcgcaag gtgtccggcg gcagcacgaa cctgtccgac 5940
atcatcgaga aggagaccgg caagcagctc gtgatccagg agagcatcct catgctgccg 6000
gaggaggtcg aggaggtcat cggcaacaag cccgagtccg acatcctcgt ccacacggcc 6060
tacgacgagt ccaccgacga gaacgtgatg ctcctgacct cggacgctcc cgagtacaag 6120
ccatgggccc tggtcatcca ggacagcaac ggcgagaaca agatcaagat gctctccggc 6180
ggcagcccga agaagaagcg caaagtgtga 6210
<210> 85
<211> 2063
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 85
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro
35 40 45
Ile Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Leu Glu Pro Cys Val Met Val Ser Phe Cys Phe Tyr Leu
85 90 95
Tyr Ile Tyr Asn Asn Tyr His Leu Val Val Ile Tyr Phe Lys Tyr Phe
100 105 110
Phe Gln Asn Lys Arg Met Tyr Ile Ala Ile Ala Phe Leu Phe Ile Ser
115 120 125
Val Tyr Ile Leu Ile Tyr Asn Phe Ser Asn Ile Pro Lys Phe Val Asp
130 135 140
Val Gln Cys Ala Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val
145 150 155 160
Phe Gly Ala Arg Asp Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp
165 170 175
Val Leu His His Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly
180 185 190
Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu Ser Asp Phe Phe Arg Met
195 200 205
Arg Arg Gln Glu Ile Lys Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp
210 215 220
Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr
225 230 235 240
Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser
245 250 255
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
260 265 270
Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala Val
275 280 285
Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala Ile
290 295 300
Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
305 310 315 320
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
325 330 335
Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
340 345 350
Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly Ala
355 360 365
Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His Arg
370 375 380
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
385 390 395 400
Cys Tyr Phe Phe Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys Lys
405 410 415
Ala Gln Ser Ser Thr Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
420 425 430
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
435 440 445
Gly Gly Gly Ser Met Ser Lys Leu Glu Lys Phe Thr Asn Cys Tyr Ser
450 455 460
Leu Ser Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly Lys Thr Gln
465 470 475 480
Glu Asn Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu Lys Arg Ala
485 490 495
Glu Asp Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr Leu Ser
500 505 510
Phe Ile Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn Leu Asn Asn
515 520 525
Tyr Ile Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu Asn Lys
530 535 540
Glu Leu Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala Lys Ala
545 550 555 560
Phe Lys Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp Ile Ile
565 570 575
Glu Thr Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile Ala Leu
580 585 590
Val Asn Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe Phe Asp
595 600 605
Asn Arg Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr Ser Ile Ala
610 615 620
Phe Arg Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn Met Asp
625 630 635 640
Ile Phe Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu Val Gln Glu
645 650 655
Ile Lys Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp Phe Phe
660 665 670
Glu Gly Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly Ile Asp Val
675 680 685
Tyr Asn Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly Glu Lys Ile
690 695 700
Lys Gly Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr Lys Gln
705 710 715 720
Lys Leu Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser Asp Arg
725 730 735
Glu Ser Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu Glu Val
740 745 750
Leu Glu Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile Phe Ser
755 760 765
Ser Ile Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu Tyr Ser
770 775 780
Ser Ala Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser Thr Ile Ser
785 790 795 800
Lys Asp Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys Trp Asn Ala
805 810 815
Glu Tyr Asp Asp Ile His Leu Lys Lys Lys Ala Val Val Thr Glu Lys
820 825 830
Tyr Glu Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser Phe Ser
835 840 845
Leu Glu Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val Val Glu
850 855 860
Lys Leu Lys Glu Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr Lys Val
865 870 875 880
Tyr Gly Ser Ser Glu Lys Leu Phe Asp Ala Asp Phe Val Leu Glu Lys
885 890 895
Ser Leu Lys Lys Asn Asp Ala Val Val Ala Ile Met Lys Asp Leu Leu
900 905 910
Asp Ser Val Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe Gly Glu
915 920 925
Gly Lys Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe Val Leu
930 935 940
Ala Tyr Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp Ala Ile Arg
945 950 955 960
Asn Tyr Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys Leu Tyr
965 970 975
Phe Gln Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp Lys Glu Thr
980 985 990
Asp Tyr Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala
995 1000 1005
Ile Met Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp Lys
1010 1015 1020
Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu
1025 1030 1035
Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe Ser Lys Lys
1040 1045 1050
Trp Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln Lys Ile Tyr
1055 1060 1065
Lys Asn Gly Thr Phe Lys Lys Gly Asp Met Phe Asn Leu Asn Asp
1070 1075 1080
Cys His Lys Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg Tyr
1085 1090 1095
Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu
1100 1105 1110
Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln
1115 1120 1125
Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys Glu Val Asp
1130 1135 1140
Lys Leu Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile Tyr Asn
1145 1150 1155
Lys Asp Phe Ser Asp Lys Ser His Gly Thr Pro Asn Leu His Thr
1160 1165 1170
Met Tyr Phe Lys Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile
1175 1180 1185
Arg Leu Ser Gly Gly Ala Glu Leu Phe Met Arg Arg Ala Ser Leu
1190 1195 1200
Lys Lys Glu Glu Leu Val Val His Pro Ala Asn Ser Pro Ile Ala
1205 1210 1215
Asn Lys Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr
1220 1225 1230
Asp Val Tyr Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu
1235 1240 1245
His Ile Pro Ile Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys
1250 1255 1260
Ile Asn Thr Glu Val Arg Val Leu Leu Lys His Asp Asp Asn Pro
1265 1270 1275
Tyr Val Ile Gly Ile Ala Arg Gly Glu Arg Asn Leu Leu Tyr Ile
1280 1285 1290
Val Val Val Asp Gly Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu
1295 1300 1305
Asn Glu Ile Ile Asn Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp
1310 1315 1320
Tyr His Ser Leu Leu Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala
1325 1330 1335
Arg Gln Asn Trp Thr Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala
1340 1345 1350
Gly Tyr Ile Ser Gln Val Val His Lys Ile Cys Glu Leu Val Glu
1355 1360 1365
Lys Tyr Asp Ala Val Ile Ala Leu Ala Asp Leu Asn Ser Gly Phe
1370 1375 1380
Lys Asn Ser Arg Val Lys Val Glu Lys Gln Val Tyr Gln Lys Phe
1385 1390 1395
Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Met Val Asp Lys Lys
1400 1405 1410
Ser Asn Pro Cys Ala Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile
1415 1420 1425
Thr Asn Lys Phe Glu Ser Phe Lys Ser Met Ser Thr Gln Asn Gly
1430 1435 1440
Phe Ile Phe Tyr Ile Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro
1445 1450 1455
Ser Thr Gly Phe Val Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile
1460 1465 1470
Ala Asp Ser Lys Lys Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr
1475 1480 1485
Val Pro Glu Glu Asp Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn
1490 1495 1500
Phe Ser Arg Thr Asp Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr
1505 1510 1515
Ser Tyr Gly Asn Arg Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn
1520 1525 1530
Asn Val Phe Asp Trp Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys
1535 1540 1545
Glu Leu Phe Asn Lys Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile
1550 1555 1560
Arg Ala Leu Leu Cys Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser
1565 1570 1575
Phe Met Ala Leu Met Ser Leu Met Leu Gln Met Arg Asn Ser Ile
1580 1585 1590
Thr Gly Arg Thr Asp Val Ala Phe Leu Ile Ser Pro Val Lys Asn
1595 1600 1605
Ser Asp Gly Ile Phe Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu
1610 1615 1620
Asn Ala Ile Leu Pro Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn
1625 1630 1635
Ile Ala Arg Lys Val Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala
1640 1645 1650
Glu Asp Glu Lys Leu Asp Lys Val Lys Ile Ala Ile Ser Asn Lys
1655 1660 1665
Glu Trp Leu Glu Tyr Ala Gln Thr Ser Val Lys His Gly Ser Pro
1670 1675 1680
Lys Lys Lys Arg Lys Val Gly Ser Ala Glu Tyr Val Arg Ala Leu
1685 1690 1695
Phe Asp Phe Asn Gly Asn Asp Glu Glu Asp Leu Pro Phe Lys Lys
1700 1705 1710
Gly Asp Ile Leu Arg Ile Arg Asp Lys Pro Glu Glu Gln Trp Trp
1715 1720 1725
Asn Ala Glu Asp Ser Glu Gly Lys Arg Gly Met Ile Pro Val Pro
1730 1735 1740
Tyr Val Glu Lys Tyr Met Thr Asp Ala Glu Tyr Val Arg Ile His
1745 1750 1755
Glu Lys Leu Asp Ile Tyr Thr Phe Lys Lys Gln Phe Phe Asn Asn
1760 1765 1770
Lys Lys Ser Val Ser His Arg Cys Tyr Val Leu Phe Glu Leu Lys
1775 1780 1785
Arg Arg Gly Glu Arg Arg Ala Cys Phe Trp Gly Tyr Ala Val Asn
1790 1795 1800
Lys Pro Gln Ser Gly Thr Glu Arg Gly Ile His Ala Glu Ile Phe
1805 1810 1815
Ser Ile Arg Lys Val Glu Glu Tyr Leu Arg Asp Asn Pro Gly Gln
1820 1825 1830
Phe Thr Ile Asn Trp Tyr Ser Ser Trp Ser Pro Cys Ala Asp Cys
1835 1840 1845
Ala Glu Lys Ile Leu Glu Trp Tyr Asn Gln Glu Leu Arg Gly Asn
1850 1855 1860
Gly His Thr Leu Lys Ile Trp Ala Cys Lys Leu Tyr Tyr Glu Lys
1865 1870 1875
Asn Ala Arg Asn Gln Ile Gly Leu Trp Asn Leu Arg Asp Asn Gly
1880 1885 1890
Val Gly Leu Asn Val Met Val Ser Glu His Tyr Gln Cys Cys Arg
1895 1900 1905
Lys Ile Phe Ile Gln Ser Ser His Asn Gln Leu Asn Glu Asn Arg
1910 1915 1920
Trp Leu Glu Lys Thr Leu Lys Arg Ala Glu Lys Arg Arg Ser Glu
1925 1930 1935
Leu Ser Ile Met Ile Gln Val Lys Ile Leu His Thr Thr Lys Ser
1940 1945 1950
Pro Ala Val Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Gly Gly
1955 1960 1965
Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln
1970 1975 1980
Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu
1985 1990 1995
Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr
2000 2005 2010
Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser
2015 2020 2025
Asp Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser
2030 2035 2040
Asn Gly Glu Asn Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys
2045 2050 2055
Lys Lys Arg Lys Val
2060
<210> 86
<211> 6201
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 86
atgccgaaga agaagcgcaa ggtcatgacc gacgcggagt atgtgcgcat ccacgagaag 60
ctcgacatct acaccttcaa gaagcagttc ttcaacaaca agaagagcgt ctcccacagg 120
tgctacgtgc tcttcgagct gaagaggcgc ggcgagcgca gggcctgctt ctggggctac 180
gccgtgaaca agccgcagag cggcaccgag cgcggcatcc acgccgagat cttcagcatc 240
cgcaaggtgg aggagtacct cagggacaac ccgggccagt tcaccatcaa ctggtacagc 300
tcctggagcc cggtaagttt ctgcttctac ctttgatata tatataataa ttatcattaa 360
ttagtagtaa tataatattt caaatatttt tttcaaaata aaagaatgta gtatatagca 420
attgcttttc tgtagtttat aagtgtgtat attttaattt ataacttttc taatatatga 480
ccaaaatttg ttgatgtgca gtgcgcggac tgcgccgaga agatcctgga gtggtacaac 540
caggagctga ggggcaacgg ccacaccctg aagatctggg cctgcaagct ctactacgag 600
aagaacgcga ggaaccagat cggcctgtgg aacctccgcg acaacggcgt cggcctcaac 660
gtgatggtct ccgagcacta ccagtgctgc cgcaagatct tcatccagtc cagccacaac 720
cagctcaacg agaacaggtg gctggagaag accctgaaga gggccgagaa gcgcaggtcc 780
gagctcagca tcatgatcca ggtgaagatc ctccacacca cgaagtcccc cgccgtggcg 840
gagtacgttc gggctctgtt cgacttcaac ggcaacgacg aggaggacct cccgttcaag 900
aagggcgaca tcctgcgcat cagggacaag ccggaggagc agtggtggaa cgccgaggac 960
tccgagggca agaggggcat gatcccggtc ccctacgtgg agaagtacat gtccgaggtg 1020
gagttcagcc acgagtactg gatgaggcac gctctcaccc tggctaagag ggcgtgggac 1080
gagagggagg tgccggtggg cgccgtgctc gtccacaaca accgcgtgat cggcgagggc 1140
tggaacaggc ccatcggcag gcacgaccca accgctcacg ccgagatcat ggctctcagg 1200
cagggcggcc tggtcatgca gaactacagg ctgatcgacg cgaccctcta cgtgaccctc 1260
gagccctgcg tcatgtgcgc gggcgccatg atccactcca ggatcggcag ggtggtcttc 1320
ggcgctaggg acgccaagac gggcgctgcg ggcagcctca tggacgtgct gcaccacccc 1380
ggcatgaacc accgcgtcga gatcaccgag ggcatcctcg cggacgagtg cgctgcgctc 1440
ctgtccgact tcttcaggat gcgcaggcag gagatcaagg cccagaagaa ggcgcagtcc 1500
agcaccgact ccggcggctc cagcggcggc tccagcggca gcgagacccc gggcacgtcc 1560
gagagcgcga cgcccgagag cagcggcggc tccagcggcg gctcctcgga ggtcgagttc 1620
agccatgagt actggatgag gcatgccctg actctcgcta agagggcgcg ggatgagcgc 1680
gaggtgccgg tgggggccgt gctcgtcctg aacaaccgcg tgatcgggga gggctggaac 1740
cgggctatcg gcctccacga cccaacggcc catgccgaga tcatggccct gaggcagggc 1800
ggcctggtca tgcaaaacta caggctcatc gacgccaccc tctacgtgac cttcgagcca 1860
tgcgtgatgt gcgcgggggc catgatccac tcgaggattg ggagggtggt cttcggcgtg 1920
cgcaacgcta agacgggggc cgccggcagc ctcatggacg tcctgcacta cccgggcatg 1980
aaccacaggg tggagattac cgagggcatc ctggccgatg agtgcgccgc gctcctgtgc 2040
tacttcttcc gcatgcccag gcaggtcttc aacgcgcaga agaaggccca gagctccact 2100
gatgggggcg ggggctcagg cgggggcggg agcggcggcg ggggctctgg gggcggcggc 2160
agcggcgggg gcggcagcgg gggcggcggg tcgatgagca agctggagaa gttcacgaac 2220
tgctactccc tcagcaagac cctgaggttc aaggcgatcc cggtcggcaa gacccaggag 2280
aacatcgaca acaagcggct gctggtggag gacgagaaga gggctgagga ctacaagggc 2340
gtgaagaagc tcctggaccg ctactacctg tccttcatca acgacgtgct ccacagcatc 2400
aagctcaaga acctgaacaa ctacatcagc ctcttcagga agaagacgcg caccgagaag 2460
gagaacaagg agctcgagaa cctggagatc aacctgagga aggagatcgc caaggcgttc 2520
aagggcaacg agggctacaa gtccctcttc aagaaggaca tcatcgagac gatcctcccg 2580
gagttcctgg acgacaagga cgagatcgcc ctggtcaact ccttcaacgg cttcaccacg 2640
gcgttcaccg gcttcttcga caaccgcgag aacatgttca gcgaggaggc caagtccacg 2700
agcatcgcgt tcaggtgcat caacgagaac ctcacccgct acatctccaa catggacatc 2760
ttcgagaagg tcgacgcgat cttcgacaag cacgaggtgc aggagatcaa ggagaagatc 2820
ctgaacagcg actacgacgt cgaggacttc ttcgagggcg agttcttcaa cttcgtcctc 2880
acgcaggagg gcatcgacgt gtacaacgcc atcatcggtg gcttcgtgac cgagtccggc 2940
gagaagatca agggcctgaa cgagtacatc aacctctaca accagaagac caagcagaag 3000
ctgccgaagt tcaagcccct gtacaagcag gtgctctccg acagggagtc cctcagcttc 3060
tacggcgagg gctacacgag cgacgaggag gtcctggagg tgttccgcaa caccctcaac 3120
aagaacagcg agatcttctc cagcatcaag aagctcgaga agctgttcaa gaacttcgac 3180
gagtactcca gcgccggcat cttcgtcaag aacggcccgg cgatctccac gatcagcaag 3240
gacatcttcg gcgagtggaa cgtgatccgc gacaagtgga acgccgagta cgacgacatc 3300
cacctcaaga agaaggcggt ggtcaccgag aagtacgagg acgacaggcg caagtccttc 3360
aagaagatcg gctccttcag cctcgagcag ctgcaggagt acgccgacgc ggacctgagc 3420
gtggtcgaga agctcaagga gatcatcatc cagaaggtcg acgagatcta caaggtgtac 3480
ggctccagcg agaagctctt cgacgcggac ttcgtcctcg agaagtccct gaagaagaac 3540
gacgccgtgg tcgcgatcat gaaggacctc ctggactccg tgaagagctt cgagaattac 3600
atcaaggcct tcttcggcga gggcaaggag acgaacaggg acgagtcctt ctacggcgac 3660
ttcgtcctgg cctacgacat cctcctgaag gtggaccaca tctacgacgc gatccgcaac 3720
tacgtgaccc agaagccgta cagcaaggac aagttcaagc tctacttcca gaacccccag 3780
ttcatgggcg gctgggacaa ggacaaggag acggactaca gggcgaccat cctgcgctac 3840
ggcagcaagt actacctcgc catcatggac aagaagtacg cgaagtgcct gcagaagatc 3900
gacaaggacg acgtcaacgg caactacgag aagatcaact acaagctcct gccgggcccc 3960
aacaagatgc tcccgaaggt gttcttctcc aagaagtgga tggcctacta caaccccagc 4020
gaggacatcc agaagatcta caagaacggc acgttcaaga agggcgacat gttcaacctg 4080
aacgactgcc acaagctcat cgacttcttc aaggactcca tcagccgcta cccgaagtgg 4140
tccaacgcct acgacttcaa cttcagcgag accgagaagt acaaggacat cgcgggcttc 4200
taccgcgagg tcgaggagca gggctacaag gtgtccttcg agtccgccag caagaaggag 4260
gtcgacaagc tggtggagga gggcaagctc tacatgttcc agatctacaa caaggacttc 4320
tccgacaaga gccacggcac gcccaacctg cacaccatgt acttcaagct cctgttcgac 4380
gagaacaacc acggccagat caggctgtcc ggcggcgccg agctcttcat gaggagggcg 4440
agcctgaaga aggaggagct ggtggtccac cccgctaaca gcccaatcgc gaacaagaac 4500
ccggacaacc ccaagaagac cacgaccctg tcctacgacg tgtacaagga caagaggttc 4560
agcgaggacc agtacgagct ccacatcccg atcgcgatca acaagtgccc caagaacatc 4620
ttcaagatca acaccgaggt ccgcgtgctc ctgaagcacg acgacaaccc ctacgtgatc 4680
ggcatcgcta ggggcgagag gaacctcctg tacatcgtgg tcgtggacgg caagggcaac 4740
atcgtggagc agtactccct caacgagatc atcaacaact tcaacggcat caggatcaag 4800
acggactacc acagcctcct ggacaagaag gagaaggaga ggttcgaggc ccgccagaac 4860
tggacctcca tcgagaacat caaggagctg aaggcgggct acatcagcca ggtcgtgcac 4920
aagatctgcg agctcgtcga gaagtacgac gccgtgatcg ccctcgcgga cctgaactcc 4980
ggcttcaaga acagccgcgt caaggtggag aagcaggtct accagaagtt cgagaagatg 5040
ctcatcgaca agctgaacta catggtggac aagaagtcca acccctgcgc tacgggcggc 5100
gcgctgaagg gctaccagat caccaacaag ttcgagagct tcaagtccat gagcactcag 5160
aacggcttca tcttctacat cccggcgtgg ctcacgtcca agatcgaccc cagcaccggc 5220
ttcgtcaacc tcctgaagac gaagtacacc tccatcgccg acagcaagaa gttcatctcc 5280
agcttcgacc gcatcatgta tgtgccggag gaggacctgt tcgagttcgc cctcgactac 5340
aagaacttct cccgcacgga cgcggactac atcaagaagt ggaagctgta cagctacggc 5400
aaccgcatcc gcatcttcag gaaccccaag aagaacaacg tcttcgactg ggaggaggtg 5460
tgcctgacct ccgcgtacaa ggagctcttc aacaagtacg gcatcaacta ccagcagggc 5520
gacatcaggg ctctcctgtg cgagcagagc gacaaggcct tctactccag cttcatggcg 5580
ctgatgtccc tcatgctgca gatgaggaac tcgatcaccg gcaggacgga cgtggccttc 5640
ctcatctccc cggtgaagaa cagcgacggc atcttctacg actccaggaa ctacgaggcc 5700
caggagaacg cgatcctccc aaagaacgcg gacgccaacg gcgcctacaa catcgccagg 5760
aaggtcctct gggctatcgg ccagttcaag aaggcggagg acgagaagct ggacaaggtg 5820
aagatcgcca tcagcaacaa ggagtggctc gagtacgccc agacctcggt caagcacggc 5880
agcccgaaga agaagcgcaa ggtgtccggc ggcagcacga acctgtccga catcatcgag 5940
aaggagaccg gcaagcagct cgtgatccag gagagcatcc tcatgctgcc ggaggaggtc 6000
gaggaggtca tcggcaacaa gcccgagtcc gacatcctcg tccacacggc ctacgacgag 6060
tccaccgacg agaacgtgat gctcctgacc tcggacgctc ccgagtacaa gccatgggcc 6120
ctggtcatcc aggacagcaa cggcgagaac aagatcaaga tgctctccgg cggcagcccg 6180
aagaagaagc gcaaagtgtg a 6201
<210> 87
<211> 2060
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 87
Met Pro Lys Lys Lys Arg Lys Val Met Thr Asp Ala Glu Tyr Val Arg
1 5 10 15
Ile His Glu Lys Leu Asp Ile Tyr Thr Phe Lys Lys Gln Phe Phe Asn
20 25 30
Asn Lys Lys Ser Val Ser His Arg Cys Tyr Val Leu Phe Glu Leu Lys
35 40 45
Arg Arg Gly Glu Arg Arg Ala Cys Phe Trp Gly Tyr Ala Val Asn Lys
50 55 60
Pro Gln Ser Gly Thr Glu Arg Gly Ile His Ala Glu Ile Phe Ser Ile
65 70 75 80
Arg Lys Val Glu Glu Tyr Leu Arg Asp Asn Pro Gly Gln Phe Thr Ile
85 90 95
Asn Trp Tyr Ser Ser Trp Ser Pro Val Ser Phe Cys Phe Tyr Leu Tyr
100 105 110
Ile Tyr Asn Asn Tyr His Leu Val Val Ile Tyr Phe Lys Tyr Phe Phe
115 120 125
Gln Asn Lys Arg Met Tyr Ile Ala Ile Ala Phe Leu Phe Ile Ser Val
130 135 140
Tyr Ile Leu Ile Tyr Asn Phe Ser Asn Ile Pro Lys Phe Val Asp Val
145 150 155 160
Gln Cys Ala Asp Cys Ala Glu Lys Ile Leu Glu Trp Tyr Asn Gln Glu
165 170 175
Leu Arg Gly Asn Gly His Thr Leu Lys Ile Trp Ala Cys Lys Leu Tyr
180 185 190
Tyr Glu Lys Asn Ala Arg Asn Gln Ile Gly Leu Trp Asn Leu Arg Asp
195 200 205
Asn Gly Val Gly Leu Asn Val Met Val Ser Glu His Tyr Gln Cys Cys
210 215 220
Arg Lys Ile Phe Ile Gln Ser Ser His Asn Gln Leu Asn Glu Asn Arg
225 230 235 240
Trp Leu Glu Lys Thr Leu Lys Arg Ala Glu Lys Arg Arg Ser Glu Leu
245 250 255
Ser Ile Met Ile Gln Val Lys Ile Leu His Thr Thr Lys Ser Pro Ala
260 265 270
Val Ala Glu Tyr Val Arg Ala Leu Phe Asp Phe Asn Gly Asn Asp Glu
275 280 285
Glu Asp Leu Pro Phe Lys Lys Gly Asp Ile Leu Arg Ile Arg Asp Lys
290 295 300
Pro Glu Glu Gln Trp Trp Asn Ala Glu Asp Ser Glu Gly Lys Arg Gly
305 310 315 320
Met Ile Pro Val Pro Tyr Val Glu Lys Tyr Met Ser Glu Val Glu Phe
325 330 335
Ser His Glu Tyr Trp Met Arg His Ala Leu Thr Leu Ala Lys Arg Ala
340 345 350
Trp Asp Glu Arg Glu Val Pro Val Gly Ala Val Leu Val His Asn Asn
355 360 365
Arg Val Ile Gly Glu Gly Trp Asn Arg Pro Ile Gly Arg His Asp Pro
370 375 380
Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln Gly Gly Leu Val Met
385 390 395 400
Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr Val Thr Leu Glu Pro
405 410 415
Cys Val Met Cys Ala Gly Ala Met Ile His Ser Arg Ile Gly Arg Val
420 425 430
Val Phe Gly Ala Arg Asp Ala Lys Thr Gly Ala Ala Gly Ser Leu Met
435 440 445
Asp Val Leu His His Pro Gly Met Asn His Arg Val Glu Ile Thr Glu
450 455 460
Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu Ser Asp Phe Phe Arg
465 470 475 480
Met Arg Arg Gln Glu Ile Lys Ala Gln Lys Lys Ala Gln Ser Ser Thr
485 490 495
Asp Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly
500 505 510
Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly
515 520 525
Ser Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
530 535 540
Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala
545 550 555 560
Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala
565 570 575
Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
580 585 590
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
595 600 605
Tyr Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
610 615 620
Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly
625 630 635 640
Ala Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His
645 650 655
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
660 665 670
Leu Cys Tyr Phe Phe Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys
675 680 685
Lys Ala Gln Ser Ser Thr Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly
690 695 700
Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
705 710 715 720
Gly Gly Gly Gly Ser Met Ser Lys Leu Glu Lys Phe Thr Asn Cys Tyr
725 730 735
Ser Leu Ser Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly Lys Thr
740 745 750
Gln Glu Asn Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu Lys Arg
755 760 765
Ala Glu Asp Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr Leu
770 775 780
Ser Phe Ile Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn Leu Asn
785 790 795 800
Asn Tyr Ile Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu Asn
805 810 815
Lys Glu Leu Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala Lys
820 825 830
Ala Phe Lys Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp Ile
835 840 845
Ile Glu Thr Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile Ala
850 855 860
Leu Val Asn Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe Phe
865 870 875 880
Asp Asn Arg Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr Ser Ile
885 890 895
Ala Phe Arg Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn Met
900 905 910
Asp Ile Phe Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu Val Gln
915 920 925
Glu Ile Lys Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp Phe
930 935 940
Phe Glu Gly Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly Ile Asp
945 950 955 960
Val Tyr Asn Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly Glu Lys
965 970 975
Ile Lys Gly Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr Lys
980 985 990
Gln Lys Leu Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser Asp
995 1000 1005
Arg Glu Ser Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu
1010 1015 1020
Glu Val Leu Glu Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu
1025 1030 1035
Ile Phe Ser Ser Ile Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe
1040 1045 1050
Asp Glu Tyr Ser Ser Ala Gly Ile Phe Val Lys Asn Gly Pro Ala
1055 1060 1065
Ile Ser Thr Ile Ser Lys Asp Ile Phe Gly Glu Trp Asn Val Ile
1070 1075 1080
Arg Asp Lys Trp Asn Ala Glu Tyr Asp Asp Ile His Leu Lys Lys
1085 1090 1095
Lys Ala Val Val Thr Glu Lys Tyr Glu Asp Asp Arg Arg Lys Ser
1100 1105 1110
Phe Lys Lys Ile Gly Ser Phe Ser Leu Glu Gln Leu Gln Glu Tyr
1115 1120 1125
Ala Asp Ala Asp Leu Ser Val Val Glu Lys Leu Lys Glu Ile Ile
1130 1135 1140
Ile Gln Lys Val Asp Glu Ile Tyr Lys Val Tyr Gly Ser Ser Glu
1145 1150 1155
Lys Leu Phe Asp Ala Asp Phe Val Leu Glu Lys Ser Leu Lys Lys
1160 1165 1170
Asn Asp Ala Val Val Ala Ile Met Lys Asp Leu Leu Asp Ser Val
1175 1180 1185
Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe Gly Glu Gly Lys
1190 1195 1200
Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe Val Leu Ala
1205 1210 1215
Tyr Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp Ala Ile Arg
1220 1225 1230
Asn Tyr Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys Leu
1235 1240 1245
Tyr Phe Gln Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp Lys
1250 1255 1260
Glu Thr Asp Tyr Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr
1265 1270 1275
Tyr Leu Ala Ile Met Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys
1280 1285 1290
Ile Asp Lys Asp Asp Val Asn Gly Asn Tyr Glu Lys Ile Asn Tyr
1295 1300 1305
Lys Leu Leu Pro Gly Pro Asn Lys Met Leu Pro Lys Val Phe Phe
1310 1315 1320
Ser Lys Lys Trp Met Ala Tyr Tyr Asn Pro Ser Glu Asp Ile Gln
1325 1330 1335
Lys Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly Asp Met Phe Asn
1340 1345 1350
Leu Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys Asp Ser Ile
1355 1360 1365
Ser Arg Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn Phe Ser
1370 1375 1380
Glu Thr Glu Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu Val
1385 1390 1395
Glu Glu Gln Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys
1400 1405 1410
Glu Val Asp Lys Leu Val Glu Glu Gly Lys Leu Tyr Met Phe Gln
1415 1420 1425
Ile Tyr Asn Lys Asp Phe Ser Asp Lys Ser His Gly Thr Pro Asn
1430 1435 1440
Leu His Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu Asn Asn His
1445 1450 1455
Gly Gln Ile Arg Leu Ser Gly Gly Ala Glu Leu Phe Met Arg Arg
1460 1465 1470
Ala Ser Leu Lys Lys Glu Glu Leu Val Val His Pro Ala Asn Ser
1475 1480 1485
Pro Ile Ala Asn Lys Asn Pro Asp Asn Pro Lys Lys Thr Thr Thr
1490 1495 1500
Leu Ser Tyr Asp Val Tyr Lys Asp Lys Arg Phe Ser Glu Asp Gln
1505 1510 1515
Tyr Glu Leu His Ile Pro Ile Ala Ile Asn Lys Cys Pro Lys Asn
1520 1525 1530
Ile Phe Lys Ile Asn Thr Glu Val Arg Val Leu Leu Lys His Asp
1535 1540 1545
Asp Asn Pro Tyr Val Ile Gly Ile Ala Arg Gly Glu Arg Asn Leu
1550 1555 1560
Leu Tyr Ile Val Val Val Asp Gly Lys Gly Asn Ile Val Glu Gln
1565 1570 1575
Tyr Ser Leu Asn Glu Ile Ile Asn Asn Phe Asn Gly Ile Arg Ile
1580 1585 1590
Lys Thr Asp Tyr His Ser Leu Leu Asp Lys Lys Glu Lys Glu Arg
1595 1600 1605
Phe Glu Ala Arg Gln Asn Trp Thr Ser Ile Glu Asn Ile Lys Glu
1610 1615 1620
Leu Lys Ala Gly Tyr Ile Ser Gln Val Val His Lys Ile Cys Glu
1625 1630 1635
Leu Val Glu Lys Tyr Asp Ala Val Ile Ala Leu Ala Asp Leu Asn
1640 1645 1650
Ser Gly Phe Lys Asn Ser Arg Val Lys Val Glu Lys Gln Val Tyr
1655 1660 1665
Gln Lys Phe Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Met Val
1670 1675 1680
Asp Lys Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala Leu Lys Gly
1685 1690 1695
Tyr Gln Ile Thr Asn Lys Phe Glu Ser Phe Lys Ser Met Ser Thr
1700 1705 1710
Gln Asn Gly Phe Ile Phe Tyr Ile Pro Ala Trp Leu Thr Ser Lys
1715 1720 1725
Ile Asp Pro Ser Thr Gly Phe Val Asn Leu Leu Lys Thr Lys Tyr
1730 1735 1740
Thr Ser Ile Ala Asp Ser Lys Lys Phe Ile Ser Ser Phe Asp Arg
1745 1750 1755
Ile Met Tyr Val Pro Glu Glu Asp Leu Phe Glu Phe Ala Leu Asp
1760 1765 1770
Tyr Lys Asn Phe Ser Arg Thr Asp Ala Asp Tyr Ile Lys Lys Trp
1775 1780 1785
Lys Leu Tyr Ser Tyr Gly Asn Arg Ile Arg Ile Phe Arg Asn Pro
1790 1795 1800
Lys Lys Asn Asn Val Phe Asp Trp Glu Glu Val Cys Leu Thr Ser
1805 1810 1815
Ala Tyr Lys Glu Leu Phe Asn Lys Tyr Gly Ile Asn Tyr Gln Gln
1820 1825 1830
Gly Asp Ile Arg Ala Leu Leu Cys Glu Gln Ser Asp Lys Ala Phe
1835 1840 1845
Tyr Ser Ser Phe Met Ala Leu Met Ser Leu Met Leu Gln Met Arg
1850 1855 1860
Asn Ser Ile Thr Gly Arg Thr Asp Val Ala Phe Leu Ile Ser Pro
1865 1870 1875
Val Lys Asn Ser Asp Gly Ile Phe Tyr Asp Ser Arg Asn Tyr Glu
1880 1885 1890
Ala Gln Glu Asn Ala Ile Leu Pro Lys Asn Ala Asp Ala Asn Gly
1895 1900 1905
Ala Tyr Asn Ile Ala Arg Lys Val Leu Trp Ala Ile Gly Gln Phe
1910 1915 1920
Lys Lys Ala Glu Asp Glu Lys Leu Asp Lys Val Lys Ile Ala Ile
1925 1930 1935
Ser Asn Lys Glu Trp Leu Glu Tyr Ala Gln Thr Ser Val Lys His
1940 1945 1950
Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Gly Gly Ser Thr Asn
1955 1960 1965
Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val Ile
1970 1975 1980
Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile
1985 1990 1995
Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp
2000 2005 2010
Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro
2015 2020 2025
Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu
2030 2035 2040
Asn Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg
2045 2050 2055
Lys Val
2060
<210> 88
<211> 6183
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 88
atgtccgagg tggagttcag ccacgagtac tggatgaggc acgctctcac cctggctaag 60
agggcgtggg acgagaggga ggtgccggtg ggcgccgtgc tcgtccacaa caaccgcgtg 120
atcggcgagg gctggaacag gcccatcggc aggcacgacc caaccgctca cgccgagatc 180
atggctctca ggcagggcgg cctggtcatg cagaactaca ggctgatcga cgcgaccctc 240
tacgtgaccc tcgagccctg cgtcatggta agtttctgct tctacctttg atatatatat 300
aataattatc attaattagt agtaatataa tatttcaaat atttttttca aaataaaaga 360
atgtagtata tagcaattgc ttttctgtag tttataagtg tgtatatttt aatttataac 420
ttttctaata tatgaccaaa atttgttgat gtgcagtgcg cgggcgccat gatccactcc 480
aggatcggca gggtggtctt cggcgctagg gacgccaaga cgggcgctgc gggcagcctc 540
atggacgtgc tgcaccaccc cggcatgaac caccgcgtcg agatcaccga gggcatcctc 600
gcggacgagt gcgctgcgct cctgtccgac ttcttcagga tgcgcaggca ggagatcaag 660
gcccagaaga aggcgcagtc cagcaccgac tccggcggct ccagcggcgg ctccagcggc 720
agcgagaccc cgggcacgtc cgagagcgcg acgcccgaga gcagcggcgg ctccagcggc 780
ggctcctcgg aggtcgagtt cagccatgag tactggatga ggcatgccct gactctcgct 840
aagagggcgc gggatgagcg cgaggtgccg gtgggggccg tgctcgtcct gaacaaccgc 900
gtgatcgggg agggctggaa ccgggctatc ggcctccacg acccaacggc ccatgccgag 960
atcatggccc tgaggcaggg cggcctggtc atgcaaaact acaggctcat cgacgccacc 1020
ctctacgtga ccttcgagcc atgcgtgatg tgcgcggggg ccatgatcca ctcgaggatt 1080
gggagggtgg tcttcggcgt gcgcaacgct aagacggggg ccgccggcag cctcatggac 1140
gtcctgcact acccgggcat gaaccacagg gtggagatta ccgagggcat cctggccgat 1200
gagtgcgccg cgctcctgtg ctacttcttc cgcatgccca ggcaggtctt caacgcgcag 1260
aagaaggccc agagctccac tgatggcagc gcggagtacg ttcgggctct gttcgacttc 1320
aacggcaacg acgaggagga cctcccgttc aagaagggcg acatcctgcg catcagggac 1380
aagccggagg agcagtggtg gaacgccgag gactccgagg gcaagagggg catgatcccg 1440
gtcccctacg tggagaagta catgaccgac gcggagtatg tgcgcatcca cgagaagctc 1500
gacatctaca ccttcaagaa gcagttcttc aacaacaaga agagcgtctc ccacaggtgc 1560
tacgtgctct tcgagctgaa gaggcgcggc gagcgcaggg cctgcttctg gggctacgcc 1620
gtgaacaagc cgcagagcgg caccgagcgc ggcatccacg ccgagatctt cagcatccgc 1680
aaggtggagg agtacctcag ggacaacccg ggccagttca ccatcaactg gtacagctcc 1740
tggagcccgt gcgcggactg cgccgagaag atcctggagt ggtacaacca ggagctgagg 1800
ggcaacggcc acaccctgaa gatctgggcc tgcaagctct actacgagaa gaacgcgagg 1860
aaccagatcg gcctgtggaa cctccgcgac aacggcgtcg gcctcaacgt gatggtctcc 1920
gagcactacc agtgctgccg caagatcttc atccagtcca gccacaacca gctcaacgag 1980
aacaggtggc tggagaagac cctgaagagg gccgagaagc gcaggtccga gctcagcatc 2040
atgatccagg tgaagatcct ccacaccacg aagtcccccg ccgtgggggg cgggggctca 2100
ggcgggggcg ggagcggcgg cgggggctct gggggcggcg gcagcggcgg gggcggcagc 2160
gggggcggcg ggtcgatgag caagctggag aagttcacga actgctactc cctcagcaag 2220
accctgaggt tcaaggcgat cccggtcggc aagacccagg agaacatcga caacaagcgg 2280
ctgctggtgg aggacgagaa gagggctgag gactacaagg gcgtgaagaa gctcctggac 2340
cgctactacc tgtccttcat caacgacgtg ctccacagca tcaagctcaa gaacctgaac 2400
aactacatca gcctcttcag gaagaagacg cgcaccgaga aggagaacaa ggagctcgag 2460
aacctggaga tcaacctgag gaaggagatc gccaaggcgt tcaagggcaa cgagggctac 2520
aagtccctct tcaagaagga catcatcgag acgatcctcc cggagttcct ggacgacaag 2580
gacgagatcg ccctggtcaa ctccttcaac ggcttcacca cggcgttcac cggcttcttc 2640
gacaaccgcg agaacatgtt cagcgaggag gccaagtcca cgagcatcgc gttcaggtgc 2700
atcaacgaga acctcacccg ctacatctcc aacatggaca tcttcgagaa ggtcgacgcg 2760
atcttcgaca agcacgaggt gcaggagatc aaggagaaga tcctgaacag cgactacgac 2820
gtcgaggact tcttcgaggg cgagttcttc aacttcgtcc tcacgcagga gggcatcgac 2880
gtgtacaacg ccatcatcgg tggcttcgtg accgagtccg gcgagaagat caagggcctg 2940
aacgagtaca tcaacctcta caaccagaag accaagcaga agctgccgaa gttcaagccc 3000
ctgtacaagc aggtgctctc cgacagggag tccctcagct tctacggcga gggctacacg 3060
agcgacgagg aggtcctgga ggtgttccgc aacaccctca acaagaacag cgagatcttc 3120
tccagcatca agaagctcga gaagctgttc aagaacttcg acgagtactc cagcgccggc 3180
atcttcgtca agaacggccc ggcgatctcc acgatcagca aggacatctt cggcgagtgg 3240
aacgtgatcc gcgacaagtg gaacgccgag tacgacgaca tccacctcaa gaagaaggcg 3300
gtggtcaccg agaagtacga ggacgacagg cgcaagtcct tcaagaagat cggctccttc 3360
agcctcgagc agctgcagga gtacgccgac gcggacctga gcgtggtcga gaagctcaag 3420
gagatcatca tccagaaggt cgacgagatc tacaaggtgt acggctccag cgagaagctc 3480
ttcgacgcgg acttcgtcct cgagaagtcc ctgaagaaga acgacgccgt ggtcgcgatc 3540
atgaaggacc tcctggactc cgtgaagagc ttcgagaatt acatcaaggc cttcttcggc 3600
gagggcaagg agacgaacag ggacgagtcc ttctacggcg acttcgtcct ggcctacgac 3660
atcctcctga aggtggacca catctacgac gcgatccgca actacgtgac ccagaagccg 3720
tacagcaagg acaagttcaa gctctacttc cagaaccccc agttcatggg cggctgggac 3780
aaggacaagg agacggacta cagggcgacc atcctgcgct acggcagcaa gtactacctc 3840
gccatcatgg acaagaagta cgcgaagtgc ctgcagaaga tcgacaagga cgacgtcaac 3900
ggcaactacg agaagatcaa ctacaagctc ctgccgggcc ccaacaagat gctcccgaag 3960
gtgttcttct ccaagaagtg gatggcctac tacaacccca gcgaggacat ccagaagatc 4020
tacaagaacg gcacgttcaa gaagggcgac atgttcaacc tgaacgactg ccacaagctc 4080
atcgacttct tcaaggactc catcagccgc tacccgaagt ggtccaacgc ctacgacttc 4140
aacttcagcg agaccgagaa gtacaaggac atcgcgggct tctaccgcga ggtcgaggag 4200
cagggctaca aggtgtcctt cgagtccgcc agcaagaagg aggtcgacaa gctggtggag 4260
gagggcaagc tctacatgtt ccagatctac aacaaggact tctccgacaa gagccacggc 4320
acgcccaacc tgcacaccat gtacttcaag ctcctgttcg acgagaacaa ccacggccag 4380
atcaggctgt ccggcggcgc cgagctcttc atgaggaggg cgagcctgaa gaaggaggag 4440
ctggtggtcc accccgctaa cagcccaatc gcgaacaaga acccggacaa ccccaagaag 4500
accacgaccc tgtcctacga cgtgtacaag gacaagaggt tcagcgagga ccagtacgag 4560
ctccacatcc cgatcgcgat caacaagtgc cccaagaaca tcttcaagat caacaccgag 4620
gtccgcgtgc tcctgaagca cgacgacaac ccctacgtga tcggcatcgc taggggcgag 4680
aggaacctcc tgtacatcgt ggtcgtggac ggcaagggca acatcgtgga gcagtactcc 4740
ctcaacgaga tcatcaacaa cttcaacggc atcaggatca agacggacta ccacagcctc 4800
ctggacaaga aggagaagga gaggttcgag gcccgccaga actggacctc catcgagaac 4860
atcaaggagc tgaaggcggg ctacatcagc caggtcgtgc acaagatctg cgagctcgtc 4920
gagaagtacg acgccgtgat cgccctcgcg gacctgaact ccggcttcaa gaacagccgc 4980
gtcaaggtgg agaagcaggt ctaccagaag ttcgagaaga tgctcatcga caagctgaac 5040
tacatggtgg acaagaagtc caacccctgc gctacgggcg gcgcgctgaa gggctaccag 5100
atcaccaaca agttcgagag cttcaagtcc atgagcactc agaacggctt catcttctac 5160
atcccggcgt ggctcacgtc caagatcgac cccagcaccg gcttcgtcaa cctcctgaag 5220
acgaagtaca cctccatcgc cgacagcaag aagttcatct ccagcttcga ccgcatcatg 5280
tatgtgccgg aggaggacct gttcgagttc gccctcgact acaagaactt ctcccgcacg 5340
gacgcggact acatcaagaa gtggaagctg tacagctacg gcaaccgcat ccgcatcttc 5400
aggaacccca agaagaacaa cgtcttcgac tgggaggagg tgtgcctgac ctccgcgtac 5460
aaggagctct tcaacaagta cggcatcaac taccagcagg gcgacatcag ggctctcctg 5520
tgcgagcaga gcgacaaggc cttctactcc agcttcatgg cgctgatgtc cctcatgctg 5580
cagatgagga actcgatcac cggcaggacg gacgtggcct tcctcatctc cccggtgaag 5640
aacagcgacg gcatcttcta cgactccagg aactacgagg cccaggagaa cgcgatcctc 5700
ccaaagaacg cggacgccaa cggcgcctac aacatcgcca ggaaggtcct ctgggctatc 5760
ggccagttca agaaggcgga ggacgagaag ctggacaagg tgaagatcgc catcagcaac 5820
aaggagtggc tcgagtacgc ccagacctcg gtcaagcacg gcagcccgaa gaagaagcgc 5880
aaggtgtccg gcggcagcac gaacctgtcc gacatcatcg agaaggagac cggcaagcag 5940
ctcgtgatcc aggagagcat cctcatgctg ccggaggagg tcgaggaggt catcggcaac 6000
aagcccgagt ccgacatcct cgtccacacg gcctacgacg agtccaccga cgagaacgtg 6060
atgctcctga cctcggacgc tcccgagtac aagccatggg ccctggtcat ccaggacagc 6120
aacggcgaga acaagatcaa gatgctctcc ggcggcagcc cgaagaagaa gcgcaaagtg 6180
tga 6183
<210> 89
<211> 2054
<212> PRT
<213> 人工序列
<220>
<223> 融合蛋白
<400> 89
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro
35 40 45
Ile Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Leu Glu Pro Cys Val Met Val Ser Phe Cys Phe Tyr Leu
85 90 95
Tyr Ile Tyr Asn Asn Tyr His Leu Val Val Ile Tyr Phe Lys Tyr Phe
100 105 110
Phe Gln Asn Lys Arg Met Tyr Ile Ala Ile Ala Phe Leu Phe Ile Ser
115 120 125
Val Tyr Ile Leu Ile Tyr Asn Phe Ser Asn Ile Pro Lys Phe Val Asp
130 135 140
Val Gln Cys Ala Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val
145 150 155 160
Phe Gly Ala Arg Asp Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp
165 170 175
Val Leu His His Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly
180 185 190
Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu Ser Asp Phe Phe Arg Met
195 200 205
Arg Arg Gln Glu Ile Lys Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp
210 215 220
Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr
225 230 235 240
Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser
245 250 255
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
260 265 270
Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala Val
275 280 285
Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala Ile
290 295 300
Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
305 310 315 320
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
325 330 335
Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
340 345 350
Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly Ala
355 360 365
Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His Arg
370 375 380
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
385 390 395 400
Cys Tyr Phe Phe Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys Lys
405 410 415
Ala Gln Ser Ser Thr Asp Gly Ser Ala Glu Tyr Val Arg Ala Leu Phe
420 425 430
Asp Phe Asn Gly Asn Asp Glu Glu Asp Leu Pro Phe Lys Lys Gly Asp
435 440 445
Ile Leu Arg Ile Arg Asp Lys Pro Glu Glu Gln Trp Trp Asn Ala Glu
450 455 460
Asp Ser Glu Gly Lys Arg Gly Met Ile Pro Val Pro Tyr Val Glu Lys
465 470 475 480
Tyr Met Thr Asp Ala Glu Tyr Val Arg Ile His Glu Lys Leu Asp Ile
485 490 495
Tyr Thr Phe Lys Lys Gln Phe Phe Asn Asn Lys Lys Ser Val Ser His
500 505 510
Arg Cys Tyr Val Leu Phe Glu Leu Lys Arg Arg Gly Glu Arg Arg Ala
515 520 525
Cys Phe Trp Gly Tyr Ala Val Asn Lys Pro Gln Ser Gly Thr Glu Arg
530 535 540
Gly Ile His Ala Glu Ile Phe Ser Ile Arg Lys Val Glu Glu Tyr Leu
545 550 555 560
Arg Asp Asn Pro Gly Gln Phe Thr Ile Asn Trp Tyr Ser Ser Trp Ser
565 570 575
Pro Cys Ala Asp Cys Ala Glu Lys Ile Leu Glu Trp Tyr Asn Gln Glu
580 585 590
Leu Arg Gly Asn Gly His Thr Leu Lys Ile Trp Ala Cys Lys Leu Tyr
595 600 605
Tyr Glu Lys Asn Ala Arg Asn Gln Ile Gly Leu Trp Asn Leu Arg Asp
610 615 620
Asn Gly Val Gly Leu Asn Val Met Val Ser Glu His Tyr Gln Cys Cys
625 630 635 640
Arg Lys Ile Phe Ile Gln Ser Ser His Asn Gln Leu Asn Glu Asn Arg
645 650 655
Trp Leu Glu Lys Thr Leu Lys Arg Ala Glu Lys Arg Arg Ser Glu Leu
660 665 670
Ser Ile Met Ile Gln Val Lys Ile Leu His Thr Thr Lys Ser Pro Ala
675 680 685
Val Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
690 695 700
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Met
705 710 715 720
Ser Lys Leu Glu Lys Phe Thr Asn Cys Tyr Ser Leu Ser Lys Thr Leu
725 730 735
Arg Phe Lys Ala Ile Pro Val Gly Lys Thr Gln Glu Asn Ile Asp Asn
740 745 750
Lys Arg Leu Leu Val Glu Asp Glu Lys Arg Ala Glu Asp Tyr Lys Gly
755 760 765
Val Lys Lys Leu Leu Asp Arg Tyr Tyr Leu Ser Phe Ile Asn Asp Val
770 775 780
Leu His Ser Ile Lys Leu Lys Asn Leu Asn Asn Tyr Ile Ser Leu Phe
785 790 795 800
Arg Lys Lys Thr Arg Thr Glu Lys Glu Asn Lys Glu Leu Glu Asn Leu
805 810 815
Glu Ile Asn Leu Arg Lys Glu Ile Ala Lys Ala Phe Lys Gly Asn Glu
820 825 830
Gly Tyr Lys Ser Leu Phe Lys Lys Asp Ile Ile Glu Thr Ile Leu Pro
835 840 845
Glu Phe Leu Asp Asp Lys Asp Glu Ile Ala Leu Val Asn Ser Phe Asn
850 855 860
Gly Phe Thr Thr Ala Phe Thr Gly Phe Phe Asp Asn Arg Glu Asn Met
865 870 875 880
Phe Ser Glu Glu Ala Lys Ser Thr Ser Ile Ala Phe Arg Cys Ile Asn
885 890 895
Glu Asn Leu Thr Arg Tyr Ile Ser Asn Met Asp Ile Phe Glu Lys Val
900 905 910
Asp Ala Ile Phe Asp Lys His Glu Val Gln Glu Ile Lys Glu Lys Ile
915 920 925
Leu Asn Ser Asp Tyr Asp Val Glu Asp Phe Phe Glu Gly Glu Phe Phe
930 935 940
Asn Phe Val Leu Thr Gln Glu Gly Ile Asp Val Tyr Asn Ala Ile Ile
945 950 955 960
Gly Gly Phe Val Thr Glu Ser Gly Glu Lys Ile Lys Gly Leu Asn Glu
965 970 975
Tyr Ile Asn Leu Tyr Asn Gln Lys Thr Lys Gln Lys Leu Pro Lys Phe
980 985 990
Lys Pro Leu Tyr Lys Gln Val Leu Ser Asp Arg Glu Ser Leu Ser Phe
995 1000 1005
Tyr Gly Glu Gly Tyr Thr Ser Asp Glu Glu Val Leu Glu Val Phe
1010 1015 1020
Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile Phe Ser Ser Ile Lys
1025 1030 1035
Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu Tyr Ser Ser Ala
1040 1045 1050
Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser Thr Ile Ser Lys
1055 1060 1065
Asp Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys Trp Asn Ala
1070 1075 1080
Glu Tyr Asp Asp Ile His Leu Lys Lys Lys Ala Val Val Thr Glu
1085 1090 1095
Lys Tyr Glu Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser
1100 1105 1110
Phe Ser Leu Glu Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser
1115 1120 1125
Val Val Glu Lys Leu Lys Glu Ile Ile Ile Gln Lys Val Asp Glu
1130 1135 1140
Ile Tyr Lys Val Tyr Gly Ser Ser Glu Lys Leu Phe Asp Ala Asp
1145 1150 1155
Phe Val Leu Glu Lys Ser Leu Lys Lys Asn Asp Ala Val Val Ala
1160 1165 1170
Ile Met Lys Asp Leu Leu Asp Ser Val Lys Ser Phe Glu Asn Tyr
1175 1180 1185
Ile Lys Ala Phe Phe Gly Glu Gly Lys Glu Thr Asn Arg Asp Glu
1190 1195 1200
Ser Phe Tyr Gly Asp Phe Val Leu Ala Tyr Asp Ile Leu Leu Lys
1205 1210 1215
Val Asp His Ile Tyr Asp Ala Ile Arg Asn Tyr Val Thr Gln Lys
1220 1225 1230
Pro Tyr Ser Lys Asp Lys Phe Lys Leu Tyr Phe Gln Asn Pro Gln
1235 1240 1245
Phe Met Gly Gly Trp Asp Lys Asp Lys Glu Thr Asp Tyr Arg Ala
1250 1255 1260
Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala Ile Met Asp
1265 1270 1275
Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp Lys Asp Asp Val
1280 1285 1290
Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu Pro Gly Pro
1295 1300 1305
Asn Lys Met Leu Pro Lys Val Phe Phe Ser Lys Lys Trp Met Ala
1310 1315 1320
Tyr Tyr Asn Pro Ser Glu Asp Ile Gln Lys Ile Tyr Lys Asn Gly
1325 1330 1335
Thr Phe Lys Lys Gly Asp Met Phe Asn Leu Asn Asp Cys His Lys
1340 1345 1350
Leu Ile Asp Phe Phe Lys Asp Ser Ile Ser Arg Tyr Pro Lys Trp
1355 1360 1365
Ser Asn Ala Tyr Asp Phe Asn Phe Ser Glu Thr Glu Lys Tyr Lys
1370 1375 1380
Asp Ile Ala Gly Phe Tyr Arg Glu Val Glu Glu Gln Gly Tyr Lys
1385 1390 1395
Val Ser Phe Glu Ser Ala Ser Lys Lys Glu Val Asp Lys Leu Val
1400 1405 1410
Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile Tyr Asn Lys Asp Phe
1415 1420 1425
Ser Asp Lys Ser His Gly Thr Pro Asn Leu His Thr Met Tyr Phe
1430 1435 1440
Lys Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile Arg Leu Ser
1445 1450 1455
Gly Gly Ala Glu Leu Phe Met Arg Arg Ala Ser Leu Lys Lys Glu
1460 1465 1470
Glu Leu Val Val His Pro Ala Asn Ser Pro Ile Ala Asn Lys Asn
1475 1480 1485
Pro Asp Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr
1490 1495 1500
Lys Asp Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro
1505 1510 1515
Ile Ala Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr
1520 1525 1530
Glu Val Arg Val Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile
1535 1540 1545
Gly Ile Ala Arg Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val
1550 1555 1560
Asp Gly Lys Gly Asn Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile
1565 1570 1575
Ile Asn Asn Phe Asn Gly Ile Arg Ile Lys Thr Asp Tyr His Ser
1580 1585 1590
Leu Leu Asp Lys Lys Glu Lys Glu Arg Phe Glu Ala Arg Gln Asn
1595 1600 1605
Trp Thr Ser Ile Glu Asn Ile Lys Glu Leu Lys Ala Gly Tyr Ile
1610 1615 1620
Ser Gln Val Val His Lys Ile Cys Glu Leu Val Glu Lys Tyr Asp
1625 1630 1635
Ala Val Ile Ala Leu Ala Asp Leu Asn Ser Gly Phe Lys Asn Ser
1640 1645 1650
Arg Val Lys Val Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met
1655 1660 1665
Leu Ile Asp Lys Leu Asn Tyr Met Val Asp Lys Lys Ser Asn Pro
1670 1675 1680
Cys Ala Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile Thr Asn Lys
1685 1690 1695
Phe Glu Ser Phe Lys Ser Met Ser Thr Gln Asn Gly Phe Ile Phe
1700 1705 1710
Tyr Ile Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro Ser Thr Gly
1715 1720 1725
Phe Val Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile Ala Asp Ser
1730 1735 1740
Lys Lys Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr Val Pro Glu
1745 1750 1755
Glu Asp Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn Phe Ser Arg
1760 1765 1770
Thr Asp Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr Ser Tyr Gly
1775 1780 1785
Asn Arg Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn Asn Val Phe
1790 1795 1800
Asp Trp Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys Glu Leu Phe
1805 1810 1815
Asn Lys Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile Arg Ala Leu
1820 1825 1830
Leu Cys Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser Phe Met Ala
1835 1840 1845
Leu Met Ser Leu Met Leu Gln Met Arg Asn Ser Ile Thr Gly Arg
1850 1855 1860
Thr Asp Val Ala Phe Leu Ile Ser Pro Val Lys Asn Ser Asp Gly
1865 1870 1875
Ile Phe Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu Asn Ala Ile
1880 1885 1890
Leu Pro Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile Ala Arg
1895 1900 1905
Lys Val Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala Glu Asp Glu
1910 1915 1920
Lys Leu Asp Lys Val Lys Ile Ala Ile Ser Asn Lys Glu Trp Leu
1925 1930 1935
Glu Tyr Ala Gln Thr Ser Val Lys His Gly Ser Pro Lys Lys Lys
1940 1945 1950
Arg Lys Val Ser Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu
1955 1960 1965
Lys Glu Thr Gly Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met
1970 1975 1980
Leu Pro Glu Glu Val Glu Glu Val Ile Gly Asn Lys Pro Glu Ser
1985 1990 1995
Asp Ile Leu Val His Thr Ala Tyr Asp Glu Ser Thr Asp Glu Asn
2000 2005 2010
Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp Ala
2015 2020 2025
Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu
2030 2035 2040
Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
2045 2050
<210> 90
<211> 23
<212> DNA
<213> 玉米
<400> 90
aatcaatggg aagcctatct acc 23
<210> 91
<211> 1284
<212> DNA
<213> 人工序列
<220>
<223> 合成
<400> 91
atgtccgagg tggagttcag ccacgagtac tggatgaggc acgctctcac cctggctaag 60
agggcgtggg acgagaggga ggtgccggtg ggcgccgtgc tcgtccacaa caaccgcgtg 120
atcggcgagg gctggaacag gcccatcggc aggcacgacc caaccgctca cgccgagatc 180
atggctctca ggcagggcgg cctggtcatg cagaactaca ggctgatcga cgcgaccctc 240
tacgtgaccc tcgagccctg cgtcatggta agtttctgct tctacctttg atatatatat 300
aataattatc attaattagt agtaatataa tatttcaaat atttttttca aaataaaaga 360
atgtagtata tagcaattgc ttttctgtag tttataagtg tgtatatttt aatttataac 420
ttttctaata tatgaccaaa atttgttgat gtgcagtgcg cgggcgccat gatccactcc 480
aggatcggca gggtggtctt cggcgctagg gacgccaaga cgggcgctgc gggcagcctc 540
atggacgtgc tgcaccaccc cggcatgaac caccgcgtcg agatcaccga gggcatcctc 600
gcggacgagt gcgctgcgct cctgtccgac ttcttcagga tgcgcaggca ggagatcaag 660
gcccagaaga aggcgcagtc cagcaccgac tccggcggct ccagcggcgg ctccagcggc 720
agcgagaccc cgggcacgtc cgagagcgcg acgcccgaga gcagcggcgg ctccagcggc 780
ggctcctcgg aggtcgagtt cagccatgag tactggatga ggcatgccct gactctcgct 840
aagagggcgc gggatgagcg cgaggtgccg gtgggggccg tgctcgtcct gaacaaccgc 900
gtgatcgggg agggctggaa ccgggctatc ggcctccacg acccaacggc ccatgccgag 960
atcatggccc tgaggcaggg cggcctggtc atgcaaaact acaggctcat cgacgccacc 1020
ctctacgtga ccttcgagcc atgcgtgatg tgcgcggggg ccatgatcca ctcgaggatt 1080
gggagggtgg tcttcggcgt gcgcaacgct aagacggggg ccgccggcag cctcatggac 1140
gtcctgcact acccgggcat gaaccacagg gtggagatta ccgagggcat cctggccgat 1200
gagtgcgccg cgctcctgtg ctacttcttc cgcatgccca ggcaggtctt caacgcgcag 1260
aagaaggccc agagctccac tgat 1284
<210> 92
<211> 422
<212> PRT
<213> 人工序列
<220>
<223> 合成
<400> 92
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro
35 40 45
Ile Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Leu Glu Pro Cys Val Met Val Ser Phe Cys Phe Tyr Leu
85 90 95
Tyr Ile Tyr Asn Asn Tyr His Leu Val Val Ile Tyr Phe Lys Tyr Phe
100 105 110
Phe Gln Asn Lys Arg Met Tyr Ile Ala Ile Ala Phe Leu Phe Ile Ser
115 120 125
Val Tyr Ile Leu Ile Tyr Asn Phe Ser Asn Ile Pro Lys Phe Val Asp
130 135 140
Val Gln Cys Ala Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val
145 150 155 160
Phe Gly Ala Arg Asp Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp
165 170 175
Val Leu His His Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly
180 185 190
Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu Ser Asp Phe Phe Arg Met
195 200 205
Arg Arg Gln Glu Ile Lys Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp
210 215 220
Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr
225 230 235 240
Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser
245 250 255
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
260 265 270
Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala Val
275 280 285
Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala Ile
290 295 300
Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
305 310 315 320
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
325 330 335
Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
340 345 350
Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly Ala
355 360 365
Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His Arg
370 375 380
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
385 390 395 400
Cys Tyr Phe Phe Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys Lys
405 410 415
Ala Gln Ser Ser Thr Asp
420
Claims (27)
1.一种融合蛋白,所述融合蛋白以N-末端至C-末端方向包含异源结构域、第一接头序列以及V型CRISPR-Cas酶,其中所述第一接头序列包含重复的GGGGS序列。
2.如权利要求1所述的融合蛋白,其中所述异源结构域为脱氨酶、聚合酶、核酸酶、松弛酶、烷基转移酶、甲基转移酶、腺苷脱氨酶、胞苷脱氨酶、氧化酶、胸腺嘧啶烷基转移酶、腺嘌呤氧化酶、腺苷甲基转移酶、糖基化酶或核定位信号。
3.如权利要求2所述的融合蛋白,其中所述异源结构域为脱氨酶结构域。
4.如权利要求3所述的融合蛋白,其中所述脱氨酶结构域为胞苷脱氨酶。
5.如权利要求4所述的融合蛋白,其中所述胞苷脱氨酶结构域为激活诱导的胞苷脱氨酶(“AID”)。
6.如权利要求4所述的融合蛋白,其中所述胞苷脱氨酶结构域为载脂蛋白B mRNA编辑复合物(“APOBEC”)结构域。
7.如权利要求6所述的融合蛋白,其中所述APOBEC结构域为APOBEC1家族脱氨酶。
8.如权利要求7所述的融合蛋白,其中所述APOBEC结构域包含与SEQ ID NO:1具有至少70%同一性的序列。
9.如权利要求3所述的融合蛋白,其中所述脱氨酶结构域为腺嘌呤脱氨酶。
10.如权利要求9所述的融合蛋白,其中所述腺嘌呤脱氨酶为TadA结构域。
11.如权利要求10所述的融合蛋白,其中所述TadA结构域包含与SEQ ID NO:92具有至少70%同一性的序列。
12.如权利要求1所述的融合蛋白,其中所述V型CRISPR-Cas酶为V-A型(Cas12a)酶。
13.如权利要求12所述的融合蛋白,其中所述Cas12a结构域选自由以下组成的组:SEQID NO:3、SEQ ID NO:6、SEQ ID NO:22、SEQ ID NO:45、SEQ ID NO:46、SEQ ID NO:47以及SEQ ID NO:48。
14.如权利要求13所述的融合蛋白,其中所述Cas12a结构域为无催化活性的,并且选自由以下组成的组:SEQ ID NO:3、SEQ ID NO:6以及SEQ ID NO:22。
15.如权利要求1所述的融合蛋白,其中所述第一接头序列包含重复至少三次的GGGGS。
16.如权利要求15所述的融合蛋白,其中所述第一接头序列包含重复至少六次的GGGGS。
17.如权利要求1-16所述的融合蛋白,其中所述融合蛋白包含选自由以下组成的组的序列:SEQ ID NO:11、12、13以及44。
18.如权利要求1-17所述的融合蛋白,所述融合蛋白进一步包含尿嘧啶DNA糖基化酶抑制剂(“UGI”)结构域。
19.如权利要求18所述的融合蛋白,其中所述UGI结构域包含SEQ ID NO:8。
20.如权利要求19所述的融合蛋白,其中所述UGI结构域通过包含所述序列SGGS的第二接头与所述Cas12a酶连接。
21.如权利要求1所述的融合蛋白,其中所述融合蛋白包含选自由以下组成的组的序列:SEQ ID NO:17、SEQ ID NO:24、SEQ ID NO:35、SEQ ID NO:39、SEQ ID NO:43、SEQ IDNO:50、SEQ ID NO:52、SEQ ID NO:54、SEQ ID NO:56、SEQ ID NO:81、SEQ ID NO:83、SEQ IDNO:85、SEQ ID NO:87以及SEQ ID NO:89。
22.如权利要求1所述的融合蛋白,其中与具有除了重复的GGGGS序列外的第一接头序列的融合蛋白相比,当与DNA接触时,所述融合蛋白以增加的频率产生靶上编辑,并且以降低的频率产生脱靶编辑。
23.一种编辑植物基因组DNA的方法,所述方法包括将植物基因组DNA与以下接触:
(a)如权利要求1-17所述的融合蛋白并且所述融合蛋白任选地包含UGI结构域;以及
(b)将步骤(a)的融合蛋白靶向至所述植物基因组DNA的靶DNA序列的指导RNA(“gRNA”);
其中与通过具有除重复的GGGGS序列外的第一接头的融合蛋白编辑的植物基因组DNA相比,所述经编辑的植物基因组DNA包含减少的脱靶编辑。
24.一种编辑具有减少的脱靶编辑的植物基因组DNA的方法,所述方法包括将植物基因组DNA与以下接触:
(a)如权利要求1-17所述的融合蛋白并且所述融合蛋白任选地包含UGI结构域;以及
(b)将步骤(a)的融合蛋白靶向至所述植物基因组DNA的靶DNA序列的指导RNA(“gRNA”);
其中与通过具有除重复的GGGGS序列外的第一接头的融合蛋白编辑的植物基因组DNA相比,所述经编辑的植物基因组DNA包含减少的脱靶编辑。
25.如权利要求24所述的方法,其中所述融合蛋白包含SEQ ID NO:24。
26.一种获得具有减少的脱靶编辑的经编辑的植物群体的方法,所述方法包括:
(a)获得包含待编辑的基因组DNA的植物细胞的群体;
(b)获得编码如权利要求1-16所述的融合蛋白、和任选地UGI结构域的核苷酸序列;
(c)用步骤(b)的核苷酸序列转化所述植物细胞的群体,从而表达通过植物细胞的群体内的核酸序列编码的融合蛋白;
(d)使转化的植物细胞的群体生长成植物,其中至少一种植物被编辑;以及
(e)从步骤(d)的产物中选择所述至少一种经编辑的植物,从而获得经编辑的植物群体;
其中与通过具有除重复的GGGGS序列外的第一接头的融合蛋白编辑的植物相比,经编辑的植物群体包含减少的脱靶编辑。
27.如权利要求26所述的方法,其中所述核苷酸序列编码选自由以下组成的组的融合蛋白:SEQ ID NO:17、SEQ ID NO:24、SEQ ID NO:35、SEQ ID NO:39、SEQ ID NO:43、SEQ IDNO:50、SEQ ID NO:52、SEQ ID NO:54、SEQ ID NO:56、SEQ ID NO:81、SEQ ID NO:83、SEQ IDNO:85、SEQ ID NO:87以及SEQ ID NO:89。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNPCT/CN2019/108026 | 2019-09-26 | ||
PCT/CN2019/108026 WO2021056302A1 (en) | 2019-09-26 | 2019-09-26 | Methods and compositions for dna base editing |
PCT/US2020/051383 WO2021061507A1 (en) | 2019-09-26 | 2020-09-18 | Methods and compositions for dna base editing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114761547A true CN114761547A (zh) | 2022-07-15 |
Family
ID=75166246
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202080081866.3A Pending CN114761547A (zh) | 2019-09-26 | 2020-09-18 | 用于dna碱基编辑的方法和组合物 |
Country Status (11)
Country | Link |
---|---|
US (1) | US20220403396A1 (zh) |
EP (1) | EP4034648A4 (zh) |
JP (1) | JP2022549430A (zh) |
KR (1) | KR20220066111A (zh) |
CN (1) | CN114761547A (zh) |
AU (1) | AU2020354372A1 (zh) |
CA (1) | CA3149273A1 (zh) |
CL (2) | CL2022000745A1 (zh) |
IL (1) | IL290572A (zh) |
MX (1) | MX2022003577A (zh) |
WO (2) | WO2021056302A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2021253959A1 (en) | 2020-04-09 | 2022-11-17 | Verve Therapeutics, Inc. | Base editing of PCSK9 and methods of using same for treatment of disease |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018213708A1 (en) * | 2017-05-18 | 2018-11-22 | The Broad Institute, Inc. | Systems, methods, and compositions for targeted nucleic acid editing |
CN109321593A (zh) * | 2018-11-07 | 2019-02-12 | 中国农业科学院植物保护研究所 | 一套用于水稻的人工基因编辑系统 |
WO2019041296A1 (zh) * | 2017-09-01 | 2019-03-07 | 上海科技大学 | 一种碱基编辑系统及方法 |
WO2019126762A2 (en) * | 2017-12-22 | 2019-06-27 | The Broad Institute, Inc. | Cas12a systems, methods, and compositions for targeted rna base editing |
CN109957569A (zh) * | 2017-12-22 | 2019-07-02 | 中国科学院遗传与发育生物学研究所 | 基于cpf1蛋白的碱基编辑系统和方法 |
CN110157727A (zh) * | 2017-12-21 | 2019-08-23 | 中国科学院遗传与发育生物学研究所 | 植物碱基编辑方法 |
WO2019161783A1 (en) * | 2018-02-23 | 2019-08-29 | Shanghaitech University | Fusion proteins for base editing |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2439472A1 (en) * | 2001-02-27 | 2002-09-06 | University Of Rochester | Methods and compositions for modifying apolipoprotein b mrna editing |
IL294014B2 (en) * | 2015-10-23 | 2024-07-01 | Harvard College | Nucleobase editors and their uses |
IL308426A (en) * | 2016-08-03 | 2024-01-01 | Harvard College | Adenosine nuclear base editors and their uses |
EP3592777A1 (en) * | 2017-03-10 | 2020-01-15 | President and Fellows of Harvard College | Cytosine to guanine base editor |
JP7191388B2 (ja) * | 2017-03-23 | 2022-12-19 | プレジデント アンド フェローズ オブ ハーバード カレッジ | 核酸によってプログラム可能なdna結合蛋白質を含む核酸塩基編集因子 |
WO2018213726A1 (en) * | 2017-05-18 | 2018-11-22 | The Broad Institute, Inc. | Systems, methods, and compositions for targeted nucleic acid editing |
CN111801345A (zh) * | 2017-07-28 | 2020-10-20 | 哈佛大学的校长及成员们 | 使用噬菌体辅助连续进化(pace)的进化碱基编辑器的方法和组合物 |
CN111757937A (zh) * | 2017-10-16 | 2020-10-09 | 布罗德研究所股份有限公司 | 腺苷碱基编辑器的用途 |
EP3728613A4 (en) * | 2017-12-22 | 2022-01-12 | The Broad Institute, Inc. | MULTIPLEX DIAGNOSTICS BASED ON CRISPR EFFECTOR SYSTEM |
-
2019
- 2019-09-26 WO PCT/CN2019/108026 patent/WO2021056302A1/en active Application Filing
-
2020
- 2020-09-18 JP JP2022518227A patent/JP2022549430A/ja active Pending
- 2020-09-18 CN CN202080081866.3A patent/CN114761547A/zh active Pending
- 2020-09-18 EP EP20868850.7A patent/EP4034648A4/en active Pending
- 2020-09-18 CA CA3149273A patent/CA3149273A1/en active Pending
- 2020-09-18 KR KR1020227012388A patent/KR20220066111A/ko unknown
- 2020-09-18 AU AU2020354372A patent/AU2020354372A1/en active Pending
- 2020-09-18 MX MX2022003577A patent/MX2022003577A/es unknown
- 2020-09-18 US US17/763,384 patent/US20220403396A1/en active Pending
- 2020-09-18 WO PCT/US2020/051383 patent/WO2021061507A1/en unknown
-
2022
- 2022-02-13 IL IL290572A patent/IL290572A/en unknown
- 2022-03-24 CL CL2022000745A patent/CL2022000745A1/es unknown
-
2023
- 2023-11-17 CL CL2023003425A patent/CL2023003425A1/es unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018213708A1 (en) * | 2017-05-18 | 2018-11-22 | The Broad Institute, Inc. | Systems, methods, and compositions for targeted nucleic acid editing |
WO2019041296A1 (zh) * | 2017-09-01 | 2019-03-07 | 上海科技大学 | 一种碱基编辑系统及方法 |
CN110157727A (zh) * | 2017-12-21 | 2019-08-23 | 中国科学院遗传与发育生物学研究所 | 植物碱基编辑方法 |
WO2019126762A2 (en) * | 2017-12-22 | 2019-06-27 | The Broad Institute, Inc. | Cas12a systems, methods, and compositions for targeted rna base editing |
CN109957569A (zh) * | 2017-12-22 | 2019-07-02 | 中国科学院遗传与发育生物学研究所 | 基于cpf1蛋白的碱基编辑系统和方法 |
WO2019161783A1 (en) * | 2018-02-23 | 2019-08-29 | Shanghaitech University | Fusion proteins for base editing |
CN109321593A (zh) * | 2018-11-07 | 2019-02-12 | 中国农业科学院植物保护研究所 | 一套用于水稻的人工基因编辑系统 |
Also Published As
Publication number | Publication date |
---|---|
US20220403396A1 (en) | 2022-12-22 |
WO2021061507A1 (en) | 2021-04-01 |
EP4034648A4 (en) | 2023-11-01 |
JP2022549430A (ja) | 2022-11-25 |
IL290572A (en) | 2022-04-01 |
CA3149273A1 (en) | 2021-04-01 |
AU2020354372A1 (en) | 2022-03-31 |
KR20220066111A (ko) | 2022-05-23 |
MX2022003577A (es) | 2022-04-25 |
EP4034648A1 (en) | 2022-08-03 |
CL2022000745A1 (es) | 2022-10-28 |
WO2021056302A1 (en) | 2021-04-01 |
CL2023003425A1 (es) | 2024-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107109427B (zh) | 用于鉴定和富集包含位点特异性基因组修饰的细胞的方法和组合物 | |
JP2023145691A (ja) | 遺伝子操作のためのヌクレアーゼシステム | |
AU2021225152A1 (en) | Isolated polypeptides and polynucleotides useful for increasing nitrogen use efficiency, abiotic stress tolerance, yield and biomass in plants | |
KR102253223B1 (ko) | 식물 내 담배 특이 니트로사민 감소 | |
CN101784667B (zh) | 来自玉米的次生壁形成基因及其用途 | |
AU2016380351A1 (en) | Novel CRISPR-associated transposases and uses thereof | |
KR20210142586A (ko) | CRISPR/Cas12j 효소 및 시스템 | |
KR20210152597A (ko) | 키메라 게놈 조작 분자 및 방법 | |
CN114641568A (zh) | Rna指导的核酸酶及其活性片段及变体以及使用方法 | |
TW201144442A (en) | Production of DHA and other LC-PUFAs in plants | |
CN113473845A (zh) | 经由基因组编辑进行基因沉默 | |
CN110892074A (zh) | 用于增加香蕉的保质期的组成物及方法 | |
CN115103590A (zh) | 用于促进植物生长的生长调节因子家族转录因子的突变 | |
CN109788738A (zh) | 小麦 | |
WO2019027789A1 (en) | METHODS AND COMPOSITIONS FOR TARGETED GENOMIC INSERTION | |
CN114761547A (zh) | 用于dna碱基编辑的方法和组合物 | |
US20020152497A1 (en) | Nucleic acid fragments encoding proteins involved in stress response | |
JP2022522823A (ja) | 天然miRNAのゲノム編集による標的遺伝子発現の抑制 | |
CN114302963A (zh) | 开花期基因及其使用的方法 | |
US11459577B2 (en) | Targeted insertion sites in the maize genome | |
US20230114951A1 (en) | Targeted insertion sites in the maize genome | |
CA2503837A1 (fr) | Sequences peptidiques ppr capables de restaurer la fertilite male de plantes porteuses d'un cytoplasme inducteur de sterilite male | |
CN114502733A (zh) | 花期基因及其使用方法 | |
CN117255859A (zh) | 具有同源指导rna识别位点的可去除植物转基因基因座 | |
CN117425402A (zh) | 通过基因组编辑加快转基因作物的育种 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |