CN102086455B - 絮凝酵母絮凝基因、其表达产物及其用途 - Google Patents
絮凝酵母絮凝基因、其表达产物及其用途 Download PDFInfo
- Publication number
- CN102086455B CN102086455B CN 200910200097 CN200910200097A CN102086455B CN 102086455 B CN102086455 B CN 102086455B CN 200910200097 CN200910200097 CN 200910200097 CN 200910200097 A CN200910200097 A CN 200910200097A CN 102086455 B CN102086455 B CN 102086455B
- Authority
- CN
- China
- Prior art keywords
- thr
- ser
- gly
- val
- glu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 240000004808 Saccharomyces cerevisiae Species 0.000 title claims abstract description 145
- 230000003311 flocculating effect Effects 0.000 title claims abstract description 138
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 120
- 238000005189 flocculation Methods 0.000 title claims abstract description 63
- 230000016615 flocculation Effects 0.000 title claims abstract description 63
- 238000000034 method Methods 0.000 claims abstract description 44
- 239000013604 expression vector Substances 0.000 claims abstract description 34
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 27
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 claims description 136
- 239000012634 fragment Substances 0.000 claims description 24
- 235000018102 proteins Nutrition 0.000 claims description 24
- 150000007523 nucleic acids Chemical class 0.000 claims description 19
- 108020004707 nucleic acids Proteins 0.000 claims description 18
- 102000039446 nucleic acids Human genes 0.000 claims description 18
- 238000012408 PCR amplification Methods 0.000 claims description 16
- 241000894006 Bacteria Species 0.000 claims description 14
- 239000013598 vector Substances 0.000 claims description 13
- 238000010276 construction Methods 0.000 claims description 12
- 238000006243 chemical reaction Methods 0.000 claims description 11
- 238000000926 separation method Methods 0.000 claims description 8
- 238000001890 transfection Methods 0.000 claims description 8
- 239000000284 extract Substances 0.000 claims description 7
- 238000011156 evaluation Methods 0.000 claims description 5
- 239000011248 coating agent Substances 0.000 claims description 4
- 238000000576 coating method Methods 0.000 claims description 4
- 125000003275 alpha amino acid group Chemical group 0.000 claims 2
- 101000579123 Homo sapiens Phosphoglycerate kinase 1 Proteins 0.000 claims 1
- 101000795074 Homo sapiens Tryptase alpha/beta-1 Proteins 0.000 claims 1
- KJWZYMMLVHIVSU-IYCNHOCDSA-N PGK1 Chemical compound CCCCC[C@H](O)\C=C\[C@@H]1[C@@H](CCCCCCC(O)=O)C(=O)CC1=O KJWZYMMLVHIVSU-IYCNHOCDSA-N 0.000 claims 1
- 102100028251 Phosphoglycerate kinase 1 Human genes 0.000 claims 1
- 101000662819 Physarum polycephalum Terpene synthase 1 Proteins 0.000 claims 1
- 102100029639 Tryptase alpha/beta-1 Human genes 0.000 claims 1
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 65
- KBBRNEDOYWMIJP-KYNKHSRBSA-N Thr-Gly-Thr Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O KBBRNEDOYWMIJP-KYNKHSRBSA-N 0.000 description 44
- SNXUIBACCONSOH-BWBBJGPYSA-N Ser-Thr-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CO)C(O)=O SNXUIBACCONSOH-BWBBJGPYSA-N 0.000 description 41
- 108020004414 DNA Proteins 0.000 description 39
- QUBVFEANYYWBTM-VEVYYDQMSA-N Pro-Thr-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUBVFEANYYWBTM-VEVYYDQMSA-N 0.000 description 37
- MXNAOGFNFNKUPD-JHYOHUSXSA-N Thr-Phe-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MXNAOGFNFNKUPD-JHYOHUSXSA-N 0.000 description 37
- 108010038633 aspartylglutamate Proteins 0.000 description 37
- DLISPGXMKZTWQG-IFFSRLJSSA-N Glu-Thr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O DLISPGXMKZTWQG-IFFSRLJSSA-N 0.000 description 36
- KBLYJPQSNGTDIU-LOKLDPHHSA-N Thr-Glu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N)O KBLYJPQSNGTDIU-LOKLDPHHSA-N 0.000 description 36
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 36
- WONGRTVAMHFGBE-WDSKDSINSA-N Asn-Gly-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N WONGRTVAMHFGBE-WDSKDSINSA-N 0.000 description 35
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 33
- SEXRBCGSZRCIPE-LYSGOOTNSA-N Trp-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O SEXRBCGSZRCIPE-LYSGOOTNSA-N 0.000 description 32
- ZJBUILVYSXQNSW-YTWAJWBKSA-N Arg-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ZJBUILVYSXQNSW-YTWAJWBKSA-N 0.000 description 31
- DLEBSGAVWRPTIX-PEDHHIEDSA-N Ile-Val-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)[C@@H](C)CC DLEBSGAVWRPTIX-PEDHHIEDSA-N 0.000 description 31
- RVMNUBQWPVOUKH-HEIBUPTGSA-N Thr-Ser-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMNUBQWPVOUKH-HEIBUPTGSA-N 0.000 description 28
- BIENEHRYNODTLP-HJGDQZAQSA-N Thr-Glu-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N)O BIENEHRYNODTLP-HJGDQZAQSA-N 0.000 description 27
- LECUEEHKUFYOOV-ZJDVBMNYSA-N Thr-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)[C@@H](C)O LECUEEHKUFYOOV-ZJDVBMNYSA-N 0.000 description 27
- 108010061238 threonyl-glycine Proteins 0.000 description 26
- 238000000855 fermentation Methods 0.000 description 25
- 230000004151 fermentation Effects 0.000 description 25
- LRQXRHGQEVWGPV-NHCYSSNCSA-N Gly-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN LRQXRHGQEVWGPV-NHCYSSNCSA-N 0.000 description 23
- 150000001413 amino acids Chemical group 0.000 description 21
- VBMOVTMNHWPZJR-SUSMZKCASA-N Thr-Thr-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VBMOVTMNHWPZJR-SUSMZKCASA-N 0.000 description 19
- 239000000047 product Substances 0.000 description 18
- LLWQVJNHMYBLLK-CDMKHQONSA-N Gly-Thr-Phe Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LLWQVJNHMYBLLK-CDMKHQONSA-N 0.000 description 17
- 230000006698 induction Effects 0.000 description 16
- XDGBFDYXZCMYEX-NUMRIWBASA-N Asp-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)O XDGBFDYXZCMYEX-NUMRIWBASA-N 0.000 description 15
- KKKVOZNCLALMPV-XKBZYTNZSA-N Ser-Thr-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KKKVOZNCLALMPV-XKBZYTNZSA-N 0.000 description 15
- DJQIUOKSNRBTSV-CYDGBPFRSA-N Val-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](C(C)C)N DJQIUOKSNRBTSV-CYDGBPFRSA-N 0.000 description 15
- UWMDGPFFTKDUIY-HJGDQZAQSA-N Gln-Pro-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O UWMDGPFFTKDUIY-HJGDQZAQSA-N 0.000 description 14
- NURNJECQNNCRBK-FLBSBUHZSA-N Ile-Thr-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NURNJECQNNCRBK-FLBSBUHZSA-N 0.000 description 14
- QYIGOFGUOVTAHK-ZJDVBMNYSA-N Met-Thr-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QYIGOFGUOVTAHK-ZJDVBMNYSA-N 0.000 description 14
- VBZXFFYOBDLLFE-HSHDSVGOSA-N Pro-Trp-Thr Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H]([C@H](O)C)C(O)=O)C(=O)[C@@H]1CCCN1 VBZXFFYOBDLLFE-HSHDSVGOSA-N 0.000 description 14
- TZKPNGDGUVREEB-FOHZUACHSA-N Thr-Asn-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O TZKPNGDGUVREEB-FOHZUACHSA-N 0.000 description 14
- YQYFYUSYEDNLSD-YEPSODPASA-N Val-Thr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O YQYFYUSYEDNLSD-YEPSODPASA-N 0.000 description 14
- 229940024606 amino acid Drugs 0.000 description 14
- 235000001014 amino acid Nutrition 0.000 description 14
- 210000004027 cell Anatomy 0.000 description 14
- 108010079547 glutamylmethionine Proteins 0.000 description 14
- GVMXJJAJLIEASL-ZJDVBMNYSA-N Thr-Pro-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O GVMXJJAJLIEASL-ZJDVBMNYSA-N 0.000 description 13
- 239000002773 nucleotide Substances 0.000 description 13
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 12
- RMJZWERKFFNNNS-XGEHTFHBSA-N Pro-Thr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMJZWERKFFNNNS-XGEHTFHBSA-N 0.000 description 12
- 125000003729 nucleotide group Chemical group 0.000 description 12
- NHQVWACSJZJCGJ-FLBSBUHZSA-N Thr-Thr-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NHQVWACSJZJCGJ-FLBSBUHZSA-N 0.000 description 11
- ILUOMMDDGREELW-OSUNSFLBSA-N Thr-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)O ILUOMMDDGREELW-OSUNSFLBSA-N 0.000 description 11
- KZTLZZQTJMCGIP-ZJDVBMNYSA-N Thr-Val-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KZTLZZQTJMCGIP-ZJDVBMNYSA-N 0.000 description 11
- 108090000765 processed proteins & peptides Proteins 0.000 description 11
- FKESCSGWBPUTPN-FOHZUACHSA-N Gly-Thr-Asn Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O FKESCSGWBPUTPN-FOHZUACHSA-N 0.000 description 10
- QTUSJASXLGLJSR-OSUNSFLBSA-N Ile-Arg-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N QTUSJASXLGLJSR-OSUNSFLBSA-N 0.000 description 10
- YBXMGKCLOPDEKA-NUMRIWBASA-N Thr-Asp-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YBXMGKCLOPDEKA-NUMRIWBASA-N 0.000 description 10
- 239000013612 plasmid Substances 0.000 description 10
- YHOJJFFTSMWVGR-HJGDQZAQSA-N Glu-Met-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YHOJJFFTSMWVGR-HJGDQZAQSA-N 0.000 description 9
- JPUNZXVHHRZMNL-XIRDDKMYSA-N Glu-Pro-Trp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O JPUNZXVHHRZMNL-XIRDDKMYSA-N 0.000 description 9
- PABFFPWEJMEVEC-JGVFFNPUSA-N Gly-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)CN)C(=O)O PABFFPWEJMEVEC-JGVFFNPUSA-N 0.000 description 9
- BNGDYRRHRGOPHX-IFFSRLJSSA-N Thr-Glu-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O BNGDYRRHRGOPHX-IFFSRLJSSA-N 0.000 description 9
- 230000003321 amplification Effects 0.000 description 9
- 108010050848 glycylleucine Proteins 0.000 description 9
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 9
- 239000007788 liquid Substances 0.000 description 9
- 238000004519 manufacturing process Methods 0.000 description 9
- 238000003199 nucleic acid amplification method Methods 0.000 description 9
- MIIVFRCYJABHTQ-ONGXEEELSA-N Gly-Leu-Val Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O MIIVFRCYJABHTQ-ONGXEEELSA-N 0.000 description 8
- GNRMAQSIROFNMI-IXOXFDKPSA-N Phe-Thr-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O GNRMAQSIROFNMI-IXOXFDKPSA-N 0.000 description 8
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 8
- KDKLLPMFFGYQJD-CYDGBPFRSA-N Val-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N KDKLLPMFFGYQJD-CYDGBPFRSA-N 0.000 description 8
- 230000001580 bacterial effect Effects 0.000 description 8
- 108010089804 glycyl-threonine Proteins 0.000 description 8
- 108010078274 isoleucylvaline Proteins 0.000 description 8
- 229920001184 polypeptide Polymers 0.000 description 8
- 102000004196 processed proteins & peptides Human genes 0.000 description 8
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 7
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 7
- VLMIUSLQONKLDV-HEIBUPTGSA-N Ser-Thr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VLMIUSLQONKLDV-HEIBUPTGSA-N 0.000 description 7
- 238000012216 screening Methods 0.000 description 7
- KUFVXLQLDHJVOG-SHGPDSBTSA-N Ala-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C)N)O KUFVXLQLDHJVOG-SHGPDSBTSA-N 0.000 description 6
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 6
- 241000880493 Leptailurus serval Species 0.000 description 6
- YRBGKVIWMNEVCZ-WDSKDSINSA-N Ser-Glu-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YRBGKVIWMNEVCZ-WDSKDSINSA-N 0.000 description 6
- 108010010147 glycylglutamine Proteins 0.000 description 6
- 230000001939 inductive effect Effects 0.000 description 6
- 238000003780 insertion Methods 0.000 description 6
- 230000037431 insertion Effects 0.000 description 6
- 108091008146 restriction endonucleases Proteins 0.000 description 6
- 102000004190 Enzymes Human genes 0.000 description 5
- 108090000790 Enzymes Proteins 0.000 description 5
- BDFHWFUAQLIMJO-KXNHARMFSA-N Lys-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N)O BDFHWFUAQLIMJO-KXNHARMFSA-N 0.000 description 5
- GZNYIXWOIUFLGO-ZJDVBMNYSA-N Pro-Thr-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZNYIXWOIUFLGO-ZJDVBMNYSA-N 0.000 description 5
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 5
- NDZYTIMDOZMECO-SHGPDSBTSA-N Thr-Thr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O NDZYTIMDOZMECO-SHGPDSBTSA-N 0.000 description 5
- MHNHRNHJMXAVHZ-AAEUAGOBSA-N Trp-Asn-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N MHNHRNHJMXAVHZ-AAEUAGOBSA-N 0.000 description 5
- XPKCFQZDQGVJCX-RHYQMDGZSA-N Val-Lys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N)O XPKCFQZDQGVJCX-RHYQMDGZSA-N 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 239000008103 glucose Substances 0.000 description 5
- 244000005700 microbiome Species 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000003906 pulsed field gel electrophoresis Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- MEIRRNXMZYDVDW-MQQKCMAXSA-N (2E,4E)-2,4-hexadien-1-ol Chemical compound C\C=C\C=C\CO MEIRRNXMZYDVDW-MQQKCMAXSA-N 0.000 description 4
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 4
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 4
- 101150054379 FLO1 gene Proteins 0.000 description 4
- STHSGOZLFLFGSS-SUSMZKCASA-N Gln-Thr-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O STHSGOZLFLFGSS-SUSMZKCASA-N 0.000 description 4
- YHFPHRUWZMEOIX-CYDGBPFRSA-N Ile-Val-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(=O)O)N YHFPHRUWZMEOIX-CYDGBPFRSA-N 0.000 description 4
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 4
- QESXLSQLQHHTIX-RHYQMDGZSA-N Leu-Val-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QESXLSQLQHHTIX-RHYQMDGZSA-N 0.000 description 4
- KCGIREHVWRXNDH-GARJFASQSA-N Ser-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N KCGIREHVWRXNDH-GARJFASQSA-N 0.000 description 4
- XQJCEKXQUJQNNK-ZLUOBGJFSA-N Ser-Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O XQJCEKXQUJQNNK-ZLUOBGJFSA-N 0.000 description 4
- MFQMZDPAZRZAPV-NAKRPEOUSA-N Ser-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CO)N MFQMZDPAZRZAPV-NAKRPEOUSA-N 0.000 description 4
- DFTCYYILCSQGIZ-GCJQMDKQSA-N Thr-Ala-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFTCYYILCSQGIZ-GCJQMDKQSA-N 0.000 description 4
- OFTXTCGQJXTNQS-XGEHTFHBSA-N Val-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N)O OFTXTCGQJXTNQS-XGEHTFHBSA-N 0.000 description 4
- 238000009395 breeding Methods 0.000 description 4
- 230000001488 breeding effect Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000000295 complement effect Effects 0.000 description 4
- 230000005611 electricity Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 4
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 4
- 239000001963 growth medium Substances 0.000 description 4
- 230000000968 intestinal effect Effects 0.000 description 4
- 238000012856 packing Methods 0.000 description 4
- 239000002245 particle Substances 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 238000007670 refining Methods 0.000 description 4
- 239000000523 sample Substances 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- 235000000346 sugar Nutrition 0.000 description 4
- 108010073969 valyllysine Proteins 0.000 description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- GORKKVHIBWAQHM-GCJQMDKQSA-N Ala-Asn-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GORKKVHIBWAQHM-GCJQMDKQSA-N 0.000 description 3
- LNNSWWRRYJLGNI-NAKRPEOUSA-N Ala-Ile-Val Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O LNNSWWRRYJLGNI-NAKRPEOUSA-N 0.000 description 3
- NCQMBSJGJMYKCK-ZLUOBGJFSA-N Ala-Ser-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O NCQMBSJGJMYKCK-ZLUOBGJFSA-N 0.000 description 3
- HCZQKHSRYHCPSD-IUKAMOBKSA-N Asn-Thr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HCZQKHSRYHCPSD-IUKAMOBKSA-N 0.000 description 3
- ZEXHDOQQYZKOIB-ACZMJKKPSA-N Cys-Glu-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZEXHDOQQYZKOIB-ACZMJKKPSA-N 0.000 description 3
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 3
- CAQXJMUDOLSBPF-SUSMZKCASA-N Glu-Thr-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAQXJMUDOLSBPF-SUSMZKCASA-N 0.000 description 3
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 3
- 101150043522 HO gene Proteins 0.000 description 3
- AXNGDPAKKCEKGY-QPHKQPEJSA-N Ile-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N AXNGDPAKKCEKGY-QPHKQPEJSA-N 0.000 description 3
- WCNWGAUZWWSYDG-SVSWQMSJSA-N Ile-Thr-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)O)N WCNWGAUZWWSYDG-SVSWQMSJSA-N 0.000 description 3
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 3
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 3
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 3
- 108091000080 Phosphotransferase Proteins 0.000 description 3
- UREQLMJCKFLLHM-NAKRPEOUSA-N Pro-Ile-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UREQLMJCKFLLHM-NAKRPEOUSA-N 0.000 description 3
- 101100066910 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FLO1 gene Proteins 0.000 description 3
- YQHZVYJAGWMHES-ZLUOBGJFSA-N Ser-Ala-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YQHZVYJAGWMHES-ZLUOBGJFSA-N 0.000 description 3
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 3
- JURQXQBJKUHGJS-UHFFFAOYSA-N Ser-Ser-Ser-Ser Chemical compound OCC(N)C(=O)NC(CO)C(=O)NC(CO)C(=O)NC(CO)C(O)=O JURQXQBJKUHGJS-UHFFFAOYSA-N 0.000 description 3
- VGQVAVQWKJLIRM-FXQIFTODSA-N Ser-Ser-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O VGQVAVQWKJLIRM-FXQIFTODSA-N 0.000 description 3
- ZKOKTQPHFMRSJP-YJRXYDGGSA-N Ser-Thr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKOKTQPHFMRSJP-YJRXYDGGSA-N 0.000 description 3
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 3
- CAJFZCICSVBOJK-SHGPDSBTSA-N Thr-Ala-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAJFZCICSVBOJK-SHGPDSBTSA-N 0.000 description 3
- GMXIJHCBTZDAPD-QPHKQPEJSA-N Thr-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N GMXIJHCBTZDAPD-QPHKQPEJSA-N 0.000 description 3
- FWTFAZKJORVTIR-VZFHVOOUSA-N Thr-Ser-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O FWTFAZKJORVTIR-VZFHVOOUSA-N 0.000 description 3
- SLLKXDSRVAOREO-KZVJFYERSA-N Val-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N)O SLLKXDSRVAOREO-KZVJFYERSA-N 0.000 description 3
- JAIZPWVHPQRYOU-ZJDVBMNYSA-N Val-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O JAIZPWVHPQRYOU-ZJDVBMNYSA-N 0.000 description 3
- AOILQMZPNLUXCM-AVGNSLFASA-N Val-Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN AOILQMZPNLUXCM-AVGNSLFASA-N 0.000 description 3
- JVGDAEKKZKKZFO-RCWTZXSCSA-N Val-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)N)O JVGDAEKKZKKZFO-RCWTZXSCSA-N 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 238000000137 annealing Methods 0.000 description 3
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000011049 filling Methods 0.000 description 3
- 238000013467 fragmentation Methods 0.000 description 3
- 238000006062 fragmentation reaction Methods 0.000 description 3
- 238000010353 genetic engineering Methods 0.000 description 3
- 238000009396 hybridization Methods 0.000 description 3
- 108010064235 lysylglycine Proteins 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 230000035772 mutation Effects 0.000 description 3
- 235000015097 nutrients Nutrition 0.000 description 3
- 229940049547 paraxin Drugs 0.000 description 3
- 102000020233 phosphotransferase Human genes 0.000 description 3
- 210000001938 protoplast Anatomy 0.000 description 3
- 238000000746 purification Methods 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- AWUCVROLDVIAJX-GSVOUGTGSA-N sn-glycerol 3-phosphate Chemical compound OC[C@@H](O)COP(O)(O)=O AWUCVROLDVIAJX-GSVOUGTGSA-N 0.000 description 3
- FRXSZNDVFUDTIR-UHFFFAOYSA-N 6-methoxy-1,2,3,4-tetrahydroquinoline Chemical compound N1CCCC2=CC(OC)=CC=C21 FRXSZNDVFUDTIR-UHFFFAOYSA-N 0.000 description 2
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 2
- MMLHRUJLOUSRJX-CIUDSAMLSA-N Ala-Ser-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN MMLHRUJLOUSRJX-CIUDSAMLSA-N 0.000 description 2
- NZGRHTKZFSVPAN-BIIVOSGPSA-N Ala-Ser-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N NZGRHTKZFSVPAN-BIIVOSGPSA-N 0.000 description 2
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 2
- PGNNQOJOEGFAOR-KWQFWETISA-N Ala-Tyr-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 PGNNQOJOEGFAOR-KWQFWETISA-N 0.000 description 2
- XKXAZPSREVUCRT-BPNCWPANSA-N Ala-Tyr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=C(O)C=C1 XKXAZPSREVUCRT-BPNCWPANSA-N 0.000 description 2
- REWSWYIDQIELBE-FXQIFTODSA-N Ala-Val-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O REWSWYIDQIELBE-FXQIFTODSA-N 0.000 description 2
- 244000153158 Ammi visnaga Species 0.000 description 2
- 235000010585 Ammi visnaga Nutrition 0.000 description 2
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 2
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 2
- RAQMSGVCGSJKCL-FOHZUACHSA-N Asn-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(N)=O RAQMSGVCGSJKCL-FOHZUACHSA-N 0.000 description 2
- LTZIRYMWOJHRCH-GUDRVLHUSA-N Asn-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N LTZIRYMWOJHRCH-GUDRVLHUSA-N 0.000 description 2
- RBOBTTLFPRSXKZ-BZSNNMDCSA-N Asn-Phe-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RBOBTTLFPRSXKZ-BZSNNMDCSA-N 0.000 description 2
- XOQYDFCQPWAMSA-KKHAAJSZSA-N Asn-Val-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOQYDFCQPWAMSA-KKHAAJSZSA-N 0.000 description 2
- HKEZZWQWXWGASX-KKUMJFAQSA-N Asp-Leu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HKEZZWQWXWGASX-KKUMJFAQSA-N 0.000 description 2
- WMLFFCRUSPNENW-ZLUOBGJFSA-N Asp-Ser-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O WMLFFCRUSPNENW-ZLUOBGJFSA-N 0.000 description 2
- XYPJXLLXNSAWHZ-SRVKXCTJSA-N Asp-Ser-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XYPJXLLXNSAWHZ-SRVKXCTJSA-N 0.000 description 2
- 101100083069 Candida albicans (strain SC5314 / ATCC MYA-2876) PGA62 gene Proteins 0.000 description 2
- 101100106993 Candida albicans (strain SC5314 / ATCC MYA-2876) YWP1 gene Proteins 0.000 description 2
- KVCJEMHFLGVINV-ZLUOBGJFSA-N Cys-Ser-Asn Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(N)=O KVCJEMHFLGVINV-ZLUOBGJFSA-N 0.000 description 2
- JAHCWGSVNZXHRR-SVSWQMSJSA-N Cys-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CS)N JAHCWGSVNZXHRR-SVSWQMSJSA-N 0.000 description 2
- LPBUBIHAVKXUOT-FXQIFTODSA-N Cys-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N LPBUBIHAVKXUOT-FXQIFTODSA-N 0.000 description 2
- 102000004594 DNA Polymerase I Human genes 0.000 description 2
- 108010017826 DNA Polymerase I Proteins 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- WLRYGVYQFXRJDA-DCAQKATOSA-N Gln-Pro-Pro Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 WLRYGVYQFXRJDA-DCAQKATOSA-N 0.000 description 2
- ININBLZFFVOQIO-JHEQGTHGSA-N Gln-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O ININBLZFFVOQIO-JHEQGTHGSA-N 0.000 description 2
- UBRQJXFDVZNYJP-AVGNSLFASA-N Gln-Tyr-Ser Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O UBRQJXFDVZNYJP-AVGNSLFASA-N 0.000 description 2
- UTKICHUQEQBDGC-ACZMJKKPSA-N Glu-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N UTKICHUQEQBDGC-ACZMJKKPSA-N 0.000 description 2
- VFZIDQZAEBORGY-GLLZPBPUSA-N Glu-Gln-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VFZIDQZAEBORGY-GLLZPBPUSA-N 0.000 description 2
- HZISRJBYZAODRV-XQXXSGGOSA-N Glu-Thr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O HZISRJBYZAODRV-XQXXSGGOSA-N 0.000 description 2
- BKMOHWJHXQLFEX-IRIUXVKKSA-N Glu-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)O)N)O BKMOHWJHXQLFEX-IRIUXVKKSA-N 0.000 description 2
- PYTZFYUXZZHOAD-WHFBIAKZSA-N Gly-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)CN PYTZFYUXZZHOAD-WHFBIAKZSA-N 0.000 description 2
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 2
- GNBMOZPQUXTCRW-STQMWFEESA-N Gly-Asn-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)CN)C(O)=O)=CNC2=C1 GNBMOZPQUXTCRW-STQMWFEESA-N 0.000 description 2
- LGQZOQRDEUIZJY-YUMQZZPRSA-N Gly-Cys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CS)NC(=O)CN)C(O)=O LGQZOQRDEUIZJY-YUMQZZPRSA-N 0.000 description 2
- DTRUBYPMMVPQPD-YUMQZZPRSA-N Gly-Gln-Arg Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DTRUBYPMMVPQPD-YUMQZZPRSA-N 0.000 description 2
- QPTNELDXWKRIFX-YFKPBYRVSA-N Gly-Gly-Gln Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O QPTNELDXWKRIFX-YFKPBYRVSA-N 0.000 description 2
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 2
- HHRODZSXDXMUHS-LURJTMIESA-N Gly-Met-Gly Chemical compound CSCC[C@H](NC(=O)C[NH3+])C(=O)NCC([O-])=O HHRODZSXDXMUHS-LURJTMIESA-N 0.000 description 2
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 2
- ZLCLYFGMKFCDCN-XPUUQOCRSA-N Gly-Ser-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CO)NC(=O)CN)C(O)=O ZLCLYFGMKFCDCN-XPUUQOCRSA-N 0.000 description 2
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 2
- HERITAGIPLEJMT-GVARAGBVSA-N Ile-Ala-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HERITAGIPLEJMT-GVARAGBVSA-N 0.000 description 2
- AQTWDZDISVGCAC-CFMVVWHZSA-N Ile-Asp-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N AQTWDZDISVGCAC-CFMVVWHZSA-N 0.000 description 2
- GVKKVHNRTUFCCE-BJDJZHNGSA-N Ile-Leu-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)O)N GVKKVHNRTUFCCE-BJDJZHNGSA-N 0.000 description 2
- GLYJPWIRLBAIJH-FQUUOJAGSA-N Ile-Lys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N GLYJPWIRLBAIJH-FQUUOJAGSA-N 0.000 description 2
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 2
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 2
- JZBVBOKASHNXAD-NAKRPEOUSA-N Ile-Val-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N JZBVBOKASHNXAD-NAKRPEOUSA-N 0.000 description 2
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 2
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 2
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 2
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 2
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- PRZVBIAOPFGAQF-SRVKXCTJSA-N Leu-Glu-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O PRZVBIAOPFGAQF-SRVKXCTJSA-N 0.000 description 2
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 2
- WMIOEVKKYIMVKI-DCAQKATOSA-N Leu-Pro-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WMIOEVKKYIMVKI-DCAQKATOSA-N 0.000 description 2
- YUTNOGOMBNYPFH-XUXIUFHCSA-N Leu-Pro-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YUTNOGOMBNYPFH-XUXIUFHCSA-N 0.000 description 2
- PWPBLZXWFXJFHE-RHYQMDGZSA-N Leu-Pro-Thr Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O PWPBLZXWFXJFHE-RHYQMDGZSA-N 0.000 description 2
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 2
- 102000003960 Ligases Human genes 0.000 description 2
- 108090000364 Ligases Proteins 0.000 description 2
- NRQRKMYZONPCTM-CIUDSAMLSA-N Lys-Asp-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O NRQRKMYZONPCTM-CIUDSAMLSA-N 0.000 description 2
- SBQDRNOLGSYHQA-YUMQZZPRSA-N Lys-Ser-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SBQDRNOLGSYHQA-YUMQZZPRSA-N 0.000 description 2
- QXEVZBXTDTVPCP-GMOBBJLQSA-N Met-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCSC)N QXEVZBXTDTVPCP-GMOBBJLQSA-N 0.000 description 2
- KYJHWKAMFISDJE-RCWTZXSCSA-N Met-Thr-Met Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCSC KYJHWKAMFISDJE-RCWTZXSCSA-N 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- FIRWJEJVFFGXSH-RYUDHWBXSA-N Phe-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 FIRWJEJVFFGXSH-RYUDHWBXSA-N 0.000 description 2
- MHNBYYFXWDUGBW-RPTUDFQQSA-N Phe-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O MHNBYYFXWDUGBW-RPTUDFQQSA-N 0.000 description 2
- YFNOUBWUIIJQHF-LPEHRKFASA-N Pro-Asp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O YFNOUBWUIIJQHF-LPEHRKFASA-N 0.000 description 2
- UPJGUQPLYWTISV-GUBZILKMSA-N Pro-Gln-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UPJGUQPLYWTISV-GUBZILKMSA-N 0.000 description 2
- LCUOTSLIVGSGAU-AVGNSLFASA-N Pro-His-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LCUOTSLIVGSGAU-AVGNSLFASA-N 0.000 description 2
- KBUAPZAZPWNYSW-SRVKXCTJSA-N Pro-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 KBUAPZAZPWNYSW-SRVKXCTJSA-N 0.000 description 2
- RSTWKJFWBKFOFC-JYJNAYRXSA-N Pro-Trp-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O RSTWKJFWBKFOFC-JYJNAYRXSA-N 0.000 description 2
- JPIDMRXXNMIVKY-VZFHVOOUSA-N Ser-Ala-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPIDMRXXNMIVKY-VZFHVOOUSA-N 0.000 description 2
- XVAUJOAYHWWNQF-ZLUOBGJFSA-N Ser-Asn-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O XVAUJOAYHWWNQF-ZLUOBGJFSA-N 0.000 description 2
- ICHZYBVODUVUKN-SRVKXCTJSA-N Ser-Asn-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ICHZYBVODUVUKN-SRVKXCTJSA-N 0.000 description 2
- MESDJCNHLZBMEP-ZLUOBGJFSA-N Ser-Asp-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MESDJCNHLZBMEP-ZLUOBGJFSA-N 0.000 description 2
- ZOHGLPQGEHSLPD-FXQIFTODSA-N Ser-Gln-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZOHGLPQGEHSLPD-FXQIFTODSA-N 0.000 description 2
- XWCYBVBLJRWOFR-WDSKDSINSA-N Ser-Gln-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O XWCYBVBLJRWOFR-WDSKDSINSA-N 0.000 description 2
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 2
- GZBKRJVCRMZAST-XKBZYTNZSA-N Ser-Glu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZBKRJVCRMZAST-XKBZYTNZSA-N 0.000 description 2
- ADJDNJCSPNFFPI-FXQIFTODSA-N Ser-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO ADJDNJCSPNFFPI-FXQIFTODSA-N 0.000 description 2
- JCLAFVNDBJMLBC-JBDRJPRFSA-N Ser-Ser-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JCLAFVNDBJMLBC-JBDRJPRFSA-N 0.000 description 2
- PYTKULIABVRXSC-BWBBJGPYSA-N Ser-Ser-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PYTKULIABVRXSC-BWBBJGPYSA-N 0.000 description 2
- XJDMUQCLVSCRSJ-VZFHVOOUSA-N Ser-Thr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O XJDMUQCLVSCRSJ-VZFHVOOUSA-N 0.000 description 2
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 2
- OQSQCUWQOIHECT-YJRXYDGGSA-N Ser-Tyr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OQSQCUWQOIHECT-YJRXYDGGSA-N 0.000 description 2
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 2
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 2
- JBHMLZSKIXMVFS-XVSYOHENSA-N Thr-Asn-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JBHMLZSKIXMVFS-XVSYOHENSA-N 0.000 description 2
- OJRNZRROAIAHDL-LKXGYXEUSA-N Thr-Asn-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OJRNZRROAIAHDL-LKXGYXEUSA-N 0.000 description 2
- GKMYGVQDGVYCPC-IUKAMOBKSA-N Thr-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H]([C@@H](C)O)N GKMYGVQDGVYCPC-IUKAMOBKSA-N 0.000 description 2
- ONNSECRQFSTMCC-XKBZYTNZSA-N Thr-Glu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ONNSECRQFSTMCC-XKBZYTNZSA-N 0.000 description 2
- JQAWYCUUFIMTHE-WLTAIBSBSA-N Thr-Gly-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JQAWYCUUFIMTHE-WLTAIBSBSA-N 0.000 description 2
- PAXANSWUSVPFNK-IUKAMOBKSA-N Thr-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N PAXANSWUSVPFNK-IUKAMOBKSA-N 0.000 description 2
- SPVHQURZJCUDQC-VOAKCMCISA-N Thr-Lys-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O SPVHQURZJCUDQC-VOAKCMCISA-N 0.000 description 2
- JMBRNXUOLJFURW-BEAPCOKYSA-N Thr-Phe-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N)O JMBRNXUOLJFURW-BEAPCOKYSA-N 0.000 description 2
- CSNBWOJOEOPYIJ-UVOCVTCTSA-N Thr-Thr-Lys Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O CSNBWOJOEOPYIJ-UVOCVTCTSA-N 0.000 description 2
- DKNYWNPPSZCWCJ-GBALPHGKSA-N Thr-Trp-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CS)C(=O)O)N)O DKNYWNPPSZCWCJ-GBALPHGKSA-N 0.000 description 2
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 2
- 239000004473 Threonine Substances 0.000 description 2
- OGXQLUCMJZSJPW-LYSGOOTNSA-N Trp-Gly-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O OGXQLUCMJZSJPW-LYSGOOTNSA-N 0.000 description 2
- GSCPHMSPGQSZJT-JYBASQMISA-N Trp-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O GSCPHMSPGQSZJT-JYBASQMISA-N 0.000 description 2
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 2
- 239000006035 Tryptophane Substances 0.000 description 2
- FWOVTJKVUCGVND-UFYCRDLUSA-N Tyr-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N FWOVTJKVUCGVND-UFYCRDLUSA-N 0.000 description 2
- KZOZXAYPVKKDIO-UFYCRDLUSA-N Tyr-Met-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 KZOZXAYPVKKDIO-UFYCRDLUSA-N 0.000 description 2
- KHPLUFDSWGDRHD-SLFFLAALSA-N Tyr-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N)C(=O)O KHPLUFDSWGDRHD-SLFFLAALSA-N 0.000 description 2
- YKBUNNNRNZZUID-UFYCRDLUSA-N Tyr-Val-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YKBUNNNRNZZUID-UFYCRDLUSA-N 0.000 description 2
- AZSHAZJLOZQYAY-FXQIFTODSA-N Val-Ala-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O AZSHAZJLOZQYAY-FXQIFTODSA-N 0.000 description 2
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 2
- OVBMCNDKCWAXMZ-NAKRPEOUSA-N Val-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N OVBMCNDKCWAXMZ-NAKRPEOUSA-N 0.000 description 2
- YKNOJPJWNVHORX-UNQGMJICSA-N Val-Phe-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YKNOJPJWNVHORX-UNQGMJICSA-N 0.000 description 2
- HTONZBWRYUKUKC-RCWTZXSCSA-N Val-Thr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HTONZBWRYUKUKC-RCWTZXSCSA-N 0.000 description 2
- IECQJCJNPJVUSB-IHRRRGAJSA-N Val-Tyr-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(O)=O IECQJCJNPJVUSB-IHRRRGAJSA-N 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 2
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 2
- 108010005233 alanylglutamic acid Proteins 0.000 description 2
- 108010062796 arginyllysine Proteins 0.000 description 2
- 108010060035 arginylproline Proteins 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- YTRQFSDWAXHJCC-UHFFFAOYSA-N chloroform;phenol Chemical compound ClC(Cl)Cl.OC1=CC=CC=C1 YTRQFSDWAXHJCC-UHFFFAOYSA-N 0.000 description 2
- 238000005520 cutting process Methods 0.000 description 2
- 108010004073 cysteinylcysteine Proteins 0.000 description 2
- 230000009089 cytolysis Effects 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 238000001976 enzyme digestion Methods 0.000 description 2
- 238000012869 ethanol precipitation Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 2
- 229960002989 glutamic acid Drugs 0.000 description 2
- 108010028188 glycyl-histidyl-serine Proteins 0.000 description 2
- 108010087823 glycyltyrosine Proteins 0.000 description 2
- 108010037850 glycylvaline Proteins 0.000 description 2
- 229910001385 heavy metal Inorganic materials 0.000 description 2
- 238000002744 homologous recombination Methods 0.000 description 2
- 230000006801 homologous recombination Effects 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 230000005764 inhibitory process Effects 0.000 description 2
- 238000011081 inoculation Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 229960000310 isoleucine Drugs 0.000 description 2
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 2
- 108010020432 prolyl-prolylisoleucine Proteins 0.000 description 2
- 230000006798 recombination Effects 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 238000004062 sedimentation Methods 0.000 description 2
- 239000001509 sodium citrate Substances 0.000 description 2
- 108700004896 tripeptide FEG Proteins 0.000 description 2
- HRXKRNGNAMMEHJ-UHFFFAOYSA-K trisodium citrate Chemical compound [Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O HRXKRNGNAMMEHJ-UHFFFAOYSA-K 0.000 description 2
- 229940038773 trisodium citrate Drugs 0.000 description 2
- 229960004799 tryptophan Drugs 0.000 description 2
- 108010038745 tryptophylglycine Proteins 0.000 description 2
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 2
- 108010005834 tyrosyl-alanyl-glycine Proteins 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 210000005253 yeast cell Anatomy 0.000 description 2
- HDTRYLNUVZCQOY-UHFFFAOYSA-N α-D-glucopyranosyl-α-D-glucopyranoside Natural products OC1C(O)C(O)C(CO)OC1OC1C(O)C(O)C(O)C(CO)O1 HDTRYLNUVZCQOY-UHFFFAOYSA-N 0.000 description 1
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- OSJPPGNTCRNQQC-UWTATZPHSA-N 3-phospho-D-glyceric acid Chemical compound OC(=O)[C@H](O)COP(O)(O)=O OSJPPGNTCRNQQC-UWTATZPHSA-N 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- FXKNPWNXPQZLES-ZLUOBGJFSA-N Ala-Asn-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FXKNPWNXPQZLES-ZLUOBGJFSA-N 0.000 description 1
- RXTBLQVXNIECFP-FXQIFTODSA-N Ala-Gln-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RXTBLQVXNIECFP-FXQIFTODSA-N 0.000 description 1
- ZODMADSIQZZBSQ-FXQIFTODSA-N Ala-Gln-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZODMADSIQZZBSQ-FXQIFTODSA-N 0.000 description 1
- XYTNPQNAZREREP-XQXXSGGOSA-N Ala-Glu-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XYTNPQNAZREREP-XQXXSGGOSA-N 0.000 description 1
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 1
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 1
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 1
- VNYMOTCMNHJGTG-JBDRJPRFSA-N Ala-Ile-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O VNYMOTCMNHJGTG-JBDRJPRFSA-N 0.000 description 1
- YYAVDNKUWLAFCV-ACZMJKKPSA-N Ala-Ser-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYAVDNKUWLAFCV-ACZMJKKPSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- WQKAQKZRDIZYNV-VZFHVOOUSA-N Ala-Ser-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WQKAQKZRDIZYNV-VZFHVOOUSA-N 0.000 description 1
- YNOCMHZSWJMGBB-GCJQMDKQSA-N Ala-Thr-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O YNOCMHZSWJMGBB-GCJQMDKQSA-N 0.000 description 1
- LSMDIAAALJJLRO-XQXXSGGOSA-N Ala-Thr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LSMDIAAALJJLRO-XQXXSGGOSA-N 0.000 description 1
- VNFSAYFQLXPHPY-CIQUZCHMSA-N Ala-Thr-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNFSAYFQLXPHPY-CIQUZCHMSA-N 0.000 description 1
- KTXKIYXZQFWJKB-VZFHVOOUSA-N Ala-Thr-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O KTXKIYXZQFWJKB-VZFHVOOUSA-N 0.000 description 1
- YHQGEARSFILVHL-HJGDQZAQSA-N Arg-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N)O YHQGEARSFILVHL-HJGDQZAQSA-N 0.000 description 1
- UULLJGQFCDXVTQ-CYDGBPFRSA-N Arg-Pro-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UULLJGQFCDXVTQ-CYDGBPFRSA-N 0.000 description 1
- LRPZJPMQGKGHSG-XGEHTFHBSA-N Arg-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N)O LRPZJPMQGKGHSG-XGEHTFHBSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- 102000003916 Arrestin Human genes 0.000 description 1
- 108090000328 Arrestin Proteins 0.000 description 1
- IYVSIZAXNLOKFQ-BYULHYEWSA-N Asn-Asp-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IYVSIZAXNLOKFQ-BYULHYEWSA-N 0.000 description 1
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 1
- RAKKBBHMTJSXOY-XVYDVKMFSA-N Asn-His-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O RAKKBBHMTJSXOY-XVYDVKMFSA-N 0.000 description 1
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 1
- XTMZYFMTYJNABC-ZLUOBGJFSA-N Asn-Ser-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N XTMZYFMTYJNABC-ZLUOBGJFSA-N 0.000 description 1
- FMNBYVSGRCXWEK-FOHZUACHSA-N Asn-Thr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O FMNBYVSGRCXWEK-FOHZUACHSA-N 0.000 description 1
- PUUPMDXIHCOPJU-HJGDQZAQSA-N Asn-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O PUUPMDXIHCOPJU-HJGDQZAQSA-N 0.000 description 1
- BCADFFUQHIMQAA-KKHAAJSZSA-N Asn-Thr-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BCADFFUQHIMQAA-KKHAAJSZSA-N 0.000 description 1
- BUVNWKQBMZLCDW-UGYAYLCHSA-N Asp-Asn-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BUVNWKQBMZLCDW-UGYAYLCHSA-N 0.000 description 1
- VAWNQIGQPUOPQW-ACZMJKKPSA-N Asp-Glu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VAWNQIGQPUOPQW-ACZMJKKPSA-N 0.000 description 1
- LBFYTUPYYZENIR-GHCJXIJMSA-N Asp-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N LBFYTUPYYZENIR-GHCJXIJMSA-N 0.000 description 1
- ORRJQLIATJDMQM-HJGDQZAQSA-N Asp-Leu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O ORRJQLIATJDMQM-HJGDQZAQSA-N 0.000 description 1
- UXRVDHVARNBOIO-QSFUFRPTSA-N Asp-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(=O)O)N UXRVDHVARNBOIO-QSFUFRPTSA-N 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- XCDDSPYIMNXECQ-NAKRPEOUSA-N Cys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CS XCDDSPYIMNXECQ-NAKRPEOUSA-N 0.000 description 1
- JLZCAZJGWNRXCI-XKBZYTNZSA-N Cys-Thr-Glu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O JLZCAZJGWNRXCI-XKBZYTNZSA-N 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 239000003109 Disodium ethylene diamine tetraacetate Substances 0.000 description 1
- ZGTMUACCHSMWAC-UHFFFAOYSA-L EDTA disodium salt (anhydrous) Chemical compound [Na+].[Na+].OC(=O)CN(CC([O-])=O)CCN(CC(O)=O)CC([O-])=O ZGTMUACCHSMWAC-UHFFFAOYSA-L 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- IVCOYUURLWQDJQ-LPEHRKFASA-N Gln-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N)C(=O)O IVCOYUURLWQDJQ-LPEHRKFASA-N 0.000 description 1
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 1
- XUMFMAVDHQDATI-DCAQKATOSA-N Gln-Pro-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XUMFMAVDHQDATI-DCAQKATOSA-N 0.000 description 1
- OKARHJKJTKFQBM-ACZMJKKPSA-N Gln-Ser-Asn Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OKARHJKJTKFQBM-ACZMJKKPSA-N 0.000 description 1
- UTOQQOMEJDPDMX-ACZMJKKPSA-N Gln-Ser-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O UTOQQOMEJDPDMX-ACZMJKKPSA-N 0.000 description 1
- PAOHIZNRJNIXQY-XQXXSGGOSA-N Gln-Thr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PAOHIZNRJNIXQY-XQXXSGGOSA-N 0.000 description 1
- OUBUHIODTNUUTC-WDCWCFNPSA-N Gln-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O OUBUHIODTNUUTC-WDCWCFNPSA-N 0.000 description 1
- NCWOMXABNYEPLY-NRPADANISA-N Glu-Ala-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O NCWOMXABNYEPLY-NRPADANISA-N 0.000 description 1
- HPJLZFTUUJKWAJ-JHEQGTHGSA-N Glu-Gly-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HPJLZFTUUJKWAJ-JHEQGTHGSA-N 0.000 description 1
- ZHNHJYYFCGUZNQ-KBIXCLLPSA-N Glu-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O ZHNHJYYFCGUZNQ-KBIXCLLPSA-N 0.000 description 1
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 1
- GMVCSRBOSIUTFC-FXQIFTODSA-N Glu-Ser-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMVCSRBOSIUTFC-FXQIFTODSA-N 0.000 description 1
- RFTVTKBHDXCEEX-WDSKDSINSA-N Glu-Ser-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RFTVTKBHDXCEEX-WDSKDSINSA-N 0.000 description 1
- HMJULNMJWOZNFI-XHNCKOQMSA-N Glu-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N)C(=O)O HMJULNMJWOZNFI-XHNCKOQMSA-N 0.000 description 1
- VNCNWQPIQYAMAK-ACZMJKKPSA-N Glu-Ser-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O VNCNWQPIQYAMAK-ACZMJKKPSA-N 0.000 description 1
- DMYACXMQUABZIQ-NRPADANISA-N Glu-Ser-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O DMYACXMQUABZIQ-NRPADANISA-N 0.000 description 1
- MWTGQXBHVRTCOR-GLLZPBPUSA-N Glu-Thr-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MWTGQXBHVRTCOR-GLLZPBPUSA-N 0.000 description 1
- GPSHCSTUYOQPAI-JHEQGTHGSA-N Glu-Thr-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O GPSHCSTUYOQPAI-JHEQGTHGSA-N 0.000 description 1
- DTLLNDVORUEOTM-WDCWCFNPSA-N Glu-Thr-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DTLLNDVORUEOTM-WDCWCFNPSA-N 0.000 description 1
- MXJYXYDREQWUMS-XKBZYTNZSA-N Glu-Thr-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O MXJYXYDREQWUMS-XKBZYTNZSA-N 0.000 description 1
- WGYHAAXZWPEBDQ-IFFSRLJSSA-N Glu-Val-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGYHAAXZWPEBDQ-IFFSRLJSSA-N 0.000 description 1
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 1
- BPQYBFAXRGMGGY-LAEOZQHASA-N Gly-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN BPQYBFAXRGMGGY-LAEOZQHASA-N 0.000 description 1
- LPCKHUXOGVNZRS-YUMQZZPRSA-N Gly-His-Ser Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O LPCKHUXOGVNZRS-YUMQZZPRSA-N 0.000 description 1
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 1
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 1
- SSFWXSNOKDZNHY-QXEWZRGKSA-N Gly-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN SSFWXSNOKDZNHY-QXEWZRGKSA-N 0.000 description 1
- TVTZEOHWHUVYCG-KYNKHSRBSA-N Gly-Thr-Thr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O TVTZEOHWHUVYCG-KYNKHSRBSA-N 0.000 description 1
- DKJWUIYLMLUBDX-XPUUQOCRSA-N Gly-Val-Cys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(=O)O DKJWUIYLMLUBDX-XPUUQOCRSA-N 0.000 description 1
- 102000057621 Glycerol kinases Human genes 0.000 description 1
- 108700016170 Glycerol kinases Proteins 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 102000002812 Heat-Shock Proteins Human genes 0.000 description 1
- 108010004889 Heat-Shock Proteins Proteins 0.000 description 1
- VIJMRAIWYWRXSR-CIUDSAMLSA-N His-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 VIJMRAIWYWRXSR-CIUDSAMLSA-N 0.000 description 1
- WSAILOWUJZEAGC-DCAQKATOSA-N His-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N WSAILOWUJZEAGC-DCAQKATOSA-N 0.000 description 1
- DURWCDDDAWVPOP-JBDRJPRFSA-N Ile-Cys-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O)N DURWCDDDAWVPOP-JBDRJPRFSA-N 0.000 description 1
- BBQABUDWDUKJMB-LZXPERKUSA-N Ile-Ile-Ile Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C([O-])=O BBQABUDWDUKJMB-LZXPERKUSA-N 0.000 description 1
- QZZIBQZLWBOOJH-PEDHHIEDSA-N Ile-Ile-Val Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(=O)O QZZIBQZLWBOOJH-PEDHHIEDSA-N 0.000 description 1
- CKRFDMPBSWYOBT-PPCPHDFISA-N Ile-Lys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CKRFDMPBSWYOBT-PPCPHDFISA-N 0.000 description 1
- FGBRXCZYVRFNKQ-MXAVVETBSA-N Ile-Phe-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N FGBRXCZYVRFNKQ-MXAVVETBSA-N 0.000 description 1
- RQJUKVXWAKJDBW-SVSWQMSJSA-N Ile-Ser-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N RQJUKVXWAKJDBW-SVSWQMSJSA-N 0.000 description 1
- ANTFEOSJMAUGIB-KNZXXDILSA-N Ile-Thr-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@@H]1C(=O)O)N ANTFEOSJMAUGIB-KNZXXDILSA-N 0.000 description 1
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 1
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 1
- HPBCTWSUJOGJSH-MNXVOIDGSA-N Leu-Glu-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HPBCTWSUJOGJSH-MNXVOIDGSA-N 0.000 description 1
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 1
- IDGZVZJLYFTXSL-DCAQKATOSA-N Leu-Ser-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IDGZVZJLYFTXSL-DCAQKATOSA-N 0.000 description 1
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 1
- GZRABTMNWJXFMH-UVOCVTCTSA-N Leu-Thr-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZRABTMNWJXFMH-UVOCVTCTSA-N 0.000 description 1
- MRWXLRGAFDOILG-DCAQKATOSA-N Lys-Gln-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MRWXLRGAFDOILG-DCAQKATOSA-N 0.000 description 1
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 1
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 1
- DIBZLYZXTSVGLN-CIUDSAMLSA-N Lys-Ser-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O DIBZLYZXTSVGLN-CIUDSAMLSA-N 0.000 description 1
- PLOUVAYOMTYJRG-JXUBOQSCSA-N Lys-Thr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PLOUVAYOMTYJRG-JXUBOQSCSA-N 0.000 description 1
- RMOKGALPSPOYKE-KATARQTJSA-N Lys-Thr-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMOKGALPSPOYKE-KATARQTJSA-N 0.000 description 1
- CAODKDAPYGUMLK-FXQIFTODSA-N Met-Asn-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O CAODKDAPYGUMLK-FXQIFTODSA-N 0.000 description 1
- HOZNVKDCKZPRER-XUXIUFHCSA-N Met-Lys-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HOZNVKDCKZPRER-XUXIUFHCSA-N 0.000 description 1
- WTHGNAAQXISJHP-AVGNSLFASA-N Met-Lys-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WTHGNAAQXISJHP-AVGNSLFASA-N 0.000 description 1
- ZDJICAUBMUKVEJ-CIUDSAMLSA-N Met-Ser-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O ZDJICAUBMUKVEJ-CIUDSAMLSA-N 0.000 description 1
- DBMLDOWSVHMQQN-XGEHTFHBSA-N Met-Ser-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DBMLDOWSVHMQQN-XGEHTFHBSA-N 0.000 description 1
- CQRGINSEMFBACV-WPRPVWTQSA-N Met-Val-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O CQRGINSEMFBACV-WPRPVWTQSA-N 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 1
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 108010065395 Neuropep-1 Proteins 0.000 description 1
- 108010038807 Oligopeptides Proteins 0.000 description 1
- 102000015636 Oligopeptides Human genes 0.000 description 1
- 239000001888 Peptone Substances 0.000 description 1
- 108010080698 Peptones Proteins 0.000 description 1
- UEXCHCYDPAIVDE-SRVKXCTJSA-N Phe-Asp-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 UEXCHCYDPAIVDE-SRVKXCTJSA-N 0.000 description 1
- UEADQPLTYBWWTG-AVGNSLFASA-N Phe-Glu-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 UEADQPLTYBWWTG-AVGNSLFASA-N 0.000 description 1
- CWFGECHCRMGPPT-MXAVVETBSA-N Phe-Ile-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O CWFGECHCRMGPPT-MXAVVETBSA-N 0.000 description 1
- KLXQWABNAWDRAY-ACRUOGEOSA-N Phe-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 KLXQWABNAWDRAY-ACRUOGEOSA-N 0.000 description 1
- GLJZDMZJHFXJQG-BZSNNMDCSA-N Phe-Ser-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLJZDMZJHFXJQG-BZSNNMDCSA-N 0.000 description 1
- SJRQWEDYTKYHHL-SLFFLAALSA-N Phe-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CC=CC=C3)N)C(=O)O SJRQWEDYTKYHHL-SLFFLAALSA-N 0.000 description 1
- XQLBWXHVZVBNJM-FXQIFTODSA-N Pro-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 XQLBWXHVZVBNJM-FXQIFTODSA-N 0.000 description 1
- HFZNNDWPHBRNPV-KZVJFYERSA-N Pro-Ala-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HFZNNDWPHBRNPV-KZVJFYERSA-N 0.000 description 1
- AHXPYZRZRMQOAU-QXEWZRGKSA-N Pro-Asn-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1)C(O)=O AHXPYZRZRMQOAU-QXEWZRGKSA-N 0.000 description 1
- SGCZFWSQERRKBD-BQBZGAKWSA-N Pro-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 SGCZFWSQERRKBD-BQBZGAKWSA-N 0.000 description 1
- XZONQWUEBAFQPO-HJGDQZAQSA-N Pro-Gln-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZONQWUEBAFQPO-HJGDQZAQSA-N 0.000 description 1
- LNOWDSPAYBWJOR-PEDHHIEDSA-N Pro-Ile-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LNOWDSPAYBWJOR-PEDHHIEDSA-N 0.000 description 1
- KDBHVPXBQADZKY-GUBZILKMSA-N Pro-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 KDBHVPXBQADZKY-GUBZILKMSA-N 0.000 description 1
- NBDHWLZEMKSVHH-UVBJJODRSA-N Pro-Trp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@@H]3CCCN3 NBDHWLZEMKSVHH-UVBJJODRSA-N 0.000 description 1
- OQSGBXGNAFQGGS-CYDGBPFRSA-N Pro-Val-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OQSGBXGNAFQGGS-CYDGBPFRSA-N 0.000 description 1
- YDTUEBLEAVANFH-RCWTZXSCSA-N Pro-Val-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 YDTUEBLEAVANFH-RCWTZXSCSA-N 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- 108091027981 Response element Proteins 0.000 description 1
- 244000253724 Saccharomyces cerevisiae S288c Species 0.000 description 1
- 235000004905 Saccharomyces cerevisiae S288c Nutrition 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- WTWGOQRNRFHFQD-JBDRJPRFSA-N Ser-Ala-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WTWGOQRNRFHFQD-JBDRJPRFSA-N 0.000 description 1
- QVOGDCQNGLBNCR-FXQIFTODSA-N Ser-Arg-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O QVOGDCQNGLBNCR-FXQIFTODSA-N 0.000 description 1
- VAUMZJHYZQXZBQ-WHFBIAKZSA-N Ser-Asn-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O VAUMZJHYZQXZBQ-WHFBIAKZSA-N 0.000 description 1
- WXWDPFVKQRVJBJ-CIUDSAMLSA-N Ser-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N WXWDPFVKQRVJBJ-CIUDSAMLSA-N 0.000 description 1
- WTPKKLMBNBCCNL-ACZMJKKPSA-N Ser-Cys-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CO)N WTPKKLMBNBCCNL-ACZMJKKPSA-N 0.000 description 1
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 1
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 1
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 1
- ZUDXUJSYCCNZQJ-DCAQKATOSA-N Ser-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CO)N ZUDXUJSYCCNZQJ-DCAQKATOSA-N 0.000 description 1
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 1
- LWMQRHDTXHQQOV-MXAVVETBSA-N Ser-Ile-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LWMQRHDTXHQQOV-MXAVVETBSA-N 0.000 description 1
- UIPXCLNLUUAMJU-JBDRJPRFSA-N Ser-Ile-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UIPXCLNLUUAMJU-JBDRJPRFSA-N 0.000 description 1
- ZOPISOXXPQNOCO-SVSWQMSJSA-N Ser-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CO)N ZOPISOXXPQNOCO-SVSWQMSJSA-N 0.000 description 1
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 1
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 1
- IXZHZUGGKLRHJD-DCAQKATOSA-N Ser-Leu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IXZHZUGGKLRHJD-DCAQKATOSA-N 0.000 description 1
- BYCVMHKULKRVPV-GUBZILKMSA-N Ser-Lys-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O BYCVMHKULKRVPV-GUBZILKMSA-N 0.000 description 1
- OCWWJBZQXGYQCA-DCAQKATOSA-N Ser-Lys-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O OCWWJBZQXGYQCA-DCAQKATOSA-N 0.000 description 1
- FPCGZYMRFFIYIH-CIUDSAMLSA-N Ser-Lys-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O FPCGZYMRFFIYIH-CIUDSAMLSA-N 0.000 description 1
- LRZLZIUXQBIWTB-KATARQTJSA-N Ser-Lys-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRZLZIUXQBIWTB-KATARQTJSA-N 0.000 description 1
- ASGYVPAVFNDZMA-GUBZILKMSA-N Ser-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CO)N ASGYVPAVFNDZMA-GUBZILKMSA-N 0.000 description 1
- UGTZYIPOBYXWRW-SRVKXCTJSA-N Ser-Phe-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O UGTZYIPOBYXWRW-SRVKXCTJSA-N 0.000 description 1
- NMZXJDSKEGFDLJ-DCAQKATOSA-N Ser-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CO)N)C(=O)N[C@@H](CCCCN)C(=O)O NMZXJDSKEGFDLJ-DCAQKATOSA-N 0.000 description 1
- KQNDIKOYWZTZIX-FXQIFTODSA-N Ser-Ser-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQNDIKOYWZTZIX-FXQIFTODSA-N 0.000 description 1
- VFWQQZMRKFOGLE-ZLUOBGJFSA-N Ser-Ser-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)O VFWQQZMRKFOGLE-ZLUOBGJFSA-N 0.000 description 1
- GYDFRTRSSXOZCR-ACZMJKKPSA-N Ser-Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GYDFRTRSSXOZCR-ACZMJKKPSA-N 0.000 description 1
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 1
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 1
- CUXJENOFJXOSOZ-BIIVOSGPSA-N Ser-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CO)N)C(=O)O CUXJENOFJXOSOZ-BIIVOSGPSA-N 0.000 description 1
- FLMYSKVSDVHLEW-SVSWQMSJSA-N Ser-Thr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLMYSKVSDVHLEW-SVSWQMSJSA-N 0.000 description 1
- ZSDXEKUKQAKZFE-XAVMHZPKSA-N Ser-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N)O ZSDXEKUKQAKZFE-XAVMHZPKSA-N 0.000 description 1
- BDMWLJLPPUCLNV-XGEHTFHBSA-N Ser-Thr-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BDMWLJLPPUCLNV-XGEHTFHBSA-N 0.000 description 1
- SIEBDTCABMZCLF-XGEHTFHBSA-N Ser-Val-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SIEBDTCABMZCLF-XGEHTFHBSA-N 0.000 description 1
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- XSLXHSYIVPGEER-KZVJFYERSA-N Thr-Ala-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O XSLXHSYIVPGEER-KZVJFYERSA-N 0.000 description 1
- QGXCWPNQVCYJEL-NUMRIWBASA-N Thr-Asn-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QGXCWPNQVCYJEL-NUMRIWBASA-N 0.000 description 1
- PQLXHSACXPGWPD-GSSVUCPTSA-N Thr-Asn-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PQLXHSACXPGWPD-GSSVUCPTSA-N 0.000 description 1
- VGYBYGQXZJDZJU-XQXXSGGOSA-N Thr-Glu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VGYBYGQXZJDZJU-XQXXSGGOSA-N 0.000 description 1
- GKWNLDNXMMLRMC-GLLZPBPUSA-N Thr-Glu-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O GKWNLDNXMMLRMC-GLLZPBPUSA-N 0.000 description 1
- XOTBWOCSLMBGMF-SUSMZKCASA-N Thr-Glu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOTBWOCSLMBGMF-SUSMZKCASA-N 0.000 description 1
- LKEKWDJCJSPXNI-IRIUXVKKSA-N Thr-Glu-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LKEKWDJCJSPXNI-IRIUXVKKSA-N 0.000 description 1
- IMULJHHGAUZZFE-MBLNEYKQSA-N Thr-Gly-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IMULJHHGAUZZFE-MBLNEYKQSA-N 0.000 description 1
- UBDDORVPVLEECX-FJXKBIBVSA-N Thr-Gly-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O UBDDORVPVLEECX-FJXKBIBVSA-N 0.000 description 1
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 1
- KZSYAEWQMJEGRZ-RHYQMDGZSA-N Thr-Leu-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O KZSYAEWQMJEGRZ-RHYQMDGZSA-N 0.000 description 1
- CJXURNZYNHCYFD-WDCWCFNPSA-N Thr-Lys-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O CJXURNZYNHCYFD-WDCWCFNPSA-N 0.000 description 1
- JWQNAFHCXKVZKZ-UVOCVTCTSA-N Thr-Lys-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWQNAFHCXKVZKZ-UVOCVTCTSA-N 0.000 description 1
- WRUWXBBEFUTJOU-XGEHTFHBSA-N Thr-Met-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N)O WRUWXBBEFUTJOU-XGEHTFHBSA-N 0.000 description 1
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 1
- OLFOOYQTTQSSRK-UNQGMJICSA-N Thr-Pro-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLFOOYQTTQSSRK-UNQGMJICSA-N 0.000 description 1
- NBIIPOKZPUGATB-BWBBJGPYSA-N Thr-Ser-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)O NBIIPOKZPUGATB-BWBBJGPYSA-N 0.000 description 1
- IEZVHOULSUULHD-XGEHTFHBSA-N Thr-Ser-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O IEZVHOULSUULHD-XGEHTFHBSA-N 0.000 description 1
- QYDKSNXSBXZPFK-ZJDVBMNYSA-N Thr-Thr-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYDKSNXSBXZPFK-ZJDVBMNYSA-N 0.000 description 1
- BBPCSGKKPJUYRB-UVOCVTCTSA-N Thr-Thr-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O BBPCSGKKPJUYRB-UVOCVTCTSA-N 0.000 description 1
- ZMYCLHFLHRVOEA-HEIBUPTGSA-N Thr-Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ZMYCLHFLHRVOEA-HEIBUPTGSA-N 0.000 description 1
- CSZFFQBUTMGHAH-UAXMHLISSA-N Thr-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O CSZFFQBUTMGHAH-UAXMHLISSA-N 0.000 description 1
- ZOCJFNXUVSGBQI-HSHDSVGOSA-N Thr-Trp-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O ZOCJFNXUVSGBQI-HSHDSVGOSA-N 0.000 description 1
- LXXCHJKHJYRMIY-FQPOAREZSA-N Thr-Tyr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O LXXCHJKHJYRMIY-FQPOAREZSA-N 0.000 description 1
- CYCGARJWIQWPQM-YJRXYDGGSA-N Thr-Tyr-Ser Chemical compound C[C@@H](O)[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CO)C([O-])=O)CC1=CC=C(O)C=C1 CYCGARJWIQWPQM-YJRXYDGGSA-N 0.000 description 1
- MNYNCKZAEIAONY-XGEHTFHBSA-N Thr-Val-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O MNYNCKZAEIAONY-XGEHTFHBSA-N 0.000 description 1
- VYVBSMCZNHOZGD-RCWTZXSCSA-N Thr-Val-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O VYVBSMCZNHOZGD-RCWTZXSCSA-N 0.000 description 1
- HDTRYLNUVZCQOY-WSWWMNSNSA-N Trehalose Natural products O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@@H]1O[C@@H]1[C@H](O)[C@@H](O)[C@@H](O)[C@@H](CO)O1 HDTRYLNUVZCQOY-WSWWMNSNSA-N 0.000 description 1
- VIWQOOBRKCGSDK-RYQLBKOJSA-N Trp-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O VIWQOOBRKCGSDK-RYQLBKOJSA-N 0.000 description 1
- XKKBFNPJFZLTMY-CWRNSKLLSA-N Trp-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O XKKBFNPJFZLTMY-CWRNSKLLSA-N 0.000 description 1
- JVTHMUDOKPQBOT-NSHDSACASA-N Trp-Gly-Gly Chemical compound C1=CC=C2C(C[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O)=CNC2=C1 JVTHMUDOKPQBOT-NSHDSACASA-N 0.000 description 1
- XLVRTKPAIXJYOH-HOCLYGCPSA-N Trp-His-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)NCC(=O)O)N XLVRTKPAIXJYOH-HOCLYGCPSA-N 0.000 description 1
- 108060008682 Tumor Necrosis Factor Proteins 0.000 description 1
- YYLHVUCSTXXKBS-IHRRRGAJSA-N Tyr-Pro-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YYLHVUCSTXXKBS-IHRRRGAJSA-N 0.000 description 1
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 1
- UMSZZGTXGKHTFJ-SRVKXCTJSA-N Tyr-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UMSZZGTXGKHTFJ-SRVKXCTJSA-N 0.000 description 1
- KLQPIEVIKOQRAW-IZPVPAKOSA-N Tyr-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O KLQPIEVIKOQRAW-IZPVPAKOSA-N 0.000 description 1
- AUMNPAUHKUNHHN-BYULHYEWSA-N Val-Asn-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N AUMNPAUHKUNHHN-BYULHYEWSA-N 0.000 description 1
- HIZMLPKDJAXDRG-FXQIFTODSA-N Val-Cys-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O)N HIZMLPKDJAXDRG-FXQIFTODSA-N 0.000 description 1
- UKEVLVBHRKWECS-LSJOCFKGSA-N Val-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](C(C)C)N UKEVLVBHRKWECS-LSJOCFKGSA-N 0.000 description 1
- JZWZACGUZVCQPS-RNJOBUHISA-N Val-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N JZWZACGUZVCQPS-RNJOBUHISA-N 0.000 description 1
- UGFMVXRXULGLNO-XPUUQOCRSA-N Val-Ser-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O UGFMVXRXULGLNO-XPUUQOCRSA-N 0.000 description 1
- UJMCYJKPDFQLHX-XGEHTFHBSA-N Val-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N)O UJMCYJKPDFQLHX-XGEHTFHBSA-N 0.000 description 1
- NZYNRRGJJVSSTJ-GUBZILKMSA-N Val-Ser-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O NZYNRRGJJVSSTJ-GUBZILKMSA-N 0.000 description 1
- WUFHZIRMAZZWRS-OSUNSFLBSA-N Val-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C(C)C)N WUFHZIRMAZZWRS-OSUNSFLBSA-N 0.000 description 1
- PMKQKNBISAOSRI-XHSDSOJGSA-N Val-Tyr-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N PMKQKNBISAOSRI-XHSDSOJGSA-N 0.000 description 1
- 241000584803 Xanthosia rotundifolia Species 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 108010047495 alanylglycine Proteins 0.000 description 1
- 108010011559 alanylphenylalanine Proteins 0.000 description 1
- 125000003158 alcohol group Chemical group 0.000 description 1
- HDTRYLNUVZCQOY-LIZSDCNHSA-N alpha,alpha-trehalose Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 HDTRYLNUVZCQOY-LIZSDCNHSA-N 0.000 description 1
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 1
- 230000006229 amino acid addition Effects 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 108010008355 arginyl-glutamine Proteins 0.000 description 1
- -1 aromatic amino acid Chemical class 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 1
- 108010047857 aspartylglycine Proteins 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 108010060199 cysteinylproline Proteins 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 1
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 238000013016 damping Methods 0.000 description 1
- 239000008367 deionised water Substances 0.000 description 1
- 229910021641 deionized water Inorganic materials 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000008121 dextrose Substances 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 1
- 235000019301 disodium ethylene diamine tetraacetate Nutrition 0.000 description 1
- 239000012153 distilled water Substances 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- OCLXJTCGWSSVOE-UHFFFAOYSA-N ethanol etoh Chemical compound CCO.CCO OCLXJTCGWSSVOE-UHFFFAOYSA-N 0.000 description 1
- 230000000763 evoking effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 238000010230 functional analysis Methods 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 239000003292 glue Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 1
- 235000011187 glycerol Nutrition 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 108010077515 glycylproline Proteins 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 108010040030 histidinoalanine Proteins 0.000 description 1
- 230000002218 hypoglycaemic effect Effects 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 230000001678 irradiating effect Effects 0.000 description 1
- 108010060857 isoleucyl-valyl-tyrosine Proteins 0.000 description 1
- XUWPJKDMEZSVTP-LTYMHZPRSA-N kalafungina Chemical compound O=C1C2=C(O)C=CC=C2C(=O)C2=C1[C@@H](C)O[C@H]1[C@@H]2OC(=O)C1 XUWPJKDMEZSVTP-LTYMHZPRSA-N 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000007899 nucleic acid hybridization Methods 0.000 description 1
- 230000000050 nutritive effect Effects 0.000 description 1
- LCCNCVORNKJIRZ-UHFFFAOYSA-N parathion Chemical compound CCOP(=S)(OCC)OC1=CC=C([N+]([O-])=O)C=C1 LCCNCVORNKJIRZ-UHFFFAOYSA-N 0.000 description 1
- 235000019319 peptone Nutrition 0.000 description 1
- 108010051242 phenylalanylserine Proteins 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 231100000614 poison Toxicity 0.000 description 1
- 230000007096 poisonous effect Effects 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 108010004914 prolylarginine Proteins 0.000 description 1
- 239000012264 purified product Substances 0.000 description 1
- 238000000197 pyrolysis Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 102000037983 regulatory factors Human genes 0.000 description 1
- 108091008025 regulatory factors Proteins 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003938 response to stress Effects 0.000 description 1
- 108010026333 seryl-proline Proteins 0.000 description 1
- 108010027322 single cell proteins Proteins 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 108010058119 tryptophyl-glycyl-glycine Proteins 0.000 description 1
- 229910021642 ultra pure water Inorganic materials 0.000 description 1
- 239000012498 ultrapure water Substances 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 239000002351 wastewater Substances 0.000 description 1
Images
Landscapes
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明提供一种获得絮凝酵母的絮凝基因的方法,由此获得的絮凝基因、其编码蛋白,含有该絮凝基因的表达载体、含有该表达载体的菌株,以及用不同启动子利用此絮凝基因构建的新生产菌株。
Description
技术领域
本发明涉及一种自絮凝酵母的絮凝基因FLOsc、其产物、及其应用。
背景技术
自絮凝酵母是大连理工大学白凤武教授实验室自行选育的具有良好絮凝性状而且发酵性能优良的酵母菌株Saccharomyces cerevisiae flo(驯化选育的自絮凝酵母变异株及应用,中国发明专利200610025259.7)。利用自絮凝酵母进行乙醇发酵,可提高发酵罐中细胞密度,缩短发酵时间,提高生产强度,同时降低乙醇分离过程的能源动力消耗,降低乙醇生产成本,而该菌株为原生质体融合菌株,获得其絮凝基因,并对其功能进行研究,进一步构建新的具有自絮凝特征的生产菌株具有重要的意义。
现有的获得自絮凝酵母的方法包括自然分离、原生质体融合,以及利用导入絮凝基因的遗传工程方法等。其中自然分离方法带有一定随机性,而且分离得到的酵母存在遗传背景不清楚,发酵性状难以预知等缺点,原生质体融合手段需要筛选大量的融合子,而且在亲本酵母不存在选择标记的情况下难以进行。而利用絮凝基因进行絮凝酵母的转基因育种则可选择发酵性能优良的亲本酵母,实现定向选育优良絮凝酵母。
现有获得絮凝基因的方法是通过PCR获得(Agric.Biol.Chem.1991,55:1547-1552),或者将絮凝酵母基因组酶切后连接4-6kb左右片段通过功能验证方法获得(微生物学报,2002,42:110-113)。传统PCR扩增存在保真性问题,而且扩增3kb以上的片段时就难以实现,即使可以获得,得到的基因也可能存在点突变,不能保证代表原始菌株的序列,而且山于絮凝基因存在非常长的重复序列,在PCR过程中很难扩增。此外,由于该基因较大,FLO1的启动子区存在大约5kb的多种抑制蛋白的结合区(EMBO J.2001,20(18):5219-31),因此构建表达载体进行功能互补的时候难以包装完整的片段,从而影响功能鉴定。而且功能筛选需要进行大量重组子的培养和性状观察,工作量很大,因此也难以通过功能互补获得。
因此,本领域仍需要一种获得絮凝酵母絮凝基因的方法,该方法具有保真性好、筛选快速的优点。
发明内容
本发明人首次采用建立Fosmid基因组文库的方法,利用絮凝基因的保守探针进行文库的PCR筛选,获得了新的絮凝基因。与传统的SuperCos载体相比,Fosmid文库构建时不采用限制性酶切,避免了酶切位点的偏好性,而且其拷贝数低,更具有稳定性。利用PCR快速筛选,成功获得了絮凝基因的全长,该基因长达8kb,如此长的基因很难通过传统PCR手段获得,也难以通过连接4-6kb片段进行功能互补获得。由此完成本发明。
本申请一个目的是提供一种分离的核酸,选自:(a)含有SEQ ID NO:1或3所示核苷 酸序列的核酸;和(b)与(a)所述核酸具有至少75%序列相同性、同时保留了SEQ ID NO:1或SEQ ID NO:3的絮凝功能的核酸。所述核酸可选自SEQ ID NO:1或SEQ ID NO:3。
本申请的另一目的是提供一种蛋白质,选自:(i)含有SEQ ID NO:2或SEQ ID NO:4所示的氨基酸序列的蛋白质;和(ii)在(i)的蛋白质的氨基酸序列中经过取代、缺失或添加一个或几个氨基酸且具有絮凝功能的由(i)衍生的蛋白质。所述蛋白质可以选自SEQ ID NO:2或SEQ ID NO:4所示的蛋白质。
本申请也包括编码本申请所述的蛋白质的核酸。
本申请再一目的是提供一种表达载体,其含有本申请所述的核酸序列,例如,含有前述(a)或(b)项的核酸序列,或者含有编码前述(i)项或(ii)项所述的蛋白质的核酸序列。
本申请的表达载体可含有TPS1启动子或者含有PGK1启动子。当含有TPS1启动子时,转化了此表达载体的絮凝酵母诱导型表达絮凝基因;当含有PGK1启动子时,转化了此表达载体的絮凝酵母组成型表达絮凝基因。诱导型表达絮凝基因意指在一定的诱导条件下(例如存在一定的乙醇)表达絮凝基因;而组成型表达絮凝基因则意指不需要任何诱导条件,细胞从开始生长即表达絮凝基因。
本申请再一目的是提供一种絮凝酵母,其含有本申请所述的表达载体。
本申请的絮凝酵母包括保藏号为CGMCC NO:3408或CGMCC NO:3409的酿酒酵母(Saccharomyces cerevisiae),两种酿酒酵母已于2009年11月5日保藏于中国微生物菌种保藏管理委员会普通微生物中心(CGMCC,中国北京市朝阳区北辰西路1号院3号,邮编100101)。
本申请再一目的是提供一种获得絮凝酵母全长絮凝基因的方法,该方法包括以下步骤:(1)用Fosmid载体构建插入片段约为35-40kb的絮凝酵母基因组文库;(2)将所获得的文库转染细菌,平板涂布,经鉴定文库合格后挑取平板上的单克隆于培养基中培养;(3)提取培养的单克隆的DNA,PGR扩增,并对PCR扩增产物进行检测,获得含有絮凝基因的阳性克隆;和(4)对该阳性克隆进行测序,获得该絮凝酵母的絮凝基因。所用细菌可以是大肠杆菌等。
本申请又一目的是提供一种生产絮凝蛋白的方法,该方法包括:构建本申请的表达载体,用该表达载体转化絮凝酵母,和在使转化的絮凝酵母表达絮凝蛋白的条件下培育该絮凝酵母,从而生产絮凝蛋白。
附图说明
图1显示絮凝蛋白C-端与模式菌株蛋白的比较。
图2显示絮凝酵母(左)和破坏子(右)的絮凝性状。
图3显示S288C的PGK1启动子电泳图。
图4显示S288C的TPS1启动子电泳图。
图5显示絮凝基因表达载体的构建。
图6显示转基因絮凝酵母的絮凝形态。a,组成型絮凝酵母BHL01;b,诱导型絮凝酵母ZLH01;c,含有空载体的游离宿主酵母;d,野生型絮凝酵母S.cerevisiae flo。
图7显示诱导型絮凝酵母ZLH01在不同乙醇添加浓度下的絮凝性状比较。图下标注的浓度为乙醇浓度(0-8%),上部为整体培养物图像,下部为试管底图像,显示沉降的细胞。
具体实施方式
本申请提供一种获得絮凝酵母的絮凝基因的方法,该方法包括以下步骤:(1)用Fosmid载体构建插入片段约为35-40kb的絮凝酵母基因组文库;(2)将所获得的文库转染细菌,例如大肠杆菌,平板涂布,经鉴定文库合格后挑取平板上的单克隆于培养基中培养;(3)提取培养的单克隆的DNA,PCR扩增,并对PCR扩增产物进行检测,获得含有絮凝基因的阳性克隆;和(4)对该阳性克隆进行测序,获得该絮凝酵母的絮凝基因。
絮凝酵母基因组文库的构建可包括提取絮凝酵母基因组DNA、制备所述长约35-40kb的插入片段,以及将该插入片段与Fosmid载体连接等步骤。Fosmid载体可从各种市售途径获得,例如,可使用Copycontrol Fosmid Library Production Kit(Epicentre,USA)提供的Fosmid载体。基因组DNA的提取以及插入片段的制备可采用常规的方法实施。在将插入片段与Fosmid载体连接前,可先将该插入片段用Klenow片段进行末端补平。进一步地,还可通过酚氯仿抽提乙醇沉淀精制该补平的DNA片断。精制后的DNA片段可经过脉冲场电泳确认。
可将所获得的文库包装、转染和平板涂布,并鉴定该文库是否合格。包装可使用市售获得的噬菌体包装蛋白,如Copycontrol Fosmid Library Production Kit,Epicentre体外包装。然后可转染大肠杆菌,涂布于含有氯霉素的LB平板中。随机挑取单克隆接种,过夜培养,用碱裂解法提取DNA,再用NotI作酶切鉴定,脉冲场电泳检测插入片断的长度。根据涂平板的结果和插入片断的长度,判断文库是否合格。
PCR扩增优选使用如SEQ ID NOS:5-8所示的引物。扩增所得产物可采用转座子Tn5随机插入目的载体的基因操作方法,利用转座子两端的引物位点进行测序。值得提出的是,由于絮凝基因内部存在很长的重复区,因此常规测序技术很难测序,通过转座方法可以克服这一难点,获得精确的含有较长重复区的基因序列。
可采用以下方法对所获得的絮凝基因进行功能分析:通过PCR技术获得一段带有筛选标记和酵母同源区域的目的基因,通过电转化法,将目的基因导入自絮凝酵母,使筛选标记与酵母FLOsc基因发生同源重组,破坏该基因的功能,得到一株非絮凝的菌株;由于FLOsc基因破坏后絮凝功能丧失,可以说明该基因负责细胞的絮凝性状。
本申请也涉及絮凝基因的外源组成型表达,即利用3-磷酸甘油酸激酶(PGK1)的启动子启动絮凝基因的表达,构建整合表达载体整合入无絮凝性状的酵母的HO位点,获得发酵性能提高的新一代絮凝酵母,从而实现絮凝基因的外源组成型表达。由于使用整合载体,该重组酵母培养时不需要抗生素选择,而且克服了复制型载体容易丢失的缺点,遗传稳定,传代十次以上均能稳定絮凝。
本申请也涉及絮凝基因的外源条件性表达,即利用海藻糖合成酶(TPS1)的启动子启动絮凝基因的表达,构建表达载体整合入无絮凝性状的酵母的HO位点,所获得的酵母转化子在乙醇生成达到3%左右开始絮凝,避免了过早过强絮凝导致的对生长和乙醇发酵的抑制。
本申请得到了自絮凝酵母Saccharomyces cerevisiae变种的絮凝基因,并构建了组成型絮凝和诱导型絮凝的新一代絮凝酵母,提高了菌株的耐温性和发酵效率。
本申请的絮凝基因可用于构建应用于其它领域的酵母菌株,如重金属离子吸附,医药蛋白表达,单细胞蛋白的生产等。
本申请的絮凝基因如SEQ ID NO:1和3所示。本申请包括含有SEQ ID NO:1或SEQ ID NO:3所示DNA序列的分离的核酸。术语“分离的”指其所修饰的物质至少缺乏某些其它成分的制品,这些成分也可存在于这些物质或类似物质天然状况或最初从其制备时。
术语“多肽”和“蛋白质”指氨基酸残基的聚合物,并不限于产物的最小长度。因此,肽、寡肽、二聚物、多聚物等都包括在该定义中。全长的蛋白质及其片段包括在该定义中。该术语还包括多肽的表达后修饰,例如糖基化、乙酰化、磷酸化等。另外,为了本发明的目的,“多肽”指包括天然序列的修饰,例如缺失、添加和取代(通常性质保守),只要蛋白质维持所需活性。这些修饰可以通过定点诱变设计,或可以是偶然的,例如通过产生蛋白质的宿主突变,或由于PCR扩增引起的错误。
术语“类似物”指具有天然多肽序列和结构,以及相对于天然分子的一个或多个氨基酸添加、取代(通常性质保守)和/或缺失的化合物,只要修饰不破坏衍生该类似物的原始多肽的活性。制备多肽类似物和突变蛋白的方法是本领域已知的,如下进一步所述。
特别优选的类似物包括性质上保守的取代,即这些取代发生在与它们的侧链有关的一类氨基酸中。具体而言,氨基酸一般被分成四类:(1)酸性——天冬氨酸和谷氨酸;(2)碱性——赖氨酸、精氨酸、组氨酸;(3)非极性——丙氨酸、缬氨酸、亮氨酸、异亮氨酸、脯氨酸、苯丙氨酸、甲硫氨酸、色氨酸;(4)无电荷的极性——甘氨酸、天冬酰胺、谷氨酰胺、半胱氨酸、丝氨酸、苏氨酸、酪氨酸。有时将苯丙氨酸、色氨酸和酪氨酸归为芳族氨基酸。例如,有理由预测:单独用异亮氨酸或缬氨酸取代亮氨酸、用谷氨酸取代天冬氨酸、用丝氨酸取代苏氨酸,或者用结构上相关的氨基酸取代类似的保守的氨基酸,这样的取代将不会对生物活性有重要影响。例如,感兴趣的多肽可包括多达约2-6个保守的或不保守的氨基酸取代,甚至多达约5-10个保守的或不保守的氨基酸取代,或2-10之间任何整数,只要该分子的所需功能仍维持完整。本领域的熟练技术人员可结合本领域熟知的Hopp/Woods和Kyte-Doolittle曲线图,容易地测定感兴趣的分子中可耐受改变的区域。
可用“相同性”或“同源性”来限定本发明的多肽或核苷酸序列。“相同性”或“同源性”指两条多核苷酸或多肽序列上准确的核苷酸对核苷酸或者氨基酸对氨基酸对应。通过排列两个分子的序列直接比较它们的序列信息,计算两条排列的序列间匹配的准确数量,将其除以最短序列的长度,然后乘以100,从而可得到相同性百分数。
在同源性和相同性分析中可辅助使用易于获得的计算机程序,如ALIGH、Dayhoff、M.O.(Atlas of Protein Sequence and Structure、M.O.Dayhoff编辑,5 Suppl.,3:353-358,National Biomedical Research Foundation,Washington,DC),它适用于Smith和Waterman分析肽用的局部同源性算法(Advances in Appl.Math.,2:482-489,1981)。可从WisconsinSequence Analysis Package(第8版,从Genetics Computer Group,Madison,WI获得)获得测定核苷酸序列同源性的程序,例如,BESTFIT、FASTA和GAP程序,这些程序也依赖于Smith和Waterman算法。使用制造者建议的和上述Wisconsin Sequence Analysis Package所述的默认参数可容易地使用这些程序。例如,可使用Smith和Warerman的同源性算法的默认计分表和6个核苷酸位置的间隔罚分(gap penalty)测定的核苷酸序列与参比序列的同源性百分数。
本发明建立同源性百分数的另一方法是使用版权属于爱丁堡大学、由John F.Collins和Shane S.Sturrok开发、由IntelliGenetics,Inc.(Mountain View,CA)发行的MPSRCH 程序包。Smith-Waterman算法可在这套程序包中使用,其中,在计分表中使用默认参数(例如,间隔开放罚分=12,间隔延伸罚分=1,间隔=6)。从这批数据产生的“匹配”值反映出“序列同源性”。计算序列间的相同性百分数或相似性百分数的其它合适的程序在本领域中一般都是已知的,例如,另一种排列程序是BLAST,使用默认参数。例如,可使用下述默认参数的BLASTN和BLASTP:基因编码=标准;过滤=无;链=两;截留=60;期望值=10;矩阵=BLOSUM62;描述=50个序列;排序=HIGH SCORE;数据库=无冗余,GenBank+EMBL+DDBJ+PDB+GenBankCDS翻译+Swiss蛋白+Spupdate+PIR。在http://www.ncbi.nim.gov/cgi-bin/BLAST网址上可查到这些程序的详细描述。
或者,在同源区域之间形成稳定的双链的条件下进行多核苷酸杂交,接着用单链特异性核酸酶消化,然后测定消化的片段的大小,从而测出同源性。在如(对具体的体系所定义的)严格条件下进行的Southern杂交试验中,可鉴别基本同源的DNA序列。确定适当的杂交条件在本领域熟练技术人员所掌握的知识之内。例如,参见Sambrook等,同上;DNA Cloning,同上;Nucleic Acid Hybridization,同上。
因此,本申请包括与SEQ ID NO:1或SEQ ID NO:3具有至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、至少98%、至少99%的序列相同性的核酸。本申请也包括包含与SEQ ID NO:1或SEQ ID NO:3具有至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、至少98%、至少99%的序列相同性的核酸的核酸。
本申请包括SEQ ID NO:2和4所示的氨基酸序列,以及包含所示氨基酸序列的蛋白质。本申请也包括与SEQ ID NO:2或SEQ ID NO:4具有至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、至少98%、至少99%的序列相同性的氨基酸序列。
另一方面,本申请包括编码在SEQ ID NO:2或4限定的氨基酸序列中经过取代、缺失或添加一个或几个氨基酸其具有絮凝蛋白活性的由SEQ ID NO:2或4衍生的蛋白质的核酸。
在本申请的蛋白质中,也包括在SEQ I D NO:2或4限定的氨基酸序列中经过取代、缺失或添加一个或几个氨基酸其具有絮凝蛋白活性的由SEQ ID NO:2或4衍生的蛋白质。
本申请也包括本申请蛋白质的编码序列。
本申请包括含有本申请核苷酸序列的表达载体。本申请的表达载体可含有SEQ ID NO:1或SEQ ID NO:3所示核苷酸序列。在本申请的表达载体中可包含TPS1启动子或PGK1启动子。本申请也包括含有本申请表达载体的细胞或真菌。在优选实施例中,所述真菌是酵母,在更优选的实施方式中,所述酵母是酿酒酵母。
本申请包括一种生产絮凝蛋白的方法,该方法包括:构建本申请的表达载体,用该表达载体转化絮凝酵母,和在使转化的絮凝酵母表达絮凝蛋白的条件下培育该絮凝酵母,从而生产絮凝蛋白。
在一个具体身上发生中,本申请包括利用TPS1启动子诱导表达絮凝蛋白的方法,该方法包括构建含有TPS1启动子的絮凝基因表达载体,用该表达载体转染絮凝酵母,和在使该启动子诱导表达絮凝蛋白的条件下培育该絮凝酵母,诱导絮凝蛋白的表达。
下文将结合具体实施例对本申请作出进一步的描述。应理解,所述实施例仅仅是阐述性的,而非限制性的。实验中所使用的试剂,除非另有说明,都是可从市场上购得常规的试剂。
实施例
实施例1:自絮凝酵母基因组中的絮凝基因的获得
1.提取自絮凝酵母基因组DNA
用YPD培养基(以g/L计,葡萄糖20,酵母粉10,蛋白胨20)在30℃,150rpm过夜培养自絮凝酵母S.cerevisiae flo(中国普通微生物菌种保藏管理中心保藏号CGMCC0587),取菌体适量,用低熔点琼脂糖包埋,然后用细胞裂解液(1M 山梨醇,0.1M 乙二胺四乙酸钠盐缓冲液pH7.5,蜗牛酶5.5mg/ml)过夜处理,得到自絮凝酵母的全基因组DNA。
2.制备基因组文库插入片段
取适量自絮凝酵母的基因组DNA,用DNA破碎仪(Hydro-Shear 0703,美国GeneMachine)将基因组DNA打断,然后将片断化的基因组DNA通过脉冲场电泳分离,在避免紫外照射的条件下切胶回收35~40kb的片断。
3.DNA片断末端补平与Fosmid载体连接
将回收后的DNA片断用Klenow片段进行末端补平,并通过酚氯仿抽提乙醇沉淀精制DNA片断,精制后的DNA片段经过脉冲场电泳确认,连入Copycontrol Fosmid Library ProductionKit(Epicentre,USA)提供的Fosmid载体。
4.文库的包装、转染、平板涂布、鉴定
用噬菌体包装蛋白(Copycontrol Fosmid Library Production Kit,Epicentre)体外包装,转染大肠杆菌EPI300,然后涂布于含有氯霉素的LB平板中。随机挑取24个单克隆接种,过夜培养,用碱裂解法提取DNA,再用NotI作酶切鉴定,脉冲场电泳检测插入片断的长度。根据涂平板的结果和插入片断的长度,判断文库是否合格。
5.自絮凝酵母基因组文库的筛选
用无菌牙签挑取单克隆,加入1.5mL含有氯霉素12.5μg/mL的LB培养基37℃振荡过夜培养。每个样品取适量,加入终浓度为20%的甘油,在-70℃保存。然后每个样品再取150μL,30个一组混合,99℃煮沸10min破菌,取适量做模板进行PCR检测。
由于絮凝酵母的絮凝基因序列未知,所以尝试根据模式菌株的FLO1序列分别设计两对PCR筛选引物(CF/CR和NF/NR,其中F代表正向引物,R代表反向引物),用于在基因组文库中筛选自絮凝酵母的絮凝基因,其中CF/CR扩增的是絮凝基因C端约1kb的序列,NF/NR扩增的是絮凝基因N端约500bp的序列。
CF:5’-GCGGAATTCCCTCTGGTTCTTCTGAGAGC-3’(SEQ ID NO:5)
CR:5’-GCGAAGCTTGTAAGCTGTTGGCACTGC-3’(SEQ ID NO:6)
NF:5’-GGCGAATTCCTTGAAATTAGCTCGGT -3’(SEQ ID NO:7)
NR:5’-GCGAAGCTTGCATATCCATAAGCCAT-3’(SEQ ID NO:8)
PCR扩增体系:
模板 2μL
正向引物(CF或NF)(10pmol/μL) 1μL
反向引物(CR或NR)(10pmol/μL) 1μL
10x缓冲液 2.5μL
dNTP(dATP、dGTP、dCTP、dTTP各2.5mM) 2.0μL
Ex Taq DNA聚合酶(TaKaRa)(5U/μL) 0.1μL
蒸馏水 16.4μL
PCR反应条件:
PCR产物凝胶电泳检测
在160个分组样品中,检测出19个能够扩增出PCR产物的样品组,进一步将阳性样品的30个单克隆分别做PCR检测,最后得到5个阳性单克隆,选择其中一个进行全测序。
6.自絮凝酵母絮凝基因的序列测定
PCR筛选得到的阳性单克隆由大连宝生物公司进行序列测定。采用转座子Tn5随机插入目的载体的基因操作方法,利用转座子两端的引物位点进行测序,成功获得了一个阳性克隆载体上絮凝基因的全长,将絮凝基因命名为FLO1sc,得其全长8049bp,其中2403A(30%)、2245C(28%)、1397G(17%)、2004 T(25%)。絮凝基因的全长序列见SEQ ID NO:1;其编码的蛋白产物序列如SEQ ID NO:2所示。其中,所编码的蛋白内部45个氨基酸的重复区有43个,比模式酵母的FLO1基因(YAR050W)多25个,絮凝基因内部的重复区据报道和絮凝基因的强度有关,重复区长的絮凝基因絮凝性强(Cell 2008,135:726-37)。通过对比该絮凝基因与模式酵母的FLO1基因,发现其C-端也有部分氨基酸的突变,而且含有序列的插入(图1)。
实施例2:自絮凝酵母絮凝基因破坏
为了进一步验证絮凝基因的功能,对其进行了破坏,方法如下:
1.PCR扩增转化片段
自絮凝酵母SPSC01絮凝基因破坏引物:
其中每条引物的方框内序列为FLO1sc的互补序列,位置分别为核苷酸3-47,和1330-1374;其余为卡那霉素抗性基因序列。以DF和DR为引物,以质粒pFA6a-kanMX4(GenBank登陆号AJ002680,参考文献:Wach,A.1996.Yeast 12:259-265)为模板进行PCR扩增。PCR反应条件:
2.电转化法转化酵母细胞
将活化后的自絮凝酵母,用YPD培养基在30℃,250rpm过夜培养,离心收集酵母细胞,用0.1M柠檬酸钠解絮三次,去离子水漂洗两次,1M山梨醇洗一次,制备电感受态细胞,然后进行电转化。
3.筛选抗性酵母菌株
转化后的酵母菌液涂布到含有抗生素G418300μg/mL的YPD平板上,30℃培养48h后,生长出的酵母菌落为阳性克隆。用无菌牙签挑取单克隆,加入含有G418 100μg/mL的YPD培养基30℃ 150rpm过夜培养,非絮凝的单克隆为絮凝基因被破坏的菌株。
经观察,絮凝基因FLOsc被破坏后,酵母菌的絮凝性状消失(图2),证明该絮凝基因是负责自絮凝酵母的絮凝性状的基因。
实施例3:絮凝基因的组成型表达
为了进一步证明絮凝基因的功能,将絮凝基因转化入不具有絮凝性状的游离酵母,并观察絮凝性状的获得。
将PCR扩增得到的絮凝基因全长插入HO整合载体(NCBI:#AF324728,美国Utah大学David J.Stillman惠赠),并在基因上游插入3-磷酸甘油酸激酶PGK1启动子,线性化后电转入游离工业酒精酿酒酵母宿主4126观察是否有絮凝现象产生。
具体实施方式:
1.PCR方法克隆自絮凝酵母的絮凝基因
用玻璃珠法小量制备自絮凝酵母基因组模板(Burke D,Dawson D,S tearns T.Methodsin Yeast Genetics:A Cold Spring Harbor Laboratory Course Manual,第1版,北京:清华大学出版社,2002),使用Roche公司PCR扩增试剂盒Expand Long Range dNTPack扩增FLOsc全长。
使用引物(下划线代表的酶切位点见后面括号中):
FLO F:5’-ggcttaattaaATGACAATGCCTCATCGCTATAT-3’(PacI)(SEQ ID NO:11)
FLO R:5’-taccatgtcgctggTTAAATAATTGCCAGCAATAAG-3’(BstXI)(SEQ ID NO:12)
退火温度:58.5℃。
反应体系及PCR操作参照Roche公司Expand Long Range dNTPack试剂盒说明书进行。
2.酿酒酵母S288C的PGK1启动子的克隆
同上制备S288C基因组模板,PCR扩增启动子。
PGK1启动子扩增使用引物(下划线处的酶切位点在括号中):
PF:5’-ttggatccACTGTAATTGCTTTTAGTTG-3’(BamHI)(SEQ ID NO:13)
PR:5’-ggcttaattaaTGTTTTATATTTGTTGTAAAAAG-3’(PacI)(SEQ ID NO:14)
退火温度:56.5℃。
S288C的PGK1启动子电泳图见图3。
纯化PCR产物,将产物连入Promega的pGEM-T Easy Vector,转化大肠杆菌DH5α,纯化质粒后送Takara测序,所获得的序列见SEQ ID NO:3。
所扩增的产物全长5.2kb,经分析,野性型基因的重复区只有一半被扩增,其编码的氨基酸序列如SEQ ID NO:4所示。
3.整合载体的构建
HO基因编码核酸内切酶,负责完成酵母a型和α型之间的转换,普遍存在于出芽繁殖的酵母中,并且该基因是生长非必须的,破坏HO基因不会影响酵母的生长(遗传,1990,12(4),37-39;Yeast,1997,13:1563-1573)。Warren等构建了以HO基因的启动子(起始密码-2720至-1814)906bp为左边界,起始密码后+1199至+1699的500bp为右边界的整合载体,实现了外源基因在酵母中的高效率表达(Nucleic Acids Res.2001,29:55-59)。
由于HO基因的启动子受若干调节因子调节,构建表达载体时在FLOsc上游插入组成型表达强启动子PGK1(3-phosphoglycerate kinase,3-磷酸甘油激酶)启动子。线性化整合载体后可利用同源重组电转入宿主菌株。
含有PGK1启动子的组成型表达载体的构建步骤见图5,含有TPS1的启动子的诱导型表达载体的构建采取相似的方法,只是扩增和连接的启动子不同。各限制性内切酶与T4 Ligase购自NEB公司,反应体系与条件按说明书进行。PCR产物纯化与凝胶回收试剂盒使用IllustraGFXTM PCR DNA and Gel Band Purification Kit(GE,USA)操作按说明书进行;大肠杆菌感受态细胞制备及转化、质粒提取参照标准方法进行(J.莎姆布鲁克,分子克隆实验指南,第三版,科学出版社,2002)。具体步骤如下:
(1)pQL01载体构建:
酶切、连接和连接产物转化大肠杆菌:纯化FLOsc PCR扩增产物,用PacI、BstXI分步酶切FLOsc纯化产物及HO载体(NCBI:#AF324728),凝胶回收后用T4连接酶过夜连接,获得pQL01,并转化大肠杆菌。随机选取平板菌落接入含Amp(终浓度为50mg/L)的LB液体培养基中37℃过夜培养。菌液明显变混浊说明重组质粒可能转入细胞。
质粒的提取与验证:吸取3ml菌液提取质粒。BamHI与EcoRI双切后电泳验证条带大小。
(2)含PGK1启动子的表达载体pQL02的构建:
用BamHI、PacI双切连入 -T载体(Promega公司产品)的PGK1启动子及pQL01整合载体,将PGK1启动子连接入pQL01,得到表达载体pQL02。挑取大小正确的质粒送Takara测序。
(3)含TPS1启动子的表达载体pQL03的构建:
构建方法同上,含有TPS1启动子的表达载体命名为pQL03。
4.线性片段转化宿主酵母4126
(1)NotI酶切质粒pQL02使其线性化。电泳,凝胶回收其大片段条带。
(2)线性片段电转化游离工业酒精宿主酵母4126。
取过夜培养的4126酿酒酵母菌液2mL于新鲜的200ml YPD中30℃,250rpm培养16h。6000rpm离心2min,倾去上清,预冷的超纯水洗涤2次;25mL冰预冷的1mol/L山梨醇洗涤细胞后,用0.5ml 1mol/L山梨醇悬浮细胞,制成感受态。取回收的目的片段5μl加入80μl感受态细胞于1.5mL离心管中,混匀后,冰浴5min,加入电转杯中,采用电击法转化后加1mL 1mol/L山梨醇,30℃培养1h。取200μL涂选择培养基平板(YPD+终浓度300μg/mL G418)。30℃培养至转化子出现。电转仪使用BIO-RAD公司Mi croPulserElectroporator。
5.转化子的絮凝性状
含有质粒pQL02的酵母转化子含有PGK1启动子,能够组成型絮凝,命名为BHL01,转化子的验证方法为:提取基因组DNA,利用PCR进行验证,所使用的引物:
BHL01F:5’-ATGCTATGATGCCCACTG-3’(SEQ ID NO:15);
BHL01R:5’-AATACACGTATCCCTCGA-3’(SEQ ID NO:16)
通过如上引物扩增启动子区域和絮凝基因的区域,获得了预期大小的片段,证实了外源片段的整合。挑取平板上长出的较大的菌落,接入YPD液体培养基,30℃培养60h,组成型絮凝酵母转化子可以看到明显的絮凝颗粒(见图6a)。转化子传代10次,絮凝性状保持稳定。
该组成型絮凝酵母BHL01已于2009年11月5日保藏于中国微生物菌种保藏管理委员会普通微生物中心(CGMCC,中国北京市朝阳区北辰西路1号院3号,邮编100101),保藏号为CGMCC NO:3408。
实施例4:絮凝基因的诱导表达
由于组成型絮凝在细胞开始生长时就聚集成团,限制了营养物质的运输,因此存在影响生长和发酵速率的缺点。我们进一步设计了诱导型絮凝酵母的构建。目前文献报道的诱导型絮凝使用的是热激蛋白HSP30的启动子(Appl.Environ.Microbiol.2008,74:6041-6052),该启动子在葡萄糖枯竭的时候受诱导,因此能在生长末期诱导絮凝,但所构建的絮凝酵母絮凝特性不强,而且在乙醇浓度提高时受到强烈抑制(絮凝在6%乙醇存在下只有10%),因此无法用于高浓度乙醇发酵。在我们研究絮凝酵母乙醇耐性的过程中,克隆了絮凝酵母的6-磷酸海藻糖合成酶基因启动子,该启动子比模式酵母多了一个胁迫响应元件,如图7所示,是一个新的启动子,GenBank登陆号FJ536256。由于海藻糖在葡萄糖存在时受到抑制,而在葡萄糖耗尽时受到诱导,因此也是一个可以选用的诱导型启动子。根据以上思路,我们构建了诱导型絮凝表达载体pQL03,方法参照图5,此表达载体所含的絮凝基因是PCR扩增的5.2kb基因(SEQ ID NO:3),不同的是,将pQL02中PGK1启动子替换成海藻糖合成酶启动子TPS1。
构建方法如下:
1.酿酒酵母TPS1启动子的克隆
TPS1启动子使用引物(下划线处的酶切位点在括号中):
TF:5’-aaggatccGAGGACGGTTGCTGAAGAA-3’(BamHI)(SEQ ID NO:17)
TR:5’-gcgttaattaaAGTTCTATGTCTTAATAAGTC-3’(PacI)(SEQ ID NO:18)
退火温度56.5℃。
S288C的TPS1启动子PCR产物的电泳图见图4。
2.诱导型絮凝酵母转化子的获得
表达载体pQL03利用前述方法转化工业酿酒酵母4126,获得了诱导型絮凝的酵母菌株,酵母转化子的验证引物:
ZLH01F:5’-TCTTCGTGCTCTTGTTGC-3’(SEQ ID NO:19)
ZLH01R:5’-TTTCCAGGGTTACGTTTG-3’(SEQ ID NO:20)
通过如上引物扩增启动子区域和絮凝基因的区域,获得了预期大小的片段,证实了外源片段的整合。挑取平板上长出的较大的菌落,接入YPD液体培养基,30℃培养60h,诱导型絮凝酵母转化子可以看到较细的絮凝颗粒(见图6b)。转化子传代10次,絮凝性状保持稳定。同样条件培养含有空载体的对照酵母和野生型絮凝酵母,对照酵母始终呈现游离状态(见图6c),而野生型絮凝酵母(见图6d)的颗粒大小与组成型絮凝酵母(见图6a)的相当。
此含有质粒pQL03的酵母转化子含有TPS1启动子,絮凝性状为诱导型絮凝,命名为ZLH01。该酵母已于2009年11月5日保藏于中国微生物菌种保藏管理委员会普通微生物中心(CGMCC,中国北京市朝阳区北辰西路1号院3号,邮编100101),保藏号为CGMCC NO:3409。
将该酵母菌从斜面接入YPD生长培养基过夜培养,用0.1M柠檬酸钠(pH5.0)解絮,取解絮后的菌液各0.3ml接入新鲜YPD生长培养基,加入不同体积乙醇(总体积5ml),30℃,150rpm试管培养16h后,将培养物沉降10min,拍照。如图7所示,ZLH01菌株在添加3%乙醇后出现明显的絮凝,乙醇浓度添加为4%-10%时絮凝更加明显,而且当乙醇浓度添加至10%时絮凝仍然没有出现被抑制的现象。通过分析乙醇与ZLH01絮凝的关系,发现该絮凝酵母即使在大量葡萄糖存在的情况下,在低浓度乙醇时仍然不絮凝,而在乙醇浓度为3%时开始出现肉眼可见的絮凝,当使用高浓度糖产生乙醇浓度高达118.5g/L时,该诱导型酵母的絮凝特性仍然没有被抑制的现象,而文献中报道的诱导型絮凝在高浓度乙醇乙醇中出现抑制,而且絮凝强度较低(Appl.Environ.Microbiol.2008,74:6041-6052)。
乙醇发酵实验结果表明,摇瓶发酵和发酵罐批式发酵中重组酵母的乙醇发酵效率明显好于野性型絮凝酵母S.cerevisiae flo,尤其在高温条件下。表2和表3分为37℃絮凝酵母转化子发酵残糖和乙醇水平与S.cerevisiae flo的比较。与组成型絮凝酵母相比,诱导型絮凝的酵母乙醇发酵效率更高,降糖速率明显加快,比对照快12小时(表2)。组成型絮凝酵母BHL01发酵摇瓶发酵性能不好与其发酵后期颗粒太大影响生长可能有关,但进一步的研究表明,该组成型絮凝的酵母在高浓度乙醇发酵过程中呈现了良好的性能,详见下文。
表2高温发酵条件下(37℃)转化子发酵残糖水平的比较*
表3高温发酵条件下(37℃)转化子发酵酒精浓度的比较
*BHL01酵母为组成型絮凝酵母,ZLH01为诱导型絮凝酵母;发酵初始糖浓度274g/L,摇床转速150rpm。
利用BHL01菌株,按文献(生物工程学报,2009,25(9):13299-37)中的改进的重复批次发酵方法,采用超高浓度(255g/L)葡萄糖培养基,在2.5L发酵罐中进行重复批式发酵,结果如表4。在发酵时间为10-11小时/批,重复20批次后,利用激光粒度仪(Biotechnol Bioeng,2005,90(5):523-531)在线监测颗粒粒度,可以发现BHL01的絮凝特性不受高浓度乙醇的抑制,能保持良好生长活性和发酵能力,经过20批次发酵后菌体絮凝特性和沉降性能保持良好;而野生型絮凝酵母在相同条件的重复批次发酵下,经过20余批次后,絮凝出现明显退化,沉降分离效率严重受影响,因此BHL01比S.cerevisiae flo更适合高浓度乙醇发酵,显示了良好的工业应用前景。
表4组成型絮凝酵母BHL01高浓度乙醇重复批次发酵结果*
*本表显示20次重复批次发酵后结果
本发明提出的絮凝基因构建絮凝酵母除了可以用于进行燃料乙醇发酵,还可用于其它工业用途,如,重金属废水的吸附;转基因药物蛋白的生产等。条件型诱导启动子还可用于毒物蛋白的生产,可通过调控培养基中的含糖量或者乙醇浓度,调控下游基因的转录和表达。
序列表
<110>大连理工大学
<120>絮凝酵母絮凝基因、其表达产物及其用途
<130>095484
<160>20
<170>PatentIn version 3.3
<210>1
<211>8049
<212>DNA
<213>絮凝酵母
<400>1
atgacaatgc ctcatcgcta tatgtttttg gcagtcttta cacttctggc actaattaat 60
gtggcctcag gagccacaga ggcgtgctta ccagcaggcc agaggaaaag tgggatgaat 120
ataaattttt accagtattc attgaaagat tcctccacat attcgaatgc agcatatatg 180
gcttatggat atgcctcaaa aactaaacta ggttctgtcg gaggacaaac tgatatctcg 240
attgattata atattccttg tgttagttca tcaggcacat ttccttgtcc tcaagaagat 300
tcctatggaa actggggatg caaaggaatg ggtgcttgtt ctaatagtca aggaattgca 360
tactggagta ctgatttatt tggtttctat actaccccaa caaacgtaac cctagaaatg 420
acaggttatt ttttaccacc acagacgggt tcttacacgt tttcttttgc aacagtggat 480
gattctgcaa ttttatcagt cggtggtagc attgcgttcg aatgttgtgc acaagaacaa 540
cctcccatca cgtcgactaa cttcaccatc aatggtatca agccatggca tggaagtctc 600
cctgataata tcgcagggac tgtctacatg tatgctggtt tctattatcc aatgaagatt 660
gtttactcaa atgccgtttc ctggggtaca cttccaatta gtgtgacact accagatggc 720
actaccgtta gtgatgactt tgaagggtac gtatatacct ttgacaacaa tctcagccag 780
tcgaattgta ctattccaga cccttcaaat tatactgcca gtactacaat aactacaacc 840
gagccatgga ccggtacttt cacctctaca tccacagaaa tgactactgt cactggtacc 900
aacggtcaac caactgacga aactgtcatt gttgtcaaaa cacctacaac tgctaacacc 960
atcataacta cgaccgaacc atggaccggc actttcacct ctacatccac tgaaatgacc 1020
acagtcaccg gtactaatgg cttgccaact gacgaaactg tcattgttgt caaaacacct 1080
acaactgcta acaccatcat aactacaact gagccatgga ctggtacttt cacctctaca 1140
tccacagaaa tgactactgt cactggtacc aacggtcaac caactgacga aactgtcatt 1200
gttgttaaaa cacctacaac tgctaacacc atcataacta cgaccgaacc atggactggt 1260
actttcacct ctacatccac agaaatgact actgtcactg gtaccaacgg tcaaccaact 1320
gatgaaactg tcattgttgt caaaacacct acaactgcta acaccgtcat aactacgacc 1380
gaaccatgga ctggtacttt cacctctaca tccacagaaa tgactactgt caccggtacc 1440
aacggtcaac cgaccgatga aaccgttatt gtcattaaaa ctccaaccag tgaaggtcca 1500
atcagcacca ccactgaacc atggaccggt actttcacat ctacatccac tgaaatgacc 1560
acagtcactg gtactaatgg tttaccaacc gatgaaactg tcattattat caaaacacct 1620
acaacagcta gcaccatcat aactacaact gagccatgga acggcacttt cacatctaca 1680
tccacagaaa tgactactgt cactggtacc aacggtcaac caactgacga aactgtcatt 1740
gttgttaaaa cacctacaac tgctaacacc atcataacta cgaccgaacc atggaccggt 1800
atttccactt ctacttctac cgaattgacc acagtcaccg gtactaatgg cttgccaacc 1860
gatgaaactg tcattgttgt caaaacacct acaactgcta acaccatcat aactacaact 1920
gagccatgga ctggtacttt cacatctaca tccacagaaa tgactactgt cactggtacc 1980
aacggtcaac caactgatga aaccatcatt gtcatcagaa caccaacaac tgctagcacc 2040
atcataacta caactgagcc atggaccggt acttccactt ctacatccac agaaatgact 2100
actgtcaccg gtaccaacgg tcaaccgacc gatgaaaccg ttattgtcat taaaactcca 2160
accagtgaag gtccaatcag caccaccact gagccatgga acggcacttt cacatctaca 2220
tccacagaaa tgactactgt cactggtacc aacggtcaac caactgacga aactgtcatt 2280
gttgttaaaa cacctacaac tgctaacacc atcataacta cgaccgaacc atggaccggc 2340
actttcacct ctacatccac tgaaatgacc acagtcaccg gtactaatgg cttgccaact 2400
gacgaaactg tcattgttgt taaaacacct acaactgcta acaccgtcat aactacgacc 2460
gaaccatgga ctggtacttt cacctctaca tccacagaaa tgaccaccgt caccggtacc 2520
aacggtcaac caactgacga aactgtcatt gttgttaaaa cacctacaac tgctaacacc 2580
atcataacta cgaccgaacc atggaccggc actttcacct ctacatccac agaaatgact 2640
actgtcactg gtaccaacgg tcaaccaact gacgaaactg tcattgttgt taaaacacct 2700
acaactgcta acaccatcat aactacgacc gaaccatgga ccggcacttt cacctctaca 2760
tccacagaaa tgactactgt cactggtacc aacggtcaac caactgatga aactgtcatt 2820
gttatcagaa ctccaactag tgagggtttg attacaacca ccactgaacc atggaatggc 2880
actttcacct ctacatccac agaaatgact actgtcactg gtaccaacgg tcaaccaact 2940
gatgaaactg tcattgttat cagaactcca actagtgagg gtttgattac aaccaccact 3000
gaaccatgga ctggtacttt cacttctaca tctactgaga tgaccaccat cactggtact 3060
aatggtcaac caactgacga aaccgtgatt gttatcagaa ctccaaccag tgaaggtttg 3120
gttgcaacca ccactgaacc atggactggc actttcactt ctacatctac tgagatgacc 3180
accgtcaccg gtaccaacgg tcaaccaact gacgaaaccg tgattgttat cagaactcca 3240
actagtgagg gtttgattac aaccaccact gaaccatgga ctggtacttt cacttctaca 3300
tctactgaga tgaccaccgt caccggtacc aacggtcaac caactgacga aaccgtgatt 3360
gttatcagaa ctccaaccag tgaaggtttg attacaacca ccactgaacc atggaatggc 3420
actttcactt cgacttccac tgaggttacc accatcactg gaaccaacgg tcaaccaact 3480
gacgaaactg tgattgttat cagaactcca actagtgagg gtttgattac aaccaccact 3540
gaaccatgga ctggtacttt cacttctaca tctactgaga tgaccaccat cactggtact 3600
aatggtcaac caactgacga aaccgtgatt gttatcagaa ctccaaccag tgaaggtttg 3660
gttgcaacca ccactgaacc atggactggc actttcactt ctacatctac tgagatgacc 3720
accgtcaccg gtaccaacgg tcaaccaact gacgaaaccg tgattgttat cagaactcca 3780
actagtgagg gtttgattac aaccaccact gaaccatgga ctggtacttt cacttctaca 3840
tctactgaga tgaccaccgt caccggtacc aacggtcaac caactgacga aaccgttatt 3900
gttatcagaa ctccaactag tgagggtttg attacaacca ccactgaacc atggactggc 3960
actttcactt ctacatctac tgagatgacc accgtcaccg gtaccaacgg tcaaccaact 4020
gacgaaaccg tgattgttat cagaactcca accagtgaag gtctaatcag caccaccact 4080
gaaccatgga ctggtacttt cacctctacg tctactgaga tgaccaccgt caccggtacc 4140
aacggtcaac caactgacga aaccgtgatt gttatcagaa ctccaaccag tgaaggtcta 4200
atcagcacca ccactgaacc atggactggt actttcacct ctacgtctac tgagatgacc 4260
accgtcaccg gtactaacgg tcaaccaact gatgaaaccg ttattgttat cagaactcca 4320
accagtgaag gtctaatcag caccaccact gaaccatgga ctggcacttt cacctctaca 4380
tccactgaga tgaccaccat caccggtact aatggtcaac caactgacga aaccgttatt 4440
gttatcagaa ctccaactag tgagggtttg attacaacca ccactgaacc atggactggt 4500
actttcactt ctacatctac tgagatgacc accatcactg gtactaatgg tcaaccaact 4560
gacgaaaccg tgattgttat cagaactcca accagtgaag gtttggttgc aaccaccact 4620
gaaccatgga ctggcacttt cacttctaca tctactgaga tgaccaccgt caccggtacc 4680
aacggtcaac caactgatga aaccgtgatt gttatcagaa ctccaaccag tgaaggtttg 4740
attacaacca ccactgaacc atggaatggc actttcactt cgacttccac tgaggttacc 4800
accatcactg gaaccaacgg tcaaccaact gacgaaactg tgattgtcat tagaactcca 4860
actagtgagg gtttgattac tacaactacc gaaccatgga ctggtacttt cacttctaca 4920
tctactgagg ttaccaccgt caccggtact aatggtcaac caactgacga aaccgttatt 4980
gttatcagaa ctccaactag tgagggtttg attacaacca ccactgaacc atggactggt 5040
actttcactt ctacatctac tgagatgacc accgtcaccg gtactaacgg tcaaccaact 5100
gatgaaaccg ttattgttat cagaactcca accagtgaag gtttgattac aaccaccact 5160
gaaccatgga atggcacttt cacttcgact tccactgagg ttaccaccat cactggaacc 5220
aacggtcaac caactgacga aactgtgatt gtcattagaa ctccaactag tgagggtttg 5280
attactacaa ctaccgaacc atggactggt actttcactt ctacatctac tgaggttacc 5340
accgtcaccg gtaccaacgg tcaaccaact gacgaaaccg ttattgttat cagaactcca 5400
actagtgagg gtttgattac aaccaccact gaaccatgga ctggcacttt cacttctaca 5460
tctactgaga tgaccaccgt caccggtact aacggtcaac caactgacga aactgtgatt 5520
gtcattagaa ctccaactag tgagggtttg attacaacca ccactgaacc atggactggt 5580
actttcactt ctacatctac tgaggttacc accgtcaccg gtaccaacgg tcaaccaact 5640
gacgaaaccg ttattgttat cagaactcca actagtgagg gtttgattac aaccaccact 5700
gaaccatgga ctggcacttt cacttctaca tctactgaga tgaccaccgt caccggtact 5760
aacggtcaac caactgatga aactgtgatt gttatcagaa ctccaaccag tgaaggtttg 5820
gttgcaacca ccactgaacc atggactggc actttcacct ctacatccac tgagatgacc 5880
accgtcaccg gtactaacgg tcaaccaact gacgaaaccg tgattgttat cagaactcca 5940
accagtgaag gtttggttgc aaccaccact gaaccatgga ctggcacttt cacctctaca 6000
tccactgaga tgaccaccgt caccggtact aacggtcaac caactgacga aaccgtgatt 6060
gttatcagaa ctccaaccag tgaaggtttg gttgcaacca ccactgaacc atggactggc 6120
actttcacct ctacatccac tgagatgacc accatcaccg gtactaatgg tcaaccaact 6180
gacgaaaccg ttattgttat cagaactcca actagtgagg gtttgattac aaccaccacc 6240
gaaccatgga ctggcacttt cacttcgact tccactgaga tgaccaccat caccggtacc 6300
aacggtcaac caactgacga agctgtgatt gtcattagaa ctccaactag tgagggtttg 6360
gttactacaa ctaccgaacc atggactggt actttcactt cgacttccac tgggatgacc 6420
accgtcaccg gtactaacgg tcaaccaact gacgaaaccg tgattgttat cagaactcca 6480
accagtgaag gtttggttac aaccaccact gaaccatgga ctggtacttt tacttcgact 6540
tccactgaaa tgtctactgt cactggaacc aatggcttgc caactgatga aactgtcatt 6600
gttgtcaaaa ctccaactac tgccatctca tccagtttgt catcatcatc ttcaggacaa 6660
atcaccagct ctatcacgtc ttcgcgtcca attattaccc cattctatcc tagcaatgga 6720
acttctgtga tttcttcctc agtaatttct tcctcagtca cttcttctct agtcacttca 6780
tctccagtca tttcttcttc attcatttct tcctctgtca tttcttcttc tacaacaacc 6840
tccgcttcta tattctctga atcatctaaa tcatccgtca ttccaaccag tagttccacc 6900
tctggttctt ctgagagcga aacgagttca gctagttctg cctcttcttc ctcttctatc 6960
tcttctgaat caccaaagtc tacatattcg tcttcatcat taccacctgt taccagtgca 7020
acaacaagtc aggaaattac ttcttcctta ccacctgtta ccagtgcgac agcaagccag 7080
gaaactgctt cttcattacc acctgctacc actacaaaaa cgagcgaaca aaccactttg 7140
gttaccgtga catcctgcga atctcatgtg tgcactgaat ccatctcctc tgcgattgtt 7200
tccacggcca ccgttactgt tagcggcgtc acaacagagt ataccacatg gtgccctatt 7260
tctaccacag agacaacaag acaaaccaaa gggacaacag agcaaaccac agaaacaaca 7320
aaacaaacca cggtagttac aatttcttct tgtgaatctg acatatgctc taaaactgct 7380
tctccagcca ttgtgtctac aagcactgct actattaacg gcgttaccac ggaatacaca 7440
acatggtgtc ctatttccac cacagaatcg aagcaacaaa ctacgctagt tactgttact 7500
tcctgcgaat ctggtgtgtg ttccgaaact gcttcacctg ccattgtttc gacggccacg 7560
gctactgtga atgatgttgt tacggtctat cctacatgga gaccacagac tacgaatgaa 7620
gagtctgtca gctctaaaat gaacagtgct accagtgaga caacaaccaa tactgtagct 7680
gctgaaacga ctaccaatac tggagctgct gagacaacta ccagtactgg agctgctgag 7740
acgaaaacag tagtcacctc ttcgctttca agatctaatc acgctgaaac acagacggct 7800
tccgcgaccg atgtgattgg tcacagcagt agtgttgttt ctgtatccga aactggcaac 7860
accaagagtc taacaagttc cgggttgagt actatgtcgc aacagcctcg tagcacacca 7920
gcaagtagca tggtaggatc tagtacagct tctttagaaa tttcaacgta tgctggcagt 7980
gccaacagct tactggccgg tagtggttta agtgtcttca ttgcgtcctt attgctggca 8040
attatttaa 8049
<210>2
<211>2682
<212>PRT
<213>絮凝酵母
<400>2
Met Thr Met Pro His Arg Tyr Met Phe Leu Ala Val Phe Thr Leu Leu
1 5 10 15
Ala Leu Ile Asn Val Ala Ser Gly Ala Thr Glu Ala Cys Leu Pro Ala
20 25 30
Gly Gln Arg Lys Ser Gly Met Asn Ile Asn Phe Tyr Gln Tyr Ser Leu
35 40 45
Lys Asp Ser Ser Thr Tyr Ser Asn Ala Ala Tyr Met Ala Tyr Gly Tyr
50 55 60
Ala Ser Lys Thr Lys Leu Gly Ser Val Gly Gly Gln Thr Asp Ile Ser
65 70 75 80
Ile Asp Tyr Asn Ile Pro Cys Val Ser Ser Ser Gly Thr Phe Pro Cys
85 90 95
Pro Gln Glu Asp Ser Tyr Gly Asn Trp Gly Cys Lys Gly Met Gly Ala
100 105 110
Cys Ser Asn Ser Gln Gly Ile Ala Tyr Trp Ser Thr Asp Leu Phe Gly
115 120 125
Phe Tyr Thr Thr Pro Thr Asn Val Thr Leu Glu Met Thr Gly Tyr Phe
130 135 140
Leu Pro Pro Gln Thr Gly Ser Tyr Thr Phe Ser Phe Ala Thr Val Asp
145 150 155 160
Asp Ser Ala Ile Leu Ser Val Gly Gly Ser Ile Ala Phe Glu Cys Cys
165 170 175
Ala Gln Glu Gln Pro Pro Ile Thr Ser Thr Asn Phe Thr Ile Asn Gly
180 185 190
Ile Lys Pro Trp His Gly Ser Leu Pro Asp Asn Ile Ala Gly Thr Val
195 200 205
Tyr Met Tyr Ala Gly Phe Tyr Tyr Pro Met Lys Ile Val Tyr Ser Asn
210 215 220
Ala Val Ser Trp Gly Thr Leu Pro Ile Ser Val Thr Leu Pro Asp Gly
225 230 235 240
Thr Thr Val Ser Asp Asp Phe Glu Gly Tyr Val Tyr Thr Phe Asp Asn
245 250 255
Asn Leu Ser Gln Ser Asn Cys Thr Ile Pro Asp Pro Ser Asn Tyr Thr
260 265 270
Ala Ser Thr Thu Ile Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
275 280 285
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln Pro
290 295 300
Thr Asp Glu Thr Val Ile Val Val Lys Thr Pro Thr Thr Ala Asn Thr
305 310 315 320
Ile Ile Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser
325 330 335
Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Leu Pro Thr Asp Glu
340 345 350
Thr Val Ile Val Val Lys Thr Pro Thr Thr Ala Asn Thr Ile Ile Thr
355 360 365
Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met
370 375 380
Thr Thr Val Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile
385 390 395 400
Val Val Lys Thr Pro Thr Thr Ala Asn Thr Ile Ile Thr Thr Thr Glu
405 410 415
Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Thr Thr Val
420 425 430
Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val Val Lys
435 440 445
Thr Pro Thr Thr Ala Asn Thr Val Ile Thr Thr Thr Glu Pro Trp Thr
450 455 460
Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr
465 470 475 480
Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val Ile Lys Thr Pro Thr
485 490 495
Ser Glu Gly Pro Ile Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe
500 505 510
Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Leu
515 520 525
Pro Thr Asp Glu Thr Val Ile Ile Ile Lys Thr Pro Thr Thr Ala Ser
530 535 540
Thr Ile Ile Thr Thr Thr Glu Pro Trp Asn Gly Thr Phe Thr Ser Thr
545 550 555 560
Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln Pro Thr Asp
565 570 575
Glu Thr Val Ile Val Val Lys Thr Pro Thr Thr Ala Asn Thr Ile Ile
580 585 590
Thr Thr Thr Glu Pro Trp Thr Gly Ile Ser Thr Ser Thr Ser Thr Glu
595 600 605
Leu Thr Thr Val Thr Gly Thr Asn Gly Leu Pro Thr Asp Glu Thr Val
610 615 620
Ile Val Val Lys Thr Pro Thr Thr Ala Asn Thr Ile Ile Thr Thr Thr
625 630 635 640
Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Thr Thr
645 650 655
Val Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr Ile Ile Val Ile
660 665 670
Arg Thr Pro Thr Thr Ala Ser Thr Ile Ile Thr Thr Thr Glu Pro Trp
675 680 685
Thr Gly Thr Ser Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly
690 695 700
Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val Ile Lys Thr Pro
705 710 715 720
Thr Ser Glu Gly Pro Ile Ser Thr Thr Thr Glu Pro Trp Asn Gly Thr
725 730 735
Phe Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly
740 745 750
Gln Pro Thr Asp Glu Thr Val Ile Val Val Lys Thr Pro Thr Thr Ala
755 760 765
Asn Thr Ile Ile Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser
770 775 780
Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Leu Pro Thr
785 790 795 800
Asp Glu Thr Val Ile Val Val Lys Thr Pro Thr Thr Ala Asn Thr Val
805 810 815
Ile Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr
820 825 830
Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr
835 840 845
Val Ile Val Val Lys Thr Pro Thr Thr Ala Asn Thr Ile Ile Thr Thr
850 855 860
Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Thr
865 870 875 880
Thr Val Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val
885 890 895
Val Lys Thr Pro Thr Thr Ala Asn Thr Ile Ile Thr Thr Thr Glu Pro
900 905 910
Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr
915 920 925
Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr
930 935 940
Pro Thr Ser Glu Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Asn Gly
945 950 955 960
Thr Phe Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn
965 970 975
Gly Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser
980 985 990
Glu Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
995 1000 1005
Ser Thr Ser Thr Glu Met Thr Thr Ile Thr Gly Thr Asn Gly Gln
1010 1015 1020
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1025 1030 1035
Gly Leu Val Ala Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1040 1045 1050
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1055 1060 1065
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1070 1075 1080
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1085 1090 1095
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1100 1105 1110
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1115 1120 1125
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Asn Gly Thr Phe Thr
1130 1135 1140
Ser Thr Ser Thr Glu Val Thr Thr Ile Thr Gly Thr Asn Gly Gln
1145 1150 1155
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1160 1165 1170
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1175 1180 1185
Ser Thr Ser Thr Glu Met Thr Thr Ile Thr Gly Thr Asn Gly Gln
1190 1195 1200
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1205 1210 1215
Gly Leu Val Ala Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1220 1225 1230
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1235 1240 1245
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1250 1255 1260
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1265 1270 1275
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1280 1285 1290
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1295 1300 1305
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1310 1315 1320
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1325 1330 1335
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1340 1345 1350
Gly Leu Ile Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1355 1360 1365
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1370 1375 1380
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1385 1390 1395
Gly Leu Ile Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1400 1405 1410
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1415 1420 1425
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1430 1435 1440
Gly Leu Ile Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1445 1450 1455
Ser Thr Ser Thr Glu Met Thr Thr Ile Thr Gly Thr Asn Gly Gln
1460 1465 1470
Pro Thr Asp Glu Thr Val IIe Val Ile Arg Thr Pro Thr Ser Glu
1475 1480 1485
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1490 1495 1500
Ser Thr Ser Thr Glu Met Thr Thr Ile Thr Gly Thr Asn Gly Gln
1505 1510 1515
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1520 1525 1530
Gly Leu Val Ala Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1535 1540 1545
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1550 1555 1560
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1565 1570 1575
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Asn Gly Thr Phe Thr
1580 1585 1590
Ser Thr Ser Thr Glu Val Thr Thr Ile Thr Gly Thr Asn Gly Gln
1595 1600 1605
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1610 1615 1620
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1625 1630 1635
Ser Thr Ser Thr Glu Val Thr Thr Val Thr Gly Thr Asn Gly Gln
1640 1645 1650
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1655 1660 1665
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1670 1675 1680
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1685 1690 1695
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1700 1705 1710
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Asn Gly Thr Phe Thr
1715 1720 1725
Ser Thr Ser Thr Glu Val Thr Thr Ile Thr Gly Thr Asn Gly Gln
1730 1735 1740
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1745 1750 1755
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1760 1765 1770
Ser Thr Ser Thr Glu Val Thr Thr Val Thr Gly Thr Asn Gly Gln
1775 1780 1785
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1790 1795 1800
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1805 1810 1815
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1820 1825 1830
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1835 1840 1845
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1850 1855 1860
Ser Thr Ser Thr Glu Val Thr Thr Val Thr Gly Thr Asn Gly Gln
1865 1870 1875
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1880 1885 1890
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1895 1900 1905
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1910 1915 1920
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1925 1930 1935
Gly Leu Val Ala Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1940 1945 1950
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
1955 1960 1965
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
1970 1975 1980
Gly Leu Val Ala Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
1985 1990 1995
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln
2000 2005 2010
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
2015 2020 2025
Gly Leu Val Ala Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
2030 2035 2040
Ser Thr Ser Thr Glu Met Thr Thr Ile Thr Gly Thr Asn Gly Gln
2045 2050 2055
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
2060 2065 2070
Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
2075 2080 2085
Ser Thr Ser Thr Glu Met Thr Thr Ile Thr Gly Thr Asn Gly Gln
2090 2095 2100
Pro Thr Asp Glu Ala Val Ile Val Ile Arg Thr Pro Thr Ser Glu
2105 2110 2115
Gly Leu Val Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
2120 2125 2130
Ser Thr Ser Thr Gly Met Thr Thr Val Thr Gly Thr Asn Gly Gln
2135 2140 2145
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
2150 2155 2160
Gly Leu Val Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
2165 2170 2175
Ser Thr Ser Thr Glu Met Ser Thr Val Thr Gly Thr Asn Gly Leu
2180 2185 2190
Pro Thr Asp Glu Thr Val Ile Val Val Lys Thr Pro Thr Thr Ala
2195 2200 2205
Ile Ser Ser Ser Leu Ser Ser Ser Ser Ser Gly Gln Ile Thr Ser
2210 2215 2220
Ser Ile Thr Ser Ser Arg Pro Ile Ile Thr Pro Phe Tyr Pro Ser
2225 2230 2235
Asn Gly Thr Ser Val Ile Ser Ser Ser Val Ile Ser Ser Ser Val
2240 2245 2250
Thr Ser Ser Leu Val Thr Ser Ser Pro Val Ile Ser Ser Ser Phe
2255 2260 2265
Ile Ser Ser Ser Val Ile Ser Ser Ser Thr Thr Thr Ser Ala Ser
2270 2275 2280
Ile Phe Ser Glu Ser Ser Lys Ser Ser Val Ile Pro Thr Ser Ser
2285 2290 2295
Ser Thr Ser Gly Ser Ser Glu Ser Glu Thr Ser Ser Ala Ser Ser
2300 2305 2310
Ala Ser Ser Ser Ser Ser Ile Ser Ser Glu Ser Pro Lys Ser Thr
2315 2320 2325
Tyr Ser Ser Ser Ser Leu Pro Pro Val Thr Ser Ala Thr Thr Ser
2330 2335 2340
Gln Glu Ile Thr Ser Ser Leu Pro Pro Val Thr Ser Ala Thr Ala
2345 2350 2355
Ser Gln Glu Thr Ala Ser Ser Leu Pro Pro Ala Thr Thr Thr Lys
2360 2365 2370
Thr Ser Glu Gln Thr Thr Leu Val Thr Val Thr Ser Cys Glu Ser
2375 2380 2385
His Val Cys Thr Glu Ser Ile Ser Ser Ala Ile Val Ser Thr Ala
2390 2395 2400
Thr Val Thr Val Ser Gly Val Thr Thr Glu Tyr Thr Thr Trp Cys
2405 2410 2415
Pro Ile Ser Thr Thr Glu Thr Thr Arg Gln Thr Lys Gly Thr Thr
2420 2425 2430
Glu Gln Thr Thr Glu Thr Thr Lys Gln Thr Thr Val Val Thr Ile
2435 2440 2445
Ser Ser Cys Glu Ser Asp Ile Cys Ser Lys Thr Ala Ser Pro Ala
2450 2455 2460
Ile Val Ser Thr Ser Thr Ala Thr Ile Ash Gly Val Thr Thr Glu
2465 2470 2475
Tyr Thr Thr Trp Cys Pro Ile Ser Thr Thr Glu Ser Lys Gln Gln
2480 2485 2490
Thr Thr Leu Val Thr Val Thr Ser Cys Glu Ser Gly Val Cys Ser
2495 2500 2505
Glu Thr Ala Ser Pro Ala Ile Val Ser Thr Ala Thr Ala Thr Val
2510 2515 2520
Asn Asp Val Val Thr Val Tyr Pro Thr Trp Arg Pro Gln Thr Thr
2525 2530 2535
Asn Glu Glu Ser Val Ser Ser Lys Met Asn Ser Ala Thr Ser Glu
2540 2545 2550
Thr Thr Thr Asn Thr Val Ala Ala Glu Thr Thr Thr Asn Thr Gly
2555 2560 2565
Ala Ala Glu Thr Thr Thr Ser Thr Gly Ala Ala Glu Thr Lys Thr
2570 2575 2580
Val Val Thr Ser Ser Leu Ser Arg Ser Asn His Ala Glu Thr Gln
2585 2590 2595
Thr Ala Ser Ala Thr Asp Val Ile Gly His Ser Ser Ser Val Val
2600 2605 2610
Ser Val Ser Glu Thr Gly Asn Thr Lys Ser Leu Thr Ser Ser Gly
2615 2620 2625
Leu Ser Thr Met Ser Gln Gln Pro Arg Ser Thr Pro Ala Ser Ser
2630 2635 2640
Met Val Gly Ser Ser Thr Ala Ser Leu Glu Ile Ser Thr Tyr Ala
2645 2650 2655
Gly Ser Ala Asn Ser Leu Leu Ala Gly Ser Gly Leu Sea Val Phe
2660 2665 2670
Ile Ala Ser Leu Leu Leu Ala Ile Ile
2675 2680
<210>3
<211>5217
<212>DNA
<213>絮凝酵母
<400>3
atgacaatgc ctcatcgcta tatgtttttg gcagtcttta cacttctggc actaattaat 60
gtggcctcag gagccacaga ggcgtgctta ccagcaggcc agaggaaaag tgggatgaat 120
ataaattttt accagtattc attgaaagat tcctccacat attcgaatgc agcatatatg 180
gcttatggat atgcctcaaa aactaaacta ggttctgtcg gaggacaaac tgatatctcg 240
attgattata atattccttg tgttagttca tcaggcacat ttccttgtcc tcaagaagat 300
tcctatggaa actggggatg caaaggaatg ggtgcttgtt ctaatagtca aggaattgca 360
tactggagta ctgatttatt tggtttctat actaccccaa caaacgtaac cctggaaatg 420
acaggttatt ttttaccacc acagacgggt tcttacacat tcaagtttgc tacagttgac 480
gactctgcaa ttctatcagt aggtggtgct accgcgttcg actgttgtgc tcaacagcaa 540
ccgccgatca catccacaaa ctttacgatt aacggtatca aaccatgggg tggaagtttg 600
ccacctaatg ttgaaggaac agtctacatg tatgctggat tctactaccc aatgaaggtt 660
gtttactcaa atgctgtttc ttggggtaca cttccaatta gtgtgacact gcctgatggt 720
acagctgtca gtgatgactt cgagggatac gtgtattcct ttgatgatga tttgactcaa 780
tctgattgta ccattccaga tccttcaaac tatactatag caggcctaat caccaccacc 840
actgaaccat ggactggtac tttcacttct acatccactg agatgactac tgtcactggt 900
accaacagtc aaccaactga tgaaaccgtt attgttatca gaactccaac tagtgagggt 960
ttgattacaa ccaccactga accatggact ggcactttca cttctacatc tactgagatg 1020
accaccgtca ccggtaccaa cggtcaacca actgacgaaa ccgtgattgt tatcagaact 1080
ccaactagtg agggtttgat tacaaccacc actgaaccat ggactggtac tttcacttct 1140
acatctactg agatgaccac cgtcaccggt actaacagtc aaccaactga tgaaaccgtt 1200
attgttatca gaactccaac tagtgagggt ttgattacaa ccaccactga accatggact 1260
ggcactttca cttctacatc tactgagatg accaccgtca ccggtactaa cggtcaacca 1320
actgacgaaa ccgtgattgt tatcagaact ccaaccagtg aaggtttgat tacaaccacc 1380
actgaagcat ggactggtac tttcacttct acatctactg agatgaccac cgtcaccggt 1440
accaacggtc aaccaactga cgaaaccgtt attgttatca gaactccaac tagtgagggt 1500
ttgattacaa ccaccactga accatggact ggtactttca cctctacgtc tactgagatg 1560
accaccgtca ccggtactaa cggtcaacca actgatgaaa ccgttattgt tatcagaact 1620
ccaaccagtg aaggtctaat cagcaccacc actgaaccat ggactggcac tttcacttct 1680
acatctactg agatgaccac cgtcaccggt accaacggtc aaccaactga tgaaaccgtg 1740
attgttatca gaactccaac cagtgaaggt ttgattacaa ccaccactga accatggaat 1800
ggcactttca cttcgacttc cactgaggtt accaccatca ctggaaccaa cggtcaacca 1860
actgacgaaa ctgtgattgt cattagaact ccaactagtg agggtttgat tactacaact 1920
accgaaccat ggactggtac tttcacttct acatctactg aggttaccac cgtcaccggt 1980
actaatggtc aaccaactga cgaaaccgtt attgttatca gaactccaac tagtgagggt 2040
ttgattacaa ccgccactga accatggact ggtactttca cttctacatc tactgagatg 2100
accaccgtca ccggtactaa cggtcaacca actgatgaaa ccgttattgt tatcagaact 2160
ccaaccagtg aaggtttgat tacaaccacc actgaaccat ggaatggcac tttcacttcg 2220
acttccactg aggttaccac catcactgga accaacggtc aaccaactga cgaaactgtg 2280
attgtcatta gaactccaac tagtgagggt ttgattacta caactaccga accatggact 2340
ggtactttca cttctacatc tactgaggtt accaccgtca ccggtaccaa cggtcaacca 2400
actgacgaaa ccgttattgt tatcagaact ccaactagtg agggtttgat tacaaccacc 2460
accgaaccat ggactggcac tttcacttcg acttccactg agatgaccac catcaccggt 2520
accaacggtc aaccaactga cgaaactgtg attgtcatta gaactccaac tagtgagggt 2580
ttgattacaa ccaccactga accatggact ggtactttca cttctacatc tactgaggtt 2640
accaccgtca ccggtaccaa cggtcaacca actgacgaaa ccgttattgt tatcagaact 2700
ccaactagtg agggtttgat tacaaccacc actgaaccat ggactggcac tttcacttct 2760
acatctactg agatgaccac cgtcaccggt actaacggtc aaccaactga tgaaactgtg 2820
attgttatca gaactccaac cagtgaaggt ttggttacaa ccaccactga accatggaat 2880
ggtactttca cttctacatc tactgagatg accaccgtca ccggtaccaa cggtcaacca 2940
actgacgaaa ccgtgattgt tatcagaact ccaaccagtg aaggtttggt tgcaaccacc 3000
actgaaccat gggctggcac tttcacctct acatccactg agatgaccac cgtcaccggt 3060
actaacggtc aaccaactga cgaaaccgtg attgttatca gaactccaac cagtgaaggt 3120
ttggttgcaa ccaccactga accatggact ggcactttca cctctacatc cactgagatg 3180
accaccgtca ccggtactaa cggtcaacca actgacgaaa ccgtgattgt tatcagaact 3240
ccaaccagtg aaggtttggt tgcaaccacc actgaaccat ggactggcac tttcacctct 3300
acatccactg agatgaccac catcaccggt actaatggtc aaccaactga cgaaaccgtt 3360
attgttatca gaactccaac tagtgagggt ttgattacaa ccaccaccga accatggact 3420
ggcactttca cttcgacttc cactgagatg accaccatca ccggtaccaa cggtcaacca 3480
actgacgaag ctgtgattgt cattagaact ccaactagtg agggtttggt tactacaact 3540
accgaaccat ggactggtac tttcacttcg acttccactg ggatgaccac cgtcaccggt 3600
actaacggtc aaccaactga cgaaaccgtg attgttatca gaactccaac cagtgaaggt 3660
ttggttacaa ccaccactga accatggact ggtactttta cttcgacttc cactgaaatg 3720
tctactgtca ctggaaccaa tggcttgcca actgatgaaa ctgtcattgt tgtcaaaact 3780
ccaactactg ccatctcatc cagtttgtca tcatcatctt caggacaaat caccagctct 3840
atcacgtctt cgcgtccaat tattacccca ttctatccta gcaatggaac ttctgtgatt 3900
tcttcctcag taatttcttc ctcagtcact tcttctctag tcacttcatc tccagtcatt 3960
tcttcttcat tcatttcttc ctctgtcatt tcttcctcta caacaacctc cgcttctata 4020
ttctctgaat catctaaatc atccgtcatt ccaaccagta gttccacctc tggttcttct 4080
gagagcgaaa cgagttcagc tagttctgcc tcttcttcct cttctatctc ttctgaatca 4140
ccaaagtcta catattcgtc ttcatcatta ccacctgtta ccagtgcaac aacaagtcag 4200
gaaattactt cttccttacc acctgttacc agtgcgacag caagccagga aactgcttct 4260
tcattaccac ctgctaccac tacaaaaacg agcgaacaaa ccactttggt taccgtgaca 4320
tcctgcgaat ctcatgtgtg cactgaatcc atctcctctg cgattgtttc cacggccacc 4380
gttactgtta gcggcgtcac aacagagtat accacatggt gccctatttc taccacagag 4440
acaacaagac aaaccaaagg gacaacagag caaaccacag aaacaacaaa acaaaccacg 4500
gtagttacaa tttcttcttg tgaatctgac atatgctcta aaactgcttc tccagccatt 4560
gtgtctacaa gcactgctac tattaacggc gttaccacgg aatacacaac atggtgtcct 4620
atttccacca cagaatcgaa gcaacaaact acgctagtta ctgttacttc ctgcgaatct 4680
ggtgtgtgtt ccgaaactgc ttcacctgcc attgtttcga cggccacggc tactgtgaat 4740
gatgttgtta cggtctatcc tacatggaga ccacagacta cgaatgaaga gtctgtcagc 4800
tctaaaatga acagtgctac cagtgagaca acaaccaata ctgtagctgc tgaaacgact 4860
accaatactg gagctgctga gacaactacc agtactggag ctgctgagac gaaaacagta 4920
gtcacctctt cgctttcaag atctaatcac gctgaaacac agacggcttc cgcgaccgat 4980
gtgattggtc acagcagtag tgttgtttct gtatccgaaa ctggcaacac caagagtcta 5040
acaagttccg ggttgagtac tatgtcgcaa cagcctcgta gcacaccagc aagtagcatg 5100
gtaggatcta gtacagcttc tttagaaatt tcaacgtatg ctggcagtgc caacagctta 5160
ctggccggta gtggtttaag tgtcttcatt gcgtccttat tgctggcaat tatttaa 5217
<210>4
<211>1738
<212>PRT
<213>絮凝酵母
<400>4
Met Thr Met Pro His Arg Tyr Met Phe Leu Ala Val Phe Thr Leu Leu
1 5 10 15
Ala Leu Ile Asn Val Ala Ser Gly Ala Thr Glu Ala Cys Leu Pro Ala
20 25 30
Gly Gln Arg Lys Ser Gly Met Asn Ile Asn Phe Tyr Gln Tyr Ser Leu
35 40 45
Lys Asp Ser Ser Thr Tyr Ser Asn Ala Ala Tyr Met Ala Tyr Gly Tyr
50 55 60
Ala Ser Lys Thr Lys Leu Gly Ser Val Gly Gly Gln Thr Asp Ile Ser
65 70 75 80
Ile Asp Tyr Asn Ile Pro Cys Val Ser Ser Ser Gly Thr Phe Pro Cys
85 90 95
Pro Gln Glu Asp Ser Tyr Gly Asn Trp Gly Cys Lys Gly Met Gly Ala
100 105 110
Cys Ser Asn Ser Gln Gly Ile Ala Tyr Trp Ser Thr Asp Leu Phe Gly
115 120 125
Phe Tyr Thr Thr Pro Thr Asn Val Thr Leu Glu Met Thr Gly Tyr Phe
130 135 140
Leu Pro Pro Gln Thr Gly Ser Tyr Thr Phe Lys Phe Ala Thr Val Asp
145 150 155 160
Asp Ser Ala Ile Leu Ser Val Gly Gly Ala Thr Ala Phe Asp Cys Cys
165 170 175
Ala Gln Gln Gln Pro Pro Ile Thr Ser Thr Asn Phe Thr Ile Asn Gly
180 185 190
Ile Lys Pro Trp Gly Gly Ser Leu Pro Pro Asn Val Glu Gly Thr Val
195 200 205
Tyr Met Tyr Ala Gly Phe Tyr Tyr Pro Met Lys Val Val Tyr Ser Asn
210 215 220
Ala Val Ser Trp Gly Thr Leu Pro Ile Ser Val Th rLeu Pro Asp Gly
225 230 235 240
Thr Ala Val Ser Asp Asp Phe Glu Gly Tyr Val Tyr Ser Phe Asp Asp
245 250 255
Asp Leu Thr Gln Ser Asp Cys Thr Ile Pro Asp Pro Ser Asn Tyr Thr
260 265 270
lle Ala Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe
275 280 285
Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Ser Gln
290 295 300
Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu Gly
305 310 315 320
Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr
325 330 335
Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln Pro Thr Asp
340 345 350
Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu Gly Leu Ile Thr
355 360 365
Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu
370 375 380
Met Thr Thr Val Thr Gly Thr Asn Ser Gln Pro Thr Asp Glu Thr Val
385 390 395 400
Ile Val Ile Arg Thr Pro Thr Ser Glu Gly Leu Ile Thr Thr Thr Thr
405 410 415
Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Thr Thr
420 425 430
Val Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val lle
435 440 445
Arg Thr Pro Thr Ser Glu Gly Leu Ile Thr Thr Thr Thr Glu Ala Trp
450 455 460
Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly
465 470 475 480
Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro
485 490 495
Thr Ser Glu Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr
500 505 510
Phe Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly
515 520 525
Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu
530 535 540
Gly Leu Ile Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser
545 550 555 560
Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Gln Pro Thr
565 570 575
Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu Gly Leu Ile
580 585 590
Thr Thr Thr Thr Glu Pro Trp Asn Gly Thr Phe Thr Ser Thr Ser Thr
595 600 605
Glu Val Thr Thr Ile Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr
610 615 620
Val Ile Val Ile Arg Thr Pro Thr Ser Glu Gly Leu Ile Thr Thr Thr
625 630 635 640
Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Val Thr
645 650 655
Thr Val Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val
660 665 670
Ile Arg Thr Pro Thr Ser Glu Gly Leu Ile Thr Thr Ala Thr Glu Pro
675 680 685
Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr
690 695 700
Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr
705 710 715 720
Pro Thr Ser Glu Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Asn Gly
725 730 735
Thr Phe Thr Ser Thr Ser Thr Glu Val Thr Thr Ile Thr Gly Thr Asn
740 745 750
Gly Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser
755 760 765
Glu Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr
770 775 780
Ser Thr Ser Thr Glu Val Thr Thr Val Thr Gly Thr Asn Gly Gln Pro
785 790 795 800
Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu Gly Leu
805 810 815
Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser
820 825 830
Thr Glu Met Thr Thr Ile Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu
835 840 845
Thr Val Ile Val Ile Arg Thr Pro Thr Ser Glu Gly Leu Ile Thr Thr
850 855 860
Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Val
865 870 875 880
Thr Thr Val Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile
885 890 895
Val Ile Arg Thr Pro Thr Ser Glu Gly Leu Ile Thr Thr Thr Thr Glu
900 905 910
Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Thr Thr Val
915 920 925
Thr Gly Thr Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg
930 935 940
Thr Pro Thr Ser Glu Gly Leu Val Thr Thr Thr Thr Glu Pro Trp Asn
945 950 955 960
Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr
965 970 975
Asn Gly Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr
980 985 990
Ser Glu Gly Leu Val Ala Thr Thr Thr Glu Pro Trp Ala Gly Thr Phe
995 1000 1005
Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly
1010 1015 1020
Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser
1025 1030 1035
Glu Gly Leu Val Ala Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe
1040 1045 1050
Thr Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly
1055 1060 1065
Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser
1070 1075 1080
Glu Gly Leu Val Ala Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe
1085 1090 1095
Thr Ser Thr Ser Thr Glu Met Thr Thr Ile Thr Gly Thr Asn Gly
1100 1105 1110
Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser
1115 1120 1125
Glu Gly Leu Ile Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe
1130 1135 1140
Thr Ser Thr Ser Thr Glu Met Thr Thr Ile Thr Gly Thr Asn Gly
1145 1150 1155
Gln Pro Thr Asp Glu Ala Val Ile Val Ile Arg Thr Pro Thr Ser
1160 1165 1170
Glu Gly Leu Val Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe
1175 1180 1185
Thr Ser Thr Ser Thr Gly Met Thr Thr Val Thr Gly Thr Asn Gly
1190 1195 1200
Gln Pro Thr Asp Glu Thr Val Ile Val Ile Arg Thr Pro Thr Ser
1205 1210 1215
Glu Gly Leu Val Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe
1220 1225 1230
Thr Ser Thr Ser Thr Glu Met Ser Thr Val Thr Gly Thr Asn Gly
1235 1240 1245
Leu Pro Thr Asp Glu Thr Val Ile Val Val Lys Thr Pro Thr Thr
1250 1255 1260
Ala Ile Ser Ser Ser Leu Ser Ser Ser Ser Ser Gly Gln Ile Thr
1265 1270 1275
Ser Ser Ile Thr Ser Ser Arg Pro Ile Ile Thr Pro Phe Tyr Pro
1280 1285 1290
Ser Asn Gly Thr Ser Val Ile Ser Ser Ser Val Ile Ser Ser Ser
1295 1300 1305
Val Thr Ser Ser Leu Val Thr Ser Ser Pro Val Ile Ser Ser Ser
1310 1315 1320
Phe Ile Ser Ser Ser Val Ile Ser Ser Ser Thr Thr Thr Ser Ala
1325 1330 1335
Ser Ile Phe Ser Glu Ser Ser Lys Ser Ser Val Ile Pro Thr Ser
1340 1345 1350
Ser Ser Thr Ser Gly Ser Ser Glu Ser Glu Thr Ser Ser Ala Ser
1355 1360 1365
Ser Ala Ser Ser Ser Ser Ser Ile Ser Ser Glu Ser Pro Lys Ser
1370 1375 1380
Thr Tyr Ser Ser Ser Ser Leu Pro Pro Val Thr Ser Ala Thr Thr
1385 1390 1395
Ser Gln Glu Ile Thr Ser Ser Leu Pro Pro Val Thr Ser Ala Thr
1400 1405 1410
Ala Ser Gln Glu Thr Ala Ser Ser Leu Pro Pro Ala Thr Thr Thr
1415 1420 1425
Lys Thr Ser Glu Gln Thr Thr Leu Val Thr Val Thr Ser Cys Glu
1430 1435 1440
Ser His Val Cys Thr Glu Ser Ile Ser Ser Ala Ile Val Ser Thr
1445 1450 1455
Ala Thr Val Thr Val Ser Gly Val Thr Thr Glu Tyr Thr Thr Trp
1460 1465 1470
Cys Pro Ile Ser Thr Thr Glu Thr Thr Arg Gln Thr Lys Gly Thr
1475 1480 1485
Thr Glu Gln Thr Thr Glu Thr Thr Lys Gln Thr Thr Val Val Thr
1490 1495 1500
Ile Ser Ser Cys Glu Ser Asp Ile Cys Ser Lys Thr Ala Ser Pro
1505 1510 1515v
Ala Ile Val Ser Thr Ser Thr Ala Thr Ile Ash Gly Val Thr Thr
1520 1525 1530
Glu Tyr Thr Thr Trp Cys Pro Ile Ser Thr Thr Glu Ser Lys Gln
1535 1540 1545
Gln Thr Thr Leu Val Thr Val Thr Ser Cys Glu Ser Gly Val Cys
1550 1555 1560
Ser Glu Thr Ala Ser Pro Ala Ile Val Ser Thr Ala Thr Ala Thr
1565 1570 1575
Val Asn Asp Val Val Thr Val Tyr Pro Thr Trp Arg Pro Gln Thr
1580 1585 1590
Thr Asn Glu Glu Ser Val Ser Ser Lys Met Asn Ser Ala Thr Ser
1595 1600 1605
Glu Thr Thr Thr Asn Thr Val Ala Ala Glu Thr Thr Thr Ash Thr
1610 1615 1620
Gly Ala Ala Glu Thr Thr Thr Ser Thr Gly Ala Ala Glu Thr Lys
1625 1630 1635
Thr Val Val Thr Ser Ser Leu Ser Arg Ser Asn His Ala Glu Thr
1640 1645 1650
Gln Thr Ala Ser Ala Thr Asp Val Ile Gly His Ser Ser Ser Val
1655 1660 1665
Val Ser Val Ser Glu Thr Gly Asn Thr Lys Ser Leu Thr Ser Ser
1670 1675 1680
Gly Leu Ser Thr Met Ser Gln Gln Pro Arg Ser Thr Pro Ala Ser
1685 1690 1695
Ser Met Val Gly Ser Ser Thr Ala Ser Leu Glu Ile Ser Thr Tyr
1700 1705 1710
Ala Gly Ser Ala Asn Ser Leu Leu Ala Gly Ser Gly Leu Ser Val
1715 1720 1725
Phe Ile Ala Ser Leu Leu Leu Ala Ile Ile
1730 1735
<210>5
<211>29
<212>DNA
<213>人工序列
<220>
<223>引物
<400>5
gcggaattcc ctctggttct tctgagagc 29
<210>6
<211>27
<212>DNA
<213>人工序列
<220>
<223>引物
<400>6
gcgaagcttg taagctgttg gcactgc 27
<210>7
<211>26
<212>DNA
<213>人工序列
<220>
<223>引物
<400>7
ggcgaattcc ttgaaattag ctcggt 26
<210>8
<211>26
<212>DNA
<213>人工序列
<220>
<223>引物
<400>8
gcgaagcttg catatccata agccat 26
<210>9
<211>66
<212>DNA
<213>人工序列
<220>
<223>引物
<400>9
gacaatgcct catcgctata tgtttttggc agtctttaca cttctgacat ggaggcccag 60
aatacc 66
<210>10
<211>66
<212>DNA
<213>人工序列
<220>
<223>引物
<400>10
agttatgacg gtgttagcag ttgtaggtgt tttgacaaca atgaccagta tagcgaccag 60
cattca 66
<210>11
<211>34
<212>DNA
<213>人工序列
<220>
<223>引物
<400>11
ggcttaatta aatgacaatg cctcatcgct atat 34
<210>12
<211>36
<212>DNA
<213>人工序列
<220>
<223>引物
<400>12
taccatgtcg ctggttaaat aattgccagc aataag 36
<210>13
<211>28
<212>DNA
<213>人工序列
<220>
<223>引物
<400>13
ttggatccac tgtaattgct tttagttg 28
<210>14
<211>34
<212>DNA
<213>人工序列
<220>
<223>引物
<400>14
ggcttaatta atgttttata tttgttgtaa aaag 34
<210>15
<211>27
<212>DNA
<213>人工序列
<220>
<223>引物
<400>15
aaggatccga ggacggttgc tgaagaa 27
<210>16
<211>32
<212>DNA
<213>人工序列
<220>
<223>引物
<400>16
gcgttaatta aagttctatg tcttaataag tc 32
<210>17
<211>18
<212>DNA
<213>人工序列
<220>
<223>引物
<400>17
atgctatgat gcccactg 18
<210>18
<211>18
<212>DNA
<213>人工序列
<220>
<223>引物
<400>18
aatacacgta tccctcga 18
<210>19
<211>18
<212>DNA
<213>人工序列
<220>
<223>引物
<400>19
tcttcgtgct cttgttgc 18
<210>20
<211>18
<212>DNA
<213>人工序列
<220>
<223>引物
<400>20
tttccagggt tacgtttg 18
Claims (9)
1.一种分离的核酸,其特征在于,其序列如SEQ ID NO:1或SEQ ID NO:3所示。
2.一种蛋白质,其特征在于,所述蛋白质的氨基酸序列如SEQ ID NO:2或SEQ ID NO:4所示。
3.编码权利要求2所述的氨基酸序列的核酸。
4.一种表达载体,其含有权利要求1或3所述的核酸。
5.如权利要求4所述的表达载体,其特征在于,所述表达载体含有TPS1启动子或者含有PGK1启动子。
6.一种絮凝酵母(Saccharomyces cerevisiae),其含有权利要求4或5所述的表达载体。
7.一种获得权利要求1所述的核酸的方法,所述核酸为絮凝酵母全长絮凝基因,该方法包括以下步骤:
(1)用Fosmid载体构建插入片段约为35-40kb的絮凝酵母基因组文库;
(2)将所获得的文库转染细菌,平板涂布,经鉴定文库合格后挑取平板上的单克隆于培养基中培养;
(3)提取培养的单克隆的DNA,PCR扩增,并对PCR扩增产物进行检测,获得含有絮凝基因的阳性克隆;和
(4)对该阳性克隆进行测序,获得该絮凝酵母的所述絮凝基因。
8.一种生产絮凝蛋白的方法,该方法包括:
构建权利要求4或5所述的表达载体,
用该表达载体转化絮凝酵母,和
在使转化的絮凝酵母表达絮凝蛋白的条件下培育该絮凝酵母,从而生产絮凝蛋白。
9.选自保藏号为CGMCC NO:3408或CGMCC NO:3409的酿酒酵母(Saccharomyces cerevisiae)。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200910200097 CN102086455B (zh) | 2009-12-08 | 2009-12-08 | 絮凝酵母絮凝基因、其表达产物及其用途 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200910200097 CN102086455B (zh) | 2009-12-08 | 2009-12-08 | 絮凝酵母絮凝基因、其表达产物及其用途 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102086455A CN102086455A (zh) | 2011-06-08 |
CN102086455B true CN102086455B (zh) | 2013-04-17 |
Family
ID=44098460
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 200910200097 Expired - Fee Related CN102086455B (zh) | 2009-12-08 | 2009-12-08 | 絮凝酵母絮凝基因、其表达产物及其用途 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102086455B (zh) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102690836B (zh) * | 2012-06-12 | 2014-05-07 | 大连理工大学 | 转基因絮凝微藻的构建及其在微藻采收中的应用 |
JP5694426B2 (ja) * | 2013-05-09 | 2015-04-01 | アサヒグループホールディングス株式会社 | 新規ショ糖非資化性凝集性酵母 |
CN104974945B (zh) * | 2015-06-26 | 2018-05-04 | 中国石油天然气股份有限公司 | 一种过表达mig1基因的酿酒酵母及其制备方法与应用 |
CN110643515A (zh) * | 2019-01-02 | 2020-01-03 | 沈阳化工大学 | 转基因絮凝微藻的构建及其制备方法 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1352241A (zh) * | 2001-12-04 | 2002-06-05 | 大连理工大学 | 一种用于酒精发酵的自絮凝酵母菌株 |
CN101045905A (zh) * | 2006-03-30 | 2007-10-03 | 大连理工大学 | 驯化选育的自絮凝酵母变异株及其应用 |
-
2009
- 2009-12-08 CN CN 200910200097 patent/CN102086455B/zh not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1352241A (zh) * | 2001-12-04 | 2002-06-05 | 大连理工大学 | 一种用于酒精发酵的自絮凝酵母菌株 |
CN101045905A (zh) * | 2006-03-30 | 2007-10-03 | 大连理工大学 | 驯化选育的自絮凝酵母变异株及其应用 |
Non-Patent Citations (2)
Title |
---|
VERSTREPEN, K.J.等.INTRAGENIC TANDEM REPEATS GENERATE FUNCTIONAL VARIABILITY.《NAT. GENET.》.2005,第37卷(第9期),986-990. * |
WATARI,J.等.MOLECULAR CLONING AND ANALYSIS OF THE YEAST FLOCULATION GENE FLO1.《YEAST》.1994,第10卷(第2期),211-225. * |
Also Published As
Publication number | Publication date |
---|---|
CN102086455A (zh) | 2011-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103571762B (zh) | 一种高山被孢霉重组基因表达系统及其构建方法和应用 | |
CN109477115A (zh) | 用于真核生物的表达系统 | |
CN101993482A (zh) | 与水稻长粒卷叶相关的蛋白及其编码基因与应用 | |
CN102086455B (zh) | 絮凝酵母絮凝基因、其表达产物及其用途 | |
CN110028566A (zh) | GhPRXR1蛋白及其编码基因在调控棉籽含油量中的应用 | |
CN103014053A (zh) | 集胞藻高效双同源重组载体及其构建方法与应用 | |
CN101748069B (zh) | 一种重组蓝藻 | |
US20240124858A1 (en) | Recombinant algae having high lipid productivity | |
CN105238797B (zh) | 一种无乳链球菌的gshF基因的突变基因及其应用 | |
CN109266566A (zh) | 一种调节光滑球拟酵母渗透压胁迫的方法 | |
CN101469325A (zh) | 一种马克斯克鲁维酵母外切菊粉酶的分泌表达方法 | |
CN109134662A (zh) | 一种可视化抗菌肽融合蛋白及其制备方法和其应用 | |
CN102234620A (zh) | Dna序列、重组载体、单和双营养缺陷型多形汉逊酵母菌、及其制备方法 | |
CN113736806B (zh) | 提高海洋微拟球藻油脂合成的基因及其用途 | |
CN102690836B (zh) | 转基因絮凝微藻的构建及其在微藻采收中的应用 | |
CN109022452A (zh) | 苔藓lrr1基因在提高苔藓抗盐和抗衰老性能中的应用 | |
CN107988290A (zh) | 一种提高谷胱甘肽累积量的生物方法 | |
CN103642831A (zh) | 海洋球石藻真核表达载体的构建方法 | |
CN102373189B (zh) | 与脂肪酸合成相关的蛋白及其编码基因与应用 | |
CN104672314B (zh) | 一种重组古紫质4蛋白及其制备方法和应用 | |
CN103649318B (zh) | 通过改变激酶和磷酸酶的表达水平加快植物生长并提高产量的方法 | |
CN101922047A (zh) | Gap启动子文库及其用途 | |
CN112481235B (zh) | Trdrs2蛋白及其相关生物材料在调控里氏木霉合成和分泌蛋白能力中的应用 | |
CN110003316A (zh) | 一种编码糖转运蛋白的氨基酸序列、突变体及应用 | |
CN102146363A (zh) | 一种新的葡聚糖酶,其编码基因及应用 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20130417 Termination date: 20141208 |
|
EXPY | Termination of patent right or utility model |