MXPA99007257A - Novel compounds - Google Patents
Novel compoundsInfo
- Publication number
- MXPA99007257A MXPA99007257A MXPA/A/1999/007257A MX9907257A MXPA99007257A MX PA99007257 A MXPA99007257 A MX PA99007257A MX 9907257 A MX9907257 A MX 9907257A MX PA99007257 A MXPA99007257 A MX PA99007257A
- Authority
- MX
- Mexico
- Prior art keywords
- leu
- gly
- arg
- val
- clavama
- Prior art date
Links
- 150000001875 compounds Chemical class 0.000 title description 3
- HZZVJAQRINQKSD-PBFISZAISA-N Clavulanic acid Chemical compound OC(=O)[C@H]1C(=C/CO)/O[C@@H]2CC(=O)N21 HZZVJAQRINQKSD-PBFISZAISA-N 0.000 claims abstract description 43
- 229960003324 Clavulanic Acid Drugs 0.000 claims abstract description 42
- 238000004519 manufacturing process Methods 0.000 claims abstract description 18
- 244000005700 microbiome Species 0.000 claims abstract description 16
- WSHJJCPTKWSMRR-RXMQYKEDSA-N penam Chemical compound S1CCN2C(=O)C[C@H]21 WSHJJCPTKWSMRR-RXMQYKEDSA-N 0.000 claims abstract description 12
- 238000000034 method Methods 0.000 claims abstract description 11
- 229920003013 deoxyribonucleic acid Polymers 0.000 claims description 40
- 241000187433 Streptomyces clavuligerus Species 0.000 claims description 27
- 230000037348 biosynthesis Effects 0.000 claims description 17
- 230000015572 biosynthetic process Effects 0.000 claims description 17
- 239000000203 mixture Substances 0.000 claims description 15
- 230000002950 deficient Effects 0.000 claims description 10
- 230000002829 reduced Effects 0.000 claims description 4
- 239000003782 beta lactam antibiotic agent Substances 0.000 claims description 3
- 239000002132 β-lactam antibiotic Substances 0.000 claims description 3
- 238000000855 fermentation Methods 0.000 claims description 2
- 230000004151 fermentation Effects 0.000 claims description 2
- ZLMJMSJWJFRBEC-UHFFFAOYSA-N potassium Chemical compound [K] ZLMJMSJWJFRBEC-UHFFFAOYSA-N 0.000 claims description 2
- 229910052700 potassium Inorganic materials 0.000 claims description 2
- 239000011591 potassium Substances 0.000 claims description 2
- 238000002360 preparation method Methods 0.000 claims description 2
- 239000011780 sodium chloride Substances 0.000 claims description 2
- 229960003022 amoxicillin Drugs 0.000 claims 3
- LSQZJLSUYDQPKJ-NJBDSQKTSA-N amoxicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=C(O)C=C1 LSQZJLSUYDQPKJ-NJBDSQKTSA-N 0.000 claims 3
- 159000000001 potassium salts Chemical class 0.000 claims 2
- 230000000875 corresponding Effects 0.000 claims 1
- 238000009877 rendering Methods 0.000 claims 1
- 150000003839 salts Chemical class 0.000 claims 1
- 230000001580 bacterial Effects 0.000 abstract description 3
- 229950006334 APRAMYCIN Drugs 0.000 description 18
- XZNUGFQTQHRASN-XQENGBIVSA-N Apramycin Chemical compound O([C@H]1O[C@@H]2[C@H](O)[C@@H]([C@H](O[C@H]2C[C@H]1N)O[C@@H]1[C@@H]([C@@H](O)[C@H](N)[C@@H](CO)O1)O)NC)[C@@H]1[C@@H](N)C[C@@H](N)[C@H](O)[C@H]1O XZNUGFQTQHRASN-XQENGBIVSA-N 0.000 description 18
- 238000004458 analytical method Methods 0.000 description 17
- 108020004707 nucleic acids Proteins 0.000 description 13
- 150000007523 nucleic acids Chemical class 0.000 description 13
- 229920001850 Nucleic acid sequence Polymers 0.000 description 12
- 150000001413 amino acids Chemical class 0.000 description 11
- 108010013835 arginine glutamate Proteins 0.000 description 11
- 230000003115 biocidal Effects 0.000 description 11
- 108010057821 leucylproline Proteins 0.000 description 11
- 229920001184 polypeptide Polymers 0.000 description 11
- 239000000047 product Substances 0.000 description 11
- 108010061238 threonyl-glycine Proteins 0.000 description 11
- OLIFSFOFKGKIRH-WUJLRWPWSA-N Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CN OLIFSFOFKGKIRH-WUJLRWPWSA-N 0.000 description 10
- LESXFEZIFXFIQR-LURJTMIESA-N Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(O)=O LESXFEZIFXFIQR-LURJTMIESA-N 0.000 description 10
- XGDCYUQSFDQISZ-BQBZGAKWSA-N Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O XGDCYUQSFDQISZ-BQBZGAKWSA-N 0.000 description 10
- 108010050848 glycylleucine Proteins 0.000 description 10
- 241000588724 Escherichia coli Species 0.000 description 9
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 9
- VTJUNIYRYIAIHF-IUCAKERBSA-N Leu-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O VTJUNIYRYIAIHF-IUCAKERBSA-N 0.000 description 9
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 9
- 108010038633 aspartylglutamate Proteins 0.000 description 9
- XUUXCWCKKCZEAW-YFKPBYRVSA-N 2-[[(2S)-2-amino-5-(diaminomethylideneamino)pentanoyl]amino]acetic acid Chemical compound OC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N XUUXCWCKKCZEAW-YFKPBYRVSA-N 0.000 description 8
- IJYZHIOOBGIINM-WDSKDSINSA-N Arg-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N IJYZHIOOBGIINM-WDSKDSINSA-N 0.000 description 8
- JLXVRFDTDUGQEE-YFKPBYRVSA-N Gly-Arg Chemical compound NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N JLXVRFDTDUGQEE-YFKPBYRVSA-N 0.000 description 8
- LRKCBIUDWAXNEG-CSMHCCOUSA-N Leu-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRKCBIUDWAXNEG-CSMHCCOUSA-N 0.000 description 8
- 239000002253 acid Substances 0.000 description 8
- 239000002609 media Substances 0.000 description 8
- 108010029020 prolylglycine Proteins 0.000 description 8
- 102000004169 proteins and genes Human genes 0.000 description 8
- 108090000623 proteins and genes Proteins 0.000 description 8
- XNSKSTRGQIPTSE-UHFFFAOYSA-N Arginyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CCCNC(N)=N XNSKSTRGQIPTSE-UHFFFAOYSA-N 0.000 description 7
- CKAJHWFHHFSCDT-WHFBIAKZSA-N Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O CKAJHWFHHFSCDT-WHFBIAKZSA-N 0.000 description 7
- HFKJBCPRWWGPEY-BQBZGAKWSA-N L-arginyl-L-glutamic acid Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HFKJBCPRWWGPEY-BQBZGAKWSA-N 0.000 description 7
- RNKSNIBMTUYWSH-YFKPBYRVSA-N L-prolylglycine Chemical compound [O-]C(=O)CNC(=O)[C@@H]1CCC[NH2+]1 RNKSNIBMTUYWSH-YFKPBYRVSA-N 0.000 description 7
- 125000003275 alpha amino acid group Chemical group 0.000 description 7
- 239000003242 anti bacterial agent Substances 0.000 description 7
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 7
- 239000012228 culture supernatant Substances 0.000 description 7
- 108010089804 glycyl-threonine Proteins 0.000 description 7
- 108010037850 glycylvaline Proteins 0.000 description 7
- RZVAJINKPMORJF-UHFFFAOYSA-N p-acetaminophenol Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 7
- YBAFDPFAUTYYRW-YUMQZZPRSA-N Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O YBAFDPFAUTYYRW-YUMQZZPRSA-N 0.000 description 6
- IEFJWDNGDZAYNZ-BYPYZUCNSA-N Gly-Glu Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(O)=O IEFJWDNGDZAYNZ-BYPYZUCNSA-N 0.000 description 6
- DKEXFJVMVGETOO-LURJTMIESA-N Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CN DKEXFJVMVGETOO-LURJTMIESA-N 0.000 description 6
- 241000880493 Leptailurus serval Species 0.000 description 6
- SENJXOPIZNYLHU-IUCAKERBSA-N Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-IUCAKERBSA-N 0.000 description 6
- NFNVDJGXRFEYTK-YUMQZZPRSA-N Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O NFNVDJGXRFEYTK-YUMQZZPRSA-N 0.000 description 6
- WOUIMBGNEUWXQG-VKHMYHEASA-N Ser-Gly Chemical compound OC[C@H](N)C(=O)NCC(O)=O WOUIMBGNEUWXQG-VKHMYHEASA-N 0.000 description 6
- 238000002105 Southern blotting Methods 0.000 description 6
- STTYIMSDIYISRG-WDSKDSINSA-N Val-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(O)=O STTYIMSDIYISRG-WDSKDSINSA-N 0.000 description 6
- GVRKWABULJAONN-UHFFFAOYSA-N Valyl-Threonine Chemical compound CC(C)C(N)C(=O)NC(C(C)O)C(O)=O GVRKWABULJAONN-UHFFFAOYSA-N 0.000 description 6
- 125000000539 amino acid group Chemical compound 0.000 description 6
- 108010047857 aspartylglycine Proteins 0.000 description 6
- 230000001809 detectable Effects 0.000 description 6
- 230000029087 digestion Effects 0.000 description 6
- 238000010494 dissociation reaction Methods 0.000 description 6
- 230000005593 dissociations Effects 0.000 description 6
- 125000000267 glycino group Chemical group [H]N([*])C([H])([H])C(=O)O[H] 0.000 description 6
- 108010000761 leucylarginine Proteins 0.000 description 6
- 108010053725 prolylvaline Proteins 0.000 description 6
- IOUPEELXVYPCPG-UHFFFAOYSA-N val-gly Chemical compound CC(C)C(N)C(=O)NCC(O)=O IOUPEELXVYPCPG-UHFFFAOYSA-N 0.000 description 6
- BUQICHWNXBIBOG-LMVFSUKVSA-N Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)N BUQICHWNXBIBOG-LMVFSUKVSA-N 0.000 description 5
- SITWEMZOJNKJCH-UHFFFAOYSA-N Alanyl-Arginine Chemical compound CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 5
- AVKUERGKIZMTKX-NJBDSQKTSA-N Ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 5
- 229940064005 Antibiotic throat preparations Drugs 0.000 description 5
- 229940083879 Antibiotics FOR TREATMENT OF HEMORRHOIDS AND ANAL FISSURES FOR TOPICAL USE Drugs 0.000 description 5
- 229940042052 Antibiotics for systemic use Drugs 0.000 description 5
- 229940042786 Antitubercular Antibiotics Drugs 0.000 description 5
- PSZNHSNIGMJYOZ-WDSKDSINSA-N Asp-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PSZNHSNIGMJYOZ-WDSKDSINSA-N 0.000 description 5
- 210000000349 Chromosomes Anatomy 0.000 description 5
- 229940093922 Gynecological Antibiotics Drugs 0.000 description 5
- XWOBNBRUDDUEEY-UWVGGRQHSA-N Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 XWOBNBRUDDUEEY-UWVGGRQHSA-N 0.000 description 5
- 241000187747 Streptomyces Species 0.000 description 5
- VPZKQTYZIVOJDV-LMVFSUKVSA-N Thr-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(O)=O VPZKQTYZIVOJDV-LMVFSUKVSA-N 0.000 description 5
- IOWJRKAVLALBQB-IWGUZYHVSA-N Thr-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O IOWJRKAVLALBQB-IWGUZYHVSA-N 0.000 description 5
- BECPPKYKPSRKCP-ZDLURKLDSA-N Thr-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O BECPPKYKPSRKCP-ZDLURKLDSA-N 0.000 description 5
- WXVIGTAUZBUDPZ-DTLFHODZSA-N Thr-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 WXVIGTAUZBUDPZ-DTLFHODZSA-N 0.000 description 5
- 229940024982 Topical Antifungal Antibiotics Drugs 0.000 description 5
- HPYDSVWYXXKHRD-VIFPVBQESA-N Tyr-Gly Chemical compound [O-]C(=O)CNC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 HPYDSVWYXXKHRD-VIFPVBQESA-N 0.000 description 5
- ZQOOYCZQENFIMC-STQMWFEESA-N Tyr-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=C(O)C=C1 ZQOOYCZQENFIMC-STQMWFEESA-N 0.000 description 5
- 229960000723 ampicillin Drugs 0.000 description 5
- 108010068380 arginylarginine Proteins 0.000 description 5
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 5
- 210000004027 cells Anatomy 0.000 description 5
- 238000010367 cloning Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 108010028295 histidylhistidine Proteins 0.000 description 5
- 229940079866 intestinal antibiotics Drugs 0.000 description 5
- 239000002773 nucleotide Substances 0.000 description 5
- 125000003729 nucleotide group Chemical group 0.000 description 5
- 229940005935 ophthalmologic Antibiotics Drugs 0.000 description 5
- QLROSWPKSBORFJ-BQBZGAKWSA-N pro glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 QLROSWPKSBORFJ-BQBZGAKWSA-N 0.000 description 5
- 108010090894 prolylleucine Proteins 0.000 description 5
- 108010026333 seryl-proline Proteins 0.000 description 5
- 108010080629 tryptophan-leucine Proteins 0.000 description 5
- MPZWMIIOPAPAKE-UHFFFAOYSA-N 4-amino-5-[[1-carboxy-4-(diaminomethylideneamino)butyl]amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CCCN=C(N)N MPZWMIIOPAPAKE-UHFFFAOYSA-N 0.000 description 4
- WPWUFUBLGADILS-WDSKDSINSA-N Ala-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WPWUFUBLGADILS-WDSKDSINSA-N 0.000 description 4
- FRYULLIZUDQONW-IMJSIDKUSA-N Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FRYULLIZUDQONW-IMJSIDKUSA-N 0.000 description 4
- JHFNSBBHKSZXKB-VKHMYHEASA-N Asp-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(O)=O JHFNSBBHKSZXKB-VKHMYHEASA-N 0.000 description 4
- UKGGPJNBONZZCM-WDSKDSINSA-N Aspartyl-L-proline Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O UKGGPJNBONZZCM-WDSKDSINSA-N 0.000 description 4
- NTQDELBZOMWXRS-UHFFFAOYSA-N Aspartyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC(O)=O NTQDELBZOMWXRS-UHFFFAOYSA-N 0.000 description 4
- 108020004705 Codon Proteins 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- KOSRFJWDECSPRO-WDSKDSINSA-N Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O KOSRFJWDECSPRO-WDSKDSINSA-N 0.000 description 4
- MRVYVEQPNDSWLH-UHFFFAOYSA-N Glutaminyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)CCC(N)=O MRVYVEQPNDSWLH-UHFFFAOYSA-N 0.000 description 4
- JBCLFWXMTIKCCB-VIFPVBQESA-N Gly-Phe Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-VIFPVBQESA-N 0.000 description 4
- NIKBMHGRNAPJFW-UHFFFAOYSA-N Histidinyl-Arginine Chemical compound NC(=N)NCCCC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 NIKBMHGRNAPJFW-UHFFFAOYSA-N 0.000 description 4
- VYZAGTDAHUIRQA-WHFBIAKZSA-N L-alanyl-L-glutamic acid Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O VYZAGTDAHUIRQA-WHFBIAKZSA-N 0.000 description 4
- JXWLMUIXUXLIJR-QWRGUYRKSA-N Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JXWLMUIXUXLIJR-QWRGUYRKSA-N 0.000 description 4
- GLUBLISJVJFHQS-VIFPVBQESA-N Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 GLUBLISJVJFHQS-VIFPVBQESA-N 0.000 description 4
- ZKQOUHVVXABNDG-IUCAKERBSA-N Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 ZKQOUHVVXABNDG-IUCAKERBSA-N 0.000 description 4
- NFDYGNFETJVMSE-BQBZGAKWSA-N Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CO NFDYGNFETJVMSE-BQBZGAKWSA-N 0.000 description 4
- 235000019764 Soybean Meal Nutrition 0.000 description 4
- BIYXEUAFGLTAEM-WUJLRWPWSA-N Thr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(O)=O BIYXEUAFGLTAEM-WUJLRWPWSA-N 0.000 description 4
- BQBCIBCLXBKYHW-CSMHCCOUSA-N Thr-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])[C@@H](C)O BQBCIBCLXBKYHW-CSMHCCOUSA-N 0.000 description 4
- OBTCMSPFOITUIJ-FSPLSTOPSA-N Val-Asp Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O OBTCMSPFOITUIJ-FSPLSTOPSA-N 0.000 description 4
- XXDVDTMEVBYRPK-XPUUQOCRSA-N Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O XXDVDTMEVBYRPK-XPUUQOCRSA-N 0.000 description 4
- UPJONISHZRADBH-XPUUQOCRSA-N Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UPJONISHZRADBH-XPUUQOCRSA-N 0.000 description 4
- GIAZPLMMQOERPN-YUMQZZPRSA-N Val-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GIAZPLMMQOERPN-YUMQZZPRSA-N 0.000 description 4
- 108010087924 alanylproline Proteins 0.000 description 4
- 108010093581 aspartyl-proline Proteins 0.000 description 4
- 150000007942 carboxylates Chemical class 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 108010078144 glutaminyl-glycine Proteins 0.000 description 4
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 4
- 238000009396 hybridization Methods 0.000 description 4
- 239000003999 initiator Substances 0.000 description 4
- 238000003780 insertion Methods 0.000 description 4
- 108010004914 prolylarginine Proteins 0.000 description 4
- 239000004455 soybean meal Substances 0.000 description 4
- 230000028070 sporulation Effects 0.000 description 4
- VNYDHJARLHNEGA-RYUDHWBXSA-N (2S)-1-[(2S)-2-azaniumyl-3-(4-hydroxyphenyl)propanoyl]pyrrolidine-2-carboxylate Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=C(O)C=C1 VNYDHJARLHNEGA-RYUDHWBXSA-N 0.000 description 3
- CXISPYVYMQWFLE-VKHMYHEASA-N Ala-Gly Chemical compound C[C@H]([NH3+])C(=O)NCC([O-])=O CXISPYVYMQWFLE-VKHMYHEASA-N 0.000 description 3
- XZWXFWBHYRFLEF-FSPLSTOPSA-N Ala-His Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 XZWXFWBHYRFLEF-FSPLSTOPSA-N 0.000 description 3
- OMLWNBVRVJYMBQ-YUMQZZPRSA-N Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OMLWNBVRVJYMBQ-YUMQZZPRSA-N 0.000 description 3
- KLKHFFMNGWULBN-VKHMYHEASA-N Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)NCC(O)=O KLKHFFMNGWULBN-VKHMYHEASA-N 0.000 description 3
- HSPSXROIMXIJQW-BQBZGAKWSA-N Asp-His Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 HSPSXROIMXIJQW-BQBZGAKWSA-N 0.000 description 3
- NALWOULWGHTVDA-UWVGGRQHSA-N Asp-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NALWOULWGHTVDA-UWVGGRQHSA-N 0.000 description 3
- CPMKYMGGYUFOHS-FSPLSTOPSA-N Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O CPMKYMGGYUFOHS-FSPLSTOPSA-N 0.000 description 3
- ZVDPYSVOZFINEE-UHFFFAOYSA-N Aspartyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CC(O)=O ZVDPYSVOZFINEE-UHFFFAOYSA-N 0.000 description 3
- 102000004594 DNA Polymerase I Human genes 0.000 description 3
- 108010017826 DNA Polymerase I Proteins 0.000 description 3
- KFKWRHQBZQICHA-STQMWFEESA-N Leu-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 3
- KAKJTZWHIUWTTD-VQVTYTSYSA-N Met-Thr Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)O)C([O-])=O KAKJTZWHIUWTTD-VQVTYTSYSA-N 0.000 description 3
- 229920000272 Oligonucleotide Polymers 0.000 description 3
- IWIANZLCJVYEFX-RYUDHWBXSA-N Pro-Phe Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 IWIANZLCJVYEFX-RYUDHWBXSA-N 0.000 description 3
- WBAXJMCUFIXCNI-WDSKDSINSA-N Ser-Pro Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WBAXJMCUFIXCNI-WDSKDSINSA-N 0.000 description 3
- ILVGMCVCQBJPSH-WDSKDSINSA-N Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO ILVGMCVCQBJPSH-WDSKDSINSA-N 0.000 description 3
- RZEQTVHJZCIUBT-UHFFFAOYSA-N Serinyl-Arginine Chemical compound OCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N RZEQTVHJZCIUBT-UHFFFAOYSA-N 0.000 description 3
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 3
- HYLXOQURIOCKIH-VQVTYTSYSA-N Thr-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N HYLXOQURIOCKIH-VQVTYTSYSA-N 0.000 description 3
- QOLYAJSZHIJCTO-VQVTYTSYSA-N Thr-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O QOLYAJSZHIJCTO-VQVTYTSYSA-N 0.000 description 3
- CKHWEVXPLJBEOZ-UHFFFAOYSA-N Threoninyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)C(C)O CKHWEVXPLJBEOZ-UHFFFAOYSA-N 0.000 description 3
- PWIQCLSQVQBOQV-AAEUAGOBSA-N Trp-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 PWIQCLSQVQBOQV-AAEUAGOBSA-N 0.000 description 3
- VEYJKJORLPYVLO-RYUDHWBXSA-N Val-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 VEYJKJORLPYVLO-RYUDHWBXSA-N 0.000 description 3
- 108010047495 alanylglycine Proteins 0.000 description 3
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 3
- 108010062796 arginyllysine Proteins 0.000 description 3
- 108010036533 arginylvaline Proteins 0.000 description 3
- 108010068265 aspartyltyrosine Proteins 0.000 description 3
- 230000002068 genetic Effects 0.000 description 3
- 108010049041 glutamylalanine Proteins 0.000 description 3
- KZNQNBZMBZJQJO-YFKPBYRVSA-N gly pro Chemical compound NCC(=O)N1CCC[C@H]1C(O)=O KZNQNBZMBZJQJO-YFKPBYRVSA-N 0.000 description 3
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 3
- 108010081551 glycylphenylalanine Proteins 0.000 description 3
- 108010077515 glycylproline Proteins 0.000 description 3
- 108010087823 glycyltyrosine Proteins 0.000 description 3
- 238000004128 high performance liquid chromatography Methods 0.000 description 3
- 108010092114 histidylphenylalanine Proteins 0.000 description 3
- 108010085325 histidylproline Proteins 0.000 description 3
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 3
- 108010012581 phenylalanylglutamate Proteins 0.000 description 3
- 108010051242 phenylalanylserine Proteins 0.000 description 3
- 108010077112 prolyl-proline Proteins 0.000 description 3
- 108010070643 prolylglutamic acid Proteins 0.000 description 3
- 239000006152 selective media Substances 0.000 description 3
- 108010048818 seryl-histidine Proteins 0.000 description 3
- 238000004448 titration Methods 0.000 description 3
- 230000001131 transforming Effects 0.000 description 3
- 108010020532 tyrosyl-proline Proteins 0.000 description 3
- SITLTJHOQZFJGG-XPUUQOCRSA-N α-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SITLTJHOQZFJGG-XPUUQOCRSA-N 0.000 description 3
- LQJAALCCPOTJGB-YUMQZZPRSA-N (2S)-1-[(2S)-2-amino-5-(diaminomethylideneamino)pentanoyl]pyrrolidine-2-carboxylic acid Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O LQJAALCCPOTJGB-YUMQZZPRSA-N 0.000 description 2
- IGROJMCBGRFRGI-YTLHQDLWSA-N (2S)-2-[[(2S)-2-[[(2S,3R)-2-amino-3-hydroxybutanoyl]amino]propanoyl]amino]propanoic acid Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 2
- GJSURZIOUXUGAL-UHFFFAOYSA-N 2-((2,6-Dichlorophenyl)imino)imidazolidine Chemical compound ClC1=CC=CC(Cl)=C1NC1=NCCN1 GJSURZIOUXUGAL-UHFFFAOYSA-N 0.000 description 2
- HKTRDWYCAUTRRL-UHFFFAOYSA-N 4-amino-5-[[1-carboxy-2-(1H-imidazol-5-yl)ethyl]amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 HKTRDWYCAUTRRL-UHFFFAOYSA-N 0.000 description 2
- XAEWTDMGFGHWFK-IMJSIDKUSA-N Ala-Asp Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O XAEWTDMGFGHWFK-IMJSIDKUSA-N 0.000 description 2
- RDIKFPRVLJLMER-BQBZGAKWSA-N Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)N RDIKFPRVLJLMER-BQBZGAKWSA-N 0.000 description 2
- OMNVYXHOSHNURL-WPRPVWTQSA-N Ala-Phe Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OMNVYXHOSHNURL-WPRPVWTQSA-N 0.000 description 2
- WVRUNFYJIHNFKD-WDSKDSINSA-N Arg-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N WVRUNFYJIHNFKD-WDSKDSINSA-N 0.000 description 2
- PMGDADKJMCOXHX-BQBZGAKWSA-N Arg-Gln Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O PMGDADKJMCOXHX-BQBZGAKWSA-N 0.000 description 2
- WYBVBIHNJWOLCJ-IUCAKERBSA-N Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N WYBVBIHNJWOLCJ-IUCAKERBSA-N 0.000 description 2
- JQFZHHSQMKZLRU-IUCAKERBSA-N Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N JQFZHHSQMKZLRU-IUCAKERBSA-N 0.000 description 2
- QADCERNTBWTXFV-JSGCOSHPSA-N Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCNC(N)=N)N)C(O)=O)=CNC2=C1 QADCERNTBWTXFV-JSGCOSHPSA-N 0.000 description 2
- BNODVYXZAAXSHW-UHFFFAOYSA-N Arginyl-Histidine Chemical compound NC(=N)NCCCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 BNODVYXZAAXSHW-UHFFFAOYSA-N 0.000 description 2
- FYRVDDJMNISIKJ-UWVGGRQHSA-N Asn-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FYRVDDJMNISIKJ-UWVGGRQHSA-N 0.000 description 2
- YZQCXOFQZKCETR-UWVGGRQHSA-N Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YZQCXOFQZKCETR-UWVGGRQHSA-N 0.000 description 2
- NPDLYUOYAGBHFB-UHFFFAOYSA-N Asparaginyl-Arginine Chemical compound NC(=O)CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N NPDLYUOYAGBHFB-UHFFFAOYSA-N 0.000 description 2
- RGGVDKVXLBOLNS-UHFFFAOYSA-N Asparaginyl-Tryptophan Chemical compound C1=CC=C2C(CC(NC(=O)C(CC(N)=O)N)C(O)=O)=CNC2=C1 RGGVDKVXLBOLNS-UHFFFAOYSA-N 0.000 description 2
- RGTVXXNMOGHRAY-UHFFFAOYSA-N Cysteinyl-Arginine Chemical compound SCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N RGTVXXNMOGHRAY-UHFFFAOYSA-N 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- JEFZIKRIDLHOIF-BYPYZUCNSA-N Gln-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(O)=O JEFZIKRIDLHOIF-BYPYZUCNSA-N 0.000 description 2
- JZDHUJAFXGNDSB-WHFBIAKZSA-N Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O JZDHUJAFXGNDSB-WHFBIAKZSA-N 0.000 description 2
- LSPKYLAFTPBWIL-BYPYZUCNSA-N Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(O)=O LSPKYLAFTPBWIL-BYPYZUCNSA-N 0.000 description 2
- YBTCBQBIJKGSJP-BQBZGAKWSA-N Glu-Pro Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O YBTCBQBIJKGSJP-BQBZGAKWSA-N 0.000 description 2
- JSIQVRIXMINMTA-ZDLURKLDSA-N Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O JSIQVRIXMINMTA-ZDLURKLDSA-N 0.000 description 2
- YSWHPLCDIMUKFE-QWRGUYRKSA-N Glu-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YSWHPLCDIMUKFE-QWRGUYRKSA-N 0.000 description 2
- FUESBOMYALLFNI-VKHMYHEASA-N Gly-Asn Chemical compound NCC(=O)N[C@H](C(O)=O)CC(N)=O FUESBOMYALLFNI-VKHMYHEASA-N 0.000 description 2
- SCCPDJAQCXWPTF-VKHMYHEASA-N Gly-Asp Chemical compound NCC(=O)N[C@H](C(O)=O)CC(O)=O SCCPDJAQCXWPTF-VKHMYHEASA-N 0.000 description 2
- YIWFXZNIBQBFHR-LURJTMIESA-N Gly-His Chemical compound [NH3+]CC(=O)N[C@H](C([O-])=O)CC1=CN=CN1 YIWFXZNIBQBFHR-LURJTMIESA-N 0.000 description 2
- IKAIKUBBJHFNBZ-LURJTMIESA-N Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CN IKAIKUBBJHFNBZ-LURJTMIESA-N 0.000 description 2
- PFMUCCYYAAFKTH-YFKPBYRVSA-N Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CN PFMUCCYYAAFKTH-YFKPBYRVSA-N 0.000 description 2
- AJHCSUXXECOXOY-NSHDSACASA-N Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-NSHDSACASA-N 0.000 description 2
- MMFKFJORZBJVNF-UWVGGRQHSA-N His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 MMFKFJORZBJVNF-UWVGGRQHSA-N 0.000 description 2
- FBTYOQIYBULKEH-ZFWWWQNUSA-N His-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CNC=N1 FBTYOQIYBULKEH-ZFWWWQNUSA-N 0.000 description 2
- WRPDZHJNLYNFFT-UHFFFAOYSA-N Histidinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC1=CN=CN1 WRPDZHJNLYNFFT-UHFFFAOYSA-N 0.000 description 2
- ZUKPVRWZDMRIEO-VKHMYHEASA-N L-cysteinylglycine zwitterion Chemical compound SC[C@H]([NH3+])C(=O)NCC([O-])=O ZUKPVRWZDMRIEO-VKHMYHEASA-N 0.000 description 2
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 2
- DVCSNHXRZUVYAM-BQBZGAKWSA-N Leu-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O DVCSNHXRZUVYAM-BQBZGAKWSA-N 0.000 description 2
- LHSGPCFBGJHPCY-STQMWFEESA-N Leu-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-STQMWFEESA-N 0.000 description 2
- NPBGTPKLVJEOBE-IUCAKERBSA-N Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N NPBGTPKLVJEOBE-IUCAKERBSA-N 0.000 description 2
- ADHNYKZHPOEULM-BQBZGAKWSA-N Met-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O ADHNYKZHPOEULM-BQBZGAKWSA-N 0.000 description 2
- IMTUWVJPCQPJEE-IUCAKERBSA-N Met-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN IMTUWVJPCQPJEE-IUCAKERBSA-N 0.000 description 2
- DZMGFGQBRYWJOR-YUMQZZPRSA-N Met-Pro Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O DZMGFGQBRYWJOR-YUMQZZPRSA-N 0.000 description 2
- WEDDFMCSUNNZJR-WDSKDSINSA-N Met-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O WEDDFMCSUNNZJR-WDSKDSINSA-N 0.000 description 2
- 108010079364 N-glycylalanine Proteins 0.000 description 2
- 101700036361 PCR2 Proteins 0.000 description 2
- WEQJQNWXCSUVMA-RYUDHWBXSA-N Phe-Pro Chemical compound C([C@H]([NH3+])C(=O)N1[C@@H](CCC1)C([O-])=O)C1=CC=CC=C1 WEQJQNWXCSUVMA-RYUDHWBXSA-N 0.000 description 2
- NYQBYASWHVRESG-MIMYLULJSA-N Phe-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 NYQBYASWHVRESG-MIMYLULJSA-N 0.000 description 2
- KNPVDQMEHSCAGX-UHFFFAOYSA-N Phenylalanyl-Cysteine Chemical compound SCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 KNPVDQMEHSCAGX-UHFFFAOYSA-N 0.000 description 2
- KLAONOISLHWJEE-UHFFFAOYSA-N Phenylalanyl-Glutamine Chemical compound NC(=O)CCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 KLAONOISLHWJEE-UHFFFAOYSA-N 0.000 description 2
- FELJDCNGZFDUNR-WDSKDSINSA-N Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 FELJDCNGZFDUNR-WDSKDSINSA-N 0.000 description 2
- SHAQGFGGJSLLHE-BQBZGAKWSA-N Pro-Gln Chemical compound NC(=O)CC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 SHAQGFGGJSLLHE-BQBZGAKWSA-N 0.000 description 2
- RVQDZELMXZRSSI-IUCAKERBSA-N Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 RVQDZELMXZRSSI-IUCAKERBSA-N 0.000 description 2
- AFWBWPCXSWUCLB-WDSKDSINSA-N Pro-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 AFWBWPCXSWUCLB-WDSKDSINSA-N 0.000 description 2
- OIDKVWTWGDWMHY-RYUDHWBXSA-N Pro-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 OIDKVWTWGDWMHY-RYUDHWBXSA-N 0.000 description 2
- BEPSGCXDIVACBU-UHFFFAOYSA-N Prolyl-Histidine Chemical compound C1CCNC1C(=O)NC(C(=O)O)CC1=CN=CN1 BEPSGCXDIVACBU-UHFFFAOYSA-N 0.000 description 2
- GVUVRRPYYDHHGK-UHFFFAOYSA-N Prolyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C1CCCN1 GVUVRRPYYDHHGK-UHFFFAOYSA-N 0.000 description 2
- 102100002802 RAPH1 Human genes 0.000 description 2
- 101700052970 RAPH1 Proteins 0.000 description 2
- VBKBDLMWICBSCY-IMJSIDKUSA-N Ser-Asp Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O VBKBDLMWICBSCY-IMJSIDKUSA-N 0.000 description 2
- LAFKUZYWNCHOHT-WHFBIAKZSA-N Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O LAFKUZYWNCHOHT-WHFBIAKZSA-N 0.000 description 2
- 229920000978 Start codon Polymers 0.000 description 2
- 241000424942 Streptomyces clavuligerus ATCC 27064 Species 0.000 description 2
- UYKREHOKELZSPB-JTQLQIEISA-N Trp-Gly Chemical compound C1=CC=C2C(C[C@H](N)C(=O)NCC(O)=O)=CNC2=C1 UYKREHOKELZSPB-JTQLQIEISA-N 0.000 description 2
- LYMVXFSTACVOLP-ZFWWWQNUSA-N Trp-Leu Chemical compound C1=CC=C2C(C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C([O-])=O)=CNC2=C1 LYMVXFSTACVOLP-ZFWWWQNUSA-N 0.000 description 2
- MYVYPSWUSKCCHG-JQWIXIFHSA-N Trp-Ser Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 MYVYPSWUSKCCHG-JQWIXIFHSA-N 0.000 description 2
- UBAQSAUDKMIEQZ-QWRGUYRKSA-N Tyr-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UBAQSAUDKMIEQZ-QWRGUYRKSA-N 0.000 description 2
- PDSLRCZINIDLMU-QWRGUYRKSA-N Tyr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PDSLRCZINIDLMU-QWRGUYRKSA-N 0.000 description 2
- CGWAPUBOXJWXMS-HOTGVXAUSA-N Tyr-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 CGWAPUBOXJWXMS-HOTGVXAUSA-N 0.000 description 2
- QZOSVNLXLSNHQK-UHFFFAOYSA-N Tyrosyl-Aspartate Chemical compound OC(=O)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 QZOSVNLXLSNHQK-UHFFFAOYSA-N 0.000 description 2
- BNQVUHQWZGTIBX-IUCAKERBSA-N Val-His Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CN=CN1 BNQVUHQWZGTIBX-IUCAKERBSA-N 0.000 description 2
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 108010041407 alanylaspartic acid Proteins 0.000 description 2
- 108010070944 alanylhistidine Proteins 0.000 description 2
- 108010011559 alanylphenylalanine Proteins 0.000 description 2
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 2
- 108010008355 arginyl-glutamine Proteins 0.000 description 2
- 108010060035 arginylproline Proteins 0.000 description 2
- 108010077245 asparaginyl-proline Proteins 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 108010016616 cysteinylglycine Proteins 0.000 description 2
- 238000010353 genetic engineering Methods 0.000 description 2
- VPZXBVLAVMBEQI-VKHMYHEASA-N gly ala Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 2
- STKYPAFSDFAEPH-LURJTMIESA-N gly-val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CN STKYPAFSDFAEPH-LURJTMIESA-N 0.000 description 2
- 108010033719 glycyl-histidyl-glycine Proteins 0.000 description 2
- 108010010147 glycylglutamine Proteins 0.000 description 2
- YMAWOPBAYDPSLA-UHFFFAOYSA-N glycylglycine zwitterion Chemical compound [NH3+]CC(=O)NCC([O-])=O YMAWOPBAYDPSLA-UHFFFAOYSA-N 0.000 description 2
- 108010015792 glycyllysine Proteins 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- 108010025306 histidylleucine Proteins 0.000 description 2
- 108010018006 histidylserine Proteins 0.000 description 2
- 108010012058 leucyltyrosine Proteins 0.000 description 2
- OKKJLVBELUTLKV-UHFFFAOYSA-N methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 2
- 108010056582 methionylglutamic acid Proteins 0.000 description 2
- 108010005942 methionylglycine Proteins 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 230000036961 partial Effects 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 2
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 2
- 108010031719 prolyl-serine Proteins 0.000 description 2
- 108010079317 prolyl-tyrosine Proteins 0.000 description 2
- 108091007521 restriction endonucleases Proteins 0.000 description 2
- 230000000717 retained Effects 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 108010038745 tryptophylglycine Proteins 0.000 description 2
- 108010009962 valyltyrosine Proteins 0.000 description 2
- 101710041788 ygbT Proteins 0.000 description 2
- OABOXRPGTFRBFZ-IMJSIDKUSA-N (2R)-2-[[(2R)-2-amino-3-sulfanylpropanoyl]amino]-3-sulfanylpropanoic acid Chemical compound SC[C@H](N)C(=O)N[C@@H](CS)C(O)=O OABOXRPGTFRBFZ-IMJSIDKUSA-N 0.000 description 1
- CQGSYZCULZMEDE-SRVKXCTJSA-N (2S)-1-[(2S)-5-amino-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-5-oxopentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O CQGSYZCULZMEDE-SRVKXCTJSA-N 0.000 description 1
- POTCZYQVVNXUIG-BQBZGAKWSA-N (2S)-1-[2-[[(2S)-2-amino-3-carboxypropanoyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O POTCZYQVVNXUIG-BQBZGAKWSA-N 0.000 description 1
- UCXQIIIFOOGYEM-ULQDDVLXSA-N (2S)-2-[[(2S)-1-[(2S)-2-amino-4-methylpentanoyl]pyrrolidine-2-carbonyl]amino]-3-(4-hydroxyphenyl)propanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UCXQIIIFOOGYEM-ULQDDVLXSA-N 0.000 description 1
- VKVDRTGWLVZJOM-DCAQKATOSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-methylbutanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 1
- RXGLHDWAZQECBI-SRVKXCTJSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 1
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N (2S)-2-[[(2S)-2-[[(2S)-2-aminopropanoyl]amino]propanoyl]amino]-3-hydroxypropanoic acid Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 1
- XMAUFHMAAVTODF-STQMWFEESA-N (2S)-2-[[(2S)-2-amino-3-(1H-imidazol-5-yl)propanoyl]amino]-3-phenylpropanoic acid Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 XMAUFHMAAVTODF-STQMWFEESA-N 0.000 description 1
- LZDNBBYBDGBADK-KBPBESRZSA-N (2S)-2-[[(2S)-2-amino-3-methylbutanoyl]amino]-3-(1H-indol-3-yl)propanoic acid Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-KBPBESRZSA-N 0.000 description 1
- ILDSIMPXNFWKLH-KATARQTJSA-N (2S)-2-[[(2S,3R)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-hydroxybutanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ILDSIMPXNFWKLH-KATARQTJSA-N 0.000 description 1
- CCQOOWAONKGYKQ-BYPYZUCNSA-N (2S)-2-[[2-[(2-azaniumylacetyl)amino]acetyl]amino]propanoate Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 1
- BTBUEVAGZCKULD-XPUUQOCRSA-N (2S)-2-[[2-[[(2S)-2-aminopropanoyl]amino]acetyl]amino]-3-(1H-imidazol-5-yl)propanoic acid Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CN=CN1 BTBUEVAGZCKULD-XPUUQOCRSA-N 0.000 description 1
- VRIUOZAEZDNGTN-HRNNMHKYSA-N (2S)-N-[5-[3-[4-[[(2S)-1-[[2-[[(2S)-2-amino-5-(diaminomethylideneamino)pentanoyl]amino]acetyl]amino]-1-oxopropan-2-yl]amino]butylamino]propanoylamino]pentyl]-2-[[2-(2,4-dihydroxyphenyl)acetyl]amino]butanediamide Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)NC(=O)[C@H](C)NCCCCNCCC(=O)NCCCCCNC(=O)[C@H](CC(N)=O)NC(=O)CC1=CC=C(O)C=C1O VRIUOZAEZDNGTN-HRNNMHKYSA-N 0.000 description 1
- QJVHTELASVOWBE-AGNWQMPPSA-N (2S,5R,6R)-6-[[(2R)-2-amino-2-(4-hydroxyphenyl)acetyl]amino]-3,3-dimethyl-7-oxo-4-thia-1-azabicyclo[3.2.0]heptane-2-carboxylic acid;(2R,3Z,5R)-3-(2-hydroxyethylidene)-7-oxo-4-oxa-1-azabicyclo[3.2.0]heptane-2-carboxylic acid Chemical compound OC(=O)[C@H]1C(=C/CO)/O[C@@H]2CC(=O)N21.C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=C(O)C=C1 QJVHTELASVOWBE-AGNWQMPPSA-N 0.000 description 1
- ULXYQAJWJGLCNR-YUMQZZPRSA-N (3S)-3-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-(carboxymethylamino)-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 1
- PVMPDMIKUVNOBD-CIUDSAMLSA-N (3S)-3-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-[[(1S)-1-carboxy-2-hydroxyethyl]amino]-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 1
- SOYWRINXUSUWEQ-DLOVCJGASA-N (4S)-4-amino-5-[[(2S)-1-[[(1S)-1-carboxy-2-methylpropyl]amino]-3-methyl-1-oxobutan-2-yl]amino]-5-oxopentanoic acid Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 1
- BUXAPSQPMALTOY-UHFFFAOYSA-N 2-[(2-amino-3-sulfanylpropanoyl)amino]pentanedioic acid Chemical compound SCC(N)C(=O)NC(C(O)=O)CCC(O)=O BUXAPSQPMALTOY-UHFFFAOYSA-N 0.000 description 1
- SNFUTDLOCQQRQD-UHFFFAOYSA-N 2-[(2-amino-4-carboxybutanoyl)amino]-3-methylpentanoic acid Chemical compound CCC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SNFUTDLOCQQRQD-UHFFFAOYSA-N 0.000 description 1
- AAKRWBIIGKPOKQ-ONGXEEELSA-N 2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-methylbutanoyl]amino]acetic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 1
- WNGVUZWBXZKQES-YUMQZZPRSA-N 2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]propanoyl]amino]acetic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 1
- YOKVEHGYYQEQOP-QWRGUYRKSA-N 2-[[(2S)-2-[[(2S)-2-azaniumyl-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]acetate Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 1
- VWHGTYCRDRBSFI-ZETCQYMHSA-N 2-[[2-[[(2S)-2-amino-4-methylpentanoyl]amino]acetyl]amino]acetic acid Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 1
- KRKNYBCHXYNGOX-UHFFFAOYSA-K 2qpq Chemical compound [O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O KRKNYBCHXYNGOX-UHFFFAOYSA-K 0.000 description 1
- XMBSYZWANAQXEV-UHFFFAOYSA-N 4-amino-5-[(1-carboxy-2-phenylethyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 1
- MGHKSHCBDXNTHX-UHFFFAOYSA-N 4-amino-5-[(4-amino-1-carboxy-4-oxobutyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(CCC(N)=O)C(O)=O MGHKSHCBDXNTHX-UHFFFAOYSA-N 0.000 description 1
- SPBWHPXCWJLQRU-FITJORAGSA-N 4-amino-8-[(2R,3R,4S,5R)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-5-oxopyrido[2,3-d]pyrimidine-6-carboxamide Chemical compound C12=NC=NC(N)=C2C(=O)C(C(=O)N)=CN1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O SPBWHPXCWJLQRU-FITJORAGSA-N 0.000 description 1
- 229960000583 Acetic Acid Drugs 0.000 description 1
- CCUAQNUWXLYFRA-IMJSIDKUSA-N Ala-Asn Chemical compound C[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC(N)=O CCUAQNUWXLYFRA-IMJSIDKUSA-N 0.000 description 1
- HJCMDXDYPOUFDY-WHFBIAKZSA-N Ala-Gln Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O HJCMDXDYPOUFDY-WHFBIAKZSA-N 0.000 description 1
- FSHURBQASBLAPO-WDSKDSINSA-N Ala-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C)N FSHURBQASBLAPO-WDSKDSINSA-N 0.000 description 1
- IPWKGIFRRBGCJO-IMJSIDKUSA-N Ala-Ser Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](CO)C([O-])=O IPWKGIFRRBGCJO-IMJSIDKUSA-N 0.000 description 1
- WUGMRIBZSVSJNP-UFBFGSQYSA-N Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UFBFGSQYSA-N 0.000 description 1
- 108020005544 Antisense RNA Proteins 0.000 description 1
- SIFXMYAHXJGAFC-WDSKDSINSA-N Arg-Asp Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O SIFXMYAHXJGAFC-WDSKDSINSA-N 0.000 description 1
- DAQIJMOLTMGJLO-YUMQZZPRSA-N Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N DAQIJMOLTMGJLO-YUMQZZPRSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- JSLGXODUIAFWCF-UHFFFAOYSA-N Arginyl-Asparagine Chemical compound NC(N)=NCCCC(N)C(=O)NC(CC(N)=O)C(O)=O JSLGXODUIAFWCF-UHFFFAOYSA-N 0.000 description 1
- RJUHZPRQRQLCFL-IMJSIDKUSA-N Asn-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O RJUHZPRQRQLCFL-IMJSIDKUSA-N 0.000 description 1
- IIFDPDVJAHQFSR-WHFBIAKZSA-N Asn-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O IIFDPDVJAHQFSR-WHFBIAKZSA-N 0.000 description 1
- FFMIYIMKQIMDPK-BQBZGAKWSA-N Asn-His Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 FFMIYIMKQIMDPK-BQBZGAKWSA-N 0.000 description 1
- KWBQPGIYEZKDEG-FSPLSTOPSA-N Asn-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O KWBQPGIYEZKDEG-FSPLSTOPSA-N 0.000 description 1
- DVUFTQLHHHJEMK-IMJSIDKUSA-N Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O DVUFTQLHHHJEMK-IMJSIDKUSA-N 0.000 description 1
- GSMPSRPMQQDRIB-WHFBIAKZSA-N Asp-Gln Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O GSMPSRPMQQDRIB-WHFBIAKZSA-N 0.000 description 1
- OAMLVOVXNKILLQ-BQBZGAKWSA-N Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O OAMLVOVXNKILLQ-BQBZGAKWSA-N 0.000 description 1
- DWBZEJHQQIURML-IMJSIDKUSA-N Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O DWBZEJHQQIURML-IMJSIDKUSA-N 0.000 description 1
- HXWUJJADFMXNKA-UHFFFAOYSA-N Asparaginyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CC(N)=O HXWUJJADFMXNKA-UHFFFAOYSA-N 0.000 description 1
- VBKIFHUVGLOJKT-UHFFFAOYSA-N Asparaginyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC(N)=O VBKIFHUVGLOJKT-UHFFFAOYSA-N 0.000 description 1
- VGRHZPNRCLAHQA-UHFFFAOYSA-N Aspartyl-Asparagine Chemical compound OC(=O)CC(N)C(=O)NC(CC(N)=O)C(O)=O VGRHZPNRCLAHQA-UHFFFAOYSA-N 0.000 description 1
- FKBFDTRILNZGAI-UHFFFAOYSA-N Aspartyl-Cysteine Chemical compound OC(=O)CC(N)C(=O)NC(CS)C(O)=O FKBFDTRILNZGAI-UHFFFAOYSA-N 0.000 description 1
- 229940098164 Augmentin Drugs 0.000 description 1
- 241000194110 Bacillus sp. (in: Bacteria) Species 0.000 description 1
- 240000007124 Brassica oleracea Species 0.000 description 1
- 235000011301 Brassica oleracea var capitata Nutrition 0.000 description 1
- 235000001169 Brassica oleracea var oleracea Nutrition 0.000 description 1
- 101710022549 CASBPX2 Proteins 0.000 description 1
- YXQDRIRSAHTJKM-IMJSIDKUSA-N Cys-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(O)=O YXQDRIRSAHTJKM-IMJSIDKUSA-N 0.000 description 1
- YHDXIZKDOIWPBW-UHFFFAOYSA-N Cysteinyl-Glutamine Chemical compound SCC(N)C(=O)NC(C(O)=O)CCC(N)=O YHDXIZKDOIWPBW-UHFFFAOYSA-N 0.000 description 1
- NXTYATMDWQYLGJ-UHFFFAOYSA-N Cysteinyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CS NXTYATMDWQYLGJ-UHFFFAOYSA-N 0.000 description 1
- OELDIVRKHTYFNG-UHFFFAOYSA-N Cysteinyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)CS OELDIVRKHTYFNG-UHFFFAOYSA-N 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N D-sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 230000007023 DNA restriction-modification system Effects 0.000 description 1
- 229920001917 Ficoll Polymers 0.000 description 1
- 101710042240 GLUL Proteins 0.000 description 1
- LOJYQMFIIJVETK-WDSKDSINSA-N Gln-Gln Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LOJYQMFIIJVETK-WDSKDSINSA-N 0.000 description 1
- FYYSIASRLDJUNP-WHFBIAKZSA-N Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FYYSIASRLDJUNP-WHFBIAKZSA-N 0.000 description 1
- BBBXWRGITSUJPB-YUMQZZPRSA-N Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O BBBXWRGITSUJPB-YUMQZZPRSA-N 0.000 description 1
- LLEUXCDZPQOJMY-AAEUAGOBSA-N Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)N)C(O)=O)=CNC2=C1 LLEUXCDZPQOJMY-AAEUAGOBSA-N 0.000 description 1
- SSHIXEILTLPAQT-UHFFFAOYSA-N Glutaminyl-Aspartate Chemical compound NC(=O)CCC(N)C(=O)NC(CC(O)=O)C(O)=O SSHIXEILTLPAQT-UHFFFAOYSA-N 0.000 description 1
- JZOYFBPIEHCDFV-UHFFFAOYSA-N Glutaminyl-Histidine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 JZOYFBPIEHCDFV-UHFFFAOYSA-N 0.000 description 1
- ARPVSMCNIDAQBO-UHFFFAOYSA-N Glutaminyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CCC(N)=O ARPVSMCNIDAQBO-UHFFFAOYSA-N 0.000 description 1
- XBGGUPMXALFZOT-VIFPVBQESA-N Gly-Tyr Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-VIFPVBQESA-N 0.000 description 1
- 101710017531 H4C15 Proteins 0.000 description 1
- LNCFUHAPNTYMJB-IUCAKERBSA-N His-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CN=CN1 LNCFUHAPNTYMJB-IUCAKERBSA-N 0.000 description 1
- KRBMQYPTDYSENE-BQBZGAKWSA-N His-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 KRBMQYPTDYSENE-BQBZGAKWSA-N 0.000 description 1
- MUFXDFWAJSPHIQ-XDTLVQLUSA-N Ile-Tyr Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 MUFXDFWAJSPHIQ-XDTLVQLUSA-N 0.000 description 1
- 108010044467 Isoenzymes Proteins 0.000 description 1
- PWWVAXIEGOYWEE-UHFFFAOYSA-N Isophenergan Chemical compound C1=CC=C2N(CC(C)N(C)C)C3=CC=CC=C3SC2=C1 PWWVAXIEGOYWEE-UHFFFAOYSA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- 125000000510 L-tryptophano group Chemical group [H]C1=C([H])C([H])=C2N([H])C([H])=C(C([H])([H])[C@@]([H])(C(O[H])=O)N([H])[*])C2=C1[H] 0.000 description 1
- 101700021119 LEUC Proteins 0.000 description 1
- HIZYETOZLYFUFF-BQBZGAKWSA-N Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(O)=O HIZYETOZLYFUFF-BQBZGAKWSA-N 0.000 description 1
- JYOAXOMPIXKMKK-UHFFFAOYSA-N Leucyl-Glutamine Chemical compound CC(C)CC(N)C(=O)NC(C(O)=O)CCC(N)=O JYOAXOMPIXKMKK-UHFFFAOYSA-N 0.000 description 1
- CIOWSLJGLSUOME-BQBZGAKWSA-N Lys-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O CIOWSLJGLSUOME-BQBZGAKWSA-N 0.000 description 1
- NVGBPTNZLWRQSY-UWVGGRQHSA-N Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN NVGBPTNZLWRQSY-UWVGGRQHSA-N 0.000 description 1
- AIXUQKMMBQJZCU-IUCAKERBSA-N Lys-Pro Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O AIXUQKMMBQJZCU-IUCAKERBSA-N 0.000 description 1
- YSZNURNVYFUEHC-BQBZGAKWSA-N Lys-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O YSZNURNVYFUEHC-BQBZGAKWSA-N 0.000 description 1
- ZOKVLMBYDSIDKG-CSMHCCOUSA-N Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ZOKVLMBYDSIDKG-CSMHCCOUSA-N 0.000 description 1
- JHKXZYLNVJRAAJ-WDSKDSINSA-N Met-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(O)=O JHKXZYLNVJRAAJ-WDSKDSINSA-N 0.000 description 1
- QTZXSYBVOSXBEJ-WDSKDSINSA-N Met-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O QTZXSYBVOSXBEJ-WDSKDSINSA-N 0.000 description 1
- MUMXFARPYQTTSL-BQBZGAKWSA-N Met-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O MUMXFARPYQTTSL-BQBZGAKWSA-N 0.000 description 1
- QXOHLNCNYLGICT-YFKPBYRVSA-N Met-Gly Chemical compound CSCC[C@H](N)C(=O)NCC(O)=O QXOHLNCNYLGICT-YFKPBYRVSA-N 0.000 description 1
- PESQCPHRXOFIPX-RYUDHWBXSA-N Met-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-RYUDHWBXSA-N 0.000 description 1
- JMEWFDUAFKVAAT-UHFFFAOYSA-N Methionyl-Asparagine Chemical compound CSCCC(N)C(=O)NC(C(O)=O)CC(N)=O JMEWFDUAFKVAAT-UHFFFAOYSA-N 0.000 description 1
- 108010066427 N-valyltryptophan Proteins 0.000 description 1
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 1
- 101700080605 NUC1 Proteins 0.000 description 1
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 1
- 101700014647 PCR1 Proteins 0.000 description 1
- 229940049954 Penicillin Drugs 0.000 description 1
- JGSARLDLIJGVTE-MBNYWOFBSA-N Penicillin G Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)CC1=CC=CC=C1 JGSARLDLIJGVTE-MBNYWOFBSA-N 0.000 description 1
- OZILORBBPKKGRI-RYUDHWBXSA-N Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 OZILORBBPKKGRI-RYUDHWBXSA-N 0.000 description 1
- BXNGIHFNNNSEOS-UWVGGRQHSA-N Phe-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 BXNGIHFNNNSEOS-UWVGGRQHSA-N 0.000 description 1
- HWMGTNOVUDIKRE-UWVGGRQHSA-N Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 HWMGTNOVUDIKRE-UWVGGRQHSA-N 0.000 description 1
- OHUXOEXBXPZKPT-STQMWFEESA-N Phe-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=CC=C1 OHUXOEXBXPZKPT-STQMWFEESA-N 0.000 description 1
- ROHDXJUFQVRDAV-UWVGGRQHSA-N Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ROHDXJUFQVRDAV-UWVGGRQHSA-N 0.000 description 1
- 241000048284 Potato virus P Species 0.000 description 1
- HMNSRTLZAJHSIK-YUMQZZPRSA-N Pro-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 HMNSRTLZAJHSIK-YUMQZZPRSA-N 0.000 description 1
- RWCOTTLHDJWHRS-YUMQZZPRSA-N Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 RWCOTTLHDJWHRS-YUMQZZPRSA-N 0.000 description 1
- 108010025216 RVF peptide Proteins 0.000 description 1
- 101710012186 SLC7A1 Proteins 0.000 description 1
- SSJMZMUVNKEENT-IMJSIDKUSA-N Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CO SSJMZMUVNKEENT-IMJSIDKUSA-N 0.000 description 1
- LTFSLKWFMWZEBD-IMJSIDKUSA-N Ser-Asn Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O LTFSLKWFMWZEBD-IMJSIDKUSA-N 0.000 description 1
- UJTZHGHXJKIAOS-WHFBIAKZSA-N Ser-Gln Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O UJTZHGHXJKIAOS-WHFBIAKZSA-N 0.000 description 1
- YZMPDHTZJJCGEI-BQBZGAKWSA-N Ser-His Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 YZMPDHTZJJCGEI-BQBZGAKWSA-N 0.000 description 1
- PBUXMVYWOSKHMF-WDSKDSINSA-N Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CO PBUXMVYWOSKHMF-WDSKDSINSA-N 0.000 description 1
- PPQRSMGDOHLTBE-UWVGGRQHSA-N Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PPQRSMGDOHLTBE-UWVGGRQHSA-N 0.000 description 1
- XZKQVQKUZMAADP-IMJSIDKUSA-N Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(O)=O XZKQVQKUZMAADP-IMJSIDKUSA-N 0.000 description 1
- LDEBVRIURYMKQS-UHFFFAOYSA-N Serinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CO LDEBVRIURYMKQS-UHFFFAOYSA-N 0.000 description 1
- 241000187180 Streptomyces sp. Species 0.000 description 1
- NHUHCSRWZMLRLA-UHFFFAOYSA-N Sulfizole Chemical compound CC1=NOC(NS(=O)(=O)C=2C=CC(N)=CC=2)=C1C NHUHCSRWZMLRLA-UHFFFAOYSA-N 0.000 description 1
- 229940063214 Thiostrepton Drugs 0.000 description 1
- IQHUITKNHOKGFC-MIMYLULJSA-N Thr-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IQHUITKNHOKGFC-MIMYLULJSA-N 0.000 description 1
- DSGIVWSDDRDJIO-ZXXMMSQZSA-N Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DSGIVWSDDRDJIO-ZXXMMSQZSA-N 0.000 description 1
- UQTNIFUCMBFWEJ-UHFFFAOYSA-N Threoninyl-Asparagine Chemical compound CC(O)C(N)C(=O)NC(C(O)=O)CC(N)=O UQTNIFUCMBFWEJ-UHFFFAOYSA-N 0.000 description 1
- YKRQRPFODDJQTC-UHFFFAOYSA-N Threoninyl-Lysine Chemical compound CC(O)C(N)C(=O)NC(C(O)=O)CCCCN YKRQRPFODDJQTC-UHFFFAOYSA-N 0.000 description 1
- OHGNSVACHBZKSS-KWQFWETISA-N Trp-Ala Chemical compound C1=CC=C2C(C[C@H]([NH3+])C(=O)N[C@@H](C)C([O-])=O)=CNC2=C1 OHGNSVACHBZKSS-KWQFWETISA-N 0.000 description 1
- LCPVBXOHXMBLFW-JSGCOSHPSA-N Trp-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)=CNC2=C1 LCPVBXOHXMBLFW-JSGCOSHPSA-N 0.000 description 1
- LWFWZRANSFAJDR-JSGCOSHPSA-N Trp-Val Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O)=CNC2=C1 LWFWZRANSFAJDR-JSGCOSHPSA-N 0.000 description 1
- SMDQRGAERNMJJF-UHFFFAOYSA-N Tryptophyl-Cysteine Chemical compound C1=CC=C2C(CC(N)C(=O)NC(CS)C(O)=O)=CNC2=C1 SMDQRGAERNMJJF-UHFFFAOYSA-N 0.000 description 1
- YBRHKUNWEYBZGT-UHFFFAOYSA-N Tryptophyl-Threonine Chemical compound C1=CC=C2C(CC(N)C(=O)NC(C(O)C)C(O)=O)=CNC2=C1 YBRHKUNWEYBZGT-UHFFFAOYSA-N 0.000 description 1
- ZHSGGJXRNHWHRS-VIDYELAYSA-N Tunicamycin Chemical compound O([C@H]1[C@@H]([C@H]([C@@H](O)[C@@H](CC(O)[C@@H]2[C@H]([C@@H](O)[C@@H](O2)N2C(NC(=O)C=C2)=O)O)O1)O)NC(=O)/C=C/CC(C)C)[C@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1NC(C)=O ZHSGGJXRNHWHRS-VIDYELAYSA-N 0.000 description 1
- ZSXJENBJGRHKIG-UHFFFAOYSA-N Tyrosyl-Serine Chemical compound OCC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 ZSXJENBJGRHKIG-UHFFFAOYSA-N 0.000 description 1
- MFEVVAXTBZELLL-UHFFFAOYSA-N Tyrosyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 MFEVVAXTBZELLL-UHFFFAOYSA-N 0.000 description 1
- IBIDRSSEHFLGSD-YUMQZZPRSA-N Val-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-YUMQZZPRSA-N 0.000 description 1
- WITCOKQIPFWQQD-FSPLSTOPSA-N Val-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O WITCOKQIPFWQQD-FSPLSTOPSA-N 0.000 description 1
- YSGSDAIMSCVPHG-YUMQZZPRSA-N Val-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)C(C)C YSGSDAIMSCVPHG-YUMQZZPRSA-N 0.000 description 1
- GJNDXQBALKCYSZ-RYUDHWBXSA-N Val-Phe Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 GJNDXQBALKCYSZ-RYUDHWBXSA-N 0.000 description 1
- WPSXZFTVLIAPCN-UHFFFAOYSA-N Valyl-Cysteine Chemical compound CC(C)C(N)C(=O)NC(CS)C(O)=O WPSXZFTVLIAPCN-UHFFFAOYSA-N 0.000 description 1
- 101700068378 XRN1 Proteins 0.000 description 1
- 235000005042 Zier Kohl Nutrition 0.000 description 1
- QTBSBXVTEAMEQO-UHFFFAOYSA-N acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 1
- 108010044940 alanylglutamine Proteins 0.000 description 1
- 108010070783 alanyltyrosine Proteins 0.000 description 1
- 229920002847 antisense RNA Polymers 0.000 description 1
- 101700026142 apr-1 Proteins 0.000 description 1
- 108010080488 arginyl-arginyl-leucine Proteins 0.000 description 1
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 1
- 108010029539 arginyl-prolyl-proline Proteins 0.000 description 1
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 1
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 1
- 108010092854 aspartyllysine Proteins 0.000 description 1
- 229960000626 benzylpenicillin Drugs 0.000 description 1
- 239000003781 beta lactamase inhibitor Substances 0.000 description 1
- 230000027455 binding Effects 0.000 description 1
- 230000001851 biosynthetic Effects 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 150000001780 cephalosporins Chemical class 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000002759 chromosomal Effects 0.000 description 1
- 108020003054 clavaminate synthase Proteins 0.000 description 1
- 108010092360 clavamine Proteins 0.000 description 1
- 239000003184 complementary RNA Substances 0.000 description 1
- 108010004073 cysteinylcysteine Proteins 0.000 description 1
- 108010060199 cysteinylproline Proteins 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 238000010230 functional analysis Methods 0.000 description 1
- 239000012362 glacial acetic acid Substances 0.000 description 1
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 1
- 108010079547 glutamylmethionine Proteins 0.000 description 1
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 1
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 1
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 1
- 108010078326 glycyl-glycyl-valine Proteins 0.000 description 1
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 1
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 1
- 108010079413 glycyl-prolyl-glutamic acid Proteins 0.000 description 1
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 1
- 108010045126 glycyl-tyrosyl-glycine Proteins 0.000 description 1
- 108010020688 glycylhistidine Proteins 0.000 description 1
- 108010084389 glycyltryptophan Proteins 0.000 description 1
- 108010045383 histidyl-glycyl-glutamic acid Proteins 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 239000000543 intermediate Substances 0.000 description 1
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 1
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 108010054155 lysyllysine Proteins 0.000 description 1
- 108010038320 lysylphenylalanine Proteins 0.000 description 1
- 108010044655 lysylproline Proteins 0.000 description 1
- 230000002503 metabolic Effects 0.000 description 1
- 230000037353 metabolic pathway Effects 0.000 description 1
- 101700006494 nucA Proteins 0.000 description 1
- 235000016709 nutrition Nutrition 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 150000002960 penicillins Chemical class 0.000 description 1
- 108010072637 phenylalanyl-arginyl-phenylalanine Proteins 0.000 description 1
- 108010084525 phenylalanyl-phenylalanyl-glycine Proteins 0.000 description 1
- 108010065135 phenylalanyl-phenylalanyl-phenylalanine Proteins 0.000 description 1
- 108010018625 phenylalanylarginine Proteins 0.000 description 1
- 235000015108 pies Nutrition 0.000 description 1
- 125000000830 polyketide group Chemical group 0.000 description 1
- 229930001119 polyketides Natural products 0.000 description 1
- 235000013855 polyvinylpyrrolidone Nutrition 0.000 description 1
- 229920000036 polyvinylpyrrolidone Polymers 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- OZAIFHULBGXAKX-UHFFFAOYSA-N precursor Substances N#CC(C)(C)N=NC(C)(C)C#N OZAIFHULBGXAKX-UHFFFAOYSA-N 0.000 description 1
- 108010025826 prolyl-leucyl-arginine Proteins 0.000 description 1
- 238000002708 random mutagenesis Methods 0.000 description 1
- 230000001105 regulatory Effects 0.000 description 1
- 108010091078 rigin Proteins 0.000 description 1
- 108010071207 serylmethionine Proteins 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 229910000162 sodium phosphate Inorganic materials 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- NSFFHOGKXHRQEW-AIHSUZKVSA-N thiostrepton Chemical compound C([C@]12C=3SC=C(N=3)C(=O)N[C@H](C(=O)NC(/C=3SC[C@@H](N=3)C(=O)N[C@H](C=3SC=C(N=3)C(=O)N[C@H](C=3SC=C(N=3)[C@H]1N=1)[C@@H](C)OC(=O)C3=CC(=C4C=C[C@H]([C@@H](C4=N3)O)N[C@H](C(N[C@@H](C)C(=O)NC(=C)C(=O)N[C@@H](C)C(=O)N2)=O)[C@@H](C)CC)[C@H](C)O)[C@](C)(O)[C@@H](C)O)=C\C)[C@@H](C)O)CC=1C1=NC(C(=O)NC(=C)C(=O)NC(=C)C(N)=O)=CS1 NSFFHOGKXHRQEW-AIHSUZKVSA-N 0.000 description 1
- 230000002588 toxic Effects 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 108010078580 tyrosylleucine Proteins 0.000 description 1
- 108010073969 valyllysine Proteins 0.000 description 1
- OPINTGHFESTVAX-UHFFFAOYSA-N γ-glutamyl-Arginine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N OPINTGHFESTVAX-UHFFFAOYSA-N 0.000 description 1
- DXJZITDUDUPINW-UHFFFAOYSA-N γ-glutamyl-Asparagine Chemical compound NC(=O)CCC(N)C(=O)NC(CC(N)=O)C(O)=O DXJZITDUDUPINW-UHFFFAOYSA-N 0.000 description 1
- SIGGQAHUPUBWNF-UHFFFAOYSA-N γ-glutamyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CCC(N)=O SIGGQAHUPUBWNF-UHFFFAOYSA-N 0.000 description 1
- NJMYZEJORPYOTO-UHFFFAOYSA-N γ-glutamyl-Proline Chemical compound NC(=O)CCC(N)C(=O)N1CCCC1C(O)=O NJMYZEJORPYOTO-UHFFFAOYSA-N 0.000 description 1
- ZQFAGNFSIZZYBA-UHFFFAOYSA-N γ-glutamyl-Tryptophan Chemical compound C1=CC=C2C(CC(NC(=O)C(CCC(N)=O)N)C(O)=O)=CNC2=C1 ZQFAGNFSIZZYBA-UHFFFAOYSA-N 0.000 description 1
Abstract
Novel bacterial genes, microorganisms and processes for improving the manufacture of 5R clavams, e. g. clavulanic acid.
Description
NOVEL COMPOUNDS
DESCRIPTIVE MEMORY
The present invention relates to novel bacterial genes and methods for improving the manufacture of clavamas, for example clavulanic acid. The present invention also provides novel organisms capable of producing increased amounts of clavulanic acid. The microorganisms, in particular Streptomyces sp. they produce a number of antibiotics including clavulanic acid and other clavamas, cephalosporins, polyketides, cefamycins, tunicamycin, holomicin and penicillins. There is considerable interest in being able to manipulate the absolute and relative amounts of these antibiotics produced by the microorganism and accordingly there have been a number of studies that investigate the metabolic and genetic mechanisms of the biosynthetic pathways [Domain, A.L. (1990) "Biosynthesis and regulation of beta-lactam antibiotics." In: 50 years of Penicillin applications, history and trends]. Many of the enzymes that carry out the various steps in the metabolic pathways and the genes that encode these enzymes are known. The clavamas can be arbitrarily divided into two groups depending on their annular stereochemistry (clavamas 5S and 5R). The biochemical pathways for biosynthesis have not yet been completely elucidated.
5R and 5S clavams, but it has been suggested that it is derived from the same starting units (a three-carbon compound not yet identified [Townsend, CA and Ho, MF (1985) J. arm. chem. soc. 107 (4) , 1066-1068 and Elson, sw and Oliver, RS (1978) J. Antibiotics XXXI No.6, 568] and arginine [Valentine, BP et al. (1993) J. Am Chem. Soc. 15, 1210-1211] and it shares some common intermediates [Iwata-Reuyl, D. and CA Townsend (1992) J. Am. Chem. Soc. 114: 2762-63, and Jane, JW et al. (1993) Bioorg, Med. Chem. Lett. : 2313-16] Some examples of 5S clavames include clavama-2 (C2C) carboxylate, 2-hydroxymethylclavama (2HMC), 2- (3-alanyl) clavama, valclavama and clavamine acid [GB 1585661, Rohl, F. others, Arch. Microbiol. 147: 315-320, US 4,202,819.] There are, however, few examples of 5S clavamas and until now the best known is the beta-lactamase inhibitor clavulanic acid which is produced by the fermentation of Streptomyces clavuligerus.The combination is combined of clavulanic acid, in the form of potassium clavulanate, with beta-lactam-amoxicillin in the antibiotic AUGMENTIN (Trade Mark SmithKIine Beecham). Because of this commercial interest, research into the understanding of clavama biosynthesis has focused on the biosynthesis of clavama 5R, clavulanic acid, by S. clavuligerus. A number of enzymes and their genes associated with the biosynthesis of clavulanic acid have been identified and published. Some examples of such publications include Hodgson, J. E. et al., Gene 166, 49-55 (1995), Aidoo, K.A. and others, Gene 147, 41-46 (1994),
Paradkar, A. S. and others, J. Bact. 177 (s), 1307-14 (1995). In contrast, nothing is known about the biosynthesis and genetics of the different 5S clavames of clavinic acid which is a precursor of clavulanic acid produced by clavinic acid synthase in the biosynthesis of clavulamic acid in S. clavuligerus. Some gene cloning experiments have identified that S. clavuligerus contains two isoenzymes of clavinic acid synthase, almost and cas2 [Marsh, E.N. and other Biocehmistry 31, 12648-657, (1992)] both contributing to the production of cavulanic acid under certain nutritional conditions [Paradkar, A. S. et al., J. Bact. 177 (S), 1307-14 (1995)]. Chlamine synthase acid activity has been detected in microorganisms that produce clavulanic acid, ie S. jumonjinesis [Vidal, CM, Es 550549m (1987)] and S. katsurahamanus [Kitano, K, et al., JP 53-104796, (1978 )] as well as S. antibiotics, producer of the 5S clavama, valclavama [Baldwin, JE and others, Tetrahedron Letts. 35 (17), 2783-86, (1994)]. The last document also reported that S. antibiotics have proclamamynic acid amidinohydrolase activity, another enzyme that is known to be involved in the biosynthesis of clavulanic acid. It has been reported that all other genes identified in S. clavuligerus are required as involved in the biosynthesis of the clavam, for the biosynthesis of clavulanic acid [Hodgson, J.E. and others, Gene 166, 49-55 (1995), Aidoo, K.A. and others Gene 147, 41-46 (1994)] and until now no one has been reported that is specific for the biosynthesis of 5S clavamas.
We have now identified certain genes that are specific for the 5S clavama biosynthesis, as exemplified by C2C and 2HMC in S. clavuligerus, According to the above, the present invention provides DNA comprising one or more genes that are specific for the clavama biosynthesis. 5S in S. clavuligerus and that are not essential for the biosynthesis of clavama 5R) for example clavulanic acid. By "gene" as used herein is also included any regulatory region required for the function or expression of genes. In a preferred aspect, the DNA is as identified in SEQ ID NO: 1. Preferably, the DNA comprises the nucleotide sequences indicated in SEQ ID NO: 1, designated as orfup3m orfup2, orfdwnl, orfdwn2 and orfdwn3. The present invention also provides proteins encoded by said DNA. The present invention also provides vectors comprising the DNA of the invention and hosts containing such vectors. Surprisingly, it has been found that, when at least one of the genes according to the invention is effective, the amount of clavulanic acid produced by the organism is increased. Accordingly, the invention also provides methods for increasing the amount of clavulanic acid produced by a suitable microorganism. In one aspect of the invention, the identified genes can be manipulated to produce an organism capable of producing increased amounts of clavam, suitably clavulanic acid. The findings of this work also allow for an improved procedure for the identification of
organisms with higher production of clavulanic acid, which comprises the preliminary analysis of organisms with low or no production of 5S clavam (for example by CLAR and / or clavam analysis as described in the examples herein). Suitably, the 5S clavam genes of the present invention can be obtained by conventional cloning methods (such as PCR) based on the sequences provided herein. The function of the gene can be interfered with or eliminated / suppressed by genetic techniques, such as gene dissociation [Aidoo, K.A. et al., 81994), Gene, 147, 41-46], random mutagenesis, site-directed mutagenesis and antisense RNA. In a further aspect of the invention, plasmids containing no more defective genes are provided, preferably the plasmids pCEC060, pCEC061, pDES3, pCEC056 and pCEC057, described below. Genes can be made defective in a variety of ways, for example by insertion of a DNA fragment encoding an antibiotic resistance gene in which it completely cancels the activity of that gene. Alternatively, other strategies have been employed to produce defective genes, including the insertion of DNA that does not code for an antibiotic resistance gene, the location of part of the gene, the location of the entire gene or the creation of the nucleotide sequence of the gene by addition and / or substitution of one or more nucleotides. The defective genes according to the invention can be defective to different degrees. They may be
defective in the sense that their activity is completely canceled or a proportion of the original activity can be retained. Suitably, the plasmids of the invention are used to transform an organism, such as S. clavuligerus, for example strain ATCC 27064 (which corresponds to S. clavuligerus NRRL 3585). Suitable transformation methods can be found in relevant sources including: Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989), Molecular cloning: a laboratory manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y; Hopwood, D.A. and golds (1985), Genetic Manipulation of Sreptomyces. A Cloning Manual, and Paradkar, A.S. and Jensen, S.E. (1995), J. Bacteriol, 177 (S). 1307-1314. Industrial strains of the species S. clavuligerus are used industrially to produce clavulanic acid (potassium clavulanate). Within the British and United States pharmacopoeias for potassium clavunalate (British Pharmacopoeia 1993, Addendum 1994, p1362-3 and U.S. Pharmacopeia Official Monographs 1995, USP 23 NF18 p384-5), the amounts of 5S clavame and toxic clavama-2 carboxylate are specifically controlled. Therefore, in a further aspect of the invention an organism capable of producing high amounts of clavulanic acid is provided, but has been rendered incapable of forming C2C, or capable of producing high amounts of clavulanic acid, but capable of forming only two levels of C2C.
Suitably, the organism that produces clavulanic acid contains one or
more defective clavama genes and is preferably strains 56-1 A, 56-3A, 57-2B, 57-1C, 60-1 A, 60-2A, 60-3A, 61-1A, 61-2A, 61- 3A and 61-4A, of S. clavuligerus, described below. Such organisms are suitable for the production of clavulanic acid without the production of 5S clavam, clavama-2 carboxylate or with significantly reduced production of clavama-2 carboxylate.
EXAMPLES
In the examples, all methods are as in Sambrook, J.,
Fritsch, E.F. and Maniatis, T. (1989) Molecular Cloning A Laboratory Manual (2nd Edition), or Hopwood, D.A. (1985) Genetic Manipulation of Streptomyces. A Cloning Manual, and Paradkar, A.S. and Jensen, S.E. (1995) J. Bacteriol. 177 (5): "1307-1314, unless otherwise stated.
I. DNA SEQUENCING OF THE STREPTOMYCES CHROMOSOME
CLAVULIGERUS TO THE END 5 'AND TO THE END 3' OF THE CAS1 GENE OF CLAVAMINATO SINTASA
A. Almost isolating To isolate the chromosomal DNA fragments of Streptomyces clavuligerus encoding the gene for the isozyme clavaminate synthase I (almost), an oligonucleotide probe RMO1 was synthesized based on the nucleotides
9-44 of the gene almost previously sequenced (Marsh, E.N., Chang, M.D.T. and Townsend, C.A. (1992) Biochemistry 31: 12648-12657). Oligonucleotides were constructed using conventional methods in an Applied Biosystems 391 DNA synthesizer. The 36-part RMO1 sequence was synthesized antiparallel to that published by Marsh et al. (1992, right there), RM01 was re-labeled with 32P using conventional techniques. for labeling at the ends of the DNA oligonucleotides (Sambrook et al., 189 therein), and was used to analyze a cosmid bank of Streptomyces clavuligerus genomic DNA by Southern hybridization as described by Stahl and Amman (En: Nucleic acid techniques) bacterial systematics, Ed. E. Stackebrandt and M. Goodfellow, Toronto: John Wiley and Sons, pp. 205-248, 1991). The DNA genomic bank of S. clavuligerus, prepared in the cosmid pLAFR3, was as described by Doran, J.L et al., (1990), J. Bacteriol. 172 (9), 4909-4918. Colonies spots from the S. clavuligerus cosmid bank were incubated overnight with radiolabelled RMOI at 60 ° C in a solution consisting of 5 x SSC, 5 x Denhardt's solution and 0.5% SDS (1 x SDS: NaCl a 0.15 M + Na3 citrate, 0.015 M, 1 x Denhardt's solution: 0.02% BSA, 0.01% Ficoll and 0.02% PVP). The spots were then washed at 68 ° C for 30 minutes in a 0.5 x SSC + 0.1% SDS solution. A cosmid clone, 10D7, which hybridized strongly to RM01 was isolated and gave hybridization signals following digestion with the restriction endonucleases Sacl and EcoRI that were compatible with the hybridization signals.
detected in similar experiments with digestions of genomic DNA of S. Clavuligerus.
B. DNA sequencing of the nearly flanking S. clavuliazerus chromosome A partial restriction map of cosmid 10D7 was generated, using the restriction endonucleases Sacl, Ncol, and Kpnl. Southern hybridization experiments between RM01 and various DNA digestions of 10D7 indicated that Sacl was most likely located at one end of the 7 kb Sacl-Sacl DNA subfragment. This fragment consisted of the almost open reading frame and approximately 6 kb of DNA towards the 5 'end. The 7 kb fragment of the Sacl digestion of 10D7 was then subcloned into the phagemid vector pBluescriptlI SK + (2.96 kb, Stratagene), thus generating the recombinant plasmid pCEC007. To facilitate the procedure of sequencing the chromosome to the 5 'end of almost, a 3 kb Ncol-Ncol subfragment of the 7 kb Sacl-Sacl fragment was subcloned into pUC120 (3.2 kb; Vieirra and Messing, Methods Enzymol. 3-11, 1987) in both orientations, generating the recombinant plasmids pCEC026 and pCEC0.27. The 3 kb subfragment consists of the amino terminal coding portion of about and about 2.6 kb of DNA towards the 5 'end. Nested and overlapping deletions were created in both pCEC026 and pCEC0.27 using exonuclease II and S1 digestion
nuclease (Sambrook et al. 1989 there and the DNA sequence of the 3 kb Ncol-Ncol fragment was determined in both chains by the dideoxy chain termination method (Sanger, F., Nicklen, S and Coulson, AR ( 1977), Proc. Nati, Acad. Sci. USA 74: 5463-5467), using the Taq dye-deoxy3 terminator kit and an Applied Biosystems 373A sequencer, to determine the chromosome DNA sequence immediately towards the 3 'end of almost , a 4.3 kb Kpnl-EcoRI DNA fragment of the cosmid clone 10D7 was subcloned into pBluescriptlI SK +, generating pCEC018.From pCEC018, a 3.7 kb Sacl-Sacl subfragment was cloned into pSL1180 (3422 kb, Pharmacia); Sacl ends of this fragment partially overlapped the TGA stop codon of almost, the other was encoded in the vector Both orientations of the 3.7 kb fragment were obtained during subcloning and the resulting recombinant plasmids were designated pCEC023 and pCEC024. to nests and overlaps in both plasmids and the DNA sequence of the 3.7 kb fragment was determined in both chains. In SEQ ID NO: 1, the nucleotide sequence of the S. clavuligerus chromosome generated in those experiments is shown, including and flanking the sequence of almost.
II. FUNCTIONAL ANALYSIS OF OPEN READING FRAMES THAT FLANK CAS1
Computer analysis of the DNA sequence towards the 5 'end of almost predicted the presence of two complete open reading frames and an incomplete open reading frame. The three open reading frames were located in the DNA chain opposite almost and were oriented in the opposite direction. The first open reading frame, orfupl, was located 579 bp towards the 5 'end of almost and encoded a 344 amino acid (aa) polypeptide. The second open reading frame, orfup2, was located 437 bp beyond the 3 'end of orfup 1 and encoded a 151 aa polypeptide. Beyond orfup 2 is orfup 3. The start codon of orfup 3 overlaps the stop codon by translation of orfup 2, suggesting that the two open reading frames are coupled by translation. No stop codon for translation for orfup 3 was located in the Nocl-Nocl fragment of 3 kb. A similar analysis of the DNA sequence towards the 3 'end of almost predicted the presence of two complete open reading frames and an incomplete open reading frame. Two of the open reading frames were located in the DNA chain opposite to and almost oriented towards almost. The third open reading frame was located in the same chain as almost and was therefore oriented away from it. The first open reading frame towards the 3 'end, orfdwnl, was located at 373
bp to the 3 'end of almost and encoded a 328 aa polypeptide. The second open reading frame, orfdwn2, was located 55 bp towards the 5 'end of orfdwnl and encoded a polypeptide of 394 aa. At 315 bp towards the 5 'end of orfdwn2 and on the opposite chain was orfdwn3. Since no stop codon was observed for orfdwn3 in the 3.7 kb fragment, it encoded an incomplete polypeptide of 219 aa.
Gene dissociation of the open reading frames of orfup and orfdwn To assess the possible functions of the open reading frames that flanked almost in the biosynthesis of clavulanic acid and other clavamas produced by S. clavuligerus, inactivation or deletion mutants were created of insertions for gene replacement. The method used for dissociation and replacement of genes was essentially as described by Paradkar and Jensen (1995 right there).
A. orfupl A 1.5 kb Ncol-Ncol fragment having the apramycin resistance gene (aprr), constructed as described by Paradkar and Jensen (1995 right there), was treated with Klenow fragment to generate shaved ends (Sambrook and others 1989). right there) and ligated to pCEC026 that was digested with BsaBI and treated equally with Klenow fragment. PCEC026 possessed a BsaBI site indicated within orfupl at 636 bp of the translation start codon. The ligation mixture was used to transform the cells
Competent E. coli GM2163 (obtainable from New England Biolabs, USA., Marinus, M.G. et al., M GG (1983) vol 122, p288-9) in resistance to apramycin. Of the resulting transformants, two clones containing the plasmids pCEC054 and pCEC055 were isolated; by restriction analysis, it was found that pCEC054 possessed the aprr fragment in the same orientation as orfupl, while pCEC055 possessed it in the opposite orientation. To introduce pCEC054 to S. Clavuligerus, plasmid DNA was engineered with BamYW and HindWl the high copy number Streptomyces vector plJ486 (6.2 kb, Ward et al., 81986) Mol was ligated. Gen. Genet. 203: 468-478). The ligation mixture was then used to transform competent E. coli GM2163 cells into apramycin resistance. Of the resulting transformants, a clone having the promiscuous plasmid pCEC061 was isolated. This plasmid was then used to transform S. Clavuligerus NRRL 3585. The resulting transformants were subjected to two successive sporulation cycles in non-selective media and then replicated to the antibiotic-containing media to identify apramycin-resistant transformants and sensitive to thiostrepton. From this procedure, four putative mutants (61-1 A, -2A, -3A and -4A) were chosen for further analysis. To confirm that these putative mutants dissociated in orfupl, DNA was prepared from the isolated components 61-1 A and 61-2A, digested with Sacl and subjected to Southern blot analysis. Southern blot results were compatible with a double cross that had occurred
and demonstrated that these mutants are true replacement mutants by dissociation in orfupl. The mutants 61-1A, -2A, -3A and -4A were grown in soybean meal medium and the culture supernatants were examined by CLAR for the production of clavulanic acid and clavama. It was previously reported about the composition of the soybean meal medium and the method for examining clavamas by CLAR (Paradkar and Jensen, 1995 there), except that the regulator that operated for the CLAR analysis consisted of 0.1 M NaH2P04 + methanol. 6%, pH 3.68 (adjusted with glacial acetic acid). The CLAR analysis indicated that none of the mutants produced detectable levels of clavama-2 carboxylate or a hydroxymethylclavama-2. In addition, when the culture supernatants were analyzed biologically with respect to Bacillus sp. ATCC 27860, using the method of Pruess and Kellett (1983, J. Antibiot 36: 208-212), none of the mutants produced detectable levels of alanylclavama. In contrast, the HPLC analyzes of the culture supernatants terminated with the mutants seemed to produce higher levels of clavulanic acid, when compared with the wild-type ones (Table 1).
TABLE 1 Titration of clavulanic acid (CA) from orfupl mutants in shake flask tests
Deletion of orfupl A cloning experiment was undertaken to create the deletion of 654 nucleotide genes between the AatW sites of orfupl. PCR products were generated using the primer oligonucleotides listed below and pCEC061 described above as template. The original nucleotide sequence was altered to incorporate a Pst \ oligo 11 and a Sphl to oligo 14 site.
Pair of nucleotides 1 used to generate the product PCR 1 Initiator 11: 5 'dCTGACGCTGCAGGAGGAAGTCCCGC 3' Initiator 12: 5 'dCGGGGCGAGGACGTCGTCCCGATCC 3'
Pair of nucleotides 2 using general pair the product PCR2 Initiator 13: 5 'dGAGCCCCTGGACGTCGGCGGTGTCC 3' Initiator: 14: 5 'dGACGGTGCATGCTCAGCAGGGAGCG 3'
Standard PCR reactions were carried out using the PTC-200 Peltier thermal cycler, from GRI (Felsted, Dunmow, Essex, CM6 3LD). The PCR 1 product was generated using primers 11 and 12. This product is approximately 1 kb and contains the carboxy terminus of orfup? from the second site > 4afll and regions towards the 3 'end. The PCR 2 product is generated using the primers 13 and 14. The product ester is about 1.1 Kb and contains the amino terminus of the first site -4-lll and the regions towards the 5'-end. The PCR 2 product was ligated to pCR-Script amp SK (+) digested by Srfl according to the reduction analysis (Strategene Ltd, Cambridge Science Park, Milton Road, Cambridge CB4 4GF). A ligation mixture was used to transform supercompetent E. coli XL1-Blue MRF 'Kan epicurean cells (obtainable from Strategene) into ampicillin resistance (according to manufacturers' instructions). Plasmid DNA was isolated from the
Resulting transformants and DNA restriction analysis revealed that 7 clones containing the plasmid to which the PCR2 product had bound had been ligated. One of these plasmids was designated as PDES1. The PCR1 product was digested with Psil and AatW and the resulting DNA was fractionated by agarose gel electrophoresis. It was excised from 1 kb and eluted using the Sephaglas band preparation kit (Pharmacia, St Albans, Herts, ALI 3 AW). The isolated fragment was then ligated into pDES1 digested by AatW and Psil. The ligation mixture was used to transform the competent E. coli KL1-blue cells (obtainable from Strategene) into ampicillin resistance (according to manufacturers' instructions). Plasmid DNA was isolated from the resulting transformants and restriction analysis revealed that a clone contained the plasmid to which the PCR product had been ligated. This plasmid was designated pDES2. To introduce pDES2 to S. clavuligerus, the plasmid was modified further to contain an origin of replication that could work in Streptomyces. to achieve this, pDES2 plasmid DNA was digested with EcoRI and HindW and ligated to the high copy number Streptomyces vector plJ486 (6.2Kb: ward et al., (1986) Mol.Gen.genet 203: 468-478) also digested with EcoR \ and HidW. The ligation mixture was used to transform the E. coli competent cells (JM109) into ampicillin resistance. Plasmid DNA was isolated from the resulting transformants and restriction analysis revealed that 6 clones possessed pDES2 containing plJ486. One of the plasmids was designated as pDES3. The plasmid was used
pDES3 to transform a strain of S. clavuligerus in which the oplpl gene had already been dissociated by insertion of the apramycin resistance gene (as described above). Triostrepton resistant transformants were selected and then these transformants were subjected to three cycles of sporulation in non-selective media and analyzed for the loss of apramycin resistance. From this procedure, 45 mutants that had lost resistance to apramycin were identified. These were then analyzed by CLAR, which confirmed that these strains, such as the dissociating 61-1A, 61-2A, 61-3A and 61-4A de oriup were unable to produce carboxylates of clavama-2 and 2-hydroxymethyl-clavama, when they were fermented under conditions in which these clavamas are normally produced.
B orfdwnl and orfdwn2 Was a deletion / replacement mutant created in orfdwn? and orfdwnl, first digesting pCEC018 (7.3 kb) with Nco \ and releasing a subfragment of 1 kb that contained most of the orfwn? and a portion of orfdwnl. The digestion was fractionated by electrophoresis with agarose-gel and the 6.3 kb fragment was excised and eluted from the gene. This fragment was then ligated to a DNA fragment Nco \ -Nco \ that had apr1"was used to transform E. coli XLI-blue in apramycin resistance A clone was obtained from this experiment, but the restriction analysis of the plasmid resulting recombinant revealed that two copies of apramycin resistance fragment
they had ligated to the deletion plasmid. To remove the additional copy of the fragment api ^ the plasmid was digested with? / Cabbage and self-ligated. The ligation mixture was used to transform E. co // GM2163 into apramycin resistance. Of the transformants, two clones containing the plasmids pCEC052 and pCEC053 both having only one copy of the aprr fragment were isolated; pCEC052 had the fragment oriented api-1"inversely with respect to oridwnl and 2, whereas pCEC053 possessed the aprr fragment inserted in the same orientation as orfdwn and 2. A promiscuous plasmid of pCEC052 ligand pCECE052 digested by BamYW was constructed with similarly digested plJ486 and transforming E. coli GM2163 into apramycin resistance.From this experiment, a clone containing the promiscuous plasmid pCEC060 was isolated.This plasmid was used to transform S.clavuligerus 3585 wild-type in resistance to apramycin and triestrepton. subjected the resulting transformants to two sporulation cycles under non-selective conditions and then coated the antibiotic-containing media to identify apramycin-resistant colonies, sensitive to triostrepton. Three putative mutants (60-1A.-2A and -3A) were chosen for further analysis. To establish the identity of these putative mutants, genomic DNA was isolated from strain 60-1 A and 60-2A and Sacl or BstEII were digested and subjected to Southern blot analysis. The hybridization bands generated from this experiment were compatible with both strains that had
experienced a double-crossing event, demonstrating that these mutants are true replacement mutants by dissociation in orfdwn1 / 2. When these were grown in soybean meal medium and the culture supernatants were analyzed by HPLC, none of the mutants produced detectable levels of clavama-2 or 2-hydroxymethylclavama carboxylate. A biological analysis of the culture supernatants revealed that the mutants stopped producing also detectable levels of alanylclavama. As with orfup 1 mutants, the orfdwn 1/2 mutants are capable of producing higher levels than the wild-type clavulanic acid (Table 2).
TABLE 2 Titration of clavulanic acid (CA) from orfdwn1 / 2 mutants in shake flask tests
orfdwn3 To dissociate orfdwn3, pCEC023 (consisting of a 3.7 kb fragment of DNA towards the 3 terminus of almost subcloned to pSI11809) was digested with Ncol and then self-ligated. After transforming E. coli with the ligation mixture, a clone having the plasmid pCEC031 was isolated. This plasmid retained only the 1.9kb Ncol-EcoRI fragment encoding a portion of orfdwn2 and the incomplete orfdwn3. An examination of the DNA sequence revealed that pCEC031 possesses a unique BstEII site at 158 bp from the translation start site of Orfdwn3. Therefore, pCEC031 was digested with BstEII, treated with Klenow fragment to create shaved ends and ligated
then to a shaved cassette of apramycin resistance. The ligation mixture was used to transform E. coli GM2183 into apramycin resistance and ampicillin resistance. Two transformants containing respectively pCEC050 and pCEC051 were selected. Restriction analysis revealed that the apramycin resistance cassette is oriented in the same orientation as an orfdw3 in pCEC050 and in the opposite orientation in pCEC051. These two plasmids were then digested with Hindlll and ligated to PIJ486 digested similarly. The ligation mixtures were then used to transform E.Coli GM2163 into resistance to apramycin and ampicillin. Promiscuous plasmids pCEC056 (pCEC050 + plJ486) and pCEC057 (pCEC051 + plJ486) were isolated from the resulting transformants. Both plasmids S. Clavuligerus NRRI 3585 were transformed. One transformant from each experiment was selected with transformants and subjected to successive sporulation cycles in non-selective media and then coated with replicate to medium containing antibiotic to indicate apramycin-resistant transformants. and sensitive to triestreptone. From this procedure, two putative mutants were isolated from the progeny of each primary transformant 56-1 A and 56-3A for pCEC056, and 57-IC and 57-2B for pCEC057). To establish the identity of these putative mutants, genomic DNA was isolated from these strains and digested with Sacl or Acc65l and subjected to Southern blot analysis. The hybridization bands generated by this experiment were compatible with both strains that had
experienced a double-crossing event, demonstrating that these mutants are true displacement mutants by dissociation in an orfdwn3. When these were grown in soybean meal medium and the culture supernatants were analyzed by HPLC, none of the mutants produced detectable levels of clavama-2 or 2-hydroxymethylclavama carboxylate. A biological analysis of the culture supernatants revealed that the mutants have also stopped producing detectable levels of alanylclavama. As with the orfup and orfdwn 1/2 mutants, the orfdwn 3 mutants are capable of producing higher levels than the wild-type clavulanic acid (Table 3).
TABLE 3 Titration of clavulanic acid (CA) from orfdwn3 mutants in shake flask tests
The application discloses the following sequences of neuroketidos and amino acids: SEQ ID No: 1 - a DNA sequence of 7193 bp SEQ ID No: 2 - the sequence of orfup3 SEQ ID No: 3 - the sequence of orfup2 SEQ ID No: 4 - the sequence of orfupl SEQ ID No: 5 - the sequence of orfdwnl SEQ ID No: 6 - the sequence of orfdwn2 SEQ ID No: 7 - the sequence of orfdwn3 SEQ ID No: 8 - the sequence of the primer 11
SEQ ID No: 9 - the sequence of the primer 12 SEQ ID No: 10 - the sequence of the primer 13 SEQ ID No: 11 - the sequence of the primer 14 SEQ ID No: 12 - the DNA sequence of the open reading frame of CAS SEQ ID NO: 13 - the predicted partial amino acid sequence of the polypeptide encoded by orfup3 SEQ ID No: 14 - the predicted amino acid sequence of the polypeptide encoded by orfup2 SEQ ID No: 15 - the predicted amino acid sequence of the polypeptide encoded by orfupl SEQ ID No: 16 - the predicted amino acid sequence of the polypeptide encoded by orfdwnl SEQ ID No: 17 - the predicted amino acid sequence of the polypeptide encoded by orfdwn2 SEQ ID No: 18 - the predicted amino acid sequence of the polypeptide encoded by orfdwn3 SEQ ID No : 19 - the amino acid sequence of CAS
LIST OF SEQUENCES (1) GENERAL INFORMATION (i) APPLICANT: SmithKIine Beecham pie et al (ii) TITLE OF THE INVENTION: Novel compounds
(iii) NUMBER OF SEQUENCES: 19 (iv) DOMICILE TO RECEIVE CORRESPONDENCE: (A) RECIPIENT: SmithKine Beecham (B) STREET: Two, New Horizons Court, Great West Road (C) CITY: Brentford (D) STATE: (E) ) COUNTRY: UNITED KINGDOM (F) POSTAL CODE: TW8 9EP (v) COMPUTER LEGIBLE FORM: (A) TYPE OF MEDIA: Flexible disk (B) COMPUTER: IBM Compatible (C) OPERATVO SYSTEM: TWO (D) SOFTWARE: FastSEQ for Windows Version 2.0 (vi) APPLICATION DATA: (A) APPLICATION NUMBER: (B) SUBMISSION DATE: (C) CLASSIFICATION: (vii) DATA FROM THE PREVIOUS APPLICATION: (A) APPLICATION NUMBER: ( B) DATE OF PRESENTATION: (viii) INFORMATION OF THE POWDER / AGENT: (A) NAME: Valentine, Jill B (B) REGISTRATION NUMBER: (C) REFERENCE NUMBER: P31731 (ix) TELECOMMUNICATIONS INFORMATION: (A) TELEPHONE : 0181-9752000 (B) TELEFAX: 0181-9756294 (C) TELEX:
(2) INFORMATION OF THE SEQUENCE SEQ ID NO: l: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 7193 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple -D- TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 1: CCATGGCGGG CGGCGGCTGC CCCGGAGCCT CGGCCGGACC GGTGACCAGG ACCACCCCGG 60
TGGGATAGTG GCCCGCCACC CGGCGCAGCA GACTCCCGGA CACGGACCCG TGGGTGTGCG 120
CGGAAAGGCC CGGAGGCCGG GTCACAGCCA CGGGTAACGC GCGGTGTCCT TGCCCGCGTA 180
ATCGGGGTCC AGATAGACGA AGGCCCGGTG GACGAGGAAG TCCCGCACCT CGTAGACCGT 240
GCACCAGCGC CCGGCGGCCC ACTCGGGGTC ACCCGCCCGC CACGGCCCGT CCCGGTGCTC 300
ACCGTGGGTG GTGCCCTCCG CGGCGAGGAG TTCGGTCCCG GTCAGAATCC AGTTGACGGA 360
CCACAGATGG TGGGTGATCG AGCGGATGGT GCCCCCGAGG TCGTCGAAGA GCCGGGCGAT 420
CTCGGACTTG CCCCGGGCCA GACCCCACTT GGGGAAGAAG AAGACCGCGT CCTCGGCGAA 480
GTAGTCGATC GCGGGGGTGC CGTCGCTGCC GACGCCGCCG TTGTCGAACG CCTTGAAGTA 540
CGCGGTGATG ACCGCCTTGC GCTGCTCGTC CGTCATACCG GCCGATGCCA CGGACATGAA 600
ACGACCTCCA GAGATTCCGG GTGGCTGTGC TGGGGCTGCG GAAGGGGTGT CCCCCGCGAA 660
, - GGACGGCGGA CGCCGCGGAC GCCGCGGCCG TCTCCCCGGC GGACGGGTCC CAGCGTCCTG 720
° GAGAGGGCTT GGCGGCGGCT TGACGCCGTG CTGTCCCGCG GCTTGCGGAA CGCGAAGTAC 780
CGGCCAGCGT ACGGGCGTTG CACCGGACGT GTACGCCGGT CGGGACCCCT CGTACCCCCG 840
GAGCCGGCCG ACCCCGGCGG CTCCGGGGGT ACGGACGCGC CGGACCGGCC CGAGCGAGCC 900
GGACGGGTCG GACGGTGCGC GTGGTTCCGG TGTGTCGGAC AGCTCGGACG GACCGGACGG 960
TGCGCGTGGT TCCGGTGTGT CGGACAGCTC GGACGGGTCG GACGGTGCGC GTGGTTCCGG 1020
CACGCCGGAC GGGTCAGTTG CCGATCATGG CGAGCAATGC CGGGGTGTAC CGCTCCCCGG 1080
ACACCGGGTG GGAGATCGCG GCCGTCACCT CCGCGAGGGA CCGGTCGTCC AGCCGGATCG 1140
AGGCGGCGGC GAGATTGTCC GCGAGATGGG CCGGGTTCGC GGTGCCCGGG ATCGGGACGA 1200
CGTCCTCGCC CCGGTGGTGC AGCCAGGCGA GCGCGAGCTG TGCCAGGGTC AGCCCCAGAC 1260
CGTCCGCGAC CGGGCGCAGC CGGTGCAGCA ACGAGCGGTT GCGCGCGAGG GCCGGAGCGC 1320
TGAACCGGGG CTGGCCCCGG CGGAAGTCCT CGTCCCCCAG ATCGTCGGTG GTGCGGATGG 1380
TGCCGGTGAG AAAACCCCGT CCCAGAGGGG CGTAAGCGAC GATCCCGATC CCCAGCTCCC 1440
GGCAGACGGG CACCACCTCG TCCTCGATCC CGCGCGACCA CAGGCTCCAC TCGCTCTGCA 1500
- | 0 CCGCCGTCAC CGGGTGCACC GCGTCCGCCC GGCGCAGCGT GGCCGCGGAG GGCTCGGAGA 1560
GACCGAGCCT GCGGACCTTG CCCTCGCGCA CCAGCTCGGC CACCGCACCC ACGGTCTCCT 1620
CGATCGGCAC CGCCGGGTCC GTCCAGTGCT GGTAGTACAG GTCGATGCGG TCGGTGCCGA 1680
GACGACGCAG GGACCGTTCG CAGGCCGCGC GGACGTAGGA CGGCTCGCCG CACAAGCCCT 1740
GGGAGGCGCC GTCGGACGAG CGCACCATGC CGAACTTGGT GGCGATCAGC ACCTCGTCCC 1800
GGCGGCCCGC GACCGCCCGT CCGAGCAGCT CCTCACCGGC GCCGAGCCCC TGGACGTCGG 1860
CGGTGTCCAG CAGGGTGACC CCGGCGTCGA CGGCGGCGCG GATGGTGGCC GTCGCCCGGG 1920
CGCGGTCCGG GCGTCCGTAG AAGTCGGTGG TCGGCAGGCA GCCGAGCCCC TGGGCACTGA 1980
CCGGAAGGTC CCGCAGGGCG CGGACCGGCG GACGCGGAAC CGCGGCGGAC ACGGAACCGG 2040
CCGGGGACTC GGGCGGAGAG CGGGACATAC GGAACCTCCA CAGGCGGAGC CGGGAACGGG 2100
ACGAGGGCGA GGACGGGACG GAACGAAGGA GAGGACGGGA CGGACAGCAC GGACGGGACG 2160
GACGGAACGG AGTCGGGAAC CGGGGGGGGT GACCGGAACC GGGCCGTCCT TGGCCCTCCC 2 20
CCGTCCTCCC CGCCATCCGC CGTTCTCCCC CGTTCCCTCT CCCGTCCTCC AGCCAACACC 2280
GCCGCCCTTT CCAAGCGCTT GACACGGCAC CGACAGCCGC CGCCGGGCGC CCGATGGGGA 2340
CCCGTGCCCG CCGGTGAGCG GCGGTGAGCG CCGGTACGGG ACCCCACGCG CCGCCGCCCG 2400
GGCGCCCGCC AGGGCCCGCG CGGCCACCCC GGCCCGCCCC GGCCGGAGCG GCGATCCGGG 2460
CCGCTCGCTG CAAGAGGAAC ATCCACAGCC GCACAAGGAG CGCTCCGCAC AGTGGGCACC 2520
ACGTCCGCCC CGTCCCCCAC ACCGTGGCCG GTCCCCACCG GACAGCACAG CACCGCACAG 2580
CACCACATCG CACGGCACAG CACAGCACCA CCGGCACGAG GAACCAAGGA AAGGAACCAC 640
ACCACCATGA CCTCAGTGGA CTGCACCGCG TACGGCCCCG AGCTGCGCGC GCTCGCCGCC 2700
CGGCTGCCCC GGACCCCCCG GGCCGACCTG TACGCCTTCC TGGACGCCGC GCACACAGCC 2760
GCCGCCTCGC TCCCCGGCGC CCTCGCCACC GCGCTGGACA CCTTCAACGC CGAGGGCAGC 2820
GAGGACGGCC ATCTGCTGCT GCGCGGCCTC CCGGTGGAGG CCGACGCCGA CCTCCCCACC 2880
ACCCCGAGCA GCACCCCGGC GCCCGAGGAC CGCTCCCTGC TGACCATGGA GGCCATGCTC 2940
GGACTGGTGG GCCGCCGGCT CGGTCTGCAC ACGGGGTACC GGGAGCTGCG CTCGGGCACG 3000
GTCTACCACG ACGTGTACCC GTCGCCCGGC GCGCACCACC TGTCCTCGGA GACCTCCGAG 3060
ACGCTGCTGG AGTTCCACAC GGAGATGGCC TACCACCGGC TCCAGCCGAA CTACGTCATG 3120 0 CTGGCCTGCT CCCGGGCCGA CCACGAGCGC ACGGCGGCCA CACTCGTCGC CTCGGTCCGC 3180
AAGGCGCTGC CCCTGCTGGA CGAGAGGACC CGGGCCCGGC TCCTCGACCG GAGGATGCCC 3240
TGCTGCGTGG ATGTGGCCTT CCGCGGCGGG GTGGACGACC CGGGCGCCAT CGCCCAGGTC 3300
AAACCGCTCT ACGGGGACGC GGACGATCCC TTCCTCGGGT ACGACCGCGA GCTGCTGGCG 3360
CCGGAGGACC CCGCGGACAA GGAGGCCGTC GCCGCCCTGT CCAAGGCGCT CGACGAGGTC 3420
ACGGAGGCGG TGTATCTGGA GCCCGGCGAT CTGCTGATCG TCGACAACTT CCGCACCACG 3480
CACGCGCGGA CGCCGTTCTC GCCCCGCTGG GACGGGAAGG ACCGCTGGCT GCACCGCGTC 3540
TACATCCGCA CCGACCGCAA TGGACAGCTC TCCGGCGGCG AGCGCGCGGG CGACGTCGTC 3600
GCCTTCACAC CGCGCGGCTG AGCTCCCGGG TCCGACACCG CGCGGCTGAA CCCACGGTCC 3660
GGGGCCCACG GTCCGGCACC GCGCGGCTGÁ GCCCCCGGGT CCGGCAGCGG GCGGCTGAAC 3720
CCCCGCCCCG GGCCACCGCC CGACCGCCCC CGCGCACCGG ACGCGCCCGC CTGTACGGCG 3780
GTCCCGCCCG GGCCCGTACA CCTGAAGCGC CCGGCGGACC GCCGCCCCGC CGGGGGACGG 3840
ACAGAGCCGG GTGCGGGAGG ACGTCCTCCC GCACCCGGCT CCCACCGTTC CGCACCGACC 3900
GCACCCGACC GTGCCGCAGG CGCCACCGGC ACCGCACCGC CCGCGCCGGC AGCCACCACA 3960
GGCGCCACGC CGCCCGCACG GTGCCCGCGC TGCTCAGCCC CCGTCCACCG GGCTGTCCAG 4020
CAGCCGCCGC AGCGCGCCCC CGATGAACTC CCGGTCGGCG GCCGACCCCC CGGACCCCGC 4080
GAGATGCCCC CACACTCCCG GGATCACCTC CAGCGAGGCA TACGGCAGCA GATCGGCCAC 4140
CCGCTTCTCG TCCTCGACGG CGAAACACAC GTCCAGGGCG CCCGGCAGCA CCACGGCCCG 4200
CGCCGTGACG GAGGCCAGCG CCGCCTCGAC GCTCCCCCCG GCCCCGGGTG TCGCCCCCAC 4260
, - ATCCGTGTTC TCCCAGGTGC GCACCATGGT GAGCAGATCC GCGGCGCCGG GCCCGGAGAG 4320
^ GAAGACCTGC TCCCAGAAGC CGGTGAGGTA CTCCTCGCGG GTGGCGAAAC CCAGCTCCCG 4380
GTGGGCACGG CGGGCCCAGA AGGAACGCGA GGTCCCCCAC CCGGCGAACA CCCGGCCCGC 4440
CGCCTTCCGC CCCCGCTCCC CGGCGTCGGC GCTGAGCGCC GCGGCCAGAC CGGACAGCAG 4500
GACCAGGCTG TGCGGGCTGC TCACCGGCGC CCCGCAGATC GGGGCGATCC GGCGCACCAT 4560
CCCCGGATGC GACACGGCCC ACTGGTAGGC GTGGGCCGCG CCCATCGACC AGCCCGTGAC 4620
CAGGGCCAGT TCCCGTACCC CCAGCTCCTC GGTGAGCAGC CGGTGCTGCG CCGCGACATT 4680
GTCCTGCGGA GTGATCAGCG GAAAGCGGGA CCCCGACGGG TGGTTGCCGG GCGAGCTGGA 4740
GACCCCGTTG CCGAAGAGTC CGGCGGTGAC GACGCAGTAC CGCCGGGTGT CCAGCGGCAG 4800
CCCCGCACCG ATCAGCCAGT CGTACCCGGT GTGGTCCCGG CCGAAGAACG ACGGACAGAG 4860
CACCACGTTC GTCCCGTCGG CGTTCGGCGT GCCGTACATG GCGTAACCGA TCCGGGCGTC 4920
CCGCAGGACC TCCCCGTCCA GCAACGGCAG TTCGTCGATC TCGAATATGC GGCATTCCAC 4980
CGCTGACCTC CTTGTTCGAT CCCCCCGGAC AACAGGTCGG TCGTGGCCGG AGACTCAGAG 5040
CCAGTTGGGG GCGATCTCGG TGGCCCACAG CTCCAGGCTG CGCAGCTGGA CATCGTGCGG 5100 0 GATCAGCCCG GAGTACTGGC ACTGGAGCAG ATACTCCGGA TCGTGCCGCT CCACCAGCTT 5160
CTCGATCATG CGGTTGATGT CGTCCGGGGT GCCGACCCAC TCCAGCCCCC GGTCGACCAG 5220
GGTCTTGTAG TCCGAGCCGA TCGGACCCGT CTCGCCGGTC GCGCGCAGCG CCTCGGTGAA 5280
GCCCATGGGG CCGAACCAGT TCTCGAAGAT GAAGCCGCCG CCGCGGGACG CCCAGTGGTG 5340
GGCCTCGCCG GAGTCCCGGG AGACCAGGAC GTCCTTCATC ACCCCGACCC GCTCGCCCCG 5400
CCGCAGGGTG CCGTGGCCCG CCGCCTCGGC CTCCTCCCGG TAGATGTCCA TCAGCCGGGC 5460
GACGATCTGG TCGTCGGTGT TCATCAGGAT CGGCACCACG CCCTCCCGGG CACAGAACCG 5520
GAACGTGTCC TCACTGAAGC TGAACGGCTG GAAGACGGGC GGGTGGGGGC GCTGGTAGGG 5580
CTTGGGCGCG ATGCCCACCT CGCGGATGAC GCCGTTCTCG TCGAGGCCCC GGCCGTAGCG 5640
GCGCACCGCC TCGTAGGGGA ACTCCAGGTC CGGCACCGGG ATCGTCCACT GCTCCCCGGA 5700
GTGGGTGAAC GTCTCGGTCG TCCACGCCTT CTTGATGATC TCCCAGTGCT CCTCGAAGAG 5760
GGCACGATTG CGCCGGTCCC GCTCCCCGGC GTCGGACAGG GTGCCGCCGA CCCCGTACAC 5820
CTGCCCCATG ATGTCGGCCC AGCGCTTCTG GAACCCGCGC GCGATCCCGA CGAAGGCGCG 5880
GCCCCGGGTC ATGTGGTCGA GCATCGCCAG ATCCTCGGCC AGCCGCAGCG GATTGTGCAG 5940
CGGCAGGACG TTGGCCATCT GGCCGACCCG GATGTGCCGG GTCTGCATGC CGAGGTAGAG 6000
CCCCAGCATG ATCGGGTTGT TGGAGACCTC GAAACCCTCG GTGTGGAAGT GGTGCTCGGT 6060
GAAGGACAGT CCCCAGTAGC CGAGTTCGTC GGCCGCCTGC GCCTGCCGGG TGAGCTGCCG 6120
GAGCATGTTC TGGTAGTTCT GCGGATTGAC CCCCGCCATA CCCCGCTGGA CCTGCGCATG 6180
ACTGCCGACC GTTGGCAGAT AGAAGAGAAT GGACTTCACC CTGGCTCCTC CGGTTCGCGG 6240
CGCCCTCCAT TGACGTGCGC CGAAAGCGGC TCGACCGTCC CACTCCGCCC TTGAGTTCCG 6300
TCTGACGCCG CGCCAGTCGG CGGGCCGTCC GCCGGGGTGC CCGCCGGGGT CCGCACCCGC 6360
CGGACGGCAC GGCGCGCACC GCGCGCGCGG CGCTTCGGGG CACCGGGCTC GACGGGGTGC 6420
TCAGCGGGAC GTCCAACGGA AGGCAAGCCC CCGTACCCAG CCTGGTCAAG GCGCTCATCG 6480
CCATTCCCTG AGGAGGTCCC GCCTTGACCA CAGCAATCTC CGCGCTCCCG ACCGTGCCCG 6540
GCTCCGGACT CGAAGCACTG GACCGTGCCA CCCTCATCCA CCCCACCCTC TCCGGAAACA 6600
CCGCGGAACG GATCGTGCTG ACCTCGGGGT CCGGCAGCCG GGTCCGCGAC ACCGACGGCC 6660
GGGAGTACCT GGACGCGAGC GCCGTCCTCG GGGTGACCCA GGTGGGCCAC GGCCGGGCCCC 6720 AGCTGGCCCG GGTCGCGGCC GAGCAGATGG CCCGGCTGGA GTACTTCCAC ACCTGGGGGA 6780
CGATCAGCAA CGACCGGGCG GTGGAGCTGG CGGCACGGCT GGTGGGGCTG AGCCCGGAGC 6840
CGCTGACCCG CGTCTACTTC ACCAGCGGCG GGGCCGAGGG CAACGAGATC GCCCTGCGGA 6900
TGGCCCGGCT CTACCACCAC CGGCGCGGGG AGTCCGCCCG TACCTGGATA CTCTCCCGCC 6960
GGTCGGCCTA CCACGGCGTC GGATACGGCA GCGGCGGCGT CACCGGCTTC CCCGCCTACC 7020
ACCAGGGCTT CGGCCCCTCC CTCCCGGACG TCGACTTCCT GACCCCGCCG CAGCCCTACC 7080
GCCGGGAGCT GTTCGCCGGT TCCGACGTCA CCGACTTCTG CCTCGCCGAA CTGCGCGAGA 7140
CCATCGACCG GATCGGCCCG GAGCGGATCG CGGCGATGAT CGGCGAGCCG ATC 7193
(2) INFORMATION OF THE SEQUENCE SEQ ID NO: 2: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 145 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other, - (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 2: OR
GTGACCCGGC CTCCGGGCCT TTCCGCGCAC ACCCACGGGT CCGTGTCCGG GAGTCTGCTG 60 CGCCGGGTGG CGGGCCACTA TCCCACCGGG GTGGTCCTGG TCACCGGTCC GGCCGAGGCT 120 CCGGGGCAGC CGCCGCCCGC CATGG 145 (2) INFORMATION FOR SEQ ID NO: 3: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 453 base pairs (B ) TYPE: nucleic acid (C) CHAIN TYPE: simple (D) TOPOLOGY: linear 0 (Ü) TYPE OF MOLECULE: Other (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 3: ATGTCCGTGG CATCGGCCGG TATGACGGAC GAGCAGCGCA AGGCGGTCAT CACCGCGTAC 60
TTCAAGGCGT TCGACAACGG CGGCGTCGGC AGCGACGGCA CCCCCGCGAT CGACTACTTC 120
GCCGAGGACG CGGTCTTCTT CTTCCCCAAG TGGGGTCTGG CCCGGGGCAA GTCCGAGATC 180
GCCCGGCTCT TCGACGACCT CGGGGGCACC ATCCGCTCGA TCACCCACCA TCTGTGGTCC 240
GTCAACTGGA TTCTGACCGG GACCGAACTC CTCGCCGCGG AGGGCACCAC CCACGGTGAG 300
CACCGGGACG GGCCGTGGCG GGCGGGTGAC CCCGAGTGGG CCGCCGGGCG CTGGTGCACG 360
GTCTACGAGG TGCGGGACTT CCTCGTCCAC CGGGCCTTCG TCTATCTGGA CCCCGATTAC 420
GCGGGCAAGG ACACCGCGCG TTACCCGTGG CTG 453 (2) SEQUENCE INFORMATION SEQ ID NO: 4: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1032 base pairs (B) TYPE: nucleic acid (C) CHAIN TYPE: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 4: ATGTCCCGCT CTCCGCCCGA GTCCCCGGCC GGTTCCGTGT CCGCCGCGGT TCCGCGTCCG 60
CCGGTCCGCG CCCTGCGGGA CCTTCCGGTC AGTGCCCAGG GGCTCGGCTG CCTGCCGACC 120 ACCGACTTCT ACGGACGCCC GGACCGCGCC CGGGCGACGG CCACCATCCG CGCCGCCGTC 180
GACGCCGGGG TCACCCTGCT GGACACCGCC GACGTCCAGG GGCTCGGCGC CGGTGAGGAG 240
CTGCTCGGAC GGGCGGTCGC GGGCCGCCGG GACGAGGTGC TGATCGCCAC CAAGTTCGGC 300
ATGGTGCGCT CGTCCGACGG CGCCTCCCAG GGCTTGTGCG GCGAGCCGTC CTACGTCCGC 360
GCGGCCTGCG AACGGTCCCT GCGTCGTCTC GGCACCGACC GCATCGACCT GTACTACCAG 420
CACTGGACGG ACCCGGCGGT GCCGATCGAG GAGACCGTGG GTGCGGTGGC CGAGCTGGTG 480
CGCGAGGGCA AGGTCCGCAG GCTCGGTCTC TCCGAGCCCT CCGCGGCCAC GCTGCGCCGG 540
GCGGACGCGG TGCACCCGGT GACGGCGGTG CAGAGCGAGT GGAGCCTGTG GTCGCGCGGG 600
ATCGAGGACG AGGTGGTGCC CGTCTGCCGG GAGCTGGGGA TCGGGATCGT CGCTTACGCC 660
CCTCTGGGAC GGGGTTTTCT CACCGGCACC ATCCGCACCA CCGACGATCT GGGGGACGAG 720
GACTTCCGCC GGGGCCAGCC CCGGTTCAGC GCTCCGGCCC TCGCGCGCAA CCGCTCGTTG 780
CTGCACCGGC TGCGCCCGGT CGCGGACGGT CTGGGGCTGA CCCTGGCACA GCTCGCGCTC 840
GCCTGGCTGC ACCACCGGGG CGAGGACGTC GTCCCGATCC CGGGCACCGC GAACCCGGCC 900
CATCTCGCGG ACAATCTCGC CGCCGCCTCG ATCCGGCTGG ACGACCGGTC CCTCGCGGAG 960
GTGACGGCCG CGATCTCCCA CCCGGTGTCC GGGGAGCGGT ACACCCCGGC ATTGCTCGCC 1020
ATGATCGGCA AC 1032 (2) SEQUENCE INFORMATION SEQ ID NO: 5: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 984 base pairs (B) TYPE: nucleic acid (C) CHAIN TYPE: simple (D) ) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 5: GTGGAATGCC GCATATTCGA GATCGACGAA CTGCCGTTGC TGGACGGGGA GGTCCTGCGG 60 GACGCCCGGA TCGGTTACGC CATGTACGGC ACGCCGAACG CCGACGGGAC GAACGTGGTG 120
CTCTGTCCGT CGTTCTTCGG CCGGGACCAC ACCGGGTACG ACTGGCTGAT CGGTGCGGGG 180
CTGCCGCTGG ACACCCGGCG GTACTGCGTC GTCACCGCCG GACTCTTCGG CAACGGGGTC 240
TCCAGCTCGC CCGGCAACCA CCCGTCGGGG TCCCGCTTTC CGCTGATCAC TCCGCAGGAC 300
AATGTCGCGG CGCAGCACCG GCTGCTCACC GAGGAGCTGG GGGTACGGGA ACTGGCCCTG 360
GTCACGGGCT GGTCGATGGG CGCGGCCCAC GCCTACCAGT GGGCCGTGTC GCATCCGGGG 420
ATGGTGCGCC GGATCGCCCC GATCTGCGGG GCGCCGGTGA GCAGCCCGCA CAGCCTGGTC 480
CTGCTGTCCG GTCTGGCCGC GGCGCTCAGC GCCGACGCCG GGGAGCGGGG GCGGAAGGCG 540
GCGGGCCGGG TGTTCGCCGG GTGGGGGACC TCGCGTTCCT TCTGGGCCCG CCGTGCCCAC 600
CGGGAGCTGG GTTTCGCCAC CCGCGAGGAG TACCTCACCG GCTTCTGGGA GCAGGTCTTC 660
CTCTCCGGGC CCGGCGCCGC GGATCTGCTC ACCATGGTGC GCACCTGGGA GAACACGGAT 720
GTGGGGGCGA CACCCGGGGC CGGGGGGAGC GTCGAGGCGG CGCTGGCCTC CGTCACGGCG 780
CGGGCCGTGG TGCTGCCGGG CGCCCTGGAC GTGTGTTTCG CCGTCGAGGA CGAGAAGCGG 840 GTGGCCGATC TGCTGCCGTA TGCCTCGCTG GAGGTGATCC CGGGAGTGTG GGGGCATCTC 900 GCGGGGGTCCG GGGGGTCGGC CGCCGACCGG GAGTTCATCG GGGGCGCGCT GCGGCGGCTG 960
CTGGACAGCC CGGTGGACGG GGGC 984
(2) INFORMATION OF THE SEQUENCE SEQ ID NO: 6: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1182 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other (x) DESCRIPTION OF SEQUENCE: SEQ ID NO: 6: GTGAAGTCCA TTCTCTTCTA TCTGCCAACG GTCGGCAGTC ATGCGCAGGT CCAGCGGGGT 60
ATGGCGGGGG TCAATCCGCA GAACTACCAG AACATGCTCC GGCAGCTCAC CCGGCAGGCG 120
CAGGCGGCCG ACGAACTCGG CTACTGGGGA CTGTCCTTCA CCGAGCACCA CTTCCACACC 180
GAGGGTTTCG AGGTCTCCAA CAACCCGATC ATGCTGGGGC TCTACCTCGG CATGCAGACC 240
CGGCACATCC GGGTCGGCCA GATGGCCAAC GTCCTGCCGC TGCACAATCC GCTGCGGCTG 300
GCCGAGGATC TGGCGATGCT CGACCACATG ACCCGGGGCC GCGCCTTCGT CGGGATCGCG 360
CGCGGGTTCC AGAAGCGCTG GGCCGACATC ATGGGGCAGG TGTACGGGGT CGGCGGCACC 420
CTGTCCGACG CCGGGGAGCG GGACCGGCGC AATCGTGCCC TCTTCGAGGA GCACTGGGAG 480
ATCATCAAGA AGGCGTGGAC GACCGAGACG TTCACCCACT CCGGGGAGCA GTGGACGATC 540
CCGGTGCCGG ACCTGGAGTT CCCCTACGAG GCGGTGCGCC GCTACGGCCG GGGCCTCGAC 600
GAGAACGGCG TCATCCGCGA GGTGGGCATC GCGCCCAAGC CCTACCAGCG CCCCCACCCG 660
CCCGTCTTCC AGCCGTTCAG CTTCAGTGAG GACACGTTCC GGTTCTGTGC CCGGGAGGGC 720
GTGGTGCCGA TCCTGATGAA CACCGACGAC CAGATCGTCG CCCGGCTGAT GGACATCTAC 780
CGGGAGGAGG CCGAGGCGGC GGGCCACGGC ACCCTGCGGC GGGGCGAGCG GGTCGGGGTG 840
ATGAAGGACG TCCTGGTCTC CCGGGACTCC GGCGAGGCCC ACCACTGGGC GTCCCGCGGC 900
GGCGGCTTCA TCTTCGAGAA CTGGTTCGGC CCCATGGGCT TCACCGAGGC GCTGCGCGCG 960
ACCGGCGAGA CGGGTCCGAT CGGCTCGGAC TACAAGACCC TGGTCGACCG GGGGCTGGAG 1020
TGGGTCGGCA CCCCGGACGA CATCAACCGC ATGATCGAGA AGCTGGTGGA GCGGCACGAT 1080
CCGGAGTATC TGCTCCAGTG CCAGTACTCC GGGCTGATCC CGCACGATGT CCAGCTGCGC 1140
AGCCTGGAGC TGTGGGCCAC CGAGATCGCC CCCAACTGGC TC 1182
(2) INFORMATION OF THE SEQUENCE SEQ ID NO: 7: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 660 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 7: GTGCCCGGCT CCGGACTCGA AGCACTGGAC CGTGCCACCC TCATCCACCC CACCCTCTCC 60
GGAAACACCG CGGAACGGAT CGTGCTGACC TCGGGGTCCG GCAGCCGGGT CCGCGACACC 120
GACGGCCGGG AGTACCTGGA CGCGAGCGCC GTCCTCGGGG TGACCCAGGT GGGCCACGGC 180
CGGGCCGAGC TGGCCCGGGT CGCGGCCGAG CAGATGGCCC GGCTGGAGTA CTTCCACACC 240
TGGGGGACGA TCAGCAACGA CCGGGCGGTG GAGCTGGCGG CACGGCTGGT GGGGCTGAGC 300
CCGGAGCCGC TGACCCGCGT CTACTTCACC AGCGGCGGGG CCGAGGGCAA CGAGATCGCC 360
CTGCGGATGG CCCGGCTCTA CCACCACCGG CGCGGGGAGT CCGCCCGTAC CTGGATACTC 420
TCCCGCCGGT CGGCCTACCA CGGCGTCGGA TACGGCAGCG GCGGCGTCAC CGGCTTCCCC 480
GCCTACCACC AGGGCTTCGG CCCCTCCCTC CCGGACGTCG ACTTCCTGAC CCCGCCGCAG 540
CCCTACCGCC GGGAGCTGTT CGCCGGTTCC GACGTCACCG ACTTCTGCCT CGCCGAACTG 600
CGCGAGACCA TCGACCGGAT CGGCCCGGAG CGGATCGCGG CGATGATCGG CGAGCCGATC 660
(2) INFORMATION OF THE SEQUENCE SEQ ID NO: 8: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 25 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other -xi > DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 8: CTGACGCTGC AGGAGGAAGT CCCGC 25
(2) SEQUENCE INFORMATION SEQ ID NO: 9:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 25 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 9: CGGGGCGAGG ACGTCGTCCC GATCC 25
(2) INFORMATION OF THE SEQUENCE SEQ ID NO: 10: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 25 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 10: GAGCCCCTGG ACGTCGGCGG TGTCC 25
(2) INFORMATION OF THE SEQUENCE SEQ ID NO: 11: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 25 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 11: GACGGTGCAT GCTCAGCAGG GAGCG 25
(2) INFORMATION OF THE SEQUENCE SEQ ID NO: 12: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 972 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Other -? i. DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 12: ATGACCTCAG TGGACTGCAC CGCGTACGGC CCCGAGCTGC GCGCGCTCGC CGCCCGGCTG 60
CCCCGGACCC CCCGGGCCGA CCTGTACGCC TTCCTGGACG CCGCGCACAC AGCCGCCGCC 120
TCGCTCCCCG GCGCCCTCGC CACCGCGCTG GACACCTTCA ACGCCGAGGG CAGCGAGGAC 180
GGCCATCTGC TGCTGCGCGG CCTCCCGGTG GAGGCCGACG CCGACCTCCC CACCACCCCG 240
AGCAGCACCC CGGCGCCCGA GGACCGCTCC CTGCTGACCA TGGAGGCCAT GCTCGGACTG 300
GTGGGCCGCC GGCTCGGTCT GCACACGGGG TACCGGGAGC TGCGCTCGGG CACGGTCTAC 360
CACGACGTGT ACCCGTCGCC CGGCGCGCAC CACCTGTCCT CGGAGACCTC CGAGACGCTG 420
CTGGAGTTCC ACACGGAGAT GGCCTACCAC CGGCTCCAGC CGAACTACGT CATGCTGGCC 480
TGCTCCCGGG CCGACCACGA GCGCACGGCG GCCACACTCG TCGCCTCGGT CCGCAAGGCG 540
CTGCCCCTGC TGGACGAGAG GACCCGGGCC CGGCTCCTCG ACCGGAGGAT GCCCTGCTGC 600
GTGGATGTGG CCTTCCGCGG CGGGGTGGAC GACCCGGGCG CCATCGCCCA GGTCAAACCG 660,
CTCTACGGGG ACGCGGACGA TCCCTTCCTC GGGTACGACC GCGAGCTGCT GGCGCCGGAG 720
GACCCCGCGG ACAAGGAGGC CGTCGCCGCC CTGTCCAAGG CGCTCGACGA GGTCACGGAG 780
GCGGTGTATC TGGAGCCCGG CGATCTGCTG ATCGTCGACA ACTTCCGCAC CACGCACGCG 840
, - CGGACGCCGT TCTCGCCCCG CTGGGACGGG AAGGACCGCT GGCTGCACCG CGTCTACATC 900
° CGCACCGACC GCAATGGACA GCTCTCCGGC GGCGAGCGCG CGGGCGACGT CGTCGCCTTC 960
ACACCGCGCG GC 972 (2) SEQUENCE INFORMATION SEQ ID NO: 13: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 48 amino acids (B) TYPE: amino acid (C) CHAIN TYPE: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein 0 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 13: Met Thr Arg Pro Pro Gly Leu Ser Ala His Thr His Gly Ser Val Ser 1 5 10 15 Gly Ser Leu Leu Arg Arg Val Ala Gly His Tyr Pro Thr Gly Val Val 20 25 30 Leu Val Thr Gly Pro Wing Glu Wing Pro Gly Gln Pro Pro Pro Wing Met 35 40 45 (2) SEQUENCE INFORMATION SEQ ID NO: 14: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 151 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 14: Met Ser Val Wing Wing Wing Gly Met Thr Asp Glu Gln Arg Lys Ala Val 1 5 10 15 lie Thr Ala Tyr Phe Lys Wing Phe Asp Asn Gly Val Gly Ser Asp 20 25 30 Gly Thr Pro Ala lie Asp Tyr Phe Ala Glu Asp Wing Val Phe Phe Phe 35 40 45 Pro LYS TrP G1Y Leu Wing Ar9 G1Y LYS Ser Glu Ile Wing Ar9 Leu phe 50 55 60 Asp Asp Leu Gly Gly Thr lie Arg Ser lie Thr His His Leu Trp Ser 65 70 75 80 Val Asn Trp lie Leu Thr Gly Thr Glu Leu Leu Wing Ala Glu Gly Thr 85 90 95
Thr His Gly Glu His Arg Asp Gly Pro Trp Arg Wing Gly Asp Pro Glu
100 105 110
Trp Wing Wing Gly Arg Trp Cys Thr Val Tyr Glu Val Arg Asp Phe Leu
115 120 125 Val His Arg Wing Phe Val Tyr Leu Asp Pro Asp Tyr Wing Gly Lys Asp
130 135 140 Thr Ala Arg Tyr Pro Trp Leu 145 150 (2) INFORMATION OF THE SEQUENCE SEQ ID NO: 15: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 344 amino acids (B) TYPE: amino acid (C) TYPE CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 15: Met Ser Arg Ser Pro Pro Glu Ser Pro Wing Gly Ser Val Ser Ala Ala
1 5 10 15
Val Pro Arg Pro Pro Val Arg Ala Leu Arg Asp Leu Pro Val Ser Wing 20 25 30 Gln Gly Leu Gly Cys Leu Pro Thr Thr Asp Phe Tyr Gly Arg Pro Asp
40 45 Arg Ala Arg Ala Thr Ala Thr He Arg Ala Ala Val Asp Ala Gly Val
50 55 60 Thr Leu Leu Asp Thr Wing Asp Val Gln Gly Leu Gly Wing Gly Glu Glu 65 70 75 80
Leu Leu Gly Arg Ala Val Ala Gly Arg Arg Asp Glu Val Leu He Wing 85 90 95
Thr Lys Phe Gly Met Val Arg Ser Ser Asp Gly Wing Ser Gln Gly Leu
100 105 lys Cys Gly Glu Pro Ser Tyr Val Arg Ala Wing Cys Glu Arg Ser Leu Arg
115 120 125 Arg Leu Gly Thr Asp Arg He Asp Leu Tyr Tyr Gln His Trp Thr Asp
130 135 140 Pro Ala Val Pro He Glu Glu Thr Val Gly Ala Val Ala Glu Leu Val 145 150 155 160
Arg Glu Gly Lys Val Arg Arg Leu Gly Leu Ser Glu Pro Ser Ala Ala 165 170 175
Thr Leu Arg Arg Wing Asp Wing Val His Pro Val Thr Wing Val Gln Ser
180 185 190 Glu Trp Ser Leu Trp Ser Arg Gly He Glu Asp Glu Val Val Pro Val
195 200 205 Cys Arg Glu Leu Gly He Gly He Val Wing Tyr Wing Pro Leu Gly Arg
210 215 220 Gly Phe Leu Thr Gly Thr He Arg Thr Thr Asp Asp Leu Gly Asp Glu 225 230 235 240 AsP phe Arg Arg Gly Gln Pro Arg Phe Ser Ala Pro Ala Leu Ala Arg 245 250 255
Asn Arg Ser Leu Leu His Arg Leu Arg Pro Val Wing Asp Gly Leu Gly
260 265 270 Leu Thr Leu Ala Gln Leu Ala Leu Ala Trp Leu His His Arg Gly Glu 275 280 285
Asp Val Val Pro He Pro Gly Thr Ala Asn Pro Ala His Leu Ala Asp
290 295 300 Asn Leu Ala Ala Ala Ser He Arg Leu Asp Asp Arg Ser Leu Ala Glu 305 310 315 320
Val Thr Ala Ala He Ser His Pro Val Ser Gly Glu Arg Tyr Thr Pro 325 330 335
Ala Leu Leu Ala Met He Gly Asn 340 (2) SEQUENCE INFORMATION SEQ ID NO: 16: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 328 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 16: Met Glu Cys Arg He Phe Glu He Asp Glu Leu Pro Leu Leu Asp Gly
1 5 10 15
Glu Val Leu Arg Asp Wing Arg He Gly Tyr Wing Met Tyr Gly Thr Pro 20 25 30 Asn Wing Asp Gly Thr Asn Val Val Leu Cys Pro Ser Phe Phe Gly Arg
40 45 Asp His Thr Gly Tyr Asp Trp Leu He Gly Wing Gly Leu Pro Leu Asp
50 55 60 Thr Arg Arg Tyr Cys Val Val Thr Wing Gly Leu Phe Gly Asn Gly Val 65 70 75 80
Being Being Pro Gly Asn His Pro Being Gly Being Arg Phe Pro Leu He 85 90 95
Thr Pro Gln Asp Asn Val Wing Wing Gln His Arg Leu Leu Thr Glu Glu
100 105 110 Leu Gly Val Arg Glu Leu Ala Leu Val Thr Gly Trp Ser Met Gly Ala
115 120 125 Wing His Wing Tyr Gln Trp Wing Val Ser His Pro Gly Met Val Arg Arg
130 135 140 He Ala Pro He Cys Gly Ala Pro Val Ser Ser Pro His Ser Leu Val 145 150 155 160
Leu Leu Ser Gly Leu Ala Ala Ala Leu Ser Ala Asp Ala Gly Glu Arg 165 170 175
Gly Arg Lys Wing Wing Gly Arg Val Phe Wing Gly Trp Gly Thr Ser Arg
180 185 190 Ser Phe Trp Wing Arg Arg Wing His Arg Glu Leu Gly Phe Wing Thr Arg
195 200 205 Glu Glu Tyr Leu Thr Gly Phe Trp Glu Gln Val Phe Leu Ser Gly Pro
210 215 220 Gly Wing Wing Asp Leu Leu Thr Met Val Arg Thr Trp Glu Asn Thr Asp 225 230 235 240 V l G1Y Ala thr Pr ° Giy Wing Gly Gly Ser Val Glu Ala Wing Leu Wing 245 250 255
Ser Val Thr Ala Arg Ala Val Val Leu Pro Gly Ala Leu Asp Val Cys
260 265 270 Phe Wing Val Glu Asp Glu Lys Arg Val Wing Asp Leu Leu Pro Tyr Wing 275 280 285
Ser Leu Glu Val He Pro Gly Val Trp Gly His Leu Ala Gly Ser Gly
290 295 300 Gly Ser Ala Ala Asp Arg Glu Phe He Gly Gly Ala Leu Arg Arg Leu 305 310 315 320
Leu Asp Ser Pro Val Asp Gly Gly 325 (2) SEQUENCE INFORMATION SEQ ID NO: 17: (i) SEQUENCE CHARACTERISTICS:, - (A) LENGTH: 394 amino acids "(B) TYPE: amino acid (C) TYPE CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 17: Met Lys Ser He Leu Phe Tyr Leu Pro Thr Val Gly Ser His Wing Gln
1 5 10 15 Val Gln Arg Gly Met Wing Gly Val Asn Pro Gln Asn Tyr Gln Asn Met 20 25 30 Leu Arg Gln Leu Thr Arg Gln Wing Gln Wing Wing Asp Glu Leu Gly Tyr 0 35 40 45 Trp Gly Leu Ser Phe Thr Glu His His Phe His Thr Glu Gly Phe Glu
50 55 60 Val Ser Asn Asn Pro He Met Leu Gly Leu Tyr Leu Gly Met Gln Thr 65 70 75 80
Arg His He Arg Val Gly Gln Met Ala Asn Val Leu Pro Leu His Asn 85 90 95 Pro Leu Arg Leu Ala Glu Asp Leu Ala Met Leu Asp His Met Thr Arg 100 105 110 Gly Arg Ala Phe Val Gly He Ala Arg Gly Phe Gln Lys Arg Trp Wing 115 120 125 Asp He Met Gly Gln Val Tyr Gly Val Gly Gly Thr Leu Ser Asp Ala
130 135 140 Gly Glu Arg Asp Arg Arg Asn Arg Ala Leu Phe Glu Glu His Trp Glu 145 150 155 160
He He Lys Lys Wing Trp Thr Thr Glu Thr Phe Thr His Ser Gly Glu 165 170 175 Gln Trp Thr He Pro Val Pro Asp Leu Glu Phe Pro Tyr Glu Wing Val 180 185 190 Arg Arg Tyr Gly Arg Gly Leu Asp Glu Asn Gly Val He Arg Glu Val 195 200 205 Gly He Wing Pro Lys Pro Tyr Gln Arg Pro His Pro Pro Val Phe Gln
210 215 220 Pro Phe Ser Phe Ser Glu Asp Thr Phe Arg Phe Cys Wing Arg Glu Gly
225 230 235 240
Val Val Pro He Leu Met Asn Thr Asp Asp Gln He Val Wing Arg Leu 245 250 255 Met AsP Ile tyr Ar9 Glu Glu Wing Glu Wing Wing Gly His Gly Thr Leu 260 265 270 Arg Arg Gly Glu Arg Val Gly Val Met Lys Asp Val Leu Val Ser Arg 275 280 285 Asp Ser Gly Glu Wing His His Trp Wing Ser Arg Gly Gly Gly Phe He 290 295 300
Phe Glu Asn Trp Phe Gly Pro Met Gly Phe Thr Glu Ala Leu Arg Ala 305 310 315 320
Thr Gly Glu Thr Gly Pro He Gly Ser Asp Tyr Lys Thr Leu Val Asp 325 330 335
Arg Gly Leu Glu Trp Val Gly Thr Pro Asp Asp He Asn Arg Met He
340 345 350 Glu Lys Leu Val Glu Arg His Asp Pro Glu Tyr Leu Leu Gln Cys Gln
355 360 365 Tyr Ser Gly Leu He Pro His Asp Val Gln Leu Arg Ser Leu Glu Leu
370 375 380 Trp Ala Thr Glu He Ala Pro Asn Trp Leu 385 390 (2) SEQUENCE INFORMATION SEQ ID NO: 18: '(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 220 amino acids (B) TYPE: amino acid (C) CHAIN TYPE: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 18: Met Pro Gly Ser Gly Leu Glu Ala Leu Asp Arg Ala Thr Leu He His
1 5 10 15
Pro Thr Leu Ser Gly Asn Thr Wing Glu Arg He Val Leu Thr Ser Gly
25 30 Ser Gly Ser Arg Val Arg Asp Thr Asp Gly Arg Glu Tyr Leu Asp Ala
40 45 Wing Wing Val Leu Gly Val Thr Gln Val Gly His Gly Arg Wing Glu Leu 50 55 60 Wing Arg Wing Wing Wing Gln Gln Met Wing Arg Leu Glu Tyr Phe His Thr 65 70 75 80
Trp Gly Thr He Ser Asn Asp Arg Ala Val Glu Leu Ala Ala Arg Leu 85 90 95
Val Gly Leu Ser Pro Glu Pro Leu Thr Arg Val Tyr Phe Thr Ser Gly
100 105 110 Gly Ala Glu Gly Asn Glu He Ala Ala Leu Arg Met Ala Arg Leu Tyr His
115 120 125 His Arg Arg Gly Glu Be Wing Arg Thr Trp He Leu Ser Arg Arg Ser
130 135 140 Ala Tyr His Gly Val Gly Tyr Gly Ser Gly Gly Val Thr Gly Phe Pro 145 150 155 160
Ala Tyr His Gln Gly Phe Gly Pro Ser Leu Pro Asp Val Asp Phe Leu 165 170 175
Thr Pro Pro Gln Pro Tyr Arg Arg Glu Leu Phe Wing Gly Ser Asp Val
180 185 190 Thr Asp Phe Cys Leu Wing Glu Leu Arg Glu Thr He Asp Arg He Gly 195 200 205 Pro Glu Arg He Wing Wing Met He Gly Glu Pro He 210 215 220
(2) INFORMATION OF THE SEQUENCE SEQ ID NO: 19: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 324 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ii) ) TYPE OF MOLECULE: protein r- (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 19: O Met Thr Ser Val Asp Cys Thr Ala Tyr Gly Pro Glu Leu Arg Ala Leu
1 5 10 15 Ala Ala Arg Leu Pro Arg Thr Pro Arg Ala Asp Leu Tyr Ala Phe Leu 20 25 30 Asp Ala Ala His Thr Ala Ala Ala Ser Leu Pro Gly Ala Leu Ala Thr 35 40 45 Ala Leu Asp Thr Phe Asn Ala Glu Gly Ser Glu Asp Gly His Leu Leu
50 55 60 Leu Arg Gly Leu Pro Val Glu Wing Asp Wing Asp Leu Pro Thr Thr Pro 65 70 75 80
Be Ser Thr Pro Wing Pro Glu Asp Arg Ser Leu Leu Thr Met Glu Wing 85 90 95 0 Met Leu Gly Leu Val Gly Arg Arg Leu Gly Leu His Thr Gly Tyr Arg 100 105 110 Glu Leu Arg Ser Gly Thr Val Tyr His Asp Val Tyr Pro Ser Pro Gly 115 120 125 Wing His His Leu Ser Ser Glu Thr Ser Glu Thr Leu Leu Glu Phe His
130 135 140 Thr Glu Met Wing Tyr His Arg Leu Gln Pro Asn Tyr Val Met Leu Wing
145 150 155 160
Cys Ser Arg Ala Asp His Glu Arg Thr Ala Ala Thr Leu Val Ala Ser 165 170 175 Val Arg Lys Ala Leu Pro Leu Leu Asp Glu Arg Thr Arg Ala Arg Leu 180 185 190 Leu Asp Arg Arg Met Pro Cys Cys Val Asp Val Ala Phe Arg Gly Gly 195 200 205 Val Asp Asp Pro Gly Wing He Wing Gln Val Lys Pro Leu Tyr Gly Asp
210 215 220 Wing Asp Asp Pro Phe Leu Gly Tyr Asp Arg Glu Leu Leu Wing Pro Glu 225 230 235 240
Asp Pro Wing Asp Lys Glu Wing Val Wing Wing Leu Ser Lys Wing Leu Asp 245 250 255 Glu Val Thr Glu Wing Val Tyr Leu Glu Pro Gly Asp Leu Leu He Val 260 265 270 Asp Asn Phe Arg Thr Thr His Wing Arg Thr Pro Phe Ser Pro Arg Trp 275 280 285 Asp Gly Lys Asp Arg Trp Leu His Arg Val Tyr He Arg Thr Asp Arg
290 295 300 Asn Gly Gln Leu Ser Gly Gly Glu Arg Wing Gly Asp Val Val Wing Phe 305 310 315 320
Thr Pro Arg Gly
Claims (24)
1. - DNA that comprises one or more specific genes for the biosynthesis of clavama 5S in S. Clavuligerus and that is not essential for the biosynthesis of clavama 5R.
2. DNA according to claim 1 as identified in (SEQ ID NO: 1);
3. DNA according to claim 1, which has the sequences or substantially the sequences determined as orfup3, orfup2, orfupl, orfdwnl, orfdwn2 or orfdwn3, which correspond respectively to SEQ ID NO: 2 to 7.
4.- DNA according to claim 1, which has the sequence or substantially the sequence determined as SEQ ID NO: 4.
5.- S. clavuligerus comprising DNA corresponding to an open reading frame that flanks almost, DNA that has been dissociated or otherwise made defective.
6.- Clavuligerus conformation with claim 5, further characterized in that the open reading frame is orfup3, orfup2, orfup 1, orfdwn 1, orfdwn2 or orfdwn3.
7.- A procedure to improve the production of 5R clavama in a suitable microorganism that comprises the manipulation of DNA as defined in claim 1 or 2 and its inclusion in said microorganism.
8. A method according to claim 7, further characterized in that said suitable organism is S. clavuligerus.
9. A method to improve the production of clavama 5R in S. clavuligerus which comprises dissociating or otherwise rendering defective the almost flanking DNA regions.
10. A method according to any of claims 7 to 9, further characterized in that said DNA corresponds to the open reading frames orfup3, orfup2, orfupl, orfdwn2 or orfdwn3.
11. A method according to any of claims 7 to 10, further characterized in that said clavam 5R is clavulanic acid.
12. A method for identifying a microorganism suitable for the high production of clavama 5R comprising a preliminary screening for microorganisms with low or no production of clavama 5S.
13. A method according to claim 12, further characterized in that the microorganism is S. clavuligerus.
14. A method according to claim 12 or 13, further characterized in that clavama 5R is clavulanic acid.
15. - A method according to any of claims 12 or 14, further characterized in that one or more specific genes for the production of 5S clavamas are defective.
16. A microorganism selected from the group consisting of: a) a microorganism that is capable of the production of 5R clavam and of the low or zero production of 5S clavama obtainable by the method of any of claims 7 to 15; b) a microorganism that is capable of the production of 5R clavam and of the low or no production of 5S clavama obtemible by the method of claim 12 which is capable of producing clavulanic acid, but which does not produce clavama-2 carboxylate and / or 2-hydroxymethylclavama; and c) a microorganism obtained by the method of claim 12 which is strain 56-1 A, 56-3A, 57-2B, 1C, 60-1 A, 60-2A, 60-3A, 61-1A, 61 -2A, 61-3A or 61-4A.
17.- Clavulanic acid obtainable by the fermentation of a microorganism as defined in claim 16.
18. Clavulanic acid according to claim 17, which is free of clavama-2 carboxylate, or has significantly reduced levels thereof. .
19. Clavulanic acid according to claim 18 in the form of its potassium salt.
20.- Clavulanic acid which is free of 5S clavam, or has significantly reduced levels thereof.
21. - Clavulanic acid which is free of clavama-2 carboxylate, or has significantly reduced levels thereof.
22. A composition comprising potassium clavulanate according to claim 19, in combination with a beta-lactam antibiotic.
23. A composition according to claim 22, wherein the beta-lactam antibiotic is amoxicillin.
24. A process for the preparation of a composition comprising potassium clavulanate and amoxicillin, which process comprises producing clavulanic acid from a microorganism according to claim 12 and then converting it to the potassium salt and combining the salt of potassium with amoxicillin.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB9702218.0 | 1997-02-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
MXPA99007257A true MXPA99007257A (en) | 2000-01-21 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DK2271666T3 (en) | NRPS-PKS GROUP AND ITS MANIPULATION AND APPLICABILITY | |
JP5042186B2 (en) | Microorganisms that produce large amounts of clavulanic acid | |
MXPA99007257A (en) | Novel compounds | |
CA2108113C (en) | Dna sequence encoding enzymes of clavulanic acid biosynthesis | |
US20040081960A1 (en) | Novel process | |
JP2002512784A (en) | Method for producing doxorubicin | |
EP0682706B1 (en) | Novel compounds | |
KR100367753B1 (en) | Regulated Nucleic Acid Sequences and Their Uses | |
US6232106B1 (en) | DNA sequence encoding enzymes of clavulanic acid biosynthesis | |
CN115247179B (en) | Polyketide skeleton and biosynthetic gene cluster of post-modifier thereof and application thereof | |
US6589775B1 (en) | DNA sequence encoding enzymes of clavulanic acid biosynthesis | |
EP2589663A1 (en) | Process for production of clavulanic acid | |
CA2522622A1 (en) | New process for improving the manufacture of clavams e.g. clavulanic acid | |
JPH07265080A (en) | Fosfomycin biosynthesis-related gene and new method for producing fosfomycin with the same | |
FR2786200A1 (en) | New nucleic acid sequences encoding enzymes involved in macrolide biosynthesis | |
KR19990074514A (en) | Genes that specify multidrug resistance against aminoglycoside antibiotics | |
CZ277599A3 (en) | Micro-organisms with increased production of clavulanic acid | |
CA2412627A1 (en) | Genes and proteins involved in the biosynthesis of lipopeptides |