KR20190138274A - 클로스트리디움 박테리아의 변형을 위한 최적화된 유전 툴 - Google Patents
클로스트리디움 박테리아의 변형을 위한 최적화된 유전 툴 Download PDFInfo
- Publication number
- KR20190138274A KR20190138274A KR1020190061522A KR20190061522A KR20190138274A KR 20190138274 A KR20190138274 A KR 20190138274A KR 1020190061522 A KR1020190061522 A KR 1020190061522A KR 20190061522 A KR20190061522 A KR 20190061522A KR 20190138274 A KR20190138274 A KR 20190138274A
- Authority
- KR
- South Korea
- Prior art keywords
- bacteria
- asn
- ile
- sequence
- phe
- Prior art date
Links
- 230000002068 genetic effect Effects 0.000 title claims abstract description 108
- 241000193403 Clostridium Species 0.000 title claims abstract description 97
- 241000894006 Bacteria Species 0.000 claims abstract description 248
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 179
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 164
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 164
- 230000009466 transformation Effects 0.000 claims abstract description 55
- 239000002904 solvent Substances 0.000 claims abstract description 36
- 230000006801 homologous recombination Effects 0.000 claims abstract description 21
- 238000002744 homologous recombination Methods 0.000 claims abstract description 21
- 108090000623 proteins and genes Proteins 0.000 claims description 200
- 108020004414 DNA Proteins 0.000 claims description 153
- 239000013612 plasmid Substances 0.000 claims description 144
- 102000004169 proteins and genes Human genes 0.000 claims description 140
- 108020005004 Guide RNA Proteins 0.000 claims description 123
- 108091033409 CRISPR Proteins 0.000 claims description 69
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 claims description 56
- 238000010354 CRISPR gene editing Methods 0.000 claims description 52
- 230000001939 inductive effect Effects 0.000 claims description 48
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 claims description 43
- 238000000034 method Methods 0.000 claims description 42
- 230000001580 bacterial effect Effects 0.000 claims description 35
- CSCPPACGZOOCGX-UHFFFAOYSA-N Acetone Chemical compound CC(C)=O CSCPPACGZOOCGX-UHFFFAOYSA-N 0.000 claims description 33
- LRHPLDYGYMQRHN-UHFFFAOYSA-N N-Butanol Chemical compound CCCCO LRHPLDYGYMQRHN-UHFFFAOYSA-N 0.000 claims description 30
- 230000008439 repair process Effects 0.000 claims description 29
- 239000013598 vector Substances 0.000 claims description 28
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 26
- 239000000411 inducer Substances 0.000 claims description 26
- 102000004190 Enzymes Human genes 0.000 claims description 25
- 108090000790 Enzymes Proteins 0.000 claims description 25
- 230000008569 process Effects 0.000 claims description 18
- 239000000203 mixture Substances 0.000 claims description 17
- 238000004519 manufacturing process Methods 0.000 claims description 16
- 235000000346 sugar Nutrition 0.000 claims description 16
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 15
- 241000193454 Clostridium beijerinckii Species 0.000 claims description 15
- 230000001131 transforming effect Effects 0.000 claims description 15
- 108020000946 Bacterial DNA Proteins 0.000 claims description 14
- 238000012239 gene modification Methods 0.000 claims description 14
- 230000005017 genetic modification Effects 0.000 claims description 14
- 235000013617 genetically modified food Nutrition 0.000 claims description 14
- 150000008163 sugars Chemical class 0.000 claims description 12
- 230000000295 complement effect Effects 0.000 claims description 11
- 230000007246 mechanism Effects 0.000 claims description 8
- 108010081734 Ribonucleoproteins Proteins 0.000 claims description 7
- 102000004389 Ribonucleoproteins Human genes 0.000 claims description 7
- 241000193401 Clostridium acetobutylicum Species 0.000 claims description 6
- 241000186522 Clostridium aurantibutyricum Species 0.000 claims description 5
- 241000193171 Clostridium butyricum Species 0.000 claims description 5
- 108010052285 Membrane Proteins Proteins 0.000 claims description 4
- 108091023040 Transcription factor Proteins 0.000 claims description 4
- 102000040945 Transcription factor Human genes 0.000 claims description 4
- 239000001913 cellulose Substances 0.000 claims description 4
- 229920002678 cellulose Polymers 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 4
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical group CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 claims description 2
- 241000429427 Clostridium saccharobutylicum Species 0.000 claims description 2
- 241000193470 Clostridium sporogenes Species 0.000 claims description 2
- 241000193452 Clostridium tyrobutyricum Species 0.000 claims description 2
- 241000002309 Collariella virescens Species 0.000 claims description 2
- 241000933069 Lachnoclostridium phytofermentans Species 0.000 claims description 2
- 150000001299 aldehydes Chemical class 0.000 claims description 2
- 238000012258 culturing Methods 0.000 claims 2
- 102000018697 Membrane Proteins Human genes 0.000 claims 1
- 150000001298 alcohols Chemical class 0.000 claims 1
- 230000004048 modification Effects 0.000 abstract description 28
- 238000012986 modification Methods 0.000 abstract description 28
- 235000018102 proteins Nutrition 0.000 description 100
- 125000003729 nucleotide group Chemical group 0.000 description 65
- 101150009760 CATB gene Proteins 0.000 description 62
- 239000002773 nucleotide Substances 0.000 description 62
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 51
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 50
- ULGZDMOVFRHVEP-RWJQBGPGSA-N Erythromycin Chemical compound O([C@@H]1[C@@H](C)C(=O)O[C@@H]([C@@]([C@H](O)[C@@H](C)C(=O)[C@H](C)C[C@@](C)(O)[C@H](O[C@H]2[C@@H]([C@H](C[C@@H](C)O2)N(C)C)O)[C@H]1C)(C)O)CC)[C@H]1C[C@@](C)(OC)[C@@H](O)[C@H](C)O1 ULGZDMOVFRHVEP-RWJQBGPGSA-N 0.000 description 48
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 45
- 108010054812 diprotin A Proteins 0.000 description 35
- 239000012634 fragment Substances 0.000 description 35
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 31
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 31
- 108010034529 leucyl-lysine Proteins 0.000 description 31
- 239000002609 medium Substances 0.000 description 31
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 29
- 229930101283 tetracycline Natural products 0.000 description 28
- 101150076274 upp gene Proteins 0.000 description 28
- GRRXPUAICOGISM-RWMBFGLXSA-N Arg-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O GRRXPUAICOGISM-RWMBFGLXSA-N 0.000 description 26
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 26
- 108010081404 acein-2 Proteins 0.000 description 26
- XVBDDUPJVQXDSI-PEFMBERDSA-N Asn-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVBDDUPJVQXDSI-PEFMBERDSA-N 0.000 description 25
- YFSLJHLQOALGSY-ZPFDUUQYSA-N Asp-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N YFSLJHLQOALGSY-ZPFDUUQYSA-N 0.000 description 25
- DRXOWZZHCSBUOI-YJRXYDGGSA-N Cys-Thr-Tyr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CS)N)O DRXOWZZHCSBUOI-YJRXYDGGSA-N 0.000 description 25
- 101710163270 Nuclease Proteins 0.000 description 25
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 25
- ZTPXSEUVYNNZRB-CDMKHQONSA-N Thr-Gly-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZTPXSEUVYNNZRB-CDMKHQONSA-N 0.000 description 25
- OFVLGDICTFRJMM-WESIUVDSSA-N tetracycline Chemical compound C1=CC=C2[C@](O)(C)[C@H]3C[C@H]4[C@H](N(C)C)C(O)=C(C(N)=O)C(=O)[C@@]4(O)C(O)=C3C(=O)C2=C1O OFVLGDICTFRJMM-WESIUVDSSA-N 0.000 description 25
- ZMUQQMGITUJQTI-CIUDSAMLSA-N Asn-Leu-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ZMUQQMGITUJQTI-CIUDSAMLSA-N 0.000 description 24
- VZKXOWRNJDEGLZ-WHFBIAKZSA-N Cys-Asp-Gly Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O VZKXOWRNJDEGLZ-WHFBIAKZSA-N 0.000 description 24
- VEPIBPGLTLPBDW-URLPEUOOSA-N Ile-Phe-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N VEPIBPGLTLPBDW-URLPEUOOSA-N 0.000 description 24
- AKQFLPNANHNTLP-VKOGCVSHSA-N Ile-Pro-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N AKQFLPNANHNTLP-VKOGCVSHSA-N 0.000 description 24
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 24
- AKJAKCBHLJGRBU-JYJNAYRXSA-N Phe-Glu-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N AKJAKCBHLJGRBU-JYJNAYRXSA-N 0.000 description 24
- QGMLKFGTGXWAHF-IHRRRGAJSA-N Ser-Arg-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QGMLKFGTGXWAHF-IHRRRGAJSA-N 0.000 description 24
- ABWNZPOIUJMNKT-IXOXFDKPSA-N Thr-Phe-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O ABWNZPOIUJMNKT-IXOXFDKPSA-N 0.000 description 24
- LTLBNCDNXQCOLB-UBHSHLNASA-N Trp-Asp-Ser Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 LTLBNCDNXQCOLB-UBHSHLNASA-N 0.000 description 24
- 229960003276 erythromycin Drugs 0.000 description 24
- 108010079317 prolyl-tyrosine Proteins 0.000 description 24
- 108010071207 serylmethionine Proteins 0.000 description 24
- 108010020532 tyrosyl-proline Proteins 0.000 description 24
- LHJDLVVQRJIURS-SRVKXCTJSA-N Cys-Phe-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N LHJDLVVQRJIURS-SRVKXCTJSA-N 0.000 description 23
- RXSWQCATLWVDLI-XGEHTFHBSA-N Ser-Met-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RXSWQCATLWVDLI-XGEHTFHBSA-N 0.000 description 23
- KWBISLAEQZUYIC-UWJYBYFXSA-N His-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CN=CN2)N KWBISLAEQZUYIC-UWJYBYFXSA-N 0.000 description 22
- -1 OPT nucleic acid Chemical class 0.000 description 22
- OQSQCUWQOIHECT-YJRXYDGGSA-N Ser-Tyr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OQSQCUWQOIHECT-YJRXYDGGSA-N 0.000 description 22
- MVFQLSPDMMFCMW-KKUMJFAQSA-N Tyr-Leu-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O MVFQLSPDMMFCMW-KKUMJFAQSA-N 0.000 description 22
- PIABYSIYPGLLDQ-XVSYOHENSA-N Asn-Thr-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PIABYSIYPGLLDQ-XVSYOHENSA-N 0.000 description 21
- JUIOPCXACJLRJK-AVGNSLFASA-N His-Lys-Glu Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N JUIOPCXACJLRJK-AVGNSLFASA-N 0.000 description 21
- DVRDRICMWUSCBN-UKJIMTQDSA-N Ile-Gln-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DVRDRICMWUSCBN-UKJIMTQDSA-N 0.000 description 21
- UREQLMJCKFLLHM-NAKRPEOUSA-N Pro-Ile-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UREQLMJCKFLLHM-NAKRPEOUSA-N 0.000 description 21
- FKZSXTKZLPPHQU-GQGQLFGLSA-N Ser-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CO)N FKZSXTKZLPPHQU-GQGQLFGLSA-N 0.000 description 21
- LKEKWDJCJSPXNI-IRIUXVKKSA-N Thr-Glu-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LKEKWDJCJSPXNI-IRIUXVKKSA-N 0.000 description 21
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 21
- 108010078580 tyrosylleucine Proteins 0.000 description 21
- SCHZQZPYHBWYEQ-PEFMBERDSA-N Ile-Asn-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SCHZQZPYHBWYEQ-PEFMBERDSA-N 0.000 description 20
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 20
- 108010068265 aspartyltyrosine Proteins 0.000 description 20
- QZKVWWIUSQGWMY-IHRRRGAJSA-N Val-Ser-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QZKVWWIUSQGWMY-IHRRRGAJSA-N 0.000 description 19
- 239000003242 anti bacterial agent Substances 0.000 description 19
- YSYTWUMRHSFODC-QWRGUYRKSA-N Asn-Tyr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O YSYTWUMRHSFODC-QWRGUYRKSA-N 0.000 description 18
- FOKISINOENBSDM-WLTAIBSBSA-N Gly-Thr-Tyr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O FOKISINOENBSDM-WLTAIBSBSA-N 0.000 description 18
- USXAYNCLFSUSBA-MGHWNKPDSA-N Ile-Phe-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N USXAYNCLFSUSBA-MGHWNKPDSA-N 0.000 description 18
- SHUFSZDAIPLZLF-BEAPCOKYSA-N Phe-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O SHUFSZDAIPLZLF-BEAPCOKYSA-N 0.000 description 18
- 108010027338 isoleucylcysteine Proteins 0.000 description 18
- WYUHAXJAMDTOAU-IAVJCBSLSA-N Ile-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N WYUHAXJAMDTOAU-IAVJCBSLSA-N 0.000 description 17
- OMHLATXVNQSALM-FQUUOJAGSA-N Leu-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(C)C)N OMHLATXVNQSALM-FQUUOJAGSA-N 0.000 description 17
- VJGQRELPQWNURN-JYJNAYRXSA-N Leu-Tyr-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJGQRELPQWNURN-JYJNAYRXSA-N 0.000 description 17
- PSALWJCUIAQKFW-ACRUOGEOSA-N Tyr-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N PSALWJCUIAQKFW-ACRUOGEOSA-N 0.000 description 17
- DFQZDQPLWBSFEJ-LSJOCFKGSA-N Val-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N DFQZDQPLWBSFEJ-LSJOCFKGSA-N 0.000 description 17
- OTVAEFIXJLOWRX-NXEZZACHSA-N thiamphenicol Chemical compound CS(=O)(=O)C1=CC=C([C@@H](O)[C@@H](CO)NC(=O)C(Cl)Cl)C=C1 OTVAEFIXJLOWRX-NXEZZACHSA-N 0.000 description 17
- 229960003053 thiamphenicol Drugs 0.000 description 17
- LVHMEJJWEXBMKK-GMOBBJLQSA-N Asn-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)N)N LVHMEJJWEXBMKK-GMOBBJLQSA-N 0.000 description 16
- SRBFZHDQGSBBOR-IOVATXLUSA-N D-xylopyranose Chemical compound O[C@@H]1COC(O)[C@H](O)[C@H]1O SRBFZHDQGSBBOR-IOVATXLUSA-N 0.000 description 16
- BSCBBPKDVOZICB-KKUMJFAQSA-N Tyr-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BSCBBPKDVOZICB-KKUMJFAQSA-N 0.000 description 16
- MIAZWUMFUURQNP-YDHLFZDLSA-N Val-Tyr-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N MIAZWUMFUURQNP-YDHLFZDLSA-N 0.000 description 16
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical class ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 16
- 108010079547 glutamylmethionine Proteins 0.000 description 16
- 230000035897 transcription Effects 0.000 description 16
- 238000013518 transcription Methods 0.000 description 16
- CNKBMTKICGGSCQ-ACRUOGEOSA-N (2S)-2-[[(2S)-2-[[(2S)-2,6-diamino-1-oxohexyl]amino]-1-oxo-3-phenylpropyl]amino]-3-(4-hydroxyphenyl)propanoic acid Chemical compound C([C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 CNKBMTKICGGSCQ-ACRUOGEOSA-N 0.000 description 15
- CYCKJEFVFNRWEZ-UGYAYLCHSA-N Asp-Ile-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CYCKJEFVFNRWEZ-UGYAYLCHSA-N 0.000 description 15
- 108010035563 Chloramphenicol O-acetyltransferase Proteins 0.000 description 15
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 15
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 15
- HODVZHLJUUWPKY-STECZYCISA-N Ile-Tyr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=C(O)C=C1 HODVZHLJUUWPKY-STECZYCISA-N 0.000 description 15
- DEFGUIIUYAUEDU-ZPFDUUQYSA-N Lys-Asn-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DEFGUIIUYAUEDU-ZPFDUUQYSA-N 0.000 description 15
- UAPZLLPGGOOCRO-IHRRRGAJSA-N Met-Asn-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N UAPZLLPGGOOCRO-IHRRRGAJSA-N 0.000 description 15
- IHITVQKJXQQGLJ-LPEHRKFASA-N Met-Asn-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N IHITVQKJXQQGLJ-LPEHRKFASA-N 0.000 description 15
- 229940088710 antibiotic agent Drugs 0.000 description 15
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 15
- 230000008685 targeting Effects 0.000 description 15
- YCRAFFCYWOUEOF-DLOVCJGASA-N Ala-Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 YCRAFFCYWOUEOF-DLOVCJGASA-N 0.000 description 14
- 101100162137 Clostridium acetobutylicum (strain ATCC 824 / DSM 792 / JCM 1419 / LMG 5710 / VKM B-1787) bdhB gene Proteins 0.000 description 14
- CYHYBSGMHMHKOA-CIQUZCHMSA-N Ile-Ala-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CYHYBSGMHMHKOA-CIQUZCHMSA-N 0.000 description 14
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 14
- FWTBMGAKKPSTBT-GUBZILKMSA-N Met-Gln-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FWTBMGAKKPSTBT-GUBZILKMSA-N 0.000 description 14
- LGBVMDMZZFYSFW-HJWJTTGWSA-N Phe-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CC=CC=C1)N LGBVMDMZZFYSFW-HJWJTTGWSA-N 0.000 description 14
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 14
- VPRHDRKAPYZMHL-SZMVWBNQSA-N Trp-Leu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 VPRHDRKAPYZMHL-SZMVWBNQSA-N 0.000 description 14
- KEANSLVUGJADPN-LKTVYLICSA-N Tyr-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=C(C=C2)O)N KEANSLVUGJADPN-LKTVYLICSA-N 0.000 description 14
- GQVZBMROTPEPIF-SRVKXCTJSA-N Tyr-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GQVZBMROTPEPIF-SRVKXCTJSA-N 0.000 description 14
- 230000009471 action Effects 0.000 description 14
- PYMYPHUHKUWMLA-UHFFFAOYSA-N arabinose Natural products OCC(O)C(O)C(O)C=O PYMYPHUHKUWMLA-UHFFFAOYSA-N 0.000 description 14
- 108010047079 phenylalanyl-leucyl-arginyl-phenylalanine Proteins 0.000 description 14
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 13
- LMIWYCWRJVMAIQ-NHCYSSNCSA-N Asn-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N LMIWYCWRJVMAIQ-NHCYSSNCSA-N 0.000 description 13
- 108091026890 Coding region Proteins 0.000 description 13
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 13
- GHKXHCMRAUYLBS-CIUDSAMLSA-N Lys-Ser-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O GHKXHCMRAUYLBS-CIUDSAMLSA-N 0.000 description 13
- 108010038633 aspartylglutamate Proteins 0.000 description 13
- WQZGKKKJIJFFOK-VFUOTHLCSA-N beta-D-glucose Chemical compound OC[C@H]1O[C@@H](O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-VFUOTHLCSA-N 0.000 description 13
- 229960005091 chloramphenicol Drugs 0.000 description 13
- 239000008103 glucose Substances 0.000 description 13
- 241000588724 Escherichia coli Species 0.000 description 12
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 12
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 12
- 150000001413 amino acids Chemical group 0.000 description 12
- 210000003578 bacterial chromosome Anatomy 0.000 description 12
- 238000000855 fermentation Methods 0.000 description 12
- 230000004151 fermentation Effects 0.000 description 12
- 230000010076 replication Effects 0.000 description 12
- ZEIYPKQQLSUPOT-QORCZRPOSA-N (2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-amino-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]-5-(diaminomethylideneamino)pentanoyl]amino]-3-phenylpropanoic acid Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 ZEIYPKQQLSUPOT-QORCZRPOSA-N 0.000 description 11
- 108020004705 Codon Proteins 0.000 description 11
- UDEPRBFQTWGLCW-CIUDSAMLSA-N Glu-Pro-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O UDEPRBFQTWGLCW-CIUDSAMLSA-N 0.000 description 11
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 11
- 230000003115 biocidal effect Effects 0.000 description 11
- 101150038500 cas9 gene Proteins 0.000 description 11
- 108010051242 phenylalanylserine Proteins 0.000 description 11
- REXAUQBGSGDEJY-IGISWZIWSA-N Ile-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N REXAUQBGSGDEJY-IGISWZIWSA-N 0.000 description 10
- GUBGYTABKSRVRQ-QKKXKWKRSA-N Lactose Natural products OC[C@H]1O[C@@H](O[C@H]2[C@H](O)[C@@H](O)C(O)O[C@@H]2CO)[C@H](O)[C@@H](O)[C@H]1O GUBGYTABKSRVRQ-QKKXKWKRSA-N 0.000 description 10
- ZRACLHJYVRBJFC-ULQDDVLXSA-N Met-Lys-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZRACLHJYVRBJFC-ULQDDVLXSA-N 0.000 description 10
- ZWJKVFAYPLPCQB-UNQGMJICSA-N Phe-Arg-Thr Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)Cc1ccccc1)C(O)=O ZWJKVFAYPLPCQB-UNQGMJICSA-N 0.000 description 10
- UNLYPPYNDXHGDG-IHRRRGAJSA-N Phe-Gln-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 UNLYPPYNDXHGDG-IHRRRGAJSA-N 0.000 description 10
- 239000013611 chromosomal DNA Substances 0.000 description 10
- 230000002759 chromosomal effect Effects 0.000 description 10
- 230000011987 methylation Effects 0.000 description 10
- 238000007069 methylation reaction Methods 0.000 description 10
- UBKOVSLDWIHYSY-ACZMJKKPSA-N Asn-Glu-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UBKOVSLDWIHYSY-ACZMJKKPSA-N 0.000 description 9
- 102000004533 Endonucleases Human genes 0.000 description 9
- 108010042407 Endonucleases Proteins 0.000 description 9
- YBDOQKVAGTWZMI-XIRDDKMYSA-N His-Trp-Ser Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N YBDOQKVAGTWZMI-XIRDDKMYSA-N 0.000 description 9
- 101000952182 Homo sapiens Max-like protein X Proteins 0.000 description 9
- 102100037423 Max-like protein X Human genes 0.000 description 9
- PHURAEXVWLDIGT-LPEHRKFASA-N Met-Ser-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N PHURAEXVWLDIGT-LPEHRKFASA-N 0.000 description 9
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 9
- 101150050729 bdhA gene Proteins 0.000 description 9
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 9
- 230000012010 growth Effects 0.000 description 9
- 230000006698 induction Effects 0.000 description 9
- 239000008101 lactose Substances 0.000 description 9
- KQBVNNAPIURMPD-PEFMBERDSA-N Asp-Ile-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KQBVNNAPIURMPD-PEFMBERDSA-N 0.000 description 8
- 102000053602 DNA Human genes 0.000 description 8
- RJONUNZIMUXUOI-GUBZILKMSA-N Glu-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N RJONUNZIMUXUOI-GUBZILKMSA-N 0.000 description 8
- XTZDZAXYPDISRR-MNXVOIDGSA-N Glu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XTZDZAXYPDISRR-MNXVOIDGSA-N 0.000 description 8
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 8
- WLJYLAQSUSIQNH-GUBZILKMSA-N Pro-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@@H]1CCCN1 WLJYLAQSUSIQNH-GUBZILKMSA-N 0.000 description 8
- FIODMZKLZFLYQP-GUBZILKMSA-N Pro-Val-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FIODMZKLZFLYQP-GUBZILKMSA-N 0.000 description 8
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 8
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 8
- QHFQQRKNGCXTHL-AUTRQRHGSA-N Val-Gln-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QHFQQRKNGCXTHL-AUTRQRHGSA-N 0.000 description 8
- 108010092854 aspartyllysine Proteins 0.000 description 8
- 210000004027 cell Anatomy 0.000 description 8
- 230000001276 controlling effect Effects 0.000 description 8
- 108010050848 glycylleucine Proteins 0.000 description 8
- 239000001963 growth medium Substances 0.000 description 8
- 230000005764 inhibitory process Effects 0.000 description 8
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 8
- 108091079001 CRISPR RNA Proteins 0.000 description 7
- QJVZSVUYZFYLFQ-CIUDSAMLSA-N Glu-Pro-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O QJVZSVUYZFYLFQ-CIUDSAMLSA-N 0.000 description 7
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 7
- CVXURBLRELTJKO-BWAGICSOSA-N Tyr-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)O CVXURBLRELTJKO-BWAGICSOSA-N 0.000 description 7
- 108010062796 arginyllysine Proteins 0.000 description 7
- 238000003776 cleavage reaction Methods 0.000 description 7
- 238000012217 deletion Methods 0.000 description 7
- 230000037430 deletion Effects 0.000 description 7
- 239000013613 expression plasmid Substances 0.000 description 7
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 7
- 108010003700 lysyl aspartic acid Proteins 0.000 description 7
- 108010009298 lysylglutamic acid Proteins 0.000 description 7
- 108010015796 prolylisoleucine Proteins 0.000 description 7
- 230000007017 scission Effects 0.000 description 7
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 6
- WIDVAWAQBRAKTI-YUMQZZPRSA-N Asn-Leu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O WIDVAWAQBRAKTI-YUMQZZPRSA-N 0.000 description 6
- JDHOJQJMWBKHDB-CIUDSAMLSA-N Asp-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N JDHOJQJMWBKHDB-CIUDSAMLSA-N 0.000 description 6
- WQZGKKKJIJFFOK-QTVWNMPRSA-N D-mannopyranose Chemical compound OC[C@H]1OC(O)[C@@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-QTVWNMPRSA-N 0.000 description 6
- OCDLPQDYTJPWNG-YUMQZZPRSA-N Gly-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN OCDLPQDYTJPWNG-YUMQZZPRSA-N 0.000 description 6
- XKIYNCLILDLGRS-QWRGUYRKSA-N His-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 XKIYNCLILDLGRS-QWRGUYRKSA-N 0.000 description 6
- SAEWJTCJQVZQNZ-IUKAMOBKSA-N Ile-Thr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SAEWJTCJQVZQNZ-IUKAMOBKSA-N 0.000 description 6
- 108060004795 Methyltransferase Proteins 0.000 description 6
- 102000016397 Methyltransferase Human genes 0.000 description 6
- 229940072174 amphenicols Drugs 0.000 description 6
- 230000003321 amplification Effects 0.000 description 6
- PYMYPHUHKUWMLA-WDCZJNDASA-N arabinose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)C=O PYMYPHUHKUWMLA-WDCZJNDASA-N 0.000 description 6
- 238000010790 dilution Methods 0.000 description 6
- 239000012895 dilution Substances 0.000 description 6
- 230000002401 inhibitory effect Effects 0.000 description 6
- 108010054155 lysyllysine Proteins 0.000 description 6
- 108010038320 lysylphenylalanine Proteins 0.000 description 6
- 244000005700 microbiome Species 0.000 description 6
- 230000003472 neutralizing effect Effects 0.000 description 6
- 238000003199 nucleic acid amplification method Methods 0.000 description 6
- 239000007787 solid Substances 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 108010073969 valyllysine Proteins 0.000 description 6
- NTUPOKHATNSWCY-PMPSAXMXSA-N (2s)-2-[[(2s)-1-[(2r)-2-amino-3-phenylpropanoyl]pyrrolidine-2-carbonyl]amino]-5-(diaminomethylideneamino)pentanoic acid Chemical compound C([C@@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=CC=C1 NTUPOKHATNSWCY-PMPSAXMXSA-N 0.000 description 5
- DGKCOYGQLNWNCJ-ACZMJKKPSA-N Asp-Glu-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O DGKCOYGQLNWNCJ-ACZMJKKPSA-N 0.000 description 5
- 241000193468 Clostridium perfringens Species 0.000 description 5
- 101100005275 Clostridium perfringens catP gene Proteins 0.000 description 5
- 241001508458 Clostridium saccharoperbutylacetonicum Species 0.000 description 5
- FXJLRZFMKGHYJP-CFMVVWHZSA-N Ile-Tyr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N FXJLRZFMKGHYJP-CFMVVWHZSA-N 0.000 description 5
- 241000880493 Leptailurus serval Species 0.000 description 5
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 5
- KPJJOZUXFOLGMQ-CIUDSAMLSA-N Lys-Asp-Asn Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N KPJJOZUXFOLGMQ-CIUDSAMLSA-N 0.000 description 5
- 108020005091 Replication Origin Proteins 0.000 description 5
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 5
- LRHBBGDMBLFYGL-FHWLQOOXSA-N Tyr-Phe-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LRHBBGDMBLFYGL-FHWLQOOXSA-N 0.000 description 5
- QGFPYRPIUXBYGR-YDHLFZDLSA-N Val-Asn-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N QGFPYRPIUXBYGR-YDHLFZDLSA-N 0.000 description 5
- UXODSMTVPWXHBT-ULQDDVLXSA-N Val-Phe-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N UXODSMTVPWXHBT-ULQDDVLXSA-N 0.000 description 5
- 239000002253 acid Substances 0.000 description 5
- 230000008901 benefit Effects 0.000 description 5
- 238000010367 cloning Methods 0.000 description 5
- 108010081551 glycylphenylalanine Proteins 0.000 description 5
- 230000001976 improved effect Effects 0.000 description 5
- 238000003780 insertion Methods 0.000 description 5
- 230000037431 insertion Effects 0.000 description 5
- 108010017391 lysylvaline Proteins 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 108010089198 phenylalanyl-prolyl-arginine Proteins 0.000 description 5
- 108090000765 processed proteins & peptides Proteins 0.000 description 5
- 238000011084 recovery Methods 0.000 description 5
- 108091008146 restriction endonucleases Proteins 0.000 description 5
- 241000894007 species Species 0.000 description 5
- 238000001890 transfection Methods 0.000 description 5
- 108010051110 tyrosyl-lysine Proteins 0.000 description 5
- 229930024421 Adenine Natural products 0.000 description 4
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 4
- 229920001817 Agar Polymers 0.000 description 4
- CREYEAPXISDKSB-FQPOAREZSA-N Ala-Thr-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CREYEAPXISDKSB-FQPOAREZSA-N 0.000 description 4
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 4
- IYKVSFNGSWTTNZ-GUBZILKMSA-N Ala-Val-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IYKVSFNGSWTTNZ-GUBZILKMSA-N 0.000 description 4
- APHUDFFMXFYRKP-CIUDSAMLSA-N Asn-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N APHUDFFMXFYRKP-CIUDSAMLSA-N 0.000 description 4
- GIKOVDMXBAFXDF-NHCYSSNCSA-N Asp-Val-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GIKOVDMXBAFXDF-NHCYSSNCSA-N 0.000 description 4
- FERIUCNNQQJTOY-UHFFFAOYSA-N Butyric acid Chemical compound CCCC(O)=O FERIUCNNQQJTOY-UHFFFAOYSA-N 0.000 description 4
- 108091092566 Extrachromosomal DNA Proteins 0.000 description 4
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 4
- NOQPTNXSGNPJNS-YUMQZZPRSA-N His-Asn-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O NOQPTNXSGNPJNS-YUMQZZPRSA-N 0.000 description 4
- TXLQHACKRLWYCM-DCAQKATOSA-N His-Glu-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O TXLQHACKRLWYCM-DCAQKATOSA-N 0.000 description 4
- FONIDUOGWNWEAX-XIRDDKMYSA-N His-Trp-Asn Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O FONIDUOGWNWEAX-XIRDDKMYSA-N 0.000 description 4
- LQSBBHNVAVNZSX-GHCJXIJMSA-N Ile-Ala-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N LQSBBHNVAVNZSX-GHCJXIJMSA-N 0.000 description 4
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 4
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 4
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 4
- KYIIALJHAOIAHF-KKUMJFAQSA-N Leu-Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KYIIALJHAOIAHF-KKUMJFAQSA-N 0.000 description 4
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 4
- LLSUNJYOSCOOEB-GUBZILKMSA-N Lys-Glu-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O LLSUNJYOSCOOEB-GUBZILKMSA-N 0.000 description 4
- VMTYLUGCXIEDMV-QWRGUYRKSA-N Lys-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN VMTYLUGCXIEDMV-QWRGUYRKSA-N 0.000 description 4
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 4
- KIAWKQJTSGRCSA-AVGNSLFASA-N Phe-Asn-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KIAWKQJTSGRCSA-AVGNSLFASA-N 0.000 description 4
- SRILZRSXIKRGBF-HRCADAONSA-N Phe-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N SRILZRSXIKRGBF-HRCADAONSA-N 0.000 description 4
- IAOHCSQDQDWRQU-GUBZILKMSA-N Ser-Val-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IAOHCSQDQDWRQU-GUBZILKMSA-N 0.000 description 4
- 241000193996 Streptococcus pyogenes Species 0.000 description 4
- UJRIVCPPPMYCNA-HOCLYGCPSA-N Trp-Leu-Gly Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N UJRIVCPPPMYCNA-HOCLYGCPSA-N 0.000 description 4
- BYOHPUZJVXWHAE-BYULHYEWSA-N Val-Asn-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N BYOHPUZJVXWHAE-BYULHYEWSA-N 0.000 description 4
- 150000007513 acids Chemical class 0.000 description 4
- 101150006589 adc gene Proteins 0.000 description 4
- 229960000643 adenine Drugs 0.000 description 4
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 4
- 239000008272 agar Substances 0.000 description 4
- 229940041514 candida albicans extract Drugs 0.000 description 4
- 210000000349 chromosome Anatomy 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000004520 electroporation Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 108010028295 histidylhistidine Proteins 0.000 description 4
- 238000009396 hybridization Methods 0.000 description 4
- 230000001965 increasing effect Effects 0.000 description 4
- 108010057821 leucylproline Proteins 0.000 description 4
- 239000007788 liquid Substances 0.000 description 4
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 4
- 239000003550 marker Substances 0.000 description 4
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 4
- 239000013600 plasmid vector Substances 0.000 description 4
- 230000001737 promoting effect Effects 0.000 description 4
- 239000011780 sodium chloride Substances 0.000 description 4
- UNFWWIHTNXNPBV-WXKVUWSESA-N spectinomycin Chemical compound O([C@@H]1[C@@H](NC)[C@@H](O)[C@H]([C@@H]([C@H]1O1)O)NC)[C@]2(O)[C@H]1O[C@H](C)CC2=O UNFWWIHTNXNPBV-WXKVUWSESA-N 0.000 description 4
- 229960000268 spectinomycin Drugs 0.000 description 4
- 108010061238 threonyl-glycine Proteins 0.000 description 4
- 229940113082 thymine Drugs 0.000 description 4
- 239000012137 tryptone Substances 0.000 description 4
- 239000012138 yeast extract Substances 0.000 description 4
- ZODMADSIQZZBSQ-FXQIFTODSA-N Ala-Gln-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZODMADSIQZZBSQ-FXQIFTODSA-N 0.000 description 3
- KLALXKYLOMZDQT-ZLUOBGJFSA-N Ala-Ser-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(N)=O KLALXKYLOMZDQT-ZLUOBGJFSA-N 0.000 description 3
- NVGWESORMHFISY-SRVKXCTJSA-N Asn-Asn-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NVGWESORMHFISY-SRVKXCTJSA-N 0.000 description 3
- COWITDLVHMZSIW-CIUDSAMLSA-N Asn-Lys-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O COWITDLVHMZSIW-CIUDSAMLSA-N 0.000 description 3
- MLJZMGIXXMTEPO-UBHSHLNASA-N Asn-Trp-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O MLJZMGIXXMTEPO-UBHSHLNASA-N 0.000 description 3
- DPWDPEVGACCWTC-SRVKXCTJSA-N Asn-Tyr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O DPWDPEVGACCWTC-SRVKXCTJSA-N 0.000 description 3
- PQKSVQSMTHPRIB-ZKWXMUAHSA-N Asn-Val-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O PQKSVQSMTHPRIB-ZKWXMUAHSA-N 0.000 description 3
- DPNWSMBUYCLEDG-CIUDSAMLSA-N Asp-Lys-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O DPNWSMBUYCLEDG-CIUDSAMLSA-N 0.000 description 3
- NVXLFIPTHPKSKL-UBHSHLNASA-N Asp-Trp-Asn Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(O)=O)N)C(=O)N[C@@H](CC(N)=O)C(O)=O)=CNC2=C1 NVXLFIPTHPKSKL-UBHSHLNASA-N 0.000 description 3
- 241000193163 Clostridioides difficile Species 0.000 description 3
- 241000193155 Clostridium botulinum Species 0.000 description 3
- 101100119095 Enterococcus faecalis (strain ATCC 700802 / V583) ermB gene Proteins 0.000 description 3
- 229930091371 Fructose Natural products 0.000 description 3
- RFSUNEUAIZKAJO-ARQDHWQXSA-N Fructose Chemical compound OC[C@H]1O[C@](O)(CO)[C@@H](O)[C@@H]1O RFSUNEUAIZKAJO-ARQDHWQXSA-N 0.000 description 3
- 239000005715 Fructose Substances 0.000 description 3
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 3
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 3
- FLLRAEJOLZPSMN-CIUDSAMLSA-N Glu-Asn-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FLLRAEJOLZPSMN-CIUDSAMLSA-N 0.000 description 3
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 3
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 3
- AKEDPWJFQULLPE-IUCAKERBSA-N His-Glu-Gly Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O AKEDPWJFQULLPE-IUCAKERBSA-N 0.000 description 3
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 3
- FQZPTCNSNPWHLJ-AVGNSLFASA-N Leu-Gln-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O FQZPTCNSNPWHLJ-AVGNSLFASA-N 0.000 description 3
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 3
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 3
- IRMLZWSRWSGTOP-CIUDSAMLSA-N Leu-Ser-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O IRMLZWSRWSGTOP-CIUDSAMLSA-N 0.000 description 3
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 3
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 3
- FDBTVENULFNTAL-XQQFMLRXSA-N Leu-Val-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N FDBTVENULFNTAL-XQQFMLRXSA-N 0.000 description 3
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 3
- LZWNAOIMTLNMDW-NHCYSSNCSA-N Lys-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N LZWNAOIMTLNMDW-NHCYSSNCSA-N 0.000 description 3
- GAHJXEMYXKLZRQ-AJNGGQMLSA-N Lys-Lys-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GAHJXEMYXKLZRQ-AJNGGQMLSA-N 0.000 description 3
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 3
- 102100025169 Max-binding protein MNT Human genes 0.000 description 3
- JACAKCWAOHKQBV-UWVGGRQHSA-N Met-Gly-Lys Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN JACAKCWAOHKQBV-UWVGGRQHSA-N 0.000 description 3
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 3
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 3
- 108010047562 NGR peptide Proteins 0.000 description 3
- 238000012408 PCR amplification Methods 0.000 description 3
- BFYHIHGIHGROAT-HTUGSXCWSA-N Phe-Glu-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFYHIHGIHGROAT-HTUGSXCWSA-N 0.000 description 3
- TXJJXEXCZBHDNA-ACRUOGEOSA-N Phe-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N TXJJXEXCZBHDNA-ACRUOGEOSA-N 0.000 description 3
- IIEOLPMQYRBZCN-SRVKXCTJSA-N Phe-Ser-Cys Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O IIEOLPMQYRBZCN-SRVKXCTJSA-N 0.000 description 3
- ZXLUWXWISXIFIX-ACZMJKKPSA-N Ser-Asn-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZXLUWXWISXIFIX-ACZMJKKPSA-N 0.000 description 3
- ZOPISOXXPQNOCO-SVSWQMSJSA-N Ser-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CO)N ZOPISOXXPQNOCO-SVSWQMSJSA-N 0.000 description 3
- LGIMRDKGABDMBN-DCAQKATOSA-N Ser-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N LGIMRDKGABDMBN-DCAQKATOSA-N 0.000 description 3
- 239000004098 Tetracycline Substances 0.000 description 3
- PWONLXBUSVIZPH-RHYQMDGZSA-N Thr-Val-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O PWONLXBUSVIZPH-RHYQMDGZSA-N 0.000 description 3
- 108091028113 Trans-activating crRNA Proteins 0.000 description 3
- RRVUOLRWIZXBRQ-IHPCNDPISA-N Trp-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N RRVUOLRWIZXBRQ-IHPCNDPISA-N 0.000 description 3
- WPXKRJVHBXYLDT-JUKXBJQTSA-N Tyr-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=C(C=C2)O)N WPXKRJVHBXYLDT-JUKXBJQTSA-N 0.000 description 3
- KIJLSRYAUGGZIN-CFMVVWHZSA-N Tyr-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KIJLSRYAUGGZIN-CFMVVWHZSA-N 0.000 description 3
- KSCVLGXNQXKUAR-JYJNAYRXSA-N Tyr-Leu-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O KSCVLGXNQXKUAR-JYJNAYRXSA-N 0.000 description 3
- UZDHNIJRRTUKKC-DLOVCJGASA-N Val-Gln-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N UZDHNIJRRTUKKC-DLOVCJGASA-N 0.000 description 3
- BZMIYHIJVVJPCK-QSFUFRPTSA-N Val-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N BZMIYHIJVVJPCK-QSFUFRPTSA-N 0.000 description 3
- 101150067366 adh gene Proteins 0.000 description 3
- 101150056596 azin2 gene Proteins 0.000 description 3
- 230000010310 bacterial transformation Effects 0.000 description 3
- 101150105239 bdhB gene Proteins 0.000 description 3
- 229960000074 biopharmaceutical Drugs 0.000 description 3
- 108010060199 cysteinylproline Proteins 0.000 description 3
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 3
- 108010049041 glutamylalanine Proteins 0.000 description 3
- 108010087823 glycyltyrosine Proteins 0.000 description 3
- 108010025306 histidylleucine Proteins 0.000 description 3
- 108010018006 histidylserine Proteins 0.000 description 3
- 238000012423 maintenance Methods 0.000 description 3
- 239000002243 precursor Substances 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 3
- 125000006850 spacer group Chemical group 0.000 description 3
- 108010005652 splenotritin Proteins 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 230000004083 survival effect Effects 0.000 description 3
- 101150107585 tetA gene Proteins 0.000 description 3
- 229960002180 tetracycline Drugs 0.000 description 3
- 235000019364 tetracycline Nutrition 0.000 description 3
- 150000003522 tetracyclines Chemical class 0.000 description 3
- 230000002103 transcriptional effect Effects 0.000 description 3
- 108091006107 transcriptional repressors Proteins 0.000 description 3
- 230000009261 transgenic effect Effects 0.000 description 3
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 3
- HLXHCNWEVQNNKA-UHFFFAOYSA-N 5-methoxy-2,3-dihydro-1h-inden-2-amine Chemical compound COC1=CC=C2CC(N)CC2=C1 HLXHCNWEVQNNKA-UHFFFAOYSA-N 0.000 description 2
- 229920000936 Agarose Polymers 0.000 description 2
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 2
- NWVVKQZOVSTDBQ-CIUDSAMLSA-N Ala-Glu-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NWVVKQZOVSTDBQ-CIUDSAMLSA-N 0.000 description 2
- CFPQUJZTLUQUTJ-HTFCKZLJSA-N Ala-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@H](C)N CFPQUJZTLUQUTJ-HTFCKZLJSA-N 0.000 description 2
- HHRAXZAYZFFRAM-CIUDSAMLSA-N Ala-Leu-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O HHRAXZAYZFFRAM-CIUDSAMLSA-N 0.000 description 2
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 2
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 2
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 2
- VNFWDYWTSHFRRG-SRVKXCTJSA-N Arg-Gln-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O VNFWDYWTSHFRRG-SRVKXCTJSA-N 0.000 description 2
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 2
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 2
- FSNVAJOPUDVQAR-AVGNSLFASA-N Arg-Lys-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FSNVAJOPUDVQAR-AVGNSLFASA-N 0.000 description 2
- AOJYORNRFWWEIV-IHRRRGAJSA-N Arg-Tyr-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 AOJYORNRFWWEIV-IHRRRGAJSA-N 0.000 description 2
- IZSMEUDYADKZTJ-KJEVXHAQSA-N Arg-Tyr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IZSMEUDYADKZTJ-KJEVXHAQSA-N 0.000 description 2
- XYOVHPDDWCEUDY-CIUDSAMLSA-N Asn-Ala-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O XYOVHPDDWCEUDY-CIUDSAMLSA-N 0.000 description 2
- BDMIFVIWCNLDCT-CIUDSAMLSA-N Asn-Arg-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O BDMIFVIWCNLDCT-CIUDSAMLSA-N 0.000 description 2
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 2
- XLZCLJRGGMBKLR-PCBIJLKTSA-N Asn-Ile-Phe Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XLZCLJRGGMBKLR-PCBIJLKTSA-N 0.000 description 2
- NTWOPSIUJBMNRI-KKUMJFAQSA-N Asn-Lys-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTWOPSIUJBMNRI-KKUMJFAQSA-N 0.000 description 2
- MVXJBVVLACEGCG-PCBIJLKTSA-N Asn-Phe-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVXJBVVLACEGCG-PCBIJLKTSA-N 0.000 description 2
- NJSNXIOKBHPFMB-GMOBBJLQSA-N Asn-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)N)N NJSNXIOKBHPFMB-GMOBBJLQSA-N 0.000 description 2
- UGXYFDQFLVCDFC-CIUDSAMLSA-N Asn-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O UGXYFDQFLVCDFC-CIUDSAMLSA-N 0.000 description 2
- QTKYFZCMSQLYHI-UBHSHLNASA-N Asn-Trp-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O QTKYFZCMSQLYHI-UBHSHLNASA-N 0.000 description 2
- XEGZSHSPQNDNRH-JRQIVUDYSA-N Asn-Tyr-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XEGZSHSPQNDNRH-JRQIVUDYSA-N 0.000 description 2
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 2
- KNMRXHIAVXHCLW-ZLUOBGJFSA-N Asp-Asn-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)C(=O)O KNMRXHIAVXHCLW-ZLUOBGJFSA-N 0.000 description 2
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 2
- GBSUGIXJAAKZOW-GMOBBJLQSA-N Asp-Ile-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GBSUGIXJAAKZOW-GMOBBJLQSA-N 0.000 description 2
- TZOZNVLBTAFJRW-UGYAYLCHSA-N Asp-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N TZOZNVLBTAFJRW-UGYAYLCHSA-N 0.000 description 2
- LDLZOAJRXXBVGF-GMOBBJLQSA-N Asp-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)O)N LDLZOAJRXXBVGF-GMOBBJLQSA-N 0.000 description 2
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 2
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 2
- LTCKTLYKRMCFOC-KKUMJFAQSA-N Asp-Phe-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LTCKTLYKRMCFOC-KKUMJFAQSA-N 0.000 description 2
- GWWSUMLEWKQHLR-NUMRIWBASA-N Asp-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GWWSUMLEWKQHLR-NUMRIWBASA-N 0.000 description 2
- 241000193830 Bacillus <bacterium> Species 0.000 description 2
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 2
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 description 2
- 108010078791 Carrier Proteins Proteins 0.000 description 2
- 241001112695 Clostridiales Species 0.000 description 2
- 101100385572 Clostridium acetobutylicum (strain ATCC 824 / DSM 792 / JCM 1419 / LMG 5710 / VKM B-1787) ctfA gene Proteins 0.000 description 2
- 101100385573 Clostridium acetobutylicum (strain ATCC 824 / DSM 792 / JCM 1419 / LMG 5710 / VKM B-1787) ctfB gene Proteins 0.000 description 2
- 241000272479 Clostridium diolis Species 0.000 description 2
- 241000328950 Clostridium drakei Species 0.000 description 2
- 241000193464 Clostridium sp. Species 0.000 description 2
- 241001638344 Clostridium tunisiense Species 0.000 description 2
- 241000194032 Enterococcus faecalis Species 0.000 description 2
- 241000192125 Firmicutes Species 0.000 description 2
- ODBLJLZVLAWVMS-GUBZILKMSA-N Gln-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N ODBLJLZVLAWVMS-GUBZILKMSA-N 0.000 description 2
- SXGMGNZEHFORAV-IUCAKERBSA-N Gln-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N SXGMGNZEHFORAV-IUCAKERBSA-N 0.000 description 2
- DOQUICBEISTQHE-CIUDSAMLSA-N Gln-Pro-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O DOQUICBEISTQHE-CIUDSAMLSA-N 0.000 description 2
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 2
- GLWXKFRTOHKGIT-ACZMJKKPSA-N Glu-Asn-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O GLWXKFRTOHKGIT-ACZMJKKPSA-N 0.000 description 2
- QQLBPVKLJBAXBS-FXQIFTODSA-N Glu-Glu-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QQLBPVKLJBAXBS-FXQIFTODSA-N 0.000 description 2
- ZHNHJYYFCGUZNQ-KBIXCLLPSA-N Glu-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O ZHNHJYYFCGUZNQ-KBIXCLLPSA-N 0.000 description 2
- CUPSDFQZTVVTSK-GUBZILKMSA-N Glu-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O CUPSDFQZTVVTSK-GUBZILKMSA-N 0.000 description 2
- CQAHWYDHKUWYIX-YUMQZZPRSA-N Glu-Pro-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O CQAHWYDHKUWYIX-YUMQZZPRSA-N 0.000 description 2
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 2
- UCZXXMREFIETQW-AVGNSLFASA-N Glu-Tyr-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O UCZXXMREFIETQW-AVGNSLFASA-N 0.000 description 2
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 2
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 2
- CVFOYJJOZYYEPE-KBPBESRZSA-N Gly-Lys-Tyr Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CVFOYJJOZYYEPE-KBPBESRZSA-N 0.000 description 2
- PNUFMLXHOLFRLD-KBPBESRZSA-N Gly-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 PNUFMLXHOLFRLD-KBPBESRZSA-N 0.000 description 2
- XINDHUAGVGCNSF-QSFUFRPTSA-N His-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XINDHUAGVGCNSF-QSFUFRPTSA-N 0.000 description 2
- UOAVQQRILDGZEN-SRVKXCTJSA-N His-Asp-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UOAVQQRILDGZEN-SRVKXCTJSA-N 0.000 description 2
- CSRRMQFXMBPSIL-SIXJUCDHSA-N His-Trp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CC3=CN=CN3)N CSRRMQFXMBPSIL-SIXJUCDHSA-N 0.000 description 2
- GECLQMBTZCPAFY-PEFMBERDSA-N Ile-Gln-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GECLQMBTZCPAFY-PEFMBERDSA-N 0.000 description 2
- KUHFPGIVBOCRMV-MNXVOIDGSA-N Ile-Gln-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N KUHFPGIVBOCRMV-MNXVOIDGSA-N 0.000 description 2
- PDTMWFVVNZYWTR-NHCYSSNCSA-N Ile-Gly-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O PDTMWFVVNZYWTR-NHCYSSNCSA-N 0.000 description 2
- TWYOYAKMLHWMOJ-ZPFDUUQYSA-N Ile-Leu-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O TWYOYAKMLHWMOJ-ZPFDUUQYSA-N 0.000 description 2
- YGDWPQCLFJNMOL-MNXVOIDGSA-N Ile-Leu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YGDWPQCLFJNMOL-MNXVOIDGSA-N 0.000 description 2
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 2
- HPCFRQWLTRDGHT-AJNGGQMLSA-N Ile-Leu-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O HPCFRQWLTRDGHT-AJNGGQMLSA-N 0.000 description 2
- OVDKXUDMKXAZIV-ZPFDUUQYSA-N Ile-Lys-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OVDKXUDMKXAZIV-ZPFDUUQYSA-N 0.000 description 2
- RVNOXPZHMUWCLW-GMOBBJLQSA-N Ile-Met-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N RVNOXPZHMUWCLW-GMOBBJLQSA-N 0.000 description 2
- CIDLJWVDMNDKPT-FIRPJDEBSA-N Ile-Phe-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N CIDLJWVDMNDKPT-FIRPJDEBSA-N 0.000 description 2
- VZSDQFZFTCVEGF-ZEWNOJEFSA-N Ile-Phe-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O VZSDQFZFTCVEGF-ZEWNOJEFSA-N 0.000 description 2
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 2
- AGGIYSLVUKVOPT-HTFCKZLJSA-N Ile-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N AGGIYSLVUKVOPT-HTFCKZLJSA-N 0.000 description 2
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 2
- CNMOKANDJMLAIF-CIQUZCHMSA-N Ile-Thr-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O CNMOKANDJMLAIF-CIQUZCHMSA-N 0.000 description 2
- NAFIFZNBSPWYOO-RWRJDSDZSA-N Ile-Thr-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NAFIFZNBSPWYOO-RWRJDSDZSA-N 0.000 description 2
- COWHUQXTSYTKQC-RWRJDSDZSA-N Ile-Thr-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N COWHUQXTSYTKQC-RWRJDSDZSA-N 0.000 description 2
- OAQJOXZPGHTJNA-NGTWOADLSA-N Ile-Trp-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N OAQJOXZPGHTJNA-NGTWOADLSA-N 0.000 description 2
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 2
- 108010065920 Insulin Lispro Proteins 0.000 description 2
- 101100001037 Komagataeibacter europaeus adhA gene Proteins 0.000 description 2
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 2
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 2
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 2
- JKGHDYGZRDWHGA-SRVKXCTJSA-N Leu-Asn-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JKGHDYGZRDWHGA-SRVKXCTJSA-N 0.000 description 2
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 2
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 2
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 2
- DBSLVQBXKVKDKJ-BJDJZHNGSA-N Leu-Ile-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O DBSLVQBXKVKDKJ-BJDJZHNGSA-N 0.000 description 2
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 2
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 2
- BJWKOATWNQJPSK-SRVKXCTJSA-N Leu-Met-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N BJWKOATWNQJPSK-SRVKXCTJSA-N 0.000 description 2
- AIRUUHAOKGVJAD-JYJNAYRXSA-N Leu-Phe-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIRUUHAOKGVJAD-JYJNAYRXSA-N 0.000 description 2
- YRRCOJOXAJNSAX-IHRRRGAJSA-N Leu-Pro-Lys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N YRRCOJOXAJNSAX-IHRRRGAJSA-N 0.000 description 2
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 2
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 2
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 2
- MPGHETGWWWUHPY-CIUDSAMLSA-N Lys-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN MPGHETGWWWUHPY-CIUDSAMLSA-N 0.000 description 2
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 2
- HKCCVDWHHTVVPN-CIUDSAMLSA-N Lys-Asp-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O HKCCVDWHHTVVPN-CIUDSAMLSA-N 0.000 description 2
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 2
- XNKDCYABMBBEKN-IUCAKERBSA-N Lys-Gly-Gln Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O XNKDCYABMBBEKN-IUCAKERBSA-N 0.000 description 2
- JYXBNQOKPRQNQS-YTFOTSKYSA-N Lys-Ile-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JYXBNQOKPRQNQS-YTFOTSKYSA-N 0.000 description 2
- ZXFRGTAIIZHNHG-AJNGGQMLSA-N Lys-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N ZXFRGTAIIZHNHG-AJNGGQMLSA-N 0.000 description 2
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 2
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 2
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 2
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 2
- WGILOYIKJVQUPT-DCAQKATOSA-N Lys-Pro-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WGILOYIKJVQUPT-DCAQKATOSA-N 0.000 description 2
- JMNRXRPBHFGXQX-GUBZILKMSA-N Lys-Ser-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JMNRXRPBHFGXQX-GUBZILKMSA-N 0.000 description 2
- JHNOXVASMSXSNB-WEDXCCLWSA-N Lys-Thr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JHNOXVASMSXSNB-WEDXCCLWSA-N 0.000 description 2
- FRWZTWWOORIIBA-FXQIFTODSA-N Met-Asn-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N FRWZTWWOORIIBA-FXQIFTODSA-N 0.000 description 2
- RRIHXWPHQSXHAQ-XUXIUFHCSA-N Met-Ile-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O RRIHXWPHQSXHAQ-XUXIUFHCSA-N 0.000 description 2
- 101100162145 Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) adhC2 gene Proteins 0.000 description 2
- 101100162144 Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) adhc1 gene Proteins 0.000 description 2
- 108010079364 N-glycylalanine Proteins 0.000 description 2
- 101100378791 Paenarthrobacter nicotinovorans aldh gene Proteins 0.000 description 2
- 206010034133 Pathogen resistance Diseases 0.000 description 2
- 108091093037 Peptide nucleic acid Proteins 0.000 description 2
- MPGJIHFJCXTVEX-KKUMJFAQSA-N Phe-Arg-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O MPGJIHFJCXTVEX-KKUMJFAQSA-N 0.000 description 2
- RJYBHZVWJPUSLB-QEWYBTABSA-N Phe-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N RJYBHZVWJPUSLB-QEWYBTABSA-N 0.000 description 2
- WPTYDQPGBMDUBI-QWRGUYRKSA-N Phe-Gly-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O WPTYDQPGBMDUBI-QWRGUYRKSA-N 0.000 description 2
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 2
- OQTDZEJJWWAGJT-KKUMJFAQSA-N Phe-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O OQTDZEJJWWAGJT-KKUMJFAQSA-N 0.000 description 2
- PEFJUUYFEGBXFA-BZSNNMDCSA-N Phe-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 PEFJUUYFEGBXFA-BZSNNMDCSA-N 0.000 description 2
- WEDZFLRYSIDIRX-IHRRRGAJSA-N Phe-Ser-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 WEDZFLRYSIDIRX-IHRRRGAJSA-N 0.000 description 2
- BPCLGWHVPVTTFM-QWRGUYRKSA-N Phe-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O BPCLGWHVPVTTFM-QWRGUYRKSA-N 0.000 description 2
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 2
- DBNGDEAQXGFGRA-ACRUOGEOSA-N Phe-Tyr-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DBNGDEAQXGFGRA-ACRUOGEOSA-N 0.000 description 2
- 102000055027 Protein Methyltransferases Human genes 0.000 description 2
- 108020004511 Recombinant DNA Proteins 0.000 description 2
- 101100490556 Schizosaccharomyces pombe (strain 972 / ATCC 24843) adh1 gene Proteins 0.000 description 2
- WTUJZHKANPDPIN-CIUDSAMLSA-N Ser-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N WTUJZHKANPDPIN-CIUDSAMLSA-N 0.000 description 2
- KJMOINFQVCCSDX-XKBZYTNZSA-N Ser-Gln-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KJMOINFQVCCSDX-XKBZYTNZSA-N 0.000 description 2
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 2
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 2
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 2
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 2
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 2
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 2
- WLJPJRGQRNCIQS-ZLUOBGJFSA-N Ser-Ser-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O WLJPJRGQRNCIQS-ZLUOBGJFSA-N 0.000 description 2
- FHXGMDRKJHKLKW-QWRGUYRKSA-N Ser-Tyr-Gly Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 FHXGMDRKJHKLKW-QWRGUYRKSA-N 0.000 description 2
- ZUXQFMVPAYGPFJ-JXUBOQSCSA-N Thr-Ala-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN ZUXQFMVPAYGPFJ-JXUBOQSCSA-N 0.000 description 2
- DKDHTRVDOUZZTP-IFFSRLJSSA-N Thr-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DKDHTRVDOUZZTP-IFFSRLJSSA-N 0.000 description 2
- JMGJDTNUMAZNLX-RWRJDSDZSA-N Thr-Glu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JMGJDTNUMAZNLX-RWRJDSDZSA-N 0.000 description 2
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 2
- LECUEEHKUFYOOV-ZJDVBMNYSA-N Thr-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)[C@@H](C)O LECUEEHKUFYOOV-ZJDVBMNYSA-N 0.000 description 2
- PELIQFPESHBTMA-WLTAIBSBSA-N Thr-Tyr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 PELIQFPESHBTMA-WLTAIBSBSA-N 0.000 description 2
- ABCLYRRGTZNIFU-BWAGICSOSA-N Thr-Tyr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O ABCLYRRGTZNIFU-BWAGICSOSA-N 0.000 description 2
- DANHCMVVXDXOHN-SRVKXCTJSA-N Tyr-Asp-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DANHCMVVXDXOHN-SRVKXCTJSA-N 0.000 description 2
- MPKPIWFFDWVJGC-IRIUXVKKSA-N Tyr-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O MPKPIWFFDWVJGC-IRIUXVKKSA-N 0.000 description 2
- KOVXHANYYYMBRF-IRIUXVKKSA-N Tyr-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O KOVXHANYYYMBRF-IRIUXVKKSA-N 0.000 description 2
- GULIUBBXCYPDJU-CQDKDKBSSA-N Tyr-Leu-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 GULIUBBXCYPDJU-CQDKDKBSSA-N 0.000 description 2
- YKCXQOBTISTQJD-BZSNNMDCSA-N Tyr-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N YKCXQOBTISTQJD-BZSNNMDCSA-N 0.000 description 2
- BGFCXQXETBDEHP-BZSNNMDCSA-N Tyr-Phe-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O BGFCXQXETBDEHP-BZSNNMDCSA-N 0.000 description 2
- FDKDGFGTHGJKNV-FHWLQOOXSA-N Tyr-Phe-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N FDKDGFGTHGJKNV-FHWLQOOXSA-N 0.000 description 2
- NHOVZGFNTGMYMI-KKUMJFAQSA-N Tyr-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NHOVZGFNTGMYMI-KKUMJFAQSA-N 0.000 description 2
- GZWPQZDVTBZVEP-BZSNNMDCSA-N Tyr-Tyr-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O GZWPQZDVTBZVEP-BZSNNMDCSA-N 0.000 description 2
- PAPWZOJOLKZEFR-AVGNSLFASA-N Val-Arg-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N PAPWZOJOLKZEFR-AVGNSLFASA-N 0.000 description 2
- UDLYXGYWTVOIKU-QXEWZRGKSA-N Val-Asn-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UDLYXGYWTVOIKU-QXEWZRGKSA-N 0.000 description 2
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 2
- DIOSYUIWOQCXNR-ONGXEEELSA-N Val-Lys-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O DIOSYUIWOQCXNR-ONGXEEELSA-N 0.000 description 2
- UEPLNXPLHJUYPT-AVGNSLFASA-N Val-Met-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O UEPLNXPLHJUYPT-AVGNSLFASA-N 0.000 description 2
- FMQGYTMERWBMSI-HJWJTTGWSA-N Val-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C(C)C)N FMQGYTMERWBMSI-HJWJTTGWSA-N 0.000 description 2
- JXCOEPXCBVCTRD-JYJNAYRXSA-N Val-Tyr-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N JXCOEPXCBVCTRD-JYJNAYRXSA-N 0.000 description 2
- 101150081538 aad9 gene Proteins 0.000 description 2
- 229960005305 adenosine Drugs 0.000 description 2
- 101150004356 adhC gene Proteins 0.000 description 2
- 108010013835 arginine glutamate Proteins 0.000 description 2
- 108010008355 arginyl-glutamine Proteins 0.000 description 2
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 2
- 238000000429 assembly Methods 0.000 description 2
- 230000000712 assembly Effects 0.000 description 2
- 239000002551 biofuel Substances 0.000 description 2
- 101150110215 catQ gene Proteins 0.000 description 2
- 101150008363 celC gene Proteins 0.000 description 2
- 238000010835 comparative analysis Methods 0.000 description 2
- 108010016616 cysteinylglycine Proteins 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 229940032049 enterococcus faecalis Drugs 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 2
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 2
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 2
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 2
- 108010015792 glycyllysine Proteins 0.000 description 2
- 108010037850 glycylvaline Proteins 0.000 description 2
- 108010085325 histidylproline Proteins 0.000 description 2
- 238000010348 incorporation Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 108010084715 isopropanol dehydrogenase (NADP) Proteins 0.000 description 2
- 101150109249 lacI gene Proteins 0.000 description 2
- 108010000761 leucylarginine Proteins 0.000 description 2
- 108010064235 lysylglycine Proteins 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 108010012581 phenylalanylglutamate Proteins 0.000 description 2
- 108010025488 pinealon Proteins 0.000 description 2
- 229920001184 polypeptide Polymers 0.000 description 2
- 102000004196 processed proteins & peptides Human genes 0.000 description 2
- 101150048333 ptb gene Proteins 0.000 description 2
- 230000006798 recombination Effects 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 230000003362 replicative effect Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 101150061166 tetR gene Proteins 0.000 description 2
- 231100000419 toxicity Toxicity 0.000 description 2
- 230000001988 toxicity Effects 0.000 description 2
- 108010080629 tryptophan-leucine Proteins 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 101150052264 xylA gene Proteins 0.000 description 2
- 101150110790 xylB gene Proteins 0.000 description 2
- AYIRNRDRBQJXIF-NXEZZACHSA-N (-)-Florfenicol Chemical compound CS(=O)(=O)C1=CC=C([C@@H](O)[C@@H](CF)NC(=O)C(Cl)Cl)C=C1 AYIRNRDRBQJXIF-NXEZZACHSA-N 0.000 description 1
- JNTMAZFVYNDPLB-PEDHHIEDSA-N (2S,3S)-2-[[[(2S)-1-[(2S,3S)-2-amino-3-methyl-1-oxopentyl]-2-pyrrolidinyl]-oxomethyl]amino]-3-methylpentanoic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JNTMAZFVYNDPLB-PEDHHIEDSA-N 0.000 description 1
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 1
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 1
- AXFMEGAFCUULFV-BLFANLJRSA-N (2s)-2-[[(2s)-1-[(2s,3r)-2-amino-3-methylpentanoyl]pyrrolidine-2-carbonyl]amino]pentanedioic acid Chemical compound CC[C@@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O AXFMEGAFCUULFV-BLFANLJRSA-N 0.000 description 1
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 description 1
- 108020005065 3' Flanking Region Proteins 0.000 description 1
- 108020005029 5' Flanking Region Proteins 0.000 description 1
- 101100235000 Acetobacterium woodii (strain ATCC 29683 / DSM 1030 / JCM 2381 / KCTC 1655 / WB1) lctB gene Proteins 0.000 description 1
- 101100235003 Acetobacterium woodii (strain ATCC 29683 / DSM 1030 / JCM 2381 / KCTC 1655 / WB1) lctC gene Proteins 0.000 description 1
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 1
- QDRGPQWIVZNJQD-CIUDSAMLSA-N Ala-Arg-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O QDRGPQWIVZNJQD-CIUDSAMLSA-N 0.000 description 1
- XEXJJJRVTFGWIC-FXQIFTODSA-N Ala-Asn-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XEXJJJRVTFGWIC-FXQIFTODSA-N 0.000 description 1
- NXSFUECZFORGOG-CIUDSAMLSA-N Ala-Asn-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXSFUECZFORGOG-CIUDSAMLSA-N 0.000 description 1
- ZIBWKCRKNFYTPT-ZKWXMUAHSA-N Ala-Asn-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O ZIBWKCRKNFYTPT-ZKWXMUAHSA-N 0.000 description 1
- WXERCAHAIKMTKX-ZLUOBGJFSA-N Ala-Asp-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O WXERCAHAIKMTKX-ZLUOBGJFSA-N 0.000 description 1
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 1
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 1
- XAGIMRPOEJSYER-CIUDSAMLSA-N Ala-Cys-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N XAGIMRPOEJSYER-CIUDSAMLSA-N 0.000 description 1
- FVSOUJZKYWEFOB-KBIXCLLPSA-N Ala-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)N FVSOUJZKYWEFOB-KBIXCLLPSA-N 0.000 description 1
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 1
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 1
- NJPMYXWVWQWCSR-ACZMJKKPSA-N Ala-Glu-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NJPMYXWVWQWCSR-ACZMJKKPSA-N 0.000 description 1
- WKOBSJOZRJJVRZ-FXQIFTODSA-N Ala-Glu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WKOBSJOZRJJVRZ-FXQIFTODSA-N 0.000 description 1
- GGNHBHYDMUDXQB-KBIXCLLPSA-N Ala-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)N GGNHBHYDMUDXQB-KBIXCLLPSA-N 0.000 description 1
- XYTNPQNAZREREP-XQXXSGGOSA-N Ala-Glu-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XYTNPQNAZREREP-XQXXSGGOSA-N 0.000 description 1
- MPLOSMWGDNJSEV-WHFBIAKZSA-N Ala-Gly-Asp Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MPLOSMWGDNJSEV-WHFBIAKZSA-N 0.000 description 1
- XCZXVTHYGSMQGH-NAKRPEOUSA-N Ala-Ile-Met Chemical compound C[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCSC)C([O-])=O XCZXVTHYGSMQGH-NAKRPEOUSA-N 0.000 description 1
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 1
- MNZHHDPWDWQJCQ-YUMQZZPRSA-N Ala-Leu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O MNZHHDPWDWQJCQ-YUMQZZPRSA-N 0.000 description 1
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 1
- LDLSENBXQNDTPB-DCAQKATOSA-N Ala-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LDLSENBXQNDTPB-DCAQKATOSA-N 0.000 description 1
- XHNLCGXYBXNRIS-BJDJZHNGSA-N Ala-Lys-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XHNLCGXYBXNRIS-BJDJZHNGSA-N 0.000 description 1
- KQESEZXHYOUIIM-CQDKDKBSSA-N Ala-Lys-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KQESEZXHYOUIIM-CQDKDKBSSA-N 0.000 description 1
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 1
- RMAWDDRDTRSZIR-ZLUOBGJFSA-N Ala-Ser-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RMAWDDRDTRSZIR-ZLUOBGJFSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- PEEYDECOOVQKRZ-DLOVCJGASA-N Ala-Ser-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PEEYDECOOVQKRZ-DLOVCJGASA-N 0.000 description 1
- VRTOMXFZHGWHIJ-KZVJFYERSA-N Ala-Thr-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VRTOMXFZHGWHIJ-KZVJFYERSA-N 0.000 description 1
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 1
- KTXKIYXZQFWJKB-VZFHVOOUSA-N Ala-Thr-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O KTXKIYXZQFWJKB-VZFHVOOUSA-N 0.000 description 1
- YXXPVUOMPSZURS-ZLIFDBKOSA-N Ala-Trp-Leu Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@H](C)N)=CNC2=C1 YXXPVUOMPSZURS-ZLIFDBKOSA-N 0.000 description 1
- QRIYOHQJRDHFKF-UWJYBYFXSA-N Ala-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 QRIYOHQJRDHFKF-UWJYBYFXSA-N 0.000 description 1
- JPOQZCHGOTWRTM-FQPOAREZSA-N Ala-Tyr-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPOQZCHGOTWRTM-FQPOAREZSA-N 0.000 description 1
- XSLGWYYNOSUMRM-ZKWXMUAHSA-N Ala-Val-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XSLGWYYNOSUMRM-ZKWXMUAHSA-N 0.000 description 1
- XCIGOVDXZULBBV-DCAQKATOSA-N Ala-Val-Lys Chemical compound CC(C)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CCCCN)C(O)=O XCIGOVDXZULBBV-DCAQKATOSA-N 0.000 description 1
- 102000007698 Alcohol dehydrogenase Human genes 0.000 description 1
- 108010021809 Alcohol dehydrogenase Proteins 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- 241000193457 Anaerocolumna aminovalerica Species 0.000 description 1
- MUXONAMCEUBVGA-DCAQKATOSA-N Arg-Arg-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O MUXONAMCEUBVGA-DCAQKATOSA-N 0.000 description 1
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 1
- BVBKBQRPOJFCQM-DCAQKATOSA-N Arg-Asn-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BVBKBQRPOJFCQM-DCAQKATOSA-N 0.000 description 1
- IIABBYGHLYWVOS-FXQIFTODSA-N Arg-Asn-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O IIABBYGHLYWVOS-FXQIFTODSA-N 0.000 description 1
- KMSHNDWHPWXPEC-BQBZGAKWSA-N Arg-Asp-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KMSHNDWHPWXPEC-BQBZGAKWSA-N 0.000 description 1
- SQKPKIJVWHAWNF-DCAQKATOSA-N Arg-Asp-Lys Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(O)=O SQKPKIJVWHAWNF-DCAQKATOSA-N 0.000 description 1
- RRGPUNYIPJXJBU-GUBZILKMSA-N Arg-Asp-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O RRGPUNYIPJXJBU-GUBZILKMSA-N 0.000 description 1
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 1
- PTVGLOCPAVYPFG-CIUDSAMLSA-N Arg-Gln-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O PTVGLOCPAVYPFG-CIUDSAMLSA-N 0.000 description 1
- JCAISGGAOQXEHJ-ZPFDUUQYSA-N Arg-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N JCAISGGAOQXEHJ-ZPFDUUQYSA-N 0.000 description 1
- LMPKCSXZJSXBBL-NHCYSSNCSA-N Arg-Gln-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O LMPKCSXZJSXBBL-NHCYSSNCSA-N 0.000 description 1
- QAODJPUKWNNNRP-DCAQKATOSA-N Arg-Glu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QAODJPUKWNNNRP-DCAQKATOSA-N 0.000 description 1
- RKRSYHCNPFGMTA-CIUDSAMLSA-N Arg-Glu-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O RKRSYHCNPFGMTA-CIUDSAMLSA-N 0.000 description 1
- MZRBYBIQTIKERR-GUBZILKMSA-N Arg-Glu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MZRBYBIQTIKERR-GUBZILKMSA-N 0.000 description 1
- NKBQZKVMKJJDLX-SRVKXCTJSA-N Arg-Glu-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NKBQZKVMKJJDLX-SRVKXCTJSA-N 0.000 description 1
- RKQRHMKFNBYOTN-IHRRRGAJSA-N Arg-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N RKQRHMKFNBYOTN-IHRRRGAJSA-N 0.000 description 1
- LKDHUGLXOHYINY-XUXIUFHCSA-N Arg-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LKDHUGLXOHYINY-XUXIUFHCSA-N 0.000 description 1
- LLUGJARLJCGLAR-CYDGBPFRSA-N Arg-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LLUGJARLJCGLAR-CYDGBPFRSA-N 0.000 description 1
- DNUKXVMPARLPFN-XUXIUFHCSA-N Arg-Leu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DNUKXVMPARLPFN-XUXIUFHCSA-N 0.000 description 1
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 1
- YVTHEZNOKSAWRW-DCAQKATOSA-N Arg-Lys-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O YVTHEZNOKSAWRW-DCAQKATOSA-N 0.000 description 1
- SSZGOKWBHLOCHK-DCAQKATOSA-N Arg-Lys-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N SSZGOKWBHLOCHK-DCAQKATOSA-N 0.000 description 1
- MJINRRBEMOLJAK-DCAQKATOSA-N Arg-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N MJINRRBEMOLJAK-DCAQKATOSA-N 0.000 description 1
- CVXXSWQORBZAAA-SRVKXCTJSA-N Arg-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N CVXXSWQORBZAAA-SRVKXCTJSA-N 0.000 description 1
- BNYNOWJESJJIOI-XUXIUFHCSA-N Arg-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N BNYNOWJESJJIOI-XUXIUFHCSA-N 0.000 description 1
- NPAVRDPEFVKELR-DCAQKATOSA-N Arg-Lys-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NPAVRDPEFVKELR-DCAQKATOSA-N 0.000 description 1
- PAPSMOYMQDWIOR-AVGNSLFASA-N Arg-Lys-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PAPSMOYMQDWIOR-AVGNSLFASA-N 0.000 description 1
- PYZPXCZNQSEHDT-GUBZILKMSA-N Arg-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N PYZPXCZNQSEHDT-GUBZILKMSA-N 0.000 description 1
- KSUALAGYYLQSHJ-RCWTZXSCSA-N Arg-Met-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSUALAGYYLQSHJ-RCWTZXSCSA-N 0.000 description 1
- CZUHPNLXLWMYMG-UBHSHLNASA-N Arg-Phe-Ala Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 CZUHPNLXLWMYMG-UBHSHLNASA-N 0.000 description 1
- YTMKMRSYXHBGER-IHRRRGAJSA-N Arg-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YTMKMRSYXHBGER-IHRRRGAJSA-N 0.000 description 1
- BSGSDLYGGHGMND-IHRRRGAJSA-N Arg-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N BSGSDLYGGHGMND-IHRRRGAJSA-N 0.000 description 1
- NIELFHOLFTUZME-HJWJTTGWSA-N Arg-Phe-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NIELFHOLFTUZME-HJWJTTGWSA-N 0.000 description 1
- UIUXXFIKWQVMEX-UFYCRDLUSA-N Arg-Phe-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UIUXXFIKWQVMEX-UFYCRDLUSA-N 0.000 description 1
- NGYHSXDNNOFHNE-AVGNSLFASA-N Arg-Pro-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O NGYHSXDNNOFHNE-AVGNSLFASA-N 0.000 description 1
- ADPACBMPYWJJCE-FXQIFTODSA-N Arg-Ser-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O ADPACBMPYWJJCE-FXQIFTODSA-N 0.000 description 1
- FRBAHXABMQXSJQ-FXQIFTODSA-N Arg-Ser-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O FRBAHXABMQXSJQ-FXQIFTODSA-N 0.000 description 1
- BECXEHHOZNFFFX-IHRRRGAJSA-N Arg-Ser-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BECXEHHOZNFFFX-IHRRRGAJSA-N 0.000 description 1
- XMGVWQWEWWULNS-BPUTZDHNSA-N Arg-Trp-Ser Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N XMGVWQWEWWULNS-BPUTZDHNSA-N 0.000 description 1
- CGWVCWFQGXOUSJ-ULQDDVLXSA-N Arg-Tyr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O CGWVCWFQGXOUSJ-ULQDDVLXSA-N 0.000 description 1
- FMYQECOAIFGQGU-CYDGBPFRSA-N Arg-Val-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FMYQECOAIFGQGU-CYDGBPFRSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 1
- VDCIPFYVCICPEC-FXQIFTODSA-N Asn-Arg-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O VDCIPFYVCICPEC-FXQIFTODSA-N 0.000 description 1
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 1
- IOTKDTZEEBZNCM-UGYAYLCHSA-N Asn-Asn-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOTKDTZEEBZNCM-UGYAYLCHSA-N 0.000 description 1
- VKCOHFFSTKCXEQ-OLHMAJIHSA-N Asn-Asn-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VKCOHFFSTKCXEQ-OLHMAJIHSA-N 0.000 description 1
- WVCJSDCHTUTONA-FXQIFTODSA-N Asn-Asp-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WVCJSDCHTUTONA-FXQIFTODSA-N 0.000 description 1
- XVVOVPFMILMHPX-ZLUOBGJFSA-N Asn-Asp-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XVVOVPFMILMHPX-ZLUOBGJFSA-N 0.000 description 1
- ZWASIOHRQWRWAS-UGYAYLCHSA-N Asn-Asp-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZWASIOHRQWRWAS-UGYAYLCHSA-N 0.000 description 1
- UGXVKHRDGLYFKR-CIUDSAMLSA-N Asn-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(N)=O UGXVKHRDGLYFKR-CIUDSAMLSA-N 0.000 description 1
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 1
- PAXHINASXXXILC-SRVKXCTJSA-N Asn-Asp-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N)O PAXHINASXXXILC-SRVKXCTJSA-N 0.000 description 1
- FJIRXKVEDFLLOQ-SRVKXCTJSA-N Asn-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)N)N FJIRXKVEDFLLOQ-SRVKXCTJSA-N 0.000 description 1
- SJPZTWAYTJPPBI-GUBZILKMSA-N Asn-Gln-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N SJPZTWAYTJPPBI-GUBZILKMSA-N 0.000 description 1
- WPOLSNAQGVHROR-GUBZILKMSA-N Asn-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N WPOLSNAQGVHROR-GUBZILKMSA-N 0.000 description 1
- HCAUEJAQCXVQQM-ACZMJKKPSA-N Asn-Glu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HCAUEJAQCXVQQM-ACZMJKKPSA-N 0.000 description 1
- QYXNFROWLZPWPC-FXQIFTODSA-N Asn-Glu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QYXNFROWLZPWPC-FXQIFTODSA-N 0.000 description 1
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 1
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 1
- ASCGFDYEKSRNPL-CIUDSAMLSA-N Asn-Glu-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O ASCGFDYEKSRNPL-CIUDSAMLSA-N 0.000 description 1
- GFFRWIJAFFMQGM-NUMRIWBASA-N Asn-Glu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GFFRWIJAFFMQGM-NUMRIWBASA-N 0.000 description 1
- JQSWHKKUZMTOIH-QWRGUYRKSA-N Asn-Gly-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N JQSWHKKUZMTOIH-QWRGUYRKSA-N 0.000 description 1
- FTCGGKNCJZOPNB-WHFBIAKZSA-N Asn-Gly-Ser Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FTCGGKNCJZOPNB-WHFBIAKZSA-N 0.000 description 1
- VXLBDJWTONZHJN-YUMQZZPRSA-N Asn-His-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N VXLBDJWTONZHJN-YUMQZZPRSA-N 0.000 description 1
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 1
- GOKCTAJWRPSCHP-VHWLVUOQSA-N Asn-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(=O)N)N GOKCTAJWRPSCHP-VHWLVUOQSA-N 0.000 description 1
- JQBCANGGAVVERB-CFMVVWHZSA-N Asn-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N JQBCANGGAVVERB-CFMVVWHZSA-N 0.000 description 1
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 1
- BXUHCIXDSWRSBS-CIUDSAMLSA-N Asn-Leu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BXUHCIXDSWRSBS-CIUDSAMLSA-N 0.000 description 1
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 1
- JEEFEQCRXKPQHC-KKUMJFAQSA-N Asn-Leu-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JEEFEQCRXKPQHC-KKUMJFAQSA-N 0.000 description 1
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 1
- TZFQICWZWFNIKU-KKUMJFAQSA-N Asn-Leu-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 TZFQICWZWFNIKU-KKUMJFAQSA-N 0.000 description 1
- RCFGLXMZDYNRSC-CIUDSAMLSA-N Asn-Lys-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O RCFGLXMZDYNRSC-CIUDSAMLSA-N 0.000 description 1
- KHCNTVRVAYCPQE-CIUDSAMLSA-N Asn-Lys-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O KHCNTVRVAYCPQE-CIUDSAMLSA-N 0.000 description 1
- NYGILGUOUOXGMJ-YUMQZZPRSA-N Asn-Lys-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O NYGILGUOUOXGMJ-YUMQZZPRSA-N 0.000 description 1
- JWKDQOORUCYUIW-ZPFDUUQYSA-N Asn-Lys-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JWKDQOORUCYUIW-ZPFDUUQYSA-N 0.000 description 1
- ORJQQZIXTOYGGH-SRVKXCTJSA-N Asn-Lys-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ORJQQZIXTOYGGH-SRVKXCTJSA-N 0.000 description 1
- VOGCFWDZYYTEOY-DCAQKATOSA-N Asn-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N VOGCFWDZYYTEOY-DCAQKATOSA-N 0.000 description 1
- AYOAHKWVQLNPDM-HJGDQZAQSA-N Asn-Lys-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AYOAHKWVQLNPDM-HJGDQZAQSA-N 0.000 description 1
- LSJQOMAZIKQMTJ-SRVKXCTJSA-N Asn-Phe-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LSJQOMAZIKQMTJ-SRVKXCTJSA-N 0.000 description 1
- ZVUMKOMKQCANOM-AVGNSLFASA-N Asn-Phe-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZVUMKOMKQCANOM-AVGNSLFASA-N 0.000 description 1
- RAUPFUCUDBQYHE-AVGNSLFASA-N Asn-Phe-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RAUPFUCUDBQYHE-AVGNSLFASA-N 0.000 description 1
- HZZIFFOVHLWGCS-KKUMJFAQSA-N Asn-Phe-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O HZZIFFOVHLWGCS-KKUMJFAQSA-N 0.000 description 1
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 1
- YUUIAUXBNOHFRJ-IHRRRGAJSA-N Asn-Phe-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O YUUIAUXBNOHFRJ-IHRRRGAJSA-N 0.000 description 1
- YUOXLJYVSZYPBJ-CIUDSAMLSA-N Asn-Pro-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O YUOXLJYVSZYPBJ-CIUDSAMLSA-N 0.000 description 1
- AWXDRZJQCVHCIT-DCAQKATOSA-N Asn-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O AWXDRZJQCVHCIT-DCAQKATOSA-N 0.000 description 1
- GMUOCGCDOYYWPD-FXQIFTODSA-N Asn-Pro-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O GMUOCGCDOYYWPD-FXQIFTODSA-N 0.000 description 1
- MKJBPDLENBUHQU-CIUDSAMLSA-N Asn-Ser-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O MKJBPDLENBUHQU-CIUDSAMLSA-N 0.000 description 1
- VLDRQOHCMKCXLY-SRVKXCTJSA-N Asn-Ser-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VLDRQOHCMKCXLY-SRVKXCTJSA-N 0.000 description 1
- MYTHOBCLNIOFBL-SRVKXCTJSA-N Asn-Ser-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYTHOBCLNIOFBL-SRVKXCTJSA-N 0.000 description 1
- HCZQKHSRYHCPSD-IUKAMOBKSA-N Asn-Thr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HCZQKHSRYHCPSD-IUKAMOBKSA-N 0.000 description 1
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 1
- BCADFFUQHIMQAA-KKHAAJSZSA-N Asn-Thr-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BCADFFUQHIMQAA-KKHAAJSZSA-N 0.000 description 1
- FLJVGAFLZVBBNG-BPUTZDHNSA-N Asn-Trp-Arg Chemical compound N[C@@H](CC(=O)N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CCCNC(=N)N)C(=O)O FLJVGAFLZVBBNG-BPUTZDHNSA-N 0.000 description 1
- ATHZHGQSAIJHQU-XIRDDKMYSA-N Asn-Trp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ATHZHGQSAIJHQU-XIRDDKMYSA-N 0.000 description 1
- KTDWFWNZLLFEFU-KKUMJFAQSA-N Asn-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O KTDWFWNZLLFEFU-KKUMJFAQSA-N 0.000 description 1
- BEHQTVDBCLSCBY-CFMVVWHZSA-N Asn-Tyr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BEHQTVDBCLSCBY-CFMVVWHZSA-N 0.000 description 1
- CGYKCTPUGXFPMG-IHPCNDPISA-N Asn-Tyr-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O CGYKCTPUGXFPMG-IHPCNDPISA-N 0.000 description 1
- DPSUVAPLRQDWAO-YDHLFZDLSA-N Asn-Tyr-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(=O)N)N DPSUVAPLRQDWAO-YDHLFZDLSA-N 0.000 description 1
- WQAOZCVOOYUWKG-LSJOCFKGSA-N Asn-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CC(=O)N)N WQAOZCVOOYUWKG-LSJOCFKGSA-N 0.000 description 1
- KDFQZBWWPYQBEN-ZLUOBGJFSA-N Asp-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N KDFQZBWWPYQBEN-ZLUOBGJFSA-N 0.000 description 1
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 1
- BLQBMRNMBAYREH-UWJYBYFXSA-N Asp-Ala-Tyr Chemical compound N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O BLQBMRNMBAYREH-UWJYBYFXSA-N 0.000 description 1
- WSOKZUVWBXVJHX-CIUDSAMLSA-N Asp-Arg-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O WSOKZUVWBXVJHX-CIUDSAMLSA-N 0.000 description 1
- AXXCUABIFZPKPM-BQBZGAKWSA-N Asp-Arg-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O AXXCUABIFZPKPM-BQBZGAKWSA-N 0.000 description 1
- MFMJRYHVLLEMQM-DCAQKATOSA-N Asp-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N MFMJRYHVLLEMQM-DCAQKATOSA-N 0.000 description 1
- BUVNWKQBMZLCDW-UGYAYLCHSA-N Asp-Asn-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BUVNWKQBMZLCDW-UGYAYLCHSA-N 0.000 description 1
- RDRMWJBLOSRRAW-BYULHYEWSA-N Asp-Asn-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O RDRMWJBLOSRRAW-BYULHYEWSA-N 0.000 description 1
- JGDBHIVECJGXJA-FXQIFTODSA-N Asp-Asp-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JGDBHIVECJGXJA-FXQIFTODSA-N 0.000 description 1
- TVVYVAUGRHNTGT-UGYAYLCHSA-N Asp-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O TVVYVAUGRHNTGT-UGYAYLCHSA-N 0.000 description 1
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 1
- CELPEWWLSXMVPH-CIUDSAMLSA-N Asp-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O CELPEWWLSXMVPH-CIUDSAMLSA-N 0.000 description 1
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 1
- DXQOQMCLWWADMU-ACZMJKKPSA-N Asp-Gln-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DXQOQMCLWWADMU-ACZMJKKPSA-N 0.000 description 1
- VAWNQIGQPUOPQW-ACZMJKKPSA-N Asp-Glu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VAWNQIGQPUOPQW-ACZMJKKPSA-N 0.000 description 1
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 1
- KHBLRHKVXICFMY-GUBZILKMSA-N Asp-Glu-Lys Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O KHBLRHKVXICFMY-GUBZILKMSA-N 0.000 description 1
- LTXGDRFJRZSZAV-CIUDSAMLSA-N Asp-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N LTXGDRFJRZSZAV-CIUDSAMLSA-N 0.000 description 1
- RRKCPMGSRIDLNC-AVGNSLFASA-N Asp-Glu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RRKCPMGSRIDLNC-AVGNSLFASA-N 0.000 description 1
- NRIFEOUAFLTMFJ-AAEUAGOBSA-N Asp-Gly-Trp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O NRIFEOUAFLTMFJ-AAEUAGOBSA-N 0.000 description 1
- PGUYEUCYVNZGGV-QWRGUYRKSA-N Asp-Gly-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PGUYEUCYVNZGGV-QWRGUYRKSA-N 0.000 description 1
- TVIZQBFURPLQDV-DJFWLOJKSA-N Asp-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N TVIZQBFURPLQDV-DJFWLOJKSA-N 0.000 description 1
- SEMWSADZTMJELF-BYULHYEWSA-N Asp-Ile-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O SEMWSADZTMJELF-BYULHYEWSA-N 0.000 description 1
- KYQNAIMCTRZLNP-QSFUFRPTSA-N Asp-Ile-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O KYQNAIMCTRZLNP-QSFUFRPTSA-N 0.000 description 1
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 1
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 1
- MYOHQBFRJQFIDZ-KKUMJFAQSA-N Asp-Leu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYOHQBFRJQFIDZ-KKUMJFAQSA-N 0.000 description 1
- XWSIYTYNLKCLJB-CIUDSAMLSA-N Asp-Lys-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O XWSIYTYNLKCLJB-CIUDSAMLSA-N 0.000 description 1
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 1
- MYLZFUMPZCPJCJ-NHCYSSNCSA-N Asp-Lys-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MYLZFUMPZCPJCJ-NHCYSSNCSA-N 0.000 description 1
- WWOYXVBGHAHQBG-FXQIFTODSA-N Asp-Met-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O WWOYXVBGHAHQBG-FXQIFTODSA-N 0.000 description 1
- BPTFNDRZKBFMTH-DCAQKATOSA-N Asp-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N BPTFNDRZKBFMTH-DCAQKATOSA-N 0.000 description 1
- PCJOFZYFFMBZKC-PCBIJLKTSA-N Asp-Phe-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PCJOFZYFFMBZKC-PCBIJLKTSA-N 0.000 description 1
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 1
- HJZLUGQGJWXJCJ-CIUDSAMLSA-N Asp-Pro-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O HJZLUGQGJWXJCJ-CIUDSAMLSA-N 0.000 description 1
- RVMXMLSYBTXCAV-VEVYYDQMSA-N Asp-Pro-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMXMLSYBTXCAV-VEVYYDQMSA-N 0.000 description 1
- CUQDCPXNZPDYFQ-ZLUOBGJFSA-N Asp-Ser-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O CUQDCPXNZPDYFQ-ZLUOBGJFSA-N 0.000 description 1
- DRCOAZZDQRCGGP-GHCJXIJMSA-N Asp-Ser-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DRCOAZZDQRCGGP-GHCJXIJMSA-N 0.000 description 1
- QSFHZPQUAAQHAQ-CIUDSAMLSA-N Asp-Ser-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O QSFHZPQUAAQHAQ-CIUDSAMLSA-N 0.000 description 1
- OFYVKOXTTDCUIL-FXQIFTODSA-N Asp-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N OFYVKOXTTDCUIL-FXQIFTODSA-N 0.000 description 1
- IQCJOIHDVFJQFV-LKXGYXEUSA-N Asp-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O IQCJOIHDVFJQFV-LKXGYXEUSA-N 0.000 description 1
- KBJVTFWQWXCYCQ-IUKAMOBKSA-N Asp-Thr-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KBJVTFWQWXCYCQ-IUKAMOBKSA-N 0.000 description 1
- JDDYEZGPYBBPBN-JRQIVUDYSA-N Asp-Thr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JDDYEZGPYBBPBN-JRQIVUDYSA-N 0.000 description 1
- CXEFNHOVIIDHFU-IHPCNDPISA-N Asp-Trp-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)NC(=O)[C@H](CC(=O)O)N CXEFNHOVIIDHFU-IHPCNDPISA-N 0.000 description 1
- LEYKQPDPZJIRTA-AQZXSJQPSA-N Asp-Trp-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LEYKQPDPZJIRTA-AQZXSJQPSA-N 0.000 description 1
- USENATHVGFXRNO-SRVKXCTJSA-N Asp-Tyr-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 USENATHVGFXRNO-SRVKXCTJSA-N 0.000 description 1
- BJDHEININLSZOT-KKUMJFAQSA-N Asp-Tyr-Lys Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(O)=O BJDHEININLSZOT-KKUMJFAQSA-N 0.000 description 1
- SQIARYGNVQWOSB-BZSNNMDCSA-N Asp-Tyr-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQIARYGNVQWOSB-BZSNNMDCSA-N 0.000 description 1
- CZIVKMOEXPILDK-SRVKXCTJSA-N Asp-Tyr-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O CZIVKMOEXPILDK-SRVKXCTJSA-N 0.000 description 1
- WAEDSQFVZJUHLI-BYULHYEWSA-N Asp-Val-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WAEDSQFVZJUHLI-BYULHYEWSA-N 0.000 description 1
- GGBQDSHTXKQSLP-NHCYSSNCSA-N Asp-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N GGBQDSHTXKQSLP-NHCYSSNCSA-N 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 101150076489 B gene Proteins 0.000 description 1
- 241000304886 Bacilli Species 0.000 description 1
- 235000014469 Bacillus subtilis Nutrition 0.000 description 1
- 101100449954 Bacillus subtilis (strain 168) eglS gene Proteins 0.000 description 1
- 101100174784 Bacillus subtilis (strain 168) ganR gene Proteins 0.000 description 1
- 101100299607 Bacillus subtilis (strain 168) licA gene Proteins 0.000 description 1
- 108010040467 CRISPR-Associated Proteins Proteins 0.000 description 1
- 101100512078 Caenorhabditis elegans lys-1 gene Proteins 0.000 description 1
- 241001453245 Campylobacter jejuni subsp. jejuni Species 0.000 description 1
- 241001297667 Candidatus Wallbacteria Species 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 108091092236 Chimeric RNA Proteins 0.000 description 1
- 241000423301 Clostridioides difficile 630 Species 0.000 description 1
- 241001656810 Clostridium aceticum Species 0.000 description 1
- 241000423302 Clostridium acetobutylicum ATCC 824 Species 0.000 description 1
- 241001110912 Clostridium beijerinckii NCIMB 8052 Species 0.000 description 1
- 206010009657 Clostridium difficile colitis Diseases 0.000 description 1
- 241000186566 Clostridium ljungdahlii Species 0.000 description 1
- 241000186581 Clostridium novyi Species 0.000 description 1
- 241000193469 Clostridium pasteurianum Species 0.000 description 1
- 101100166169 Clostridium perfringens catQ gene Proteins 0.000 description 1
- 241001147704 Clostridium puniceum Species 0.000 description 1
- 241000186587 Clostridium scatologenes Species 0.000 description 1
- 241000193449 Clostridium tetani Species 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- KLLFLHBKSJAUMZ-ACZMJKKPSA-N Cys-Asn-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CS)N KLLFLHBKSJAUMZ-ACZMJKKPSA-N 0.000 description 1
- NIPJKKSXHSBEMX-CIUDSAMLSA-N Cys-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N NIPJKKSXHSBEMX-CIUDSAMLSA-N 0.000 description 1
- BSFFNUBDVYTDMV-WHFBIAKZSA-N Cys-Gly-Asn Chemical compound [H]N[C@@H](CS)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BSFFNUBDVYTDMV-WHFBIAKZSA-N 0.000 description 1
- URDUGPGPLNXXES-WHFBIAKZSA-N Cys-Gly-Cys Chemical compound SC[C@H](N)C(=O)NCC(=O)N[C@@H](CS)C(O)=O URDUGPGPLNXXES-WHFBIAKZSA-N 0.000 description 1
- LYSHSHHDBVKJRN-JBDRJPRFSA-N Cys-Ile-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CS)N LYSHSHHDBVKJRN-JBDRJPRFSA-N 0.000 description 1
- QQOWCDCBFFBRQH-IXOXFDKPSA-N Cys-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CS)N)O QQOWCDCBFFBRQH-IXOXFDKPSA-N 0.000 description 1
- JEKIARHEWURQRJ-BZSNNMDCSA-N Cys-Phe-Tyr Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)NC(=O)[C@H](CS)N JEKIARHEWURQRJ-BZSNNMDCSA-N 0.000 description 1
- RJPKQCFHEPPTGL-ZLUOBGJFSA-N Cys-Ser-Asp Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RJPKQCFHEPPTGL-ZLUOBGJFSA-N 0.000 description 1
- NDNZRWUDUMTITL-FXQIFTODSA-N Cys-Ser-Val Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O NDNZRWUDUMTITL-FXQIFTODSA-N 0.000 description 1
- IRKLTAKLAFUTLA-KATARQTJSA-N Cys-Thr-Lys Chemical compound C[C@@H](O)[C@H](NC(=O)[C@@H](N)CS)C(=O)N[C@@H](CCCCN)C(O)=O IRKLTAKLAFUTLA-KATARQTJSA-N 0.000 description 1
- SRBFZHDQGSBBOR-SOOFDHNKSA-N D-ribopyranose Chemical compound O[C@@H]1COC(O)[C@H](O)[C@@H]1O SRBFZHDQGSBBOR-SOOFDHNKSA-N 0.000 description 1
- 108010084276 DNA modification methylase EcoKI Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 241001584242 Desnuesiella massiliensis Species 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 101150110799 ETFA gene Proteins 0.000 description 1
- 101150046595 ETFB gene Proteins 0.000 description 1
- YQYJSBFKSSDGFO-UHFFFAOYSA-N Epihygromycin Natural products OC1C(O)C(C(=O)C)OC1OC(C(=C1)O)=CC=C1C=C(C)C(=O)NC1C(O)C(O)C2OCOC2C1O YQYJSBFKSSDGFO-UHFFFAOYSA-N 0.000 description 1
- 101100031701 Escherichia coli (strain K12) bglF gene Proteins 0.000 description 1
- 101100218687 Escherichia coli (strain K12) bglG gene Proteins 0.000 description 1
- 101100125338 Escherichia coli (strain K12) hypF gene Proteins 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 206010016952 Food poisoning Diseases 0.000 description 1
- 208000019331 Foodborne disease Diseases 0.000 description 1
- 241000589599 Francisella tularensis subsp. novicida Species 0.000 description 1
- 101150106478 GPS1 gene Proteins 0.000 description 1
- 201000000628 Gas Gangrene Diseases 0.000 description 1
- PHZYLYASFWHLHJ-FXQIFTODSA-N Gln-Asn-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PHZYLYASFWHLHJ-FXQIFTODSA-N 0.000 description 1
- ZPDVKYLJTOFQJV-WDSKDSINSA-N Gln-Asn-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O ZPDVKYLJTOFQJV-WDSKDSINSA-N 0.000 description 1
- AAOBFSKXAVIORT-GUBZILKMSA-N Gln-Asn-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O AAOBFSKXAVIORT-GUBZILKMSA-N 0.000 description 1
- WQWMZOIPXWSZNE-WDSKDSINSA-N Gln-Asp-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O WQWMZOIPXWSZNE-WDSKDSINSA-N 0.000 description 1
- XEYMBRRKIFYQMF-GUBZILKMSA-N Gln-Asp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XEYMBRRKIFYQMF-GUBZILKMSA-N 0.000 description 1
- KVXVVDFOZNYYKZ-DCAQKATOSA-N Gln-Gln-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KVXVVDFOZNYYKZ-DCAQKATOSA-N 0.000 description 1
- SNLOOPZHAQDMJG-CIUDSAMLSA-N Gln-Glu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SNLOOPZHAQDMJG-CIUDSAMLSA-N 0.000 description 1
- KCJJFESQRXGTGC-BQBZGAKWSA-N Gln-Glu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O KCJJFESQRXGTGC-BQBZGAKWSA-N 0.000 description 1
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 1
- LFIVHGMKWFGUGK-IHRRRGAJSA-N Gln-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N LFIVHGMKWFGUGK-IHRRRGAJSA-N 0.000 description 1
- WVUZERSNWGUKJY-BPUTZDHNSA-N Gln-Glu-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N WVUZERSNWGUKJY-BPUTZDHNSA-N 0.000 description 1
- YXQCLIVLWCKCRS-RYUDHWBXSA-N Gln-Gly-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N)O YXQCLIVLWCKCRS-RYUDHWBXSA-N 0.000 description 1
- NNXIQPMZGZUFJJ-AVGNSLFASA-N Gln-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N NNXIQPMZGZUFJJ-AVGNSLFASA-N 0.000 description 1
- GIVHPCWYVWUUSG-HVTMNAMFSA-N Gln-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N GIVHPCWYVWUUSG-HVTMNAMFSA-N 0.000 description 1
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 1
- IULKWYSYZSURJK-AVGNSLFASA-N Gln-Leu-Lys Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O IULKWYSYZSURJK-AVGNSLFASA-N 0.000 description 1
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 1
- MLSKFHLRFVGNLL-WDCWCFNPSA-N Gln-Leu-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MLSKFHLRFVGNLL-WDCWCFNPSA-N 0.000 description 1
- JRHPEMVLTRADLJ-AVGNSLFASA-N Gln-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JRHPEMVLTRADLJ-AVGNSLFASA-N 0.000 description 1
- ZEEPYMXTJWIMSN-GUBZILKMSA-N Gln-Lys-Ser Chemical compound NCCCC[C@@H](C(=O)N[C@@H](CO)C(O)=O)NC(=O)[C@@H](N)CCC(N)=O ZEEPYMXTJWIMSN-GUBZILKMSA-N 0.000 description 1
- CELXWPDNIGWCJN-WDCWCFNPSA-N Gln-Lys-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CELXWPDNIGWCJN-WDCWCFNPSA-N 0.000 description 1
- LUGUNEGJNDEBLU-DCAQKATOSA-N Gln-Met-Arg Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N LUGUNEGJNDEBLU-DCAQKATOSA-N 0.000 description 1
- WHVLABLIJYGVEK-QEWYBTABSA-N Gln-Phe-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WHVLABLIJYGVEK-QEWYBTABSA-N 0.000 description 1
- OTQSTOXRUBVWAP-NRPADANISA-N Gln-Ser-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O OTQSTOXRUBVWAP-NRPADANISA-N 0.000 description 1
- PAOHIZNRJNIXQY-XQXXSGGOSA-N Gln-Thr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PAOHIZNRJNIXQY-XQXXSGGOSA-N 0.000 description 1
- OEIDWQHTRYEYGG-QEJZJMRPSA-N Gln-Trp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N OEIDWQHTRYEYGG-QEJZJMRPSA-N 0.000 description 1
- WTJIWXMJESRHMM-XDTLVQLUSA-N Gln-Tyr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O WTJIWXMJESRHMM-XDTLVQLUSA-N 0.000 description 1
- QZQYITIKPAUDGN-GVXVVHGQSA-N Gln-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N QZQYITIKPAUDGN-GVXVVHGQSA-N 0.000 description 1
- RUFHOVYUYSNDNY-ACZMJKKPSA-N Glu-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O RUFHOVYUYSNDNY-ACZMJKKPSA-N 0.000 description 1
- LKDIBBOKUAASNP-FXQIFTODSA-N Glu-Ala-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LKDIBBOKUAASNP-FXQIFTODSA-N 0.000 description 1
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 1
- YKLNMGJYMNPBCP-ACZMJKKPSA-N Glu-Asn-Asp Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YKLNMGJYMNPBCP-ACZMJKKPSA-N 0.000 description 1
- YYOBUPFZLKQUAX-FXQIFTODSA-N Glu-Asn-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YYOBUPFZLKQUAX-FXQIFTODSA-N 0.000 description 1
- RDPOETHPAQEGDP-ACZMJKKPSA-N Glu-Asp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RDPOETHPAQEGDP-ACZMJKKPSA-N 0.000 description 1
- QPRZKNOOOBWXSU-CIUDSAMLSA-N Glu-Asp-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N QPRZKNOOOBWXSU-CIUDSAMLSA-N 0.000 description 1
- NTBDVNJIWCKURJ-ACZMJKKPSA-N Glu-Asp-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NTBDVNJIWCKURJ-ACZMJKKPSA-N 0.000 description 1
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 1
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 1
- PAQUJCSYVIBPLC-AVGNSLFASA-N Glu-Asp-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PAQUJCSYVIBPLC-AVGNSLFASA-N 0.000 description 1
- WATXSTJXNBOHKD-LAEOZQHASA-N Glu-Asp-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O WATXSTJXNBOHKD-LAEOZQHASA-N 0.000 description 1
- ZXQPJYWZSFGWJB-AVGNSLFASA-N Glu-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N ZXQPJYWZSFGWJB-AVGNSLFASA-N 0.000 description 1
- LVCHEMOPBORRLB-DCAQKATOSA-N Glu-Gln-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O LVCHEMOPBORRLB-DCAQKATOSA-N 0.000 description 1
- WPLGNDORMXTMQS-FXQIFTODSA-N Glu-Gln-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O WPLGNDORMXTMQS-FXQIFTODSA-N 0.000 description 1
- VFZIDQZAEBORGY-GLLZPBPUSA-N Glu-Gln-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VFZIDQZAEBORGY-GLLZPBPUSA-N 0.000 description 1
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 1
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 1
- LGYZYFFDELZWRS-DCAQKATOSA-N Glu-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O LGYZYFFDELZWRS-DCAQKATOSA-N 0.000 description 1
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 1
- PHONAZGUEGIOEM-GLLZPBPUSA-N Glu-Glu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PHONAZGUEGIOEM-GLLZPBPUSA-N 0.000 description 1
- UHVIQGKBMXEVGN-WDSKDSINSA-N Glu-Gly-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UHVIQGKBMXEVGN-WDSKDSINSA-N 0.000 description 1
- CUXJIASLBRJOFV-LAEOZQHASA-N Glu-Gly-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUXJIASLBRJOFV-LAEOZQHASA-N 0.000 description 1
- VOORMNJKNBGYGK-YUMQZZPRSA-N Glu-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N VOORMNJKNBGYGK-YUMQZZPRSA-N 0.000 description 1
- RAUDKMVXNOWDLS-WDSKDSINSA-N Glu-Gly-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O RAUDKMVXNOWDLS-WDSKDSINSA-N 0.000 description 1
- XMPAXPSENRSOSV-RYUDHWBXSA-N Glu-Gly-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XMPAXPSENRSOSV-RYUDHWBXSA-N 0.000 description 1
- XOFYVODYSNKPDK-AVGNSLFASA-N Glu-His-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XOFYVODYSNKPDK-AVGNSLFASA-N 0.000 description 1
- CXRWMMRLEMVSEH-PEFMBERDSA-N Glu-Ile-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CXRWMMRLEMVSEH-PEFMBERDSA-N 0.000 description 1
- ITBHUUMCJJQUSC-LAEOZQHASA-N Glu-Ile-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O ITBHUUMCJJQUSC-LAEOZQHASA-N 0.000 description 1
- WTMZXOPHTIVFCP-QEWYBTABSA-N Glu-Ile-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WTMZXOPHTIVFCP-QEWYBTABSA-N 0.000 description 1
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 1
- ZCFNZTVIDMLUQC-SXNHZJKMSA-N Glu-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZCFNZTVIDMLUQC-SXNHZJKMSA-N 0.000 description 1
- KRRFFAHEAOCBCQ-SIUGBPQLSA-N Glu-Ile-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KRRFFAHEAOCBCQ-SIUGBPQLSA-N 0.000 description 1
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 1
- VSRCAOIHMGCIJK-SRVKXCTJSA-N Glu-Leu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VSRCAOIHMGCIJK-SRVKXCTJSA-N 0.000 description 1
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 1
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 1
- FBEJIDRSQCGFJI-GUBZILKMSA-N Glu-Leu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FBEJIDRSQCGFJI-GUBZILKMSA-N 0.000 description 1
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 1
- UJMNFCAHLYKWOZ-DCAQKATOSA-N Glu-Lys-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UJMNFCAHLYKWOZ-DCAQKATOSA-N 0.000 description 1
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 1
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 1
- ZGEJRLJEAMPEDV-SRVKXCTJSA-N Glu-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N ZGEJRLJEAMPEDV-SRVKXCTJSA-N 0.000 description 1
- RBXSZQRSEGYDFG-GUBZILKMSA-N Glu-Lys-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O RBXSZQRSEGYDFG-GUBZILKMSA-N 0.000 description 1
- ZQYZDDXTNQXUJH-CIUDSAMLSA-N Glu-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(=O)O)N ZQYZDDXTNQXUJH-CIUDSAMLSA-N 0.000 description 1
- HQOGXFLBAKJUMH-CIUDSAMLSA-N Glu-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)O)N HQOGXFLBAKJUMH-CIUDSAMLSA-N 0.000 description 1
- JZJGEKDPWVJOLD-QEWYBTABSA-N Glu-Phe-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JZJGEKDPWVJOLD-QEWYBTABSA-N 0.000 description 1
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 1
- GMVCSRBOSIUTFC-FXQIFTODSA-N Glu-Ser-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMVCSRBOSIUTFC-FXQIFTODSA-N 0.000 description 1
- RFTVTKBHDXCEEX-WDSKDSINSA-N Glu-Ser-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RFTVTKBHDXCEEX-WDSKDSINSA-N 0.000 description 1
- SYAYROHMAIHWFB-KBIXCLLPSA-N Glu-Ser-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYAYROHMAIHWFB-KBIXCLLPSA-N 0.000 description 1
- JWNZHMSRZXXGTM-XKBZYTNZSA-N Glu-Ser-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWNZHMSRZXXGTM-XKBZYTNZSA-N 0.000 description 1
- DMYACXMQUABZIQ-NRPADANISA-N Glu-Ser-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O DMYACXMQUABZIQ-NRPADANISA-N 0.000 description 1
- TWYSSILQABLLME-HJGDQZAQSA-N Glu-Thr-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYSSILQABLLME-HJGDQZAQSA-N 0.000 description 1
- RGJKYNUINKGPJN-RWRJDSDZSA-N Glu-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(=O)O)N RGJKYNUINKGPJN-RWRJDSDZSA-N 0.000 description 1
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 1
- CAQXJMUDOLSBPF-SUSMZKCASA-N Glu-Thr-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAQXJMUDOLSBPF-SUSMZKCASA-N 0.000 description 1
- YOTHMZZSJKKEHZ-SZMVWBNQSA-N Glu-Trp-Lys Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CCC(O)=O)=CNC2=C1 YOTHMZZSJKKEHZ-SZMVWBNQSA-N 0.000 description 1
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 1
- YQPFCZVKMUVZIN-AUTRQRHGSA-N Glu-Val-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQPFCZVKMUVZIN-AUTRQRHGSA-N 0.000 description 1
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 1
- RMWAOBGCZZSJHE-UMNHJUIQSA-N Glu-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N RMWAOBGCZZSJHE-UMNHJUIQSA-N 0.000 description 1
- QRWPTXLWHHTOCO-DZKIICNBSA-N Glu-Val-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QRWPTXLWHHTOCO-DZKIICNBSA-N 0.000 description 1
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 1
- 102100036263 Glutamyl-tRNA(Gln) amidotransferase subunit C, mitochondrial Human genes 0.000 description 1
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 1
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 1
- QIZJOTQTCAGKPU-KWQFWETISA-N Gly-Ala-Tyr Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 QIZJOTQTCAGKPU-KWQFWETISA-N 0.000 description 1
- OVSKVOOUFAKODB-UWVGGRQHSA-N Gly-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OVSKVOOUFAKODB-UWVGGRQHSA-N 0.000 description 1
- KRRMJKMGWWXWDW-STQMWFEESA-N Gly-Arg-Phe Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KRRMJKMGWWXWDW-STQMWFEESA-N 0.000 description 1
- XUORRGAFUQIMLC-STQMWFEESA-N Gly-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)CN)O XUORRGAFUQIMLC-STQMWFEESA-N 0.000 description 1
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 1
- XCLCVBYNGXEVDU-WHFBIAKZSA-N Gly-Asn-Ser Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O XCLCVBYNGXEVDU-WHFBIAKZSA-N 0.000 description 1
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 1
- XQHSBNVACKQWAV-WHFBIAKZSA-N Gly-Asp-Asn Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XQHSBNVACKQWAV-WHFBIAKZSA-N 0.000 description 1
- RPLLQZBOVIVGMX-QWRGUYRKSA-N Gly-Asp-Phe Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RPLLQZBOVIVGMX-QWRGUYRKSA-N 0.000 description 1
- LCNXZQROPKFGQK-WHFBIAKZSA-N Gly-Asp-Ser Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O LCNXZQROPKFGQK-WHFBIAKZSA-N 0.000 description 1
- HDNXXTBKOJKWNN-WDSKDSINSA-N Gly-Glu-Asn Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O HDNXXTBKOJKWNN-WDSKDSINSA-N 0.000 description 1
- ZQIMMEYPEXIYBB-IUCAKERBSA-N Gly-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN ZQIMMEYPEXIYBB-IUCAKERBSA-N 0.000 description 1
- LHRXAHLCRMQBGJ-RYUDHWBXSA-N Gly-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CN LHRXAHLCRMQBGJ-RYUDHWBXSA-N 0.000 description 1
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 1
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 1
- ZKLYPEGLWFVRGF-IUCAKERBSA-N Gly-His-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZKLYPEGLWFVRGF-IUCAKERBSA-N 0.000 description 1
- CQIIXEHDSZUSAG-QWRGUYRKSA-N Gly-His-His Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 CQIIXEHDSZUSAG-QWRGUYRKSA-N 0.000 description 1
- SXJHOPPTOJACOA-QXEWZRGKSA-N Gly-Ile-Arg Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SXJHOPPTOJACOA-QXEWZRGKSA-N 0.000 description 1
- AAHSHTLISQUZJL-QSFUFRPTSA-N Gly-Ile-Ile Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O AAHSHTLISQUZJL-QSFUFRPTSA-N 0.000 description 1
- HAXARWKYFIIHKD-ZKWXMUAHSA-N Gly-Ile-Ser Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HAXARWKYFIIHKD-ZKWXMUAHSA-N 0.000 description 1
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 1
- TVUWMSBGMVAHSJ-KBPBESRZSA-N Gly-Leu-Phe Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TVUWMSBGMVAHSJ-KBPBESRZSA-N 0.000 description 1
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 1
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 1
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 1
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 1
- OQQKUTVULYLCDG-ONGXEEELSA-N Gly-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)CN)C(O)=O OQQKUTVULYLCDG-ONGXEEELSA-N 0.000 description 1
- MTBIKIMYHUWBRX-QWRGUYRKSA-N Gly-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN MTBIKIMYHUWBRX-QWRGUYRKSA-N 0.000 description 1
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 1
- IBYOLNARKHMLBG-WHOFXGATSA-N Gly-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IBYOLNARKHMLBG-WHOFXGATSA-N 0.000 description 1
- YLEIWGJJBFBFHC-KBPBESRZSA-N Gly-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 YLEIWGJJBFBFHC-KBPBESRZSA-N 0.000 description 1
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 1
- VNNRLUNBJSWZPF-ZKWXMUAHSA-N Gly-Ser-Ile Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNNRLUNBJSWZPF-ZKWXMUAHSA-N 0.000 description 1
- FKESCSGWBPUTPN-FOHZUACHSA-N Gly-Thr-Asn Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O FKESCSGWBPUTPN-FOHZUACHSA-N 0.000 description 1
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 1
- RCHFYMASWAZQQZ-ZANVPECISA-N Gly-Trp-Ala Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)CN)=CNC2=C1 RCHFYMASWAZQQZ-ZANVPECISA-N 0.000 description 1
- GNNJKUYDWFIBTK-QWRGUYRKSA-N Gly-Tyr-Asp Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O GNNJKUYDWFIBTK-QWRGUYRKSA-N 0.000 description 1
- DNVDEMWIYLVIQU-RCOVLWMOSA-N Gly-Val-Asp Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O DNVDEMWIYLVIQU-RCOVLWMOSA-N 0.000 description 1
- SYOJVRNQCXYEOV-XVKPBYJWSA-N Gly-Val-Glu Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SYOJVRNQCXYEOV-XVKPBYJWSA-N 0.000 description 1
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 1
- 229920002488 Hemicellulose Polymers 0.000 description 1
- IPIVXQQRZXEUGW-UWJYBYFXSA-N His-Ala-His Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 IPIVXQQRZXEUGW-UWJYBYFXSA-N 0.000 description 1
- DZMVESFTHXSSPZ-XVYDVKMFSA-N His-Ala-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DZMVESFTHXSSPZ-XVYDVKMFSA-N 0.000 description 1
- SVHKVHBPTOMLTO-DCAQKATOSA-N His-Arg-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SVHKVHBPTOMLTO-DCAQKATOSA-N 0.000 description 1
- JBJNKUOMNZGQIM-PYJNHQTQSA-N His-Arg-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JBJNKUOMNZGQIM-PYJNHQTQSA-N 0.000 description 1
- MJICNEVRDVQXJH-WDSOQIARSA-N His-Arg-Trp Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O MJICNEVRDVQXJH-WDSOQIARSA-N 0.000 description 1
- AVQOSMRPITVTRB-CIUDSAMLSA-N His-Asn-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AVQOSMRPITVTRB-CIUDSAMLSA-N 0.000 description 1
- MVADCDSCFTXCBT-CIUDSAMLSA-N His-Asp-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MVADCDSCFTXCBT-CIUDSAMLSA-N 0.000 description 1
- IMCHNUANCIGUKS-SRVKXCTJSA-N His-Glu-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IMCHNUANCIGUKS-SRVKXCTJSA-N 0.000 description 1
- XMENRVZYPBKBIL-AVGNSLFASA-N His-Glu-His Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O XMENRVZYPBKBIL-AVGNSLFASA-N 0.000 description 1
- JCOSMKPAOYDKRO-AVGNSLFASA-N His-Glu-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N JCOSMKPAOYDKRO-AVGNSLFASA-N 0.000 description 1
- BZKDJRSZWLPJNI-SRVKXCTJSA-N His-His-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O BZKDJRSZWLPJNI-SRVKXCTJSA-N 0.000 description 1
- MPXGJGBXCRQQJE-MXAVVETBSA-N His-Ile-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O MPXGJGBXCRQQJE-MXAVVETBSA-N 0.000 description 1
- QMUHTRISZMFKAY-MXAVVETBSA-N His-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N QMUHTRISZMFKAY-MXAVVETBSA-N 0.000 description 1
- BILZDIPAKWZFSG-PYJNHQTQSA-N His-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N BILZDIPAKWZFSG-PYJNHQTQSA-N 0.000 description 1
- IWXMHXYOACDSIA-PYJNHQTQSA-N His-Ile-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O IWXMHXYOACDSIA-PYJNHQTQSA-N 0.000 description 1
- SKYULSWNBYAQMG-IHRRRGAJSA-N His-Leu-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SKYULSWNBYAQMG-IHRRRGAJSA-N 0.000 description 1
- RNMNYMDTESKEAJ-KKUMJFAQSA-N His-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 RNMNYMDTESKEAJ-KKUMJFAQSA-N 0.000 description 1
- ZSKJIISDJXJQPV-BZSNNMDCSA-N His-Leu-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 ZSKJIISDJXJQPV-BZSNNMDCSA-N 0.000 description 1
- UMBKDWGQESDCTO-KKUMJFAQSA-N His-Lys-Lys Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O UMBKDWGQESDCTO-KKUMJFAQSA-N 0.000 description 1
- DPQIPEAHIYMUEJ-IHRRRGAJSA-N His-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N DPQIPEAHIYMUEJ-IHRRRGAJSA-N 0.000 description 1
- YIGCZZKZFMNSIU-RWMBFGLXSA-N His-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N YIGCZZKZFMNSIU-RWMBFGLXSA-N 0.000 description 1
- SVVULKPWDBIPCO-BZSNNMDCSA-N His-Phe-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SVVULKPWDBIPCO-BZSNNMDCSA-N 0.000 description 1
- PYNPBMCLAKTHJL-SRVKXCTJSA-N His-Pro-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O PYNPBMCLAKTHJL-SRVKXCTJSA-N 0.000 description 1
- WCHONUZTYDQMBY-PYJNHQTQSA-N His-Pro-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WCHONUZTYDQMBY-PYJNHQTQSA-N 0.000 description 1
- ZHHLTWUOWXHVQJ-YUMQZZPRSA-N His-Ser-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZHHLTWUOWXHVQJ-YUMQZZPRSA-N 0.000 description 1
- MKWFGXSFLYNTKC-XIRDDKMYSA-N His-Trp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N MKWFGXSFLYNTKC-XIRDDKMYSA-N 0.000 description 1
- SWBUZLFWGJETAO-KKUMJFAQSA-N His-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)O SWBUZLFWGJETAO-KKUMJFAQSA-N 0.000 description 1
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 1
- FFYYUUWROYYKFY-IHRRRGAJSA-N His-Val-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O FFYYUUWROYYKFY-IHRRRGAJSA-N 0.000 description 1
- 101001001786 Homo sapiens Glutamyl-tRNA(Gln) amidotransferase subunit C, mitochondrial Proteins 0.000 description 1
- 101000697649 Homo sapiens Mitochondrial chaperone BCS1 Proteins 0.000 description 1
- 241000019008 Hyda Species 0.000 description 1
- 108010020056 Hydrogenase Proteins 0.000 description 1
- VSZALHITQINTGC-GHCJXIJMSA-N Ile-Ala-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VSZALHITQINTGC-GHCJXIJMSA-N 0.000 description 1
- QICVAHODWHIWIS-HTFCKZLJSA-N Ile-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N QICVAHODWHIWIS-HTFCKZLJSA-N 0.000 description 1
- RWIKBYVJQAJYDP-BJDJZHNGSA-N Ile-Ala-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RWIKBYVJQAJYDP-BJDJZHNGSA-N 0.000 description 1
- WECYRWOMWSCWNX-XUXIUFHCSA-N Ile-Arg-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O WECYRWOMWSCWNX-XUXIUFHCSA-N 0.000 description 1
- YOTNPRLPIPHQSB-XUXIUFHCSA-N Ile-Arg-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOTNPRLPIPHQSB-XUXIUFHCSA-N 0.000 description 1
- IIXDMJNYALIKGP-DJFWLOJKSA-N Ile-Asn-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N IIXDMJNYALIKGP-DJFWLOJKSA-N 0.000 description 1
- FJWYJQRCVNGEAQ-ZPFDUUQYSA-N Ile-Asn-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N FJWYJQRCVNGEAQ-ZPFDUUQYSA-N 0.000 description 1
- NCSIQAFSIPHVAN-IUKAMOBKSA-N Ile-Asn-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N NCSIQAFSIPHVAN-IUKAMOBKSA-N 0.000 description 1
- QYOGJYIRKACXEP-SLBDDTMCSA-N Ile-Asn-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N QYOGJYIRKACXEP-SLBDDTMCSA-N 0.000 description 1
- QIHJTGSVGIPHIW-QSFUFRPTSA-N Ile-Asn-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N QIHJTGSVGIPHIW-QSFUFRPTSA-N 0.000 description 1
- NBJAAWYRLGCJOF-UGYAYLCHSA-N Ile-Asp-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NBJAAWYRLGCJOF-UGYAYLCHSA-N 0.000 description 1
- HVWXAQVMRBKKFE-UGYAYLCHSA-N Ile-Asp-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HVWXAQVMRBKKFE-UGYAYLCHSA-N 0.000 description 1
- BGZIJZJBXRVBGJ-SXTJYALSSA-N Ile-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N BGZIJZJBXRVBGJ-SXTJYALSSA-N 0.000 description 1
- RGSOCXHDOPQREB-ZPFDUUQYSA-N Ile-Asp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N RGSOCXHDOPQREB-ZPFDUUQYSA-N 0.000 description 1
- AQTWDZDISVGCAC-CFMVVWHZSA-N Ile-Asp-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N AQTWDZDISVGCAC-CFMVVWHZSA-N 0.000 description 1
- DMZOUKXXHJQPTL-GRLWGSQLSA-N Ile-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N DMZOUKXXHJQPTL-GRLWGSQLSA-N 0.000 description 1
- LKACSKJPTFSBHR-MNXVOIDGSA-N Ile-Gln-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N LKACSKJPTFSBHR-MNXVOIDGSA-N 0.000 description 1
- LGMUPVWZEYYUMU-YVNDNENWSA-N Ile-Glu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N LGMUPVWZEYYUMU-YVNDNENWSA-N 0.000 description 1
- MTFVYKQRLXYAQN-LAEOZQHASA-N Ile-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O MTFVYKQRLXYAQN-LAEOZQHASA-N 0.000 description 1
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 1
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 1
- MQFGXJNSUJTXDT-QSFUFRPTSA-N Ile-Gly-Ile Chemical compound N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)O MQFGXJNSUJTXDT-QSFUFRPTSA-N 0.000 description 1
- NYEYYMLUABXDMC-NHCYSSNCSA-N Ile-Gly-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)O)N NYEYYMLUABXDMC-NHCYSSNCSA-N 0.000 description 1
- HYLIOBDWPQNLKI-HVTMNAMFSA-N Ile-His-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HYLIOBDWPQNLKI-HVTMNAMFSA-N 0.000 description 1
- YKLOMBNBQUTJDT-HVTMNAMFSA-N Ile-His-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YKLOMBNBQUTJDT-HVTMNAMFSA-N 0.000 description 1
- CMNMPCTVCWWYHY-MXAVVETBSA-N Ile-His-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(C)C)C(=O)O)N CMNMPCTVCWWYHY-MXAVVETBSA-N 0.000 description 1
- YBGTWSFIGHUWQE-MXAVVETBSA-N Ile-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CN=CN1 YBGTWSFIGHUWQE-MXAVVETBSA-N 0.000 description 1
- PWDSHAAAFXISLE-SXTJYALSSA-N Ile-Ile-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O PWDSHAAAFXISLE-SXTJYALSSA-N 0.000 description 1
- HUWYGQOISIJNMK-SIGLWIIPSA-N Ile-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HUWYGQOISIJNMK-SIGLWIIPSA-N 0.000 description 1
- CSQNHSGHAPRGPQ-YTFOTSKYSA-N Ile-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)O)N CSQNHSGHAPRGPQ-YTFOTSKYSA-N 0.000 description 1
- PKGGWLOLRLOPGK-XUXIUFHCSA-N Ile-Leu-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PKGGWLOLRLOPGK-XUXIUFHCSA-N 0.000 description 1
- OUUCIIJSBIBCHB-ZPFDUUQYSA-N Ile-Leu-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O OUUCIIJSBIBCHB-ZPFDUUQYSA-N 0.000 description 1
- GAZGFPOZOLEYAJ-YTFOTSKYSA-N Ile-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N GAZGFPOZOLEYAJ-YTFOTSKYSA-N 0.000 description 1
- PNTWNAXGBOZMBO-MNXVOIDGSA-N Ile-Lys-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PNTWNAXGBOZMBO-MNXVOIDGSA-N 0.000 description 1
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 1
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 1
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 1
- RCMNUBZKIIJCOI-ZPFDUUQYSA-N Ile-Met-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N RCMNUBZKIIJCOI-ZPFDUUQYSA-N 0.000 description 1
- ZUPJCJINYQISSN-XUXIUFHCSA-N Ile-Met-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)O)N ZUPJCJINYQISSN-XUXIUFHCSA-N 0.000 description 1
- UYNXBNHVWFNVIN-HJWJTTGWSA-N Ile-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 UYNXBNHVWFNVIN-HJWJTTGWSA-N 0.000 description 1
- OTSVBELRDMSPKY-PCBIJLKTSA-N Ile-Phe-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OTSVBELRDMSPKY-PCBIJLKTSA-N 0.000 description 1
- SAVXZJYTTQQQDD-QEWYBTABSA-N Ile-Phe-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SAVXZJYTTQQQDD-QEWYBTABSA-N 0.000 description 1
- XLXPYSDGMXTTNQ-UHFFFAOYSA-N Ile-Phe-Leu Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(CC(C)C)C(O)=O)CC1=CC=CC=C1 XLXPYSDGMXTTNQ-UHFFFAOYSA-N 0.000 description 1
- BJECXJHLUJXPJQ-PYJNHQTQSA-N Ile-Pro-His Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N BJECXJHLUJXPJQ-PYJNHQTQSA-N 0.000 description 1
- KTNGVMMGIQWIDV-OSUNSFLBSA-N Ile-Pro-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O KTNGVMMGIQWIDV-OSUNSFLBSA-N 0.000 description 1
- VGSPNSSCMOHRRR-BJDJZHNGSA-N Ile-Ser-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N VGSPNSSCMOHRRR-BJDJZHNGSA-N 0.000 description 1
- WLRJHVNFGAOYPS-HJPIBITLSA-N Ile-Ser-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N WLRJHVNFGAOYPS-HJPIBITLSA-N 0.000 description 1
- HJDZMPFEXINXLO-QPHKQPEJSA-N Ile-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N HJDZMPFEXINXLO-QPHKQPEJSA-N 0.000 description 1
- QGXQHJQPAPMACW-PPCPHDFISA-N Ile-Thr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)O)N QGXQHJQPAPMACW-PPCPHDFISA-N 0.000 description 1
- OMDWJWGZGMCQND-CFMVVWHZSA-N Ile-Tyr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N OMDWJWGZGMCQND-CFMVVWHZSA-N 0.000 description 1
- JSLIXOUMAOUGBN-JUKXBJQTSA-N Ile-Tyr-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N JSLIXOUMAOUGBN-JUKXBJQTSA-N 0.000 description 1
- ZGKVPOSSTGHJAF-HJPIBITLSA-N Ile-Tyr-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CO)C(=O)O)N ZGKVPOSSTGHJAF-HJPIBITLSA-N 0.000 description 1
- WRDTXMBPHMBGIB-STECZYCISA-N Ile-Tyr-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 WRDTXMBPHMBGIB-STECZYCISA-N 0.000 description 1
- ZYVTXBXHIKGZMD-QSFUFRPTSA-N Ile-Val-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N ZYVTXBXHIKGZMD-QSFUFRPTSA-N 0.000 description 1
- DLEBSGAVWRPTIX-PEDHHIEDSA-N Ile-Val-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)[C@@H](C)CC DLEBSGAVWRPTIX-PEDHHIEDSA-N 0.000 description 1
- JZBVBOKASHNXAD-NAKRPEOUSA-N Ile-Val-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N JZBVBOKASHNXAD-NAKRPEOUSA-N 0.000 description 1
- SWNRZNLXMXRCJC-VKOGCVSHSA-N Ile-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 SWNRZNLXMXRCJC-VKOGCVSHSA-N 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 241001112693 Lachnospiraceae Species 0.000 description 1
- 241000135333 Lactica Species 0.000 description 1
- 241000186660 Lactobacillus Species 0.000 description 1
- 241000589248 Legionella Species 0.000 description 1
- 241000589242 Legionella pneumophila Species 0.000 description 1
- 208000007764 Legionnaires' Disease Diseases 0.000 description 1
- LJHGALIOHLRRQN-DCAQKATOSA-N Leu-Ala-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LJHGALIOHLRRQN-DCAQKATOSA-N 0.000 description 1
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N Leu-Ala-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 1
- MJOZZTKJZQFKDK-GUBZILKMSA-N Leu-Ala-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(N)=O MJOZZTKJZQFKDK-GUBZILKMSA-N 0.000 description 1
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 1
- PBCHMHROGNUXMK-DLOVCJGASA-N Leu-Ala-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 PBCHMHROGNUXMK-DLOVCJGASA-N 0.000 description 1
- XBBKIIGCUMBKCO-JXUBOQSCSA-N Leu-Ala-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XBBKIIGCUMBKCO-JXUBOQSCSA-N 0.000 description 1
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 1
- KKXDHFKZWKLYGB-GUBZILKMSA-N Leu-Asn-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKXDHFKZWKLYGB-GUBZILKMSA-N 0.000 description 1
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 1
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 1
- TWQIYNGNYNJUFM-NHCYSSNCSA-N Leu-Asn-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TWQIYNGNYNJUFM-NHCYSSNCSA-N 0.000 description 1
- ZURHXHNAEJJRNU-CIUDSAMLSA-N Leu-Asp-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZURHXHNAEJJRNU-CIUDSAMLSA-N 0.000 description 1
- DLFAACQHIRSQGG-CIUDSAMLSA-N Leu-Asp-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O DLFAACQHIRSQGG-CIUDSAMLSA-N 0.000 description 1
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 1
- QLQHWWCSCLZUMA-KKUMJFAQSA-N Leu-Asp-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QLQHWWCSCLZUMA-KKUMJFAQSA-N 0.000 description 1
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 1
- AXZGZMGRBDQTEY-SRVKXCTJSA-N Leu-Gln-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O AXZGZMGRBDQTEY-SRVKXCTJSA-N 0.000 description 1
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 1
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 1
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 1
- UCDHVOALNXENLC-KBPBESRZSA-N Leu-Gly-Tyr Chemical compound CC(C)C[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 UCDHVOALNXENLC-KBPBESRZSA-N 0.000 description 1
- VZBIUJURDLFFOE-IHRRRGAJSA-N Leu-His-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VZBIUJURDLFFOE-IHRRRGAJSA-N 0.000 description 1
- KVOFSTUWVSQMDK-KKUMJFAQSA-N Leu-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KVOFSTUWVSQMDK-KKUMJFAQSA-N 0.000 description 1
- SGIIOQQGLUUMDQ-IHRRRGAJSA-N Leu-His-Val Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C(C)C)C(=O)O)N SGIIOQQGLUUMDQ-IHRRRGAJSA-N 0.000 description 1
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 1
- IFMPDNRWZZEZSL-SRVKXCTJSA-N Leu-Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O IFMPDNRWZZEZSL-SRVKXCTJSA-N 0.000 description 1
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 1
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 1
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 1
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 1
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 1
- ZGUMORRUBUCXEH-AVGNSLFASA-N Leu-Lys-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZGUMORRUBUCXEH-AVGNSLFASA-N 0.000 description 1
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 1
- FKQPWMZLIIATBA-AJNGGQMLSA-N Leu-Lys-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FKQPWMZLIIATBA-AJNGGQMLSA-N 0.000 description 1
- ONPJGOIVICHWBW-BZSNNMDCSA-N Leu-Lys-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ONPJGOIVICHWBW-BZSNNMDCSA-N 0.000 description 1
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 1
- INCJJHQRZGQLFC-KBPBESRZSA-N Leu-Phe-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O INCJJHQRZGQLFC-KBPBESRZSA-N 0.000 description 1
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 1
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 1
- PTRKPHUGYULXPU-KKUMJFAQSA-N Leu-Phe-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O PTRKPHUGYULXPU-KKUMJFAQSA-N 0.000 description 1
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 1
- VULJUQZPSOASBZ-SRVKXCTJSA-N Leu-Pro-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O VULJUQZPSOASBZ-SRVKXCTJSA-N 0.000 description 1
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 1
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 1
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 1
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 1
- SQUFDMCWMFOEBA-KKUMJFAQSA-N Leu-Ser-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SQUFDMCWMFOEBA-KKUMJFAQSA-N 0.000 description 1
- AEDWWMMHUGYIFD-HJGDQZAQSA-N Leu-Thr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O AEDWWMMHUGYIFD-HJGDQZAQSA-N 0.000 description 1
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 1
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 1
- AIQWYVFNBNNOLU-RHYQMDGZSA-N Leu-Thr-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O AIQWYVFNBNNOLU-RHYQMDGZSA-N 0.000 description 1
- SUYRAPCRSCCPAK-VFAJRCTISA-N Leu-Trp-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SUYRAPCRSCCPAK-VFAJRCTISA-N 0.000 description 1
- WUHBLPVELFTPQK-KKUMJFAQSA-N Leu-Tyr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O WUHBLPVELFTPQK-KKUMJFAQSA-N 0.000 description 1
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 1
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 1
- YIRIDPUGZKHMHT-ACRUOGEOSA-N Leu-Tyr-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YIRIDPUGZKHMHT-ACRUOGEOSA-N 0.000 description 1
- VQHUBNVKFFLWRP-ULQDDVLXSA-N Leu-Tyr-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 VQHUBNVKFFLWRP-ULQDDVLXSA-N 0.000 description 1
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 1
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 1
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 1
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 1
- WQWZXKWOEVSGQM-DCAQKATOSA-N Lys-Ala-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN WQWZXKWOEVSGQM-DCAQKATOSA-N 0.000 description 1
- IXHKPDJKKCUKHS-GARJFASQSA-N Lys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IXHKPDJKKCUKHS-GARJFASQSA-N 0.000 description 1
- VHXMZJGOKIMETG-CQDKDKBSSA-N Lys-Ala-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCCCN)N VHXMZJGOKIMETG-CQDKDKBSSA-N 0.000 description 1
- NQCJGQHHYZNUDK-DCAQKATOSA-N Lys-Arg-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CCCN=C(N)N NQCJGQHHYZNUDK-DCAQKATOSA-N 0.000 description 1
- SWWCDAGDQHTKIE-RHYQMDGZSA-N Lys-Arg-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWWCDAGDQHTKIE-RHYQMDGZSA-N 0.000 description 1
- NTSPQIONFJUMJV-AVGNSLFASA-N Lys-Arg-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O NTSPQIONFJUMJV-AVGNSLFASA-N 0.000 description 1
- WLCYCADOWRMSAJ-CIUDSAMLSA-N Lys-Asn-Cys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(O)=O WLCYCADOWRMSAJ-CIUDSAMLSA-N 0.000 description 1
- ZQCVMVCVPFYXHZ-SRVKXCTJSA-N Lys-Asn-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN ZQCVMVCVPFYXHZ-SRVKXCTJSA-N 0.000 description 1
- DGWXCIORNLWGGG-CIUDSAMLSA-N Lys-Asn-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O DGWXCIORNLWGGG-CIUDSAMLSA-N 0.000 description 1
- HIIZIQUUHIXUJY-GUBZILKMSA-N Lys-Asp-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HIIZIQUUHIXUJY-GUBZILKMSA-N 0.000 description 1
- AAORVPFVUIHEAB-YUMQZZPRSA-N Lys-Asp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O AAORVPFVUIHEAB-YUMQZZPRSA-N 0.000 description 1
- WGCKDDHUFPQSMZ-ZPFDUUQYSA-N Lys-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCCN WGCKDDHUFPQSMZ-ZPFDUUQYSA-N 0.000 description 1
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 1
- PHHYNOUOUWYQRO-XIRDDKMYSA-N Lys-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N PHHYNOUOUWYQRO-XIRDDKMYSA-N 0.000 description 1
- WTZUSCUIVPVCRH-SRVKXCTJSA-N Lys-Gln-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N WTZUSCUIVPVCRH-SRVKXCTJSA-N 0.000 description 1
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 1
- NNCDAORZCMPZPX-GUBZILKMSA-N Lys-Gln-Ser Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N NNCDAORZCMPZPX-GUBZILKMSA-N 0.000 description 1
- NDORZBUHCOJQDO-GVXVVHGQSA-N Lys-Gln-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O NDORZBUHCOJQDO-GVXVVHGQSA-N 0.000 description 1
- CRNNMTHBMRFQNG-GUBZILKMSA-N Lys-Glu-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N CRNNMTHBMRFQNG-GUBZILKMSA-N 0.000 description 1
- GRADYHMSAUIKPS-DCAQKATOSA-N Lys-Glu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRADYHMSAUIKPS-DCAQKATOSA-N 0.000 description 1
- PBIPLDMFHAICIP-DCAQKATOSA-N Lys-Glu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PBIPLDMFHAICIP-DCAQKATOSA-N 0.000 description 1
- KZOHPCYVORJBLG-AVGNSLFASA-N Lys-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N KZOHPCYVORJBLG-AVGNSLFASA-N 0.000 description 1
- ODUQLUADRKMHOZ-JYJNAYRXSA-N Lys-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)O ODUQLUADRKMHOZ-JYJNAYRXSA-N 0.000 description 1
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 1
- GQZMPWBZQALKJO-UWVGGRQHSA-N Lys-Gly-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O GQZMPWBZQALKJO-UWVGGRQHSA-N 0.000 description 1
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 1
- FHIAJWBDZVHLAH-YUMQZZPRSA-N Lys-Gly-Ser Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FHIAJWBDZVHLAH-YUMQZZPRSA-N 0.000 description 1
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 1
- SPCHLZUWJTYZFC-IHRRRGAJSA-N Lys-His-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(O)=O SPCHLZUWJTYZFC-IHRRRGAJSA-N 0.000 description 1
- IUWMQCZOTYRXPL-ZPFDUUQYSA-N Lys-Ile-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O IUWMQCZOTYRXPL-ZPFDUUQYSA-N 0.000 description 1
- QBEPTBMRQALPEV-MNXVOIDGSA-N Lys-Ile-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN QBEPTBMRQALPEV-MNXVOIDGSA-N 0.000 description 1
- QOJDBRUCOXQSSK-AJNGGQMLSA-N Lys-Ile-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O QOJDBRUCOXQSSK-AJNGGQMLSA-N 0.000 description 1
- XREQQOATSMMAJP-MGHWNKPDSA-N Lys-Ile-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XREQQOATSMMAJP-MGHWNKPDSA-N 0.000 description 1
- PINHPJWGVBKQII-SRVKXCTJSA-N Lys-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N PINHPJWGVBKQII-SRVKXCTJSA-N 0.000 description 1
- ONPDTSFZAIWMDI-AVGNSLFASA-N Lys-Leu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ONPDTSFZAIWMDI-AVGNSLFASA-N 0.000 description 1
- XIZQPFCRXLUNMK-BZSNNMDCSA-N Lys-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCCCN)N XIZQPFCRXLUNMK-BZSNNMDCSA-N 0.000 description 1
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 1
- KJIXWRWPOCKYLD-IHRRRGAJSA-N Lys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N KJIXWRWPOCKYLD-IHRRRGAJSA-N 0.000 description 1
- ATNKHRAIZCMCCN-BZSNNMDCSA-N Lys-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N ATNKHRAIZCMCCN-BZSNNMDCSA-N 0.000 description 1
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 1
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 1
- BXPHMHQHYHILBB-BZSNNMDCSA-N Lys-Lys-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BXPHMHQHYHILBB-BZSNNMDCSA-N 0.000 description 1
- ZCWWVXAXWUAEPZ-SRVKXCTJSA-N Lys-Met-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZCWWVXAXWUAEPZ-SRVKXCTJSA-N 0.000 description 1
- ODTZHNZPINULEU-KKUMJFAQSA-N Lys-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N ODTZHNZPINULEU-KKUMJFAQSA-N 0.000 description 1
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 1
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 1
- CRIODIGWCUPXKU-AVGNSLFASA-N Lys-Pro-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(O)=O CRIODIGWCUPXKU-AVGNSLFASA-N 0.000 description 1
- WQDKIVRHTQYJSN-DCAQKATOSA-N Lys-Ser-Arg Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N WQDKIVRHTQYJSN-DCAQKATOSA-N 0.000 description 1
- MGKFCQFVPKOWOL-CIUDSAMLSA-N Lys-Ser-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N MGKFCQFVPKOWOL-CIUDSAMLSA-N 0.000 description 1
- JOSAKOKSPXROGQ-BJDJZHNGSA-N Lys-Ser-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JOSAKOKSPXROGQ-BJDJZHNGSA-N 0.000 description 1
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 1
- SQXZLVXQXWILKW-KKUMJFAQSA-N Lys-Ser-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQXZLVXQXWILKW-KKUMJFAQSA-N 0.000 description 1
- YFQSSOAGMZGXFT-MEYUZBJRSA-N Lys-Thr-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YFQSSOAGMZGXFT-MEYUZBJRSA-N 0.000 description 1
- YUTZYVTZDVZBJJ-IHPCNDPISA-N Lys-Trp-Lys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 YUTZYVTZDVZBJJ-IHPCNDPISA-N 0.000 description 1
- PELXPRPDQRFBGQ-KKUMJFAQSA-N Lys-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O PELXPRPDQRFBGQ-KKUMJFAQSA-N 0.000 description 1
- RMKJOQSYLQQRFN-KKUMJFAQSA-N Lys-Tyr-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O RMKJOQSYLQQRFN-KKUMJFAQSA-N 0.000 description 1
- LMMBAXJRYSXCOQ-ACRUOGEOSA-N Lys-Tyr-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O LMMBAXJRYSXCOQ-ACRUOGEOSA-N 0.000 description 1
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 1
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 1
- VVURYEVJJTXWNE-ULQDDVLXSA-N Lys-Tyr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O VVURYEVJJTXWNE-ULQDDVLXSA-N 0.000 description 1
- UGCIQUYEJIEHKX-GVXVVHGQSA-N Lys-Val-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O UGCIQUYEJIEHKX-GVXVVHGQSA-N 0.000 description 1
- RPWQJSBMXJSCPD-XUXIUFHCSA-N Lys-Val-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)C(C)C)C(O)=O RPWQJSBMXJSCPD-XUXIUFHCSA-N 0.000 description 1
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 1
- OZVXDDFYCQOPFD-XQQFMLRXSA-N Lys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N OZVXDDFYCQOPFD-XQQFMLRXSA-N 0.000 description 1
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- QAHFGYLFLVGBNW-DCAQKATOSA-N Met-Ala-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN QAHFGYLFLVGBNW-DCAQKATOSA-N 0.000 description 1
- CWFYZYQMUDWGTI-GUBZILKMSA-N Met-Arg-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O CWFYZYQMUDWGTI-GUBZILKMSA-N 0.000 description 1
- YNOVBMBQSQTLFM-DCAQKATOSA-N Met-Asn-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O YNOVBMBQSQTLFM-DCAQKATOSA-N 0.000 description 1
- TUSOIZOVPJCMFC-FXQIFTODSA-N Met-Asp-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O TUSOIZOVPJCMFC-FXQIFTODSA-N 0.000 description 1
- WGBMNLCRYKSWAR-DCAQKATOSA-N Met-Asp-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN WGBMNLCRYKSWAR-DCAQKATOSA-N 0.000 description 1
- NCFZHKMKRCYQBJ-CIUDSAMLSA-N Met-Cys-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NCFZHKMKRCYQBJ-CIUDSAMLSA-N 0.000 description 1
- RMHHNLKYPOOKQN-FXQIFTODSA-N Met-Cys-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O RMHHNLKYPOOKQN-FXQIFTODSA-N 0.000 description 1
- MTBVQFFQMXHCPC-CIUDSAMLSA-N Met-Glu-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MTBVQFFQMXHCPC-CIUDSAMLSA-N 0.000 description 1
- VZBXCMCHIHEPBL-SRVKXCTJSA-N Met-Glu-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN VZBXCMCHIHEPBL-SRVKXCTJSA-N 0.000 description 1
- LQMHZERGCQJKAH-STQMWFEESA-N Met-Gly-Phe Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 LQMHZERGCQJKAH-STQMWFEESA-N 0.000 description 1
- FZUNSVYYPYJYAP-NAKRPEOUSA-N Met-Ile-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O FZUNSVYYPYJYAP-NAKRPEOUSA-N 0.000 description 1
- QGRJTULYDZUBAY-ZPFDUUQYSA-N Met-Ile-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O QGRJTULYDZUBAY-ZPFDUUQYSA-N 0.000 description 1
- AFVOKRHYSSFPHC-STECZYCISA-N Met-Ile-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFVOKRHYSSFPHC-STECZYCISA-N 0.000 description 1
- AFFKUNVPPLQUGA-DCAQKATOSA-N Met-Leu-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O AFFKUNVPPLQUGA-DCAQKATOSA-N 0.000 description 1
- OSZTUONKUMCWEP-XUXIUFHCSA-N Met-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCSC OSZTUONKUMCWEP-XUXIUFHCSA-N 0.000 description 1
- BEZJTLKUMFMITF-AVGNSLFASA-N Met-Lys-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCNC(N)=N BEZJTLKUMFMITF-AVGNSLFASA-N 0.000 description 1
- RSOMVHWMIAZNLE-HJWJTTGWSA-N Met-Phe-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RSOMVHWMIAZNLE-HJWJTTGWSA-N 0.000 description 1
- NLDXSXDCNZIQCN-ULQDDVLXSA-N Met-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CCSC)CC1=CC=CC=C1 NLDXSXDCNZIQCN-ULQDDVLXSA-N 0.000 description 1
- JQHYVIKEFYETEW-IHRRRGAJSA-N Met-Phe-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=CC=C1 JQHYVIKEFYETEW-IHRRRGAJSA-N 0.000 description 1
- DSZFTPCSFVWMKP-DCAQKATOSA-N Met-Ser-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN DSZFTPCSFVWMKP-DCAQKATOSA-N 0.000 description 1
- KSIPKXNIQOWMIC-RCWTZXSCSA-N Met-Thr-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KSIPKXNIQOWMIC-RCWTZXSCSA-N 0.000 description 1
- KYXDADPHSNFWQX-VEVYYDQMSA-N Met-Thr-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O KYXDADPHSNFWQX-VEVYYDQMSA-N 0.000 description 1
- ANCPZNHGZUCSSC-ULQDDVLXSA-N Met-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CCSC)CC1=CC=C(O)C=C1 ANCPZNHGZUCSSC-ULQDDVLXSA-N 0.000 description 1
- KPVLLNDCBYXKNV-CYDGBPFRSA-N Met-Val-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KPVLLNDCBYXKNV-CYDGBPFRSA-N 0.000 description 1
- LBSWWNKMVPAXOI-GUBZILKMSA-N Met-Val-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O LBSWWNKMVPAXOI-GUBZILKMSA-N 0.000 description 1
- 241000193459 Moorella thermoacetica Species 0.000 description 1
- 101100387128 Myxococcus xanthus (strain DK1622) devR gene Proteins 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 1
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 1
- 241000588653 Neisseria Species 0.000 description 1
- 241000588650 Neisseria meningitidis Species 0.000 description 1
- 101100068676 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) gln-1 gene Proteins 0.000 description 1
- 108091092724 Noncoding DNA Proteins 0.000 description 1
- 101800004910 Nuclease A Proteins 0.000 description 1
- 101710102974 O-acetyl transferase Proteins 0.000 description 1
- 241000606856 Pasteurella multocida Species 0.000 description 1
- WSXKXSBOJXEZDV-DLOVCJGASA-N Phe-Ala-Asn Chemical compound NC(=O)C[C@@H](C([O-])=O)NC(=O)[C@H](C)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 WSXKXSBOJXEZDV-DLOVCJGASA-N 0.000 description 1
- CGOMLCQJEMWMCE-STQMWFEESA-N Phe-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 CGOMLCQJEMWMCE-STQMWFEESA-N 0.000 description 1
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 1
- IUVYJBMTHARMIP-PCBIJLKTSA-N Phe-Asp-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O IUVYJBMTHARMIP-PCBIJLKTSA-N 0.000 description 1
- MQVFHOPCKNTHGT-MELADBBJSA-N Phe-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O MQVFHOPCKNTHGT-MELADBBJSA-N 0.000 description 1
- SWZKMTDPQXLQRD-XVSYOHENSA-N Phe-Asp-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWZKMTDPQXLQRD-XVSYOHENSA-N 0.000 description 1
- CUMXHKAOHNWRFQ-BZSNNMDCSA-N Phe-Asp-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 CUMXHKAOHNWRFQ-BZSNNMDCSA-N 0.000 description 1
- FGXIJNMDRCZVDE-KKUMJFAQSA-N Phe-Cys-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N FGXIJNMDRCZVDE-KKUMJFAQSA-N 0.000 description 1
- WFDAEEUZPZSMOG-SRVKXCTJSA-N Phe-Cys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O WFDAEEUZPZSMOG-SRVKXCTJSA-N 0.000 description 1
- SXJGROGVINAYSH-AVGNSLFASA-N Phe-Gln-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SXJGROGVINAYSH-AVGNSLFASA-N 0.000 description 1
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 1
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 1
- KJJROSNFBRWPHS-JYJNAYRXSA-N Phe-Glu-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KJJROSNFBRWPHS-JYJNAYRXSA-N 0.000 description 1
- PSKRILMFHNIUAO-JYJNAYRXSA-N Phe-Glu-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N PSKRILMFHNIUAO-JYJNAYRXSA-N 0.000 description 1
- CSDMCMITJLKBAH-SOUVJXGZSA-N Phe-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O CSDMCMITJLKBAH-SOUVJXGZSA-N 0.000 description 1
- UAMFZRNCIFFMLE-FHWLQOOXSA-N Phe-Glu-Tyr Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N UAMFZRNCIFFMLE-FHWLQOOXSA-N 0.000 description 1
- HGNGAMWHGGANAU-WHOFXGATSA-N Phe-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HGNGAMWHGGANAU-WHOFXGATSA-N 0.000 description 1
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 1
- YZJKNDCEPDDIDA-BZSNNMDCSA-N Phe-His-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CN=CN1 YZJKNDCEPDDIDA-BZSNNMDCSA-N 0.000 description 1
- QEFHBVDWKFFKQI-PMVMPFDFSA-N Phe-His-Trp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O QEFHBVDWKFFKQI-PMVMPFDFSA-N 0.000 description 1
- KRYSMKKRRRWOCZ-QEWYBTABSA-N Phe-Ile-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KRYSMKKRRRWOCZ-QEWYBTABSA-N 0.000 description 1
- BWTKUQPNOMMKMA-FIRPJDEBSA-N Phe-Ile-Phe Chemical compound C([C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 BWTKUQPNOMMKMA-FIRPJDEBSA-N 0.000 description 1
- ONORAGIFHNAADN-LLLHUVSDSA-N Phe-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N ONORAGIFHNAADN-LLLHUVSDSA-N 0.000 description 1
- CWFGECHCRMGPPT-MXAVVETBSA-N Phe-Ile-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O CWFGECHCRMGPPT-MXAVVETBSA-N 0.000 description 1
- LRBSWBVUCLLRLU-BZSNNMDCSA-N Phe-Leu-Lys Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(O)=O LRBSWBVUCLLRLU-BZSNNMDCSA-N 0.000 description 1
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 1
- DNAXXTQSTKOHFO-QEJZJMRPSA-N Phe-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 DNAXXTQSTKOHFO-QEJZJMRPSA-N 0.000 description 1
- DMEYUTSDVRCWRS-ULQDDVLXSA-N Phe-Lys-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 DMEYUTSDVRCWRS-ULQDDVLXSA-N 0.000 description 1
- RMKGXGPQIPLTFC-KKUMJFAQSA-N Phe-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RMKGXGPQIPLTFC-KKUMJFAQSA-N 0.000 description 1
- MJAYDXWQQUOURZ-JYJNAYRXSA-N Phe-Lys-Gln Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O MJAYDXWQQUOURZ-JYJNAYRXSA-N 0.000 description 1
- WLYPRKLMRIYGPP-JYJNAYRXSA-N Phe-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 WLYPRKLMRIYGPP-JYJNAYRXSA-N 0.000 description 1
- KLXQWABNAWDRAY-ACRUOGEOSA-N Phe-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 KLXQWABNAWDRAY-ACRUOGEOSA-N 0.000 description 1
- BSHMIVKDJQGLNT-ACRUOGEOSA-N Phe-Lys-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 BSHMIVKDJQGLNT-ACRUOGEOSA-N 0.000 description 1
- GPSMLZQVIIYLDK-ULQDDVLXSA-N Phe-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O GPSMLZQVIIYLDK-ULQDDVLXSA-N 0.000 description 1
- FUAIIFPQELBNJF-ULQDDVLXSA-N Phe-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N FUAIIFPQELBNJF-ULQDDVLXSA-N 0.000 description 1
- WURZLPSMYZLEGH-UNQGMJICSA-N Phe-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CC=CC=C1)N)O WURZLPSMYZLEGH-UNQGMJICSA-N 0.000 description 1
- RBRNEFJTEHPDSL-ACRUOGEOSA-N Phe-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RBRNEFJTEHPDSL-ACRUOGEOSA-N 0.000 description 1
- GPLWGAYGROGDEN-BZSNNMDCSA-N Phe-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O GPLWGAYGROGDEN-BZSNNMDCSA-N 0.000 description 1
- XDMMOISUAHXXFD-SRVKXCTJSA-N Phe-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O XDMMOISUAHXXFD-SRVKXCTJSA-N 0.000 description 1
- JXQVYPWVGUOIDV-MXAVVETBSA-N Phe-Ser-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JXQVYPWVGUOIDV-MXAVVETBSA-N 0.000 description 1
- GLJZDMZJHFXJQG-BZSNNMDCSA-N Phe-Ser-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLJZDMZJHFXJQG-BZSNNMDCSA-N 0.000 description 1
- MCIXMYKSPQUMJG-SRVKXCTJSA-N Phe-Ser-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MCIXMYKSPQUMJG-SRVKXCTJSA-N 0.000 description 1
- GMWNQSGWWGKTSF-LFSVMHDDSA-N Phe-Thr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O GMWNQSGWWGKTSF-LFSVMHDDSA-N 0.000 description 1
- MSSXKZBDKZAHCX-UNQGMJICSA-N Phe-Thr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O MSSXKZBDKZAHCX-UNQGMJICSA-N 0.000 description 1
- VFDRDMOMHBJGKD-UFYCRDLUSA-N Phe-Tyr-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N VFDRDMOMHBJGKD-UFYCRDLUSA-N 0.000 description 1
- BAONJAHBAUDJKA-BZSNNMDCSA-N Phe-Tyr-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 BAONJAHBAUDJKA-BZSNNMDCSA-N 0.000 description 1
- GOUWCZRDTWTODO-YDHLFZDLSA-N Phe-Val-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O GOUWCZRDTWTODO-YDHLFZDLSA-N 0.000 description 1
- YUPRIZTWANWWHK-DZKIICNBSA-N Phe-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N YUPRIZTWANWWHK-DZKIICNBSA-N 0.000 description 1
- XALFIVXGQUEGKV-JSGCOSHPSA-N Phe-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 XALFIVXGQUEGKV-JSGCOSHPSA-N 0.000 description 1
- BQMFWUKNOCJDNV-HJWJTTGWSA-N Phe-Val-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BQMFWUKNOCJDNV-HJWJTTGWSA-N 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 1
- OBVCYFIHIIYIQF-CIUDSAMLSA-N Pro-Asn-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OBVCYFIHIIYIQF-CIUDSAMLSA-N 0.000 description 1
- RETPETNFPLNLRV-JYJNAYRXSA-N Pro-Asn-Trp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O RETPETNFPLNLRV-JYJNAYRXSA-N 0.000 description 1
- XKHCJJPNXFBADI-DCAQKATOSA-N Pro-Asp-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O XKHCJJPNXFBADI-DCAQKATOSA-N 0.000 description 1
- UAYHMOIGIQZLFR-NHCYSSNCSA-N Pro-Gln-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UAYHMOIGIQZLFR-NHCYSSNCSA-N 0.000 description 1
- KIPIKSXPPLABPN-CIUDSAMLSA-N Pro-Glu-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 KIPIKSXPPLABPN-CIUDSAMLSA-N 0.000 description 1
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 1
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 1
- UUHXBJHVTVGSKM-BQBZGAKWSA-N Pro-Gly-Asn Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UUHXBJHVTVGSKM-BQBZGAKWSA-N 0.000 description 1
- XQHGISDMVBTGAL-ULQDDVLXSA-N Pro-His-Phe Chemical compound C([C@@H](C(=O)[O-])NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H]1[NH2+]CCC1)C1=CC=CC=C1 XQHGISDMVBTGAL-ULQDDVLXSA-N 0.000 description 1
- LNOWDSPAYBWJOR-PEDHHIEDSA-N Pro-Ile-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LNOWDSPAYBWJOR-PEDHHIEDSA-N 0.000 description 1
- VZKBJNBZMZHKRC-XUXIUFHCSA-N Pro-Ile-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O VZKBJNBZMZHKRC-XUXIUFHCSA-N 0.000 description 1
- SUENWIFTSTWUKD-AVGNSLFASA-N Pro-Leu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SUENWIFTSTWUKD-AVGNSLFASA-N 0.000 description 1
- CDGABSWLRMECHC-IHRRRGAJSA-N Pro-Lys-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O CDGABSWLRMECHC-IHRRRGAJSA-N 0.000 description 1
- WOIFYRZPIORBRY-AVGNSLFASA-N Pro-Lys-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WOIFYRZPIORBRY-AVGNSLFASA-N 0.000 description 1
- FRVUYKWGPCQRBL-GUBZILKMSA-N Pro-Met-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@@H]1CCCN1 FRVUYKWGPCQRBL-GUBZILKMSA-N 0.000 description 1
- FNGOXVQBBCMFKV-CIUDSAMLSA-N Pro-Ser-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O FNGOXVQBBCMFKV-CIUDSAMLSA-N 0.000 description 1
- JDJMFMVVJHLWDP-UNQGMJICSA-N Pro-Thr-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JDJMFMVVJHLWDP-UNQGMJICSA-N 0.000 description 1
- XNJVJEHDZPDPQL-BZSNNMDCSA-N Pro-Trp-Arg Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@@H]1CCCN1)C(O)=O XNJVJEHDZPDPQL-BZSNNMDCSA-N 0.000 description 1
- HOJUNFDJDAPVBI-BZSNNMDCSA-N Pro-Trp-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@@H]3CCCN3 HOJUNFDJDAPVBI-BZSNNMDCSA-N 0.000 description 1
- QDDJNKWPTJHROJ-UFYCRDLUSA-N Pro-Tyr-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 QDDJNKWPTJHROJ-UFYCRDLUSA-N 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- 108700040121 Protein Methyltransferases Proteins 0.000 description 1
- 208000003100 Pseudomembranous Enterocolitis Diseases 0.000 description 1
- 206010037128 Pseudomembranous colitis Diseases 0.000 description 1
- 108010003201 RGH 0205 Proteins 0.000 description 1
- 101100005308 Rattus norvegicus Ctsq gene Proteins 0.000 description 1
- 101710188003 Replication and maintenance protein Proteins 0.000 description 1
- 101710195674 Replication initiator protein Proteins 0.000 description 1
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 1
- 241000235070 Saccharomyces Species 0.000 description 1
- 101100394024 Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720) bcsZ gene Proteins 0.000 description 1
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 1
- JPIDMRXXNMIVKY-VZFHVOOUSA-N Ser-Ala-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPIDMRXXNMIVKY-VZFHVOOUSA-N 0.000 description 1
- PZZJMBYSYAKYPK-UWJYBYFXSA-N Ser-Ala-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PZZJMBYSYAKYPK-UWJYBYFXSA-N 0.000 description 1
- QFBNNYNWKYKVJO-DCAQKATOSA-N Ser-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N QFBNNYNWKYKVJO-DCAQKATOSA-N 0.000 description 1
- OOKCGAYXSNJBGQ-ZLUOBGJFSA-N Ser-Asn-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OOKCGAYXSNJBGQ-ZLUOBGJFSA-N 0.000 description 1
- YMEXHZTVKDAKIY-GHCJXIJMSA-N Ser-Asn-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO)C(O)=O YMEXHZTVKDAKIY-GHCJXIJMSA-N 0.000 description 1
- VGNYHOBZJKWRGI-CIUDSAMLSA-N Ser-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO VGNYHOBZJKWRGI-CIUDSAMLSA-N 0.000 description 1
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 1
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 1
- QPFJSHSJFIYDJZ-GHCJXIJMSA-N Ser-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO QPFJSHSJFIYDJZ-GHCJXIJMSA-N 0.000 description 1
- BYIROAKULFFTEK-CIUDSAMLSA-N Ser-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO BYIROAKULFFTEK-CIUDSAMLSA-N 0.000 description 1
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 1
- BTPAWKABYQMKKN-LKXGYXEUSA-N Ser-Asp-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BTPAWKABYQMKKN-LKXGYXEUSA-N 0.000 description 1
- HEQPKICPPDOSIN-SRVKXCTJSA-N Ser-Asp-Tyr Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HEQPKICPPDOSIN-SRVKXCTJSA-N 0.000 description 1
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 1
- FMDHKPRACUXATF-ACZMJKKPSA-N Ser-Gln-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O FMDHKPRACUXATF-ACZMJKKPSA-N 0.000 description 1
- DSGYZICNAMEJOC-AVGNSLFASA-N Ser-Glu-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DSGYZICNAMEJOC-AVGNSLFASA-N 0.000 description 1
- WBINSDOPZHQPPM-AVGNSLFASA-N Ser-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)O WBINSDOPZHQPPM-AVGNSLFASA-N 0.000 description 1
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 1
- WEQAYODCJHZSJZ-KKUMJFAQSA-N Ser-His-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 WEQAYODCJHZSJZ-KKUMJFAQSA-N 0.000 description 1
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 1
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 1
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 1
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 1
- GVMUJUPXFQFBBZ-GUBZILKMSA-N Ser-Lys-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GVMUJUPXFQFBBZ-GUBZILKMSA-N 0.000 description 1
- LRWBCWGEUCKDTN-BJDJZHNGSA-N Ser-Lys-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LRWBCWGEUCKDTN-BJDJZHNGSA-N 0.000 description 1
- FPCGZYMRFFIYIH-CIUDSAMLSA-N Ser-Lys-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O FPCGZYMRFFIYIH-CIUDSAMLSA-N 0.000 description 1
- LRZLZIUXQBIWTB-KATARQTJSA-N Ser-Lys-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRZLZIUXQBIWTB-KATARQTJSA-N 0.000 description 1
- GDUZTEQRAOXYJS-SRVKXCTJSA-N Ser-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GDUZTEQRAOXYJS-SRVKXCTJSA-N 0.000 description 1
- FBLNYDYPCLFTSP-IXOXFDKPSA-N Ser-Phe-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FBLNYDYPCLFTSP-IXOXFDKPSA-N 0.000 description 1
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 1
- ADJDNJCSPNFFPI-FXQIFTODSA-N Ser-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO ADJDNJCSPNFFPI-FXQIFTODSA-N 0.000 description 1
- FLONGDPORFIVQW-XGEHTFHBSA-N Ser-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FLONGDPORFIVQW-XGEHTFHBSA-N 0.000 description 1
- CKDXFSPMIDSMGV-GUBZILKMSA-N Ser-Pro-Val Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O CKDXFSPMIDSMGV-GUBZILKMSA-N 0.000 description 1
- VFWQQZMRKFOGLE-ZLUOBGJFSA-N Ser-Ser-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)O VFWQQZMRKFOGLE-ZLUOBGJFSA-N 0.000 description 1
- JCLAFVNDBJMLBC-JBDRJPRFSA-N Ser-Ser-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JCLAFVNDBJMLBC-JBDRJPRFSA-N 0.000 description 1
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 1
- DKGRNFUXVTYRAS-UBHSHLNASA-N Ser-Ser-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O DKGRNFUXVTYRAS-UBHSHLNASA-N 0.000 description 1
- OLKICIBQRVSQMA-SRVKXCTJSA-N Ser-Ser-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OLKICIBQRVSQMA-SRVKXCTJSA-N 0.000 description 1
- WUXCHQZLUHBSDJ-LKXGYXEUSA-N Ser-Thr-Asp Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WUXCHQZLUHBSDJ-LKXGYXEUSA-N 0.000 description 1
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 1
- PQEQXWRVHQAAKS-SRVKXCTJSA-N Ser-Tyr-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CO)N)CC1=CC=C(O)C=C1 PQEQXWRVHQAAKS-SRVKXCTJSA-N 0.000 description 1
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 1
- MFQMZDPAZRZAPV-NAKRPEOUSA-N Ser-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CO)N MFQMZDPAZRZAPV-NAKRPEOUSA-N 0.000 description 1
- ODRUTDLAONAVDV-IHRRRGAJSA-N Ser-Val-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ODRUTDLAONAVDV-IHRRRGAJSA-N 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 241000194017 Streptococcus Species 0.000 description 1
- 241000194019 Streptococcus mutans Species 0.000 description 1
- 101100166146 Streptococcus pyogenes serotype M1 cas9 gene Proteins 0.000 description 1
- 206010043376 Tetanus Diseases 0.000 description 1
- 101100157012 Thermoanaerobacterium saccharolyticum (strain DSM 8691 / JW/SL-YS485) xynB gene Proteins 0.000 description 1
- 102000002932 Thiolase Human genes 0.000 description 1
- 108060008225 Thiolase Proteins 0.000 description 1
- FQPQPTHMHZKGFM-XQXXSGGOSA-N Thr-Ala-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O FQPQPTHMHZKGFM-XQXXSGGOSA-N 0.000 description 1
- CEXFELBFVHLYDZ-XGEHTFHBSA-N Thr-Arg-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CEXFELBFVHLYDZ-XGEHTFHBSA-N 0.000 description 1
- IRKWVRSEQFTGGV-VEVYYDQMSA-N Thr-Asn-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IRKWVRSEQFTGGV-VEVYYDQMSA-N 0.000 description 1
- CTONFVDJYCAMQM-IUKAMOBKSA-N Thr-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H]([C@@H](C)O)N CTONFVDJYCAMQM-IUKAMOBKSA-N 0.000 description 1
- OJRNZRROAIAHDL-LKXGYXEUSA-N Thr-Asn-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OJRNZRROAIAHDL-LKXGYXEUSA-N 0.000 description 1
- DCLBXIWHLVEPMQ-JRQIVUDYSA-N Thr-Asp-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DCLBXIWHLVEPMQ-JRQIVUDYSA-N 0.000 description 1
- QILPDQCTQZDHFM-HJGDQZAQSA-N Thr-Gln-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QILPDQCTQZDHFM-HJGDQZAQSA-N 0.000 description 1
- RKDFEMGVMMYYNG-WDCWCFNPSA-N Thr-Gln-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O RKDFEMGVMMYYNG-WDCWCFNPSA-N 0.000 description 1
- LAFLAXHTDVNVEL-WDCWCFNPSA-N Thr-Gln-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O LAFLAXHTDVNVEL-WDCWCFNPSA-N 0.000 description 1
- CQNFRKAKGDSJFR-NUMRIWBASA-N Thr-Glu-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O CQNFRKAKGDSJFR-NUMRIWBASA-N 0.000 description 1
- UDQBCBUXAQIZAK-GLLZPBPUSA-N Thr-Glu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDQBCBUXAQIZAK-GLLZPBPUSA-N 0.000 description 1
- XFTYVCHLARBHBQ-FOHZUACHSA-N Thr-Gly-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XFTYVCHLARBHBQ-FOHZUACHSA-N 0.000 description 1
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 1
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 1
- YSXYEJWDHBCTDJ-DVJZZOLTSA-N Thr-Gly-Trp Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O YSXYEJWDHBCTDJ-DVJZZOLTSA-N 0.000 description 1
- CRZNCABIJLRFKZ-IUKAMOBKSA-N Thr-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N CRZNCABIJLRFKZ-IUKAMOBKSA-N 0.000 description 1
- BVOVIGCHYNFJBZ-JXUBOQSCSA-N Thr-Leu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O BVOVIGCHYNFJBZ-JXUBOQSCSA-N 0.000 description 1
- AMXMBCAXAZUCFA-RHYQMDGZSA-N Thr-Leu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AMXMBCAXAZUCFA-RHYQMDGZSA-N 0.000 description 1
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 1
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 1
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 1
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 1
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 1
- ZSPQUTWLWGWTPS-HJGDQZAQSA-N Thr-Lys-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O ZSPQUTWLWGWTPS-HJGDQZAQSA-N 0.000 description 1
- SCSVNSNWUTYSFO-WDCWCFNPSA-N Thr-Lys-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O SCSVNSNWUTYSFO-WDCWCFNPSA-N 0.000 description 1
- KKPOGALELPLJTL-MEYUZBJRSA-N Thr-Lys-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KKPOGALELPLJTL-MEYUZBJRSA-N 0.000 description 1
- FDQXPJCLVPFKJW-KJEVXHAQSA-N Thr-Met-Tyr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N)O FDQXPJCLVPFKJW-KJEVXHAQSA-N 0.000 description 1
- WRQLCVIALDUQEQ-UNQGMJICSA-N Thr-Phe-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WRQLCVIALDUQEQ-UNQGMJICSA-N 0.000 description 1
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 1
- UGFSAPWZBROURT-IXOXFDKPSA-N Thr-Phe-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)O)N)O UGFSAPWZBROURT-IXOXFDKPSA-N 0.000 description 1
- HSQXHRIRJSFDOH-URLPEUOOSA-N Thr-Phe-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HSQXHRIRJSFDOH-URLPEUOOSA-N 0.000 description 1
- NYQIZWROIMIQSL-VEVYYDQMSA-N Thr-Pro-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O NYQIZWROIMIQSL-VEVYYDQMSA-N 0.000 description 1
- GFRIEEKFXOVPIR-RHYQMDGZSA-N Thr-Pro-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O GFRIEEKFXOVPIR-RHYQMDGZSA-N 0.000 description 1
- IWAVRIPRTCJAQO-HSHDSVGOSA-N Thr-Pro-Trp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O IWAVRIPRTCJAQO-HSHDSVGOSA-N 0.000 description 1
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 1
- RVMNUBQWPVOUKH-HEIBUPTGSA-N Thr-Ser-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMNUBQWPVOUKH-HEIBUPTGSA-N 0.000 description 1
- HUPLKEHTTQBXSC-YJRXYDGGSA-N Thr-Ser-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HUPLKEHTTQBXSC-YJRXYDGGSA-N 0.000 description 1
- MFMGPEKYBXFIRF-SUSMZKCASA-N Thr-Thr-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MFMGPEKYBXFIRF-SUSMZKCASA-N 0.000 description 1
- NJGMALCNYAMYCB-JRQIVUDYSA-N Thr-Tyr-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O NJGMALCNYAMYCB-JRQIVUDYSA-N 0.000 description 1
- REJRKTOJTCPDPO-IRIUXVKKSA-N Thr-Tyr-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O REJRKTOJTCPDPO-IRIUXVKKSA-N 0.000 description 1
- JAWUQFCGNVEDRN-MEYUZBJRSA-N Thr-Tyr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O JAWUQFCGNVEDRN-MEYUZBJRSA-N 0.000 description 1
- CJEHCEOXPLASCK-MEYUZBJRSA-N Thr-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=C(O)C=C1 CJEHCEOXPLASCK-MEYUZBJRSA-N 0.000 description 1
- FYBFTPLPAXZBOY-KKHAAJSZSA-N Thr-Val-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O FYBFTPLPAXZBOY-KKHAAJSZSA-N 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 102000004357 Transferases Human genes 0.000 description 1
- 108090000992 Transferases Proteins 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- NXAPHBHZCMQORW-FDARSICLSA-N Trp-Arg-Ile Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NXAPHBHZCMQORW-FDARSICLSA-N 0.000 description 1
- IUFQHOCOKQIOMC-XIRDDKMYSA-N Trp-Asn-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N IUFQHOCOKQIOMC-XIRDDKMYSA-N 0.000 description 1
- VTHNLRXALGUDBS-BPUTZDHNSA-N Trp-Gln-Glu Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N VTHNLRXALGUDBS-BPUTZDHNSA-N 0.000 description 1
- SNJAPSVIPKUMCK-NWLDYVSISA-N Trp-Glu-Thr Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SNJAPSVIPKUMCK-NWLDYVSISA-N 0.000 description 1
- AIISTODACBDQLW-WDSOQIARSA-N Trp-Leu-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 AIISTODACBDQLW-WDSOQIARSA-N 0.000 description 1
- OGZRZMJASKKMJZ-XIRDDKMYSA-N Trp-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N OGZRZMJASKKMJZ-XIRDDKMYSA-N 0.000 description 1
- UKWSFUSPGPBJGU-VFAJRCTISA-N Trp-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O UKWSFUSPGPBJGU-VFAJRCTISA-N 0.000 description 1
- YPBYQWFZAAQMGW-XIRDDKMYSA-N Trp-Lys-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N YPBYQWFZAAQMGW-XIRDDKMYSA-N 0.000 description 1
- WTRQBSSQBKRNKV-MNSWYVGCSA-N Trp-Thr-Tyr Chemical compound C([C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)[C@H](O)C)C(O)=O)C1=CC=C(O)C=C1 WTRQBSSQBKRNKV-MNSWYVGCSA-N 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- IELISNUVHBKYBX-XDTLVQLUSA-N Tyr-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 IELISNUVHBKYBX-XDTLVQLUSA-N 0.000 description 1
- DLZKEQQWXODGGZ-KWQFWETISA-N Tyr-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DLZKEQQWXODGGZ-KWQFWETISA-N 0.000 description 1
- KSVMDJJCYKIXTK-IGNZVWTISA-N Tyr-Ala-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 KSVMDJJCYKIXTK-IGNZVWTISA-N 0.000 description 1
- MICSYKFECRFCTJ-IHRRRGAJSA-N Tyr-Arg-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O MICSYKFECRFCTJ-IHRRRGAJSA-N 0.000 description 1
- OEVJGIHPQOXYFE-SRVKXCTJSA-N Tyr-Asn-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O OEVJGIHPQOXYFE-SRVKXCTJSA-N 0.000 description 1
- PEVVXUGSAKEPEN-AVGNSLFASA-N Tyr-Asn-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PEVVXUGSAKEPEN-AVGNSLFASA-N 0.000 description 1
- GFHYISDTIWZUSU-QWRGUYRKSA-N Tyr-Asn-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GFHYISDTIWZUSU-QWRGUYRKSA-N 0.000 description 1
- IUQDEKCCHWRHRW-IHPCNDPISA-N Tyr-Asn-Trp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O IUQDEKCCHWRHRW-IHPCNDPISA-N 0.000 description 1
- JFDGVHXRCKEBAU-KKUMJFAQSA-N Tyr-Asp-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O JFDGVHXRCKEBAU-KKUMJFAQSA-N 0.000 description 1
- UABYBEBXFFNCIR-YDHLFZDLSA-N Tyr-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UABYBEBXFFNCIR-YDHLFZDLSA-N 0.000 description 1
- QOIKZODVIPOPDD-AVGNSLFASA-N Tyr-Cys-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(O)=O QOIKZODVIPOPDD-AVGNSLFASA-N 0.000 description 1
- UXUFNBVCPAWACG-SIUGBPQLSA-N Tyr-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N UXUFNBVCPAWACG-SIUGBPQLSA-N 0.000 description 1
- WZQZUVWEPMGIMM-JYJNAYRXSA-N Tyr-Gln-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O WZQZUVWEPMGIMM-JYJNAYRXSA-N 0.000 description 1
- HKYTWJOWZTWBQB-AVGNSLFASA-N Tyr-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HKYTWJOWZTWBQB-AVGNSLFASA-N 0.000 description 1
- IMXAAEFAIBRCQF-SIUGBPQLSA-N Tyr-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N IMXAAEFAIBRCQF-SIUGBPQLSA-N 0.000 description 1
- NZFCWALTLNFHHC-JYJNAYRXSA-N Tyr-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NZFCWALTLNFHHC-JYJNAYRXSA-N 0.000 description 1
- SLCSPPCQWUHPPO-JYJNAYRXSA-N Tyr-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SLCSPPCQWUHPPO-JYJNAYRXSA-N 0.000 description 1
- CDHQEOXPWBDFPL-QWRGUYRKSA-N Tyr-Gly-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDHQEOXPWBDFPL-QWRGUYRKSA-N 0.000 description 1
- PJWCWGXAVIVXQC-STECZYCISA-N Tyr-Ile-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PJWCWGXAVIVXQC-STECZYCISA-N 0.000 description 1
- DZKFGCNKEVMXFA-JUKXBJQTSA-N Tyr-Ile-His Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O DZKFGCNKEVMXFA-JUKXBJQTSA-N 0.000 description 1
- OHOVFPKXPZODHS-SJWGOKEGSA-N Tyr-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N OHOVFPKXPZODHS-SJWGOKEGSA-N 0.000 description 1
- NKUGCYDFQKFVOJ-JYJNAYRXSA-N Tyr-Leu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NKUGCYDFQKFVOJ-JYJNAYRXSA-N 0.000 description 1
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 1
- CDKZJGMPZHPAJC-ULQDDVLXSA-N Tyr-Leu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDKZJGMPZHPAJC-ULQDDVLXSA-N 0.000 description 1
- GITNQBVCEQBDQC-KKUMJFAQSA-N Tyr-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O GITNQBVCEQBDQC-KKUMJFAQSA-N 0.000 description 1
- JLKVWTICWVWGSK-JYJNAYRXSA-N Tyr-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JLKVWTICWVWGSK-JYJNAYRXSA-N 0.000 description 1
- FMXFHNSFABRVFZ-BZSNNMDCSA-N Tyr-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FMXFHNSFABRVFZ-BZSNNMDCSA-N 0.000 description 1
- ZOBLBMGJKVJVEV-BZSNNMDCSA-N Tyr-Lys-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O ZOBLBMGJKVJVEV-BZSNNMDCSA-N 0.000 description 1
- CWVHKVVKAQIJKY-ACRUOGEOSA-N Tyr-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=C(C=C2)O)N CWVHKVVKAQIJKY-ACRUOGEOSA-N 0.000 description 1
- CNNVVEPJTFOGHI-ACRUOGEOSA-N Tyr-Lys-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CNNVVEPJTFOGHI-ACRUOGEOSA-N 0.000 description 1
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 1
- LMKKMCGTDANZTR-BZSNNMDCSA-N Tyr-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LMKKMCGTDANZTR-BZSNNMDCSA-N 0.000 description 1
- SCZJKZLFSSPJDP-ACRUOGEOSA-N Tyr-Phe-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SCZJKZLFSSPJDP-ACRUOGEOSA-N 0.000 description 1
- JXGUUJMPCRXMSO-HJOGWXRNSA-N Tyr-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 JXGUUJMPCRXMSO-HJOGWXRNSA-N 0.000 description 1
- VBFVQTPETKJCQW-RPTUDFQQSA-N Tyr-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VBFVQTPETKJCQW-RPTUDFQQSA-N 0.000 description 1
- RCMWNNJFKNDKQR-UFYCRDLUSA-N Tyr-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 RCMWNNJFKNDKQR-UFYCRDLUSA-N 0.000 description 1
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 1
- SOAUMCDLIUGXJJ-SRVKXCTJSA-N Tyr-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O SOAUMCDLIUGXJJ-SRVKXCTJSA-N 0.000 description 1
- QPOUERMDWKKZEG-HJPIBITLSA-N Tyr-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QPOUERMDWKKZEG-HJPIBITLSA-N 0.000 description 1
- MQGGXGKQSVEQHR-KKUMJFAQSA-N Tyr-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 MQGGXGKQSVEQHR-KKUMJFAQSA-N 0.000 description 1
- LUMQYLVYUIRHHU-YJRXYDGGSA-N Tyr-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LUMQYLVYUIRHHU-YJRXYDGGSA-N 0.000 description 1
- XUIOBCQESNDTDE-FQPOAREZSA-N Tyr-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O XUIOBCQESNDTDE-FQPOAREZSA-N 0.000 description 1
- NZBSVMQZQMEUHI-WZLNRYEVSA-N Tyr-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NZBSVMQZQMEUHI-WZLNRYEVSA-N 0.000 description 1
- WQOHKVRQDLNDIL-YJRXYDGGSA-N Tyr-Thr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O WQOHKVRQDLNDIL-YJRXYDGGSA-N 0.000 description 1
- GPLTZEMVOCZVAV-UFYCRDLUSA-N Tyr-Tyr-Arg Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=C(O)C=C1 GPLTZEMVOCZVAV-UFYCRDLUSA-N 0.000 description 1
- SQUMHUZLJDUROQ-YDHLFZDLSA-N Tyr-Val-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O SQUMHUZLJDUROQ-YDHLFZDLSA-N 0.000 description 1
- PQPWEALFTLKSEB-DZKIICNBSA-N Tyr-Val-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PQPWEALFTLKSEB-DZKIICNBSA-N 0.000 description 1
- ABSXSJZNRAQDDI-KJEVXHAQSA-N Tyr-Val-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ABSXSJZNRAQDDI-KJEVXHAQSA-N 0.000 description 1
- 108010064997 VPY tripeptide Proteins 0.000 description 1
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 1
- AUMNPAUHKUNHHN-BYULHYEWSA-N Val-Asn-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N AUMNPAUHKUNHHN-BYULHYEWSA-N 0.000 description 1
- LIQJSDDOULTANC-QSFUFRPTSA-N Val-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N LIQJSDDOULTANC-QSFUFRPTSA-N 0.000 description 1
- PVPAOIGJYHVWBT-KKHAAJSZSA-N Val-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N)O PVPAOIGJYHVWBT-KKHAAJSZSA-N 0.000 description 1
- HZYOWMGWKKRMBZ-BYULHYEWSA-N Val-Asp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZYOWMGWKKRMBZ-BYULHYEWSA-N 0.000 description 1
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 1
- PFMAFMPJJSHNDW-ZKWXMUAHSA-N Val-Cys-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O)N PFMAFMPJJSHNDW-ZKWXMUAHSA-N 0.000 description 1
- IWZYXFRGWKEKBJ-GVXVVHGQSA-N Val-Gln-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N IWZYXFRGWKEKBJ-GVXVVHGQSA-N 0.000 description 1
- NYTKXWLZSNRILS-IFFSRLJSSA-N Val-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N)O NYTKXWLZSNRILS-IFFSRLJSSA-N 0.000 description 1
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 1
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 1
- ROLGIBMFNMZANA-GVXVVHGQSA-N Val-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N ROLGIBMFNMZANA-GVXVVHGQSA-N 0.000 description 1
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 1
- OQWNEUXPKHIEJO-NRPADANISA-N Val-Glu-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N OQWNEUXPKHIEJO-NRPADANISA-N 0.000 description 1
- XWYUBUYQMOUFRQ-IFFSRLJSSA-N Val-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N)O XWYUBUYQMOUFRQ-IFFSRLJSSA-N 0.000 description 1
- YTPLVNUZZOBFFC-SCZZXKLOSA-N Val-Gly-Pro Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N1CCC[C@@H]1C(O)=O YTPLVNUZZOBFFC-SCZZXKLOSA-N 0.000 description 1
- KZKMBGXCNLPYKD-YEPSODPASA-N Val-Gly-Thr Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O KZKMBGXCNLPYKD-YEPSODPASA-N 0.000 description 1
- OVBMCNDKCWAXMZ-NAKRPEOUSA-N Val-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N OVBMCNDKCWAXMZ-NAKRPEOUSA-N 0.000 description 1
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 1
- AGXGCFSECFQMKB-NHCYSSNCSA-N Val-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N AGXGCFSECFQMKB-NHCYSSNCSA-N 0.000 description 1
- AEMPCGRFEZTWIF-IHRRRGAJSA-N Val-Leu-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O AEMPCGRFEZTWIF-IHRRRGAJSA-N 0.000 description 1
- XXWBHOWRARMUOC-NHCYSSNCSA-N Val-Lys-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N XXWBHOWRARMUOC-NHCYSSNCSA-N 0.000 description 1
- KTEZUXISLQTDDQ-NHCYSSNCSA-N Val-Lys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KTEZUXISLQTDDQ-NHCYSSNCSA-N 0.000 description 1
- WBAJDGWKRIHOAC-GVXVVHGQSA-N Val-Lys-Gln Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O WBAJDGWKRIHOAC-GVXVVHGQSA-N 0.000 description 1
- ZRSZTKTVPNSUNA-IHRRRGAJSA-N Val-Lys-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)C(C)C)C(O)=O ZRSZTKTVPNSUNA-IHRRRGAJSA-N 0.000 description 1
- YMTOEGGOCHVGEH-IHRRRGAJSA-N Val-Lys-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O YMTOEGGOCHVGEH-IHRRRGAJSA-N 0.000 description 1
- JVGHIFMSFBZDHH-WPRPVWTQSA-N Val-Met-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)O)N JVGHIFMSFBZDHH-WPRPVWTQSA-N 0.000 description 1
- KISFXYYRKKNLOP-IHRRRGAJSA-N Val-Phe-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N KISFXYYRKKNLOP-IHRRRGAJSA-N 0.000 description 1
- XBJKAZATRJBDCU-GUBZILKMSA-N Val-Pro-Ala Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O XBJKAZATRJBDCU-GUBZILKMSA-N 0.000 description 1
- GQMNEJMFMCJJTD-NHCYSSNCSA-N Val-Pro-Gln Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O GQMNEJMFMCJJTD-NHCYSSNCSA-N 0.000 description 1
- KSFXWENSJABBFI-ZKWXMUAHSA-N Val-Ser-Asn Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KSFXWENSJABBFI-ZKWXMUAHSA-N 0.000 description 1
- YLBNZCJFSVJDRJ-KJEVXHAQSA-N Val-Thr-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O YLBNZCJFSVJDRJ-KJEVXHAQSA-N 0.000 description 1
- VTIAEOKFUJJBTC-YDHLFZDLSA-N Val-Tyr-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VTIAEOKFUJJBTC-YDHLFZDLSA-N 0.000 description 1
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 230000000789 acetogenic effect Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 101150014383 adhE gene Proteins 0.000 description 1
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 1
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 1
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 1
- 108010041407 alanylaspartic acid Proteins 0.000 description 1
- 108010044940 alanylglutamine Proteins 0.000 description 1
- 108010047495 alanylglycine Proteins 0.000 description 1
- 108010070944 alanylhistidine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- 108010070783 alanyltyrosine Proteins 0.000 description 1
- 125000005600 alkyl phosphonate group Chemical group 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 235000001014 amino acid Nutrition 0.000 description 1
- 229940024606 amino acid Drugs 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 1
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 1
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 1
- 108010068380 arginylarginine Proteins 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 108010077245 asparaginyl-proline Proteins 0.000 description 1
- 150000001510 aspartic acids Chemical class 0.000 description 1
- 108010093581 aspartyl-proline Proteins 0.000 description 1
- 108010047857 aspartylglycine Proteins 0.000 description 1
- 101150063145 atoA gene Proteins 0.000 description 1
- 101150006429 atoB gene Proteins 0.000 description 1
- SGRUZFCHLOFYHZ-MWLCHTKSSA-N azidamfenicol Chemical compound [N-]=[N+]=NCC(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 SGRUZFCHLOFYHZ-MWLCHTKSSA-N 0.000 description 1
- 229960002278 azidamfenicol Drugs 0.000 description 1
- 229960004099 azithromycin Drugs 0.000 description 1
- MQTOSJVFKKJCRP-BICOPXKESA-N azithromycin Chemical compound O([C@@H]1[C@@H](C)C(=O)O[C@@H]([C@@]([C@H](O)[C@@H](C)N(C)C[C@H](C)C[C@@](C)(O)[C@H](O[C@H]2[C@@H]([C@H](C[C@@H](C)O2)N(C)C)O)[C@H]1C)(C)O)CC)[C@H]1C[C@@](C)(OC)[C@@H](O)[C@H](C)O1 MQTOSJVFKKJCRP-BICOPXKESA-N 0.000 description 1
- 101150000872 bcn gene Proteins 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 108010062635 beta-lactotensin Proteins 0.000 description 1
- 101150091659 bglC gene Proteins 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000002051 biphasic effect Effects 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 101150044165 cas7 gene Proteins 0.000 description 1
- 101150014802 catD gene Proteins 0.000 description 1
- 101150047857 catP gene Proteins 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 101150083131 chbA gene Proteins 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000012136 culture method Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 239000013604 expression vector Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 229960003760 florfenicol Drugs 0.000 description 1
- 238000012224 gene deletion Methods 0.000 description 1
- 238000010362 genome editing Methods 0.000 description 1
- 101150042018 glcG gene Proteins 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 1
- 108010090037 glycyl-alanyl-isoleucine Proteins 0.000 description 1
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 1
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 1
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 1
- 108010020688 glycylhistidine Proteins 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 101150029013 hydA gene Proteins 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- 101150043267 lacR gene Proteins 0.000 description 1
- 229940039696 lactobacillus Drugs 0.000 description 1
- 229940115932 legionella pneumophila Drugs 0.000 description 1
- 231100001231 less toxic Toxicity 0.000 description 1
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 1
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 1
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 1
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 1
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 1
- 108010087810 leucyl-seryl-glutamyl-leucine Proteins 0.000 description 1
- 108010091871 leucylmethionine Proteins 0.000 description 1
- 108010012058 leucyltyrosine Proteins 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 108010012988 lysyl-glutamyl-aspartyl-glycine Proteins 0.000 description 1
- 108010076718 lysyl-glutamyl-tryptophan Proteins 0.000 description 1
- 108010043322 lysyl-tryptophyl-alpha-lysine Proteins 0.000 description 1
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 108010090114 methionyl-tyrosyl-lysine Proteins 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 108091005573 modified proteins Proteins 0.000 description 1
- 102000035118 modified proteins Human genes 0.000 description 1
- 108091027963 non-coding RNA Proteins 0.000 description 1
- 102000042567 non-coding RNA Human genes 0.000 description 1
- 238000011275 oncology therapy Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 229940051027 pasteurella multocida Drugs 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 230000007918 pathogenicity Effects 0.000 description 1
- 108010091617 pentalysine Proteins 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 1
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 229920001282 polysaccharide Polymers 0.000 description 1
- 239000005017 polysaccharide Substances 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 230000003389 potentiating effect Effects 0.000 description 1
- 125000000075 primary alcohol group Chemical group 0.000 description 1
- 150000003138 primary alcohols Chemical class 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 108010031719 prolyl-serine Proteins 0.000 description 1
- 108020001580 protein domains Proteins 0.000 description 1
- 101150020896 ptk gene Proteins 0.000 description 1
- 108010042660 rRNA (adenosine-O-2'-)methyltransferase Proteins 0.000 description 1
- 230000008929 regeneration Effects 0.000 description 1
- 238000011069 regeneration method Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000006152 selective media Substances 0.000 description 1
- 108010048818 seryl-histidine Proteins 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 239000013605 shuttle vector Substances 0.000 description 1
- 101150067544 sigF gene Proteins 0.000 description 1
- 101150015060 sigG gene Proteins 0.000 description 1
- 101150077142 sigH gene Proteins 0.000 description 1
- 101150025376 sigK gene Proteins 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 238000003756 stirring Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 101150015970 tetM gene Proteins 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 1
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
- 230000037426 transcriptional repression Effects 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 108010014563 tryptophyl-cysteinyl-serine Proteins 0.000 description 1
- 108010038745 tryptophylglycine Proteins 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 108010012567 tyrosyl-glycyl-glycyl-phenylalanyl Proteins 0.000 description 1
- 108010003137 tyrosyltyrosine Proteins 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N1/00—Microorganisms, e.g. protozoa; Compositions thereof; Processes of propagating, maintaining or preserving microorganisms or compositions thereof; Processes of preparing or isolating a composition containing a microorganism; Culture media therefor
- C12N1/20—Bacteria; Culture media therefor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N1/00—Microorganisms, e.g. protozoa; Compositions thereof; Processes of propagating, maintaining or preserving microorganisms or compositions thereof; Processes of preparing or isolating a composition containing a microorganism; Culture media therefor
- C12N1/20—Bacteria; Culture media therefor
- C12N1/205—Bacterial isolates
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/111—General methods applicable to biologically active non-coding nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/74—Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/74—Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora
- C12N15/75—Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora for Bacillus
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P7/00—Preparation of oxygen-containing organic compounds
- C12P7/02—Preparation of oxygen-containing organic compounds containing a hydroxy group
- C12P7/04—Preparation of oxygen-containing organic compounds containing a hydroxy group acyclic
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P7/00—Preparation of oxygen-containing organic compounds
- C12P7/02—Preparation of oxygen-containing organic compounds containing a hydroxy group
- C12P7/04—Preparation of oxygen-containing organic compounds containing a hydroxy group acyclic
- C12P7/16—Butanols
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P7/00—Preparation of oxygen-containing organic compounds
- C12P7/24—Preparation of oxygen-containing organic compounds containing a carbonyl group
- C12P7/26—Ketones
- C12P7/28—Acetone-containing products
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
- C12N2830/001—Vector systems having a special element relevant for transcription controllable enhancer/promoter combination
- C12N2830/002—Vector systems having a special element relevant for transcription controllable enhancer/promoter combination inducible enhancer/promoter combination, e.g. hypoxia, iron, transcription factor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12R—INDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
- C12R2001/00—Microorganisms ; Processes using microorganisms
- C12R2001/01—Bacteria or Actinomycetales ; using bacteria or Actinomycetales
- C12R2001/145—Clostridium
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E50/00—Technologies for the production of fuel of non-fossil origin
- Y02E50/10—Biofuels, e.g. bio-diesel
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- General Health & Medical Sciences (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Medicinal Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Tropical Medicine & Parasitology (AREA)
- Virology (AREA)
- Mycology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
본 발명은 클로스트리디움 속 박테리아, 전형적으로 용매생성 박테리아의 동종 재조합에 의한 형질전환 및 변형을 용이하게하기위해 최적화된 2개 이상의 별개의 핵산을 포함하는 유전 툴에 관한 것 이다.
Description
본 발명은 전형적으로 클로스트리디움 속(Clostridium genus) 속의 용매생성 박테리아(solventogenic bacterium)의 클로스트리디움 속의 박테리아의 동종 재조합에 의해 형질전환 및 변형을 촉진시키도록 최적화된 적어도 2개의 구별되는 핵산을 포함하는 유전 툴(genetic tool)에 관한 것이다.
기술적 배경
후벽균(Firmicutes) 문(phylum)의 클로스트리디움 속에 속하는 박테리아는 그람-양성의, 완전 혐기성인, 내생포자-형성 박테리아이다. 이러한 속은 이들의 병원성 또는 이들의 산업적 및 의학적 관심으로 인하여 연구된 많은 종을 함유한다. 예를 들면, 클로스트리디움 테타니(Clostridium tetani), 클로스트리디움 보툴리눔(Clostridium botulinum), 클로스트리디움 페르프린겐스(Clostridium perfringens) 및 클로스트리디움 디피실레(Clostridium difficile)는 각각 파상풍, 보툴리누스 식중독(botulism), 가스 괴저(gas gangrene) 및 거짓막 대장염(pseudomembranous colitis)에 관여하는 제제이다. 클로스트리디움 노바이(Clostridium novyi) 및 클로스트리디움 스포로게네스(Clostridium sporogenes)는 암 치료요법을 개발하기 위한 연구에서 사용되어 왔다. 동시에, 사람에게 병원성이 아닌 클로스트리디움 아세토부틸리쿰(Clostridium acetobutylicum), 클로스트리디움 부티리쿰(Clostridium butyricum) 및 클로스트리디움 베이제린키이(Clostridium beijerinckii)가 발효에 사용되고 있다.
소위 산업 목적의 클로스트리디움 종은 글루코즈로부터 셀룰로즈에 이르는 범위의 광범위한 당 및 기질로부터 산 및 용매와 같은 목적한 화합물을 생산할 수 있따. 용매를 생산하는 클로스트리디움 박테리아 ("용매생성 박테리아(solventogenic bacteria)")의 성장은 이상성(biphasic)으로 불린다. 산(아세트산 및 부티르산)은 대수적 성장기 동안에 생산된다. 이후에, 세포 성장이 정지되면 박테리아는 정체기로 도입되어, 용매를 생산한다.
클로스트리디움의 대부분의 용매생성 균주는 최종 생성물로서 아세톤, 부탄올 및 에탄올을 생산한다. 이러한 균주는 "ABE 균주"로 불린다. 이는 예를 들면, 씨. 아세토부틸리쿰(C. acetobutylicum) DSM 792(또한 ATCC 824 또는 LMG 5710으로 공지됨) 및 씨. 베이제린키이(C. beijerinckii) NCIMB 8052의 경우이다. 다른 균주는 또한 아세톤 모두 또는 일부를 이소프로판올로 환원시킬 수 있으며 "IBE 균주"로 불린다. 이는 예를 들면, 씨. 베이제린키이 DSM 6423(또한 NRRL B-593, LMG 7814, LMG 7815로서 공지됨) 균주의 경우이며, 이는 아세톤을 이소프로판올로 환원시키는 1급/2급 알코올 데하이드로게나제를 암호화하는 adh 유전자를 이의 게놈 내에 갖는다.
1세기 이상 동안 산업에서 사용되었지만, 클로스트리디움 속에 속하는 박테리아에 대한 지식은 이들을 유전적으로 변형시키는데 있어서 직면한 곤란성으로 인하여 제한되어 왔다. 다양한 유전 툴이 최근 수년 동안 개발되어 이러한 속의 균주를 최적화시켜 왔으며, 가장 최근의 생성은 CRISPR(집단화하여 일정하게 분포하는 짧은 팔린드롬 반복체: Clustered Regularly Interspaced Short Palindromic Repeats)/Cas (CRISPR-관련 단백질) 기술을 기반으로 한다. 이러한 방법은 뉴클레아제(전형적으로 스트렙토코쿠스 피오게네스(Streptococcus pyogenes)의 Cas9 단백질과 같은 Cas 뉴클레아제)로 불리는 효소의 사용을 기반으로 하며, 이는 RNA 분자에 의해 가이드되어 DNA 분자(목적한 표적 서열) 내에서 이중 가닥 절단물(double-stranded cut)을 만든다. 가이드 RNA(gRNA) 서열은 이를 매우 고 특이성으로 되도록 하는, 뉴클레아제 절단 부위를 결정할 것이다(도 1).
필수적인 DNA 분자내에서 이중-가닥 절단물은 유기체에 대해 치명적이므로, 이의 생존은 이를 복구하는 이의 능력에 의존할 것이다(참고: 예를 들면, Cui & Bikard, 2016). 클로스트리디움 속의 박테리아에서, 이중-가닥 절단물의 복구는 절단된 서열(cleaved sequence)의 완전한 카피를 필요로 하는 동종 재조합 메카니즘에 의존한다. 원래의 서열을 변형시키면서 이의 복구를 수행하도록 하는 DNA 단편을 지닌 박테리아를 제공함으로써, 미생물이 이의 게놈 내에 목적한 변화를 통합하도록 하는 것이 가능하다. 수행된 변형은 표적 서열 또는 PAM 부위의 변형을 경유하여 Cas9-gRNA 리보뉴클레오단백질 복합체에 의한 게놈 DNA의 표적화를 더이상 허용하지 않아야만 한다(도 2).
클로스트리디움 박테리아에서 기능성인 이러한 유전 툴을 제조하기 위한 상이한 시도가 기술되어 왔다. 이러한 미생물은 이들의 낮은 형질전환 및 동종 재조합 빈도로 인하여 유전적으로 변형시키기 어려운 것으로 알려져 있다. 몇가지 시도는 씨. 베이제린키이 및 씨. 르정다흘리이(C. ljungdahlii)에서(Wang et al., 2015; Huang et al., 2016) 또는 씨. 베이제린키이, 씨, 사카로페르부틸아세토니쿰(C. saccharoperbutylacetonicum) 및 씨. 오토에타노게눔(C. authoethanogenum)(Wang et al., 2016; Nagaraju et al., 2016; Wang et al., 2017)에서 유도성 프로모터의 제어 하에서 구성적으로 발현된, Cas9의 사용을 기반으로 한다. 다른 저자는 뉴클레아제의 변형된 버젼인, Cas9n의 용도를 기술하여 왔으며, 이는 게놈내에서 이중-가닥 절단보다는 단일-가닥 절단을 이룬다(Xu et al., 2015; Li et al., 2016). 이러한 선택은 Cas9의 독성이 시험한 실험 조건 하에서 클로스트리디움 박테리아에서 이를 사용하기에 너무 높다는 관찰에 기인한다. 상술된 모든 툴은 단일 플라스미드의 사용을 기반으로 한다. 최종적으로, 예를 들면, 씨. 파스테리아눔(C. pasteurianum)에서와 같이, 미생물의 게놈 내에서 내인성 CRISPR/Cas 시스템이 확인된 경우 이를 사용하는 것이 가능하다(Pyne et al., 2016).
이들이 변형시킬 균주의 내인성 기구(endogenous machinery)를 사용하지 않는다면(상술한 마지막 경우에서와 같이), CRISPR 기술을 기반으로 한 툴은 박테리아 게놈(Xu et al., 2015에 따라 최대 약 1.8 kb)으로 삽입될 수 있는 목적한 핵산의 크기(및 따라서 다수의 암호화 서열 또는 유전자의 수)를 유의적으로 제한하는 주요 단점을 갖는다.
본 발명자는 2개의 구별되는 핵산, 전형적으로 2개의 플라스미드의 사용을 기반으로, 클로스트리디움 박테리아에 대해 적응된(adapted), 박테리아를 변형시키기 위한 보다 효율적인 유전 툴을 최근에 개발하여 기술하여 왔으며(WO2017064439, Wasels et al., 2017 및 도 3), 이는 이러한 문제를 해결한다. 특수한 구현예에서, 이러한 툴의 제1의 핵산은 cas9 및 제2 핵산이, 이루어질 변형에 대해 특이적이 되도록 하며, 하나 이상의 gRNA 발현 카세트 및 목적한 서열에 의해 Cas9로 표적화된 박테리아 DNA의 일부의 대체를 가능하게 하는 복구 주형(repair template)을 함유한다. 시스템의 독성은 유도성 프로모터의 제어 하에서 cas9 및/또는 gRNA 발현 카세트(들)을 위치시킴으로써 제한된다.
본 발명자는 또한 아주 최근에 박테리아가 상기 항생제(들)에 대해 민감하게 되도록 하기 위하여 박테리아에게 하나 이상의 항생제에 대한 내성을 제공하는, 야생형 상태에서 유전자를 사용하여 박테리아를 유전적으로 변형시키는데 성공하였으며, 이는 적어도 2개의 핵산의 사용을 기반으로 하여 이들의 유전 툴의 사용을 촉진시킬 수 있도록 하였다. 따라서, 이들은 이소프로판올의 천연 생산자인 균주 씨. 베이제린키이 DSM 6423을 유전적으로 변형시킬 수 있었다. 특히, 이들은 본 명세서에서 "pNF2"(참고: FR18/73492)로서 확인된, 균주에 대해 필수적이지 않는 천연 플라스미드를 이러한 균주로부터 제거하는데 성공하였다. 본 발명자는 또한 이러한 플라스미드 pNF2의 제거가 본 발명의 맥락에서 이를 가능하도록 하여, 유전 물질 도입 효능(즉, 형질전환 효능)이 약 101 내지 5 x 103를 포함한 인자까지 증가된 씨. 제이제린키이 DSM 6423을 수득하는 것이 가능하도록 함을 발견하고 본원에 나타내었다.
본 발명자는 본 내용에서 클로스트리디움 속의 박테리아의 형질전환 효능을 매우 유의적으로 증가시킬 수 있음으로써, 수 및 유용한 양(특히 산업적 규모의 생산을 위한 강력한 균주를 선택하는 맥락에서)으로, 수득된 목적한 돌연변이체(유전적으로 변형된) 박테리아를 수득할 수 있도록 하는 클로스트리디움 속의 박테리아를 변형시키기 위한 개선된 유전 툴을 기술한다. 하기 설명한 바와 같이, 본 발명자는 특히 플라스미드 pNF2의 일부를 사용하여 박테리아의 유전 물질을 변형시키고/시키거나 박테리아 내에서 상기 박테리아의 야생형 버젼 속에 존재하는 유전 물질로부터 존재하지 않는 DNA 서열을 발현시키기 위한 서열을 수반하는 특이적인 핵산을 설계하기 위해, 본 발명에 따른 유전 툴을 개선시키는 것에 성공하였다. 이러한 핵산 및 새로운 툴은 박테리아의 형질전환 효능, 특히 이들이 야생형 상태에서 함유한 천연 플라스미드(들)이 처음 세척된 박테리아의 형질전환 효능을 현저히 개선시킨다.
따라서, 본 발명은 특히 산업적인 규모로 이러한 박테리아의 형질전환 효능 및 용도를 매우 유리하게 촉진시킨다.
발명의 요지
본 발명자는 본 발명의 맥락에서 및 최초로, 최적화된 형질전환, 및 클로스트리디움 속의 박테리아의 동종 재조합에 의한 유전적 변형 및/또는 야생형 상태에서 상기 박테리아의 유전 물질로부터 부분적으로 또는 전체적으로 부재된 DNA 서열의 상기 박테리아내 발현을 기술한다. 이러한 툴은 전형적으로 i) 이것이:
-
적어도 하나의 DNA 엔도뉴클레아제, 예를 들면, Cas9 효소를 암호화하는 "제1" 핵산(여기서 DNA 엔도뉴클레아제를 암호화하는 서열은 프로모터의 제어 하에 있다), 및
-
동종 재조합 메카니즘에 의해, 목적한 서열에 의한 엔도뉴클레아제에 의해 표적화된 박테리아 DNA의 일부의 대체를 가능하게 하는 복구 주형을 포함하는 적어도 "제2의" 핵산을 포함하고,
ii) 상기 핵산 중 적어도 하나는 하나 이상의 가이드 RNA(gRNA)를 추가로 암호화하거나, 유전 툴이 하나 이상의 가이드 RNA를 추가로 포함하고, 각각의 가이드 RNA는 DNA-엔도뉴클레아제-결합 RNA 구조 및 박테리아 DNA의 표적화된 부위에 대해 상보성인 서열을 포함하며,
iii) 상기 핵산 중 적어도 하나는 유도성 프로모터의 제어 하에 있는 항-CRISPR 단백질을 추가로 암호화하는 서열을 추가로 포함하거나, 또는 유전 툴이 유도성 프로모터의 제어 하에 있는 항-CRISPR 단백질을 암호화하는 "제3의" 핵산을 추가로 포함함을 특징으로 한다.
이러한 개선된 툴에서, 적어도 하나의 핵산은 유도성 프로모터의 제어 하에 있는 항-CRISPR 단백질("acr")을 암호화하는 서열을 포함한다. 이러한 항-CRISPR 단백질은 DNA 엔도뉴클레아제/가이드 RNA 복합체의 활성을 억제한다. 단백질의 발현은 박테리아의 형질전환 단계 동안에만 이의 발현을 허용하도록 조절된다.
선행 기술의 툴과 비교하여, 이러한 툴은 클로스트리디움 박테리아의 형질전환을 현저하게 촉진시키는 정점을 가짐으로써 목적한 유전적으로 변형된 박테리아를 산업적 규모에서의 수 및 유용한 양으로 생산하는 이점을 갖는다.
본 발명자는 또한, 본 발명의 맥락에서 및 최초로, 박테리아의 형질전환(상기 박테리아내에서 도입된 모든 유전 물질의 유지를 증진시킴으로써)을 촉진시키는 핵산(또한 본 내용에서 "OPT" 핵산으로 확인됨)을 기술한다. OPT 핵산은 i) 서열 번호: 126의 서열 모두 또는 일부 및 ii) 박테리아의 유전 물질의 변형 및/또는 상기 박테리아내에서 상기 박테리아의 야생형 버젼 내에 존재하는 유전 물질로부터 부분적으로 또는 전체적으로 부재한 DNA 서열의 발현을 가능하게 하는 서열을 포함한다. 서열 번호: 126의 서열은 또한 본 내용에서 "OREP" 핵산으로 확인된다.
본 발명자는 특히 상기 박테리아내에서 OREP 서열을 결실시키고 유리하게는 이러한 OREP 서열의 전부 또는 일부를 사용함으로써 박테리아의 유전 물질의 변형 및/또는 상기 박테리아 내에서 상기 박테리아의 야생형 버젼내에 존재하는 유전 물질로부터 부분적으로 또는 전체적으로 부재된 DNA 서열의 발현을 가능하게 하는 유전 툴을 사용함으로써, 박테리아 씨. 베이제린키이 DSM 6423내 핵산의 형질전환 빈도를 증진시키는데 성공하였다.
OREP 서열은 목적한 OPT 핵산의 복제에 포함된 단백질을 암호화하는 뉴클레오타이드 서열(서열 번호: 127)을 포함한다. 복제에 포함된 이러한 단백질은 또한 본 내용에서 "REP" 단백질(서열 번호: 128 - "MNNNNTESEELKEQSQLLLDKCTKKKKKNPKFSSYIEPLVSKKLSERIKECGDFLQMLSDLNLENSKLHRASFCGNRFCPMCSWRIACKDSLEISILMEHLRKEESKEFIFLTLTTPNVKGADLDNSIKAYNKAFKKLMERKEVKSIVKGYIRKLEVTYNLDKSSKSYNTYHPHFHVVLAVNRSYFKKQNLYINHHRWLSLWQESTGDYSITQVDVRKAKINDYKEVYELAKYSAKDSDYLINREVFTVFYKSLKGKQVLVFSGLFKDAHKMYKNGELDLYKKLDTIEYAYMVSYNWLKKKYDTSNIRELTEEEKQKFNKNLIEDVDIE")로서 확인된다. REP 단백질은 서열 번호: 129의 서열의 "COG 5655"(플라스미드 롤링-서클 복제 개시인자(플라스미드 rolling-circle replication initiator) 단백질 REP)로 불리는, 후벽균내 보존된 도메인을 갖는다.
특히, 본 발명은 따라서 적어도:
-
적어도 하나의 DNA 엔도뉴클레아제를 암호화하는 "제1의" 핵산(여기서, DNA 엔도뉴클레아제를 암호화하는 서열은 프로모터의 제어 하에 있다), 및
-
"OREP 핵산 서열"을 포함하거나, 이로 이루어진, 즉, i) 서열 번호: 126의 서열 모두 또는 일부 및 ii) 상기 박테리아 내에서 상기 박테리아의 야생형 버젼내에 존재하는 유전 물질로부터 부분적으로 또는 전체적으로 부재된 DNA 서열의 발현을 가능하게 하는 서열을 포함하거나, 이로 이루어진 "기타" 핵산을 포함하는 유전 툴을 기술한다. 특수한 구현예에서, 상술한 바와 같은 "복구 주형을 포함하는 제2의 핵산"은 이러한 "기타 핵산"을 포함한다.
예를 들면, 동종 재조합에 의해, 클로스트리디움 속의 박테리아, 전형적으로 클로스트리디움 속의 용매생성 박테리아를 형질전환시키고, 전형적으로 유전적으로 변형시키는 방법 뿐만 아니라 이러한 방법에 의해 수득된 박테리아 또는 박테리아들(형질전환되고 전형적으로 유전적으로 변형됨)이 또한 기술되어 있다. 이러한 방법은 다음의 단계를 포함한다:
a) 박테리아 내로 항-CRISPR 단백질의 발현의 유도인자(inducer)의 존재하에서 본 발명에 따른 유전 툴을 도입시키는 단계, 및
b) 단계 a)의 말기에 수득된 형질전환된 박테리아를 항-CRISPR 단백질의 발현의 유도인자를 함유하지 않는 배지 상에서 배양하고 전형적으로 DNA 엔도뉴클레아제/gRNA 리보뉴클레오단백질 복합체의 발현을 가능하게 하는 단계.
이러한 공정의 예는 유리하게는 박테리아 내로 본 발명의 내용에 기술된 바와 같은 유전 툴의 모두 또는 일부, 특히, i) 서열 서열 번호: 126의 모두 또는 일부 및 ii) 박테리아의 유전 물질의 변형 및/또는 상기 박테리아 내에서 상기 박테리아의 야생형 버젼에서 존재하는 유전 물질로부터 부분적으로 또는 전체적으로 부재된 DNA 서열의 발현을 가능하게 하는 서열을 포함하거나, 이로 이루어진 핵산("OPT 핵산")을 도입함으로써 박테리아를 형질전환시키는 단계를 포함한다.
본 발명자는 또한 클로스트리디움 속의 박테리아를 형질전환하고, 바람직하게는 유전적으로 변형시키거나, 클로스트리디움 속의 박테리아를 사용하여, 적어도 하나의 용매, 예를 들면, 용매의 혼합물을 생산하기 위한 키트(kit)를 기술한다. 이러한 키트는 본 내용에 기술된 바와 같은 핵산 또는 본 내용에 기술된 바와 같은 유전 툴의 성분, 및 툴내에 사용된 선택된 항-CRISPR 단백질의 발현의 유도성 프로모터에 적응된 적어도 하나의 유도인자를 포함한다. 특수한 구현예에서, 키트는 본 내용에 기술된 바와 같은 유전 툴의 성분 중 모두 또는 일부를 포함한다.
클로스트리디움 속의 박테리아, 예를 들면, 야생형 상태에서 박테리아 염색체 및 염색체 DNA로부터 구별된 적어도 하나의 DNA 분자(대표적으로 천연 플라스미드) 둘 다를 갖는 클로스트리디움 속의 박테리아를 형질전환시키고 임의로 유전적으로 변형시키기 위한, 본 내용에서 처음으로 개시된 핵산 또는 유전 툴의 용도가 또한 기술되어 있다.
또한 본 내용에서 첫번째로 개시된, 용매 또는 용매의 혼합물, 바람직하게는 아세톤, 부탄올, 에탄올, 이소프로판올 또는 이의 혼합물, 대표적으로 이소프로판올/부탄올, 부탄올/에탄올 또는 이소프로판올/에탄올 혼합물의 생산, 바람직하게는 산업적 규모로의 생산을 가능하도록 하는, 핵산 또는 유전 툴, 전형적으로 동종 재조합에 의해, 클로스트리디움 속의 박테리아를 형질전환시키고 바람직하게는 유전적으로 변형시키는 공정, 상기 공정에 의해 수득된 박테리아, 및/또는 키트의 용도가 기술되어 있다.
발명의 상세한 설명
1세기 이상 동안 산업에서 사용되었지만, 용매생성 박테리아, 특히, 클로스트리디움 속에 속하는 것들의 지식은 이들을 유전적으로 변형시키는데 있어서 직면한 곤란성에 의해서 제한되어 있다. 예를 들면, 이소프로판올을 천연적으로 생산하고, 전형적으로 이들의 게놈 내에 아세톤을 이소프로판올로 환원시키는 1급/2급 알코올 데하이드로게나제를 암호화하는 adh 유전자를 갖는, 천연적으로 이소프로판올을 생산하는 클로스트리디움 속의 박테리아는 천연 상태에서 ABE 발효가 가능한 박테리아과는 유전적으로 및 기능적으로 둘 다 구별된다.
본 내용에 기술된 유전 툴은 이의 특성을 개선시키기 위하여 목적한 서열에 의한 클로스트리디움 속의 박테리아의 형질전환을 상당히 촉진시키는 이점을 갖는다.
이러한 툴은 전형적으로
i) -
적어도 하나의 DNA 엔도뉴클레아제, 예를 들면, Cas9 효소를 암호화하는 "제1" 핵산(여기서 DNA 엔도뉴클레아제를 암호화하는 서열은 프로모터의 제어 하에 있다), 및
-
동종 재조합 메카니즘에 의해, 목적한 서열에 의한 엔도뉴클레아제에 의해 표적화된 박테리아 DNA의 일부의 대체를 가능하게 하는 복구 주형을 포함하는 적어도 "제2의" 핵산,을 포함하고,
ii) 상기 핵산 중 적어도 하나는 하나 이상의 가이드 RNA(gRNA)를 추가로 암호화하거나, 유전 툴이 하나 이상의 가이드 RNA를 추가로 포함하고, 각각의 가이드 RNA는 DNA-엔도뉴클레아제-결합 RNA 구조 및, 박테리아 DNA의 표적화된 부위에 대해 상보성인 서열을 포함하며,
iii) 상기 핵산 중 적어도 하나는 유도성 프로모터의 제어 하에 있는 항-CRISPR 단백질을 암호화하는 서열을 추가로 포함하거나, 또는 유전 툴이 유도성 프로모터의 제어 하에 있는 항-CRISPR 단백질을 암호화하는 "제3의" 핵산을 추가로 포함함을 특징으로 한다.
이러한 툴은 핵산 서열의 거대 단편의 삽입을 허용한다.
본 발명자가 기술한 툴을 사용하여 목적한 박테리아, 전형적으로 클로스트리디움 속에 속하는 본 내용에 기술된 바와 같은 박테리아, 바람직하게는 이소프로판올을 천연적으로 생산할 수 있고(즉, 야생형 상태에서 가능), 특히 천연적으로 IBE 발효할 수 있는, 클로스트리디움 속의 박테리아, 바람직하게는 씨. 베이제린키이와 같은, 하나 이상의 항생제에 대해 천연적으로 내성인 박테리아를 형질전환하고/하거나 유전적으로 변형시킬 수 있다. 바람직한 박테리아는 야생형 상태에서 박테리아 염색체 및, 염색체 DNA와는 구별되는 적어도 하나의 DNA 분자 둘 다를 갖는다.
표현 "클로스트리디움 속의 박테리아"는 특히 소위 산업적 목적의 클로스트리디움 종, 전형적으로 클로스트리디움 속의 용매생성 또는 아세토게닉 박테리아(acetogenic bacteria)을 지칭한다. 표현 "클로스트리디움 속의 박테리아"는 야생형 박테리아뿐 만 아니라 CRISPR 시스템에 대해 노출되지 않고도 이들의 성능(예를 들면, ctfA, ctfB 및 adc 유전자를 과발현하는)을 개선시키도록 유전적으로 변형된 이로부터 기원한 균주를 포함한다.
표현 "산업적 목적의 클로스트리디움 종"은 발효에 의해, 부티르산 또는 아세트산과 같은 산 및 용매를 당 또는 당당류, 전형적으로 크실로즈, 아라비노즈 또는 프럭토즈와 같은 5-탄당, 글루코즈 또는 만노즈와 같은 6탄당, 셀룰로즈 또는 헤미셀룰로즈와 같은 다당류 및/또는 클로스트리디움 속의 박테리아에 의해 소화되거나 사용될 수 있는 임의의 다른 탄소원(예를 들면, CO, CO2 및 메탄올)으로부터 생산할 수 있는 클로스트리디움 종을 지칭한다. 목적한 용매생성 박테리아의 예는 문헌에 "ABE 균주"로서 확인된 균주[발효를 통해 아세톤, 부탄올 및 에탄올을 생산하는 균주] 및 "IBE 균주"[발효를 통해 이소프로판올(아세톤을 환원시킴으로써), 부탄올 및 에탄올을 생산하는 균주]로서 확인된 균주와 같이, 아세톤, 에탄올 및/또는 이소프로판올을 생산하는 클로스트리디움 속의 박테리아이다. 클로스트리디움 속의 용매생성 박테리아는 씨. 아세토부틸리쿰(C. acetobutylicum), 씨, 셀룰롤리티쿰(C. cellulolyticum), 씨. 파이토페르멘탄스(C. phytofermentans), 씨. 베이제린키이, 씨, 사카로부틸리쿰(C. saccharobutylicum), 씨. 사카로페르부틸아세토니쿰( C. saccharoperbutylacetonicum), 씨. 스포로게네스(C. sporogenes), 씨. 부티리쿰(C. butyricum), 씨. 아우란티부티리쿰(C. aurantibutyricum) 및 씨. 타이로부티리쿰(C. tyrobutyricum)으로부터, 바람직하게는 씨. 아세토부틸리쿰, 씨. 베이제린키이, 씨. 부티리쿰, 씨. 타이로부티리쿰 및 씨. 셀룰롤리티쿰, 및 심지어 보다 바람직하게는 씨. 아세토부틸리툼 및 씨. 베이제린키이로부터 선택될 수 있다.
야생형 상태에서 이소프로판올을 생산할 수 있는, 특히 야생형 상태에서 IBE 발효할수 있는 박테리아는 예를 들면, 씨 베이제린키이, 씨. 디올리스(C. diolis), 씨. 푸니세움(C. puniceum), 씨. 부티리쿰(C. butyricum), 씨. 사카로페르부틸아세토니쿰(C. saccharoperbutylacetonicum), 씨. 보툴리눔(C. botulinum), 씨. 드라케이(C. drakei), 씨. 스카톨로게네스(C. scatologenes), 씨. 페르프릴겐스(C. perfringens) 및 씨. 투니시엔세(C. tunisiense)로부터 선택된 박테리아, 바람직하게는 씨. 베이지린키이, 씨. 디올리스, 씨. 푸니세움 및 씨. 사카로페르부틸아세토니툼으로부터 선택된 박테리아일 수 있다. 이소프로판올을 천연적으로 생산할 수 있고, 특히 야생형 상태에서 IBE 발효할 수 있는, 특히 바람직한 박테리아는 씨. 베이제린키이이다.
목적한 아세토게닉 박테리아는 CO2 및 H2로부터 산 및/또는 용매를 생산하는 박테리아이다. 클로스트리디움 속의 아세토게닉 박테리아는 예를 들면, 씨. 아세티쿰(C. aceticum), 씨. 써모아세티쿰(C. thermoaceticum), 씨. 르정다흘리이, 씨. 오토에타노게눔, 씨. 디피실레, 씨. 스카톨로게네스 및 씨. 카복시디보란스(C. carboxydivorans)로부터 선택될 수 있다.
특수한 구현예에서, 관련된 클로스트리디움 속의 박테리아는 "ABE 균주", 바람직하게는 균주 씨. 아세토부틸리쿰 DSM 792(또한 ATCC 824 또는 LMG 5710으로서 공지됨) 또는 균주 씨. 베이제린키이 NCIMB 8052이다.
다른 특수한 구현예에서, 관련된 클로스트리디움 속의 박테리아는 "IBE 균주", 바람직하게는 DSM 6423, LMG 7814, LMG 7815, NRRL B-593, NCCB 27006으로부터 선택된 씨. 베이제린키이의 아속, 또는 씨. 아우란티부티리쿰(C. aurantibutyricum) DSZM 793(Georges et al., 1983), 및 균주 DSM 6423과 적어도 90%, 95%, 96%, 97%, 98% 또는 99% 동일성(identity)을 갖는 이러한 씨. 베이제린키이 또는 씨. 아우란티부티리쿰 박테리아의 아속(subclade)이다. 특히 바람직한 씨. 베이제린키이 박테리아, 또는 씨. 베이제린키이 박테리아의 아속은 플라스미드 pNF2를 갖지 않는다.
한편, 아속 LMG 7814, LMG 7815, NRRL B-593 및 NCCB 27006의 각각의 게놈, 및 다른 한편 DSZM 793은 아속 DSM6423의 게놈과 적어도 97%의 서열 동일성 퍼센트(sequence identity percentage)를 갖는다.
본 발명자는 아속 DSM 6423, LMG 7815 및 NCCB 27006의 씨. 베이제린키이 박테리아가 야생형 상태에서 이소프로판올을 생산할 수 있음을 확인하는 발효 시험을 수행하였다 (참고: 표 1).
천연적으로 이소프로판올-생산 균주 씨. 베이제린키이 DSM 6423, LMG 7815 및 NCCB 27006을 사용한 글루코즈 발효 시험의 요약. 본 발명의 특히 바람직한 구현예에서, 씨. 베이제린키이 박테리아는 아속 DSM 6423의 박테리아이다. 본 발명의 여전히 다른 바람직한 구현예에서, 씨. 베이제린키이 박테리아는 씨. 베이제린키이 IFP963 △ catB △pNF2 (2019년 2월 20일자로 기탁 번호 LMG P-31277 하에 BCCM-LMG 컬렉션에 등록됨) 균주이다.
CRISPR/DNA 엔도뉴클레아제 시스템은 2개의 구별되는 필수 성분, 즉, i) 엔도뉴클레아제, 본 경우에 CRISPR 시스템과 관련된 뉴클레아제(Cas 또는 "CRISPR-관련 단백질"), 전형적으로 Cas9, 및 ii) 가이드 RNA를 함유한다. 가이드 RNA는 박테리아 CRISPR RNA(crRNA) 및 tracrRNA(트랜스-활성화 CRISPR RNA)의 조합으로 이루어진 키메라 RNA이다(Jinek et al., Science 2012). gRNA는 Cas 단백질을 가이드하는 "스페이서 서열"에 상응하는 crRNA의 표적화 특이성 및 tracrRNA의 구조적 특성을 단일 전사체 내에서 조합한다. gRNA 및 Cas 단백질이 세포내에서 동시에 발현되는 경우, 표적 게놈 서열은 제공된 복구 주형을 사용하여 영구적으로 변형시킬 수 있다.
최근의 실험에서, 본 발명자는 이소프로판올을 천연적으로 생산하는 클로스트리디움 속의 박테리아, 박테리아 씨. 베이제린키이 DSM 6423 뿐만 아니라, 참고 균주 씨. 아세토부티리쿰 DSM 792를 형질전환하고 유전적으로 변형시키는데 성공하였다.
실험 단락에 기술된 작업 중 일부는 IBE 발효할 수 있는 균주, 즉, 씨. 베이제린키이 DSM 6423 내에서 수행하였으며, 이의 게놈 및 전사체 분석(transcriptomic analysis)은 본 발명자에 의해 최근에 기술되었다(Mate de Gerando et al., 2018).
이러한 균주의 게놈의 조립 동안에, 본 발명자는, 염색체 외에, 이동하는 유전 성분(기탁 번호 PRJEB11626 - https://www.ebi.ac.uk/ena/data/view/PRJEB11626): 2개의 천연 플라스미드(pNF1 및 pNF2) 및 선형의 박테리오파아지(Φ6423)의 존재를 발견하였다.
균주 씨. 베이제린키이 DSM 6423은 천연적으로 에리쓰로마이신-민감성이지만 티암페니콜-내성이다. 특허원 제FR18/73492호는 티오페니콜-민감성으로 만들어진 특수한 균주인, 균주, 씨. 베이지린키이 DSM 6423 catB를 기술한다. 본 발명의 특수한 구현예에서, 본 발명자는 균주 씨. 베이제린키이 DSM 6423으로부터 이의 천연 pNF2 플라스미드를 제거하는데 성공하였고 균주 씨. 베이제린키이 DSM6423 △ catB △pNF2를 수득하였다. 이러한 균주는 2019년 2월 20일자로 기탁 번호 LMG P-31277 하에 BCCM-LMG 컬렉션에 등록되었다. 설명은 또한 임의의 유도된 박테리아, 클론, 돌연변이체 또는 유전적으로 변형된 이의 버젼에 관한 것이다. 이는 또한 보다 일반적으로 야생형 상태에서 박테리아 염색체 및 일반적으로 본 내용에서 기술된 핵산 및/또는 유전 툴을 사용하여 유전적으로 변형됨으로써, 더 이상 이의 비-염색체 DNA 분자 중 적어도 하나, 전형적으로 이의 비염색체 DNA 분자 중 몇가지(예를 들면, 2개, 3개 또는 4개의 비-염색체 DNA 분자), 바람직하게는, 모든 이의 비-염색체 DNA 분자를 포함하지 않는, 염색체 DNA와는 구별되는 적어도 하나의 DNA 분자(본 내용에서 "비-염색체(박테리아) DNA" 또는 "천연(박테리아) 플라스미드"로 확인됨) 둘 다를 갖는 임의의 박테리아에 관한 것이다.
본 발명자는 천연의 pNF2 플라스미드의 제거가 추가의 천연 또는 합성의 유전 성분(예를 들면, 발현 카세트(들) 또는 발현 플라스미드 벡터(들))의 도입 및 유지에 유의적으로 유리함을 관찰하였다. 따라서 균주 DSM 6423 △ catB △ pNF2는 이의 야생형 대응부보다 또는 균주 DSM 6423 △catB보다 10 내지 5 x 103배 더 높은 효능으로 형질전환될 수 있다.
따라서, 본 발명자는 이소프로판올을 천연적으로 생산할 수 있는(즉, 야생형 상태에서 가능), 특히 천연적으로 IBE 발효할 수 있는 클로스트리디움 속의 박테리아를 기술하며, 이는 유전적으로 변형되어 있고, 이러한 유전적 변형의 결과로서, 특히 적어도 하나의 천연 플라스미드(즉, 상기 박테리아의 야생형 버젼에 천연적으로 존재하는 플라스미드), 바람직하게는 모든 이의 천연 플라스미드 뿐만 아니라, 이를 수득하는데 사용된 툴, 특히 유전 툴을 갖는다.
이러한 툴은 박테리아의 형질전환 및 유전적 변형을 현저히 촉진시키는 장점을 갖는다. 본 발명자에 의해 수행된 실험은 툴 및, 보다 일반적으로 클로스트리디움 속의 박테리아, 특히 야생형 상태에서 이소프로판올을 생산할 수 있는, 특히 IBE 발효를 수행할 수 있는 클로스트리디움 속의 박테리아, 특히 항생제에 대한 내성에 관여하는 효소를 암호화하는 유전자, 특히 암페니콜-O-아세틸트랜스퍼라제, 예를 들면, 클로람페니콜-O-아세틸트랜스퍼라제 또는 티오암페니콜-O-아세틸트랜스퍼라제를 암호화하는 유전자를 수반하는 박테리아를 유전적으로 변형시키기 위해 본 내용에 기술된 기술의 가능한 사용을 입증하였다.
특수한 구현예에서, 본 발명자는 암페니콜 부류의 항생제에 대한 내성에 관여하는 효소를 암호화하는 유전자를 천연적으로 수반하는(야생형 상태에서 수반함) 박테리아를 암페니콜 부류의 항생제에 대해 내성이 되도록 하는데 성공하였다.
다른 바람직한 박테리아는, 야생형 상태에서 박테리아 염색체 및, 염색체 DNA와는 구별되는 적어도 하나의 DNA 분자를 함유한다.
또한 바람직한 박테리아는 야생형 상태에서, 박테리아 염색체 및, 염색체 DNA와는 구별되는 적어도 하나의 DNA 분자뿐만 아니라, 항생제에 대한 내성을 부여하는 유전자를 함유한다. 특수한 구현예에서, 이러한 유전자는 암페니콜-O-아세틸트랜스퍼라제, 예를 들면, 클로람페니콜-O-아세틸트랜스퍼라제 또는 티암페니콜-O-아세틸트랜스퍼라제를 암호화한다.
형질전환되도록, 및 바람직하게는 유전적으로 변형되도록 의도된 특수한 박테리아는 바람직하게는 제1 단계의 형질전환 및 야생형 상태에서 박테리아내에 천연적으로 존재하는 적어도 하나의 염색체외 DNA 분자(전형적으로 적어도 하나의 플라스미드)를 제거하는 것을 가능하도록 하는 본 발명에 따른 핵산 또는 유전 툴을 사용한 제1 단계의 유전 변형에 노출된 박테리아이다.
본 발명자가 기술한 목적은 핵산(본 내용에서 "OPT" 핵산으로서 확인됨)에 관한 것이며, 이는 유리하게는 박테리아내에서 도입된 유전 물질 모두의 유지를 개선시킴으로써 상기 박테리아의 형질전환을 촉진시키는데 사용될 수 있다. 이러한 OPT 핵산은 i) 서열 번호: 126의 서열("OREP" 서열)의 모두 또는 일부 또는 이의 기능적 변이체 및 ii) 박테리아의 유전 물질의 변형 및/또는 상기 박테리아내에서 상기 박테리아의 야생형 버젼내에 존재하는 유전 물질로부터 부분적으로 또는 전체적으로 부재된 DNA 서열의 발현을 가능하게 하는 서열(또한 본 내용에서 "목적한 서열"로서 확인됨)을 포함한다.
OREP 서열(서열 번호: 126)은 서열 번호: 127의 서열의 뉴클레오타이드 서열을 포함한다. 서열 번호: 127의 서열은 바람직하게는 OPT 핵산의 복제에 포함된 단백질을 암호화하는 서열을 포함한다. 복제에 포함된 것으로 고려된 단백질은 또한 본 내용에서 "REP" 단백질(서열 번호: 128)로 확인된다. REP 단백질은 서열 번호: 129의 서열의 "COG 5655"라고 불리는 후벽균내에에 보존된 도메인을 갖는다.
특수한 구현예에서, OPT 핵산은 OREP 서열의 일부(서열 번호: 126), 전형적으로 OREP 서열의 하나 이상의 단편, 바람직하게는 적어도 REP 단백질을 암호화하는 서열(서열 번호: 128) 또는 이의 기능성 변이체 또는 단편(즉, 복제에 포함된 단편), 대표적으로 REP 단백질 내에서, OPT 핵산의 복제에 포함된 단편을 암호화하는 이의 변이체 또는 단편을 포함한다. OPT 핵산의 복제에 포함된, REP 단백질내에 존재하는, 단편을 암호화하는 OREP 서열의 기능성 단편은 서열 번호: 129의 서열의 도메인을 포함한다. REP 단백질의 기능성 단편을 암호화하는 이러한 핵산, 및 이의 변이체의 예는 기술자가 용이하게 제조할 수 있다. 변이체의 대표적인 예는 서열 번호: 127의 서열과 70% 내지 100%, 바람직하게는 85 내지 99%, 보다 바람직하게는 95 내지 99%, 예를 들면, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 또는 100%의 서열 상동성(sequence homology)을 갖는다.
바람직한 구현예에서, OREP 서열의 기능성 변이체 또는 단편은 OPT 핵산의 복제시 포함된 단백질을 암호화한다.
본 발명의 바람직한 구현예에서, OREP 서열의 기능성 변이체 또는 단편은 OPT 핵산(예를 들면, 플라스미드-유사 유전 작제물)의 복제에 포함된 단백질(예를 들면 REP 단백질) 또는 이의 기능성 변이체 또는 단편 외에, 1 내지 150개 염기의 부위, 바람직하게는 1 내지 15개 염기의 부위, 예를 들면 A 및 T 염기가 풍부한 서열(Rajewska et al.), 바람직하게는 서열 번호: 118의 서열의 pNF2 플라스미드내에 존재하는 부위를 포함함으로써, OPT 핵산의 복제를 가능하게 하는 단백질의 결합을 허용한다.
박테리아의 유전 물질의 변형을 가능하게 하는 목적한 서열은 대표적으로 예를 들면 동종 재조합 메카니즘에 의해, 박테리아의 유전 물질의 일부를 목적한 서열로 대체하도록 하는 변형 주형(modification template)이다. 박테리아의 유전 물질의 변형을 가능하게 하는 목적한 서열은 또한 i) 표적 서열, ii) 표적 서열의 전사를 제어하는 서열, 또는 iii) 표적 서열을 플랭킹하는 서열의 인지 서열(recognition sequence)(적어도 부분적으로 결합하는), 및 바람직하게는 이들을 표적화하는, 즉, 목적한 박테리아의 게놈내 절단을 인식하고 이를 가능하게 하는 인지 서열일 수 있다.
상기 박테리아내에서 상기 박테리아의 야생형 버젼에 존재하는 유전 물질로부터 부분적으로 또는 전체적으로 부재된 DNA 서열의 발현을 가능하게 하는 목적한 서열은 전형적으로 야생형 상태에서 발현할 수 없거나, 충분한 양으로 발현할 수 없는 하나 이상의 단백질을 발현하도록 한다.
특수한 양태에 따라서, "OPT 핵산"은 또한 iii) DNA 엔도뉴클레아제, 예를 들면 Cas9를 암호화하는 서열, 및/또는 iv) 하나 이상의 가이드 RNA(gRNA)를 포함하며, 각각의 gRNA는 DNA-엔도뉴클레아제-결합 RNA 구조 및, 박테리아 유전 물질의 표적화된 부위에 대해 상보성인 서열을 포함한다.
다른 특수한 양태에 따라서, "OPT 핵산"은 Dam 및 Dcm 메틸트랜스퍼라제에 의해 인식된 모티프(motif)에서 메틸화를 갖지 않는다.
바람직하게는, "OPT 핵산"은 발현 카세트 및 벡터로부터 선택되며, 바람직하게는 플라스미드, 예를 들면 서열 번호: 119, 서열 번호: 123, 서열 번호: 124 및 서열 번호: 125로부터 선택된 서열을 갖는 플라스미드이다.
따라서, 특히 적어도:
- 적어도 하나의 DNA 엔도뉴클레아제를 암호화하는 제1의 핵산(여기서 DNA 엔도뉴클레아제를 암호화하는 서열은 프로모터의 제어하에 있다), 및
- "OPT" 핵산 서열, 즉, i) 서열 번호: 126의 서열("OREP") 중 모두 또는 일부 및 ii) 상기 박테리아 내에서 박테리아의 유전 물질의 변형 및/또는 상기 박테리아의 야생형 버젼에 존재하는 유전 물질로부터 부분적으로 또는 전체적으로 부재된 DNA 서열의 발현을 가능하게 하는 서열, 바람직하게는 유도성 프로모터의 제어 하에 있는 항-CRISPR 단백질을 암호화하는 서열을 추가로 포함하는 이러한 특수한 유전 툴의 상기 핵산 중 적어도 하나, 또는 바람직하게는 유도성 프로모터의 제어 하에 있는 항-CRISPR 단백질을 암호화하는 제3의 핵산을 추가로 포함하는 상기 특수한 유전 툴을 포함하는 서열을 포함하거나 이로 이루어진 다른 핵산(또는 "n번째 핵산")을 포함하는 유전 툴이 기술되어 있다.
특수한 구현예에서 상술한 바와 같이 "제2" 또는 "복구 주형을 포함하는 n번째 핵산"은 이러한 "다른 핵산"을 포함하거나 이로 이루어진다.
다른 특수한 구현예에서, "제1의 핵산"은 또한 하나 이상의 가이드 RNA(gRNA)를 암호화한다.
본 발명의 의미에서, 용어 "핵산"은 임의로 화학적으로 변형되거나(즉, 비-천연 염기, 예를 들면 변형된 결합, 변형된 염기 및/또는 변형된 당을 갖는 변형된 뉴클레오타이드를 포함함), 암호화 서열로부터 합성된 전사체의 코돈(codon)이 여기에 사용하기 위하여 클로스트리디움 속의 박테리아내에서 가장 흔히 발견된 코돈인, 임의의 천연, 합성, 반-합성 또는 재조합 DNA 또는 RNA 분자를 의미한다. 클로스트리디움 속의 경우에, 최적화된 코돈은 전형적으로 아데닌("A") 및 티민("T") 염기가 풍부한 코돈이다.
본 발명에 따른 유전 툴은 적어도 하나의 엔도뉴클레아제, 전형적으로 Cas 뉴클레아제, 예를 들면 Cas9 또는 MAD7을 암호화하는 제1의 핵산을 포함한다. "Cas9"는 Cas9(또한 CRISPR-관련된 단백질 9, Csn1 또는 Csx12로 불림) 단백질 또는 즉, 가이드 RNA(들)과 상호작용하여 이것이 표적 게놈의 DNA 내에서 이중-가닥 절단을 이루도록 하는 효소(뉴클레아제) 활성을 수행할 수 있는, 이의 기능성 단백질, 펩타이드 또는 폴리펩타이드 단편을 지칭한다. 따라서,"Cas9"는 예를 들면, 단백질의 예정된 기능에 필수적이지 않는 단백질 도메인을 제거하기 위해 트렁케이트된(truncated), 변형된 단백질, 특히 gRNA(들)와의 상호작용에 필수적이지 않은 도메인을 지칭할 수 있다.
또한, "Cas12" 또는 "Cpf1"로서 확인된 MAD7 뉴클레아제(서열 번호: 72의 서열에 상응하는 아미노산 서열)는 다른 경우에는 이를 기술자에게 이러한 뉴클레아제에 대해 결합할 수 있는 것으로 알려진 하나 이상의 gRNA와 결합시킴으로써 본 발명의 내용에서 유리하게 사용될 수 있다(참고: Garcia-Doval et al., 2017 및 Stella S. et al., 2017).
특수한 양태에 따라서, MAD7 뉴클레아제를 암호화하는 서열은 클로스트리디움 균주내에서 용이하게 발현되도록 최적화된 서열, 바람직하게는 서열 번호: 71의 서열이다.
본 발명의 내용에서 사용된 바와 같은 Cas9를 암호화하는 서열(전체 단백질 또는 이의 단편)은 임의의 공지된 Cas9 단백질로부터 수득될 수 있다(Makarova et al., 2011). 본 발명에서 사용될 수 있는 Cas9 단백질의 예는 에스. 피오게네스(S. pyogenes), 스트렙토코쿠스 써모필루스(Streptococcus thermophilus), 스트렙토코쿠스 무탄스(Streptococcus mutans), 캄필로박터 제주니(Campylobacter jejuni), 파스퇴렐ㄹ라 멀토시다(Pasteurella multocida), 프란키셀라 노비시다(Francisella novicida), 나이쎄리아 메닌기티디스(Neisseria meningitidis), 나이쎄리아 락타미카( Neisseria lactamica), 및 레지오넬라 뉴모필라(Legionella pneumophila)로부터의 Cas9 단백질을 포함하나, 이에 한정되지 않는다(참고: Fonfara et al., 2013; Makarova et al., 2015).
특수한 구현예에서, 본 발명에 다른 유전 툴의 핵산 중 하나에 의해 암호화된 Cas9 단백질, 또는 이의 기능성 단백질, 펩타이드 또는 폴리펩타이드 단편은 서열 번호: 75의 아미노산 서열, 또는 이와 적어도 50%, 바람직하게는 적어도 60% 동일성을 갖고, 서열 번호: 75의 아미노산 서열의 10번 위치("D10") 및 840번 위치("D840")를 점유하는 적어도 2개의 아스파르트산("D")를 포함하는 임의의 다른 아미노산 서열을 포함하거나 이로 이루어진다.
바람직한 구현예에서, Cas9는 에스. 피오게네스 M1 GAS (NCBI 수탁 번호: NC_002737.2 SPy_1046, 서열 번호: 76) 균주로부터의 cas9 유전자에 의해 암호화된, Cas9 단백질(NCBI 수탁 번호: WP_010922251.1, 서열 번호: 75), 또는 클로스트리디움 속의 박테리아에 의해 우선적으로 사용된 코돈, 전형적으로 아데닌("A") 및 티민("T") 염기가 풍부한 코돈을 함유함으로써, 이러한 박테리아 속 내에서 Cas9 단백질의 촉진된 발현을 가능하게 하는 전사체를 생성하는 최적화("최적화된 버젼")을 겪는 이의 버젼을 포함하거나, 이로 이루어진다. 이러한 최적화된 코돈은 각각의 박테리아 균주에 대해 특이적인, 기술자에게 잘 공지된, 코돈 사용 편향성(codon usage bias)과 관련된다.
본 서류에 기술된 펩타이드 서열에서, 아미노산은 다음의 명명법에 따라 이들의 단-문자 코드(single-letter code)로 나타낸다: C: 시스테인; D: 아스파르트산; E: 글루탐산; F: 페닐알라닌; G: 글리신; H: 히스티딘; I: 이소루이신; K: 라이신; L: 루이신; M: 메티오닌; N: 아스파라긴; P: 프롤린; Q: 글루타민; R: 아르기닌; S: 세린; T: 트레오닌; V: 발린; W: 트립토판 및 Y: 타이로신.
특수한 구현예에 따라서, Cas9 도메인은 전체 Cas9 단백질, 바람직하게는 에스. 피오게네스 Cas9 단백질 또는 이의 최적화된 버젼으로 이루어진다.
본 발명에 따른 유전 툴의 핵산 중 하나 내에 존재하는, DNA 엔도뉴클레아제, 예를 들면 Cas9를 암호화하는 서열은 프로모터의 제어하에 있다. 이러한 프로모터는 구성적 프로모터 또는 유도성 프로모터일 수 있다. 바람직한 구현예에서, Cas9 발현을 제어하는 프로모터는 유도성 프로모터이다.
본 발명에서 사용될 수 있는 구성적 프로모터의 예는 thl 유전자, ptb 유전자, adc 유전자, BCS 오페론, 또는 이의 유도체, 바람직하게는 씨. 아세토부틸리쿰(C. acetobutylicum)으로부터의 thl 유전자 프로모터의 "miniPthl" 유도체와 같은 기능성이지만 보다 짧은(트렁케이트된) 유도체(Dong et al., 2012), 또는 클로스트리디움 속의 박테리아 내에서 단백질의 발현을 가능하게 하는, 기술자에게 잘 알려진 임의의 다른 프로모터로부터 선택될 수 있다.
본 발명의 내용에서 사용될 수 있는 유도성 프로모터의 예는 예를 들면, 이의 발현이 전사 리프레서(transcriptional repressor) TetR에 의해 제어되는 프로모터, 예를 들면, tetA 유전자(이. 콜라이(E. coli) 트랜스포존(transposon) Tn10 상에 원래 존재하는 테트라사이클린 내성 유전자)의 프로모터; 바람직하게는 ARAi 시스템(Zhang et al., 2015)을 작제하기 위해 씨. 아세토부틸리쿰의 araR 조절인자 발현 카세트와 함께, 이의 발현이 L-아라비노즈에 의해 제어되는 프로모터, 예를 들면 ptk 유전자 프로모터(Zhang et al., 2015); 바람직하게는 리프레서 유전자 glyR3 및 목적한 유전자(Mearls et al., 2015)에 바로 이어서 이의 발현이 라미나리비오즈(β-1,3 글루코즈 이량체)에 의해 제어되는 프로모터, 예를 들면, celC 유전자 프로모터, 또는 celC 유전자 프로모터(Newcomb et al., 2011); 이의 발현이 락토즈에 의해 제어되는 프로모터, 예를 들면 bgaL 유전자 프로모터(Banerjee et al., 2014); 이의 발현이 크실로즈에 의해 제어되는 프로모터, 예를 들면 xylB 유전자 프로모터(Nariya et al., 2011); 및 이의 발현이 UV 노출에 의해 제어되는 프로모터, 예를 들면, bcn 유전자 프로모터(Dupuy et al., 2005)로부터 선택될 수 있다.
상술한 프로모터 중 하나로부터 기원한 프로모터, 바람직하게는 기능성이지만 보다 짧은(트렁케이트된) 유도체를 또한 본 발명에 내용에서 사용할 수 있다.
본 발명이 내용에서 사용될 수 있는 다른 유도성 프로모터는 또한 예를 들면, Ransom et al. (2015), Currie et al. (2013) 및 Hartman et al. (2011)의 논문에 기술되어 있다.
바람직한 유도성 프로모터는 Pcm-2tetO1 및 Pcm-2tetO2/1로부터 선택된, tetA로부터 기원한 안하이드로테트라사이클린(aTc)-유도성 프로모터(aTc는 테트라사이클린보다는 덜 독성이고 보다 낮은 농도에서 전사 리프레서 TetR의 억제를 방출할 수 있다)이다(Dong et al., 2012).
다른 바람직한 유도성 프로모터는 크실로즈-유도성 프로모터, 예를 들면 클로스트리디움 디피실레(Clostridium difficile) 630으로부터의 xylB 프로모터이다(Nariya et al., 2011).
여전히 다른 바람직한 유도성 프로모터는 락토즈-유도성 프로모터, 예를 들면, bgaL 유전자의 프로모터이다(Banerjee et al., 2014).
특수한 목적한 핵산, 대표적으로 발현 카세트 또는 벡터는 하나 이상의 발현 카세트를 포함하고, 각각의 카세트는 gRNA(가이드 RNA)를 암호화한다.
용어 "가이드 RNA" 또는 "gRNA"는 본 발명의 의미내에서 Cas9과 같은 DNA 엔도뉴클레아제와 상호작용할 수 있어 이를 박테리아 염색체의 표적 영역에 안내하는 RNA 분자를 지칭한다. 절단의 특이성은 gRNA에 의해 결정된다. 상기 설명한 바와 같이, 각각의 gRNA는 2개의 영역을 포함한다:
- gRNA의 5' 말단에서, 표적 염색체 영역에 대해 상보성이고 내인성 CRISPR 시스템 crRNA를 모사하는 제1 영역(일반적으로 "SDS" 영역으로 불림), 및
- gRNA의 3' 말단에서, tracrRNA(트랜스-작용하는 crRNA)과 내인성 CRISPR 시스템 crRNA 사이에 염기-쌍화 상호작용을 모사(mimic)하고 3' 방향으로 필수적으로 단일 가닥 서열을 지닌 이중-가덕 스템-루프 구조(stem-loop structure)를 갖는 제2 영역(일반적으로 "핸들(handle)" 영역으로 불림). 이러한 제2 영역은 gRNA를 DNA 엔도뉴클레아제에 결합시키는데 필수적이다.
gRNA의 제1 영역("SDS" 영역)은 표적화된 염색체 서열에 따라 변한다.
표적 염색체 영역에 대해 상보성인, gRNA의 "SDS" 영역은 적어도 1개의 뉴클레오타이드, 바람직하게는 적어도 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35 또는 40개의 뉴클레오타이드, 대표적으로 1 내지 40개의 뉴클레오타이드를 포함한다. 바람직하게는, 이러한 영역은 길이가 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 또는 30개의 뉴클레오타이드이다.
gRNA의 제2 영역("핸들" 영역)은 스템-루프(또는 헤어핀(hairpin)) 구조를 갖는다. 상이한 gRNA의 핸들 영역은 선택된 염색체 표적에 의존하지 않는다.
특수한 구현예에 따라서, "핸들" 영역은 적어도 1개의 뉴클레오타이드, 바람직하게는 적어도 1, 50, 100, 200, 500 및 1000개의 뉴클레오타이드, 대표적으로 1 내지 1000개의 뉴클레오타이드의 서열을 포함하거나 이로 이루어진다. 바람직하게는, 이러한 영역은 길이가 40 내지 120개 뉴클레오타이드이다.
gRNA의 전체적인 길이는 일반적으로 50 내지 1000개의 뉴클레오타이드, 바람직하게는 80 내지 200개의 뉴클레오타이드, 및 보다 특히 바람직하게는 90 내지 120개의 뉴클레오타이드이다. 특수한 구현예에 따라서, 본 발명에서 사용된 바와 같은 gRNA는 95 내지 110개 뉴클레오타이드를 포함하는 길이, 예를 들면, 약 100 또는 약 110개의 뉴클레오타이드의 길이를 갖는다.
숙련가는 잘 공지된 기술을 사용하여 표적화될 염색체 영역에 따른 gRNA의 서열 및 구조를 용이하게 정의할 수 있다(참고: 예를 들면 DiCarlo et al., 2013의 문헌).
박테리아 염색체내에서 표적화된 DNA 영역/부위/서열은 비-암호화 DNA의 일부 또는 암호화 DNA의 일부에 상응할 수 있다.
제공된 서열을 변형하는 것으로 이루어진 특수한 구현예에서, 박테리아 DNA의 표적화된 부위는 박테리아 생존에 필수적이다. 이는 예를 들면, 박테리아 염색체의 임의의 영역 또는 비-염색체 DNA, 예를 들면, 특수한 성장 조건 하에서 미생물의 생존에 필수적인 이동성 유전 성분 상에 위치한 임의의 영역, 예를 들면, 예측된 성장 조건이 박테리아를 상기 항생제의 존재하에서 성장시키는데 필요한 경우 항생제 내성 마커를 포함하는 플라스미드에 상응한다.
미생물의 배양과 관련된 특수한 성장 조건 하에서 필수적이지 않은 유전 성분을 제거하는 것을 목적으로 하는 다른 특수한 구현예에서, 박테리아 DNA의 표적화된 부위는 상기 비-염색체 박테리아 DNA, 예를 들면, 상기 이동성 유전 성분의 임의의 영역에 상응할 수 있다.
클로스트리디움 속의 박테리아내에서 표적화된 DNA 부위의 특수한 예는 실험 단락의 실시예 1에 사용된 서열이다. 이들은, 예를 들면, bdhA(서열 번호: 77) 및 bdhB(서열 번호: 78) 유전자를 암호화하는 서열이다. 표적화된 DNA 영역/부위/서열은 Cas9 결합에 포함된 광스페이서 인접한 모티프(protospacer adjacent motif)("PAM") 서열에 이어서 존재한다.
제공된 gRNA의 "SDS" 영역은 박테리아 염색체내 표적화된 DNA 영역/부위/서열에 대해 동일(100%)하거나 적어도 80% 동일하거나, 바람직하게는 85%, 90%, 95%, 96%, 97%, 98% 또는 99% 동일하고 상기 영역/부위/서열의 상보성 서열 중 모두 또는 일부, 대표적으로 적어도 1개의 뉴클레오타이드, 바람직하게는 적어도 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35 또는 40개의 뉴클레오타이드, 대표적으로 1 내지 40개의 뉴클레오타이드를 포함하는 서열, 바람직하게는 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 또는 30개의 뉴클레오타이드를 포함하는 서열과 하이브리드화할 수 있다.
본 발명에 따른 공정에서, 서열("표적 서열", "표적화된 서열" 또는 "인식된 서열")을 표적화하는 하나 이상의 gRNA를 동시에 사용할 수 있다. 이러한 상이한 gRNA는 염색체 영역, 또는 미생물 내에 존재할 수 있는 비-염색체 박테리아 DNA(예를 들면, 이동성 유전 성분)을 표적화할 수 있으며, 이는 동일하거나 상이할 수 있다.
gRNA는 gRNA 분자로서(성숙한 또는 전구체), 전구체로서 또는 상기 gRNA를 암호화하는 하나 이상의 핵산으로서 박테리아 세포내로 도입될 수 있다. gRNA는 바람직하게는 상기 gRNA를 암호화하는 하나 이상의 핵산으로서 박테리아 세포내로 도입된다.
하나 이상의 gRNA가 RNA 분자로서 직접 세포내로 도입되는 경우, 이러한 gRNA(성숙한 또는 전구체)는 변형된 뉴클레오타이드 또는 이들이, 예를 들면, 뉴클레아제에 대한 이들의 내성을 증가시킴으로써 세포내에서 이들의 수명을 증가시키도록 하는 화학적 변형을 함유할 수 있다. 특히, 이들은 예를 들면, 이노신, 메틸-5-데옥시사이티딘, 디메틸아미노-5-데옥시우리딘, 데옥시우리딘, 디아미노-2,6-푸린, 브로모-5-데옥시우리딘 또는 하이브리드화를 가능하게 하는 임의의 다른 변형된 염기와 같은 변형된 염기를 지닌 뉴클레오타이드와 같은 적어도 하나의 변형되거나 비천연의 뉴클레오타이드를 포함할 수 있다. 본 발명에 따라 사용된 gRNA는 또한 포스포로티오에이트, H-포스포네이트 또는 알킬-포스포네이트와 같은 뉴클레오타이드간 결합의 수준에서; 또는 알파-올리고뉴클레오타이드, 2'-O-알킬 리보오스 또는 펩타이드 핵산(PNA)와 같은 골격(backbone)의 수준에서 변형될 수 있다(Egholm et al., 1992).
gRNA는 천연 RNA, 합성 RNA, 또는 재조합 기술에 의해 생산된 RNA일 수 있다. 이러한 gRNA는 예를 들면, 화학적 합성, 생체내(in vivo) 전사 또는 변형 기술과 같은 기술자에게 공지된 임의의 방법으로 제조할 수 있다.
gRNA를 박테리아 세포내로 하나 이상의 핵산으로서 도입하는 경우, gRNA(들)을 암호화하는 서열(들)을 발현 프로모터의 제어 하에 둘 수 있다. 이러한 프로모터는 구성적 또는 유도성일 수 있다.
수개의 gRNA를 사용하는 경우, 각각의 gRNA의 발현은 상이한 프로모터에 의해 제어될 수 있다. 바람직하게는, 사용된 프로모터는 모든 gRNA에 대해 동일하다. 동일한 프로모터는 특수한 구현예에서 수개, 예를 들면, 단지 몇개, 또는 다시 말해서 모든 또는 일부 gRNA가 발현되도록 하기 위해 사용될 수 있다.
바람직한 구현예에서, gRNA의 발현을 제어하는 프로모터(들)은 유도성 프로모터이다.
본 발명의 내용에서 사용될 수 있는 구성적 프로모터의 예는 thl 유전자, ptb 유전자 또는 BCS 오페론의 프로모터, 또는 이의 유도체, 바람직하게는 miniPthl, 또는 클로스트리디움내에서 RNA(암호화 또는 비-암호화)의 합성을 가능하게 하는 것으로 기술자에게 잘 알려진 임의의 다른 프로모터로부터 선택될 수 있다.
본 발명의 내용에서 사용될 수 있는 유도성 프로모터의 예는 tetA 유전자, xylA 유전자, lacI 유전자, 또는 bgaL 유전자의 프로모터, 또는 이의 유도체, 바람직하게는 2tetO1 또는 tetO2/1로부터 선택될 수 있다. 바람직한 유도성 프로모터는 2tetO1이다.
DNA 엔도뉴클레아제, 예를 들면 Cas9, 및 gRNA(들)의 발현을 제어하는 프로모터는 동일하거나 상이할 수 있으며 구성적 또는 유도성일 수 있다. 본 발명의 특수한 및 바람직한 구현예에서, DNA 엔도뉴클레아제 또는 gRNA(들)의 발현을 각각 제어하는 프로모터는 상이한 프로모터이지만 동일한 유도인자에 의해 유도성이다.
상술된 바와 같은 유도성 프로모터는 DNA 엔도뉴클레아제/gRNA 리보뉴클레오단백질 복합체의 작용을 유리하게 제어할 수 있고 목적한 유전적 변형을 겪은 형질전환체의 선택을 용이하도록 한다.
본 발명에 따른 유전 툴은 유리하게는 적어도 하나의 항-CRISPR 단백질(또한 본 내용에서 "항-CRISPR/DNA 엔도뉴클레아제 단백질" 또는 "항-CRISPR/Cas9 단백질"로서 확인됨), 즉, Cas의 작용을 억제하거나 방지하고/중화할 수 있는 단백질, 및/또는 CRISPR/Cas 시스템, 예를 들면, 뉴클레아제가 Cas9 뉴클레아제인 경우 CRISPR/Cas 제II형 시스템의 작용을 억제하거나 방지/중화시킬 수 있는 단백질을 암호화하는 서열을 포함한다. 이러한 서열은 전형적으로 DNA 엔도뉴클레아제 및/또는 gRNA(들)의 발현을 제어하는 프로모터와는 상이한 유도성 프로모터의 제어 하에 위치하고, 다른 유도인자에 의해 유도될 수 있다. 바람직한 구현예에서, 항-CRISPR 단백질을 암호화하는 서열은 또한 전형적으로 유전 툴내에 존재하는 적어도 2개의 핵산 중 하나에 위치한다. 특수한 구현예에서, 항-CRISPR 단백질을 암호화하는 서열은 처음 2개와는 구별되는 핵산(전형적으로 "제3의 핵산") 위에 위치한다. 여전히 다른 특수한 구현예에서, 항-CRISPR 단백질을 암호화하는 서열 및 상기 항-CRISPR 단백질의 전사 리프레서를 암호화하는 서열 모두는 박테리아 염색체내로 통합된다.
바람직한 구현예에서, 항-CRISPR 단백질을 암호화하는 서열은 유전 툴 내에서, DNA 엔도뉴클레아제를 암호화하는 핵산(또한 본 내용에서 "제1의 핵산"으로서 확인됨) 위에 위치한다. 다른 구현예에서, 항-CRISPR 단백질을 암호화하는 서열은 유전 툴 내에 DNA 엔도뉴클레아제를 암호화하는 핵산 이외의 상이한 핵산, 예를 들면, 본 내용에서 "제2의 핵산"으로 확인된 핵산 또는 유전 툴내에 임의로 포함된 "n번째"(전형적으로 "제3") 핵산 위에 위치한다.
항-CRISPR 단백질은 전형적으로 "항-Cas9" 단백질, 즉, Cas9의 작용을 억제하거나 방지/중화할 수 있는 단백질, 및/또는 CRISPR/Cas9 제II형 스템의 작용을 억제하거나 방지/중화할 수 있는 단백질이다.
항-CRISPR 단백질은 유리하게는 "항-Cas9" 단백질 또는 "항-MAD7" 단백질, 즉, Cas9 또는 CAS7의 작용을 억제하거나 방지/중화할 수 있는 단백질이다.
항-CRISPR 단백질은 유리하게는 예를 들면 AcrIIA1, AcrIIA2, AcrIIA3, AcrIIA4, AcrIIA5, AcrIIC1, AcrIIC2 및 AcrIIC3로부터 선택된 "항-Cas9" 단백질이다(Pawluk et al., 2018). 바람직하게는 "항-Cas9" 단백질은 AcrIIA2 또는 AcrIIA4이다. 보다 바람직하게는 "항-Cas9" 단백질은 AcrIIA4이다. 이러한 단백질은 전형적으로 예를 들면, Cas9 효소에 결합함으로써, Cas9의 작용을 유의적으로 제한하고, 이상적으로 방지할 수 있다(Dong et al., 2017; Rauch et al., 2017).
다른 유리하게 유용한 항-CRISPR 단백질은 "항-MAD7" 단백질, 예를 들면 AcrVA1이다(Marino et al., 2018).
바람직한 구현예에서, 항-CRISPR 단백질은 바람직하게는 유전 툴로부터의 핵산 서열을 목적한 박테리아 균주내로 도입하는 상 동안에, DNA 엔도뉴클레아제의 작용을 억제하고, 바람직하게는 중화할 수 있다.
항-CRISPR 단백질을 암호화하는 서열의 발현을 제어하는 프로모터는 바람직하게는 유도성 프로모터이다. 유도성 프로모터는 전형적으로 상기 유도성 프로모터로부터의 전사 억제(transcriptional repression)를 가능하게 하는 단백질의 발현에 관여하는, 구성적으로 발현된 유전자와 관련되어 있다. 이러한 프로모터는 예를 들면, tetA 유전자, xylA 유전자, lacI 유전자, 또는 bgaL 유전자, 또는 이의 유도체의 프로모터로부터 선택될 수 있다.
본 발명의 내용에서 사용될 수 있는 유도성 프로모터의 예는 유전 툴내 및 구성적으로 발현된 bgaR 유전자를 따라, 동일한 핵산 위에 존재하고 이의 발현 생성물이 Pbgal로부터 전사 억제를 가능하게 하는 Pbgal 프로모터(락토즈-유도성)이다. 유도인자인, 락토즈의 존재하에서, Pbgal 프로모터의 전사 억제가 해제되어 이의 하부에 위치한 유전자의 전사가 허용된다. 바람직하게는, 본 발명의 맥락에서 하부에 위치한 유전자는 항-CRISPR 단백질, 예를 들면 acrIIA4를 암호화하는 유전자에 상응한다.
항-CRISPR 단백질의 발현을 제어하는 프로모터는 DNA 엔도뉴클레아제, 예를 들면, Cas9 효소의 작용을 유리하게 제어할 수 있도록 함으로써, 클로스트리디움 속의 박테리아의 형질전환 및 목적한 유전적 변형을 겪은 형질전환체의 생산을 촉진시킬 수 있다.
본 발명의 의미에서, 용어 "핵산"은 임의로 화학적으로 변형(즉, 비-천연 염기, 예를 들면, 변형된 결합, 변형된 염기 및/또는 변형된 당을 지닌 변형된 뉴클레오타이드를 포함함)되거나, 암호화 서열로부터 합성된 전사체의 코돈이 여기서 이를 사용하기 위한 순서로 클로스트리디움 속의 박테리아내에서 가장 흔히 발견된 코돈이도록 최적화된, 임의의 천연의, 합성의, 반-합성의 또는 재조합 DNA 또는 RNA 분자를 의미한다. 상술한 바와 같이, 클로스트리디움 속의 경우에, 최적화된 코돈은 전형적으로 아데닌("A") 및 티민("T") 염기가 풍부한 코돈이다.
본 발명에 따른 유전 툴내에 존재하는 각각의 핵산, 전형적으로 "제1" 핵산 및 "제2" 또는 "n번째" 핵산은 구별되는 실체로 이루어지고, 예를 들면, i) 목적한 하나 이상의 서열(암호화)에, 전형적으로 이의 발현 생성물이 박테리아 내에서 목적한 작용의 수행에 기여하는 몇가지의 목적한 암호화 서열을 포함하는 오페론에 작동적으로 연결된 적어도 하나의 전사 프로모터를 포함하는 핵산, 또는 활성화 서열 및/또는 전사 터미네이터를 추가로 포함하는 핵산과 같은 발현 카세트(또는 "작제물"); 또는 ii) 상술한 바와 같은 하나 이상의 발현 카세트를 포함하는, 환형 또는 선형의, 단일 가닥 또는 이중 가닥 벡터, 예를 들면, 플라스미드, 파아지, 코스미드, 인공 또는 합성 염색체에 상응한다. 바람직하게는, 벡터는 플라스미드이다.
목적한 핵산, 전형적으로 발현 카세트 및 벡터는 기술자에게 잘 공지된 통상의 기술에 의해 작제할 수 있으며 하나 이상의 프로모터, 박테리아의 복제 오리진(ORI 서열), 종결 서열, 선택 유전자, 예를 들면, 항생제-내성 유전자, 및 카세트 또는 벡터의 표적화된 삽입을 가능하게 하는 서열("플랭킹 영역")을 포함할 수 있다. 또한, 이러한 카세트 및 발현 벡터는 기술자에게 잘 공지된 기술을 사용하여 게놈내로 통합시킬 수 있다.
목적한 ORI 서열은 pIP404, pAMβ1, repH (씨. 아세토부틸리쿰의 복제 오리진), ColE1 또는 rep(이. 콜라이내 복제 오리진), 또는 벡터, 전형적으로 플라스미드가 클로스트리디움 세포내에서 유지되도록 하는 임의의 다른 복제 오리진으로부터 선택될 수 있다.
본 발명의 맥락에서, 바람직한 ORI 서열은 플라스미드 pNF2(서열 번호: 118)의 OREP 서열(서열 번호: 126)내에 존재하는 것이다.
목적한 종결 서열은 adc 또는 thl 유전자, bcs 오페론, 또는 클로스트리디움 내에서 전사가 정지되도록 하는 것으로, 기술자에게 잘 알려진 임의의 다른 터미네이터의 것으로부터 선택될 수 있따.
목적한 선택 유전자(내성 유전자)는 ermB, catP, bla, tetA, tetM, 및/또는 암피실린, 에리쓰로마이신, 클로람페니콜, 티암페니콜, 스펙티노마이신, 테트라사이클린 또는 당해 분야의 기술자에게 잘 공지된 클로스트리디움 속의 박테리아를 선택하는데 사용될 수 있는 임의의 다른 항생제에 대한 내성을 위한 임의의 다른 유전자로부터 선택될 수 있다.
특수한 벡터는 하나 이상의 발현 카세트를 포함하며, 각각의 카세트는 gRNA를 암호화한다.
특수한 구현예에서, 본 발명은 청구범위에서 확인된 바와 같은 "제1" 핵산으로서 이의 서열이 서열 번호: 23의 것인 플라스미드를 포함하는 유전 툴에 관한 것이다.
특수한 구현예에서, 본 발명은 "제2" 또는 "n번째" 핵산으로서 이의 서열이 서열 서열 번호: 79, 서열 번호: 80, 서열 번호: 119, 서열 번호: 123, 서열 번호: 124 및 서열 번호: 125 중 하나로부터 선택된 플라스미드 벡터를 포함하는 유전 툴에 관한 것이다.
여전히 다른 특수한 구현예에서, 본 발명은 "OPT 핵산"으로서 이의 서열이 서열 번호: 119, 서열 번호: 123, 서열 번호: 124 및 서열 번호: 125의 서열 중 하나로부터 선택된 플라스미드 벡터를 포함하는 유전 툴에 관한 것이다. 다른 특수한 구현예에서, 유전 툴은 서열 번호: 23, 79, 80, 119, 123, 124 및 125 중에서 수개(예를 들면, 적어도 2개 또는 3개)의 서열을 포함하며, 상기 서열은 서로 상이하다.
목적한 서열은 선택된 복구 주형(CRISPR 기술에 따름)에 의해 안내된 동종 재조합 메카니즘을 통해 박테리아 게놈내로 도입된다. 목적한 서열은 박테리아 게놈 내에서 표적화된 부위를 대체한다. 따라서, 재조합 공정은 박테리아의 게놈내 표적화된 부위의 전체적이거나 부분적인 변형 또는 결실을 허용하거나 박테리아의 게놈 내로 핵산 단편(특수한 구현예에서 큰 단편)의 삽입을 허용한다. 선택된 복구 주형은 박테리아 게놈의 표적화된 서열 중 모두 또는 일부 또는 목적한 형질전환의 특성에 따른 이의 다소 변형된 버젼을 포함할 수 있다. DNA의 표적화된 부위와 유사하게, 주형 자체는 따라서 천연 및/또는 합성, 암호화 및/또는 비-암호화 서열에 상응하는 하나 이상의 핵산 서열 또는 핵산 서열 부위를 포함할 수 있다. 주형은 또한 하나 이상의 "외부" 서열, 즉 클로스트리디움 속에 속하는 박테리아의 게놈으로부터 또는 상기 속의 특수한 종의 게놈으로부터 천연적으로 부재된 서열을 포함할 수 있따. 주형은 또한 상술한 바와 같은 서열의 조합을 포함할 수 있다.
본 발명에 따른 유전 툴은 복구 주형이 클로스트리디움 속의 박테리아의 박테리아 게놈내로 목적한 핵산, 전형적으로 적어도 1개의 염기쌍(bp), 바람직하게는 적어도 1, 2, 3, 4, 5, 10, 15, 20, 50, 100, 1,000, 10,000, 100,000 또는 1,000,000 bp, 전형적으로 1 bp 내지 20 kb 또는 1 bp 내지 10 kb, 바람직하게는 10 bp 내지 10 kb 또는 1 kb 내지 10 kb, 예를 들면, 1 bp 내지 5 kb, 2 kb 내지 5 kb, 또는 2.5 또는 3 kb 내지 5 kb를 포함하는 DNA 서열 또는 서열 부위의 혼입을 안내하도록 한다.
본 발명자는 박테리아 내에서 상기 박테리아의 야생형 버젼에 존재하는 유전 물질로부터 부분적으로 또는 완전히 부재된 DNA 서열의 발현을 가능하게 하는, 목적한 핵산, 전형적으로 목적한 DNA 서열의 예를 기술한다.
특수한 구현예에서, 목적한 DNA 서열의 발현은 클로스트리디움 속의 박테리아가 수개의 상이한 당, 예를 들면, 적어도 2개의 상이한 당, 전형적으로 5-탄당(예를 들면, 글루코즈 또는 만노즈) 및/또는 6-탄당(예를 들면, 크실로즈, 아라비노즈 또는 푸럭토즈) 중에서 적어도 2개의 상이한 당, 바람직하게는 예를 들면, 글루코즈, 크실로즈 및 만노즈로부터 선택된 적어도 3개의 상이한 당; 및 글루코즈, 크실로즈 및 아라비노즈를 발효하도록 한다.
다른 특수한 구현예에서, 목적한 DNA 서열은 적어도 하나의 목적한 생성물, 바람직하게는 클로스트리디움 속의 박테리아에 의한 용매 생산을 촉진하는 생성물, 전형적으로 적어도 하나의 목적한 단백질, 예를 들면, 효소; 트랜스포터와 같은 막 단백질; 다른 단백질(차페론 단백질)의 성숙 단백질; 전사 인자; 또는 이의 조합을 암호화한다.
바람직한 구현예에서, 목적한 DNA 서열은 i) 효소, 예를 들면, 알데하이드의 알코올로의 전환에 포함된 효소를 암호화하는 서열, 예를 들면, 알코올 데하이드로게나제를 암호화하는 서열(예를 들면, adh, adhE, adhE1, adhE2, bdhA, bdhB 및 bdhC로부터 선택된 서열), 트랜스퍼라제를 암호화하는 서열(예를 들면, ctfA, ctfB, atoA 및 atoB로부터 선택된 서열), 데카복실라제를 암호화하는 서열(예를 들면, adc), 하이드로게나제를 암호화하는 서열(예를 들면, etfA, etfB 및 hydA로부터 선택된 서열), 및 이의 조합을 암호화하는 서열, ii) 막 단백질, 예를 들면, 포스포트랜스퍼라제(예를 들면, glcG, bglC, cbe4532, cbe4533, cbe4982, cbe4983, cbe0751로부터 선택된 서열)을 암호화하는 서열, iii) 전사 인자(예를 들면, sigE, sigF, sigG, sigH, sigK로부터 선택된 서열)를 암호화하는 서열 및 iv) 이의 조합으로부터 선택된다.
본 발명자는 또한 박테리아의 게놈 내에서, i) 표적 서열, ii) 표적 서열의 전사를 제어하는 서열, 또는 iii) 표적 서열을 플랭킹(flanking)하는 서열의 적어도 하나의 가닥을 인식(적어도 부분적으로 결합)하고 바람직하게는 이를 표적화하는, 즉, 이를 인식하여 절단시키는 목적한 핵산의 예를 기술한다.
인식된 서열은 또한 본 내용에서 "표적 서열" 또는 "표적화된 서열"로서 확인된다.
이러한 목적한 핵산을 포함하거나 이로 이루어진 유전 툴이 또한 기술되어 있다. 이러한 경우에, 목적한 핵산은 본 내용에 기술된 바와 같은 유전 툴의 "제2" 또는 "n번째" 핵산 내에 존재한다.
목적한 핵산은 본 설명의 맥락에서 박테리아의 게놈으로부터 인식된 서열을 제거하거나 이의 발현을 변형시키기 위해, 예를 들면, 이의 발현을 조정/조절하기 위해, 특히 이를 억제하기 위해, 바람직하게는 이를 변형시켜 상기 박테리아가 상기 서열로부터, 단백질, 특히 기능성 단백질을 발현할 수 없도록 하기 위해 사용된다.
표적 서열이 목적한 박테리아가 내성을 부여하는 항생제를 포함하는 배양 배지에서 성장하도록 하는 효소를 암호화하는 서열, 이러한 서열의 전사를 제어하는 서열 또는 이러한 서열을 플랭킹하는 서열인 경우, 항생제는 전형적으로 암페니콜의 부류에 속하는 항생제이다. 본 명세서의 맥락에서 목적한 암페니콜의 예는 클로람페니콜, 티암페니콜, 아지담페니콜 및 플로르페니콜(Schwarz S. et al., 2004), 특히 클로람페니콜 및 티암페니콜이다.
특수한 구현예에서, 목적한 핵산은 박테리아 게놈 내에서 표적화된 DNA 영역/부위/서열에 대해 100% 동일하거나 적어도 80% 동일한, 바람직하게는 85%, 90%, 95%, 96%, 97%, 98% 또는 99% 동일한 표적 서열에 대해 상보성이고 상기 영역/부위/서열의 상보성 서열 중 모두 또는 일부, 전형적으로 적어도 1개의 뉴클레오타이드, 바람직하게는 적어도 1, 2, 3, 4, 5, 10, 14, 15, 20, 25, 30, 35 또는 40개의 뉴클레오타이드, 전형적으로 1, 10 또는 20 및 1000개의 뉴클레오타이드, 예를 들면 1, 10 또는 20 내지 900, 800, 700, 600, 500, 400, 300 또는 200개의 뉴클레오타이드, 1, 10 또는 20 내지 100개의 뉴클레오타이드, 1, 10 또는 20 내지 50개의 뉴클레오타이드, 또는 1, 10 또는 20 내지 40개의 뉴클레오타이드, 예를 들면, 10 내지 40개의 뉴클레오타이드, 10 내지 30개의 뉴클레오타이드, 10 내지 20개의 뉴클레오타이드, 20 내지 30개의 뉴클레오타이드, 15 내지 40개의 뉴클레오타이드, 15 내지 30개의 뉴클레오타이드 또는 15 내지 20개의 뉴클레오타이드를 포함하는 서열, 바람직하게는 14, 15, 16, 17, 18 ,19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 또는 30개의 뉴클레오타이드를 포함하는 서열에 대해 하이브리드화할 수 있는 적어도 하나의 영역을 포함한다. 목적한 핵산 내에 존재하는 표적 서열의 상보성 영역은 본 내용에서 기술된 바와 같은 CRISPR 툴에 사용된 가이드 RNA(gRNA)의 "SDS" 영역에 상응할 수 있다.
기술된 다른 특수한 구현예에서, 목적한 핵산은 적어도 2개의 상보성 영역을 포함하며 표적 서열 각각은 박테리아 게놈내 상기 표적화된 DNA 영역/부위/서열에 대해 100% 동일하거나 적어도 80% 동일하거나, 바람직하게는 적어도 85%, 90%, 95%, 96%, 97%, 98% 또는 99% 동일하다. 이러한 영역은 상기 영역/부위/서열의 상보성 서열 중 일부 또는 부분에 대해, 전형적으로 적어도 적어도 1개의 뉴클레오타이드, 바람직하게는 적어도 100개의 뉴클레오타이드, 전형적으로 100 내지 1000개의 뉴클레오타이드를 포함하는 상술한 바와 같은 서열에 대해 하이브리드화할 수 있다. 목적한 핵산내에 존재하는 표적 서열에 대해 상보성인 영역은 본 내용에서 기술된 바와 같은 유전 변형 툴내 표적 서열의 5' 및 3' 플랭킹 영역을 인식하고, 바람직하게는 표적화할 수 있다.
특수한 양태에 따라서, 표적 서열은 암페니콜, 예를 들면, 클로람페니콜 및/또는 티암페니콜의 부류에 속하는 하나 이상의 항생제를 포함하는 배양 배지 속에서 성장할 수 있는 목적한 박테리아, 예를 들면, 클로스트리디움 속의 게놈내에서 암페니콜-O-아세틸트랜스퍼라제, 예를 들면 클로람페니콜-O-아세틸트랜스퍼라제 또는 티암페니콜-O-아세틸트랜스퍼라제를 암호화하고, 이러한 서열의 전사를 제어하거나 이러한 서열을 플랭킹하는 서열이다.
인식된 서열은 예를 들면, 씨. 베이제린키이 DSM 6423으로부터의 클로람페니콜-O-아세틸트랜스퍼라제를 암호화하는 catB 유전자(CIBE△3859)에 상응하는 서열 번호: 18의 서열 또는 상기 클로람페니콜-O-아세틸트랜스퍼라제에 대해 적어도 70%, 75%, 80%, 85%, 90% 또는 95% 동일한 아미노산 서열, 또는 서열 번호: 18의 서열 중 모두 또는 적어도 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% 또는 99%를 포함하는 서열이다. 다시 말하면, 인식된 서열은 적어도 1개의 뉴클레오타이드, 바람직하게는 적어도 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35 또는 40개의 뉴클레오타이드, 전형적으로 1 내지 40개의 뉴클레오타이드를 포함하는 서열, 바람직하게는 서열 번호: 18의 서열의 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 또는 30개의 뉴클레오타이드를 포함하는 서열일 수 있다.
서열 번호:18의 서열에 의해 암호화된 클로람페니콜-O-아세틸트랜스퍼라제에 대해 적어도 70% 동일한 아미노산 서열의 예는 다음의 참고 하에 NCBI 데이타베이스에서 확인된 서열에 상응한다: WP_077843937.1, 서열 번호: 44(WP_063843219.1), 서열 번호: 45(WP_078116092.1), 서열 번호: 46(WP_077840383.1), 서열 번호: 47(WP_077307770.1), 서열 번호: 48(WP_103699368.1), 서열 번호: 49(WP_087701812.1), 서열 번호: 50(WP_017210112.1), 서열 번호: 51(WP_077831818.1), 서열 번호: 52(WP_012059398.1), 서열 번호: 53(WP_077363893.1), 서열 번호: 54(WP_015393553.1), 서열 번호: 55(WP_023973814.1), 서열 번호: 56(WP_026887895.1), 서열 번호 57 (AWK51568.1), 서열 번호: 58(WP_003359882.1), 서열 번호: 59(WP_091687918.1), 서열 번호: 60(WP_055668544.1), 서열 번호: 61 (KGK90159.1), 서열 번호: 62(WP_032079033.1), 서열 번호: 63(WP_029163167.1), 서열 번호: 64(WP_017414356.1), 서열 번호: 65(WP_073285202.1), 서열 번호: 66(WP_063843220.1), 및 서열 번호: 67(WP_021281995.1).
서열 번호: 18에 의해 암호화된 클로람페니콜-O-아세틸트랜스퍼라제에 대해 적어도 75% 동일한 아미노산 서열의 예는 서열 WP_077843937.1, WP_063843219.1, WP_078116092.1, WP_077840383.1, WP_077307770.1, WP_103699368.1, WP_087701812.1, WP_017210112.1, WP_077831818.1, WP_012059398.1, WP_077363893.1, WP_015393553.1, WP_023973814.1, WP_026887895.1 AWK51568.1, WP_003359882.1, WP_091687918.1, WP_055668544.1 및 KGK90159.1에 상응한다.
서열 번호: 18에 의해 암호화된 클로람페니콜-O-아세틸트랜스퍼라제에 대해 적어도 90% 동일한 아미노산 서열의 예는 WP_077843937.1, WP_063843219.1, WP_078116092.1, WP_077840383.1, WP_077307770.1, WP_103699368.1, WP_087701812.1, WP_017210112.1, WP_077831818.1, WP_012059398.1, WP_077363893.1, WP_015393553.1, WP_023973814.1, WP_026887895.1 및 AWK51568.1이다.
서열 번호: 18에 의해 암호화된 클로람페니콜-O-아세틸트랜스퍼라제에 대해 적어도 95% 동일한 아미노산 서열의 예는 서열 WP_077843937.1, WP_063843219.1, WP_078116092.1, WP_077840383.1, WP_077307770.1, WP_103699368.1, WP_087701812.1, WP_017210112.1, WP_077831818.1, WP_012059398.1, WP_077363893.1, WP_015393553.1, WP_023973814.1, 및 WP_02688787895.1에 상응한다.
서열 번호: 18에 의해 암호화된 클로람페니콜-O-아세틸트랜스퍼라제에 대해 적어도 99% 동일한 바람직한 아미노산 서열은 WP_077843937.1, 서열 번호: 44(WP_063843219.1) 및 서열 번호: 45(WP_078116092.1)이다.
서열 번호: 18과 동일한 특수한 서열은 WP_077843937.1와 같이 NCBI 데이타베이스에서 확인된 서열이다.
특수한 예에 따라서, 표적 서열은 씨. 페르프린겐스로부터의 클로람페니콜-O-아세틸트랜스퍼라제를 암호화하는 catQ 유전자에 상응하는 서열 번호:68의 서열이며 이의 아미노산 서열은 서열 번호: 66(WP_063843220.1), 또는 상기 클로람페니콜-O-아세틸트랜스퍼라제에 대해 적어도 70%, 75%, 80%, 80%, 85%, 90% 또는 95% 동일한 서열, 또는 서열 번호: 68의 서열 중 모두 또는 적어도 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% 또는 99%를 포함하는 서열에 상응한다.
다시 말해서, 인식된 서열은 서열 번호: 68의 적어도 1개의 뉴클레오타이드, 바람직하게는 적어도 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35 또는 40개의 뉴클레오타이드, 전형적으로 1 내지 40개의 뉴클레오타이드를 포함하는 서열, 바람직하게는 서열 번호:68의 서열의 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 또는 30개의 뉴클레오타이드를 포함하는 서열일 수 있다.
여전히 다른 특수한 예에 따라서, 인식된 서열은 박테리아내에 천연적으로 존재하거나 상기 박테리아내로 인공적으로 도입된, 기술자에게 공지된 핵산 서열 catB(서열 번호: 18), catQ(서열 번호: 68), catD(서열 번호: 69, Schwarz S. et al., 2004) 또는 catP(서열 번호: 70, Schwarz S. et al., 2004)일 수 있다.
상기 나타낸 바와 같이, 다른 특수한 예에 따라서, 표적 서열은 상술한 바와 같은 암호화 서열(목적한 박테리아가 이에 대해 내성을 부여하는 항생제를 포함하는 배양 배지 속에서 상기 박테리아가 성장하도록 하는 효소를 암호화하는)의 전사를 제어하는 서열, 전형적으로, 프로모터 서열, 예를 들면, catB 유전자의 프로모터 서열(서열 번호: 73) 또는 catQ 유전자의 프로모터 서열(서열 번호: 74)일 수 있다.
이후에 목적한 핵산은 상술한 바와 같은 암호화 서열의 전사를 제어하는 서열을 인식하므로, 전형적으로 이에 결합할 수 있다.
다른 특수한 예에 따라서, 표적 서열은 상술한 바와 같은 암호화 서열을 플랭킹하는 서열, 예를 들면 서열 번호: 18 서열의 catB 유전자를 플랭킹하는 서열 또는 이에 대해 적어도 70% 동일한 서열일 수 있다. 이러한 플랭킹 서열은 전형적으로 1, 10 또는 20 내지 1000개의 뉴클레오타이드, 예를 들면 1, 10 또는 20 내지 900, 800, 700, 600, 500, 400, 300 또는 200개의 뉴클레오타이드, 1, 10 또는 20 내지 100개의 뉴클레오타이드, 1, 10 또는 20 내지 50개의 뉴클레오타이드, 또는 1, 10 또는 20 내지 40개의 뉴클레오타이드, 예를 들면, 10 내지 40개의 뉴클레오타이드, 10 내지 30개의 뉴클레오타이드, 10 내지 20개의 뉴클레오타이드, 20 내지 30개의 뉴클레오타이드, 15 내지 40개의 뉴클레오타이드, 15 내지 30개의 뉴클레오타이드 또는 15 내지 20개의 뉴클레오타이드를 포함한다.
특수한 양태에 따라서, 표적 서열은 이러한 암호화 서열을 플랭킹하는 서열의 쌍에 상응하며, 각각의 플랭킹 서열은 전형적으로 적어도 20개의 뉴클레오타이드, 전형적으로 100 내지 1000개의 뉴클레오타이드, 바람직하게는 200 내지 800개의 뉴클레오타이드를 포함한다.
본 설명의 내용에서, 목적한 박테리아를 형질전환하고/하거나 유전적으로 변형시키는데 사용된, 목적한 핵산의 특수한 예는 i) 암호화 서열을 인지하거나, ii) 암호화 서열의 전사를 제어하거나, iii) 박테리아, 예를 들면, 상술한 바와 같은 클로스트리디움 속의 박테리아내에서 암호화 서열, 목적한 효소, 바람직하게는 암페니콜-O-아세틸트랜스퍼라제, 예를 들면, 클로람페니콜-O-아세틸트랜스퍼라제 또는 티암페니콜-O-아세틸트랜스퍼라제를 플랭킹하는 DNA 단편이다.
본 발명에 따른 목적한 핵산의 예는 박테리아 게놈으로부터 인식된 서열("표적 서열")을 제거하거나 이의 발현을 변형시킬 수 있고, 예를 들면, 이를 조절할 수 있으며, 특히, 이를 억제할 수 있고, 바람직하게는 이를 변형시켜 상기 박테리아가 단백질, 예를 들면, 암페니콜-O-아세틸트랜스퍼라제, 특히 기능성 단백질을 상기 서열로부터 발현할 수 없도록 할 수 있다.
효소를 암호화하는 인식된 서열이 클로람페니콜 및/또는 티암페니콜에 대해 박테리아 내성을 부여하는 서열인 특수한 구현예에서, 사용된 선택 유전자는 클로람페니콜 및/또는 티암페니콜 내성 유전자가 아니고, 바람직하게는 catB, catQ, catD 또는 catP 유전자가 아니다.
특수한 구현예에서, 목적한 핵산은 암호화 서열을 표적화하거나, 암호화 서열의 전사를 제어하거나, 암호화 서열을 플랭킹하는 하나 이상의 가이드 RNA(gRNA), 목적한 효소, 특히 암페니콜-O-아세틸트랜스퍼라제, 및/또는 변형 주형(또한 "편집 주형"으로서 본 내용에서 확인됨), 예를 들면, 표적 서열 중 모두 또는 일부를 제거하거나 변형시키기 위한, 바람직하게는 표적 서열의 발현을 억제하거나 억압할 목적의 변형 주형, 전형적으로 상술한 표적 서열의 상부 및 하부에 위치한 서열에 대한 서열 동족체(이에 상응하는) 서열, 전형적으로 각각 10 또는 20개의 염기쌍 및 1000, 1500 또는 2000개의 염기쌍, 예를 들면, 100, 200, 300, 400 또는 500개의 염기쌍 및 1000, 1200, 1300, 1400 또는 1500개의 염기쌍, 바람직하게는 100 내지 1500개 또는 100 내지 및 1000개의 염기쌍, 및 심지어 보다 바람직하게는 500 내지 1000개의 염기쌍 또는 200 내지 800개의 염기쌍을 포함하는 서열(표적 서열의 상부 및 하부에 위치한 상기 서열에 대해 상동성인)을 포함하는 주형을 포함한다.
특수한 구현예에서, 목적한 박테리아를 형질전환시키고/시키거나 유전적으로 변형시키는데 사용된 목적한 핵산은 Dam 및 Dcm 메틸트랜스퍼라제(dam- dcm- 유전형을 가진 에스케리키아 콜라이 박테리아으로부터 제조)에 의해 인식된 모티프에서 메틸화를 갖지 않는 핵산이다.
형질전환되고/되거나 유전적으로 변형된 목적한 박테리아가 특히, 하위분기군 DSM 6423, LMG 7814, LMG 7815, NRRL B-593 및 NCCB 27006 중 하나에 속하는 씨. 베이제린키이 박테리아인 경우, 유전 툴로서 사용된 목적한 핵산, 예를 들어, 플라스미드는 Dam 및 Dcm 메틸트랜스퍼라제에 의해 인식된 모티프에서 메틸화를 갖지 않는 핵산, 전형적으로 GATC 모티프의 아데노신("A") 및/또는 CCWGG 모티프(W는 아데노신("A") 또는 티민("T")에 상응한다)의 제2의 사이토신("C")이 탈메틸화된 핵산이다.
Dam 및 Dcm 메틸트랜스퍼라제에 의해 인식된 모티프에서 메틸화를 갖지 않는 핵산은 전형적으로 dam - dcm - 유전형을 지닌 에스케리키아 콜라이(예를 들면, 에스케리키아 콜라이 INV 110, Invitrogen)으로부터 제조될 수 있다. 동일한 핵산은 예를 들면, EcoKI 메틸트랜스퍼라제에 의한 다른 메틸화를 가질 수 있으며, 후자는 모티프 AAC(N6)GTGC 및 GCAC(N6)GTT(N은 임의의 염기에 상응할 수 있다)의 아데닌("A")을 표적화한다.
특수한 구현예에서, 표적화된 서열은 암페니콜-O-아세틸트랜스퍼라제, 예를 들면, catB 유전자와 같은 클로람페니콜-O-아세틸트랜스퍼라제를 암호화하는 서열, 이러한 유전자의 전사를 제어하는 서열, 또는 이러한 유전자를 플랭킹하는 서열에 상응한다.
본 발명자에 의해 기술된 목적한 특수한 핵산은 예를 들면, 벡터, 바람직하게는 플라스미드, 예를 들면, 본 설명의 실험 단락에 기술된 서열 번호: 21의 서열의 플라스미드 pCas9ind-△catB 또는 서열 번호: 38의 서열의 플라스미드 pCas9ind-gRNA△catB(실시예 2 참고), 특히 Dam 및 Dcm 메틸트랜스퍼라제에 의해 인식된 모티프에서 메틸화를 갖지 않는 상기 서열의 버젼이다.
본 설명은 또한 본 내용에 기술된 바와 같은 목적한 박테리아를 형질전환 및/또는 유전적으로 변형시키는 목적한 핵산의 용도를 포함한다.
본 발명은 또한 동종 재조합에 의해, 클로스트리디움 속의 박테리아, 바람직하게는 클로스트리디움 속의 용매생성 박테리아를 형질전환시키고, 전형적으로 유전적으로 변형시키기 위한 방법에 관한 것이다. 이러한 방법은 유리하게는 상기 박테리아내에 본 출원에 기술된 바와 같은 본 발명에 따른 유전 툴, 바람직하게는 i) 서열 번호: 126의 서열(OREP) 중 모두 또는 일부 및 ii) 박테리아의 유전 물질의 변형 및/또는 상기 박테리아내에서 상기 박테리아의 야생형 버젼에 존재하는 유전 물질로부터 부분적으로 또는 전체적으로 부재된 DNA 서열의 발현을 가능하게 하는 서열을 포함하거나, 이로 이루어진 "OPT 핵산"을 도입함으로써 박테리아를 형질전환시키는 단계를 포함한다. 이러한 방법은 또한 형질전환된 박테리아, 즉, 목적한 재조합(들)/변형(들)/최적화(들)을 갖는 박테리아를 수득하거나, 회수하거나, 선택하거나 단리하는 단계를 추가로 포함할 수 있다.
본 발명은 전형적으로 클로스트리디움 속의 박테리아를 형질전환시키고, 바람직하게는 유전적으로 변형시키기 위해 선택된 유전 변형 툴이 씨. 베이제린키이와 같은 박테리아, 하나 이상의 항생제에 대한 내성에 관여하는 효소를 암호화하는 유전자의 야생형 상태의 담체 및/또는 적어도 하나의 염색체외 DNA 서열의 야생형 상태의 담체에서 사용되도록 의도된 경우 유리하게 실행되며, 상기 유전 툴의 실행은 상기 박테리아를 상기 박테리아가 이에 대해 야생형 상태에서 내성인 항생제에 대한 내성의 마커의 발현을 가능하게 하는 핵산을 사용하여 형질전환시키는 단계 및/또는 형질전환되고/되거나 유전적으로 변형된 박테리아를 상기 항생제(이에 대해 박테리아는 야생형 상태에서 내성이다)를 사용하여 선택하는, 바람직하게는 상기 염색체외 DNA 서열을 상실한 박테리아를 상기 박테리아 중에서 선택하는 단계를 포함한다.
전형적으로 CRISPR 유전 변형 툴을 사용하여, 본 발명으로 유리하게 달성가능한 변형은 목적하지 않은 서열, 예를 들면, 하나 이상의 항생제에 대해 박테리아 내성을 제공하는 효소를 암호화하는 서열을 제거하거나, 목적하지 않은 서열이 비-기능성이 되도록 하는 것으로 이루어진다. 본 발명을 통해 유리하게 달성가능한 다른 변형은 박테리아를 유전적으로 변형시켜 이의 성능, 예를 들면, 목적한 용매 또는 용매의 혼합물의 생산에 있어서 이의 성능을 증진시키는 것으로 이루어지며, 상기 박테리아는 이것이 야생형 상태에서 내성이었던 항생제에 대해 민감성이 되고/되거나 이것이 상기 박테리아의 야생형에서 존재하는 염색체외 DNA 서열을 청소(cleaning)하도록 본 발명을 통해 이미 먼저 변형된다.
본 발명에 따른 공정은 CRISPR(집단화하여 일정하게 분포하는 짧은 팔린드롬 반복체) 유전 툴/기술, 특히 CRISPR/Cas(CRISPR-관련된 단백질) 유전 툴의 사용을 기반으로 한다.
본 발명은 Wang et al. (2015)에 의해 기술된 바와 같은 뉴클레아제, gRNA 및복구 주형을 포함하는 단일 플라스미드를 사용한 통상의 CRISPR/Cas 유전 툴을 사용하여 실행될 수 있다.
gRNA의 서열 및 구조는 잘 공지된 기술을 사용하여 표적화될 염색체 영역 또는 이동성 유전 성분에 따라 기술자에 의해 용이하게 정의될 수 있다(참고: 예를 들면, 문헌, DiCarlo et al., 2013).
본 발명자는 2개의 플라스미드의 사용을 기반으로 하는, 본 발명의 내용에서 또한 사용될 수 있는, 클로스트리디움 속의 박테리아에 대해 적응된, 박테리아를 변형시키기 위한 유전 툴을 개발하고 기술하여 왔다(참고: WO2017/064439, Wasels et al., 2017, 및 본 명세서와 관련된 도 15).
특수한 구현예에서, 이러한 툴의 "제1" 플라스미드는 Cas 뉴클레아제의 발현을 허용하며, 수행될 변형에 대해 특이적인 "제2" 플라스미드는 하나 이상의 gRNA 발현 카세트(전형적으로 박테리아 DNA의 상이한 영역을 표적화하는)뿐만 아니라 동종 재조합 메카니즘에 의해 Cas에 의해 표적화된 박테리아 DNA의 일부의 목적한 서열로의 교체를 가능하게 하는 복구 주형을 함유한다. cas 유전자 및/또는 gRNA 발현 카세트(들)는 구성적 또는 유도성 발현 프로모터, 바람직하게는 기술자에게 공지된(예를 들면, 출원 제WO2017/064439호에 기술되고 본 설명에 참고로 포함된) 유도성 발현 프로모터이고, 바람직하게는 상이하지만 동일한 유도인자에 의해 유도성 프로모터의 제어 하에 위치한다.
사용될 수 있는 gRNA는 본 내용에서 상술한 바와 같은 gRNA에 상응한다.
동종 재조합에 의해 클로스트리디움 속의 용매생성 박테리아를 형질전환시키고, 전형적으로 유전적으로 변형시키기 위한 본 발명에 따른 특수한 공정은 다음 단계를 이러한 순서대로 포함한다:
a)
박테리아내로 본 출원에 기술된 바와 같은 핵산 또는 유전 툴을 항-CRISPR 단백질의 발현의 유도인자의 존재하에서 도입하는 단계, 및
b)
단계 a)의 말단에서 수득된 형질전환된 박테리아를 항-CRISPR 단백질의 발현의 유도인자를 함유하지 않는 배지(또는 포함하지 않는 조건 하에서) 위에서 배양하여, 전형적으로 DNA 엔도뉴클레아제/gRNA 리보뉴클레오단백질 복합체, 전형적으로 Cas/gRNA(상기 항-CRISPR 단백질의 생산을 정지하고 엔도뉴클레아제의 작용을 허용하기 위하여)의 발현을 가능하게 하는 단계.
본 발명에 따른 유전 툴의 성분(핵산 또는 gRNA)은 기술자에게 공지된 임의의 방법에 의해 직접 또는 간접적으로, 예를 들면, 형질전환, 접합, 미세주입, 형질감염, 전기천공(electroporation) 등에 의해, 바람직하게는 전기천공에 의해 도입된다(Mermelstein et al., 1993).
항-CRISPR 단백질의 발현의 유도인자는 상기 발현을 유도하기에 충분한 양으로 존재한다. Pbgal 프로모터의 경우, 유도인자인, 락토즈는 BgaR 단백질의 발현과 연결된 항-CRISPR 단백질의 발현의 억제(전사 억제)를 해제한다.
항-CRISPR 단백질의 발현의 유도인자는 바람직하게는 약 1 mM 내지 약 1 M, 바람직하게는 약 10 mM 내지 약 100 mM, 예를 들면, 약 40 mM을 포함하는 농도에서 사용된다.
바람직한 구현예에서, 항-CRISPR 단백질은 바람직하게는 유전 툴의 핵산 서열을 목적한 박테리아 균주내로 도입하는 단계 동안에 뉴클레아제의 작용을 억제, 바람직하게는 중화시킬 수 있다.
실험 단락에서 입증된 바와 같이, 본 발명은 유리하게는 Cas9 발현 카세트 및 AcrIIA4와 같은 항-CRISPR 단백질에 대한 발현 카세트를 포함하는 클로스트리디움 속의 박테리아를 gRNA 발현 카세트를 포함하는 임의의 핵산으로 형질전환시킨다.
상술한 공정의 단계 a)의 말기에 수득된 형질전환된 박테리아는 이후에 항-CRISPR 단백질의 발현의 유도인자를 함유하지 않는 배지(상기 항-CRISPR 단백질의 생산을 중지하고 뉴클레아제의 작용을 허용하기 위하여)에서 성장시킨다.
특수한 구현예에서, 공정은 단계 b) 동안 또는 후에, 이러한 프로모터(들)이 유전 툴내에 존재하는 경우 뉴클레아제 및/또는 가이드 RNA(들)의 발현을 제어하는 유도성 프로모터(들)의 발현을 유도시켜, 일단 상기 유전 툴이 상기 박테리아내로 도입되면 목적한 유전적 변형을 허용하도록 하는 단계를 추가로 포함한다. 유도는 전형적으로 선택된 유도성 프로모터와 관련된 발현의 억제를 해제하는 물질을 사용하여 수행된다.
존재하는 경우, 유도 단계는 표적 박테리아내로 본 발명에 따른 유전 툴을 도입한 후 기술자에게 공지된 DNA 엔도뉴클레아제/gRNA 리보뉴클레오단백질 복합체의 발현을 가능하게 하는 배지 상에서 임의의 배양 방법으로 수행할 수 있다. 이는 예를 들면, 박테리아를 충분한 양으로 존재하는 적합한 물질과 접촉시키거나, UV 광에 노출시킴으로써 수행된다. 이러한 물질은 선택된 유도성 프로모터와 관련된 발현의 억제를 해제한다. 선택된 프로모터가 Pcm-2tetO1 및 Pcm-tetO2/1로부터 선택된, 안하이드로테트라사이클린(aTc)-유도성 프로모터인 경우, aTc는 바람직하게는 약 1 ng/mL 내지 약 5000 ng/mL, 바람직하게는 약 10 ng/mL 내지 1000 ng/mL, 10 ng/mL 내지 800 ng/mL, 10 ng/mL 내지 500 ng/mL, 100 ng/mL 또는 200 ng/mL 내지 약 800 ng/mL 또는 1000 ng/mL, 또는 약 100 ng/mL 또는 200 ng/mL 내지 약 500 ng/mL, 600 ng/mL 또는 700 ng/mL, 예를 들면, 약 50 ng/mL, 100 ng/mL, 150 ng/mL, 200 ng/mL, 250 ng/mL, 300 ng/mL, 350 ng/mL, 400 ng/mL, 450 ng/mL, 500 ng/mL, 550 ng/mL, 600 ng/mL, 650 ng/mL, 700 ng/mL, 750 ng/mL 또는 800 ng/mL로 포함된 농도에서 사용된다. 특수한 구현예에서, aTc는 바람직하게는 약 200 ng/mL 내지 약 1000 ng/mL 또는 약 200 ng/mL 내지 약 800 ng/mL, 예를 들면, 약 500 ng/mL로 포함된 농도에서 사용된다.
특수한 구현예에서, 방법은 복구 주형을 포함하는 핵산을 제거(박테리아 세포는 상기 핵산을 "청소하는" 것으로서 고려된다)하고/하거나 단계 a) 동안 유전 툴와 함께 도입된 가이드 RNA(들) 또는 가이드 RNA(들)을 암호화하는 서열을 제거하는 추가의 단계 c)를 포함한다.
다른 특수한 구현예에서, 공정은 단계 b) 또는 단계 c) 이후에, n번째―예를 들면, 제3, 제4, 제5 등의―이미 도입된 것(들)과는 구별되는 복구 주형을 포함하는 핵산을 포함하는 핵산 및 상기 구별되는 복구 주형내에 포함된 목적한 서열이 항-CRISPR 단백질의 발현의 유도인자의 존재하에서, 박테리아 게놈의 표적화된 영역내로 통합되도록 하는 하나 이상의 가이드 RNA 발현 카세트를 도입하는 하나 이상의 추가의 단계를 포함하며, 각각의 추가의 단계에 이어서 이렇게 형질전환된 박테리아를 전형적으로 Cas/gRNA 리보뉴클레오단백질 복합체, 예를 들면 Cas9/gRNA의 발현을 가능하게 하는, 항-CRISPR 단백질의 발현의 유도인자를 함유하지 않는 배지에서 배양하는 단계가 수반된다.
본 발명에 따른 공정의 특수한 구현예에서, 박테리아는 목적한 표적 서열 중 적어도 하나의 가닥을 절단하는데 관여하는 효소를 사용하는(예를 들면, 암호화하는), 상술한 것과 같은 핵산 또는 유전 툴을 사용하여 형질전환되며, 여기서 효소는 특수한 구현예에서 뉴클레아제, 바람직하게는 Cas9 효소 및 MAD7 효소로부터 우선적으로 선택된 Cas 뉴클레아제이다. 예시적인 구현예에서, 목적한 표적 서열은 하나 이상의 항생제, 바람직하게는 암페니콜의 부류, 전형적으로 클로람페니콜-O-아세틸트랜스퍼라제와 같은 암페니콜-O-아세틸트랜스퍼라제에 속하는 하나 이상의 항생제에 대한 내성을 박테리아에게 부여하는 효소를 암호화하는, 예를 들면 catB 유전자를 암호화하는 서열, 암호화 서열의 전사를 제어하는 서열 또는 상기 암호화 서열을 플랭킹하는 서열이다.
사용된 경우, 항-CRISPR 단백질은 전형적으로 상술한 바와 같은 "항-Cas" 단백질이다. 항-CRISPR 단백질은 유리하게는 "항-Cas9" 단백질 또는 "항-MAD7" 단백질이다.
표적화된 DNA 부위("인식된 서열")와 같이, 편집/복구 주형은 자체적으로 하나 이상의 핵산 서열 또는 천연 및/또는 합성의, 암호화 및/또는 비-암호화 서열에 상응하는 핵산 서열의 부위를 포함할 수 있다. 주형은 또한 즉, 클로스트리디움 속에 속하는 박테리아의 게놈으로부터, 또는 상기 속의 특수한 종의 게놈으로부터 천연적으로 부재된, 하나 이상의 "외부" 서열을 포함할 수 있다. 주형은 또한 서열의 조합을 포함할 수 있다.
본 발명에 사용된 유전 툴은 복구 주형이 목적한 핵산의 박테리아 게놈내로, 전형적으로 적어도 1개의 염기쌍(bp), 바람직하게는 적어도 1, 2, 3, 4, 5, 10, 15, 20, 50, 100, 1000, 10,000, 100,000 또는 1,000,000 bp, 전형적으로 1 bp 내지 20 kb, 예를 들면 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 또는 13 kb, 또는 1 bp 내지 10 kb, 바람직하게는 10 bp 및 10 kb 또는 1 kb 내지 10 kb, 예를 들면 1 bp 내지 5 kb, 2 kb 내지 5 kb, 또는 2.5 또는 3 kb 내지 5 kb를 포함하는 DNA 서열 또는 서열 부위의 혼입을 안내하도록 한다.
특수한 구현예에서, 목적한 DNA 서열의 발현은 필룸 피르미쿠테스(phylum Firmicutes), 특히 클로스트리디움 속, 바실러스 속(genus Bacillus) 또는 락토바실러스 속(genus Lactobacillus)에 속하는 박테리아가 수개의 상이한 당, 예를 들면 적어도 2개의 상이한 당, 전형적으로 5-탄당(예를 들면, 글루코즈 또는 만노즈) 및/또는 6-탄당(예를 들면, 크실로즈, 아라비노즈 또는 프럭토즈) 중에서 적어도 2개의 상이한 당, 바람직하게는 예를 들면, 글루코즈, 크실로즈 및 만노즈로부터 선택된 적어도 3개의 상이한 당; 글루코즈, 아라비노즈 및 만노즈; 및 글루코즈, 크실로즈 및 아라비노즈를 발효(전형적으로 동시에)하도록 한다.
다른 특수한 구현예에서, 목적한 DNA 서열은 적어도 하나의 목적한 생성물, 바람직하게는 변형된 박테리아에 의해 용매 생산을 촉진하는 생성물, 전형적으로 적어도 하나의 목적한 단백질, 예를 들면, 효소; 트랜스포터와 같은 막 단백질; 다른 단백질(차페론 단백질)의 성숙 단백질; 전사 인자; 또는 이의 조합을 암호화한다.
특히 유리하게는, 본 발명에 따른 유전 툴은 목적한 작은 및 큰 서열 둘 다를 1 단계로, 즉, 단일 핵산(전형적으로, 본 내용에 기술된 바와 같은 "OPT 핵산", "제2" 또는 "n번째" 핵산)을 사용하여, 또는 수개의 단계로, 즉, 수개의 핵산(전형적으로 본 내용에서 기술한 바와 같이, "제2" 및 "n번째" 핵산)을 사용하여, 바람직하게는 1 단계로 도입되도록 한다.
본 발명의 특수한 구현예에서, 이러한 "n번째" 핵산과 같은 핵산 및 본 내용에 기술된 유전 툴은 박테리아 DNA의 표적화된 부위가 제거되거나 보다 짧은(예를 들면, 이로부터 적어도 하나의 염기쌍이 결실된 서열에 의해) 및/또는 비-기능성인 서열로 대체되도록 한다. 본 발명의 특수한 바람직한 구현예에서, "제2" 또는 "n번째" 핵산은 유리하게는 박테리아 내로, 예를 들면, 박테리아 게놈내로, 적어도 하나의 염기 쌍, 및 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 또는 15 kb 이하를 포함하는 목적한 핵산을 도입시킨다.
목적한 핵산은 사용된 gRNA에 따라서 동일하거나 상이한 영역내에서 박테리아 염색체내로 삽입시킬 수 있다.
본 발명에 의해, 전형적으로 본 발명에 따른 유전 툴 및 방법에 의해, 이제 클로스트리디움 속의 박테리아를 전신계적으로, 효과적으로(고 빈도의 동종 재조합), 실질적으로(박테리아 게놈내로 목적한 큰 핵산의 가능한 도입) 및 안정하게(항생제와 접촉된 형질전환된 박테리아를 유지시킬 필요없이) 형질전환시켜 목적한 형질전환된 박테리아, 예를 들면, 이들이 기원한 박테리아과 비교하여 유전형 또는 표현형 차이를 지닌 개선된 돌연변이체, 전형적으로 산업적으로 유용한 박테리아, 예를 들면, 용매 또는 바이오연료(biofuel)의 생산에 유용한 박테리아를 수득할 수 있다.
본 발명의 다른 목적은 공정을 사용하여 수득되고/되거나 본 발명에 따른 유전 툴을 사용하여 형질전환되고 이상적으로는 유전적으로 변형된 클로스트리디움 속의 박테리아, 전형적으로 클로스트리디움 속의 용매생성 박테리아뿐만 아니라 이의 임의의 기원한 박테리아, 클론, 돌연변이체, 또는 유전적으로 변형된 버젼에 관한 것이다. 이러한 박테리아는 복구 주형을 사용한 동종 재조합에 의해 이의 게놈내로 도입된 목적한 핵산(들)을 발현한다. 이러한 박테리아는 본 발명에 따른 유전 툴 중 모두 또는 일부, 전형적으로 Cas9와 같은 뉴클레아제 또는 Cas9와 같은 뉴클레아제를 암호화하는 핵산을 포함할 수 있다.
따라서, 본 발명에 의해 형질전환되고/되거나 유전적으로 변형된 예시적인 박테리아는 하나 이상의 항생제에 대한 내성을 박테리아에게 제공하는 효소를 더 이상 발현하지 않는 박테리아, 특히, 암페니콜-O-아세틸트랜스퍼라제를 발현하지 않는 박테리아, 예를 들면, 야생형 상태에서 catB 유전자를 발현하고, 일단 본 발명에 의해 형질전환 및/또는 유전적으로 변형되면 상기 catB 유전자를 결여하거나 상기 catB 유전자를 발현하지 않는 박테리아이다. 본 발명을 통해 이렇게 형질전환되고/되거나 유전적으로 변형된 박테리아는 암페니콜, 예를 들면, 본 내용에 기술된 바와 같은 암페니콜, 특히 클로람페니콜 또는 티암페니콜에 대해 민간성이 된다.
본 발명에 따른 바람직한 유전적으로 변형된 박테리아의 특수한 예는 본 설명에서 기탁 번호 LMG P-31151 하에 벨기에의 미생물 공동 기탁기관(Belgian Co-ordinated Collections of Micro-organisms)("BCCM", 벨기에 베-9000 겐트 칼.엘. 레데강크슈트라트 35, B-9000 소재)에 2018년 12월 6일자로 등록된, 씨. 베이제린키이 IFP962 △catB (또한 씨. 베이제린키이 DSM6423 △catB로서 확인됨)이다.
본 발명에 따른 바람직한 유전적으로 변형된 박테리아의 다른 특수한 예는 본 설명에서 씨. 베이제린키이로 확인된 박테리아이며, 이는 기탁 번호 LMG P-31277로서 BCCM-LMG 기탁기관(collection)에 2019년 2월 20일자로 등록된 균주 씨. 베이제린키이 IFP963 △ catB △ pNF2이다.
이러한 설명은 또한 상기 박테리아 중 하나의 임의의 유도된 박테리아, 클론, 돌연변이체 또는 유전적으로 변형된 버젼, 예를 들면, 티암페니콜 및/또는 클로람페니콜과 같은 암페니콜에 대해 민감성으로 남아있는 임의의 유도된 박테리아, 클론, 돌연변이체 또는 유전적으로 변형된 버젼에 관한 것이다.
특수한 구현예에 따라서, 본 발명에 따라 형질전환되고/되거나 유전적으로 변형된 박테리아, 예를 들면 씨. 베이제린키이 DSM 6423 △catB 또는 씨. 베이제린키이 DSM6423 △ catB △ pNF2은 여전히 형질전환될 수 있고, 바람직하게는 유전적으로 변형될 수 있다. 이는 핵산, 예를 들면, 본 설명, 예를 들면, 실험 단락에 기술된 바와 같은 플라스미드를 사용하여 수행할 수 있다. 유리하게 사용될 수 있는 예시적인 핵산은 서열 번호: 23의 서열(본 설명의 실험 단락에 기술됨)의 플라스미드 pCas9acr 또는 pCas9ind(서열 번호: 22), pCas9cond(서열 번호: 133) 및 pMAD7(서열 번호: 134)로부터 선택된 플라스미드이다.
본 발명의 특수한 양태는 본 내용에 기술된 유전적으로 변형된 박테리아, 바람직하게는 번호 LMG P-31151 하에 기탁된 박테리아 씨. 베이제린키이 IFP962 △catB, 보다 바람직하게는 번호 LMG P-31277 하에 기탁된 씨. 베이제린키이 IFP963 △catB △ pNF2, 또는 예를 들면, 본 내용에 기술된 핵산, 유전 툴 또는 공정 중 하나를 사용하여, 이의 게놈내로 의도적으로 도입된 목적한 핵산(들)의 발현 덕분에, 하나 이상의 용매, 바람직하게는 적어도 이소프로판올을, 바람직하게는 산업적 규모로 생성하기 위한 용도에 관한 것이다.
특수한 구현예에서, 본 발명에 따른 방법 및 유전 툴을 사용하여 수득된, 본 발명에 따른 클로스트리디움 속의 박테리아는 이의 게놈내로 의도적으로 도입된 목적한 핵산 또는 산의 발현 덕분만으로 하나 이상의 용매를 생산할 수 있다.
본 발명은 또한 본 내용에 기술된 바와 같은 유전 툴의 성분 중 모두 또는 일부, 전형적으로 i) Cas9와 같은 하나의 DNA 엔도뉴클레아제를 암호화하는 제1의 핵산(여기서 엔도뉴클레아제를 암호화하는 서열은 프로모터의 제어 하에 위치한다), 및 ii) 동종 재조합 메카니즘에 의해, 엔도뉴클레아제에 의해 표적화된 박테리아 DNA의 일부를 목적한 서열로 대체하도록 하는 복구 주형, 및 툴내에서 사용된 선택된 항-CRISPR 단백질의 발현의 유도성 프로모터에 대해 적응된 적어도 하나의 유도인자를 암호화하는 적어도 제2의 핵산을 포함하는 클로스트리디움 속의 박테리아를 형질전환시키고, 전형적으로는 유전적으로 변형시키기 위한 키트에 관한 것이다. 키트는 또한 엔도뉴클레아제 및/또는 하나 이상의 가이드 RNA의 발현을 제어하기 위한 툴내에서 임의로 사용된 선택된 유도성 프로모터(들)에 대해 적응된 하나 이상의 유도인자를 포함할 수 있다.
또한 (i) 본 내용에 기술된 바와 같은 핵산, 예를 들면, 본 내용에 기술된 바와 같은 박테리아내 표적 서열을 인식하는 "OPT 핵산" 또는 DNA 단편, 및 (ii) 상기 박테리아의 개선된 변이체; gRNA로서의 핵산; 복구 매트릭스로서의 핵산; "OPT 핵산"; 프라이머의 적어도 하나의 쌍, 예를 들면, 본 발명의 내용에 기술된 바와 같은 프라이머의 쌍; 및 상기 툴에 의해 암호화된 단백질을 발현하기 위한 유도인자, 예를 들면, Cas9 또는 MAD7 뉴클레아제를 생산하기 위하여, 이러한 박테리아를 형질전환시키고, 전형적으로 유전적으로 변형시키기 위한 본 내용에 기술된 바와 같은 유전 변형 툴의 성분으로부터 선택된 적어도 하나의 툴, 바람직하게는 수개의 툴을 포함하는 키트가 기술되어 있다.
본 내용에 기술된 바와 같은 박테리아를 형질전환하고, 전형적으로 유전적으로 변형시키기 위한 유전 변형 툴은 예를 들면, 상기 설명된 바와 같은, "OPT 핵산", CRISPR 툴, 제II형 인트론의 사용을 기반으로 한 툴 및 대립형질 교환 툴로부터 선택될 수 있다.
특수한 구현예에서, 키트는 본 내용에 기술된 바와 같은 유전 툴의 성분 중 모두 또는 일부를 포함한다.
본 내용에 기술된 바와 같은 필룸 피르미쿠테스에 속하는 박테리아를 형질전환시키고, 바람직하게는 유전적으로 변형시키기 위한, 또는 이러한 박테리아를 사용하여, 적어도 하나의 용매, 예를 들면, 용매의 혼합물을 생산하기 위한 특수한 키트는 i) 서열 번호: 126의 서열 중 모두 또는 일부 및 ii) 박테리아의 유전 물질의 변형 및/또는 상기 박테리아의 야생형 버젼에 존재하는 유전 물질로부터 부분적으로 또는 전체적으로 부재된 DNA 서열; 및 본 내용에 기술된 유전 툴내에서 사용된 선택된 항-CRISPR 단백질의 발현의 유도성 프로모터에 대해 적응된 적어도 하나의 유도인자의 상기 박테리아내 발현을 가능하게 하는 서열을 포함하거나 이로 이루어진 핵산을 포함한다.
키트는 사용된 뉴클레아제 및/또는 하나 이상의 가이드 RNA의 발현을 제어하기 위해 유전 툴내에서 임의로 사용된 선택된 유도성 프로모터(들)에 적응된 하나 이상의 유도인자를 포함할 수 있다.
본 발명에 따른 특수한 키트는 엔도뉴클레아제, 예를 들면, Cas9 또는 tag를 포함하는 MAD7 단백질의 발현을 허용한다.
본 발명에 따른 키트는 배양 배지, 클로스트리디움 속의 적어도 하나의 컴피턴트(competent) 박테리아(즉, 형질전환을 위해 패키지된), 적어도 하나의 gRNA, 뉴클레아제, 예를 들면, Cas9 또는 MAD7 단백질, 하나 이상의 선택 분자, 또는 설명서 세트와 같은 하나 이상의 소비재를 추가로 포함할 수 있다.
본 발명은 전형적으로 본 내용에 기술된 형질전환 및 이상적으로 유전적 변형의 공정을 수행하기 위한, 및/또는 용매(들)(적어도 하나의 용매)를 클로스트리디움 속의 박테리아를 사용하여 생산하기 위한 키트에 관한 것이다.
본 발명은 또한 클로스트리디움 속의 박테리아, 전형적으로 클로스트리디움 속의 용매생성 박테리아를 형질전환시키고, 전형적으로 유전적으로 변형시키기 위한, 예를 들면, 클로스트리디움 속의 박테리아의 개선된 변이체를 생성하기 위한 본 발명에 따른 핵산, 유전 툴, 공정, 또는 키트의 잠재적인 용도에 관한 것이다.
설명은 특히 본 내용에 기술된 바와 같은 박테리아, 전형적으로 클로스트리디움 속(예를 들면, 번호 LMG P-31151 하에 기탁된 씨. 베이제린키이 IFP962 △catB)의 박테리아, 바람직하게는 박테리아 염색체 및 염색체 DNA와 구별되는 적어도 하나의 DNA 분자(전형적으로 천연 플라스미드)를 가진 박테리아, 가장 바람직하게는 번호 LMG P-31277 하에 기탁된 박테리아 씨. 베이제린키이 IFP963 △ catB △pNF2 박테리아의, 본 내용에 기술된 형질전환 공정, 및 이상적으로 유전적 변형을 실행하기 위한, 본 발명에 따른 키트, 또는 이러한 키트의 하나 이상의 성분의 용도에 관한 것이다.
최종적으로, 본 발명은 특히 용매(들) 또는 바이오연료(들), 또는 이의 혼합물을 산업적 규모로 생산할 수 있도록 하는, 본 발명에 따라 형질전환된 클로스트리디움 속의 핵산, 유전 툴, 공정, 키트 또는 박테리아의 잠재적인 용도에 관한 것이다. 생산될 수 있는 용매는 전형적으로 아세톤, 부탄올, 에탄올, 이소프로판올 또는 이의 혼합물, 전형적으로 에탄올/이소프로판올, 부탄올/이소프로판올, 또는 에탄올/부탄올 혼합물, 바람직하게는 이소프로판올/부탄올 혼합물이다.
특수한 구현예에서, 에탄올/이소프로판올 혼합물의 비는 적어도 1/4와 동일하다. 이러한 비는 바람직하게는 1/3 내지 1에 포함되고, 보다 바람직하게는 1과 동일하다.
특수한 구현예에서, 에탄올/부탄올 혼합물의 비는 적어도 1/4과 동일하다. 이러한 비는 바람직하게는 1/3 내지 1에 포함되고, 보다 바람직하게는 1과 동일하다.
특수한 구현예에서, 이소프로판올/부탄올 혼합물의 비는 적어도 1/4과 동일하다. 이러한 비는 바람직하게는 1/3 내지 1에 포함되고, 보다 바람직하게는 1과 동일하다.
본 발명에 따른 형질전환된 박테리아의 용도는 산업적 규모의 적어도 100톤의 아세톤, 적어도 100톤의 에탄올, 적어도 1000톤의 이소프로판올, 적어도 1800톤의 부탄올, 또는 적어도 40,000톤의 이의 혼합물의 매년 산업적 생산을 전형적으로 허용한다.
다음의 실시예 및 도면은 본 발명을 이의 영역을 제한하지 않고 보다 완전히 나타내기 위해 의도된다.
도 1: Cas9 뉴클레아제를 사용하여, gRNA에 의해 지시된 게놈 DNA내에 하나 이상의 이중 가닥 절단물을 생성하기 위한 유전 툴로서 게놈 편집(genome editing)을 위해 사용된 CRISPR/Cas9 시스템.
gRNA, 가이드 RNA; PAM, 광스페이서(photospacer) 인접한 모티프. 도는 Jinek et al., 2012로부터 변형되었다.
도 2: 동종 재조합에 의한 Cas9-유도된 이중 가닥 절단물의 복구. PAM, 광스페이서 인접한 모티프.
도 3: 클로스트리디움에서 CRISPR/Cas9의 용도.
ermB, 에리쓰로마이신 내성 유전자; catP (서열 번호: 70), 티암페니콜/클로람페니콜 내성 유전자; tetR, 이의 발현 생성물이 Pcm-tetO2/1로부터의 전사를 억제하는 유전자; Pcm-2tetO1 및 Pcm-tetO2/1, 안하이드로테트라사이클린 (aTc)-유도성 프로모터(Dong et al., 2012); miniPthl, 구성적 프로모터(Dong et al., 2012).
도 4: pCas9acr 플라스미드 맵(서열 번호: 23).
ermB, 에리쓰로마이신 내성 유전자; rep, 이. 콜라이내 복제 오리진; repH, 씨. 아세토부틸리쿰내 복제 오리진; Tthl, 티올라제 터미네이터; miniPthl, 구성적 프로모터(Dong et al., 2012); Pcm-tetO2/1, tetR의 생성물에 의해 억제되고 안하이드로테트라사이클린(aTc)에 의해 유도성인 프로모터(Dong et al., 2012); Pbgal, lacR의 생성물에 의해 억제되고 락토즈에 의해 유도성인 프로모터(Hartman et al., 2011); acrIIA4 , 항-CRISPR 단백질 AcrII14를 암호화하는 유전자; bgaR, 이의 발현 생성물이 Pbgal로부터의 전사를 억제하는 유전자.
도 5: pCas9ind(서열 번호: 22) 또는 pCas9acr(서열 번호: 23)을 포함하는 씨. 아세토부틸리쿰 DSM 792의 상대적인 형질전환 비율. 빈도는 pEC750C(서열 번호: 106)의 형질전환 빈도와 관련하여, 형질전환에 사용된 DNA의 μg당 수득된 형질전환체의 수로서 나타내며, 적어도 2개의 독립된 실험의 평균을 나타낸다.
도 6: pCas9acr 및 (서열 번호: 79 및 서열 번호: 80)의 존재하에서 또는 (서열 번호: 105) 복구 주형의 부재하에서 bdhB를 표적화하는 gRNA에 대한 발현 플라스미드를 포함하는 균주 DSM 792 형질전환체내에서 CRISPR/Cas9 시스템의 유도. Em, 에리쓰로마이신; Tm, 티암페니콜; aTc, 안하이드로테트라사이클린; ND, 희석되지 않음.
도 7: CRISPR/DNA 엔도뉴클레아제 시스템을 통한 씨. 아세토부틸리쿰 DSM792의 bdh 유전자자리의 변형.
A, bdh 유전자자리의 유전적 구조화. 복구 주형과 게놈성 DNA 사이의 상동성은 연회색 평행사변형으로 나타낸다. 프라이머 V1 및 V2의 하이브리드화 부위가 또한 나타나 있다.
B, 프라이머 V1 및 V2를 사용한 bdh 유전자자리의 증폭. M, 2-로그 크기 마커(NEB); P, pGRNA-△bdhA △ bdhB 플라스미드; WT, 야생형 균주.
도 8: Poehlein et al., 2017에 따른, 30개의 용매생성 클로스트리디움 균주의 분류. 하위분기군 씨. 베이제린키이 NRRL B-593이 문헌에서 씨. 베이제린키이 DSM 6423으로 확인되어 있음에 주목하라.
도 9: pCas9ind-△catB 플라스미드 맵(plasmid map).
도 10: pCas9acr 플라스미드 맵.
도 11: pEC750S-uppHR 플라스미드 맵.
도 12: pEX-A2-gRNA-upp 플라스미드 맵.
도 13: pEC750S-△upp 플라스미드 맵.
도 14: pEC750C-△upp 플라스미드 맵.
도 15: pGRNA-pNF2 맵.
도 16: 균주 씨. 베이제린키이 DSM 6423의 박테리아 형질전환으로부터 유도된 클론내 catB 유전자의 PCR 증폭.
균주가 여전히 catB 유전자를 가진 경우 약 1.5 kb, 또는 이러한 유전자가 결실된 경우 약 900 bp의 증폭.
도 17: 2YTG 배지 및 2YTG 티암페니콜 선택성 배지에서 균주 씨. 베이제린키이 DSM 6423 WT 및 △catB의 성장.
도 18: 복구 매트릭스의 존재 또는 부재하에서, pCas9acr 및, upp를 표적화하는 gRNA에 대한 발현 플라스미드를 포함하는 균주 씨. 베린제린키이 DSM 6423의 형질전환체내에서 CRISPR/Cas9acr 시스템의 유도. 범례: Em, 에리쓰로마이신; Tm, 티암페니콜; aTc, 안하이드로테트라사이클린; ND, 희석되지 않음.
도 19: 도 19의 A는 CRISPR/Cas9 시스템을 통해 씨. 베이제린키이 DSM 6423의 upp 유전자자리의 변형을 나타낸다. 도 19의 A는 게놈 DNA내에서 상응하는 상동성 영역과 관련된, upp 유전자자리: 유전자, gRNA 표적 부위 및 복구 매트릭스의 유전적 구조화를 나타낸다. PCR 검증을 위한 프라이머의 하이브리드화 부위(RH010 및 RH011)가 또한 나타나 있다.
도 19의 B는 CRISPR/Cas9 시스템을 통해 씨. 베이제린키이 DSM 6423의 upp 유전자자리의 변형을 나타낸다. 도 19의 B는 프라이머 RH010 및 RH011을 사용하는 upp 유전자자리의 증폭을 나타낸다. 1680 bp의 증폭은 변형된 upp 유전자에 대한 1090 bp와 비교하여, 야생형 유전자의 경우에 예측된다. M, 100 bp 내지 3 kb 크기의 마커(Lonza); WT, 야생형 균주.
도 20: 균주 씨. 베이제린키이 6423 △catB 내에서 플라스미드 pCas9ind의 존재를 입증하는 PCR 증폭.
도 21: CRISPR-Cas9 시스템으로부터 aTc를 포함하는 배지 상에서 유도 전(양성 대조군 1 및 2) 및 이후 유도 후 천연의 pNF2 플라스미드의 존재 또는 부재를 입증하는 PCR 증폭(△900 bp).
도 22: 2개의 플라스미드의 사용을 기반으로 한, 클로스트리디움 속의 박테리아에 대해 적응된, 박테리아를 변형시키기 위한 유전 툴(참고: WO2017/064439, Wasels et al., 2017).
도 23: pCas9ind-gRNA△catB 플라스미드 맵.
도 24: 균주 씨. 베이제린키이 DSM6423내에서 20 μg의 pCas9ind 플라스미드에 대한 형질전환 효능(형질전환된 DNA의 μg당 관찰된 콜로니 내). 오차 바아(error bar)는 생물학적 3회에 대한 평균의 표준 오차를 나타낸다.
도 25: pNF3 플라스미드 맵.
도 26: pEC751S 플라스미드 맵.
도 27: pNF3S 플라스미드 맵.
도 28: pNF3E 플라스미드 맵.
도 29: pNF3C 플라스미드 맵.
도 30: 씨. 베이제린키이 DSM 6423의 3개의 균주내 플라스미드 pCas9ind의 형질전환 효능(형질전환된 DNA의 μg 당 관찰된 콜로니 내). 오차 바아는 생물학적 3회의 평균의 표준 오차에 상응한다.
도 31: 씨. 베이제린키이 DSM 6423으로부터 유도된 2개의 균주내에서 플라스미드 pEC750C의 형질전환 효능(형질전환된 DNA의 μg당 관찰된 코로니내).
도 32: 균주 씨. 베이제린키이 DSM 6423△catB △pNF2내 플라스미드 pEC750C, pNF3C, pFW01 및 pNF3E의 형질전환 효능(형질전환된 DNA의 μg당 관찰된 코로니내). 오차 받아는 생물학적 3회에 대한 평균의 표준 편차에 상응한다.
도 33: 균주 씨. 베이제린키이 NCIMB 8052의 플라스미드 pFW01, pNF3E 및 pNF3S의 형질전환 효능(형질전환된 DNA의 μg당 관찰된 콜로니 내).
gRNA, 가이드 RNA; PAM, 광스페이서(photospacer) 인접한 모티프. 도는 Jinek et al., 2012로부터 변형되었다.
도 2: 동종 재조합에 의한 Cas9-유도된 이중 가닥 절단물의 복구. PAM, 광스페이서 인접한 모티프.
도 3: 클로스트리디움에서 CRISPR/Cas9의 용도.
ermB, 에리쓰로마이신 내성 유전자; catP (서열 번호: 70), 티암페니콜/클로람페니콜 내성 유전자; tetR, 이의 발현 생성물이 Pcm-tetO2/1로부터의 전사를 억제하는 유전자; Pcm-2tetO1 및 Pcm-tetO2/1, 안하이드로테트라사이클린 (aTc)-유도성 프로모터(Dong et al., 2012); miniPthl, 구성적 프로모터(Dong et al., 2012).
도 4: pCas9acr 플라스미드 맵(서열 번호: 23).
ermB, 에리쓰로마이신 내성 유전자; rep, 이. 콜라이내 복제 오리진; repH, 씨. 아세토부틸리쿰내 복제 오리진; Tthl, 티올라제 터미네이터; miniPthl, 구성적 프로모터(Dong et al., 2012); Pcm-tetO2/1, tetR의 생성물에 의해 억제되고 안하이드로테트라사이클린(aTc)에 의해 유도성인 프로모터(Dong et al., 2012); Pbgal, lacR의 생성물에 의해 억제되고 락토즈에 의해 유도성인 프로모터(Hartman et al., 2011); acrIIA4 , 항-CRISPR 단백질 AcrII14를 암호화하는 유전자; bgaR, 이의 발현 생성물이 Pbgal로부터의 전사를 억제하는 유전자.
도 5: pCas9ind(서열 번호: 22) 또는 pCas9acr(서열 번호: 23)을 포함하는 씨. 아세토부틸리쿰 DSM 792의 상대적인 형질전환 비율. 빈도는 pEC750C(서열 번호: 106)의 형질전환 빈도와 관련하여, 형질전환에 사용된 DNA의 μg당 수득된 형질전환체의 수로서 나타내며, 적어도 2개의 독립된 실험의 평균을 나타낸다.
도 6: pCas9acr 및 (서열 번호: 79 및 서열 번호: 80)의 존재하에서 또는 (서열 번호: 105) 복구 주형의 부재하에서 bdhB를 표적화하는 gRNA에 대한 발현 플라스미드를 포함하는 균주 DSM 792 형질전환체내에서 CRISPR/Cas9 시스템의 유도. Em, 에리쓰로마이신; Tm, 티암페니콜; aTc, 안하이드로테트라사이클린; ND, 희석되지 않음.
도 7: CRISPR/DNA 엔도뉴클레아제 시스템을 통한 씨. 아세토부틸리쿰 DSM792의 bdh 유전자자리의 변형.
A, bdh 유전자자리의 유전적 구조화. 복구 주형과 게놈성 DNA 사이의 상동성은 연회색 평행사변형으로 나타낸다. 프라이머 V1 및 V2의 하이브리드화 부위가 또한 나타나 있다.
B, 프라이머 V1 및 V2를 사용한 bdh 유전자자리의 증폭. M, 2-로그 크기 마커(NEB); P, pGRNA-△bdhA △ bdhB 플라스미드; WT, 야생형 균주.
도 8: Poehlein et al., 2017에 따른, 30개의 용매생성 클로스트리디움 균주의 분류. 하위분기군 씨. 베이제린키이 NRRL B-593이 문헌에서 씨. 베이제린키이 DSM 6423으로 확인되어 있음에 주목하라.
도 9: pCas9ind-△catB 플라스미드 맵(plasmid map).
도 10: pCas9acr 플라스미드 맵.
도 11: pEC750S-uppHR 플라스미드 맵.
도 12: pEX-A2-gRNA-upp 플라스미드 맵.
도 13: pEC750S-△upp 플라스미드 맵.
도 14: pEC750C-△upp 플라스미드 맵.
도 15: pGRNA-pNF2 맵.
도 16: 균주 씨. 베이제린키이 DSM 6423의 박테리아 형질전환으로부터 유도된 클론내 catB 유전자의 PCR 증폭.
균주가 여전히 catB 유전자를 가진 경우 약 1.5 kb, 또는 이러한 유전자가 결실된 경우 약 900 bp의 증폭.
도 17: 2YTG 배지 및 2YTG 티암페니콜 선택성 배지에서 균주 씨. 베이제린키이 DSM 6423 WT 및 △catB의 성장.
도 18: 복구 매트릭스의 존재 또는 부재하에서, pCas9acr 및, upp를 표적화하는 gRNA에 대한 발현 플라스미드를 포함하는 균주 씨. 베린제린키이 DSM 6423의 형질전환체내에서 CRISPR/Cas9acr 시스템의 유도. 범례: Em, 에리쓰로마이신; Tm, 티암페니콜; aTc, 안하이드로테트라사이클린; ND, 희석되지 않음.
도 19: 도 19의 A는 CRISPR/Cas9 시스템을 통해 씨. 베이제린키이 DSM 6423의 upp 유전자자리의 변형을 나타낸다. 도 19의 A는 게놈 DNA내에서 상응하는 상동성 영역과 관련된, upp 유전자자리: 유전자, gRNA 표적 부위 및 복구 매트릭스의 유전적 구조화를 나타낸다. PCR 검증을 위한 프라이머의 하이브리드화 부위(RH010 및 RH011)가 또한 나타나 있다.
도 19의 B는 CRISPR/Cas9 시스템을 통해 씨. 베이제린키이 DSM 6423의 upp 유전자자리의 변형을 나타낸다. 도 19의 B는 프라이머 RH010 및 RH011을 사용하는 upp 유전자자리의 증폭을 나타낸다. 1680 bp의 증폭은 변형된 upp 유전자에 대한 1090 bp와 비교하여, 야생형 유전자의 경우에 예측된다. M, 100 bp 내지 3 kb 크기의 마커(Lonza); WT, 야생형 균주.
도 20: 균주 씨. 베이제린키이 6423 △catB 내에서 플라스미드 pCas9ind의 존재를 입증하는 PCR 증폭.
도 21: CRISPR-Cas9 시스템으로부터 aTc를 포함하는 배지 상에서 유도 전(양성 대조군 1 및 2) 및 이후 유도 후 천연의 pNF2 플라스미드의 존재 또는 부재를 입증하는 PCR 증폭(△900 bp).
도 22: 2개의 플라스미드의 사용을 기반으로 한, 클로스트리디움 속의 박테리아에 대해 적응된, 박테리아를 변형시키기 위한 유전 툴(참고: WO2017/064439, Wasels et al., 2017).
도 23: pCas9ind-gRNA△catB 플라스미드 맵.
도 24: 균주 씨. 베이제린키이 DSM6423내에서 20 μg의 pCas9ind 플라스미드에 대한 형질전환 효능(형질전환된 DNA의 μg당 관찰된 콜로니 내). 오차 바아(error bar)는 생물학적 3회에 대한 평균의 표준 오차를 나타낸다.
도 25: pNF3 플라스미드 맵.
도 26: pEC751S 플라스미드 맵.
도 27: pNF3S 플라스미드 맵.
도 28: pNF3E 플라스미드 맵.
도 29: pNF3C 플라스미드 맵.
도 30: 씨. 베이제린키이 DSM 6423의 3개의 균주내 플라스미드 pCas9ind의 형질전환 효능(형질전환된 DNA의 μg 당 관찰된 콜로니 내). 오차 바아는 생물학적 3회의 평균의 표준 오차에 상응한다.
도 31: 씨. 베이제린키이 DSM 6423으로부터 유도된 2개의 균주내에서 플라스미드 pEC750C의 형질전환 효능(형질전환된 DNA의 μg당 관찰된 코로니내).
도 32: 균주 씨. 베이제린키이 DSM 6423△catB △pNF2내 플라스미드 pEC750C, pNF3C, pFW01 및 pNF3E의 형질전환 효능(형질전환된 DNA의 μg당 관찰된 코로니내). 오차 받아는 생물학적 3회에 대한 평균의 표준 편차에 상응한다.
도 33: 균주 씨. 베이제린키이 NCIMB 8052의 플라스미드 pFW01, pNF3E 및 pNF3S의 형질전환 효능(형질전환된 DNA의 μg당 관찰된 콜로니 내).
실시예
실시예
1
물질 및 방법
성장 조건
씨. 아세토부틸리쿰 DSM 792를 2YTG 배지(16 g/l 트립톤, 10 g/l 효모 추출물, 5 g/l 글루코즈, 4 g/l NaCl) 속에서 성장시켰다. 이. 콜라이 NEB10B를 LB 배지(10 g/l 트립톤, 5 g/l 효모 추출물, 5 g/l NaCl) 속에서 성장시켰다. 고체 배지는 15 g/l 아가로즈를 액체 배지에 첨가하여 제조하였다. 에리쓰로마이신(2YTG 또는 LB 배지 속에서 각각 40 또는 500 mg/l의 농도에서), 클로람페니콜(고체 또는 액체 LB 속에서 각각 25 또는 12.5 mg/l의 농도) 및 티암페니콜(2YTG 배지 속에서 15 mg/l)을 필요한 경우 사용하였다.
핵산의 사용
사용된 모든 효소 및 키트는 제공자의 추천에 따라 그와 같이 수행하였다.
플라스미드
작제
도 4에 나타낸 pCas9acr 플라스미드(서열 번호: 23)는 bgaR 및 acrIIA4를 포함하는 단편(서열 번호: 1)을 Eurofins Genomics가 합성한 Pbgal 프로모터의 제어하에서 pCas9ind 벡터의 SacI 부위에서 클로닝함으로써 작제하였다(Wasels et al., 2017).
pGRNAind 플라스미드(서열 번호: 82)는 gRNA용의 발현 카세트(서열 번호: 83)를 Eurofins Genomics가 합성한 프로모터 Pcm-2tetO1 (Dong et al., 2012)의 제어 하에서pEC750C 벡터(서열 번호: 106)의 SacI 부위에 클로닝함으로써 작제하였다(Wasels et al., 2017).
pGRNA-xylB(서열 번호: 102), pGRNA-xylR(서열 번호: 103), pGRNA-glcG(서열 번호: 104) 및 pGRNA-bdhB(서열 번호: 105) 플라스미드는 각각의 프라이머 쌍 5'-TCATGATTTCTCCATATTAGCTAG-3' 및 5'-AAACCTAGCTAATATGGAGAAATC-3', 5'-TCATGTTACACTTGGAACAGGCGT-3' 및 5'-AAACACGCCTGTTCCAAGTGTAAC-3', 5'-TCATTTCCGGCAGTAGGATCCCCA-3' 및 5'-AAACTGGGGATCCTACTGCCGGAA-3', 5'-TCATGCTTATTACGACATAACACA-3' 및 5'-AAACTGTGTTATGTCGTAATAAGC-3'를 BsaI으로 분해한 pGRNAind 플라스미드(서열 번호: 82) 내에 클로닝함으로써 작제하였다.
pGRNA-△bdhB 플라스미드(서열 번호: 79)는 한편으로는 프라이머 5'-ATGCATGGATCCAAACGAACCCAAAAAGAAAGTTTC-3' 및 5'-GGTTGATTTCAAATCTGTGTAAACCTACCG-3', 다른 한편으로는 5'-ACACAGATTTGAAATCAACCACTTTAACCC-3' 및 5'-ATGCATGTCGACTCTTAAGAACATGTATAAAGTATGG-3'를 사용하여 수득된 PCR 생성물의 PCR 조립체를 오버랩핑(overlapping)함으로써 수득된 DNA 단편을 BamHI 및 SacI으로 분해한 pGRNA-bdhB 벡터내에서 클로닝함으로써 작제하였다.
pGRNA-△bdhA △ bdhB 플라스미드(서열 번호: 80)는 한편으로는 5'-ATGCATGGATCCAAACGAACCCAAAAAGAAAGTTTC-3' 및 5'-GCTAAGTTTTAAATCTGTGTAAACCTACCG-3', 다른 한편으로는 5'-ACACAGATTTAAAACTTAGCATACTTCTTACC-3' 및 5'-ATGCATGTCGACCTTCTAATCTCCTCTACTATTTTAG-3'를 사용하여 수득된 PCR 생성물의 PCR 조립체를 오버랩핑함으로써 수득된 DNA 단편을 BamHI 및 SacI으로 분해한 pGRNA-bdhB 벡터내에서 클로닝함으로써 작제하였다.
형질전환
씨. 아세토부틸리쿰 DSM 792는 Mermelstein et al., 1993에 기술된 프로토콜에 따라서 형질전환하였다. gRNA 발현 카세트를 포함하는 플라스미드로 형질전환된 이미 Cas9 발현 플라스미드(pCas9ind 또는 pCas9acr)를 포함하는 씨. 아세토부틸리쿰 DSM 792 형질전환체의 선택은 에리쓰로마이신(40 mg/l), 티암페니콜(15 mg/l) 및 락토즈(40 nM)를 포함하는 고체 2YTG 배지 상에서 수행하였다.
cas9
발현의 유도
cas9 발현의 유도는 에리쓰로마이신(40 mg/l), 티암페니콜(15 mg/l) 및, cas9 및 gRNA의발현의 유도인자인, aTc(1 mg/l)을 포함하는 고체 2YTG 배지에서 수득된 형질전환체의 성장을 통해 달성하였다.
bdh
유전자자리의 증폭
bdhA 및 bdhB 유전자 유전자자리에서 씨. 아세토부틸리쿰 DSM 792 게놈의 편집의 입증은 Q5® 고-정확도(High-Fidelity) DNA 폴리머라제(NEB) 효소와 프라이머 V1 (5'-ACACATTGAAGGGAGCTTTT-3') 및 V2(5'-GGCAACAACATCAGGCCTTT-3')를 사용하는 PCR로 수행하였다.
결과
형질전환 효능
cas9 발현 플라스미드의 형질전환 빈도에 있어서 acrIIA4 유전자의 삽입의 영향을 평가하기 위하여, 상이한 gRNA 발현 플라스미드를 pCas9ind(서열 번호: 22) 또는 pCas9acr(서열 번호: 23)를 포함하는 균주 DSM 792내에서 형질전환시키고, 형질전환체를 락토즈가 보충된 배지 상에서 선택하였다. 수득된 형질전환 빈도는 도 5에 나타낸다.
△
bdhB
및 △
bdhA
△
bdhB
돌연변이체의 생성
bdhB를 표적화하는 gRNA에 대한 발현 카세트를 포함하는 표적화 플라스미드(pGRNA-bdhB - 서열 번호: 105) 뿐만 아니라 bdhB 유전자 단독(pGRNA-△bdhB - 서열 번호: 79) 또는 bdhA 및 bdhB 유전자(pGRNA-△bdhA △ bdhB - 서열 번호: 80)를 포함하는 2개의 유도된 플라스미드를 pCas9ind(서열 번호: 22) 또는 pCas9acr(서열 번호: 23)를 포함하는 균주 DSM 792 내에서 형질전환시켰다. 수득되는 형질전환 빈도는 표 2에 나타낸다:
표 2: bdhB 를 표적화하는 플라스미드를 사용한 pCas9ind 또는 pCas9ind을 함유하는 균주 DSM 792의 형질전환 빈도. 빈도는 형질전환에 사용된 DNA 의 μg 당 수득된 형질전환체의 수로 나타내며, 적어도 2개의 독립된 실험의 평균을 나타낸다.
수득된 형질전환체는 안하이드로테트라사이클린(aTc)이 보충된 배지 상에서 계대배양을 통해 CRISPR/Cas9 시스템의 발현의 유도 기를 겪었다(도 6).
바람직한 변형은 2개이 aTc-내성 콜로니의 게놈 DNA에서 PCR로 확인하였다(도 7).
결론
Wasels et al. (2017)에 기술된 CRISPR/Cas9-기반 유전 툴은 2개의 플라스미드를 사용한다:
- 제1 플라스미드, pCas9ind는 aTc-유도성 프로모터의 제어 하에서 cas9를 함유한다.
- pEC750C로부터 유도된 제2 플라스미드는 gRNA에 대한 발현 카세트(제2의 aTc-유도성 프로모터의 제어 하에 위치함)뿐만 아니라 시스템에 의해 유도된 이중 가닥 브레이크(break)를 보수하는 편집 주형을 함유한다.
그러나, 본 발명자들은 일부 gRNA가 이들의 발현의 제어 뿐만 아니라 aTc-유도성 프로모터를 사용한 Cas9의 제어에도 불구하고 여전히 너무 독성인 것으로 여겨짐을 관찰하였으므로, 유전 툴에 의한 및 따라서 염색체 변형에 의해 박테리아 형질전환 효능을 제한하였음을 관찰하였다.
이러한 유전 툴을 개선시키기 위하여, cas9 발현 플라스미드를 락토즈-유도성 프로모터의 제어하에서, 항-CRISPR 유전자, acrIIA4의 삽입을 통해 변형시켰다. 따라서, 상이한 gRNA 발현 플라스미드의 형질전환 효능은 유의적으로 개선되어 시험한 모든 플라스미드에 대해 형질전환체가 수득되도록 하였다.
pCas9ind를 포함하는 균주 DSM 792내로 도입되지 않을 수 있는 플라스미드를 사용한 씨. 아세토부틸리쿰 DSM 792 게놈내에서 bdhB 유전자자리를 편집하는 것이 또한 가능하였다. 관찰된 변형 빈도는 앞서 관찰한 바와 동일하며(Wasels et al., 2017), 시험한 콜로니의 100%가 변형되었다.
결론적으로, cas9 발현 플라스미드의 변형은 Cas9-gRNA 리보뉴클레오단백질 복합체의 보다 우수한 제어를 허용함으로써 Cas9의 작용이 목적한 돌연변이체를 수득하기 위해 개시될 수 있는 형질전환체의 생산을 유리하게 촉진하였다.
실시예
2
물질 및 방법
성장 조건
씨. 베이제린키이 DSM 6423을 2YTG 배지(16 g/l 트립톤, 10 g/l 효모 추출물, 5 g/l 글루코즈, 4 g/l NaCl)내에서 성장시켰다. 이. 콜라이 NEB 10-베타 및 INV110을 LB 배지(10 g/l 트립톤, 5 g/l 효모 추출물, 5 g/l NaCl) 속에서 성장시켰다. 고체 배지는 15 g/l 아가로즈를 액체 배지에 가함으로써 제조하였다. 에리쓰로마이신(2YTG 또는 LB 배지 속에서 각각 20 또는 500 mg/l의 농도), 클로람페니콜(고체 또는 액체 LB 속에서 각각 25 또는 12.5 mg/l), 티암페니콜(2YTG 배지 속에서 15 mg/l) 또는 스펙티노마이신(LB 또는 2YTG 배지 속에서 각각 100 또는 650 mg/l의 농도에서)를 필요할 경우 사용하였다.
핵산 및 플라스미드 벡터
사용된 모든 효소 및 키트는 제공업자의 추천에 따라 사용하였다.
콜로니 PCR 시험은 다음의 프로토콜에 따랐다:
단리된 씨. 베이제린키이 DSM 6423 콜로니를 100 μL의 10 mM 트리스, pH 7.5, 5 mM EDTA 속에서 재현탁시켰다. 이러한 용액을 98℃까지 10분 동안 교반없이 가열하였다. 0.5 μL의 이러한 박테리아 분해물을 이후에 Phire(Thermo Scientific), Phusion(Thermo Scientific), Q5(NEB) 또는 KAPA2G Robust(Sigma-Aldrich) 폴리머라제가 들어있는 10 μL의 반응물 속에서 PCR 매트릭스로서 사용할 수 있다.
모든 작제를 위해 사용된 프라이머의 목록(명칭/DNA 서열)은 하기에 상세히 기술된다:
△catB△fwd:
TGTTATGGATTATAAGCGGCTCGAGGACGTCAAACCATGTTAATCATTGC
△catB△rev:
AATCTATCACTGATAGGGACTCGAGCAATTTCACCAAAGAATTCGCTAGC
△catB△gRNA△rev
AATCTATCACTGATAGGGACTCGAGGGGCAAAAGTGTAAAGACAAGCTTC
RH076:
CATATAATAAAAGGAAACCTCTTGATCG
RH077:
ATTGCCAGCCTAACACTTGG
RH001:
ATCTCCATGGACGCGTGACGTCGACATAAGGTACCAGGAATTAGAGCAGC
RH002:
TCTATCTCCAGCTCTAGACCATTATTATTCCTCCAAGTTTGCT
RH003:
ATAATGGTCTAGAGCTGGAGATAGATTATTTGGTACTAAG
RH004:
TATGACCATGATTACGAATTCGAGCTCGAAGCGCTTATTATTGCATTAGC
pEX-fwd:
CAGATTGTACTGAGAGTGCACC
pEX-rev:
GTGAGCGGATAACAATTTCACAC
pEC750C-fwd:
CAATATTCCACAATATTATATTATAAGCTAGC
M13-rev:
CAGGAAACAGCTATGAC
RH010:
CGGATATTGCATTACCAGTAGC
RH011:
TTATCAATCTCTTACACATGGAGC
RH025:
TAGTATGCCGCCATTATTACGACA
RH134:
GTCGACGTGGAATTGTGAGC
pNF2△fwd:
GGGCGCACTTATACACCACC
pNF2△rev:
TGCTACGCACCCCCTAAAGG
RH021
ACTTGGGTCGACCACGATAAAACAAGGTTTTAAGG
RH022
TACCAGGGATCCGTATTAATGTAACTATGATATCAATTCTTG
aad9-fwd2
ATGCATGGTCCCAATGAATAGGTTTACACTTACTTTAGTTTTATGG
aad9-rev
ATGCGAGTTAACAACTTCTAAAATCTGATTACCAATTAG
RH031
ATGCATGGATCCCAATGAATAGGTTTACACTTACTTTAGTTTTATGG
RH032
ATGCGAGAGCTCAACTTCTAAAATCTGATTACCAATTAG
RH138
ATGCATGGATCCGTCTGACAGTTACCAGGTCC
RH139
ATGCGAGAGCTCCAATTGTTCAAAAAAATAATGGCGGAG
RH140
ATGCATGGATCCCGGCAGTTTTTCTTTTTCGG
RH141
ATGCGAGAGCTCGGTTAAATACTAGTTTTTAGTTACAGAC
다음의 플라스미드 벡터를 제조하였다:
- 플라스미드 1번: pEX-A258-△catB (서열 번호: 17)
이는 플라스미드 pEX-A258내로 클로닝된 합성된 DNA의 △catB 단편을 함유한다. 이러한 △catB 단편은 i) 안하이드로테트라사이클린-유도성 프로모터(발현 카세트: 서열 번호: 19)의 제어 하에서 씨. 베이제린키이 DSM6423으로부터의 catB 유전자(클로람페니콜-O-아세틸트랜스퍼라제를 암호화하는 클로람페니콜 내성 유전자 - 서열 번호: 18)를 표적화하는 가이드 RNA에 대한 발현 카세트 및 ii) catB 유전자의 상부 및 하부에 위치한 400개의 상동성 bp를 포함하는 편집 매트릭스(서열 번호: 20)를 포함한다.
- 플라스미드 2번: pCas9ind-△catB(참고: 도 9 및 서열 번호: 21)
이는 제한 효소 XhoI에 의한 상이한 DNA의 분해 후 PCR에 의해 증폭되고 pCas9ind(특허원 WO2017/064439에 기술됨 - 서열 번호: 22)내로 클로닝된 △catB 단편(프라이머 △ catB △fwd 및 △catB △rev)을 함유한다.
- 플라스미드 3번: pCas9acr(참고: 도 10 및 서열 번호: 23)
- 플라스미드 4번: pEC750S-uppHR(참고: 도 11 및 서열 번호: 24)
이는 upp 유전자의 결실에 사용되고 upp 유전자의 상부 및 하부의 2개의 상동성 DNA 단편(각각의 크기: 500(서열 번호: 26) 및 377(서열 번호: 27)개 염기쌍)으로 이루어진 복구 매트릭스(서열 번호: 25)를 함유한다. 이러한 조합물은 Gibson 클로닝 시스템(New England Biolabs, Gibson assembly Master Mix 2X)을 사용하여 수득되었다. 이러한 목표를 위하여, 상부 및 하부 부분을 균주 DSM 6423(참고: Mate de Gerando et al., 2018 및 수탁 번호 PRJEB11626 (https://www.ebi.ac.uk/ena/data/view/PRJEB11626))의 게놈성 DNA로부터 각각의 프라이머 RH001/RH002 및 RH003/RH004를 사용한 PCR로 증폭시켰다. 이러한 2개의 단편을 이후에 앞서 선형화된 pEC750S내에서 효소적 제한(SalI 및 SacI 제한 효소)에 의해 조립하였다.
- 플라스미드 5번: pEX-A2-gRNA-upp(참고: 도 12 및 서열 번호: 28)
이러한 플라스미드는 pEX-A2로 명명된 복제 플라스미드내로 삽입된 구성 프로모터(서열 번호: 30의 서열의 비-암호화 RNA)의 제어 하에서 upp 유전자(광스페이서 표적화 upp(서열 번호: 31))를 표적화하는 가이드 RNA에 대한 발현 카세트(서열 번호: 29)에 상응하는 gRNA-upp DNA 단편을 포함한다.
- 플라스미드 6번: pEC750S-△upp(참고: 도 13 및 서열 번호: 32)
이는 플라스미드 pEC750S-uppHR(서열 번호: 24)를 기반으로 하며 구성 프로모터의 제어 하에서 upp 유전자를 표적화하는 가이드 RNA에 대한 발현 카세트를 포함하는 DNA 단편을 추가로 함유한다.
이러한 단편을 pEX-A2-gRNA-upp로 불리는 pEX-A2 내로 삽입하였다. 이후에, 삽입체를 PCR에 의해 프라이머 pEX-fwd 및 pEX-rev를 사용하여 증폭시킨 후 제한 효소 XhoI 및 NcoI으로 분해하였다. 최종적으로, 이러한 단편을 동일한 제한 효소로 우선 분해한 pEC750S-uppHR 내로 연결함으로써 클로닝하여 pEC750S-△upp를 수득하였다.
- 플라스미드 7번: pEC750C-△upp(참고: 도 14 및 서열 번호: 33)
가이드 RNA 및 복구 매트릭스를 지닌 카세트를 이후에 프라이머 pEC750C-fwd 및 M13-rev로 증폭시켰다. 앰플리콘을 효소 XhoI 및 SacI을 사용한 효소적 제한으로 분해한 후, pEC750C내로 효소적 연결에 의해 클로닝하여 pEC750C-△upp를 수득하였다.
- 플라스미드 8번: pGRNA-pNF2(참고: 도 15 및 서열 번호: 34)
이러한 플라스미드는 pEC750C를 기반으로 하며 플라스미드 pNF2(서열 번호: 118)를 표적화하는 가이드 RNA에 대한 발현 카세트를 함유한다.
- 플라스미드 9번: pCas9ind-gRNA△catB(참고: 도 23 및 서열 번호: 38).
이는 PCR(프라이머 △catB△fwd 및 △catB△gRNA△rev)에 의해 증폭되고 제한 효소 XhoI에 의한 상이한 DNA의 분해 및 연결 후 pCas9ind(특허원 WO2017/064439에 기술됨) 내로 클로닝된 catB 유전자자리를 표적화하는 가이드 RNA를 암호화하는 서열을 함유한다.
- 플라스미드 10번: pNF3(참고: 도 25 및 서열 번호: 119)
이는 복제 오리진 및 프라이머 RH021 및 RH022를 사용하여 증폭시킨, 플라스미드 복제 단백질(CIBE△p20001)을 암호화하는 유전자를 포함하는 pNF2의 일부를 함유한다. 이러한 PCR 생성물은 이후에 플라스미드 pUC19(서열 번호: 117) 내에서 SalI 및 BamHI 제한 부위에 클로닝되었다.
- 플라스미드 11번: pEC751S(참고: 도 26 및 서열 번호: 121)
이는 클로람페니콜 내성 유전자 catP(서열 번호: 70)를 제외하고는, pEC750C(서열 번호: 106)의 모든 성분을 함유한다. 후자는 엔테로코쿠스 파에칼리스(Enterococcus faecalis)의 aad9 유전자(서열 번호: 130)에 의해 대체되며, 이는 스텍티노마이신에 대한 내성을 부여한다. 이러한 성분을 플라스미드 pMTL007S-E1(서열 번호: 120)로부터의 프라이머 aad9-fwd2 및 aad9-rev로 증폭시키고 catP 유전자(서열 번호: 70) 대신에 pEC750C의 AvaII 및 HpaI 부위내로 클로닝하였다.
- 플라스미드 12번: pNF3S(참고: 도 27 및 서열 번호: 123)
이는 BamHI와 SacI 부위 사이에 aad9 유전자(pEC751S로부터 프라이머 RH031 및 RH032로 증폭됨)가 삽입된, pNF3의 모든 성분을 함유한다.
- 플라스미드 13번: pNF3E(참고: 도 28 및 서열 번호: 124)
이는 miniPthl 프로모터의 제어 하에서 클로스트리디움 디피실레의 ermB 유전자(서열 번호:131)가 삽입된 pNF3의 모든 성분을 함유한다. 이러한 성분은 pFW01으로부터 프라이머 RH138 및 RH139를 사용하여 증폭된 후 pNF3E의 BamHI와 SacI 사이에 클로닝된다.
- 플라스미드 14번: pNF3C(참고: 도 29 및 서열 번호: 125)
이는 클로스트리디움 페르프링겐스의 catP 유전자(서열 번호: 70)가 삽입된 pNF3의 모든 성분을 함유한다. 이러한 성분은 pEC750C로부터 프라이머 RH140 및 RH141를 사용하여 증폭되었으며 pNF3E의 BamHI와 SacI 사이에 클로닝되었다.
결과 1
씨.
베이제린키이
DSM
6423의 프로세싱
플라스미드를 이. 콜라이 dam - dcm - 균주(INV110, Invitrogen) 내로 도입시키고 복제하였다. 이는 다음의 변형과 함께 Mermelstein et al. (1993)에 의해 기술된 프로토콜에 따라, 균주 DSM 6423내로 형질전환시킴으로서 이를 도입하기 전에 pCas9ind-catB 플라스미드 상에서 Dam 및 Dcm 메틸화의 제거를 허용한다: 균주는 0.8의 OD600를 지닌, 다량의 플라스미드(20 μg)로, 및 다음의 전기천공 매개변수를 사용하여 형질전환된다: 100 Ω, 25 μF, 1400 V. 에리쓰로마이신(20 μg/mL)을 포함하는 페트리 디쉬 상에서의 스트리킹(streaking)은 pCas9ind-△catB 플라스미드를 포함하는 씨. 베이제린키이 DSM 6423 형질전환체를 생산하였다.
cas9 발현의 도입 및 균주 씨. 베이제린키이 DSM 6423 △ catB 의 생산
수개의 에리쓰로마이신-내성 콜로니를 이후에 100 μL의 배양 배지(2YTG) 속에 취하고 배양 배지 속에서 104의 희석 인자가지 일련 희석하였다. 각각의 콜로니에 대해, 8 μL의 각각의 희석물을 에리쓰로마이신 및 안하이드로테트라사이클린(200 ng/mL)을 포함하는 페트리 디쉬 상에 침착시켜 Cas9 뉴클레아제를 암호화하는 유전자의 발현을 유도하였다.
게놈 DNA의 추출 후, 디쉬에서 성장시킨 클론내 catB 유전자의 결실을 프라이머 RH076 및 RH077를 사용하여, PCR로 입증하였다(참고: 도 16).
균주 씨.
베이제린키이
DSM
6423 △
catB
의
티암페니콜에 대한 민감성의 입증
catB 유전자의 결실이 티암페니콜에 대한 새로운 민감성을 실제로 부여함을 보증하기 위하여, 아가 배지(agar medium) 상에서 비교 분석을 수행하였다. 씨. 베이제린키이 DSM 6423 및 씨. 베이제린키이 DSM 6423 catB의 예비배양물을 2YTG 배지 상에서 제조한 후 100 μL의 이러한 예비배양물을 15 mg/L의 농도에서 티암페니콜이 임의로 보충된 2YTG 아가 배지 상에 스프레딩하였다. 도 17은 초기의 씨. 베이제린키이 DSM 6423 균주만이 티암페니콜 보충된 배지상에서 성장할 수 있음을 나타낸다.
균주
씨.
베이제린키이
DSM
6423 △
catB
내에서
CRISPR
-
Cas9
툴에
의한
upp
유전자의 결실
균주 씨. 베이제린키이 DSM 6423 △catB의 클론을 dam 및 dcm 메틸트랜스퍼라제에 의해 인식된 모티프에서 메틸화를 가지지 않은 pCas9acr 벡터(dam - dcm - 유전형을 지닌 에스케리키아 콜라이 박테리아으로부터 제조)로 우선 형질전환시켰다. 균주 씨. 베이제린키이 DSM 6423내에 유지된 플라스미드 pCas9acr의 존재의 입증은 프라이머 RH025 및 RH134를 지닌 콜로니 PCR로 입증하였다.
에리쓰로마이신-내성 클론을 이후에 먼저 탈메틸화된 pEC750C-△upp로 형질전환시켰다. 이렇게 수득된 콜로니를 에리쓰모마이신(20 μg/mL), 티암페니콜(15 μg/mL) 및 락토즈(40 mM)를 포함하는 배지 상에서 선택하였다.
이후에 수개의 이러한 콜로니를 100 μL의 배양 배지(2YTG)에 재현탁시키고 일련의 배양 배지(104의 희석 인자까지) 일련 희석시켰다. 5 마이크로리터의 각각의 희석물을 에리쓰로마이신, 티암페니콜 및 안하이드로테트라사이클린(200 ng/mL)을 포함하는 페트리 디쉬 위에 두었다(참고: 도 18).
각각의 클론에 대해, 2개의 aTc-내성 콜로니를 upp 유전자자리를 증폭시키기 위한 프라이머를 사용한 콜로니 PCR로 시험하였다(참고: 도 19).
균주
씨.
베이제린키이
DSM
6423 △
catB
내에서
CRISPR
-
Cas9
툴에
의한 천연 pNF2 플라스미드의 결실
균주 씨. 베이제린키이 DSM 6423 △catB의 클론을 우선 Dam 및 Dcm 메틸트랜스퍼라제(dam - dcm 유전형을 지닌 에스케리치아 콜라이 박테리아으로부터 제조)에 의해 인식된 모티프에서 메틸화를 갖지 않는 벡터 pCas9ind로 형질전환시켰다. 균주 씨. 베이제린키이 DSM6423 내에서 플라스미드 pCas9ind의 존재는 프라이머 pCas9ind △fwd(서열 번호: 42) 및 pCas9ind △rev(서열 번호: 43)를 사용한 PCR로 입증하였다(참고: 도 20).
이후에 에리쓰로마이신-내성 클론을 사용하여 dam - dcm - 유전형을 지닌 에스케리키아 콜라이 박테리아으로부터 제조된 pGRNA-pNF2를 형질전환시켰다.
에리쓰로마이신(20 μg/mL) 및 티암페니콜(15 μg/mL)을 포함하는 배지 상에서 수득된 몇개의 콜로니를 배양 배지 속에서 재현탁시키고 104의 희석인자로 일련 희석하였다. 8 마이크로리터의 각각의 희석물을 에리쓰로마이신, 티암페니콜 및 안하이드로테트라사이클린(200 ng/mL)을 포함하는 페트리 디쉬에 두어 CRISPR/Cas9 시스템의 발현을 유도하였다.
천연 pNF2 플라스미드의 부재는 프라이머 pNF2△fwd(서열 번호: 39) 및 pNF2△rev(서열 번호: 40)를 사용한 PCR로 입증하였다(참고: 도 21).
결론
이러한 작업 동안, 본 발명자는 균주클로스트리디움 베이제린키이 DSM 6423내에 상이한 플라스미드를 도입하여 유지시키는데 성공하였다. 이들은 단일 플라스미드의 사용을 기반으로 하는 CRISPR-Cas9 툴을 사용하여 catB 유전자를 제거할 수 있었다. 수득된 재조합 균주의 티암페니콜에 대한 민감성을 아가 배지에서 시험하여 확인하였다.
이러한 결실은 이들이 특허원 FR1854835에 기술된 2개의 플라스미드를 필요로 하는 CRISPR-Cas9 툴을 보다 효과적으로 사용하도록 하였다. 2개의 실시예를 수행하여 본 출원의 목적: upp 유전자의 결실 및 균주 클로스트리디움 베이제린키이DSM 6423에 대해 필수적이지 않은 천연 플라스미드의 제거를 입증하였다.
결과 2
씨.
베이제린키이
균주의 형질전환
균주 이. 콜라이 NEB 10-베타내에서 제조된 플라스미드를 또한 사용하여 균주 씨. 베이제린키이 NCIMB 8052를 형질전환시켰다. 대조적으로, 씨. 베이제린키이 DSM 6423의 경우, 플라스미드를 우선 이. 콜라이 dam - dcm - 균주(INV110, Invitrogen)내로 도입하고 복제시켰다. 이는 균주 DSM 6423내로 형질전환시킴으로써 이들의 도입 전에 목적한 플라스미드 상에 Dam 및 Dcm 메틸화의 제거를 허용한다.
형질전환은 또한 각각의 균주에 대해, 즉, 다음의 변형과 함께 Mermelstein et al. 1992에 기술된 프로토콜에 따라 유사하에 수행한다: 균주를 0.6 내지 0.8의 OD600로 다량의 플라스미드 (5-20 μg)를 사용하여 형질전환시키며, 전기천공 매개변수는 100 Ω, 25 μF, 1400 V이다. 2YTG내에서 3시간 재생 후, 박테리아를 목적한 항생제(에리쓰로마이신: 20-40 μg/mL; 티암페니콜: 15 μg/mL; 스펙티노마이신: 650 μg/mL)을 포함하는 페트리 디쉬(2YTG 아가) 상에서 스트리킹한다.
씨.
베이제린키이
DSM
6423 균주의 형질전환 효능의 비교
형질전환을 다음의 씨. 베이제린키이 균주: DSM 6423 야생형, DSM 6423 catB 및 DSM 6423 △ catB △ pNF2에서 생물학적 복제를 수행하였다(도 30). 이러한 목적을 위해, pCas9ind 벡터가 양호한 형질전환 효능을 허용하지 않으므로 박테리아를 변형시키는데 사용하기에 특히 어려운, pCas9ind 벡터를 사용하였다. 이는 또한 이에 대해 모든 3개의 균주가 민감성인 항생제인, 에리쓰로마이신에 대한 균주 내성을 제공하는 유전자를 함유한다.
결과는 천연의 pNF2 플라스미드의 손실로 인한 약 15 내지 20의 인자까지 형질전환 효능에 있어서의 증가를 나타낸다.
형질전환 효능을 또한 플라스미드 pEC750C에 대해 시험하였으며, 이는 균주 DSM 6423 catB 및 DSM 6423 △catB △pNF2에서만 티암페니콜 내성을 부여하는데, 이는 야생형 균주가 이러한 항생제에 대해 내성이기 때문이다(도 31). 이러한 플라스미드의 경우, 형질전환 효능에 있어서의 획득은 보다 더 명백하다(약 2000 인자까지 증가).
pNF3
플라스미드와 다른 플라스미드의 형질전환의 비교
천연의 pNF2 플라스미드의 복제 오리진을 포함하는 플라스미드의 형질전환 효능을 측정하기 위하여, 플라스미드 pNF3E 및 pNF3C를 균주 씨. 베이제린키이 DSM 6423 △ catB △ pNF2내로 도입하였다. 에리쓰로마이신 또는 클로람페니콜 내성 유전자를 포함하는 벡터의 사용은 벡터의 형질전환 효능을 내성 유전자의 특성에 따라 비교되도록 한다. 플라스미드 pFW01 및 pEC750C를 또한 형질전환하였다. 이러한 2개의 플라스미드는 상이한 항생제(에리쓰로마이신 및 티암페니콜 각각)에 대한 내성 유전자를 함유하며 일반적으로 씨. 베이제린키이 및 씨. 아세토부틸리쿰을 형질전환하는데 사용하였다.
도 32에 나타낸 바와 같이, pNF3을 기반으로 하는 벡터는 탁월한 형질전환 효능을 가지며, 씨. 베이제린키이 DSM 6423 △ catB △ pNF2에서 사용하기에 특히 적합하다. 특히, pNF3E(이는 에리쓰로마이신 내성 유전자를 함유한다)는 pFW01보다 유의적으로 더 높은 형질전환 효능을 나타내며, 이는 동일한 내성 유전자를 갖는다. 이러한 동일한 플라스미드는 야생형 씨. 베이제린키이 DSM 6423 균주내로 도입될 수 없었으며(생물학적 복제물내로 형질전환된 5 μg의 플라스미드를 사용하여 0개의 콜로니 수득), 이는 천연 pNF2 플라스미드의 존재의 영향을 입증한다.
다른 균주/종에서
pNF3
플라스미드의
형질전환능의
입증
다른 용매생성 클로스트리디움 균주내에서 이러한 새로운 플라스미드를 사용하는 가능성을 나타내기 위하여, 본 발명자는 ABE 균주 씨. 베이제린키이 NCIMB 8052내에서 플라스미드 pFW01, pNF3E 및 pNF3S의 형질전환 효능의 비교 분석을 수행하였다(도 33). 균주 NCIMB 8052는 티암페니콜에 대해 천연적으로 내성이므로, 스펙티노마이신에 대한 내성을 부여하는 pNF3S를 pNF3C 대신에 사용하였다.
결과는 균주 NCIMB 8052가 pNF3을 기반으로 한 플라스미드로 형질전환될 수 있음을 나타내며, 이는 이러한 벡터가 광범위한 의미에서 씨. 베이제린키이 종에 적용가능함을 입증한다.
pNF3을 기반으로 한 합성 벡터의 세트의 적용가능성을 또한 참고 균주 씨. 아세토부틸리쿰 DSM 792에서 시험하였다. 따라서, 형질전환 시험은 플라스미드 pNF3C를 사용한 이러한 균주의 형질전환 가능성(플라스미드 pEC750C에 대해 120 콜로니/μg와 비교하여 형질전환된 DNA의 μg 당 관찰된 3개의 콜로니의 형질전환 효능)을 나타내었다.
출원 FR18/73492에 기술된 유전
툴을
사용한
pNF3
플라스미드의 적합성(compatibility)의 입증
특허원 FR18/73492는 catB 균주 및 에리쓰로마이신 내성 유전자 및 티암페니콜 내성 유전자의 사용을 필요로 하는 2개-플라스미드 CRISPR/Cas9 시스템의 용도를 기술하고 있다. pNF3 플라스미드의 새로운 세트의 이점을 입증하기 위하여, 벡터 pNF3C를 pCas9acr 플라스미드를 이미 포함하는 균주 catB내에서 형질전환시켰다. 2회 수행된 형질전환은 0.625 ± 0.125개의 콜로니/μg DNA(평균 ± 표준 오차)의 형질전환 효능을 나타내었으며, 이는 pNF3C를 기반으로 한 벡터가 catB 균주내에서 pCas9acr와 함께 사용될 수 있음을 입증한다.
이러한 결과와 동반하여, 이러한 복제 오리진을 포함하는 플라스미드 pNF2의 일부(서열 번호: 118)를 성공적으로 재사용하여 요구하는 바와 같이 변형가능한, 셔틀 벡터(shuttle vector)의 새로운 세트(서열 번호: 119, 123, 124 및 125)를 생성함으로써 특히 이. 콜라이내에서 이들의 복제뿐만 아니라 씨. 베이제린키이 DSM 6423 내로 이들의 재도입을 허용할 수 있었다. 이러한 새로운 벡터는 예를 들면, 씨. 베이제린키이 DSM 6423 및 이의 유도체내에서, 특히 2개의 상이한 핵산을 포함하는 CRISPR/Cas9 툴을 사용하는, 유전 편집을 위한 유리한 형질전환 효능을 갖는다.
이러한 새로운 벡터는 또한 다른 씨. 베이제린키이 균주(NCIMB 8052), 및 클로스트리디움 종(특히 씨. 아세토부틸리쿰)내에서 성공적으로 시험되어, 필룸 피르미쿠테스의 다른 유기체내에서 이들의 적용능을 입증하였다. 시험은 또한 바실러스에서 수행한다.
결론
이러한 결과는 천연의 pNF2 플라스미드의 억제가 이를 함유한 박테리아의 형질전환 빈도를 유의적으로 증가시킴을 나타낸다(pFW01의 경우 약 15의 인자까지 및 pEC750C의 경우 약 2000의 인자까지). 이러한 결과는 형질전환하기 어려운 것으로 알려진 클로스트리디움 속의 박테리아의 경우, 및 특히 천연적으로 낮은 형질전환 효능(5개 미만의 콜로니/μg의 플라스미드)으로 고생하는 균주 씨. 베이제린키이 DSM 6423의 경우 특히 흥미롭다.
REFERENCES
SEQUENCE LISTING
<110> IFP Energies nouvelles
<120> Optimized genetic tool for modifying clostridium bacteria
<130> IP20193667FR
<160> 134
<170> KopatentIn 2.0
<210> 1
<211> 50
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer deltacatB-fwd
<400> 1
tgttatggat tataagcggc tcgaggacgt caaaccatgt taatcattgc 50
<210> 2
<211> 50
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer deltacatB-rev
<400> 2
aatctatcac tgatagggac tcgagcaatt tcaccaaaga attcgctagc 50
<210> 3
<211> 28
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer RH076
<400> 3
catataataa aaggaaacct cttgatcg 28
<210> 4
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer RH077
<400> 4
attgccagcc taacacttgg 20
<210> 5
<211> 50
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer RH001
<400> 5
atctccatgg acgcgtgacg tcgacataag gtaccaggaa ttagagcagc 50
<210> 6
<211> 43
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer RH002
<400> 6
tctatctcca gctctagacc attattattc ctccaagttt gct 43
<210> 7
<211> 40
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer RH003
<400> 7
ataatggtct agagctggag atagattatt tggtactaag 40
<210> 8
<211> 50
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer RH004
<400> 8
tatgaccatg attacgaatt cgagctcgaa gcgcttatta ttgcattagc 50
<210> 9
<211> 22
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer pEX-fwd
<400> 9
cagattgtac tgagagtgca cc 22
<210> 10
<211> 23
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer pEX-rev
<400> 10
gtgagcggat aacaatttca cac 23
<210> 11
<211> 32
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer pEC750C-fwd
<400> 11
caatattcca caatattata ttataagcta gc 32
<210> 12
<211> 17
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer M13-rev
<400> 12
caggaaacag ctatgac 17
<210> 13
<211> 22
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer RH010
<400> 13
cggatattgc attaccagta gc 22
<210> 14
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer RH011
<400> 14
ttatcaatct cttacacatg gagc 24
<210> 15
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer RH025
<400> 15
tagtatgccg ccattattac gaca 24
<210> 16
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer RH134
<400> 16
gtcgacgtgg aattgtgagc 20
<210> 17
<211> 3658
<212> DNA
<213> Artificial Sequence
<220>
<223> pEX-A258-deltacatB
<400> 17
ctcgagctgc agcaaaaaaa gcaccgactc ggtgccactt tttcaagttg ataacggact 60
agccttattt taacttgcta tttctagctc taaaactgtg gtctctcttt tcgttgatgg 120
tggaatgata agggtttgca ccttaatttc tcctattgag aaaatcgtct cttctcagac 180
gtcaaaccat gttaatcatt gcttttatca aaaataggat ccactctatc attgatagag 240
tttgaaactc tatcattgat agagtataat atctttgttc atgtacatca tgctatctgt 300
gagttttaga gctagaaata gcaagttaaa ataaggctag tccgttatca acttgaaaaa 360
gtggcaccga gtcggtgctt tttttgaagc ttgtctttac acttttgccc attaattttt 420
gagttcctta tttttaggga gcttttatta tttttatcat gaaaatttca taaaatactc 480
ataaactaag gatgtcttca taatcagatt agtactccat tttcaatcca tttaatctgg 540
gaatatgata ttttaattac gtattattta agatatatta acgtgtaata taataccccg 600
caaatattaa ttatcacata catatccccc ctttattggg gcattttttg tacccattat 660
tttagtattg tgcagtactt aaataaaaaa atgccgcaaa ttcattttta ttgaataatg 720
cggtatttct tctattcttt atttttatta ctctataaat aatgtaatca agacatgact 780
atctaaatat atgatatctt aattcataat tcgggcctcc taaaaatttt cgtaattcta 840
ttttagaagg cttttttccg tgacctagcc atttcaatct cctttttaca atgatattta 900
cgctttagtt tattatagca cattctgtaa taccgaacta ttcaattttc agagaccatt 960
ttttattgat tcataactta agaatactac gaattactct aatattttac tttttcttat 1020
ctcttgttat tttaacatcg gaattactac taatattaat ttttattttt ccatccgcat 1080
ttgctccaac atttttttaa ctatactttc cttttgttaa taaattatgt tattgttgaa 1140
caatataaga aaagtgcgta acatttttta ttaaaaataa ttaggtattt ctatctgtgg 1200
ggtaccctcg aggtggcagc tctagagcta gcgaattctt tggtgaaatt gttatccgct 1260
cacaattcca cacaacatac gagccggaag cataaagtgt aaagcctggg gtgcctaatg 1320
agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc gctttccagt cgggaaacct 1380
gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg 1440
gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc 1500
ggtatcagct cactcaaagg cggtaatacg gttatccaca gaatcagggg ataacgcagg 1560
aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct 1620
ggcgtttttc cataggctcc gcccccctga cgagcatcac aaaaatcgac gctcaagtca 1680
gaggtggcga aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct 1740
cgtgcgctct cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc 1800
gggaagcgtg gcgctttctc atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt 1860
tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc 1920
cggtaactat cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc 1980
cactggtaac aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg 2040
gtggcctaac tacggctaca ctagaagaac agtatttggt atctgcgctc tgctgaagcc 2100
agttaccttc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag 2160
cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga 2220
tcctttgatc ttttctacgg ggtctgacgc tcagtggaac gaaaactcac gttaagggat 2280
tttggtcatg agattatcaa aaaggatctt cacctagatc cttttaaatt aaaaatgaag 2340
ttttaaatca atctaaagta tatatgagta aacttggtct gacagttacc aatgcttaat 2400
cagtgaggca cctatctcag cgatctgtct atttcgttca tccatagttg cctgactccc 2460
cgtcgtgtag ataactacga tacgggaggg cttaccatct ggccccagtg ctgcaatgat 2520
accgcgcgaa ccacgctcac cggctccaga tttatcagca ataaaccagc cagccggaag 2580
ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc atccagtcta ttaattgttg 2640
ccgggaagct agagtaagta gttcgccagt taatagtttg cgcaacgttg ttgccattgc 2700
tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct tcattcagct ccggttccca 2760
acgatcaagg cgagttacat gatcccccat gttgtgcaaa aaagcggtta gctccttcgg 2820
tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta tcactcatgg ttatggcagc 2880
actgcataat tctcttactg tcatgccatc cgtaagatgc ttttctgtga ctggtgagta 2940
ctcaaccaag tcattctgag aatagtgtat gcggcgaccg agttgctctt gcccggcgtc 3000
aatacgggat aataccgcgc cacatagcag aactttaaaa gtgctcatca ttggaaaacg 3060
ttcttcgggg cgaaaactct caaggatctt accgctgttg agatccagtt cgatgtaacc 3120
cactcgtgca cccaactgat cttcagcatc ttttactttc accagcgttt ctgggtgagc 3180
aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg gcgacacgga aatgttgaat 3240
actcatactc ttcctttttc aatattattg aagcatttat cagggttatt gtctcatgag 3300
cggatacata tttgaatgta tttagaaaaa taaacaaata ggggttccgc gcacatttcc 3360
ccgaaaagtg ccacctgacg tctaagaaac cattattatc atgacattaa cctataaaaa 3420
taggcgtatc acgaggccct ttcgtctcgc gcgtttcggt gatgacggtg aaaacctctg 3480
acacatgcag ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg ggagcagaca 3540
agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg ggctggctta actatgcggc 3600
atcagagcag attgtactga gagtttggca attggtcgac ctcgagggcg cgcccgta 3658
<210> 18
<211> 660
<212> DNA
<213> Clostridium beijerinckii
<400> 18
atgaatttta atttgataga tattaatcat tggagtagaa agccatactt tgaacattat 60
ttaaacaatg tgaaatgtac ttatagtatg actgccaata tagaaataac tgatttattg 120
tatgaaatta aacttaaaaa tattaaattt tatcctaccc ttatttatat gattgcaact 180
gtggttaata agcataaaga attccgtatt tgttttgatc atgaaggtag tttaggatat 240
tgggatagca tgaatccaag ctatactatt tttcataaag aaaacgaaac attttcaagt 300
atttggacgg aatataacaa aagtttttta cgtttttata gtgattatct tgacgatata 360
aaaaactatg gaaatatcat gaagtttact ccgaaatcaa atgaacctga caatacattt 420
tctgtatcaa gcattccttg ggtgagtttt acaggattta acttgaatgt gtataatgaa 480
ggaacatatt taattcctat ttttactgca ggaaagtatt tcaaacaaga aaataaaata 540
tttattccta tatcaataca agtacatcat gctatctgtg acggttatca tgctagtaga 600
tttattaatg aaatgcaaga attagcattt agttttcaag aatggttaga aaataaataa 660
<210> 19
<211> 160
<212> DNA
<213> Artificial Sequence
<220>
<223> Cassette d'expression ARNg
<400> 19
actctatcat tgatagagtt tgaaactcta tcattgatag agtataatat ctttgttcat 60
gtacatcatg ctatctgtga gttttagagc tagaaatagc aagttaaaat aaggctagtc 120
cgttatcaac ttgaaaaagt ggcaccgagt cggtgctttt 160
<210> 20
<211> 808
<212> DNA
<213> Artificial Sequence
<220>
<223> Editing template
<400> 20
gtctttacac ttttgcccat taatttttga gttccttatt tttagggagc ttttattatt 60
tttatcatga aaatttcata aaatactcat aaactaagga tgtcttcata atcagattag 120
tactccattt tcaatccatt taatctggga atatgatatt ttaattacgt attatttaag 180
atatattaac gtgtaatata ataccccgca aatattaatt atcacataca tatcccccct 240
ttattggggc attttttgta cccattattt tagtattgtg cagtacttaa ataaaaaaat 300
gccgcaaatt catttttatt gaataatgcg gtatttcttc tattctttat ttttattact 360
ctataaataa tgtaatcaag acatgactat ctaaatatat gatatcttaa ttcataattc 420
gggcctccta aaaattttcg taattctatt ttagaaggct tttttccgtg acctagccat 480
ttcaatctcc tttttacaat gatatttacg ctttagttta ttatagcaca ttctgtaata 540
ccgaactatt caattttcag agaccatttt ttattgattc ataacttaag aatactacga 600
attactctaa tattttactt tttcttatct cttgttattt taacatcgga attactacta 660
atattaattt ttatttttcc atccgcattt gctccaacat ttttttaact atactttcct 720
tttgttaata aattatgtta ttgttgaaca atataagaaa agtgcgtaac attttttatt 780
aaaaataatt aggtatttct atctgtgg 808
<210> 21
<211> 9954
<212> DNA
<213> Artificial Sequence
<220>
<223> pCas9ind-deltacatB
<400> 21
catggataaa aagtacagta ttggtctaga cataggaact aactctgttg ggtgggctgt 60
tataacagat gaatataaag ttccatcaaa aaaatttaaa gtattaggaa acactgatag 120
acattcaata aaaaaaaact tgataggtgc tttattattc gattcaggag agactgctga 180
agctacacgt ttaaaaagaa cagctagacg tagatataca agaagaaaaa ataggatatg 240
ttatcttcaa gaaattttta gtaatgaaat ggcaaaagtt gatgattcat tctttcacag 300
actagaagaa agtttcttag ttgaagaaga taagaagcat gaaagacacc ctatttttgg 360
taatatcgta gatgaagtag catatcatga gaagtatcca actatctatc atttaagaaa 420
gaaattagtt gattctacag ataaagctga tctgagatta atatatttag ctttagctca 480
tatgattaaa tttagaggac attttttaat agaaggtgat ttaaacccag acaacagcga 540
tgtagataaa ttatttatcc aattagttca aacttataat caattattcg aagagaatcc 600
aattaatgca agtggtgtag acgctaaggc tatattatca gctagattat caaaatctag 660
aagattagaa aatctaatag ctcaacttcc tggagaaaag aaaaatggac tttttgggaa 720
cctaatagct ctctcactcg gactaacacc aaattttaaa agcaattttg atcttgctga 780
agacgcaaag ttacaactat caaaggatac atacgatgat gatttagata atttgttagc 840
tcaaataggt gatcaatatg ctgatttgtt tcttgcagca aaaaacttaa gtgatgcaat 900
tttactatca gatatactta gagtaaatac agaaataaca aaggctcctt tatcagcaag 960
tatgattaaa cgatatgatg agcatcatca agatttaaca ttattaaagg cacttgtaag 1020
acaacaatta ccagaaaaat ataaagaaat tttctttgat caatctaaaa atggatatgc 1080
tggatatata gacggtggag caagtcaaga agagttttat aaatttataa agcctatttt 1140
agaaaaaatg gatggaactg aagaattact tgttaaactt aacagagaag atttacttag 1200
aaaacaaaga acttttgata atggttcaat tcctcaccaa attcatttag gagaattaca 1260
tgctatacta agaagacaag aagattttta tccatttctt aaagataata gagaaaaaat 1320
tgaaaaaatt ttaactttta gaataccata ttatgtagga ccacttgcaa ggggaaattc 1380
aagatttgca tggatgacta gaaaatcaga agaaactata accccgtgga attttgaaga 1440
agtagtagat aaaggagcta gtgctcaatc atttatagaa agaatgacaa attttgataa 1500
gaatcttcct aacgaaaagg ttttgccaaa gcatagcctt ctttatgagt attttacagt 1560
ttataatgag cttactaaag taaaatacgt tacagaagga atgagaaaac cagcattttt 1620
gtctggtgaa caaaagaaag caatagtaga cctattattt aaaacaaata ggaaggttac 1680
cgtaaagcaa cttaaagaag attacttcaa aaaaattgaa tgctttgata gtgttgaaat 1740
atcaggagtt gaagatagat ttaatgcttc acttggtaca tatcacgatc tcttaaaaat 1800
tataaaagat aaggattttt tagataatga agaaaatgaa gatattcttg aagatatagt 1860
attaacattg acactttttg aagatagaga aatgatagaa gaaagattaa aaacatatgc 1920
acatcttttt gatgataagg ttatgaagca acttaaaaga agaagatata caggttgggg 1980
acgtttgtca agaaagctaa ttaatggtat tagagataaa caatcaggaa agactattct 2040
cgattttctt aaatcagatg gatttgctaa tagaaacttt atgcaattaa ttcatgatga 2100
ttctcttact ttcaaagagg atattcaaaa ggctcaagtt tctggacaag gcgatagctt 2160
acacgaacac attgctaacc ttgcagggag ccccgctatc aaaaaaggaa ttttacaaac 2220
agttaaagtt gtagatgaac ttgttaaagt tatgggaaga cacaaacctg agaatatagt 2280
tatagaaatg gccagagaaa atcaaacaac acaaaaagga caaaaaaatt ctagagagag 2340
aatgaagaga attgaagaag gaataaaaga gctaggatca caaatattaa aagaacatcc 2400
agttgaaaat actcaattgc aaaatgaaaa gttatatttg tattacttac aaaatggaag 2460
agatatgtat gttgatcaag aactcgatat taatagatta agtgactatg atgttgatca 2520
tattgttcct caatcatttt taaaagatga ttcaatcgat aacaaagtat taactagatc 2580
agataaaaat agaggaaagt cagataatgt accatctgaa gaagttgtta aaaaaatgaa 2640
gaactattgg agacaacttt taaatgcaaa gctaattaca caaagaaaat ttgacaattt 2700
aacaaaagca gaaagaggag gattaagcga attagacaaa gctggattta taaaaagaca 2760
acttgttgag acaagacaaa taactaagca tgttgctcaa atacttgatt caagaatgaa 2820
tacaaaatat gatgaaaatg ataaattaat cagagaagta aaagtaataa cattaaagtc 2880
aaaattagta tcagatttca gaaaggattt tcaattttac aaagttcgtg aaataaataa 2940
ctatcatcat gctcatgatg catacttaaa tgctgttgta ggaactgctc ttattaagaa 3000
atatcctaaa ctagaaagcg aatttgttta tggagattat aaagtttatg atgtgcgcaa 3060
aatgatcgcg aaatccgaac aagaaatcgg taaggctaca gcaaaatatt tcttttatag 3120
taatataatg aattttttta agacagaaat aactttggct aatggtgaaa tcagaaaaag 3180
accacttatc gaaacaaatg gagagacagg agaaatagta tgggataaag gaagagattt 3240
tgctactgtt agaaaagtac taagtatgcc acaagtaaat atcgtaaaga aaactgaagt 3300
tcaaactgga ggtttctcta aggaatcaat tttacctaag agaaattcag ataagttaat 3360
tgcaaggaaa aaagattggg acccaaaaaa atacggtggt tttgatagtc caacagttgc 3420
ctatagtgtt cttgtagtag cgaaagttga gaaaggtaag tcaaaaaagt tgaaaagcgt 3480
aaaagaactt cttggtatca caattatgga aagatcttca tttgaaaaaa atccaattga 3540
ctttttagaa gctaagggtt ataaagaagt taaaaaggat ttaatcataa aactaccaaa 3600
gtatagtcta tttgaactcg aaaacggaag aaaacgaatg ctcgctagcg caggagaact 3660
tcaaaaagga aatgaacttg cgctgccatc aaagtatgta aatttcttat atttagcttc 3720
tcattatgag aaattaaaag gatcaccaga ggataatgaa caaaagcaac tatttgtaga 3780
acaacacaaa cattatttag atgaaataat agaacaaata tctgaatttt ctaaaagagt 3840
tatacttgcc gacgcaaatc tagataaggt gctttcagcg tataataaac acagagataa 3900
accaataaga gaacaagcag aaaacattat ccatcttttt acattaacta atcttggtgc 3960
accagctgca tttaagtact ttgatacaac aatagataga aaaagataca catctactaa 4020
agaagtatta gacgcaactt taatacatca atctattaca gggctttatg aaacaagaat 4080
tgatttaagt caactaggcg gagattaagt cgacaaagta ttgttaaaaa taactctgta 4140
gaattataaa ttagttctac agagttattt tttgacccgg gtatattgat aaaaataata 4200
atagtgggta taattaagtt gttaggaggt tagttagaat gatgtcaaga ttagataaaa 4260
gtaaagtgat taacagcgca ttagagctgc ttaatgaggt cggaatcgaa ggtttaacaa 4320
cccgtaaact cgcccagaag ctaggtgtag agcagcctac attgtattgg catgtaaaaa 4380
ataagcgggc tttgctcgac gccttagcca ttgagatgtt agataggcac catactcact 4440
tttgcccttt agaaggggaa agctggcaag attttttacg taataacgct aaaagtttta 4500
gatgtgcttt actaagtcat cgcgatggag caaaagtaca tttaggtaca cggcctacag 4560
aaaaacagta tgaaactctc gaaaatcaat tagccttttt atgccaacaa ggtttttcac 4620
tagagaatgc attatatgca ctcagcgctg tggggcattt tactttaggt tgcgtattgg 4680
aagatcaaga gcatcaagtc gctaaagaag aaagggaaac acctactact gatagtatgc 4740
cgccattatt acgacaagct atcgaattat ttgatcacca aggtgcagag ccagccttct 4800
tattcggcct tgaattgatc atatgcggat tagaaaaaca acttaaatgt gaaagtgggt 4860
cttaaaagca gcataacctt tttccgtgat ggtaacttca cggtaaccaa gatgtcgagt 4920
tgagctcgaa ttcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca 4980
caattccaca caacatacga gccggaagca taaagtgtaa agcctggggt gcctaatgag 5040
tgagctaact cacattaatt gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt 5100
cgtgccagct gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc 5160
gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg 5220
tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa 5280
agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg 5340
cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga 5400
ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg 5460
tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg 5520
gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc 5580
gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg 5640
gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca 5700
ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt 5760
ggcctaacta cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag 5820
ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg 5880
gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc 5940
ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt 6000
tggtcatgag attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt 6060
ttaaatcaat ctaaagtata tatgagtaaa cttggtctga cagttaccag gtccactgcc 6120
gggcctcttg cgggatcaaa agaaaaacga aatgatacac caatcagtgc aaaaaaagat 6180
ataatgggag ataagacggt tcgtgttcgt gctgacttgc accatatcat aaaaatcgaa 6240
acagcaaaga atggcggaaa cgtaaaagaa gttatggaaa taagacttag aagcaaactt 6300
aagagtgtgt tgatagtgca gtatcttaaa attttgtata ataggaattg aagttaaatt 6360
agatgctaaa aatttgtaat taagaaggag tgattacatg aacaaaaata taaaatattc 6420
tcaaaacttt ttaacgagtg aaaaagtact caaccaaata ataaaacaat tgaatttaaa 6480
agaaaccgat accgtttacg aaattggaac aggtaaaggg catttaacga cgaaactggc 6540
taaaataagt aaacaggtaa cgtctattga attagacagt catctattca acttatcgtc 6600
agaaaaatta aaactgaata ctcgtgtcac tttaattcac caagatattc tacagtttca 6660
attccctaac aaacagaggt ataaaattgt tgggagtatt ccttaccatt taagcacaca 6720
aattattaaa aaagtggttt ttgaaagcca tgcgtctgac atctatctga ttgttgaaga 6780
aggattctac aagcgtacct tggatattca ccgaacacta gggttgctct tgcacactca 6840
agtctcgatt cagcaattgc ttaagctgcc agcggaatgc tttcatccta aaccaaaagt 6900
aaacagtgtc ttaataaaac ttacccgcca taccacagat gttccagata aatattggaa 6960
gctatatacg tactttgttt caaaatgggt caatcgagaa tatcgtcaac tgtttactaa 7020
aaatcagttt catcaagcaa tgaaacacgc caaagtaaac aatttaagta ccgttactta 7080
tgagcaagta ttgtctattt ttaatagtta tctattattt aacgggagga aataattcta 7140
tgagtcccta ggcaggcctc cgccattatt tttttgaaca attgacaatt catttcttat 7200
tttttattaa gtgatagtca aaaggcataa cagtgctgaa tagaaagaaa tttacagaaa 7260
agaaaattat agaatttagt atgattaatt atactcattt atgaatgttt aattgaatac 7320
aaaaaaaaat acttgttatg tattcaatta cgggttaaaa tatagacaag ttgaaaaatt 7380
taataaaaaa ataagtcctc agctcttata tattaagcta ccaacttagt atataagcca 7440
aaacttaaat gtgctaccaa cacatcaagc cgttagagaa ctctatctat agcaatattt 7500
caaatgtacc gacatacaag agaaacatta actatatata ttcaatttat gagattatct 7560
taacagatat aaatgtaaat tgcaataagt aagatttaga agtttatagc ctttgtgtat 7620
tggaagcagt acgcaaaggc ttttttattt gataaaaatt agaagtatat ttattttttc 7680
ataattaatt tatgaaaatg aaagggggtg agcaaagtga cagaggaaag cagtatctta 7740
tcaaataaca aggtattagc aatatcatta ttgactttag cagtaaacat tatgactttt 7800
atagtgcttg tagctaagta gtacgaaagg gggagcttta aaaagctcct tggaatacat 7860
agaattcata aattaattta tgaaaagaag ggcgtatatg aaaacttgta aaaattgcaa 7920
agagtttatt aaagatactg aaatatgcaa aatacattcg ttgatgattc atgataaaac 7980
agtagcaacc tattgcagta aatacaatga gtcaagatgt ttacataaag ggaaagtcca 8040
atgtattaat tgttcaaaga tgaaccgata tggatggtgt gccataaaaa tgagatgttt 8100
tacagaggaa gaacagaaaa aagaacgtac atgcattaaa tattatgcaa ggagctttaa 8160
aaaagctcat gtaaagaaga gtaaaaagaa aaaataattt atttattaat ttaatattga 8220
gagtgccgac acagtatgca ctaaaaaata tatctgtggt gtagtgagcc gatacaaaag 8280
gatagtcact cgcattttca taatacatct tatgttatga ttatgtgtcg gtgggacttc 8340
acgacgaaaa cccacaataa aaaaagagtt cggggtaggg ttaagcatag ttgaggcaac 8400
taaacaatca agctaggata tgcagtagca gaccgtaagg tcgttgttta ggtgtgttgt 8460
aatacatacg ctattaagat gtaaaaatac ggataccaat gaagggaaaa gtataatttt 8520
tggatgtagt ttgtttgttc atctatgggc aaactacgtc caaagccgtt tccaaatctg 8580
ctaaaaagta tatcctttct aaaatcaaag tcaagtatga aatcataaat aaagtttaat 8640
tttgaagtta ttatgatatt atgtttttct attaaaataa attaagtata tagaatagtt 8700
taataatagt atatacttaa tgtgataagt gtctgacagt gtcacagaaa ggatgattgt 8760
tatggattat aagcggctcg aggacgtcaa accatgttaa tcattgcttt tatcaaaaat 8820
aggatccact ctatcattga tagagtttga aactctatca ttgatagagt ataatatctt 8880
tgttcatgta catcatgcta tctgtgagtt ttagagctag aaatagcaag ttaaaataag 8940
gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgcttttttt gaagcttgtc 9000
tttacacttt tgcccattaa tttttgagtt ccttattttt agggagcttt tattattttt 9060
atcatgaaaa tttcataaaa tactcataaa ctaaggatgt cttcataatc agattagtac 9120
tccattttca atccatttaa tctgggaata tgatatttta attacgtatt atttaagata 9180
tattaacgtg taatataata ccccgcaaat attaattatc acatacatat ccccccttta 9240
ttggggcatt ttttgtaccc attattttag tattgtgcag tacttaaata aaaaaatgcc 9300
gcaaattcat ttttattgaa taatgcggta tttcttctat tctttatttt tattactcta 9360
taaataatgt aatcaagaca tgactatcta aatatatgat atcttaattc ataattcggg 9420
cctcctaaaa attttcgtaa ttctatttta gaaggctttt ttccgtgacc tagccatttc 9480
aatctccttt ttacaatgat atttacgctt tagtttatta tagcacattc tgtaataccg 9540
aactattcaa ttttcagaga ccatttttta ttgattcata acttaagaat actacgaatt 9600
actctaatat tttacttttt cttatctctt gttattttaa catcggaatt actactaata 9660
ttaattttta tttttccatc cgcatttgct ccaacatttt tttaactata ctttcctttt 9720
gttaataaat tatgttattg ttgaacaata taagaaaagt gcgtaacatt ttttattaaa 9780
aataattagg tatttctatc tgtggggtac cctcgaggtg gcagctctag agctagcgaa 9840
ttctttggtg aaattgctcg agtccctatc agtgatagat tgaaactcta tcattgatag 9900
agtataatat ctttgttcat tagagcgata aacttgaatt tgagagggaa cttc 9954
<210> 22
<211> 8874
<212> DNA
<213> Artificial Sequence
<220>
<223> pCas9ind
<400> 22
catggataaa aagtacagta ttggtctaga cataggaact aactctgttg ggtgggctgt 60
tataacagat gaatataaag ttccatcaaa aaaatttaaa gtattaggaa acactgatag 120
acattcaata aaaaaaaact tgataggtgc tttattattc gattcaggag agactgctga 180
agctacacgt ttaaaaagaa cagctagacg tagatataca agaagaaaaa ataggatatg 240
ttatcttcaa gaaattttta gtaatgaaat ggcaaaagtt gatgattcat tctttcacag 300
actagaagaa agtttcttag ttgaagaaga taagaagcat gaaagacacc ctatttttgg 360
taatatcgta gatgaagtag catatcatga gaagtatcca actatctatc atttaagaaa 420
gaaattagtt gattctacag ataaagctga tctgagatta atatatttag ctttagctca 480
tatgattaaa tttagaggac attttttaat agaaggtgat ttaaacccag acaacagcga 540
tgtagataaa ttatttatcc aattagttca aacttataat caattattcg aagagaatcc 600
aattaatgca agtggtgtag acgctaaggc tatattatca gctagattat caaaatctag 660
aagattagaa aatctaatag ctcaacttcc tggagaaaag aaaaatggac tttttgggaa 720
cctaatagct ctctcactcg gactaacacc aaattttaaa agcaattttg atcttgctga 780
agacgcaaag ttacaactat caaaggatac atacgatgat gatttagata atttgttagc 840
tcaaataggt gatcaatatg ctgatttgtt tcttgcagca aaaaacttaa gtgatgcaat 900
tttactatca gatatactta gagtaaatac agaaataaca aaggctcctt tatcagcaag 960
tatgattaaa cgatatgatg agcatcatca agatttaaca ttattaaagg cacttgtaag 1020
acaacaatta ccagaaaaat ataaagaaat tttctttgat caatctaaaa atggatatgc 1080
tggatatata gacggtggag caagtcaaga agagttttat aaatttataa agcctatttt 1140
agaaaaaatg gatggaactg aagaattact tgttaaactt aacagagaag atttacttag 1200
aaaacaaaga acttttgata atggttcaat tcctcaccaa attcatttag gagaattaca 1260
tgctatacta agaagacaag aagattttta tccatttctt aaagataata gagaaaaaat 1320
tgaaaaaatt ttaactttta gaataccata ttatgtagga ccacttgcaa ggggaaattc 1380
aagatttgca tggatgacta gaaaatcaga agaaactata accccgtgga attttgaaga 1440
agtagtagat aaaggagcta gtgctcaatc atttatagaa agaatgacaa attttgataa 1500
gaatcttcct aacgaaaagg ttttgccaaa gcatagcctt ctttatgagt attttacagt 1560
ttataatgag cttactaaag taaaatacgt tacagaagga atgagaaaac cagcattttt 1620
gtctggtgaa caaaagaaag caatagtaga cctattattt aaaacaaata ggaaggttac 1680
cgtaaagcaa cttaaagaag attacttcaa aaaaattgaa tgctttgata gtgttgaaat 1740
atcaggagtt gaagatagat ttaatgcttc acttggtaca tatcacgatc tcttaaaaat 1800
tataaaagat aaggattttt tagataatga agaaaatgaa gatattcttg aagatatagt 1860
attaacattg acactttttg aagatagaga aatgatagaa gaaagattaa aaacatatgc 1920
acatcttttt gatgataagg ttatgaagca acttaaaaga agaagatata caggttgggg 1980
acgtttgtca agaaagctaa ttaatggtat tagagataaa caatcaggaa agactattct 2040
cgattttctt aaatcagatg gatttgctaa tagaaacttt atgcaattaa ttcatgatga 2100
ttctcttact ttcaaagagg atattcaaaa ggctcaagtt tctggacaag gcgatagctt 2160
acacgaacac attgctaacc ttgcagggag ccccgctatc aaaaaaggaa ttttacaaac 2220
agttaaagtt gtagatgaac ttgttaaagt tatgggaaga cacaaacctg agaatatagt 2280
tatagaaatg gccagagaaa atcaaacaac acaaaaagga caaaaaaatt ctagagagag 2340
aatgaagaga attgaagaag gaataaaaga gctaggatca caaatattaa aagaacatcc 2400
agttgaaaat actcaattgc aaaatgaaaa gttatatttg tattacttac aaaatggaag 2460
agatatgtat gttgatcaag aactcgatat taatagatta agtgactatg atgttgatca 2520
tattgttcct caatcatttt taaaagatga ttcaatcgat aacaaagtat taactagatc 2580
agataaaaat agaggaaagt cagataatgt accatctgaa gaagttgtta aaaaaatgaa 2640
gaactattgg agacaacttt taaatgcaaa gctaattaca caaagaaaat ttgacaattt 2700
aacaaaagca gaaagaggag gattaagcga attagacaaa gctggattta taaaaagaca 2760
acttgttgag acaagacaaa taactaagca tgttgctcaa atacttgatt caagaatgaa 2820
tacaaaatat gatgaaaatg ataaattaat cagagaagta aaagtaataa cattaaagtc 2880
aaaattagta tcagatttca gaaaggattt tcaattttac aaagttcgtg aaataaataa 2940
ctatcatcat gctcatgatg catacttaaa tgctgttgta ggaactgctc ttattaagaa 3000
atatcctaaa ctagaaagcg aatttgttta tggagattat aaagtttatg atgtgcgcaa 3060
aatgatcgcg aaatccgaac aagaaatcgg taaggctaca gcaaaatatt tcttttatag 3120
taatataatg aattttttta agacagaaat aactttggct aatggtgaaa tcagaaaaag 3180
accacttatc gaaacaaatg gagagacagg agaaatagta tgggataaag gaagagattt 3240
tgctactgtt agaaaagtac taagtatgcc acaagtaaat atcgtaaaga aaactgaagt 3300
tcaaactgga ggtttctcta aggaatcaat tttacctaag agaaattcag ataagttaat 3360
tgcaaggaaa aaagattggg acccaaaaaa atacggtggt tttgatagtc caacagttgc 3420
ctatagtgtt cttgtagtag cgaaagttga gaaaggtaag tcaaaaaagt tgaaaagcgt 3480
aaaagaactt cttggtatca caattatgga aagatcttca tttgaaaaaa atccaattga 3540
ctttttagaa gctaagggtt ataaagaagt taaaaaggat ttaatcataa aactaccaaa 3600
gtatagtcta tttgaactcg aaaacggaag aaaacgaatg ctcgctagcg caggagaact 3660
tcaaaaagga aatgaacttg cgctgccatc aaagtatgta aatttcttat atttagcttc 3720
tcattatgag aaattaaaag gatcaccaga ggataatgaa caaaagcaac tatttgtaga 3780
acaacacaaa cattatttag atgaaataat agaacaaata tctgaatttt ctaaaagagt 3840
tatacttgcc gacgcaaatc tagataaggt gctttcagcg tataataaac acagagataa 3900
accaataaga gaacaagcag aaaacattat ccatcttttt acattaacta atcttggtgc 3960
accagctgca tttaagtact ttgatacaac aatagataga aaaagataca catctactaa 4020
agaagtatta gacgcaactt taatacatca atctattaca gggctttatg aaacaagaat 4080
tgatttaagt caactaggcg gagattaagt cgacaaagta ttgttaaaaa taactctgta 4140
gaattataaa ttagttctac agagttattt tttgacccgg gtatattgat aaaaataata 4200
atagtgggta taattaagtt gttaggaggt tagttagaat gatgtcaaga ttagataaaa 4260
gtaaagtgat taacagcgca ttagagctgc ttaatgaggt cggaatcgaa ggtttaacaa 4320
cccgtaaact cgcccagaag ctaggtgtag agcagcctac attgtattgg catgtaaaaa 4380
ataagcgggc tttgctcgac gccttagcca ttgagatgtt agataggcac catactcact 4440
tttgcccttt agaaggggaa agctggcaag attttttacg taataacgct aaaagtttta 4500
gatgtgcttt actaagtcat cgcgatggag caaaagtaca tttaggtaca cggcctacag 4560
aaaaacagta tgaaactctc gaaaatcaat tagccttttt atgccaacaa ggtttttcac 4620
tagagaatgc attatatgca ctcagcgctg tggggcattt tactttaggt tgcgtattgg 4680
aagatcaaga gcatcaagtc gctaaagaag aaagggaaac acctactact gatagtatgc 4740
cgccattatt acgacaagct atcgaattat ttgatcacca aggtgcagag ccagccttct 4800
tattcggcct tgaattgatc atatgcggat tagaaaaaca acttaaatgt gaaagtgggt 4860
cttaaaagca gcataacctt tttccgtgat ggtaacttca cggtaaccaa gatgtcgagt 4920
tgagctcgaa ttcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca 4980
caattccaca caacatacga gccggaagca taaagtgtaa agcctggggt gcctaatgag 5040
tgagctaact cacattaatt gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt 5100
cgtgccagct gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc 5160
gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg 5220
tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa 5280
agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg 5340
cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga 5400
ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg 5460
tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg 5520
gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc 5580
gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg 5640
gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca 5700
ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt 5760
ggcctaacta cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag 5820
ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg 5880
gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc 5940
ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt 6000
tggtcatgag attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt 6060
ttaaatcaat ctaaagtata tatgagtaaa cttggtctga cagttaccag gtccactgcc 6120
gggcctcttg cgggatcaaa agaaaaacga aatgatacac caatcagtgc aaaaaaagat 6180
ataatgggag ataagacggt tcgtgttcgt gctgacttgc accatatcat aaaaatcgaa 6240
acagcaaaga atggcggaaa cgtaaaagaa gttatggaaa taagacttag aagcaaactt 6300
aagagtgtgt tgatagtgca gtatcttaaa attttgtata ataggaattg aagttaaatt 6360
agatgctaaa aatttgtaat taagaaggag tgattacatg aacaaaaata taaaatattc 6420
tcaaaacttt ttaacgagtg aaaaagtact caaccaaata ataaaacaat tgaatttaaa 6480
agaaaccgat accgtttacg aaattggaac aggtaaaggg catttaacga cgaaactggc 6540
taaaataagt aaacaggtaa cgtctattga attagacagt catctattca acttatcgtc 6600
agaaaaatta aaactgaata ctcgtgtcac tttaattcac caagatattc tacagtttca 6660
attccctaac aaacagaggt ataaaattgt tgggagtatt ccttaccatt taagcacaca 6720
aattattaaa aaagtggttt ttgaaagcca tgcgtctgac atctatctga ttgttgaaga 6780
aggattctac aagcgtacct tggatattca ccgaacacta gggttgctct tgcacactca 6840
agtctcgatt cagcaattgc ttaagctgcc agcggaatgc tttcatccta aaccaaaagt 6900
aaacagtgtc ttaataaaac ttacccgcca taccacagat gttccagata aatattggaa 6960
gctatatacg tactttgttt caaaatgggt caatcgagaa tatcgtcaac tgtttactaa 7020
aaatcagttt catcaagcaa tgaaacacgc caaagtaaac aatttaagta ccgttactta 7080
tgagcaagta ttgtctattt ttaatagtta tctattattt aacgggagga aataattcta 7140
tgagtcccta ggcaggcctc cgccattatt tttttgaaca attgacaatt catttcttat 7200
tttttattaa gtgatagtca aaaggcataa cagtgctgaa tagaaagaaa tttacagaaa 7260
agaaaattat agaatttagt atgattaatt atactcattt atgaatgttt aattgaatac 7320
aaaaaaaaat acttgttatg tattcaatta cgggttaaaa tatagacaag ttgaaaaatt 7380
taataaaaaa ataagtcctc agctcttata tattaagcta ccaacttagt atataagcca 7440
aaacttaaat gtgctaccaa cacatcaagc cgttagagaa ctctatctat agcaatattt 7500
caaatgtacc gacatacaag agaaacatta actatatata ttcaatttat gagattatct 7560
taacagatat aaatgtaaat tgcaataagt aagatttaga agtttatagc ctttgtgtat 7620
tggaagcagt acgcaaaggc ttttttattt gataaaaatt agaagtatat ttattttttc 7680
ataattaatt tatgaaaatg aaagggggtg agcaaagtga cagaggaaag cagtatctta 7740
tcaaataaca aggtattagc aatatcatta ttgactttag cagtaaacat tatgactttt 7800
atagtgcttg tagctaagta gtacgaaagg gggagcttta aaaagctcct tggaatacat 7860
agaattcata aattaattta tgaaaagaag ggcgtatatg aaaacttgta aaaattgcaa 7920
agagtttatt aaagatactg aaatatgcaa aatacattcg ttgatgattc atgataaaac 7980
agtagcaacc tattgcagta aatacaatga gtcaagatgt ttacataaag ggaaagtcca 8040
atgtattaat tgttcaaaga tgaaccgata tggatggtgt gccataaaaa tgagatgttt 8100
tacagaggaa gaacagaaaa aagaacgtac atgcattaaa tattatgcaa ggagctttaa 8160
aaaagctcat gtaaagaaga gtaaaaagaa aaaataattt atttattaat ttaatattga 8220
gagtgccgac acagtatgca ctaaaaaata tatctgtggt gtagtgagcc gatacaaaag 8280
gatagtcact cgcattttca taatacatct tatgttatga ttatgtgtcg gtgggacttc 8340
acgacgaaaa cccacaataa aaaaagagtt cggggtaggg ttaagcatag ttgaggcaac 8400
taaacaatca agctaggata tgcagtagca gaccgtaagg tcgttgttta ggtgtgttgt 8460
aatacatacg ctattaagat gtaaaaatac ggataccaat gaagggaaaa gtataatttt 8520
tggatgtagt ttgtttgttc atctatgggc aaactacgtc caaagccgtt tccaaatctg 8580
ctaaaaagta tatcctttct aaaatcaaag tcaagtatga aatcataaat aaagtttaat 8640
tttgaagtta ttatgatatt atgtttttct attaaaataa attaagtata tagaatagtt 8700
taataatagt atatacttaa tgtgataagt gtctgacagt gtcacagaaa ggatgattgt 8760
tatggattat aagcggctcg agtccctatc agtgatagat tgaaactcta tcattgatag 8820
agtataatat ctttgttcat tagagcgata aacttgaatt tgagagggaa cttc 8874
<210> 23
<211> 10534
<212> DNA
<213> Artificial Sequence
<220>
<223> pCas9acr
<400> 23
cgaattcgta atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc 60
cacacaacat acgagccgga agcataaagt gtaaagcctg gggtgcctaa tgagtgagct 120
aactcacatt aattgcgttg cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc 180
agctgcatta atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt 240
ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 300
ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 360
tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 420
tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 480
gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 540
ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 600
tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 660
agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 720
atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 780
acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 840
actacggcta cactagaagg acagtatttg gtatctgcgc tctgctgaag ccagttacct 900
tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 960
tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 1020
tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 1080
tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 1140
caatctaaag tatatatgag taaacttggt ctgacagtta ccaggtccac tgccgggcct 1200
cttgcgggat caaaagaaaa acgaaatgat acaccaatca gtgcaaaaaa agatataatg 1260
ggagataaga cggttcgtgt tcgtgctgac ttgcaccata tcataaaaat cgaaacagca 1320
aagaatggcg gaaacgtaaa agaagttatg gaaataagac ttagaagcaa acttaagagt 1380
gtgttgatag tgcagtatct taaaattttg tataatagga attgaagtta aattagatgc 1440
taaaaatttg taattaagaa ggagtgatta catgaacaaa aatataaaat attctcaaaa 1500
ctttttaacg agtgaaaaag tactcaacca aataataaaa caattgaatt taaaagaaac 1560
cgataccgtt tacgaaattg gaacaggtaa agggcattta acgacgaaac tggctaaaat 1620
aagtaaacag gtaacgtcta ttgaattaga cagtcatcta ttcaacttat cgtcagaaaa 1680
attaaaactg aatactcgtg tcactttaat tcaccaagat attctacagt ttcaattccc 1740
taacaaacag aggtataaaa ttgttgggag tattccttac catttaagca cacaaattat 1800
taaaaaagtg gtttttgaaa gccatgcgtc tgacatctat ctgattgttg aagaaggatt 1860
ctacaagcgt accttggata ttcaccgaac actagggttg ctcttgcaca ctcaagtctc 1920
gattcagcaa ttgcttaagc tgccagcgga atgctttcat cctaaaccaa aagtaaacag 1980
tgtcttaata aaacttaccc gccataccac agatgttcca gataaatatt ggaagctata 2040
tacgtacttt gtttcaaaat gggtcaatcg agaatatcgt caactgttta ctaaaaatca 2100
gtttcatcaa gcaatgaaac acgccaaagt aaacaattta agtaccgtta cttatgagca 2160
agtattgtct atttttaata gttatctatt atttaacggg aggaaataat tctatgagtc 2220
cctaggcagg cctccgccat tatttttttg aacaattgac aattcatttc ttatttttta 2280
ttaagtgata gtcaaaaggc ataacagtgc tgaatagaaa gaaatttaca gaaaagaaaa 2340
ttatagaatt tagtatgatt aattatactc atttatgaat gtttaattga atacaaaaaa 2400
aaatacttgt tatgtattca attacgggtt aaaatataga caagttgaaa aatttaataa 2460
aaaaataagt cctcagctct tatatattaa gctaccaact tagtatataa gccaaaactt 2520
aaatgtgcta ccaacacatc aagccgttag agaactctat ctatagcaat atttcaaatg 2580
taccgacata caagagaaac attaactata tatattcaat ttatgagatt atcttaacag 2640
atataaatgt aaattgcaat aagtaagatt tagaagttta tagcctttgt gtattggaag 2700
cagtacgcaa aggctttttt atttgataaa aattagaagt atatttattt tttcataatt 2760
aatttatgaa aatgaaaggg ggtgagcaaa gtgacagagg aaagcagtat cttatcaaat 2820
aacaaggtat tagcaatatc attattgact ttagcagtaa acattatgac ttttatagtg 2880
cttgtagcta agtagtacga aagggggagc tttaaaaagc tccttggaat acatagaatt 2940
cataaattaa tttatgaaaa gaagggcgta tatgaaaact tgtaaaaatt gcaaagagtt 3000
tattaaagat actgaaatat gcaaaataca ttcgttgatg attcatgata aaacagtagc 3060
aacctattgc agtaaataca atgagtcaag atgtttacat aaagggaaag tccaatgtat 3120
taattgttca aagatgaacc gatatggatg gtgtgccata aaaatgagat gttttacaga 3180
ggaagaacag aaaaaagaac gtacatgcat taaatattat gcaaggagct ttaaaaaagc 3240
tcatgtaaag aagagtaaaa agaaaaaata atttatttat taatttaata ttgagagtgc 3300
cgacacagta tgcactaaaa aatatatctg tggtgtagtg agccgataca aaaggatagt 3360
cactcgcatt ttcataatac atcttatgtt atgattatgt gtcggtggga cttcacgacg 3420
aaaacccaca ataaaaaaag agttcggggt agggttaagc atagttgagg caactaaaca 3480
atcaagctag gatatgcagt agcagaccgt aaggtcgttg tttaggtgtg ttgtaataca 3540
tacgctatta agatgtaaaa atacggatac caatgaaggg aaaagtataa tttttggatg 3600
tagtttgttt gttcatctat gggcaaacta cgtccaaagc cgtttccaaa tctgctaaaa 3660
agtatatcct ttctaaaatc aaagtcaagt atgaaatcat aaataaagtt taattttgaa 3720
gttattatga tattatgttt ttctattaaa ataaattaag tatatagaat agtttaataa 3780
tagtatatac ttaatgtgat aagtgtctga cagtgtcaca gaaaggatga ttgttatgga 3840
ttataagcgg ctcgagtccc tatcagtgat agattgaaac tctatcattg atagagtata 3900
atatctttgt tcattagagc gataaacttg aatttgagag ggaacttcca tggataaaaa 3960
gtacagtatt ggtctagaca taggaactaa ctctgttggg tgggctgtta taacagatga 4020
atataaagtt ccatcaaaaa aatttaaagt attaggaaac actgatagac attcaataaa 4080
aaaaaacttg ataggtgctt tattattcga ttcaggagag actgctgaag ctacacgttt 4140
aaaaagaaca gctagacgta gatatacaag aagaaaaaat aggatatgtt atcttcaaga 4200
aatttttagt aatgaaatgg caaaagttga tgattcattc tttcacagac tagaagaaag 4260
tttcttagtt gaagaagata agaagcatga aagacaccct atttttggta atatcgtaga 4320
tgaagtagca tatcatgaga agtatccaac tatctatcat ttaagaaaga aattagttga 4380
ttctacagat aaagctgatc tgagattaat atatttagct ttagctcata tgattaaatt 4440
tagaggacat tttttaatag aaggtgattt aaacccagac aacagcgatg tagataaatt 4500
atttatccaa ttagttcaaa cttataatca attattcgaa gagaatccaa ttaatgcaag 4560
tggtgtagac gctaaggcta tattatcagc tagattatca aaatctagaa gattagaaaa 4620
tctaatagct caacttcctg gagaaaagaa aaatggactt tttgggaacc taatagctct 4680
ctcactcgga ctaacaccaa attttaaaag caattttgat cttgctgaag acgcaaagtt 4740
acaactatca aaggatacat acgatgatga tttagataat ttgttagctc aaataggtga 4800
tcaatatgct gatttgtttc ttgcagcaaa aaacttaagt gatgcaattt tactatcaga 4860
tatacttaga gtaaatacag aaataacaaa ggctccttta tcagcaagta tgattaaacg 4920
atatgatgag catcatcaag atttaacatt attaaaggca cttgtaagac aacaattacc 4980
agaaaaatat aaagaaattt tctttgatca atctaaaaat ggatatgctg gatatataga 5040
cggtggagca agtcaagaag agttttataa atttataaag cctattttag aaaaaatgga 5100
tggaactgaa gaattacttg ttaaacttaa cagagaagat ttacttagaa aacaaagaac 5160
ttttgataat ggttcaattc ctcaccaaat tcatttagga gaattacatg ctatactaag 5220
aagacaagaa gatttttatc catttcttaa agataataga gaaaaaattg aaaaaatttt 5280
aacttttaga ataccatatt atgtaggacc acttgcaagg ggaaattcaa gatttgcatg 5340
gatgactaga aaatcagaag aaactataac cccgtggaat tttgaagaag tagtagataa 5400
aggagctagt gctcaatcat ttatagaaag aatgacaaat tttgataaga atcttcctaa 5460
cgaaaaggtt ttgccaaagc atagccttct ttatgagtat tttacagttt ataatgagct 5520
tactaaagta aaatacgtta cagaaggaat gagaaaacca gcatttttgt ctggtgaaca 5580
aaagaaagca atagtagacc tattatttaa aacaaatagg aaggttaccg taaagcaact 5640
taaagaagat tacttcaaaa aaattgaatg ctttgatagt gttgaaatat caggagttga 5700
agatagattt aatgcttcac ttggtacata tcacgatctc ttaaaaatta taaaagataa 5760
ggatttttta gataatgaag aaaatgaaga tattcttgaa gatatagtat taacattgac 5820
actttttgaa gatagagaaa tgatagaaga aagattaaaa acatatgcac atctttttga 5880
tgataaggtt atgaagcaac ttaaaagaag aagatataca ggttggggac gtttgtcaag 5940
aaagctaatt aatggtatta gagataaaca atcaggaaag actattctcg attttcttaa 6000
atcagatgga tttgctaata gaaactttat gcaattaatt catgatgatt ctcttacttt 6060
caaagaggat attcaaaagg ctcaagtttc tggacaaggc gatagcttac acgaacacat 6120
tgctaacctt gcagggagcc ccgctatcaa aaaaggaatt ttacaaacag ttaaagttgt 6180
agatgaactt gttaaagtta tgggaagaca caaacctgag aatatagtta tagaaatggc 6240
cagagaaaat caaacaacac aaaaaggaca aaaaaattct agagagagaa tgaagagaat 6300
tgaagaagga ataaaagagc taggatcaca aatattaaaa gaacatccag ttgaaaatac 6360
tcaattgcaa aatgaaaagt tatatttgta ttacttacaa aatggaagag atatgtatgt 6420
tgatcaagaa ctcgatatta atagattaag tgactatgat gttgatcata ttgttcctca 6480
atcattttta aaagatgatt caatcgataa caaagtatta actagatcag ataaaaatag 6540
aggaaagtca gataatgtac catctgaaga agttgttaaa aaaatgaaga actattggag 6600
acaactttta aatgcaaagc taattacaca aagaaaattt gacaatttaa caaaagcaga 6660
aagaggagga ttaagcgaat tagacaaagc tggatttata aaaagacaac ttgttgagac 6720
aagacaaata actaagcatg ttgctcaaat acttgattca agaatgaata caaaatatga 6780
tgaaaatgat aaattaatca gagaagtaaa agtaataaca ttaaagtcaa aattagtatc 6840
agatttcaga aaggattttc aattttacaa agttcgtgaa ataaataact atcatcatgc 6900
tcatgatgca tacttaaatg ctgttgtagg aactgctctt attaagaaat atcctaaact 6960
agaaagcgaa tttgtttatg gagattataa agtttatgat gtgcgcaaaa tgatcgcgaa 7020
atccgaacaa gaaatcggta aggctacagc aaaatatttc ttttatagta atataatgaa 7080
tttttttaag acagaaataa ctttggctaa tggtgaaatc agaaaaagac cacttatcga 7140
aacaaatgga gagacaggag aaatagtatg ggataaagga agagattttg ctactgttag 7200
aaaagtacta agtatgccac aagtaaatat cgtaaagaaa actgaagttc aaactggagg 7260
tttctctaag gaatcaattt tacctaagag aaattcagat aagttaattg caaggaaaaa 7320
agattgggac ccaaaaaaat acggtggttt tgatagtcca acagttgcct atagtgttct 7380
tgtagtagcg aaagttgaga aaggtaagtc aaaaaagttg aaaagcgtaa aagaacttct 7440
tggtatcaca attatggaaa gatcttcatt tgaaaaaaat ccaattgact ttttagaagc 7500
taagggttat aaagaagtta aaaaggattt aatcataaaa ctaccaaagt atagtctatt 7560
tgaactcgaa aacggaagaa aacgaatgct cgctagcgca ggagaacttc aaaaaggaaa 7620
tgaacttgcg ctgccatcaa agtatgtaaa tttcttatat ttagcttctc attatgagaa 7680
attaaaagga tcaccagagg ataatgaaca aaagcaacta tttgtagaac aacacaaaca 7740
ttatttagat gaaataatag aacaaatatc tgaattttct aaaagagtta tacttgccga 7800
cgcaaatcta gataaggtgc tttcagcgta taataaacac agagataaac caataagaga 7860
acaagcagaa aacattatcc atctttttac attaactaat cttggtgcac cagctgcatt 7920
taagtacttt gatacaacaa tagatagaaa aagatacaca tctactaaag aagtattaga 7980
cgcaacttta atacatcaat ctattacagg gctttatgaa acaagaattg atttaagtca 8040
actaggcgga gattaagtcg acaaagtatt gttaaaaata actctgtaga attataaatt 8100
agttctacag agttattttt tgacccgggt atattgataa aaataataat agtgggtata 8160
attaagttgt taggaggtta gttagaatga tgtcaagatt agataaaagt aaagtgatta 8220
acagcgcatt agagctgctt aatgaggtcg gaatcgaagg tttaacaacc cgtaaactcg 8280
cccagaagct aggtgtagag cagcctacat tgtattggca tgtaaaaaat aagcgggctt 8340
tgctcgacgc cttagccatt gagatgttag ataggcacca tactcacttt tgccctttag 8400
aaggggaaag ctggcaagat tttttacgta ataacgctaa aagttttaga tgtgctttac 8460
taagtcatcg cgatggagca aaagtacatt taggtacacg gcctacagaa aaacagtatg 8520
aaactctcga aaatcaatta gcctttttat gccaacaagg tttttcacta gagaatgcat 8580
tatatgcact cagcgctgtg gggcatttta ctttaggttg cgtattggaa gatcaagagc 8640
atcaagtcgc taaagaagaa agggaaacac ctactactga tagtatgccg ccattattac 8700
gacaagctat cgaattattt gatcaccaag gtgcagagcc agccttctta ttcggccttg 8760
aattgatcat atgcggatta gaaaaacaac ttaaatgtga aagtgggtct taaaagcagc 8820
ataacctttt tccgtgatgg taacttcacg gtaaccaaga tgtcgagttg agctcttagt 8880
tcaactcact ttttaaggtg attgtttgca tgtcattata aaattcttct tcatcctcgt 8940
attcttgatt ccaaccgttt ttaaatgcag atatgaattt ttcaactatt gattcatttt 9000
cactttcaga aattacatac tcgtttccat cattattaac tctaataatt agctgtgtta 9060
tactattgct atccgtacca ctcaatttca ctgtgtaatc tttgtttttt atttctctaa 9120
ttaagtcatt aatattcatt tcagccctcc tgtgaaattg ttatccgctc acaattccac 9180
gtcgactacc gcggattcta gattctgcag tatcttcatg gtattcattt tttaatatca 9240
ttttaccctc ccaatacatt taaaataatt atgtattcat gaaacatgat tgtatattta 9300
agaaacataa ttccatataa atcatttttc aaaatagttt ttacccataa ttaaatgtta 9360
atatgtaaat taatctttta gaatagttaa aaagttctaa aatatgttat aatgtttctt 9420
ataatcttat aaattttaat aactaatata taaagatatt tctttaaaat attcttatat 9480
ttagaagaat ttattttaaa ataaaaagct tttatgttga taaactgctt tgcaaagctc 9540
tcatgtaaat gtttaatata agactactat aaaattggct aattttatag gttaggaggt 9600
agaaatgcaa atattgtgga aaaagtatgt taaagaaaac tttgaaatga atgtagatga 9660
atgtggtata gaacaaggta taccaggatt aggatataac tatgaagtat tgaaaaatgc 9720
tgttattcat tacgtaacta agggatatgg aacttttaaa tttaatggta aggtatataa 9780
cttaaaacaa ggtgatattt ttatactact aaaaggtatg caagttgagt atgtggcttc 9840
tattgatgat ccttgggaat actactggat aggatttagt ggttcaaatg ctaatgagta 9900
tttaaataga acttctatta ctaactcctg tgttgctaat tgtgaagaaa actcaaaaat 9960
tccacagata atattaaata tgtgcgaaat atcaaaaact tataatcctt caagatctga 10020
tgacatacta ttactaaaag aactttactc attattgtac gcacttatag aagaattccc 10080
aaaacctttt gaatacaaag ataaggaatt acacacatat attcaagatg ctcttaattt 10140
cattaattct aattacatgc atagcataac tgttcaagaa attgctgatt atgtgaactt 10200
aagtagaagt tatttatata aaatgttcat aaaaaacctt ggaatttctc ctcaaagata 10260
tttaataaac cttagaatgt acaaagccac ccttttatta aaaagcacta aacttcctat 10320
aggagaagtc gcaagtagtg taggttatag tgactccctg ttattttcaa aaactttttc 10380
aaaacatttt tcaatgtctc cactaaatta cagaaataat caagtaaata aaccaagtat 10440
ataaatttaa aatacagctt taaaacaaaa aaatttcaaa aataaaaagt ataacagagg 10500
cgtaaattaa aacctctgtt atactttttg agct 10534
<210> 24
<211> 5754
<212> DNA
<213> Artificial Sequence
<220>
<223> pEC750S-uppHR
<400> 24
ataaggtacc aggaattaga gcagcgctat gttcagatac atttagtgct catgcaacaa 60
gagaacataa taatgctaat atattaacta tgggtcaaag ggttgttgga gcaggtcttg 120
ctttagatat agtaaaaaca tttatatcag ctaaatttga aggagatagg caccaaaaaa 180
gaatagataa gatttcagat attgaaaaaa agtatacaca ttagaaaaaa gcagctatgc 240
tgcaaataag atcaatttat attagaaaaa agcagctatg ctgcaaataa gatcaattta 300
tattagaaaa aagcagctat gctgcaaata agatcaattt atattagaaa aaagcagcta 360
tgctacaaat aagatcaatt tatattagaa aaaagtagct atgctgcaac aatattaatt 420
tatattacta gaaagctaaa tggggtatat aaatataaag ggctataaat actaaaagca 480
aacttggagg aataataatg gtctagagct ggagatagat tatttggtac taagtaatta 540
gtaatctatt agaattaaaa gctatctaca taagtttctg aatgacccaa gataatttta 600
ctggggggaa tatagaaaat ggagagacga gataagaaaa attattactt ggatattgct 660
gaaacagttt tagagagagg aacctgtcta aggagaaact atggttctat aattgttaaa 720
aatgatgaaa taatttctac tggatacaca ggagcaccta gaggtagaaa aaattgcatg 780
gatttgaata gttgcataag agaaaagttg aaagttccaa gaggtactca ttatgagttg 840
tgtaggagtg tacatagtga agctaatgca ataataagcg cttcgagctc gaattcgtaa 900
tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc acacaacata 960
cgagccggaa gcataaagtg taaagcctgg ggtgcctaat gagtgagcta actcacatta 1020
attgcgttgc gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa 1080
tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc cgcttcctcg 1140
ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag 1200
gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa 1260
ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc 1320
cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca 1380
ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg 1440
accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct 1500
catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt 1560
gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag 1620
tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc 1680
agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac 1740
actagaagaa cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga 1800
gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc 1860
aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg 1920
gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca 1980
aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc aatctaaagt 2040
atatatgagt aaacttggtc tgacagttac caaagctagc ttaatactag tatatactta 2100
atgtgataag tgtctgacag ctgaccggtc taaagaggtc cgccaatgaa atctataaat 2160
aaactaaatt aagtttattt aattaacaac tatggatata aaataggtac taatcaaaat 2220
agtgaggagg atatatttga atacatacga acaaattaat aaagtgaaaa aaatacttcg 2280
gaaacattta aaaaataacc ttattggtac ttacatgttt ggatcaggag ttgagagtgg 2340
actaaaacca aatagtgatc ttgacttttt agtcgtcgta tctgaaccat tgacagatca 2400
aagtaaagaa atacttatac aaaaaattag acctatttca aagaaaatag gagataaaag 2460
caacttacga tatattgaat taacaattat tattcagcaa gaaatggtac cgtggaatca 2520
tcctcccaaa caagaattta tttatggaga atggttacaa gagctttatg aacaaggata 2580
cattcctcag aaggaattaa attcagattt aaccataatg ctttaccaag caaaacgaaa 2640
aaataaaaga atatacggaa attatgactt agaggaatta ctacctgata ttccattttc 2700
tgatgtgaga agagccatta tggattcgtc agaggaatta atagataatt atcaggatga 2760
tgaaaccaac tctatattaa ctttatgccg tatgatttta actatggaca cgggtaaaat 2820
cataccaaaa gatattgcgg gaaatgcagt ggctgaatct tctccattag aacataggga 2880
gagaattttg ttagcagttc gtagttatct tggagagaat attgaatgga ctaatgaaaa 2940
tgtaaattta actataaact atttaaataa cagattaaaa aaattataaa aaaattgaaa 3000
aaatggtgga aacacttttt tcaatttttt tgttttatta tttaatattt gggaaatatt 3060
cattctaatt ggtaatcaga ttttagaagt tgttaacttc aggtttgtct gtaactaaaa 3120
actagtattt aacctaggat caaaaaaatt tccaataatc ccactctaag ccacaaacac 3180
gccctataaa atcccgcttt aatcccactt tgagacacat gtaatattac tttacgccct 3240
agtatagtga taatttttta cattcaatgc cacgcaaaaa aataaagggg cactataata 3300
aaagttcctt cggaactaac taaagtaaaa aattatcttt acaacctccc caaaaaaaag 3360
aacaggtaca aagtacccta taatacaagc gtaaaaaaaa tgagggtaaa aataaaaaaa 3420
taaaaaaata aaaaaataaa aaaataaaaa aataaaaaaa taaaaaaata taaaaataaa 3480
aaaatataaa aataaaaaaa tataaaaata aaaaaataaa aaaatataaa aataaaaaaa 3540
taaaaaaata taaaaatatt ttttatttaa agtttgaaaa aaattttttt atattatata 3600
atctttgaag aaaagaatat aaaaaatgag cctttataaa agcccatttt ttttcatata 3660
cgtaatatga cgttctaatg tttttattgg tacttctaac attagagtaa tttctttatt 3720
tttaaagcct ttttctttaa gggcttttat tttttttctt aatacattta attcctcttt 3780
ttttgttgct tttcctttag cttttaattg ctcttgataa ttttttttac ctctaatatt 3840
ttctcttctc ttatattcct ttttagaaat tattattgtc atatattttt gttcttcttc 3900
tgtaatttct aataactcta taagagtttc attcttatac ttatattgct tatttttatc 3960
taaataacat ctttcagcac ttctagttgc tcttataact tctctttcac ttaaatgttg 4020
tctaaacata ctattaagtt ctaaaacatc atttaatgcc ttctcaatgt cttctgtaaa 4080
gctacaaaga taatatctat ataaaaataa tataagctct ctgtgtcctt ttaaatcata 4140
ttctcttagt tcacaaagtt ttattatgtc ttgtattctt ccataatata aacttctttc 4200
tctataaata taatttattt tgcttggtct accctttttc ctttcatatg gttttaattc 4260
aggtaaaaat ccattttgta tttctcttaa gtcataaata tattcgtact catctaatat 4320
attgactact gtttttgatt tagagtttat acttcctgga actcttaata ttctcgttgc 4380
atctaaggct tgtctatctg ctccaaagta ttttaattga ttatataaat attcttgaac 4440
cgctttccat aatggtaatg ctttactagg tactgcattt attatccata ttaaatacat 4500
tcctcttcca ctatctatta catagtttgg tataggaata ctttgattaa aataattctt 4560
ttctaagtcc attaatacct ggtctttagt tttgccagtt ttataataat ccaagtctat 4620
aaacagtgta tttaactctt ttatattttc taatcgccta cacggcttat aaaaggtatt 4680
tagagttata tagatatttt catcactcat atctaaatct tttaattcag cgtatttata 4740
gtgccattgg ctatatcctt ttttatctat aacgctcctg gttatccacc ctttacttct 4800
actatgaata ttatctatat agttcttttt attcagcttt aatgcgtttc tcacttattc 4860
acctcccctt ctgtaaaact aagaaaatta tatcatattt tcaataatta ttaactattc 4920
ttaaactctt aataaaaaat agagtaagtc cccaattgaa acttaatcta ttttttatgt 4980
tttaatttat tatttttatt aaaatatttt aaactaaatt aaatgattct ttttaatttt 5040
ttactatttc attccataat atattactat aattatttac aaataatatt tcttcatttg 5100
taatatttag atgatttact aattttagtt tttatatatt aaataattaa tgtataattt 5160
atataaaaaa tcaaaggagc ttataaatta tgattatttc caaagatact aaagatttaa 5220
tttttttcaa ttttaacaat actttttgta atattatgtt taaatttaat tgtatttttt 5280
tcatataata aagccgttga agtaaaccaa tccattttcc ttatgatgtt attattaaat 5340
ttaagtttta taataatatc tttattatat ttattgtttt taaaaaaact agtgaaattt 5400
ctagtgaaat ttccggcttt attaaactta tttttaggaa ttttattttc attttcatct 5460
ttacaggatt tgattatatc tttaaatatg ttttatcaaa tattatcttt ttctaaattt 5520
atatatattt ttattatatt tattattata tatattttat ttttaagttt ctttctaaca 5580
gctattaaaa agaaacttaa aaataaaaac acgtactcta aaccaataaa taaaactatt 5640
tttattattg ctgccttgat tggaatagtt tttagtaaaa ttaatttcaa tattccacaa 5700
tattatatta taagctagca ggcctcgaga tctccatgga cgcgtgacgt cgac 5754
<210> 25
<211> 884
<212> DNA
<213> Artificial Sequence
<220>
<223> Repair template
<400> 25
ataaggtacc aggaattaga gcagcgctat gttcagatac atttagtgct catgcaacaa 60
gagaacataa taatgctaat atattaacta tgggtcaaag ggttgttgga gcaggtcttg 120
ctttagatat agtaaaaaca tttatatcag ctaaatttga aggagatagg caccaaaaaa 180
gaatagataa gatttcagat attgaaaaaa agtatacaca ttagaaaaaa gcagctatgc 240
tgcaaataag atcaatttat attagaaaaa agcagctatg ctgcaaataa gatcaattta 300
tattagaaaa aagcagctat gctgcaaata agatcaattt atattagaaa aaagcagcta 360
tgctacaaat aagatcaatt tatattagaa aaaagtagct atgctgcaac aatattaatt 420
tatattacta gaaagctaaa tggggtatat aaatataaag ggctataaat actaaaagca 480
aacttggagg aataataatg gtctagagct ggagatagat tatttggtac taagtaatta 540
gtaatctatt agaattaaaa gctatctaca taagtttctg aatgacccaa gataatttta 600
ctggggggaa tatagaaaat ggagagacga gataagaaaa attattactt ggatattgct 660
gaaacagttt tagagagagg aacctgtcta aggagaaact atggttctat aattgttaaa 720
aatgatgaaa taatttctac tggatacaca ggagcaccta gaggtagaaa aaattgcatg 780
gatttgaata gttgcataag agaaaagttg aaagttccaa gaggtactca ttatgagttg 840
tgtaggagtg tacatagtga agctaatgca ataataagcg cttc 884
<210> 26
<211> 500
<212> DNA
<213> Artificial Sequence
<220>
<223> upp gene upstream fragment
<400> 26
ataaggtacc aggaattaga gcagcgctat gttcagatac atttagtgct catgcaacaa 60
gagaacataa taatgctaat atattaacta tgggtcaaag ggttgttgga gcaggtcttg 120
ctttagatat agtaaaaaca tttatatcag ctaaatttga aggagatagg caccaaaaaa 180
gaatagataa gatttcagat attgaaaaaa agtatacaca ttagaaaaaa gcagctatgc 240
tgcaaataag atcaatttat attagaaaaa agcagctatg ctgcaaataa gatcaattta 300
tattagaaaa aagcagctat gctgcaaata agatcaattt atattagaaa aaagcagcta 360
tgctacaaat aagatcaatt tatattagaa aaaagtagct atgctgcaac aatattaatt 420
tatattacta gaaagctaaa tggggtatat aaatataaag ggctataaat actaaaagca 480
aacttggagg aataataatg 500
<210> 27
<211> 377
<212> DNA
<213> Artificial Sequence
<220>
<223> upp gene downstrean fragment
<400> 27
gctggagata gattatttgg tactaagtaa ttagtaatct attagaatta aaagctatct 60
acataagttt ctgaatgacc caagataatt ttactggggg gaatatagaa aatggagaga 120
cgagataaga aaaattatta cttggatatt gctgaaacag ttttagagag aggaacctgt 180
ctaaggagaa actatggttc tataattgtt aaaaatgatg aaataatttc tactggatac 240
acaggagcac ctagaggtag aaaaaattgc atggatttga atagttgcat aagagaaaag 300
ttgaaagttc caagaggtac tcattatgag ttgtgtagga gtgtacatag tgaagctaat 360
gcaataataa gcgcttc 377
<210> 28
<211> 2666
<212> DNA
<213> Artificial Sequence
<220>
<223> pEX-A2-gRNA-upp
<400> 28
ctcgagtatt tttgataaaa gcaatgatta acatggtttg acgtctgaga agagacgatt 60
ttctcaatag gagaaattaa ggtgcaaacc cttatcattc caccatgatc cacctgtagc 120
aagcatgttt tagagctaga aatagcaagt taaaataagg ctagtccgtt atcaacttga 180
aaaagtggca ccgagtcggt gctttttttg ccatggacct gcttttgctc gcttggatcc 240
gaattcctgt gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca 300
taaagtgtaa agcctggggt gcctaatgag tgagctaact cacattaatt gcgttgcgct 360
cactgcccgc tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac 420
gcgcggggag aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc 480
tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt 540
tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg 600
ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg 660
agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat 720
accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta 780
ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct 840
gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc 900
ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa 960
gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg 1020
taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact agaagaacag 1080
tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt 1140
gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta 1200
cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc 1260
agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca 1320
cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa 1380
cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat 1440
ttcgttcatc catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct 1500
taccatctgg ccccagtgct gcaatgatac cgcgactccc acgctcaccg gctccagatt 1560
tatcagcaat aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat 1620
ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta 1680
atagtttgcg caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg 1740
gtatggcttc attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt 1800
tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg 1860
cagtgttatc actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg 1920
taagatgctt ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc 1980
ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa 2040
ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac 2100
cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt 2160
ttactttcac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg 2220
gaataagggc gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa 2280
gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata 2340
aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca 2400
ttattatcat gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc 2460
gtttcggtga tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt 2520
gtctgtaagc ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg 2580
ggtgtcgggg ctggcttaac tatgcggcat cagagcagat tgtactgaga gtgcaccaat 2640
tgggtaccga gctcgcggcc gcaagc 2666
<210> 29
<211> 203
<212> DNA
<213> Artificial Sequence
<220>
<223> gRNA expression cassette
<400> 29
tatttttgat aaaagcaatg attaacatgg tttgacgtct gagaagagac gattttctca 60
ataggagaaa ttaaggtgca aacccttatc attccaccat gatccacctg tagcaagcat 120
gttttagagc tagaaatagc aagttaaaat aaggctagtc cgttatcaac ttgaaaaagt 180
ggcaccgagt cggtgctttt ttt 203
<210> 30
<211> 100
<212> DNA
<213> Artificial Sequence
<220>
<223> Constitutive promoter
<400> 30
tatttttgat aaaagcaatg attaacatgg tttgacgtct gagaagagac gattttctca 60
ataggagaaa ttaaggtgca aacccttatc attccaccat 100
<210> 31
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Protospacer targeting upp
<400> 31
gatccacctg tagcaagcat 20
<210> 32
<211> 5954
<212> DNA
<213> Artificial Sequence
<220>
<223> pEC750S-deltaupp
<400> 32
ataaggtacc aggaattaga gcagcgctat gttcagatac atttagtgct catgcaacaa 60
gagaacataa taatgctaat atattaacta tgggtcaaag ggttgttgga gcaggtcttg 120
ctttagatat agtaaaaaca tttatatcag ctaaatttga aggagatagg caccaaaaaa 180
gaatagataa gatttcagat attgaaaaaa agtatacaca ttagaaaaaa gcagctatgc 240
tgcaaataag atcaatttat attagaaaaa agcagctatg ctgcaaataa gatcaattta 300
tattagaaaa aagcagctat gctgcaaata agatcaattt atattagaaa aaagcagcta 360
tgctacaaat aagatcaatt tatattagaa aaaagtagct atgctgcaac aatattaatt 420
tatattacta gaaagctaaa tggggtatat aaatataaag ggctataaat actaaaagca 480
aacttggagg aataataatg gtctagagct ggagatagat tatttggtac taagtaatta 540
gtaatctatt agaattaaaa gctatctaca taagtttctg aatgacccaa gataatttta 600
ctggggggaa tatagaaaat ggagagacga gataagaaaa attattactt ggatattgct 660
gaaacagttt tagagagagg aacctgtcta aggagaaact atggttctat aattgttaaa 720
aatgatgaaa taatttctac tggatacaca ggagcaccta gaggtagaaa aaattgcatg 780
gatttgaata gttgcataag agaaaagttg aaagttccaa gaggtactca ttatgagttg 840
tgtaggagtg tacatagtga agctaatgca ataataagcg cttcgagctc gaattcgtaa 900
tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc acacaacata 960
cgagccggaa gcataaagtg taaagcctgg ggtgcctaat gagtgagcta actcacatta 1020
attgcgttgc gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa 1080
tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc cgcttcctcg 1140
ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag 1200
gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa 1260
ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc 1320
cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca 1380
ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg 1440
accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct 1500
catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt 1560
gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag 1620
tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc 1680
agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac 1740
actagaagaa cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga 1800
gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc 1860
aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg 1920
gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca 1980
aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc aatctaaagt 2040
atatatgagt aaacttggtc tgacagttac caaagctagc ttaatactag tatatactta 2100
atgtgataag tgtctgacag ctgaccggtc taaagaggtc cgccaatgaa atctataaat 2160
aaactaaatt aagtttattt aattaacaac tatggatata aaataggtac taatcaaaat 2220
agtgaggagg atatatttga atacatacga acaaattaat aaagtgaaaa aaatacttcg 2280
gaaacattta aaaaataacc ttattggtac ttacatgttt ggatcaggag ttgagagtgg 2340
actaaaacca aatagtgatc ttgacttttt agtcgtcgta tctgaaccat tgacagatca 2400
aagtaaagaa atacttatac aaaaaattag acctatttca aagaaaatag gagataaaag 2460
caacttacga tatattgaat taacaattat tattcagcaa gaaatggtac cgtggaatca 2520
tcctcccaaa caagaattta tttatggaga atggttacaa gagctttatg aacaaggata 2580
cattcctcag aaggaattaa attcagattt aaccataatg ctttaccaag caaaacgaaa 2640
aaataaaaga atatacggaa attatgactt agaggaatta ctacctgata ttccattttc 2700
tgatgtgaga agagccatta tggattcgtc agaggaatta atagataatt atcaggatga 2760
tgaaaccaac tctatattaa ctttatgccg tatgatttta actatggaca cgggtaaaat 2820
cataccaaaa gatattgcgg gaaatgcagt ggctgaatct tctccattag aacataggga 2880
gagaattttg ttagcagttc gtagttatct tggagagaat attgaatgga ctaatgaaaa 2940
tgtaaattta actataaact atttaaataa cagattaaaa aaattataaa aaaattgaaa 3000
aaatggtgga aacacttttt tcaatttttt tgttttatta tttaatattt gggaaatatt 3060
cattctaatt ggtaatcaga ttttagaagt tgttaacttc aggtttgtct gtaactaaaa 3120
actagtattt aacctaggat caaaaaaatt tccaataatc ccactctaag ccacaaacac 3180
gccctataaa atcccgcttt aatcccactt tgagacacat gtaatattac tttacgccct 3240
agtatagtga taatttttta cattcaatgc cacgcaaaaa aataaagggg cactataata 3300
aaagttcctt cggaactaac taaagtaaaa aattatcttt acaacctccc caaaaaaaag 3360
aacaggtaca aagtacccta taatacaagc gtaaaaaaaa tgagggtaaa aataaaaaaa 3420
taaaaaaata aaaaaataaa aaaataaaaa aataaaaaaa taaaaaaata taaaaataaa 3480
aaaatataaa aataaaaaaa tataaaaata aaaaaataaa aaaatataaa aataaaaaaa 3540
taaaaaaata taaaaatatt ttttatttaa agtttgaaaa aaattttttt atattatata 3600
atctttgaag aaaagaatat aaaaaatgag cctttataaa agcccatttt ttttcatata 3660
cgtaatatga cgttctaatg tttttattgg tacttctaac attagagtaa tttctttatt 3720
tttaaagcct ttttctttaa gggcttttat tttttttctt aatacattta attcctcttt 3780
ttttgttgct tttcctttag cttttaattg ctcttgataa ttttttttac ctctaatatt 3840
ttctcttctc ttatattcct ttttagaaat tattattgtc atatattttt gttcttcttc 3900
tgtaatttct aataactcta taagagtttc attcttatac ttatattgct tatttttatc 3960
taaataacat ctttcagcac ttctagttgc tcttataact tctctttcac ttaaatgttg 4020
tctaaacata ctattaagtt ctaaaacatc atttaatgcc ttctcaatgt cttctgtaaa 4080
gctacaaaga taatatctat ataaaaataa tataagctct ctgtgtcctt ttaaatcata 4140
ttctcttagt tcacaaagtt ttattatgtc ttgtattctt ccataatata aacttctttc 4200
tctataaata taatttattt tgcttggtct accctttttc ctttcatatg gttttaattc 4260
aggtaaaaat ccattttgta tttctcttaa gtcataaata tattcgtact catctaatat 4320
attgactact gtttttgatt tagagtttat acttcctgga actcttaata ttctcgttgc 4380
atctaaggct tgtctatctg ctccaaagta ttttaattga ttatataaat attcttgaac 4440
cgctttccat aatggtaatg ctttactagg tactgcattt attatccata ttaaatacat 4500
tcctcttcca ctatctatta catagtttgg tataggaata ctttgattaa aataattctt 4560
ttctaagtcc attaatacct ggtctttagt tttgccagtt ttataataat ccaagtctat 4620
aaacagtgta tttaactctt ttatattttc taatcgccta cacggcttat aaaaggtatt 4680
tagagttata tagatatttt catcactcat atctaaatct tttaattcag cgtatttata 4740
gtgccattgg ctatatcctt ttttatctat aacgctcctg gttatccacc ctttacttct 4800
actatgaata ttatctatat agttcttttt attcagcttt aatgcgtttc tcacttattc 4860
acctcccctt ctgtaaaact aagaaaatta tatcatattt tcaataatta ttaactattc 4920
ttaaactctt aataaaaaat agagtaagtc cccaattgaa acttaatcta ttttttatgt 4980
tttaatttat tatttttatt aaaatatttt aaactaaatt aaatgattct ttttaatttt 5040
ttactatttc attccataat atattactat aattatttac aaataatatt tcttcatttg 5100
taatatttag atgatttact aattttagtt tttatatatt aaataattaa tgtataattt 5160
atataaaaaa tcaaaggagc ttataaatta tgattatttc caaagatact aaagatttaa 5220
tttttttcaa ttttaacaat actttttgta atattatgtt taaatttaat tgtatttttt 5280
tcatataata aagccgttga agtaaaccaa tccattttcc ttatgatgtt attattaaat 5340
ttaagtttta taataatatc tttattatat ttattgtttt taaaaaaact agtgaaattt 5400
ctagtgaaat ttccggcttt attaaactta tttttaggaa ttttattttc attttcatct 5460
ttacaggatt tgattatatc tttaaatatg ttttatcaaa tattatcttt ttctaaattt 5520
atatatattt ttattatatt tattattata tatattttat ttttaagttt ctttctaaca 5580
gctattaaaa agaaacttaa aaataaaaac acgtactcta aaccaataaa taaaactatt 5640
tttattattg ctgccttgat tggaatagtt tttagtaaaa ttaatttcaa tattccacaa 5700
tattatatta taagctagca cgcctcgagt atttttgata aaagcaatga ttaacatggt 5760
ttgacgtctg agaagagacg attttctcaa taggagaaat taaggtgcaa acccttatca 5820
ttccaccatg atccacctgt agcaagcatg ttttagagct agaaatagca agttaaaata 5880
aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt ttgccatgga 5940
cgcgtgacgt cgac 5954
<210> 33
<211> 5853
<212> DNA
<213> Artificial Sequence
<220>
<223> pEC750C-deltaupp
<400> 33
atcaaaaaaa tttccaataa tcccactcta agccacaaac acgccctata aaatcccgct 60
ttaatcccac tttgagacac atgtaatatt actttacgcc ctagtatagt gataattttt 120
tacattcaat gccacgcaaa aaaataaagg ggcactataa taaaagttcc ttcggaacta 180
actaaagtaa aaaattatct ttacaacctc cccaaaaaaa agaacaggta caaagtaccc 240
tataatacaa gcgtaaaaaa aatgagggta aaaataaaaa aataaaaaaa taaaaaaata 300
aaaaaataaa aaaataaaaa aataaaaaaa tataaaaata aaaaaatata aaaataaaaa 360
aatataaaaa taaaaaaata aaaaaatata aaaataaaaa aataaaaaaa tataaaaata 420
ttttttattt aaagtttgaa aaaaattttt ttatattata taatctttga agaaaagaat 480
ataaaaaatg agcctttata aaagcccatt ttttttcata tacgtaatat gacgttctaa 540
tgtttttatt ggtacttcta acattagagt aatttcttta tttttaaagc ctttttcttt 600
aagggctttt attttttttc ttaatacatt taattcctct ttttttgttg cttttccttt 660
agcttttaat tgctcttgat aatttttttt acctctaata ttttctcttc tcttatattc 720
ctttttagaa attattattg tcatatattt ttgttcttct tctgtaattt ctaataactc 780
tataagagtt tcattcttat acttatattg cttattttta tctaaataac atctttcagc 840
acttctagtt gctcttataa cttctctttc acttaaatgt tgtctaaaca tactattaag 900
ttctaaaaca tcatttaatg ccttctcaat gtcttctgta aagctacaaa gataatatct 960
atataaaaat aatataagct ctctgtgtcc ttttaaatca tattctctta gttcacaaag 1020
ttttattatg tcttgtattc ttccataata taaacttctt tctctataaa tataatttat 1080
tttgcttggt ctaccctttt tcctttcata tggttttaat tcaggtaaaa atccattttg 1140
tatttctctt aagtcataaa tatattcgta ctcatctaat atattgacta ctgtttttga 1200
tttagagttt atacttcctg gaactcttaa tattctcgtt gcatctaagg cttgtctatc 1260
tgctccaaag tattttaatt gattatataa atattcttga accgctttcc ataatggtaa 1320
tgctttacta ggtactgcat ttattatcca tattaaatac attcctcttc cactatctat 1380
tacatagttt ggtataggaa tactttgatt aaaataattc ttttctaagt ccattaatac 1440
ctggtcttta gttttgccag ttttataata atccaagtct ataaacagtg tatttaactc 1500
ttttatattt tctaatcgcc tacacggctt ataaaaggta tttagagtta tatagatatt 1560
ttcatcactc atatctaaat cttttaattc agcgtattta tagtgccatt ggctatatcc 1620
ttttttatct ataacgctcc tggttatcca ccctttactt ctactatgaa tattatctat 1680
atagttcttt ttattcagct ttaatgcgtt tctcacttat tcacctcccc ttctgtaaaa 1740
ctaagaaaat tatatcatat tttcaataat tattaactat tcttaaactc ttaataaaaa 1800
atagagtaag tccccaattg aaacttaatc tattttttat gttttaattt attattttta 1860
ttaaaatatt ttaaactaaa ttaaatgatt ctttttaatt ttttactatt tcattccata 1920
atatattact ataattattt acaaataata tttcttcatt tgtaatattt agatgattta 1980
ctaattttag tttttatata ttaaataatt aatgtataat ttatataaaa aatcaaagga 2040
gcttataaat tatgattatt tccaaagata ctaaagattt aatttttttc aattttaaca 2100
atactttttg taatattatg tttaaattta attgtatttt tttcatataa taaagccgtt 2160
gaagtaaacc aatccatttt ccttatgatg ttattattaa atttaagttt tataataata 2220
tctttattat atttattgtt tttaaaaaaa ctagtgaaat ttctagtgaa atttccggct 2280
ttattaaact tatttttagg aattttattt tcattttcat ctttacagga tttgattata 2340
tctttaaata tgttttatca aatattatct ttttctaaat ttatatatat ttttattata 2400
tttattatta tatatatttt atttttaagt ttctttctaa cagctattaa aaagaaactt 2460
aaaaataaaa acacgtactc taaaccaata aataaaacta tttttattat tgctgccttg 2520
attggaatag tttttagtaa aattaatttc aatattccac aatattatat tataagctag 2580
cacgcctcga gtatttttga taaaagcaat gattaacatg gtttgacgtc tgagaagaga 2640
cgattttctc aataggagaa attaaggtgc aaacccttat cattccacca tgatccacct 2700
gtagcaagca tgttttagag ctagaaatag caagttaaaa taaggctagt ccgttatcaa 2760
cttgaaaaag tggcaccgag tcggtgcttt ttttgccatg gacgcgtgac gtcgacataa 2820
ggtaccagga attagagcag cgctatgttc agatacattt agtgctcatg caacaagaga 2880
acataataat gctaatatat taactatggg tcaaagggtt gttggagcag gtcttgcttt 2940
agatatagta aaaacattta tatcagctaa atttgaagga gataggcacc aaaaaagaat 3000
agataagatt tcagatattg aaaaaaagta tacacattag aaaaaagcag ctatgctgca 3060
aataagatca atttatatta gaaaaaagca gctatgctgc aaataagatc aatttatatt 3120
agaaaaaagc agctatgctg caaataagat caatttatat tagaaaaaag cagctatgct 3180
acaaataaga tcaatttata ttagaaaaaa gtagctatgc tgcaacaata ttaatttata 3240
ttactagaaa gctaaatggg gtatataaat ataaagggct ataaatacta aaagcaaact 3300
tggaggaata ataatggtct agagctggag atagattatt tggtactaag taattagtaa 3360
tctattagaa ttaaaagcta tctacataag tttctgaatg acccaagata attttactgg 3420
ggggaatata gaaaatggag agacgagata agaaaaatta ttacttggat attgctgaaa 3480
cagttttaga gagaggaacc tgtctaagga gaaactatgg ttctataatt gttaaaaatg 3540
atgaaataat ttctactgga tacacaggag cacctagagg tagaaaaaat tgcatggatt 3600
tgaatagttg cataagagaa aagttgaaag ttccaagagg tactcattat gagttgtgta 3660
ggagtgtaca tagtgaagct aatgcaataa taagcgcttc gagctcgaat tcgtaatcat 3720
ggtcatagct gtttcctgtg tgaaattgtt atccgctcac aattccacac aacatacgag 3780
ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt gagctaactc acattaattg 3840
cgttgcgctc actgcccgct ttccagtcgg gaaacctgtc gtgccagctg cattaatgaa 3900
tcggccaacg cgcggggaga ggcggtttgc gtattgggcg ctcttccgct tcctcgctca 3960
ctgactcgct gcgctcggtc gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg 4020
taatacggtt atccacagaa tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc 4080
agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat aggctccgcc 4140
cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac ccgacaggac 4200
tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct gttccgaccc 4260
tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg ctttctcata 4320
gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc 4380
acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt cttgagtcca 4440
acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg attagcagag 4500
cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac ggctacacta 4560
gaagaacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga aaaagagttg 4620
gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt gtttgcaagc 4680
agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt tctacggggt 4740
ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa 4800
ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc taaagtatat 4860
atgagtaaac ttggtctgac agttaccaaa gctagcttaa tactagtata tacttaatgt 4920
gataagtgtc tgacagctga ccggtctaaa gaggtcccta gcgcctacgg ggaatttgta 4980
tcgataaggg gtacaaattc ccactaagcg ctcggccggg gatcgatccc cgggtacgta 5040
cccggcagtt tttctttttc ggcaagtgtt caagaagtta ttaagtcggg agtgcagtcg 5100
aagtgggcaa gttgaaaaat tcacaaaaat gtggtataat atctttgttc attagagcga 5160
taaacttgaa tttgagaggg aacttagatg gtatttgaaa aaattgataa aaatagttgg 5220
aacagaaaag agtattttga ccactacttt gcaagtgtac cttgtaccta cagcatgacc 5280
gttaaagtgg atatcacaca aataaaggaa aagggaatga aactatatcc tgcaatgctt 5340
tattatattg caatgattgt aaaccgccat tcagagttta ggacggcaat caatcaagat 5400
ggtgaattgg ggatatatga tgagatgata ccaagctata caatatttca caatgatact 5460
gaaacatttt ccagcctttg gactgagtgt aagtctgact ttaaatcatt tttagcagat 5520
tatgaaagtg atacgcaacg gtatggaaac aatcatagaa tggaaggaaa gccaaatgct 5580
ccggaaaaca tttttaatgt atctatgata ccgtggtcaa ccttcgatgg ctttaatctg 5640
aatttgcaga aaggatatga ttatttgatt cctattttta ctatggggaa atattataaa 5700
gaagataaca aaattatact tcctttggca attcaagttc atcacgcagt atgtgacgga 5760
tttcacattt gccgttttgt aaacgaattg caggaattga taaatagtta acttcaggtt 5820
tgtctgtaac taaaaactag tatttaacct agg 5853
<210> 34
<211> 4966
<212> DNA
<213> Artificial Sequence
<220>
<223> pGRNA-pNF2
<400> 34
agctcggtac ccggggatcc tctagagtcg acgtcacgcg tccatggaga tctcgaggcg 60
tgctagctta taatataata ttgtggaata ttgaaattaa ttttactaaa aactattcca 120
atcaaggcag caataataaa aatagtttta tttattggtt tagagtacgt gtttttattt 180
ttaagtttct ttttaatagc tgttagaaag aaacttaaaa ataaaatata tataataata 240
aatataataa aaatatatat aaatttagaa aaagataata tttgataaaa catatttaaa 300
gatataatca aatcctgtaa agatgaaaat gaaaataaaa ttcctaaaaa taagtttaat 360
aaagccggaa atttcactag aaatttcact agttttttta aaaacaataa atataataaa 420
gatattatta taaaacttaa atttaataat aacatcataa ggaaaatgga ttggtttact 480
tcaacggctt tattatatga aaaaaataca attaaattta aacataatat tacaaaaagt 540
attgttaaaa ttgaaaaaaa ttaaatcttt agtatctttg gaaataatca taatttataa 600
gctcctttga ttttttatat aaattataca ttaattattt aatatataaa aactaaaatt 660
agtaaatcat ctaaatatta caaatgaaga aatattattt gtaaataatt atagtaatat 720
attatggaat gaaatagtaa aaaattaaaa agaatcattt aatttagttt aaaatatttt 780
aataaaaata ataaattaaa acataaaaaa tagattaagt ttcaattggg gacttactct 840
attttttatt aagagtttaa gaatagttaa taattattga aaatatgata taattttctt 900
agttttacag aaggggaggt gaataagtga gaaacgcatt aaagctgaat aaaaagaact 960
atatagataa tattcatagt agaagtaaag ggtggataac caggagcgtt atagataaaa 1020
aaggatatag ccaatggcac tataaatacg ctgaattaaa agatttagat atgagtgatg 1080
aaaatatcta tataactcta aatacctttt ataagccgtg taggcgatta gaaaatataa 1140
aagagttaaa tacactgttt atagacttgg attattataa aactggcaaa actaaagacc 1200
aggtattaat ggacttagaa aagaattatt ttaatcaaag tattcctata ccaaactatg 1260
taatagatag tggaagagga atgtatttaa tatggataat aaatgcagta cctagtaaag 1320
cattaccatt atggaaagcg gttcaagaat atttatataa tcaattaaaa tactttggag 1380
cagatagaca agccttagat gcaacgagaa tattaagagt tccaggaagt ataaactcta 1440
aatcaaaaac agtagtcaat atattagatg agtacgaata tatttatgac ttaagagaaa 1500
tacaaaatgg atttttacct gaattaaaac catatgaaag gaaaaagggt agaccaagca 1560
aaataaatta tatttataga gaaagaagtt tatattatgg aagaatacaa gacataataa 1620
aactttgtga actaagagaa tatgatttaa aaggacacag agagcttata ttatttttat 1680
atagatatta tctttgtagc tttacagaag acattgagaa ggcattaaat gatgttttag 1740
aacttaatag tatgtttaga caacatttaa gtgaaagaga agttataaga gcaactagaa 1800
gtgctgaaag atgttattta gataaaaata agcaatataa gtataagaat gaaactctta 1860
tagagttatt agaaattaca gaagaagaac aaaaatatat gacaataata atttctaaaa 1920
aggaatataa gagaagagaa aatattagag gtaaaaaaaa ttatcaagag caattaaaag 1980
ctaaaggaaa agcaacaaaa aaagaggaat taaatgtatt aagaaaaaaa ataaaagccc 2040
ttaaagaaaa aggctttaaa aataaagaaa ttactctaat gttagaagta ccaataaaaa 2100
cattagaacg tcatattacg tatatgaaaa aaaatgggct tttataaagg ctcatttttt 2160
atattctttt cttcaaagat tatataatat aaaaaaattt ttttcaaact ttaaataaaa 2220
aatattttta tattttttta tttttttatt tttatatttt tttatttttt tatttttata 2280
tttttttatt tttatatttt tttattttta tattttttta tttttttatt tttttatttt 2340
tttatttttt tattttttta tttttttatt tttaccctca ttttttttac gcttgtatta 2400
tagggtactt tgtacctgtt cttttttttg gggaggttgt aaagataatt ttttacttta 2460
gttagttccg aaggaacttt tattatagtg cccctttatt tttttgcgtg gcattgaatg 2520
taaaaaatta tcactatact agggcgtaaa gtaatattac atgtgtctca aagtgggatt 2580
aaagcgggat tttatagggc gtgtttgtgg cttagagtgg gattattgga aatttttttg 2640
atcctaggtt aaatactagt ttttagttac agacaaacct gaagttaact atttatcaat 2700
tcctgcaatt cgtttacaaa acggcaaatg tgaaatccgt cacatactgc gtgatgaact 2760
tgaattgcca aaggaagtat aattttgtta tcttctttat aatatttccc catagtaaaa 2820
ataggaatca aataatcata tcctttctgc aaattcagat taaagccatc gaaggttgac 2880
cacggtatca tagatacatt aaaaatgttt tccggagcat ttggctttcc ttccattcta 2940
tgattgtttc cataccgttg cgtatcactt tcataatctg ctaaaaatga tttaaagtca 3000
gacttacact cagtccaaag gctggaaaat gtttcagtat cattgtgaaa tattgtatag 3060
cttggtatca tctcatcata tatccccaat tcaccatctt gattgattgc cgtcctaaac 3120
tctgaatggc ggtttacaat cattgcaata taataaagca ttgcaggata tagtttcatt 3180
cccttttcct ttatttgtgt gatatccact ttaacggtca tgctgtaggt acaaggtaca 3240
cttgcaaagt agtggtcaaa atactctttt ctgttccaac tatttttatc aattttttca 3300
aataccatct aagttccctc tcaaattcaa gtttatcgct ctaatgaaca aagatattat 3360
accacatttt tgtgaatttt tcaacttgcc cacttcgact gcactcccga cttaataact 3420
tcttgaacac ttgccgaaaa agaaaaactg ccgggtacgt acccggggat cgatccccgg 3480
ccgagcgctt agtgggaatt tgtacccctt atcgatacaa attccccgta ggcgctaggg 3540
acctctttag accggtcagc tgtcagacac ttatcacatt aagtatatac tagtattaag 3600
ctagctttgg taactgtcag accaagttta ctcatatata ctttagattg atttaaaact 3660
tcatttttaa tttaaaagga tctaggtgaa gatccttttt gataatctca tgaccaaaat 3720
cccttaacgt gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc 3780
ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct 3840
accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg 3900
cttcagcaga gcgcagatac caaatactgt tcttctagtg tagccgtagt taggccacca 3960
cttcaagaac tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc 4020
tgctgccagt ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga 4080
taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac 4140
gacctacacc gaactgagat acctacagcg tgagctatga gaaagcgcca cgcttcccga 4200
agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag 4260
ggagcttcca gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg 4320
acttgagcgt cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag 4380
caacgcggcc tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc 4440
tgcgttatcc cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc 4500
tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc 4560
aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat taatgcagct ggcacgacag 4620
gtttcccgac tggaaagcgg gcagtgagcg caacgcaatt aatgtgagtt agctcactca 4680
ttaggcaccc caggctttac actttatgct tccggctcgt atgttgtgtg gaattgtgag 4740
cggataacaa tttcacacag gaaacagcta tgaccatgat tacgaattcg agctcactct 4800
atcattgata gagtttgaaa ctctatcatt gatagagtat aatatctttg ttcatttaag 4860
ccatctacta aacaagtttt agagctagaa atagcaagtt aaaataaggc tagtccgtta 4920
tcaacttgaa aaagtggcac cgagtcggtg ctttttttga agcttg 4966
<210> 35
<211> 400
<212> DNA
<213> Artificial Sequence
<220>
<223> Fragment upstream of the catB gene
<400> 35
gtctttacac ttttgcccat taatttttga gttccttatt tttagggagc ttttattatt 60
tttatcatga aaatttcata aaatactcat aaactaagga tgtcttcata atcagattag 120
tactccattt tcaatccatt taatctggga atatgatatt ttaattacgt attatttaag 180
atatattaac gtgtaatata ataccccgca aatattaatt atcacataca tatcccccct 240
ttattggggc attttttgta cccattattt tagtattgtg cagtacttaa ataaaaaaat 300
gccgcaaatt catttttatt gaataatgcg gtatttcttc tattctttat ttttattact 360
ctataaataa tgtaatcaag acatgactat ctaaatatat 400
<210> 36
<211> 400
<212> DNA
<213> Artificial Sequence
<220>
<223> Fragment downstream of the catB gene
<400> 36
aattcataat tcgggcctcc taaaaatttt cgtaattcta ttttagaagg cttttttccg 60
tgacctagcc atttcaatct cctttttaca atgatattta cgctttagtt tattatagca 120
cattctgtaa taccgaacta ttcaattttc agagaccatt ttttattgat tcataactta 180
agaatactac gaattactct aatattttac tttttcttat ctcttgttat tttaacatcg 240
gaattactac taatattaat ttttattttt ccatccgcat ttgctccaac atttttttaa 300
ctatactttc cttttgttaa taaattatgt tattgttgaa caatataaga aaagtgcgta 360
acatttttta ttaaaaataa ttaggtattt ctatctgtgg 400
<210> 37
<211> 218
<212> PRT
<213> Clostridium beijerinckii
<400> 37
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Lys
50 55 60
His Lys Glu Phe Arg Ile Cys Asp His Glu Gly Ser Leu Gly Tyr Trp
65 70 75 80
Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu Thr
85 90 95
Phe Ser Ser Ile Trp Thr Glu Tyr Asn Lys Ser Phe Leu Arg Phe Tyr
100 105 110
Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys Phe
115 120 125
Thr Pro Lys Ser Asn Glu Pro Asp Asn Thr Phe Ser Val Ser Ser Ile
130 135 140
Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu Gly
145 150 155 160
Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln Glu
165 170 175
Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile Cys
180 185 190
Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu Ala
195 200 205
Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys
210 215
<210> 38
<211> 9113
<212> DNA
<213> Artificial Sequence
<220>
<223> pCas9ind-gRNA_catB
<400> 38
catggataaa aagtacagta ttggtctaga cataggaact aactctgttg ggtgggctgt 60
tataacagat gaatataaag ttccatcaaa aaaatttaaa gtattaggaa acactgatag 120
acattcaata aaaaaaaact tgataggtgc tttattattc gattcaggag agactgctga 180
agctacacgt ttaaaaagaa cagctagacg tagatataca agaagaaaaa ataggatatg 240
ttatcttcaa gaaattttta gtaatgaaat ggcaaaagtt gatgattcat tctttcacag 300
actagaagaa agtttcttag ttgaagaaga taagaagcat gaaagacacc ctatttttgg 360
taatatcgta gatgaagtag catatcatga gaagtatcca actatctatc atttaagaaa 420
gaaattagtt gattctacag ataaagctga tctgagatta atatatttag ctttagctca 480
tatgattaaa tttagaggac attttttaat agaaggtgat ttaaacccag acaacagcga 540
tgtagataaa ttatttatcc aattagttca aacttataat caattattcg aagagaatcc 600
aattaatgca agtggtgtag acgctaaggc tatattatca gctagattat caaaatctag 660
aagattagaa aatctaatag ctcaacttcc tggagaaaag aaaaatggac tttttgggaa 720
cctaatagct ctctcactcg gactaacacc aaattttaaa agcaattttg atcttgctga 780
agacgcaaag ttacaactat caaaggatac atacgatgat gatttagata atttgttagc 840
tcaaataggt gatcaatatg ctgatttgtt tcttgcagca aaaaacttaa gtgatgcaat 900
tttactatca gatatactta gagtaaatac agaaataaca aaggctcctt tatcagcaag 960
tatgattaaa cgatatgatg agcatcatca agatttaaca ttattaaagg cacttgtaag 1020
acaacaatta ccagaaaaat ataaagaaat tttctttgat caatctaaaa atggatatgc 1080
tggatatata gacggtggag caagtcaaga agagttttat aaatttataa agcctatttt 1140
agaaaaaatg gatggaactg aagaattact tgttaaactt aacagagaag atttacttag 1200
aaaacaaaga acttttgata atggttcaat tcctcaccaa attcatttag gagaattaca 1260
tgctatacta agaagacaag aagattttta tccatttctt aaagataata gagaaaaaat 1320
tgaaaaaatt ttaactttta gaataccata ttatgtagga ccacttgcaa ggggaaattc 1380
aagatttgca tggatgacta gaaaatcaga agaaactata accccgtgga attttgaaga 1440
agtagtagat aaaggagcta gtgctcaatc atttatagaa agaatgacaa attttgataa 1500
gaatcttcct aacgaaaagg ttttgccaaa gcatagcctt ctttatgagt attttacagt 1560
ttataatgag cttactaaag taaaatacgt tacagaagga atgagaaaac cagcattttt 1620
gtctggtgaa caaaagaaag caatagtaga cctattattt aaaacaaata ggaaggttac 1680
cgtaaagcaa cttaaagaag attacttcaa aaaaattgaa tgctttgata gtgttgaaat 1740
atcaggagtt gaagatagat ttaatgcttc acttggtaca tatcacgatc tcttaaaaat 1800
tataaaagat aaggattttt tagataatga agaaaatgaa gatattcttg aagatatagt 1860
attaacattg acactttttg aagatagaga aatgatagaa gaaagattaa aaacatatgc 1920
acatcttttt gatgataagg ttatgaagca acttaaaaga agaagatata caggttgggg 1980
acgtttgtca agaaagctaa ttaatggtat tagagataaa caatcaggaa agactattct 2040
cgattttctt aaatcagatg gatttgctaa tagaaacttt atgcaattaa ttcatgatga 2100
ttctcttact ttcaaagagg atattcaaaa ggctcaagtt tctggacaag gcgatagctt 2160
acacgaacac attgctaacc ttgcagggag ccccgctatc aaaaaaggaa ttttacaaac 2220
agttaaagtt gtagatgaac ttgttaaagt tatgggaaga cacaaacctg agaatatagt 2280
tatagaaatg gccagagaaa atcaaacaac acaaaaagga caaaaaaatt ctagagagag 2340
aatgaagaga attgaagaag gaataaaaga gctaggatca caaatattaa aagaacatcc 2400
agttgaaaat actcaattgc aaaatgaaaa gttatatttg tattacttac aaaatggaag 2460
agatatgtat gttgatcaag aactcgatat taatagatta agtgactatg atgttgatca 2520
tattgttcct caatcatttt taaaagatga ttcaatcgat aacaaagtat taactagatc 2580
agataaaaat agaggaaagt cagataatgt accatctgaa gaagttgtta aaaaaatgaa 2640
gaactattgg agacaacttt taaatgcaaa gctaattaca caaagaaaat ttgacaattt 2700
aacaaaagca gaaagaggag gattaagcga attagacaaa gctggattta taaaaagaca 2760
acttgttgag acaagacaaa taactaagca tgttgctcaa atacttgatt caagaatgaa 2820
tacaaaatat gatgaaaatg ataaattaat cagagaagta aaagtaataa cattaaagtc 2880
aaaattagta tcagatttca gaaaggattt tcaattttac aaagttcgtg aaataaataa 2940
ctatcatcat gctcatgatg catacttaaa tgctgttgta ggaactgctc ttattaagaa 3000
atatcctaaa ctagaaagcg aatttgttta tggagattat aaagtttatg atgtgcgcaa 3060
aatgatcgcg aaatccgaac aagaaatcgg taaggctaca gcaaaatatt tcttttatag 3120
taatataatg aattttttta agacagaaat aactttggct aatggtgaaa tcagaaaaag 3180
accacttatc gaaacaaatg gagagacagg agaaatagta tgggataaag gaagagattt 3240
tgctactgtt agaaaagtac taagtatgcc acaagtaaat atcgtaaaga aaactgaagt 3300
tcaaactgga ggtttctcta aggaatcaat tttacctaag agaaattcag ataagttaat 3360
tgcaaggaaa aaagattggg acccaaaaaa atacggtggt tttgatagtc caacagttgc 3420
ctatagtgtt cttgtagtag cgaaagttga gaaaggtaag tcaaaaaagt tgaaaagcgt 3480
aaaagaactt cttggtatca caattatgga aagatcttca tttgaaaaaa atccaattga 3540
ctttttagaa gctaagggtt ataaagaagt taaaaaggat ttaatcataa aactaccaaa 3600
gtatagtcta tttgaactcg aaaacggaag aaaacgaatg ctcgctagcg caggagaact 3660
tcaaaaagga aatgaacttg cgctgccatc aaagtatgta aatttcttat atttagcttc 3720
tcattatgag aaattaaaag gatcaccaga ggataatgaa caaaagcaac tatttgtaga 3780
acaacacaaa cattatttag atgaaataat agaacaaata tctgaatttt ctaaaagagt 3840
tatacttgcc gacgcaaatc tagataaggt gctttcagcg tataataaac acagagataa 3900
accaataaga gaacaagcag aaaacattat ccatcttttt acattaacta atcttggtgc 3960
accagctgca tttaagtact ttgatacaac aatagataga aaaagataca catctactaa 4020
agaagtatta gacgcaactt taatacatca atctattaca gggctttatg aaacaagaat 4080
tgatttaagt caactaggcg gagattaagt cgacaaagta ttgttaaaaa taactctgta 4140
gaattataaa ttagttctac agagttattt tttgacccgg gtatattgat aaaaataata 4200
atagtgggta taattaagtt gttaggaggt tagttagaat gatgtcaaga ttagataaaa 4260
gtaaagtgat taacagcgca ttagagctgc ttaatgaggt cggaatcgaa ggtttaacaa 4320
cccgtaaact cgcccagaag ctaggtgtag agcagcctac attgtattgg catgtaaaaa 4380
ataagcgggc tttgctcgac gccttagcca ttgagatgtt agataggcac catactcact 4440
tttgcccttt agaaggggaa agctggcaag attttttacg taataacgct aaaagtttta 4500
gatgtgcttt actaagtcat cgcgatggag caaaagtaca tttaggtaca cggcctacag 4560
aaaaacagta tgaaactctc gaaaatcaat tagccttttt atgccaacaa ggtttttcac 4620
tagagaatgc attatatgca ctcagcgctg tggggcattt tactttaggt tgcgtattgg 4680
aagatcaaga gcatcaagtc gctaaagaag aaagggaaac acctactact gatagtatgc 4740
cgccattatt acgacaagct atcgaattat ttgatcacca aggtgcagag ccagccttct 4800
tattcggcct tgaattgatc atatgcggat tagaaaaaca acttaaatgt gaaagtgggt 4860
cttaaaagca gcataacctt tttccgtgat ggtaacttca cggtaaccaa gatgtcgagt 4920
tgagctcgaa ttcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca 4980
caattccaca caacatacga gccggaagca taaagtgtaa agcctggggt gcctaatgag 5040
tgagctaact cacattaatt gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt 5100
cgtgccagct gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc 5160
gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg 5220
tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa 5280
agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg 5340
cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga 5400
ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg 5460
tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg 5520
gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc 5580
gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg 5640
gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca 5700
ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt 5760
ggcctaacta cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag 5820
ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg 5880
gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc 5940
ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt 6000
tggtcatgag attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt 6060
ttaaatcaat ctaaagtata tatgagtaaa cttggtctga cagttaccag gtccactgcc 6120
gggcctcttg cgggatcaaa agaaaaacga aatgatacac caatcagtgc aaaaaaagat 6180
ataatgggag ataagacggt tcgtgttcgt gctgacttgc accatatcat aaaaatcgaa 6240
acagcaaaga atggcggaaa cgtaaaagaa gttatggaaa taagacttag aagcaaactt 6300
aagagtgtgt tgatagtgca gtatcttaaa attttgtata ataggaattg aagttaaatt 6360
agatgctaaa aatttgtaat taagaaggag tgattacatg aacaaaaata taaaatattc 6420
tcaaaacttt ttaacgagtg aaaaagtact caaccaaata ataaaacaat tgaatttaaa 6480
agaaaccgat accgtttacg aaattggaac aggtaaaggg catttaacga cgaaactggc 6540
taaaataagt aaacaggtaa cgtctattga attagacagt catctattca acttatcgtc 6600
agaaaaatta aaactgaata ctcgtgtcac tttaattcac caagatattc tacagtttca 6660
attccctaac aaacagaggt ataaaattgt tgggagtatt ccttaccatt taagcacaca 6720
aattattaaa aaagtggttt ttgaaagcca tgcgtctgac atctatctga ttgttgaaga 6780
aggattctac aagcgtacct tggatattca ccgaacacta gggttgctct tgcacactca 6840
agtctcgatt cagcaattgc ttaagctgcc agcggaatgc tttcatccta aaccaaaagt 6900
aaacagtgtc ttaataaaac ttacccgcca taccacagat gttccagata aatattggaa 6960
gctatatacg tactttgttt caaaatgggt caatcgagaa tatcgtcaac tgtttactaa 7020
aaatcagttt catcaagcaa tgaaacacgc caaagtaaac aatttaagta ccgttactta 7080
tgagcaagta ttgtctattt ttaatagtta tctattattt aacgggagga aataattcta 7140
tgagtcccta ggcaggcctc cgccattatt tttttgaaca attgacaatt catttcttat 7200
tttttattaa gtgatagtca aaaggcataa cagtgctgaa tagaaagaaa tttacagaaa 7260
agaaaattat agaatttagt atgattaatt atactcattt atgaatgttt aattgaatac 7320
aaaaaaaaat acttgttatg tattcaatta cgggttaaaa tatagacaag ttgaaaaatt 7380
taataaaaaa ataagtcctc agctcttata tattaagcta ccaacttagt atataagcca 7440
aaacttaaat gtgctaccaa cacatcaagc cgttagagaa ctctatctat agcaatattt 7500
caaatgtacc gacatacaag agaaacatta actatatata ttcaatttat gagattatct 7560
taacagatat aaatgtaaat tgcaataagt aagatttaga agtttatagc ctttgtgtat 7620
tggaagcagt acgcaaaggc ttttttattt gataaaaatt agaagtatat ttattttttc 7680
ataattaatt tatgaaaatg aaagggggtg agcaaagtga cagaggaaag cagtatctta 7740
tcaaataaca aggtattagc aatatcatta ttgactttag cagtaaacat tatgactttt 7800
atagtgcttg tagctaagta gtacgaaagg gggagcttta aaaagctcct tggaatacat 7860
agaattcata aattaattta tgaaaagaag ggcgtatatg aaaacttgta aaaattgcaa 7920
agagtttatt aaagatactg aaatatgcaa aatacattcg ttgatgattc atgataaaac 7980
agtagcaacc tattgcagta aatacaatga gtcaagatgt ttacataaag ggaaagtcca 8040
atgtattaat tgttcaaaga tgaaccgata tggatggtgt gccataaaaa tgagatgttt 8100
tacagaggaa gaacagaaaa aagaacgtac atgcattaaa tattatgcaa ggagctttaa 8160
aaaagctcat gtaaagaaga gtaaaaagaa aaaataattt atttattaat ttaatattga 8220
gagtgccgac acagtatgca ctaaaaaata tatctgtggt gtagtgagcc gatacaaaag 8280
gatagtcact cgcattttca taatacatct tatgttatga ttatgtgtcg gtgggacttc 8340
acgacgaaaa cccacaataa aaaaagagtt cggggtaggg ttaagcatag ttgaggcaac 8400
taaacaatca agctaggata tgcagtagca gaccgtaagg tcgttgttta ggtgtgttgt 8460
aatacatacg ctattaagat gtaaaaatac ggataccaat gaagggaaaa gtataatttt 8520
tggatgtagt ttgtttgttc atctatgggc aaactacgtc caaagccgtt tccaaatctg 8580
ctaaaaagta tatcctttct aaaatcaaag tcaagtatga aatcataaat aaagtttaat 8640
tttgaagtta ttatgatatt atgtttttct attaaaataa attaagtata tagaatagtt 8700
taataatagt atatacttaa tgtgataagt gtctgacagt gtcacagaaa ggatgattgt 8760
tatggattat aagcggctcg aggacgtcaa accatgttaa tcattgcttt tatcaaaaat 8820
aggatccact ctatcattga tagagtttga aactctatca ttgatagagt ataatatctt 8880
tgttcatgta catcatgcta tctgtgagtt ttagagctag aaatagcaag ttaaaataag 8940
gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgcttttttt gaagcttgtc 9000
tttacacttt tgcccctcga gtccctatca gtgatagatt gaaactctat cattgataga 9060
gtataatatc tttgttcatt agagcgataa acttgaattt gagagggaac ttc 9113
<210> 39
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer pNF2
<400> 39
gggcgcactt atacaccacc 20
<210> 40
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer pNF2
<400> 40
tgctacgcac cccctaaagg 20
<210> 41
<211> 50
<212> DNA
<213> Artificial Sequence
<220>
<223> DeltacatB_gRNA_rev
<400> 41
aatctatcac tgatagggac tcgaggggca aaagtgtaaa gacaagcttc 50
<210> 42
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer pCas9ind_fwd
<400> 42
agctcttgat ccggcaaaca 20
<210> 43
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer pCas9ind _rev
<400> 43
gcaaccctag tgttcggtga 20
<210> 44
<211> 219
<212> PRT
<213> Clostridium butyricum
<400> 44
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Lys Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Lys Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Glu Pro Asp Asn Thr Phe Ser Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Glu Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys
210 215
<210> 45
<211> 219
<212> PRT
<213> Clostridium beijerinckii
<400> 45
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Ile Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Lys Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Lys Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Glu Pro Asp Asn Thr Phe Ser Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Glu Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys
210 215
<210> 46
<211> 219
<212> PRT
<213> Clostridium beijerinckii
<400> 46
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Glu Glu Phe Arg Ile Cys Phe Asp His Glu Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Lys Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Glu Pro Asp Asn Thr Phe Ser Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Gly Asn Lys Val Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys
210 215
<210> 47
<211> 219
<212> PRT
<213> Clostridium beijerinckii
<400> 47
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Ser Ile Cys Phe Asp His Glu Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Glu Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Glu Pro Asp Asn Thr Phe Ser Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Gly Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys
210 215
<210> 48
<211> 219
<212> PRT
<213> Artificial Sequence
<220>
<223> Clostridium sp.2-1
<400> 48
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Asn Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Asn Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Glu Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Gln Pro Asp Asn Thr Phe Ser Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Glu Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys
210 215
<210> 49
<211> 219
<212> PRT
<213> Artificial Sequence
<220>
<223> Clostridium diolis
<400> 49
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Asn Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Asn Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Val Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Glu Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Glu Pro Asp Asn Thr Phe Ser Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Gly Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys
210 215
<210> 50
<211> 219
<212> PRT
<213> Clostridium beijerinckii
<400> 50
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Ile Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Lys Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Lys Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Gln Pro Asp Asn Thr Phe Ser Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Asn Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Glu Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Lys Glu Trp Leu Glu Asn Lys
210 215
<210> 51
<211> 221
<212> PRT
<213> Clostridium beijerinckii
<400> 51
Met Asn Phe Asn Leu Ile Asp Ile Asn Asn Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Asn Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Glu Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Glu Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Glu Pro Asp Asn Thr Phe Pro Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Gly Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys Tyr Ile
210 215 220
<210> 52
<211> 219
<212> PRT
<213> Clostridium beijerinckii
<400> 52
Met Asn Phe Asn Leu Ile Asp Ile Asn Asn Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Asn Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Glu Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Glu Pro Asp Asn Thr Phe Pro Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Gly Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Arg Glu Trp Leu Glu Asn Lys
210 215
<210> 53
<211> 219
<212> PRT
<213> Clostridium saccharoperbutylacetonicum
<400> 53
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Thr Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Lys Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Glu Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Ile Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Glu Pro Asp Asn Ile Phe Pro Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Glu Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys
210 215
<210> 54
<211> 219
<212> PRT
<213> Clostridium saccharoperbutylacetonicum
<400> 54
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Thr Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Lys Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe Tyr Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Glu Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Ile Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Glu Pro Asp Asn Ile Phe Pro Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Glu Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys
210 215
<210> 55
<211> 219
<212> PRT
<213> Clostridium beijerinckii
<400> 55
Met Asn Phe Asn Leu Ile Asp Ile Asn Asn Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Asn Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Glu Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Arg Ser Asp Asn Thr Phe Pro Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Gly Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Arg Glu Trp Leu Glu Asn Lys
210 215
<210> 56
<211> 221
<212> PRT
<213> Clostridium beijerinckii
<400> 56
Met Asn Phe Asn Leu Ile Asp Ile Asn His Trp Asn Arg Lys Pro Phe
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Lys Leu Lys Asn Ile
35 40 45
Lys Phe Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Ile Cys Phe Asp His Lys Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Ile Phe His Glu Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Glu Ser Phe Leu Arg Phe
100 105 110
Tyr Ser Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Ser Asn Glu Pro Asp Asn Thr Phe Pro Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Cys Asn Glu
145 150 155 160
Gly Thr Tyr Leu Thr Pro Ile Phe Thr Ala Gly Lys Tyr Phe Lys Gln
165 170 175
Glu Asn Lys Ile Phe Ile Pro Ile Ser Ile Gln Val His His Ser Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Glu Trp Leu Glu Asn Lys Tyr Ile
210 215 220
<210> 57
<211> 219
<212> PRT
<213> Clostridium beijerinckii
<400> 57
Met Asn Phe Asn Leu Ile Asp Ile Lys His Trp Ser Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Asn Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asp Leu Leu Tyr Glu Ile Arg Leu Lys Asn Ile
35 40 45
Lys Leu Tyr Pro Thr Leu Ile Tyr Met Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Thr Cys Phe Asp His Ser Gly Ser Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Ser Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asn Glu Ser Phe Pro Arg Phe
100 105 110
Tyr Ser Asp Tyr Phe Asp Asp Ile Lys Asn Tyr Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Leu Asn Glu Pro Asp Asn Thr Phe Pro Val Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Glu
145 150 155 160
Gly Thr Tyr Leu Ile Pro Ile Phe Thr Thr Gly Lys Tyr Phe Lys Gln
165 170 175
Glu Asn Lys Met Phe Ile Pro Ile Ser Ile Gln Val His His Ala Ile
180 185 190
Cys Asp Gly Tyr His Ala Ser Arg Phe Ile Asn Glu Met Gln Glu Leu
195 200 205
Ala Phe Ser Phe Gln Asp Trp Leu Glu Asn Lys
210 215
<210> 58
<211> 219
<212> PRT
<213> Clostridium botulinum
<400> 58
Met Lys Phe Asn Leu Ile Asp Ile Glu His Trp Asn Arg Lys Pro Tyr
1 5 10 15
Phe Glu Tyr Tyr Leu His Ser Val Arg Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asn Leu Leu His Glu Ile Lys Leu Lys Lys Leu
35 40 45
Lys Leu Tyr Pro Thr Leu Ile Tyr Ile Ile Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Thr Cys Phe Asp Glu Asn Gly Asn Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Ser Pro Ser Tyr Thr Ile Phe His Lys Asp Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Asp Tyr Asp Glu Ser Phe Ser Cys Phe
100 105 110
Tyr Asn Asp Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Ala Ile Met Lys
115 120 125
Phe Thr Pro Lys Leu Asn Glu Pro Ala Asn Thr Phe Pro Val Ser Ser
130 135 140
Ile Pro Trp Val Asn Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Asn
145 150 155 160
Gly Thr Tyr Leu Val Pro Ile Phe Thr Met Gly Lys Tyr Phe Glu Gln
165 170 175
Asn Asn Lys Ile Phe Ile Pro Met Ser Ile Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Ile Ser Arg Phe Ile Asn Glu Val Gln Glu Leu
195 200 205
Ala Leu Asn Ser Gln Thr Trp Leu Lys His Lys
210 215
<210> 59
<211> 219
<212> PRT
<213> Artificial Sequence
<220>
<223> Anaerocolumna aminovalerica
<400> 59
Met Lys Phe Asn Leu Ile Asp Ile Glu Asn Trp Asn Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Ser Val Arg Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asn Leu Leu His Glu Ile Lys Leu Lys Asp Leu
35 40 45
Lys Leu Tyr Pro Thr Leu Ile Tyr Ile Leu Ala Thr Val Val Asn Asn
50 55 60
His Lys Glu Phe Arg Thr Cys Phe Asp Glu Asn Gly Asn Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Ser Pro Ser Tyr Thr Ile Phe His Lys Glu Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asp Glu Ser Phe Ser Arg Phe
100 105 110
Tyr Thr Ala Tyr Leu Asp Asp Ile Lys Asn His Gly Asn Ile Met Lys
115 120 125
Phe Thr Pro Lys Leu Asn Glu Pro Ala Asn Thr Phe Pro Ile Ser Ser
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Asp
145 150 155 160
Gly Lys Tyr Leu Leu Pro Ile Phe Thr Thr Gly Lys Tyr Phe Glu Gln
165 170 175
Asn Ser Lys Ile Phe Ile Pro Met Ser Val Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Ile Ser Arg Phe Ile Asn Glu Val Gln Glu Val
195 200 205
Ile Leu Asn Tyr Gln Thr Trp Leu Gly Asp Lys
210 215
<210> 60
<211> 219
<212> PRT
<213> Artificial Sequence
<220>
<223> Desnuesiella massiliensis
<400> 60
Met Lys Phe Asn Leu Ile Asp Ile Glu His Trp Asn Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Ser Val Arg Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asn Leu Leu His Asp Ile Lys Leu Lys Lys Leu
35 40 45
Lys Leu Tyr Pro Thr Leu Ile Tyr Ile Ile Ala Thr Val Val Asn Asn
50 55 60
His Glu Glu Phe Arg Thr Cys Phe Tyr Glu Asn Gly Asn Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Ser Pro Ser Tyr Thr Ile Phe His Lys Asp Asn Glu
85 90 95
Thr Phe Ser Glu Ile Trp Ser Glu Tyr Asp Glu Ser Phe Ser Cys Phe
100 105 110
Tyr Ser Lys Tyr Leu Asp Asp Ile Lys Asn Tyr Gly Asp Ile Met Arg
115 120 125
Phe Thr Pro Lys Leu Asn Glu Pro Ala Asn Thr Phe Pro Ile Ser Cys
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Val Tyr Asn Asp
145 150 155 160
Gly Arg Tyr Leu Val Pro Ile Phe Thr Ile Gly Lys Tyr Phe Glu Gln
165 170 175
Asn Asn Lys Ile Phe Ile Pro Met Ser Ile Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Thr Ser Arg Phe Ile Asn Glu Val Gln Glu Leu
195 200 205
Ala Leu Asn Ser Gln Thr Trp Leu Arg His Lys
210 215
<210> 61
<211> 219
<212> PRT
<213> Artificial Sequence
<220>
<223> Clostridium sp. HMP27
<400> 61
Met Lys Phe Asn Leu Ile Asp Thr Glu His Trp Asn Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Ser Val Arg Cys Thr Tyr Ser Ile Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asn Leu Leu His Asp Ile Lys Gln Lys Lys Leu
35 40 45
Lys Leu Tyr Pro Thr Phe Ile Tyr Ile Ile Ala Thr Val Val Asn Thr
50 55 60
His Lys Glu Phe Arg Thr Cys Phe Asp Glu Ser Gly Asn Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Ser Pro Ser Tyr Thr Ile Phe His Lys Asp Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asp Lys Ser Phe Ser Cys Phe
100 105 110
Tyr Ser Lys Tyr Leu His Asp Ile Lys Asn Tyr Gly Asp Ile Met Ser
115 120 125
Phe Thr Pro Lys Leu Asn Glu Pro Ala Asn Thr Phe Pro Ile Ser Cys
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Ile Tyr Asn Asp
145 150 155 160
Gly Thr Tyr Leu Val Pro Ile Phe Thr Ile Gly Lys Tyr Phe Lys Gln
165 170 175
Ala Asp Lys Ile Leu Ile Pro Ile Ser Ile Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Thr Ser Arg Phe Ile Asn Glu Val Gln Glu Leu
195 200 205
Ile Leu Asn Tyr Gln Thr Trp Leu Lys His Lys
210 215
<210> 62
<211> 219
<212> PRT
<213> Artificial Sequence
<220>
<223> Clostridium drakei
<400> 62
Met Lys Phe Asn Leu Ile Asp Ile Glu Asn Trp Asn Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Ala Val Arg Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Gly Leu Leu Arg Glu Ile Lys Leu Lys Gly Leu
35 40 45
Lys Leu Tyr Pro Thr Leu Ile Tyr Ile Ile Thr Ala Val Ile Asn Arg
50 55 60
His Lys Glu Phe Arg Thr Cys Phe Asp Glu Asn Arg Lys Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Ser Pro Ser Tyr Thr Val Phe His Lys Glu Asp Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asp Glu Ser Phe Pro Arg Phe
100 105 110
Tyr Asp Asn Tyr Leu Asp Asp Ile Lys Ser Tyr Gly Asp Val Leu Lys
115 120 125
Phe Met Pro Lys Pro Asp Glu Pro Gly Asn Thr Phe Asn Val Ser Ser
130 135 140
Ile Pro Trp Val Asn Phe Thr Gly Phe Asn Leu Asn Ile Tyr Asn Asp
145 150 155 160
Ala Thr Tyr Leu Ile Pro Ile Phe Thr Met Gly Lys Phe Phe His Gln
165 170 175
Asp Asn Lys Ile Phe Ile Pro Met Ser Ile Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Thr Ser Arg Phe Phe Asn Glu Val Gln Glu Leu
195 200 205
Ser Ser Asn Phe Glu Thr Trp Leu Asp Glu Lys
210 215
<210> 63
<211> 219
<212> PRT
<213> Clostridium scatologenes
<400> 63
Met Lys Phe Asn Leu Ile Asp Ile Glu Asp Trp Asn Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Ala Val Arg Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Gly Leu Leu Arg Glu Ile Lys Leu Lys Gly Leu
35 40 45
Lys Leu Tyr Pro Thr Leu Ile Tyr Ile Ile Thr Ala Val Ile Asn Arg
50 55 60
His Lys Glu Phe Arg Thr Cys Phe Asp Glu Asn Arg Lys Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Ser Pro Ser Tyr Thr Val Phe His Lys Glu Asp Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asp Glu Ser Phe Pro Arg Phe
100 105 110
Tyr Asp Asn Tyr Leu Asp Asp Ile Lys Ser Tyr Gly Asp Val Leu Lys
115 120 125
Phe Met Pro Lys Pro Asp Glu Pro Gly Asn Thr Phe Asn Val Ser Ser
130 135 140
Ile Pro Trp Val Asn Phe Thr Gly Phe Asn Leu Asn Ile Tyr Asn Asp
145 150 155 160
Ala Thr Tyr Leu Ile Pro Ile Phe Thr Met Gly Lys Phe Phe His Gln
165 170 175
Asp Asn Lys Ile Phe Ile Pro Met Ser Ile Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Thr Ser Arg Phe Phe Asn Glu Val Gln Glu Leu
195 200 205
Ser Ser Asn Phe Glu Thr Trp Leu Gly Glu Lys
210 215
<210> 64
<211> 219
<212> PRT
<213> Artificial Sequence
<220>
<223> Clostridium tunisiense
<400> 64
Met Lys Phe Asn Leu Ile Asp Thr Glu His Trp Asp Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Phe Asn Ser Val Lys Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asn Leu Leu Asn His Ile Arg Leu Lys Lys Leu
35 40 45
Lys Leu Tyr Pro Thr Leu Ile Tyr Ile Ile Ala Thr Val Val Asn Asn
50 55 60
His Glu Glu Phe Arg Ile Cys Phe Asp Glu Asn Asn Asn Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Ser Pro Asn Tyr Thr Ile Phe His Glu Asp Asn Lys
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Glu Glu Ser Phe Ser Gly Phe
100 105 110
Tyr Asn Lys Tyr Leu Glu Asp Ile Lys Thr Tyr Gly His Ile Met Ser
115 120 125
Phe Glu Pro Lys Leu Asn Glu Ser Thr Asn Thr Phe Pro Ile Ser Cys
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Ile Gln Asp Asp
145 150 155 160
Gly Thr Tyr Leu Thr Pro Ile Phe Thr Leu Gly Lys Tyr Phe Glu Gln
165 170 175
Asn Asn Lys Thr Phe Ile Pro Ile Ser Ile Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Thr Ser Arg Phe Ile Asn Glu Val Gln Glu Leu
195 200 205
Ala Ser Asp Phe Gln Ile Trp Leu Thr Tyr Lys
210 215
<210> 65
<211> 219
<212> PRT
<213> Artificial Sequence
<220>
<223> Lachnospiraceae
<400> 65
Met Lys Phe Asn Leu Ile Asp Ile Glu Asp Trp Asn Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Ala Val Arg Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Gly Leu Leu Arg Glu Ile Lys Leu Lys Gly Leu
35 40 45
Lys Leu Tyr Pro Thr Leu Ile Tyr Ile Ile Thr Thr Val Val Asn Arg
50 55 60
His Lys Glu Phe Arg Thr Cys Phe Asp Gln Lys Gly Lys Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Val Phe His Lys Asp Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asp Glu Asn Phe Pro Arg Phe
100 105 110
Tyr Tyr Asn Tyr Leu Glu Asp Ile Arg Asn Tyr Ser Asp Val Leu Asn
115 120 125
Phe Met Pro Lys Thr Gly Glu Pro Ala Asn Thr Ile Asn Val Ser Ser
130 135 140
Ile Pro Trp Val Asn Phe Thr Gly Phe Asn Leu Asn Ile Tyr Asn Asp
145 150 155 160
Ala Thr Tyr Leu Ile Pro Ile Phe Thr Leu Gly Lys Tyr Phe Gln Gln
165 170 175
Asp Asn Lys Ile Leu Leu Pro Met Ser Val Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Thr Ser Arg Phe Phe Asn Glu Ala Gln Glu Leu
195 200 205
Ala Ser Asn Tyr Glu Thr Trp Leu Gly Glu Lys
210 215
<210> 66
<211> 219
<212> PRT
<213> Clostridium perfringens
<400> 66
Met Lys Phe Asn Leu Ile Asp Ile Glu Asp Trp Asn Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Leu Asn Ala Val Arg Cys Thr Tyr Ser Met Thr Ala
20 25 30
Asn Ile Glu Ile Thr Gly Leu Leu Arg Glu Ile Lys Leu Lys Gly Leu
35 40 45
Lys Leu Tyr Pro Thr Leu Ile Tyr Ile Ile Thr Thr Val Val Asn Arg
50 55 60
His Lys Glu Phe Arg Thr Cys Phe Asp Gln Lys Gly Lys Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Asn Pro Ser Tyr Thr Val Phe His Lys Asp Asn Glu
85 90 95
Thr Phe Ser Ser Ile Trp Thr Glu Tyr Asp Glu Asn Phe Pro Arg Phe
100 105 110
Tyr Tyr Asn Tyr Leu Glu Asp Ile Arg Asn Tyr Ser Asp Val Leu Asn
115 120 125
Phe Met Pro Lys Thr Gly Glu Pro Ala Asn Thr Ile Asn Val Ser Ser
130 135 140
Ile Pro Trp Val Asn Phe Thr Gly Phe Asn Leu Asn Ile Tyr Asn Asp
145 150 155 160
Ala Thr Tyr Leu Ile Pro Ile Phe Thr Leu Gly Lys Tyr Phe Gln Gln
165 170 175
Asp Asn Lys Ile Leu Leu Pro Met Ser Val Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Ile Ser Arg Phe Phe Asn Glu Ala Gln Glu Leu
195 200 205
Ala Ser Asn Tyr Glu Thr Trp Leu Gly Glu Lys
210 215
<210> 67
<211> 218
<212> PRT
<213> Artificial Sequence
<220>
<223> Clostrdium sp. BL8
<400> 67
Met Lys Phe Asn Leu Ile Asp Ile Asp Gln Trp Asp Arg Lys Pro Tyr
1 5 10 15
Phe Glu His Tyr Phe Asn Ser Val Lys Cys Thr Tyr Ser Ile Thr Ala
20 25 30
Asn Ile Glu Ile Thr Asn Leu Leu Lys Asp Ile Lys Ile Thr Lys Leu
35 40 45
Lys Leu Tyr Pro Thr Leu Ile Tyr Ile Ile Ala Thr Val Ile Asn Asn
50 55 60
His Glu Glu Phe Arg Thr Cys Phe Asp Glu Asn Asn Asn Leu Gly Tyr
65 70 75 80
Trp Asp Ser Met Ser Pro Asn Tyr Thr Ile Phe His Glu Glu Thr Lys
85 90 95
Thr Phe Ser Asn Ile Trp Thr Glu Tyr Asp Lys Ser Phe Ser Gly Phe
100 105 110
Tyr Asn Lys Tyr Val Glu Asp Asn Lys Asn Tyr Gly Asn Ile Met Asn
115 120 125
Phe Asp Pro Lys Leu Asn Glu Pro Ala Asn Thr Phe Pro Ile Ser Cys
130 135 140
Ile Pro Trp Val Ser Phe Thr Gly Phe Asn Leu Asn Ile Gln Asp His
145 150 155 160
Gly Thr Tyr Leu Thr Pro Ile Phe Thr Leu Gly Lys Tyr Phe Glu Glu
165 170 175
Asn Asn Lys Val Phe Ile Pro Met Ser Ile Gln Val His His Ala Val
180 185 190
Cys Asp Gly Tyr His Thr Ser Arg Phe Ile Asn Glu Val Gln Glu Leu
195 200 205
Ala Ser Asn Ser Gln Ser Trp Leu Lys His
210 215
<210> 68
<211> 660
<212> DNA
<213> Clostridium perfringens
<400> 68
atgaaattta atttgataga tattgaggat tggaatagaa agccatactt tgagcattat 60
ttaaatgcgg ttaggtgcac ttacagtatg actgcaaata tagagataac tggtttactg 120
cgtgaaatta aacttaaggg cctgaaactg taccctacgc ttatttatat catcacaact 180
gtggttaacc gtcacaagga gttccgcacc tgttttgatc aaaaaggtaa gttaggatac 240
tgggatagta tgaacccaag ttatactgtc tttcataagg ataacgaaac tttttcaagt 300
atttggacag agtatgacga gaacttccca cgtttttact ataattacct tgaggatatt 360
agaaactata gcgacgtttt gaatttcatg cctaagacag gtgaacctgc taatacaatt 420
aatgtgtcca gcattccttg ggtgaatttt accggattca acctgaatat atacaatgat 480
gcaacatatc taatccctat ttttactttg ggtaagtatt ttcagcagga taataaaatt 540
ttattaccta tgtctgtaca ggtgcatcat gcggtttgcg acggttatca tataagcaga 600
ttttttaatg aggcacagga attagcgtca aattatgaga catggttagg agaaaaataa 660
<210> 69
<211> 624
<212> DNA
<213> Clostridium difficile
<400> 69
atggtatttg aaaaaattga taaaaatagt tggaacagaa aagagtattt tgaccactac 60
tttgcaagtg taccttgtac atacagcatg accgttaaag tggatatcac acaaataaag 120
gaaaagggaa tgaaactata tcctgcaatg ctttattata ttgcaatgat tgtaaaccgc 180
cattcagagt ttaggacggc aatcaatcaa gatggtgaat tggggatata tgatgagatg 240
ataccaagct atacaatatt tcacaatgat actgaaacat tttccagcct ttggactgag 300
tgtaagtctg actttaaatc atttttagca gattatgaaa gtgatacgca acggtatgga 360
aacaatcata gaatggaagg aaagccaaat gctccggaaa acatttttaa tgtatctatg 420
ataccgtggt caaccttcga tggctttaat ctgaatttgc agaaaggata tgattatttg 480
attcctattt ttactatggg gaaatattat aaagaagata acaaaattat acttcctttg 540
gcaattcaag ttcatcacgc agtatgtgac ggatttcaca tttgccgttt tgtaaacgaa 600
ttgcaggaat tgataaatag ttaa 624
<210> 70
<211> 624
<212> DNA
<213> Clostridium perfringens
<400> 70
atggtatttg aaaaaattga taaaaatagt tggaacagaa aagagtattt tgaccactac 60
tttgcaagtg taccttgtac atacagcatg accgttaaag tggatatcac acaaataaag 120
gaaaagggaa tgaaactata tcctgcaatg ctttattata ttgcaatgat tgtaaaccgc 180
cattcagagt ttaggacggc aatcaatcaa gatggtgaat tggggatata tgatgagatg 240
ataccaagct atacaatatt tcacaatgat actgaaacat tttccagcct ttggactgag 300
tgtaagtctg actttaaatc atttttagca gattatgaaa gtgatacgca acggtatgga 360
aacaatcata gaatggaagg aaagccaaat gctccggaaa acatttttaa tgtatctatg 420
ataccgtggt caaccttcga tggctttaat ctgaatttgc agaaaggata tgattatttg 480
attcctattt ttactatggg gaaatattat aaagaagata acaaaattat acttcctttg 540
gcaattcaag ttcatcacgc agtatgtgac ggatttcaca tttgccgttt tgtaaacgaa 600
ttgcaggaat tgataaatag ttaa 624
<210> 71
<211> 3897
<212> DNA
<213> Artificial Sequence
<220>
<223> MAD7 optimis?
<400> 71
ctcgagtccc tatcagtgat agattgaaac tctatcattg atagagtata atatctttgt 60
tcattagagc gataaacttg aatttgagag ggaacttaga tgaacaacgg cacaaataat 120
tttcagaact tcatagggat atcaagtttg cagaaaacgt taagaaatgc tttaataccc 180
acggaaacca cgcaacagtt catagttaag aacggaataa ttaaagaaga tgagttaaga 240
ggcgagaaca gacagatttt aaaagatata atggatgact actacagagg attcatatct 300
gagactttaa gttctattga tgacatagat tggactagct tattcgaaaa aatggaaatt 360
cagttaaaaa atggtgataa taaagatacc ttaattaagg aacagacaga gtatagaaaa 420
gcaatacata aaaaatttgc gaacgacgat agatttaaga acatgtttag cgccaaatta 480
attagtgaca tattacctga atttgttata cacaacaata attattcggc atcagagaaa 540
gaggaaaaaa cccaggtgat aaaattgttt tcgagatttg cgactagctt taaagattac 600
ttcaagaaca gagcaaattg cttttcagcg gacgatattt catcaagcag ctgccataga 660
atagttaacg acaatgcaga gatattcttt tcaaatgcgt tagtttacag aagaatagta 720
aaatcgttaa gcaatgacga tataaacaaa atttcgggcg atatgaaaga ttcattaaaa 780
gaaatgagtt tagaagaaat atattcttac gagaagtatg gggaatttat tacccaggaa 840
ggcattagct tctataatga tatatgtggg aaagtgaatt cttttatgaa cttatattgt 900
cagaaaaata aagaaaacaa aaatttatac aaacttcaga aacttcacaa acagattcta 960
tgcattgcgg acactagcta tgaggttccg tataaatttg aaagtgacga ggaagtgtac 1020
caatcagtta acggcttcct tgataacatt agcagcaaac atatagttga aagattaaga 1080
aaaataggcg ataactataa cggctacaac ttagataaaa tttatatagt gtccaaattt 1140
tacgagagcg ttagccaaaa aacctacaga gactgggaaa caattaatac cgccttagaa 1200
attcattaca ataatatatt gccgggtaac ggtaaaagta aagccgacaa agtaaaaaaa 1260
gcggttaaga atgatttaca gaaatccata accgaaataa atgaactagt gtcaaactat 1320
aagttatgca gtgacgacaa cataaaagcg gagacttata tacatgagat tagccatata 1380
ttgaataact ttgaagcaca ggaattgaaa tacaatccgg aaattcacct agttgaatcc 1440
gagttaaaag cgagtgagct taaaaacgtg ttagacgtga taatgaatgc gtttcattgg 1500
tgttcggttt ttatgactga ggaacttgtt gataaagaca acaattttta tgcggaatta 1560
gaggagattt acgatgaaat ttatccagta attagtttat acaacttagt tagaaactac 1620
gttacccaga aaccgtacag cacgaaaaag attaaattga actttggaat accgacgtta 1680
gcagacggtt ggtcaaagtc caaagagtat tctaataacg ctataatatt aatgagagac 1740
aatttatatt atttaggcat atttaatgcg aagaataaac cggacaagaa gattatagag 1800
ggtaatacgt cagaaaataa gggtgactac aaaaagatga tttataattt gttaccgggt 1860
cccaacaaaa tgataccgaa agttttcttg agcagcaaga cgggggtgga aacgtataaa 1920
ccgagcgcct atatactaga ggggtataaa cagaataaac atataaagtc ttcaaaagac 1980
tttgatataa ctttctgtca tgatttaata gactacttca aaaactgtat tgcaattcat 2040
cccgagtgga aaaacttcgg ttttgatttt agcgacacca gtacttatga agacatttcc 2100
gggttttata gagaggtaga gttacaaggt tacaagattg attggacata cattagcgaa 2160
aaagacattg atttattaca ggaaaaaggt caattatatt tattccagat atataacaaa 2220
gatttttcga aaaaatcaac cgggaatgac aaccttcaca ccatgtactt aaaaaatctt 2280
ttctcagaag aaaatcttaa ggatatagtt ttaaaactta acggcgaagc ggaaatattc 2340
ttcaggaaga gcagcataaa gaacccaata attcataaaa aaggctcgat tttagttaac 2400
agaacctacg aagcagaaga aaaagaccag tttggcaaca ttcaaattgt gagaaaaaat 2460
attccggaaa acatttatca ggagttatac aaatacttca acgataaaag cgacaaagag 2520
ttatctgatg aagcagccaa attaaagaat gtagtgggac accacgaggc agcgacgaat 2580
atagttaagg actatagata cacgtatgat aaatacttcc ttcatatgcc tattacgata 2640
aatttcaaag ccaataaaac gggttttatt aatgatagga tattacagta tatagctaaa 2700
gaaaaagact tacatgtgat aggcattgat agaggcgaga gaaacttaat atacgtgtcc 2760
gtgattgata cttgtggtaa tatagttgaa cagaaaagct ttaacattgt aaacggctac 2820
gactatcaga taaaattaaa acaacaggag ggcgctagac agattgcgag aaaagaatgg 2880
aaagaaattg gtaaaattaa agagataaaa gagggctact taagcttagt aatacacgag 2940
atatctaaaa tggtaataaa atacaatgca attatagcga tggaggattt gtcttatggt 3000
tttaaaaaag ggagatttaa ggttgaaaga caagtttacc agaaatttga aaccatgtta 3060
ataaataaat taaactattt agtatttaaa gatatttcga ttaccgagaa tggcggttta 3120
ttaaaaggtt atcagttaac atacattcct gataaactta aaaacgtggg tcatcagtgc 3180
ggctgcattt tttatgtgcc tgctgcatac acgagcaaaa ttgatccgac caccggcttt 3240
gtgaatatat ttaaatttaa agacttaaca gtggacgcaa aaagagaatt cattaaaaaa 3300
tttgactcaa ttagatatga cagtgaaaaa aatttattct gctttacatt tgactacaat 3360
aactttatta cgcaaaacac ggttatgagc aaatcatcgt ggagtgtgta tacatacggc 3420
gtgagaataa aaagaagatt tgtgaacggc agattctcaa acgaaagtga taccattgac 3480
ataaccaaag atatggagaa aacgttggaa atgacggaca ttaactggag agatggccac 3540
gatcttagac aagacattat agattatgaa attgttcagc acatattcga aattttcaga 3600
ttaacagtgc aaatgagaaa ctccttgtct gaattagagg acagagatta cgatagatta 3660
atttcacctg tattaaacga aaataacatt ttttatgaca gcgcgaaagc gggggatgca 3720
cttcctaagg atgccgatgc aaatggtgcg tattgtattg cattaaaagg gttatatgaa 3780
attaaacaaa ttaccgaaaa ttggaaagaa gatggtaaat tttcgagaga taaattaaaa 3840
ataagcaata aagattggtt cgactttata cagaataaga gatatttata agtcgac 3897
<210> 72
<211> 1263
<212> PRT
<213> Artificial Sequence
<220>
<223> MAD7
<400> 72
Met Asn Asn Gly Thr Asn Asn Phe Gln Asn Phe Ile Gly Ile Ser Ser
1 5 10 15
Leu Gln Lys Thr Leu Arg Asn Ala Leu Ile Pro Thr Glu Thr Thr Gln
20 25 30
Gln Phe Ile Val Lys Asn Gly Ile Ile Lys Glu Asp Glu Leu Arg Gly
35 40 45
Glu Asn Arg Gln Ile Leu Lys Asp Ile Met Asp Asp Tyr Tyr Arg Gly
50 55 60
Phe Ile Ser Glu Thr Leu Ser Ser Ile Asp Asp Ile Asp Trp Thr Ser
65 70 75 80
Leu Phe Glu Lys Met Glu Ile Gln Leu Lys Asn Gly Asp Asn Lys Asp
85 90 95
Thr Leu Ile Lys Glu Gln Thr Glu Tyr Arg Lys Ala Ile His Lys Lys
100 105 110
Phe Ala Asn Asp Asp Arg Phe Lys Asn Met Phe Ser Ala Lys Leu Ile
115 120 125
Ser Asp Ile Leu Pro Glu Phe Val Ile His Asn Asn Asn Tyr Ser Ala
130 135 140
Ser Glu Lys Glu Glu Lys Thr Gln Val Ile Lys Leu Phe Ser Arg Phe
145 150 155 160
Ala Thr Ser Phe Lys Asp Tyr Phe Lys Asn Arg Ala Asn Cys Phe Ser
165 170 175
Ala Asp Asp Ile Ser Ser Ser Ser Cys His Arg Ile Val Asn Asp Asn
180 185 190
Ala Glu Ile Phe Phe Ser Asn Ala Leu Val Tyr Arg Arg Ile Val Lys
195 200 205
Ser Leu Ser Asn Asp Asp Ile Asn Lys Ile Ser Gly Asp Met Lys Asp
210 215 220
Ser Leu Lys Glu Met Ser Leu Glu Glu Ile Tyr Ser Tyr Glu Lys Tyr
225 230 235 240
Gly Glu Phe Ile Thr Gln Glu Gly Ile Ser Phe Tyr Asn Asp Ile Cys
245 250 255
Gly Lys Val Asn Ser Phe Met Asn Leu Tyr Cys Gln Lys Asn Lys Glu
260 265 270
Asn Lys Asn Leu Tyr Lys Leu Gln Lys Leu His Lys Gln Ile Leu Cys
275 280 285
Ile Ala Asp Thr Ser Tyr Glu Val Pro Tyr Lys Phe Glu Ser Asp Glu
290 295 300
Glu Val Tyr Gln Ser Val Asn Gly Phe Leu Asp Asn Ile Ser Ser Lys
305 310 315 320
His Ile Val Glu Arg Leu Arg Lys Ile Gly Asp Asn Tyr Asn Gly Tyr
325 330 335
Asn Leu Asp Lys Ile Tyr Ile Val Ser Lys Phe Tyr Glu Ser Val Ser
340 345 350
Gln Lys Thr Tyr Arg Asp Trp Glu Thr Ile Asn Thr Ala Leu Glu Ile
355 360 365
His Tyr Asn Asn Ile Leu Pro Gly Asn Gly Lys Ser Lys Ala Asp Lys
370 375 380
Val Lys Lys Ala Val Lys Asn Asp Leu Gln Lys Ser Ile Thr Glu Ile
385 390 395 400
Asn Glu Leu Val Ser Asn Tyr Lys Leu Cys Ser Asp Asp Asn Ile Lys
405 410 415
Ala Glu Thr Tyr Ile His Glu Ile Ser His Ile Leu Asn Asn Phe Glu
420 425 430
Ala Gln Glu Leu Lys Tyr Asn Pro Glu Ile His Leu Val Glu Ser Glu
435 440 445
Leu Lys Ala Ser Glu Leu Lys Asn Val Leu Asp Val Ile Met Asn Ala
450 455 460
Phe His Trp Cys Ser Val Phe Met Thr Glu Glu Leu Val Asp Lys Asp
465 470 475 480
Asn Asn Phe Tyr Ala Glu Leu Glu Glu Ile Tyr Asp Glu Ile Tyr Pro
485 490 495
Val Ile Ser Leu Tyr Asn Leu Val Arg Asn Tyr Val Thr Gln Lys Pro
500 505 510
Tyr Ser Thr Lys Lys Ile Lys Leu Asn Phe Gly Ile Pro Thr Leu Ala
515 520 525
Asp Gly Trp Ser Lys Ser Lys Glu Tyr Ser Asn Asn Ala Ile Ile Leu
530 535 540
Met Arg Asp Asn Leu Tyr Tyr Leu Gly Ile Phe Asn Ala Lys Asn Lys
545 550 555 560
Pro Asp Lys Lys Ile Ile Glu Gly Asn Thr Ser Glu Asn Lys Gly Asp
565 570 575
Tyr Lys Lys Met Ile Tyr Asn Leu Leu Pro Gly Pro Asn Lys Met Ile
580 585 590
Pro Lys Val Phe Leu Ser Ser Lys Thr Gly Val Glu Thr Tyr Lys Pro
595 600 605
Ser Ala Tyr Ile Leu Glu Gly Tyr Lys Gln Asn Lys His Ile Lys Ser
610 615 620
Ser Lys Asp Phe Asp Ile Thr Phe Cys His Asp Leu Ile Asp Tyr Phe
625 630 635 640
Lys Asn Cys Ile Ala Ile His Pro Glu Trp Lys Asn Phe Gly Phe Asp
645 650 655
Phe Ser Asp Thr Ser Thr Tyr Glu Asp Ile Ser Gly Phe Tyr Arg Glu
660 665 670
Val Glu Leu Gln Gly Tyr Lys Ile Asp Trp Thr Tyr Ile Ser Glu Lys
675 680 685
Asp Ile Asp Leu Leu Gln Glu Lys Gly Gln Leu Tyr Leu Phe Gln Ile
690 695 700
Tyr Asn Lys Asp Phe Ser Lys Lys Ser Thr Gly Asn Asp Asn Leu His
705 710 715 720
Thr Met Tyr Leu Lys Asn Leu Phe Ser Glu Glu Asn Leu Lys Asp Ile
725 730 735
Val Leu Lys Leu Asn Gly Glu Ala Glu Ile Phe Phe Arg Lys Ser Ser
740 745 750
Ile Lys Asn Pro Ile Ile His Lys Lys Gly Ser Ile Leu Val Asn Arg
755 760 765
Thr Tyr Glu Ala Glu Glu Lys Asp Gln Phe Gly Asn Ile Gln Ile Val
770 775 780
Arg Lys Asn Ile Pro Glu Asn Ile Tyr Gln Glu Leu Tyr Lys Tyr Phe
785 790 795 800
Asn Asp Lys Ser Asp Lys Glu Leu Ser Asp Glu Ala Ala Lys Leu Lys
805 810 815
Asn Val Val Gly His His Glu Ala Ala Thr Asn Ile Val Lys Asp Tyr
820 825 830
Arg Tyr Thr Tyr Asp Lys Tyr Phe Leu His Met Pro Ile Thr Ile Asn
835 840 845
Phe Lys Ala Asn Lys Thr Gly Phe Ile Asn Asp Arg Ile Leu Gln Tyr
850 855 860
Ile Ala Lys Glu Lys Asp Leu His Val Ile Gly Ile Asp Arg Gly Glu
865 870 875 880
Arg Asn Leu Ile Tyr Val Ser Val Ile Asp Thr Cys Gly Asn Ile Val
885 890 895
Glu Gln Lys Ser Phe Asn Ile Val Asn Gly Tyr Asp Tyr Gln Ile Lys
900 905 910
Leu Lys Gln Gln Glu Gly Ala Arg Gln Ile Ala Arg Lys Glu Trp Lys
915 920 925
Glu Ile Gly Lys Ile Lys Glu Ile Lys Glu Gly Tyr Leu Ser Leu Val
930 935 940
Ile His Glu Ile Ser Lys Met Val Ile Lys Tyr Asn Ala Ile Ile Ala
945 950 955 960
Met Glu Asp Leu Ser Tyr Gly Phe Lys Lys Gly Arg Phe Lys Val Glu
965 970 975
Arg Gln Val Tyr Gln Lys Phe Glu Thr Met Leu Ile Asn Lys Leu Asn
980 985 990
Tyr Leu Val Phe Lys Asp Ile Ser Ile Thr Glu Asn Gly Gly Leu Leu
995 1000 1005
Lys Gly Tyr Gln Leu Thr Tyr Ile Pro Asp Lys Leu Lys Asn Val
1010 1015 1020
Gly His Gln Cys Gly Cys Ile Phe Tyr Val Pro Ala Ala Tyr Thr
1025 1030 1035
Ser Lys Ile Asp Pro Thr Thr Gly Phe Val Asn Ile Phe Lys Phe
1040 1045 1050
Lys Asp Leu Thr Val Asp Ala Lys Arg Glu Phe Ile Lys Lys Phe
1055 1060 1065
Asp Ser Ile Arg Tyr Asp Ser Glu Lys Asn Leu Phe Cys Phe Thr
1070 1075 1080
Phe Asp Tyr Asn Asn Phe Ile Thr Gln Asn Thr Val Met Ser Lys
1085 1090 1095
Ser Ser Trp Ser Val Tyr Thr Tyr Gly Val Arg Ile Lys Arg Arg
1100 1105 1110
Phe Val Asn Gly Arg Phe Ser Asn Glu Ser Asp Thr Ile Asp Ile
1115 1120 1125
Thr Lys Asp Met Glu Lys Thr Leu Glu Met Thr Asp Ile Asn Trp
1130 1135 1140
Arg Asp Gly His Asp Leu Arg Gln Asp Ile Ile Asp Tyr Glu Ile
1145 1150 1155
Val Gln His Ile Phe Glu Ile Phe Arg Leu Thr Val Gln Met Arg
1160 1165 1170
Asn Ser Leu Ser Glu Leu Glu Asp Arg Asp Tyr Asp Arg Leu Ile
1175 1180 1185
Ser Pro Val Leu Asn Glu Asn Asn Ile Phe Tyr Asp Ser Ala Lys
1190 1195 1200
Ala Gly Asp Ala Leu Pro Lys Asp Ala Asp Ala Asn Gly Ala Tyr
1205 1210 1215
Cys Ile Ala Leu Lys Gly Leu Tyr Glu Ile Lys Gln Ile Thr Glu
1220 1225 1230
Asn Trp Lys Glu Asp Gly Lys Phe Ser Arg Asp Lys Leu Lys Ile
1235 1240 1245
Ser Asn Lys Asp Trp Phe Asp Phe Ile Gln Asn Lys Arg Tyr Leu
1250 1255 1260
<210> 73
<211> 363
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter CatB
<400> 73
taaaaaatgt tacgcacttt tcttatattg ttcaacaata acataattta ttaacaaaag 60
gaaagtatag ttaaaaaaat gttggagcaa atgcggatgg aaaaataaaa attaatatta 120
gtagtaattc cgatgttaaa ataacaagag ataagaaaaa gtaaaatatt agagtaattc 180
gtagtattct taagttatga atcaataaaa aatggtctct gaaaattgaa tagttcggta 240
ttacagaatg tgctataata aactaaagcg taaatatcat tgtaaaaagg agattgaaat 300
ggctaggtca cggaaaaaag ccttctaaaa tagaattacg aaaattttta ggaggcccga 360
att 363
<210> 74
<211> 322
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter CATQ
<400> 74
ctgcgtacac atccagacat cgctttagag tatggtgaat taaagatgga gcgggcttat 60
cgattctcag aggatattga aggctactgc actggtaagg atgcatttgt aaagcaacta 120
gaaaaggatg ctttgcgatg gtggcaaact gtctgttagg aggttattct caaaggattg 180
caagaagcag ttgaggataa tccgtataac taactattac acattcttaa cattgctggt 240
ttgtatcggt agaataacac gaattaacaa aggatatatt ttgtagtagc aagtgtattt 300
gttttatatt ctatgaacct at 322
<210> 75
<211> 1368
<212> PRT
<213> Streptococcus pyogenes
<400> 75
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
<210> 76
<211> 4107
<212> DNA
<213> Streptococcus pyogenes
<400> 76
atggataaga aatactcaat aggcttagat atcggcacaa atagcgtcgg atgggcggtg 60
atcactgatg aatataaggt tccgtctaaa aagttcaagg ttctgggaaa tacagaccgc 120
cacagtatca aaaaaaatct tataggggct cttttatttg acagtggaga gacagcggaa 180
gcgactcgtc tcaaacggac agctcgtaga aggtatacac gtcggaagaa tcgtatttgt 240
tatctacagg agattttttc aaatgagatg gcgaaagtag atgatagttt ctttcatcga 300
cttgaagagt cttttttggt ggaagaagac aagaagcatg aacgtcatcc tatttttgga 360
aatatagtag atgaagttgc ttatcatgag aaatatccaa ctatctatca tctgcgaaaa 420
aaattggtag attctactga taaagcggat ttgcgcttaa tctatttggc cttagcgcat 480
atgattaagt ttcgtggtca ttttttgatt gagggagatt taaatcctga taatagtgat 540
gtggacaaac tatttatcca gttggtacaa acctacaatc aattatttga agaaaaccct 600
attaacgcaa gtggagtaga tgctaaagcg attctttctg cacgattgag taaatcaaga 660
cgattagaaa atctcattgc tcagctcccc ggtgagaaga aaaatggctt atttgggaat 720
ctcattgctt tgtcattggg tttgacccct aattttaaat caaattttga tttggcagaa 780
gatgctaaat tacagctttc aaaagatact tacgatgatg atttagataa tttattggcg 840
caaattggag atcaatatgc tgatttgttt ttggcagcta agaatttatc agatgctatt 900
ttactttcag atatcctaag agtaaatact gaaataacta aggctcccct atcagcttca 960
atgattaaac gctacgatga acatcatcaa gacttgactc ttttaaaagc tttagttcga 1020
caacaacttc cagaaaagta taaagaaatc ttttttgatc aatcaaaaaa cggatatgca 1080
ggttatattg atgggggagc tagccaagaa gaattttata aatttatcaa accaatttta 1140
gaaaaaatgg atggtactga ggaattattg gtgaaactaa atcgtgaaga tttgctgcgc 1200
aagcaacgga cctttgacaa cggctctatt ccccatcaaa ttcacttggg tgagctgcat 1260
gctattttga gaagacaaga agacttttat ccatttttaa aagacaatcg tgagaagatt 1320
gaaaaaatct tgacttttcg aattccttat tatgttggtc cattggcgcg tggcaatagt 1380
cgttttgcat ggatgactcg gaagtctgaa gaaacaatta ccccatggaa ttttgaagaa 1440
gttgtcgata aaggtgcttc agctcaatca tttattgaac gcatgacaaa ctttgataaa 1500
aatcttccaa atgaaaaagt actaccaaaa catagtttgc tttatgagta ttttacggtt 1560
tataacgaat tgacaaaggt caaatatgtt actgaaggaa tgcgaaaacc agcatttctt 1620
tcaggtgaac agaagaaagc cattgttgat ttactcttca aaacaaatcg aaaagtaacc 1680
gttaagcaat taaaagaaga ttatttcaaa aaaatagaat gttttgatag tgttgaaatt 1740
tcaggagttg aagatagatt taatgcttca ttaggtacct accatgattt gctaaaaatt 1800
attaaagata aagatttttt ggataatgaa gaaaatgaag atatcttaga ggatattgtt 1860
ttaacattga ccttatttga agatagggag atgattgagg aaagacttaa aacatatgct 1920
cacctctttg atgataaggt gatgaaacag cttaaacgtc gccgttatac tggttgggga 1980
cgtttgtctc gaaaattgat taatggtatt agggataagc aatctggcaa aacaatatta 2040
gattttttga aatcagatgg ttttgccaat cgcaatttta tgcagctgat ccatgatgat 2100
agtttgacat ttaaagaaga cattcaaaaa gcacaagtgt ctggacaagg cgatagttta 2160
catgaacata ttgcaaattt agctggtagc cctgctatta aaaaaggtat tttacagact 2220
gtaaaagttg ttgatgaatt ggtcaaagta atggggcggc ataagccaga aaatatcgtt 2280
attgaaatgg cacgtgaaaa tcagacaact caaaagggcc agaaaaattc gcgagagcgt 2340
atgaaacgaa tcgaagaagg tatcaaagaa ttaggaagtc agattcttaa agagcatcct 2400
gttgaaaata ctcaattgca aaatgaaaag ctctatctct attatctcca aaatggaaga 2460
gacatgtatg tggaccaaga attagatatt aatcgtttaa gtgattatga tgtcgatcac 2520
attgttccac aaagtttcct taaagacgat tcaatagaca ataaggtctt aacgcgttct 2580
gataaaaatc gtggtaaatc ggataacgtt ccaagtgaag aagtagtcaa aaagatgaaa 2640
aactattgga gacaacttct aaacgccaag ttaatcactc aacgtaagtt tgataattta 2700
acgaaagctg aacgtggagg tttgagtgaa cttgataaag ctggttttat caaacgccaa 2760
ttggttgaaa ctcgccaaat cactaagcat gtggcacaaa ttttggatag tcgcatgaat 2820
actaaatacg atgaaaatga taaacttatt cgagaggtta aagtgattac cttaaaatct 2880
aaattagttt ctgacttccg aaaagatttc caattctata aagtacgtga gattaacaat 2940
taccatcatg cccatgatgc gtatctaaat gccgtcgttg gaactgcttt gattaagaaa 3000
tatccaaaac ttgaatcgga gtttgtctat ggtgattata aagtttatga tgttcgtaaa 3060
atgattgcta agtctgagca agaaataggc aaagcaaccg caaaatattt cttttactct 3120
aatatcatga acttcttcaa aacagaaatt acacttgcaa atggagagat tcgcaaacgc 3180
cctctaatcg aaactaatgg ggaaactgga gaaattgtct gggataaagg gcgagatttt 3240
gccacagtgc gcaaagtatt gtccatgccc caagtcaata ttgtcaagaa aacagaagta 3300
cagacaggcg gattctccaa ggagtcaatt ttaccaaaaa gaaattcgga caagcttatt 3360
gctcgtaaaa aagactggga tccaaaaaaa tatggtggtt ttgatagtcc aacggtagct 3420
tattcagtcc tagtggttgc taaggtggaa aaagggaaat cgaagaagtt aaaatccgtt 3480
aaagagttac tagggatcac aattatggaa agaagttcct ttgaaaaaaa tccgattgac 3540
tttttagaag ctaaaggata taaggaagtt aaaaaagact taatcattaa actacctaaa 3600
tatagtcttt ttgagttaga aaacggtcgt aaacggatgc tggctagtgc cggagaatta 3660
caaaaaggaa atgagctggc tctgccaagc aaatatgtga attttttata tttagctagt 3720
cattatgaaa agttgaaggg tagtccagaa gataacgaac aaaaacaatt gtttgtggag 3780
cagcataagc attatttaga tgagattatt gagcaaatca gtgaattttc taagcgtgtt 3840
attttagcag atgccaattt agataaagtt cttagtgcat ataacaaaca tagagacaaa 3900
ccaatacgtg aacaagcaga aaatattatt catttattta cgttgacgaa tcttggagct 3960
cccgctgctt ttaaatattt tgatacaaca attgatcgta aacgatatac gtctacaaaa 4020
gaagttttag atgccactct tatccatcaa tccatcactg gtctttatga aacacgcatt 4080
gatttgagtc agctaggagg tgactga 4107
<210> 77
<211> 1170
<212> DNA
<213> Artificial Sequence
<220>
<223> bdhA
<400> 77
atgctaagtt ttgattattc aataccaact aaagtttttt ttggaaaagg aaaaatagac 60
gtaattggag aagaaattaa gaaatatggc tcaagagtgc ttatagttta tggcggagga 120
agtataaaaa ggaacggtat atatgataga gcaacagcta tattaaaaga aaacaatata 180
gctttctatg aactttcagg agtagagcca aatcctagga taacaacagt aaaaaaaggc 240
atagaaatat gtagagaaaa taatgtggat ttagtattag caataggggg aggaagtgca 300
atagactgtt ctaaggtaat tgcagctgga gtttattatg atggcgatac atgggacatg 360
gttaaagatc catctaaaat aactaaagtt cttccaattg caagtatact tactctttca 420
gcaacagggt ctgaaatgga tcaaattgca gtaatttcaa atatggagac taatgaaaag 480
cttggagtag gacatgatga tatgagacct aaattttcag tgttagatcc tacatatact 540
tttacagtac ctaaaaatca aacagcagcg ggaacagctg acattatgag tcacaccttt 600
gaatcttact ttagtggtgt tgaaggtgct tatgtgcagg acggtatagc agaagcaatc 660
ttaagaacat gtataaagta tggaaaaata gcaatggaga agactgatga ttacgaggct 720
agagctaatt tgatgtgggc ttcaagttta gctataaatg gtctattatc acttggtaag 780
gatagaaaat ggagttgtca tcctatggaa cacgagttaa gtgcatatta tgatataaca 840
catggtgtag gacttgcaat tttaacacct aattggatgg aatatattct aaatgacgat 900
acacttcata aatttgtttc ttatggaata aatgtttggg gaatagacaa gaacaaagat 960
aactatgaaa tagcacgaga ggctattaaa aatacgagag aatactttaa ttcattgggt 1020
attccttcaa agcttagaga agttggaata ggaaaagata aactagaact aatggcaaag 1080
caagctgtta gaaattctgg aggaacaata ggaagtttaa gaccaataaa tgcagaggat 1140
gttcttgaga tatttaaaaa atcttattaa 1170
<210> 78
<211> 1173
<212> DNA
<213> Artificial Sequence
<220>
<223> bdhB
<400> 78
gtggttgatt tcgaatattc aataccaact agaatttttt tcggtaaaga taagataaat 60
gtacttggaa gagagcttaa aaaatatggt tctaaagtgc ttatagttta tggtggagga 120
agtataaaga gaaatggaat atatgataaa gctgtaagta tacttgaaaa aaacagtatt 180
aaattttatg aacttgcagg agtagagcca aatccaagag taactacagt tgaaaaagga 240
gttaaaatat gtagagaaaa tggagttgaa gtagtactag ctataggtgg aggaagtgca 300
atagattgcg caaaggttat agcagcagca tgtgaatatg atggaaatcc atgggatatt 360
gtgttagatg gctcaaaaat aaaaagggtg cttcctatag ctagtatatt aaccattgct 420
gcaacaggat cagaaatgga tacgtgggca gtaataaata atatggatac aaacgaaaaa 480
ctaattgcgg cacatccaga tatggctcct aagttttcta tattagatcc aacgtatacg 540
tataccgtac ctaccaatca aacagcagca ggaacagctg atattatgag tcatatattt 600
gaggtgtatt ttagtaatac aaaaacagca tatttgcagg atagaatggc agaagcgtta 660
ttaagaactt gtattaaata tggaggaata gctcttgaga agccggatga ttatgaggca 720
agagccaatc taatgtgggc ttcaagtctt gcgataaatg gacttttaac atatggtaaa 780
gacactaatt ggagtgtaca cttaatggaa catgaattaa gtgcttatta cgacataaca 840
cacggcgtag ggcttgcaat tttaacacct aattggatgg agtatatttt aaataatgat 900
acagtgtaca agtttgttga atatggtgta aatgtttggg gaatagacaa agaaaaaaat 960
cactatgaca tagcacatca agcaatacaa aaaacaagag attactttgt aaatgtacta 1020
ggtttaccat ctagactgag agatgttgga attgaagaag aaaaattgga cataatggca 1080
aaggaatcag taaagcttac aggaggaacc ataggaaacc taagaccagt aaacgcctcc 1140
gaagtcctac aaatattcaa aaaatctgtg taa 1173
<210> 79
<211> 6560
<212> DNA
<213> Artificial Sequence
<220>
<223> pGRNA-deltabdhB
<400> 79
gatccccggg taccgagctc gaattcgtaa tcatggtcat agctgtttcc tgtgtgaaat 60
tgttatccgc tcacaattcc acacaacata cgagccggaa gcataaagtg taaagcctgg 120
ggtgcctaat gagtgagcta actcacatta attgcgttgc gctcactgcc cgctttccag 180
tcgggaaacc tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt 240
ttgcgtattg ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg 300
ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg 360
gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag 420
gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga 480
cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct 540
ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc 600
tttctccctt cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg 660
gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc 720
tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca 780
ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag 840
ttcttgaagt ggtggcctaa ctacggctac actagaagaa cagtatttgg tatctgcgct 900
ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc 960
accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga 1020
tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca 1080
cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat 1140
taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac 1200
caaagctagc ttaatactag tatatactta atgtgataag tgtctgacag ctgaccggtc 1260
taaagaggtc cctagcgcct acggggaatt tgtatcgata aggggtacaa attcccacta 1320
agcgctcggc cggggatcga tccccgggta cgtacccggc agtttttctt tttcggcaag 1380
tgttcaagaa gttattaagt cgggagtgca gtcgaagtgg gcaagttgaa aaattcacaa 1440
aaatgtggta taatatcttt gttcattaga gcgataaact tgaatttgag agggaactta 1500
gatggtattt gaaaaaattg ataaaaatag ttggaacaga aaagagtatt ttgaccacta 1560
ctttgcaagt gtaccttgta cctacagcat gaccgttaaa gtggatatca cacaaataaa 1620
ggaaaaggga atgaaactat atcctgcaat gctttattat attgcaatga ttgtaaaccg 1680
ccattcagag tttaggacgg caatcaatca agatggtgaa ttggggatat atgatgagat 1740
gataccaagc tatacaatat ttcacaatga tactgaaaca ttttccagcc tttggactga 1800
gtgtaagtct gactttaaat catttttagc agattatgaa agtgatacgc aacggtatgg 1860
aaacaatcat agaatggaag gaaagccaaa tgctccggaa aacattttta atgtatctat 1920
gataccgtgg tcaaccttcg atggctttaa tctgaatttg cagaaaggat atgattattt 1980
gattcctatt tttactatgg ggaaatatta taaagaagat aacaaaatta tacttccttt 2040
ggcaattcaa gttcatcacg cagtatgtga cggatttcac atttgccgtt ttgtaaacga 2100
attgcaggaa ttgataaata gttaacttca ggtttgtctg taactaaaaa ctagtattta 2160
acctaggatc aaaaaaattt ccaataatcc cactctaagc cacaaacacg ccctataaaa 2220
tcccgcttta atcccacttt gagacacatg taatattact ttacgcccta gtatagtgat 2280
aattttttac attcaatgcc acgcaaaaaa ataaaggggc actataataa aagttccttc 2340
ggaactaact aaagtaaaaa attatcttta caacctcccc aaaaaaaaga acaggtacaa 2400
agtaccctat aatacaagcg taaaaaaaat gagggtaaaa ataaaaaaat aaaaaaataa 2460
aaaaataaaa aaataaaaaa ataaaaaaat aaaaaaatat aaaaataaaa aaatataaaa 2520
ataaaaaaat ataaaaataa aaaaataaaa aaatataaaa ataaaaaaat aaaaaaatat 2580
aaaaatattt tttatttaaa gtttgaaaaa aattttttta tattatataa tctttgaaga 2640
aaagaatata aaaaatgagc ctttataaaa gcccattttt tttcatatac gtaatatgac 2700
gttctaatgt ttttattggt acttctaaca ttagagtaat ttctttattt ttaaagcctt 2760
tttctttaag ggcttttatt ttttttctta atacatttaa ttcctctttt tttgttgctt 2820
ttcctttagc ttttaattgc tcttgataat tttttttacc tctaatattt tctcttctct 2880
tatattcctt tttagaaatt attattgtca tatatttttg ttcttcttct gtaatttcta 2940
ataactctat aagagtttca ttcttatact tatattgctt atttttatct aaataacatc 3000
tttcagcact tctagttgct cttataactt ctctttcact taaatgttgt ctaaacatac 3060
tattaagttc taaaacatca tttaatgcct tctcaatgtc ttctgtaaag ctacaaagat 3120
aatatctata taaaaataat ataagctctc tgtgtccttt taaatcatat tctcttagtt 3180
cacaaagttt tattatgtct tgtattcttc cataatataa acttctttct ctataaatat 3240
aatttatttt gcttggtcta ccctttttcc tttcatatgg ttttaattca ggtaaaaatc 3300
cattttgtat ttctcttaag tcataaatat attcgtactc atctaatata ttgactactg 3360
tttttgattt agagtttata cttcctggaa ctcttaatat tctcgttgca tctaaggctt 3420
gtctatctgc tccaaagtat tttaattgat tatataaata ttcttgaacc gctttccata 3480
atggtaatgc tttactaggt actgcattta ttatccatat taaatacatt cctcttccac 3540
tatctattac atagtttggt ataggaatac tttgattaaa ataattcttt tctaagtcca 3600
ttaatacctg gtctttagtt ttgccagttt tataataatc caagtctata aacagtgtat 3660
ttaactcttt tatattttct aatcgcctac acggcttata aaaggtattt agagttatat 3720
agatattttc atcactcata tctaaatctt ttaattcagc gtatttatag tgccattggc 3780
tatatccttt tttatctata acgctcctgg ttatccaccc tttacttcta ctatgaatat 3840
tatctatata gttcttttta ttcagcttta atgcgtttct cacttattca cctccccttc 3900
tgtaaaacta agaaaattat atcatatttt caataattat taactattct taaactctta 3960
ataaaaaata gagtaagtcc ccaattgaaa cttaatctat tttttatgtt ttaatttatt 4020
atttttatta aaatatttta aactaaatta aatgattctt tttaattttt tactatttca 4080
ttccataata tattactata attatttaca aataatattt cttcatttgt aatatttaga 4140
tgatttacta attttagttt ttatatatta aataattaat gtataattta tataaaaaat 4200
caaaggagct tataaattat gattatttcc aaagatacta aagatttaat ttttttcaat 4260
tttaacaata ctttttgtaa tattatgttt aaatttaatt gtattttttt catataataa 4320
agccgttgaa gtaaaccaat ccattttcct tatgatgtta ttattaaatt taagttttat 4380
aataatatct ttattatatt tattgttttt aaaaaaacta gtgaaatttc tagtgaaatt 4440
tccggcttta ttaaacttat ttttaggaat tttattttca ttttcatctt tacaggattt 4500
gattatatct ttaaatatgt tttatcaaat attatctttt tctaaattta tatatatttt 4560
tattatattt attattatat atattttatt tttaagtttc tttctaacag ctattaaaaa 4620
gaaacttaaa aataaaaaca cgtactctaa accaataaat aaaactattt ttattattgc 4680
tgccttgatt ggaatagttt ttagtaaaat taatttcaat attccacaat attatattat 4740
aagctagcac gcctcgagac tctatcattg atagagtttg aaactctatc attgatagag 4800
tataatatct ttgttcatgc ttattacgac ataacacagt tttagagcta gaaatagcaa 4860
gttaaaataa ggctagtccg ttatcaactt gaaaaagtgg caccgagtcg gtgctttttt 4920
tgaagcttct cgagatctcc atggacgcgt gacgtcgact cttaagaaca tgtataaagt 4980
atggaaaaat agcaatggag aagactgatg attacgaggc tagagctaat ttgatgtggg 5040
cttcaagttt agctataaat ggtctattat cacttggtaa ggatagaaaa tggagttgtc 5100
atcctatgga acacgagtta agtgcatatt atgatataac acatggtgta ggacttgcaa 5160
ttttaacacc taattggatg gaatatattc taaatgacga tacacttcat aaatttgttt 5220
cttatggaat aaatgtttgg ggaatagaca agaacaaaga taactatgaa atagcacgag 5280
aggctattaa aaatacgaga gaatacttta attcattggg tattccttca aagcttagag 5340
aagttggaat aggaaaagat aaactagaac taatggcaaa gcaagctgtt agaaattctg 5400
gaggaacaat aggaagttta agaccaataa atgcagagga tgttcttgag atatttaaaa 5460
aatcttatta atagaaactg tagaggtatt tttataattt aaaagatgtt aaagagtgag 5520
gagtaatttt gttctaacgc ctcactcttt tcattttatg attaaatgta tgctgattta 5580
cgctaactta aatcctaaat aataacctaa tgttaatatt ttgtaacaaa tggataaaag 5640
cgtaaaaata ttattgtaat aattttaagt aggtttaaaa tatatataat gtagaagcat 5700
tcctacatta tattatttaa ataataatct aaacaggagg ggttaaagtg gttgatttca 5760
aatctgtgta aacctaccgg ggtttgggcg tagccattat attcatgaac tccaagaaag 5820
cagtatgcta gcaaagaaat aaaactcaaa gcagagagaa aatttagaca ttcaactata 5880
aataaaaaat accccccaaa gcattaatat cttggggagt attttttatt ttgaagtatt 5940
ctgttcagct aaatattctt ctaaggtaat acctctgttc ataatttctt gtgaggcagg 6000
aagaccgata tatcttacat gccatggctc aaaattatac tttgttatgt tttctttatc 6060
cttaggatat cttattatga aaccatattt accacaattt tgttgaagcc atttataaga 6120
atttgtattc ataaatccat catctaaaga agagtattcg gttgatagta agtccattgc 6180
caatccagtt tgatgctcac ttgtaccagg ttcagctaca tatttatcag cttcggcttt 6240
tccgtctcgt gctacttttt cattatataa tttttgctga tacgaataag gtctataacc 6300
tgaaacagct agaagtgtaa gaccatcctt tgatgctgca ttaaacatat tttcaagtcc 6360
tgttgcagct tcgctctcca tttgatttac attaggatca gaactactaa taaatttaac 6420
gttaggagtt ctcaaatttt gaggtatata gtttcctgat aatttacttt gcttgtttac 6480
aagtaggatg ttctgtttct ttacctcggg tttcttggct tgttttttag gtgtagaaac 6540
tttctttttg ggttcgtttg 6560
<210> 80
<211> 6560
<212> DNA
<213> Artificial Sequence
<220>
<223> pGRNA_deltabdhA_deltabdhB
<400> 80
gatccccggg taccgagctc gaattcgtaa tcatggtcat agctgtttcc tgtgtgaaat 60
tgttatccgc tcacaattcc acacaacata cgagccggaa gcataaagtg taaagcctgg 120
ggtgcctaat gagtgagcta actcacatta attgcgttgc gctcactgcc cgctttccag 180
tcgggaaacc tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt 240
ttgcgtattg ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg 300
ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg 360
gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag 420
gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga 480
cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct 540
ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc 600
tttctccctt cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg 660
gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc 720
tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca 780
ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag 840
ttcttgaagt ggtggcctaa ctacggctac actagaagaa cagtatttgg tatctgcgct 900
ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc 960
accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga 1020
tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca 1080
cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat 1140
taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac 1200
caaagctagc ttaatactag tatatactta atgtgataag tgtctgacag ctgaccggtc 1260
taaagaggtc cctagcgcct acggggaatt tgtatcgata aggggtacaa attcccacta 1320
agcgctcggc cggggatcga tccccgggta cgtacccggc agtttttctt tttcggcaag 1380
tgttcaagaa gttattaagt cgggagtgca gtcgaagtgg gcaagttgaa aaattcacaa 1440
aaatgtggta taatatcttt gttcattaga gcgataaact tgaatttgag agggaactta 1500
gatggtattt gaaaaaattg ataaaaatag ttggaacaga aaagagtatt ttgaccacta 1560
ctttgcaagt gtaccttgta cctacagcat gaccgttaaa gtggatatca cacaaataaa 1620
ggaaaaggga atgaaactat atcctgcaat gctttattat attgcaatga ttgtaaaccg 1680
ccattcagag tttaggacgg caatcaatca agatggtgaa ttggggatat atgatgagat 1740
gataccaagc tatacaatat ttcacaatga tactgaaaca ttttccagcc tttggactga 1800
gtgtaagtct gactttaaat catttttagc agattatgaa agtgatacgc aacggtatgg 1860
aaacaatcat agaatggaag gaaagccaaa tgctccggaa aacattttta atgtatctat 1920
gataccgtgg tcaaccttcg atggctttaa tctgaatttg cagaaaggat atgattattt 1980
gattcctatt tttactatgg ggaaatatta taaagaagat aacaaaatta tacttccttt 2040
ggcaattcaa gttcatcacg cagtatgtga cggatttcac atttgccgtt ttgtaaacga 2100
attgcaggaa ttgataaata gttaacttca ggtttgtctg taactaaaaa ctagtattta 2160
acctaggatc aaaaaaattt ccaataatcc cactctaagc cacaaacacg ccctataaaa 2220
tcccgcttta atcccacttt gagacacatg taatattact ttacgcccta gtatagtgat 2280
aattttttac attcaatgcc acgcaaaaaa ataaaggggc actataataa aagttccttc 2340
ggaactaact aaagtaaaaa attatcttta caacctcccc aaaaaaaaga acaggtacaa 2400
agtaccctat aatacaagcg taaaaaaaat gagggtaaaa ataaaaaaat aaaaaaataa 2460
aaaaataaaa aaataaaaaa ataaaaaaat aaaaaaatat aaaaataaaa aaatataaaa 2520
ataaaaaaat ataaaaataa aaaaataaaa aaatataaaa ataaaaaaat aaaaaaatat 2580
aaaaatattt tttatttaaa gtttgaaaaa aattttttta tattatataa tctttgaaga 2640
aaagaatata aaaaatgagc ctttataaaa gcccattttt tttcatatac gtaatatgac 2700
gttctaatgt ttttattggt acttctaaca ttagagtaat ttctttattt ttaaagcctt 2760
tttctttaag ggcttttatt ttttttctta atacatttaa ttcctctttt tttgttgctt 2820
ttcctttagc ttttaattgc tcttgataat tttttttacc tctaatattt tctcttctct 2880
tatattcctt tttagaaatt attattgtca tatatttttg ttcttcttct gtaatttcta 2940
ataactctat aagagtttca ttcttatact tatattgctt atttttatct aaataacatc 3000
tttcagcact tctagttgct cttataactt ctctttcact taaatgttgt ctaaacatac 3060
tattaagttc taaaacatca tttaatgcct tctcaatgtc ttctgtaaag ctacaaagat 3120
aatatctata taaaaataat ataagctctc tgtgtccttt taaatcatat tctcttagtt 3180
cacaaagttt tattatgtct tgtattcttc cataatataa acttctttct ctataaatat 3240
aatttatttt gcttggtcta ccctttttcc tttcatatgg ttttaattca ggtaaaaatc 3300
cattttgtat ttctcttaag tcataaatat attcgtactc atctaatata ttgactactg 3360
tttttgattt agagtttata cttcctggaa ctcttaatat tctcgttgca tctaaggctt 3420
gtctatctgc tccaaagtat tttaattgat tatataaata ttcttgaacc gctttccata 3480
atggtaatgc tttactaggt actgcattta ttatccatat taaatacatt cctcttccac 3540
tatctattac atagtttggt ataggaatac tttgattaaa ataattcttt tctaagtcca 3600
ttaatacctg gtctttagtt ttgccagttt tataataatc caagtctata aacagtgtat 3660
ttaactcttt tatattttct aatcgcctac acggcttata aaaggtattt agagttatat 3720
agatattttc atcactcata tctaaatctt ttaattcagc gtatttatag tgccattggc 3780
tatatccttt tttatctata acgctcctgg ttatccaccc tttacttcta ctatgaatat 3840
tatctatata gttcttttta ttcagcttta atgcgtttct cacttattca cctccccttc 3900
tgtaaaacta agaaaattat atcatatttt caataattat taactattct taaactctta 3960
ataaaaaata gagtaagtcc ccaattgaaa cttaatctat tttttatgtt ttaatttatt 4020
atttttatta aaatatttta aactaaatta aatgattctt tttaattttt tactatttca 4080
ttccataata tattactata attatttaca aataatattt cttcatttgt aatatttaga 4140
tgatttacta attttagttt ttatatatta aataattaat gtataattta tataaaaaat 4200
caaaggagct tataaattat gattatttcc aaagatacta aagatttaat ttttttcaat 4260
tttaacaata ctttttgtaa tattatgttt aaatttaatt gtattttttt catataataa 4320
agccgttgaa gtaaaccaat ccattttcct tatgatgtta ttattaaatt taagttttat 4380
aataatatct ttattatatt tattgttttt aaaaaaacta gtgaaatttc tagtgaaatt 4440
tccggcttta ttaaacttat ttttaggaat tttattttca ttttcatctt tacaggattt 4500
gattatatct ttaaatatgt tttatcaaat attatctttt tctaaattta tatatatttt 4560
tattatattt attattatat atattttatt tttaagtttc tttctaacag ctattaaaaa 4620
gaaacttaaa aataaaaaca cgtactctaa accaataaat aaaactattt ttattattgc 4680
tgccttgatt ggaatagttt ttagtaaaat taatttcaat attccacaat attatattat 4740
aagctagcac gcctcgagac tctatcattg atagagtttg aaactctatc attgatagag 4800
tataatatct ttgttcatgc ttattacgac ataacacagt tttagagcta gaaatagcaa 4860
gttaaaataa ggctagtccg ttatcaactt gaaaaagtgg caccgagtcg gtgctttttt 4920
tgaagcttct cgagatctcc atggacgcgt gacgtcgacc ttctaatctc ctctactatt 4980
ttagggttag ctacattagc taaataggta atagctacag ttgtctttga attctcacct 5040
aaagtaagtt cttccacttt aaaatcagtg cttctaattt tttttcttaa aagggctaca 5100
tttgtggtta aagattcagt gaagccctct ctaggacctc ttattacagt ttcaacagtt 5160
ggttctgtta tagctctttc agggggtttt ccaatactta taataattgc tttactttca 5220
ccatctagga ataatgctat acttcctttt aaaatggaca atataacatc atccatgctt 5280
ttatatacat ttttatcatt aacagcaaaa attgattttg tatattcaaa tatgtttaaa 5340
tggggatggt tattgtaatc ttcttctata agttttttta taacagagga ttctattaca 5400
tcagattgga taagattatt tatgtagaca atcattgcag aaaaatttct attattagct 5460
attttaaatt ctctaatcgt taaatctgag caatttgtaa ataaggtttc tatagtatgt 5520
ttatttgttt taaggctagt tgaaaccgtc ttcgcgttat ttttagatgc ttcttcttta 5580
ttaaaaattt tattaaacaa cgaaaaattc accccctcaa tttatttata taatagtagt 5640
ttgcatgaaa tttcgttgtt tattcatatt agatgcttgt attaaaataa taaaatagta 5700
aaatataagt agacaaacta taaatctatt actaggaggt aagaagtatg ctaagtttta 5760
aatctgtgta aacctaccgg ggtttgggcg tagccattat attcatgaac tccaagaaag 5820
cagtatgcta gcaaagaaat aaaactcaaa gcagagagaa aatttagaca ttcaactata 5880
aataaaaaat accccccaaa gcattaatat cttggggagt attttttatt ttgaagtatt 5940
ctgttcagct aaatattctt ctaaggtaat acctctgttc ataatttctt gtgaggcagg 6000
aagaccgata tatcttacat gccatggctc aaaattatac tttgttatgt tttctttatc 6060
cttaggatat cttattatga aaccatattt accacaattt tgttgaagcc atttataaga 6120
atttgtattc ataaatccat catctaaaga agagtattcg gttgatagta agtccattgc 6180
caatccagtt tgatgctcac ttgtaccagg ttcagctaca tatttatcag cttcggcttt 6240
tccgtctcgt gctacttttt cattatataa tttttgctga tacgaataag gtctataacc 6300
tgaaacagct agaagtgtaa gaccatcctt tgatgctgca ttaaacatat tttcaagtcc 6360
tgttgcagct tcgctctcca tttgatttac attaggatca gaactactaa taaatttaac 6420
gttaggagtt ctcaaatttt gaggtatata gtttcctgat aatttacttt gcttgtttac 6480
aagtaggatg ttctgtttct ttacctcggg tttcttggct tgttttttag gtgtagaaac 6540
tttctttttg ggttcgtttg 6560
<210> 81
<211> 1654
<212> DNA
<213> Artificial Sequence
<220>
<223> bgaR acrIIA4 cassette
<400> 81
aaaaagtata acagaggttt taatttacgc ctctgttata ctttttattt ttgaaatttt 60
tttgttttaa agctgtattt taaatttata tacttggttt atttacttga ttatttctgt 120
aatttagtgg agacattgaa aaatgttttg aaaaagtttt tgaaaataac agggagtcac 180
tataacctac actacttgcg acttctccta taggaagttt agtgcttttt aataaaaggg 240
tggctttgta cattctaagg tttattaaat atctttgagg agaaattcca aggtttttta 300
tgaacatttt atataaataa cttctactta agttcacata atcagcaatt tcttgaacag 360
ttatgctatg catgtaatta gaattaatga aattaagagc atcttgaata tatgtgtgta 420
attccttatc tttgtattca aaaggttttg ggaattcttc tataagtgcg tacaataatg 480
agtaaagttc ttttagtaat agtatgtcat cagatcttga aggattataa gtttttgata 540
tttcgcacat atttaatatt atctgtggaa tttttgagtt ttcttcacaa ttagcaacac 600
aggagttagt aatagaagtt ctatttaaat actcattagc atttgaacca ctaaatccta 660
tccagtagta ttcccaagga tcatcaatag aagccacata ctcaacttgc atacctttta 720
gtagtataaa aatatcacct tgttttaagt tatatacctt accattaaat ttaaaagttc 780
catatccctt agttacgtaa tgaataacag catttttcaa tacttcatag ttatatccta 840
atcctggtat accttgttct ataccacatt catctacatt catttcaaag ttttctttaa 900
catacttttt ccacaatatt tgcatttcta cctcctaacc tataaaatta gccaatttta 960
tagtagtctt atattaaaca tttacatgag agctttgcaa agcagtttat caacataaaa 1020
gctttttatt ttaaaataaa ttcttctaaa tataagaata ttttaaagaa atatctttat 1080
atattagtta ttaaaattta taagattata agaaacatta taacatattt tagaactttt 1140
taactattct aaaagattaa tttacatatt aacatttaat tatgggtaaa aactattttg 1200
aaaaatgatt tatatggaat tatgtttctt aaatatacaa tcatgtttca tgaatacata 1260
attattttaa atgtattggg agggtaaaat gatattaaaa aatgaatacc atgaagatac 1320
tgcagaatct agaatccgcg gtagtcgacg tggaattgtg agcggataac aatttcacag 1380
gagggctgaa atgaatatta atgacttaat tagagaaata aaaaacaaag attacacagt 1440
gaaattgagt ggtacggata gcaatagtat aacacagcta attattagag ttaataatga 1500
tggaaacgag tatgtaattt ctgaaagtga aaatgaatca atagttgaaa aattcatatc 1560
tgcatttaaa aacggttgga atcaagaata cgaggatgaa gaagaatttt ataatgacat 1620
gcaaacaatc accttaaaaa gtgagttgaa ctaa 1654
<210> 82
<211> 4984
<212> DNA
<213> Artificial Sequence
<220>
<223> pGRNAind
<400> 82
caagcttcaa aaaaagcacc gactcggtgc cactttttca agttgataac ggactagcct 60
tattttaact tgctatttct agctctaaaa cagagaccgc tagcgatatc cccgggagat 120
ctggtctcaa tgaacaaaga tattatactc tatcaatgat agagtttcaa actctatcaa 180
tgatagagtg agctcgaatt cgtaatcatg gtcatagctg tttcctgtgt gaaattgtta 240
tccgctcaca attccacaca acatacgagc cggaagcata aagtgtaaag cctggggtgc 300
ctaatgagtg agctaactca cattaattgc gttgcgctca ctgcccgctt tccagtcggg 360
aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag gcggtttgcg 420
tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg ttcggctgcg 480
gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat caggggataa 540
cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc 600
gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc 660
aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc cccctggaag 720
ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt ccgcctttct 780
cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca gttcggtgta 840
ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg accgctgcgc 900
cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc 960
agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta cagagttctt 1020
gaagtggtgg cctaactacg gctacactag aagaacagta tttggtatct gcgctctgct 1080
gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac aaaccaccgc 1140
tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca 1200
agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta 1260
agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa 1320
atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaaag 1380
ctagcttaat actagtatat acttaatgtg ataagtgtct gacagctgac cggtctaaag 1440
aggtccctag cgcctacggg gaatttgtat cgataagggg tacaaattcc cactaagcgc 1500
tcggccgggg atcgatcccc gggtacgtac ccggcagttt ttctttttcg gcaagtgttc 1560
aagaagttat taagtcggga gtgcagtcga agtgggcaag ttgaaaaatt cacaaaaatg 1620
tggtataata tctttgttca ttagagcgat aaacttgaat ttgagaggga acttagatgg 1680
tatttgaaaa aattgataaa aatagttgga acagaaaaga gtattttgac cactactttg 1740
caagtgtacc ttgtacctac agcatgaccg ttaaagtgga tatcacacaa ataaaggaaa 1800
agggaatgaa actatatcct gcaatgcttt attatattgc aatgattgta aaccgccatt 1860
cagagtttag gacggcaatc aatcaagatg gtgaattggg gatatatgat gagatgatac 1920
caagctatac aatatttcac aatgatactg aaacattttc cagcctttgg actgagtgta 1980
agtctgactt taaatcattt ttagcagatt atgaaagtga tacgcaacgg tatggaaaca 2040
atcatagaat ggaaggaaag ccaaatgctc cggaaaacat ttttaatgta tctatgatac 2100
cgtggtcaac cttcgatggc tttaatctga atttgcagaa aggatatgat tatttgattc 2160
ctatttttac tatggggaaa tattataaag aagataacaa aattatactt cctttggcaa 2220
ttcaagttca tcacgcagta tgtgacggat ttcacatttg ccgttttgta aacgaattgc 2280
aggaattgat aaatagttaa cttcaggttt gtctgtaact aaaaactagt atttaaccta 2340
ggatcaaaaa aatttccaat aatcccactc taagccacaa acacgcccta taaaatcccg 2400
ctttaatccc actttgagac acatgtaata ttactttacg ccctagtata gtgataattt 2460
tttacattca atgccacgca aaaaaataaa ggggcactat aataaaagtt ccttcggaac 2520
taactaaagt aaaaaattat ctttacaacc tccccaaaaa aaagaacagg tacaaagtac 2580
cctataatac aagcgtaaaa aaaatgaggg taaaaataaa aaaataaaaa aataaaaaaa 2640
taaaaaaata aaaaaataaa aaaataaaaa aatataaaaa taaaaaaata taaaaataaa 2700
aaaatataaa aataaaaaaa taaaaaaata taaaaataaa aaaataaaaa aatataaaaa 2760
tattttttat ttaaagtttg aaaaaaattt ttttatatta tataatcttt gaagaaaaga 2820
atataaaaaa tgagccttta taaaagccca ttttttttca tatacgtaat atgacgttct 2880
aatgttttta ttggtacttc taacattaga gtaatttctt tatttttaaa gcctttttct 2940
ttaagggctt ttattttttt tcttaataca tttaattcct ctttttttgt tgcttttcct 3000
ttagctttta attgctcttg ataatttttt ttacctctaa tattttctct tctcttatat 3060
tcctttttag aaattattat tgtcatatat ttttgttctt cttctgtaat ttctaataac 3120
tctataagag tttcattctt atacttatat tgcttatttt tatctaaata acatctttca 3180
gcacttctag ttgctcttat aacttctctt tcacttaaat gttgtctaaa catactatta 3240
agttctaaaa catcatttaa tgccttctca atgtcttctg taaagctaca aagataatat 3300
ctatataaaa ataatataag ctctctgtgt ccttttaaat catattctct tagttcacaa 3360
agttttatta tgtcttgtat tcttccataa tataaacttc tttctctata aatataattt 3420
attttgcttg gtctaccctt tttcctttca tatggtttta attcaggtaa aaatccattt 3480
tgtatttctc ttaagtcata aatatattcg tactcatcta atatattgac tactgttttt 3540
gatttagagt ttatacttcc tggaactctt aatattctcg ttgcatctaa ggcttgtcta 3600
tctgctccaa agtattttaa ttgattatat aaatattctt gaaccgcttt ccataatggt 3660
aatgctttac taggtactgc atttattatc catattaaat acattcctct tccactatct 3720
attacatagt ttggtatagg aatactttga ttaaaataat tcttttctaa gtccattaat 3780
acctggtctt tagttttgcc agttttataa taatccaagt ctataaacag tgtatttaac 3840
tcttttatat tttctaatcg cctacacggc ttataaaagg tatttagagt tatatagata 3900
ttttcatcac tcatatctaa atcttttaat tcagcgtatt tatagtgcca ttggctatat 3960
ccttttttat ctataacgct cctggttatc caccctttac ttctactatg aatattatct 4020
atatagttct ttttattcag ctttaatgcg tttctcactt attcacctcc ccttctgtaa 4080
aactaagaaa attatatcat attttcaata attattaact attcttaaac tcttaataaa 4140
aaatagagta agtccccaat tgaaacttaa tctatttttt atgttttaat ttattatttt 4200
tattaaaata ttttaaacta aattaaatga ttctttttaa ttttttacta tttcattcca 4260
taatatatta ctataattat ttacaaataa tatttcttca tttgtaatat ttagatgatt 4320
tactaatttt agtttttata tattaaataa ttaatgtata atttatataa aaaatcaaag 4380
gagcttataa attatgatta tttccaaaga tactaaagat ttaatttttt tcaattttaa 4440
caatactttt tgtaatatta tgtttaaatt taattgtatt tttttcatat aataaagccg 4500
ttgaagtaaa ccaatccatt ttccttatga tgttattatt aaatttaagt tttataataa 4560
tatctttatt atatttattg tttttaaaaa aactagtgaa atttctagtg aaatttccgg 4620
ctttattaaa cttattttta ggaattttat tttcattttc atctttacag gatttgatta 4680
tatctttaaa tatgttttat caaatattat ctttttctaa atttatatat atttttatta 4740
tatttattat tatatatatt ttatttttaa gtttctttct aacagctatt aaaaagaaac 4800
ttaaaaataa aaacacgtac tctaaaccaa taaataaaac tatttttatt attgctgcct 4860
tgattggaat agtttttagt aaaattaatt tcaatattcc acaatattat attataagct 4920
agcacgcctc gagatctcca tggacgcgtg acgtcgactc tagaggatcc ccgggtaccg 4980
agct 4984
<210> 83
<211> 200
<212> DNA
<213> Artificial Sequence
<220>
<223> gRNA cassette
<400> 83
gagctcactc tatcattgat agagtttgaa actctatcat tgatagagta taatatcttt 60
gttcattgag accagatctc ccggggatat cgctagcggt ctctgtttta gagctagaaa 120
tagcaagtta aaataaggct agtccgttat caacttgaaa aagtggcacc gagtcggtgc 180
tttttttgaa gcttgagctc 200
<210> 84
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> primer
<400> 84
tcatgatttc tccatattag ctag 24
<210> 85
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 85
aaacctagct aatatggaga aatc 24
<210> 86
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 86
tcatgttaca cttggaacag gcgt 24
<210> 87
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 87
aaacacgcct gttccaagtg taac 24
<210> 88
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 88
tcatttccgg cagtaggatc ccca 24
<210> 89
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 89
aaactgggga tcctactgcc ggaa 24
<210> 90
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 90
tcatgcttat tacgacataa caca 24
<210> 91
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 91
aaactgtgtt atgtcgtaat aagc 24
<210> 92
<211> 36
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 92
atgcatggat ccaaacgaac ccaaaaagaa agtttc 36
<210> 93
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 93
ggttgatttc aaatctgtgt aaacctaccg 30
<210> 94
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 94
acacagattt gaaatcaacc actttaaccc 30
<210> 95
<211> 37
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 95
atgcatgtcg actcttaaga acatgtataa agtatgg 37
<210> 96
<211> 36
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 96
atgcatggat ccaaacgaac ccaaaaagaa agtttc 36
<210> 97
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 97
gctaagtttt aaatctgtgt aaacctaccg 30
<210> 98
<211> 32
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 98
acacagattt aaaacttagc atacttctta cc 32
<210> 99
<211> 37
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 99
atgcatgtcg accttctaat ctcctctact attttag 37
<210> 100
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 100
acacattgaa gggagctttt 20
<210> 101
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 101
ggcaacaaca tcaggccttt 20
<210> 102
<211> 4966
<212> DNA
<213> Artificial Sequence
<220>
<223> pGRNA-xylB
<400> 102
atcaaaaaaa tttccaataa tcccactcta agccacaaac acgccctata aaatcccgct 60
ttaatcccac tttgagacac atgtaatatt actttacgcc ctagtatagt gataattttt 120
tacattcaat gccacgcaaa aaaataaagg ggcactataa taaaagttcc ttcggaacta 180
actaaagtaa aaaattatct ttacaacctc cccaaaaaaa agaacaggta caaagtaccc 240
tataatacaa gcgtaaaaaa aatgagggta aaaataaaaa aataaaaaaa taaaaaaata 300
aaaaaataaa aaaataaaaa aataaaaaaa tataaaaata aaaaaatata aaaataaaaa 360
aatataaaaa taaaaaaata aaaaaatata aaaataaaaa aataaaaaaa tataaaaata 420
ttttttattt aaagtttgaa aaaaattttt ttatattata taatctttga agaaaagaat 480
ataaaaaatg agcctttata aaagcccatt ttttttcata tacgtaatat gacgttctaa 540
tgtttttatt ggtacttcta acattagagt aatttcttta tttttaaagc ctttttcttt 600
aagggctttt attttttttc ttaatacatt taattcctct ttttttgttg cttttccttt 660
agcttttaat tgctcttgat aatttttttt acctctaata ttttctcttc tcttatattc 720
ctttttagaa attattattg tcatatattt ttgttcttct tctgtaattt ctaataactc 780
tataagagtt tcattcttat acttatattg cttattttta tctaaataac atctttcagc 840
acttctagtt gctcttataa cttctctttc acttaaatgt tgtctaaaca tactattaag 900
ttctaaaaca tcatttaatg ccttctcaat gtcttctgta aagctacaaa gataatatct 960
atataaaaat aatataagct ctctgtgtcc ttttaaatca tattctctta gttcacaaag 1020
ttttattatg tcttgtattc ttccataata taaacttctt tctctataaa tataatttat 1080
tttgcttggt ctaccctttt tcctttcata tggttttaat tcaggtaaaa atccattttg 1140
tatttctctt aagtcataaa tatattcgta ctcatctaat atattgacta ctgtttttga 1200
tttagagttt atacttcctg gaactcttaa tattctcgtt gcatctaagg cttgtctatc 1260
tgctccaaag tattttaatt gattatataa atattcttga accgctttcc ataatggtaa 1320
tgctttacta ggtactgcat ttattatcca tattaaatac attcctcttc cactatctat 1380
tacatagttt ggtataggaa tactttgatt aaaataattc ttttctaagt ccattaatac 1440
ctggtcttta gttttgccag ttttataata atccaagtct ataaacagtg tatttaactc 1500
ttttatattt tctaatcgcc tacacggctt ataaaaggta tttagagtta tatagatatt 1560
ttcatcactc atatctaaat cttttaattc agcgtattta tagtgccatt ggctatatcc 1620
ttttttatct ataacgctcc tggttatcca ccctttactt ctactatgaa tattatctat 1680
atagttcttt ttattcagct ttaatgcgtt tctcacttat tcacctcccc ttctgtaaaa 1740
ctaagaaaat tatatcatat tttcaataat tattaactat tcttaaactc ttaataaaaa 1800
atagagtaag tccccaattg aaacttaatc tattttttat gttttaattt attattttta 1860
ttaaaatatt ttaaactaaa ttaaatgatt ctttttaatt ttttactatt tcattccata 1920
atatattact ataattattt acaaataata tttcttcatt tgtaatattt agatgattta 1980
ctaattttag tttttatata ttaaataatt aatgtataat ttatataaaa aatcaaagga 2040
gcttataaat tatgattatt tccaaagata ctaaagattt aatttttttc aattttaaca 2100
atactttttg taatattatg tttaaattta attgtatttt tttcatataa taaagccgtt 2160
gaagtaaacc aatccatttt ccttatgatg ttattattaa atttaagttt tataataata 2220
tctttattat atttattgtt tttaaaaaaa ctagtgaaat ttctagtgaa atttccggct 2280
ttattaaact tatttttagg aattttattt tcattttcat ctttacagga tttgattata 2340
tctttaaata tgttttatca aatattatct ttttctaaat ttatatatat ttttattata 2400
tttattatta tatatatttt atttttaagt ttctttctaa cagctattaa aaagaaactt 2460
aaaaataaaa acacgtactc taaaccaata aataaaacta tttttattat tgctgccttg 2520
attggaatag tttttagtaa aattaatttc aatattccac aatattatat tataagctag 2580
cacgcctcga gaagcttcaa aaaaagcacc gactcggtgc cactttttca agttgataac 2640
ggactagcct tattttaact tgctatttct agctctaaaa cctagctaat atggagaaat 2700
catgaacaaa gatattatac tctatcaatg atagagtttc aaactctatc aatgatagag 2760
tctcgagatc tccatggacg cgtgacgtcg actctagagg atccccgggt accgagctcg 2820
aattcgtaat catggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca 2880
cacaacatac gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa 2940
ctcacattaa ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag 3000
ctgcattaat gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc 3060
gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct 3120
cactcaaagg cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg 3180
tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc 3240
cataggctcc gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga 3300
aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct 3360
cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg 3420
gcgctttctc atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag 3480
ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat 3540
cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac 3600
aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac 3660
tacggctaca ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttaccttc 3720
ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt 3780
tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc 3840
ttttctacgg ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg 3900
agattatcaa aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca 3960
atctaaagta tatatgagta aacttggtct gacagttacc aaagctagct taatactagt 4020
atatacttaa tgtgataagt gtctgacagc tgaccggtct aaagaggtcc ctagcgccta 4080
cggggaattt gtatcgataa ggggtacaaa ttcccactaa gcgctcggcc ggggatcgat 4140
ccccgggtac gtacccggca gtttttcttt ttcggcaagt gttcaagaag ttattaagtc 4200
gggagtgcag tcgaagtggg caagttgaaa aattcacaaa aatgtggtat aatatctttg 4260
ttcattagag cgataaactt gaatttgaga gggaacttag atggtatttg aaaaaattga 4320
taaaaatagt tggaacagaa aagagtattt tgaccactac tttgcaagtg taccttgtac 4380
ctacagcatg accgttaaag tggatatcac acaaataaag gaaaagggaa tgaaactata 4440
tcctgcaatg ctttattata ttgcaatgat tgtaaaccgc cattcagagt ttaggacggc 4500
aatcaatcaa gatggtgaat tggggatata tgatgagatg ataccaagct atacaatatt 4560
tcacaatgat actgaaacat tttccagcct ttggactgag tgtaagtctg actttaaatc 4620
atttttagca gattatgaaa gtgatacgca acggtatgga aacaatcata gaatggaagg 4680
aaagccaaat gctccggaaa acatttttaa tgtatctatg ataccgtggt caaccttcga 4740
tggctttaat ctgaatttgc agaaaggata tgattatttg attcctattt ttactatggg 4800
gaaatattat aaagaagata acaaaattat acttcctttg gcaattcaag ttcatcacgc 4860
agtatgtgac ggatttcaca tttgccgttt tgtaaacgaa ttgcaggaat tgataaatag 4920
ttaacttcag gtttgtctgt aactaaaaac tagtatttaa cctagg 4966
<210> 103
<211> 4966
<212> DNA
<213> Artificial Sequence
<220>
<223> pGRNA-xylR
<400> 103
atcaaaaaaa tttccaataa tcccactcta agccacaaac acgccctata aaatcccgct 60
ttaatcccac tttgagacac atgtaatatt actttacgcc ctagtatagt gataattttt 120
tacattcaat gccacgcaaa aaaataaagg ggcactataa taaaagttcc ttcggaacta 180
actaaagtaa aaaattatct ttacaacctc cccaaaaaaa agaacaggta caaagtaccc 240
tataatacaa gcgtaaaaaa aatgagggta aaaataaaaa aataaaaaaa taaaaaaata 300
aaaaaataaa aaaataaaaa aataaaaaaa tataaaaata aaaaaatata aaaataaaaa 360
aatataaaaa taaaaaaata aaaaaatata aaaataaaaa aataaaaaaa tataaaaata 420
ttttttattt aaagtttgaa aaaaattttt ttatattata taatctttga agaaaagaat 480
ataaaaaatg agcctttata aaagcccatt ttttttcata tacgtaatat gacgttctaa 540
tgtttttatt ggtacttcta acattagagt aatttcttta tttttaaagc ctttttcttt 600
aagggctttt attttttttc ttaatacatt taattcctct ttttttgttg cttttccttt 660
agcttttaat tgctcttgat aatttttttt acctctaata ttttctcttc tcttatattc 720
ctttttagaa attattattg tcatatattt ttgttcttct tctgtaattt ctaataactc 780
tataagagtt tcattcttat acttatattg cttattttta tctaaataac atctttcagc 840
acttctagtt gctcttataa cttctctttc acttaaatgt tgtctaaaca tactattaag 900
ttctaaaaca tcatttaatg ccttctcaat gtcttctgta aagctacaaa gataatatct 960
atataaaaat aatataagct ctctgtgtcc ttttaaatca tattctctta gttcacaaag 1020
ttttattatg tcttgtattc ttccataata taaacttctt tctctataaa tataatttat 1080
tttgcttggt ctaccctttt tcctttcata tggttttaat tcaggtaaaa atccattttg 1140
tatttctctt aagtcataaa tatattcgta ctcatctaat atattgacta ctgtttttga 1200
tttagagttt atacttcctg gaactcttaa tattctcgtt gcatctaagg cttgtctatc 1260
tgctccaaag tattttaatt gattatataa atattcttga accgctttcc ataatggtaa 1320
tgctttacta ggtactgcat ttattatcca tattaaatac attcctcttc cactatctat 1380
tacatagttt ggtataggaa tactttgatt aaaataattc ttttctaagt ccattaatac 1440
ctggtcttta gttttgccag ttttataata atccaagtct ataaacagtg tatttaactc 1500
ttttatattt tctaatcgcc tacacggctt ataaaaggta tttagagtta tatagatatt 1560
ttcatcactc atatctaaat cttttaattc agcgtattta tagtgccatt ggctatatcc 1620
ttttttatct ataacgctcc tggttatcca ccctttactt ctactatgaa tattatctat 1680
atagttcttt ttattcagct ttaatgcgtt tctcacttat tcacctcccc ttctgtaaaa 1740
ctaagaaaat tatatcatat tttcaataat tattaactat tcttaaactc ttaataaaaa 1800
atagagtaag tccccaattg aaacttaatc tattttttat gttttaattt attattttta 1860
ttaaaatatt ttaaactaaa ttaaatgatt ctttttaatt ttttactatt tcattccata 1920
atatattact ataattattt acaaataata tttcttcatt tgtaatattt agatgattta 1980
ctaattttag tttttatata ttaaataatt aatgtataat ttatataaaa aatcaaagga 2040
gcttataaat tatgattatt tccaaagata ctaaagattt aatttttttc aattttaaca 2100
atactttttg taatattatg tttaaattta attgtatttt tttcatataa taaagccgtt 2160
gaagtaaacc aatccatttt ccttatgatg ttattattaa atttaagttt tataataata 2220
tctttattat atttattgtt tttaaaaaaa ctagtgaaat ttctagtgaa atttccggct 2280
ttattaaact tatttttagg aattttattt tcattttcat ctttacagga tttgattata 2340
tctttaaata tgttttatca aatattatct ttttctaaat ttatatatat ttttattata 2400
tttattatta tatatatttt atttttaagt ttctttctaa cagctattaa aaagaaactt 2460
aaaaataaaa acacgtactc taaaccaata aataaaacta tttttattat tgctgccttg 2520
attggaatag tttttagtaa aattaatttc aatattccac aatattatat tataagctag 2580
cacgcctcga gactctatca ttgatagagt ttgaaactct atcattgata gagtataata 2640
tctttgttca tgttacactt ggaacaggcg tgttttagag ctagaaatag caagttaaaa 2700
taaggctagt ccgttatcaa cttgaaaaag tggcaccgag tcggtgcttt ttttgaagct 2760
tctcgagatc tccatggacg cgtgacgtcg actctagagg atccccgggt accgagctcg 2820
aattcgtaat catggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca 2880
cacaacatac gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa 2940
ctcacattaa ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag 3000
ctgcattaat gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc 3060
gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct 3120
cactcaaagg cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg 3180
tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc 3240
cataggctcc gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga 3300
aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct 3360
cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg 3420
gcgctttctc atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag 3480
ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat 3540
cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac 3600
aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac 3660
tacggctaca ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttaccttc 3720
ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt 3780
tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc 3840
ttttctacgg ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg 3900
agattatcaa aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca 3960
atctaaagta tatatgagta aacttggtct gacagttacc aaagctagct taatactagt 4020
atatacttaa tgtgataagt gtctgacagc tgaccggtct aaagaggtcc ctagcgccta 4080
cggggaattt gtatcgataa ggggtacaaa ttcccactaa gcgctcggcc ggggatcgat 4140
ccccgggtac gtacccggca gtttttcttt ttcggcaagt gttcaagaag ttattaagtc 4200
gggagtgcag tcgaagtggg caagttgaaa aattcacaaa aatgtggtat aatatctttg 4260
ttcattagag cgataaactt gaatttgaga gggaacttag atggtatttg aaaaaattga 4320
taaaaatagt tggaacagaa aagagtattt tgaccactac tttgcaagtg taccttgtac 4380
ctacagcatg accgttaaag tggatatcac acaaataaag gaaaagggaa tgaaactata 4440
tcctgcaatg ctttattata ttgcaatgat tgtaaaccgc cattcagagt ttaggacggc 4500
aatcaatcaa gatggtgaat tggggatata tgatgagatg ataccaagct atacaatatt 4560
tcacaatgat actgaaacat tttccagcct ttggactgag tgtaagtctg actttaaatc 4620
atttttagca gattatgaaa gtgatacgca acggtatgga aacaatcata gaatggaagg 4680
aaagccaaat gctccggaaa acatttttaa tgtatctatg ataccgtggt caaccttcga 4740
tggctttaat ctgaatttgc agaaaggata tgattatttg attcctattt ttactatggg 4800
gaaatattat aaagaagata acaaaattat acttcctttg gcaattcaag ttcatcacgc 4860
agtatgtgac ggatttcaca tttgccgttt tgtaaacgaa ttgcaggaat tgataaatag 4920
ttaacttcag gtttgtctgt aactaaaaac tagtatttaa cctagg 4966
<210> 104
<211> 4966
<212> DNA
<213> Artificial Sequence
<220>
<223> pGRNA-glcG
<400> 104
agctcggtac ccggggatcc tctagagtcg acgtcacgcg tccatggaga tctcgaggcg 60
tgctagctta taatataata ttgtggaata ttgaaattaa ttttactaaa aactattcca 120
atcaaggcag caataataaa aatagtttta tttattggtt tagagtacgt gtttttattt 180
ttaagtttct ttttaatagc tgttagaaag aaacttaaaa ataaaatata tataataata 240
aatataataa aaatatatat aaatttagaa aaagataata tttgataaaa catatttaaa 300
gatataatca aatcctgtaa agatgaaaat gaaaataaaa ttcctaaaaa taagtttaat 360
aaagccggaa atttcactag aaatttcact agttttttta aaaacaataa atataataaa 420
gatattatta taaaacttaa atttaataat aacatcataa ggaaaatgga ttggtttact 480
tcaacggctt tattatatga aaaaaataca attaaattta aacataatat tacaaaaagt 540
attgttaaaa ttgaaaaaaa ttaaatcttt agtatctttg gaaataatca taatttataa 600
gctcctttga ttttttatat aaattataca ttaattattt aatatataaa aactaaaatt 660
agtaaatcat ctaaatatta caaatgaaga aatattattt gtaaataatt atagtaatat 720
attatggaat gaaatagtaa aaaattaaaa agaatcattt aatttagttt aaaatatttt 780
aataaaaata ataaattaaa acataaaaaa tagattaagt ttcaattggg gacttactct 840
attttttatt aagagtttaa gaatagttaa taattattga aaatatgata taattttctt 900
agttttacag aaggggaggt gaataagtga gaaacgcatt aaagctgaat aaaaagaact 960
atatagataa tattcatagt agaagtaaag ggtggataac caggagcgtt atagataaaa 1020
aaggatatag ccaatggcac tataaatacg ctgaattaaa agatttagat atgagtgatg 1080
aaaatatcta tataactcta aatacctttt ataagccgtg taggcgatta gaaaatataa 1140
aagagttaaa tacactgttt atagacttgg attattataa aactggcaaa actaaagacc 1200
aggtattaat ggacttagaa aagaattatt ttaatcaaag tattcctata ccaaactatg 1260
taatagatag tggaagagga atgtatttaa tatggataat aaatgcagta cctagtaaag 1320
cattaccatt atggaaagcg gttcaagaat atttatataa tcaattaaaa tactttggag 1380
cagatagaca agccttagat gcaacgagaa tattaagagt tccaggaagt ataaactcta 1440
aatcaaaaac agtagtcaat atattagatg agtacgaata tatttatgac ttaagagaaa 1500
tacaaaatgg atttttacct gaattaaaac catatgaaag gaaaaagggt agaccaagca 1560
aaataaatta tatttataga gaaagaagtt tatattatgg aagaatacaa gacataataa 1620
aactttgtga actaagagaa tatgatttaa aaggacacag agagcttata ttatttttat 1680
atagatatta tctttgtagc tttacagaag acattgagaa ggcattaaat gatgttttag 1740
aacttaatag tatgtttaga caacatttaa gtgaaagaga agttataaga gcaactagaa 1800
gtgctgaaag atgttattta gataaaaata agcaatataa gtataagaat gaaactctta 1860
tagagttatt agaaattaca gaagaagaac aaaaatatat gacaataata atttctaaaa 1920
aggaatataa gagaagagaa aatattagag gtaaaaaaaa ttatcaagag caattaaaag 1980
ctaaaggaaa agcaacaaaa aaagaggaat taaatgtatt aagaaaaaaa ataaaagccc 2040
ttaaagaaaa aggctttaaa aataaagaaa ttactctaat gttagaagta ccaataaaaa 2100
cattagaacg tcatattacg tatatgaaaa aaaatgggct tttataaagg ctcatttttt 2160
atattctttt cttcaaagat tatataatat aaaaaaattt ttttcaaact ttaaataaaa 2220
aatattttta tattttttta tttttttatt tttatatttt tttatttttt tatttttata 2280
tttttttatt tttatatttt tttattttta tattttttta tttttttatt tttttatttt 2340
tttatttttt tattttttta tttttttatt tttaccctca ttttttttac gcttgtatta 2400
tagggtactt tgtacctgtt cttttttttg gggaggttgt aaagataatt ttttacttta 2460
gttagttccg aaggaacttt tattatagtg cccctttatt tttttgcgtg gcattgaatg 2520
taaaaaatta tcactatact agggcgtaaa gtaatattac atgtgtctca aagtgggatt 2580
aaagcgggat tttatagggc gtgtttgtgg cttagagtgg gattattgga aatttttttg 2640
atcctaggtt aaatactagt ttttagttac agacaaacct gaagttaact atttatcaat 2700
tcctgcaatt cgtttacaaa acggcaaatg tgaaatccgt cacatactgc gtgatgaact 2760
tgaattgcca aaggaagtat aattttgtta tcttctttat aatatttccc catagtaaaa 2820
ataggaatca aataatcata tcctttctgc aaattcagat taaagccatc gaaggttgac 2880
cacggtatca tagatacatt aaaaatgttt tccggagcat ttggctttcc ttccattcta 2940
tgattgtttc cataccgttg cgtatcactt tcataatctg ctaaaaatga tttaaagtca 3000
gacttacact cagtccaaag gctggaaaat gtttcagtat cattgtgaaa tattgtatag 3060
cttggtatca tctcatcata tatccccaat tcaccatctt gattgattgc cgtcctaaac 3120
tctgaatggc ggtttacaat cattgcaata taataaagca ttgcaggata tagtttcatt 3180
cccttttcct ttatttgtgt gatatccact ttaacggtca tgctgtaggt acaaggtaca 3240
cttgcaaagt agtggtcaaa atactctttt ctgttccaac tatttttatc aattttttca 3300
aataccatct aagttccctc tcaaattcaa gtttatcgct ctaatgaaca aagatattat 3360
accacatttt tgtgaatttt tcaacttgcc cacttcgact gcactcccga cttaataact 3420
tcttgaacac ttgccgaaaa agaaaaactg ccgggtacgt acccggggat cgatccccgg 3480
ccgagcgctt agtgggaatt tgtacccctt atcgatacaa attccccgta ggcgctaggg 3540
acctctttag accggtcagc tgtcagacac ttatcacatt aagtatatac tagtattaag 3600
ctagctttgg taactgtcag accaagttta ctcatatata ctttagattg atttaaaact 3660
tcatttttaa tttaaaagga tctaggtgaa gatccttttt gataatctca tgaccaaaat 3720
cccttaacgt gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc 3780
ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct 3840
accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg 3900
cttcagcaga gcgcagatac caaatactgt tcttctagtg tagccgtagt taggccacca 3960
cttcaagaac tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc 4020
tgctgccagt ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga 4080
taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac 4140
gacctacacc gaactgagat acctacagcg tgagctatga gaaagcgcca cgcttcccga 4200
agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag 4260
ggagcttcca gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg 4320
acttgagcgt cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag 4380
caacgcggcc tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc 4440
tgcgttatcc cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc 4500
tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc 4560
aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat taatgcagct ggcacgacag 4620
gtttcccgac tggaaagcgg gcagtgagcg caacgcaatt aatgtgagtt agctcactca 4680
ttaggcaccc caggctttac actttatgct tccggctcgt atgttgtgtg gaattgtgag 4740
cggataacaa tttcacacag gaaacagcta tgaccatgat tacgaattcg agctcactct 4800
atcattgata gagtttgaaa ctctatcatt gatagagtat aatatctttg ttcatttccg 4860
gcagtaggat ccccagtttt agagctagaa atagcaagtt aaaataaggc tagtccgtta 4920
tcaacttgaa aaagtggcac cgagtcggtg ctttttttga agcttg 4966
<210> 105
<211> 4938
<212> DNA
<213> Artificial Sequence
<220>
<223> pGRNA-bdhB
<400> 105
atcaaaaaaa tttccaataa tcccactcta agccacaaac acgccctata aaatcccgct 60
ttaatcccac tttgagacac atgtaatatt actttacgcc ctagtatagt gataattttt 120
tacattcaat gccacgcaaa aaaataaagg ggcactataa taaaagttcc ttcggaacta 180
actaaagtaa aaaattatct ttacaacctc cccaaaaaaa agaacaggta caaagtaccc 240
tataatacaa gcgtaaaaaa aatgagggta aaaataaaaa aataaaaaaa taaaaaaata 300
aaaaaataaa aaaataaaaa aataaaaaaa tataaaaata aaaaaatata aaaataaaaa 360
aatataaaaa taaaaaaata aaaaaatata aaaataaaaa aataaaaaaa tataaaaata 420
ttttttattt aaagtttgaa aaaaattttt ttatattata taatctttga agaaaagaat 480
ataaaaaatg agcctttata aaagcccatt ttttttcata tacgtaatat gacgttctaa 540
tgtttttatt ggtacttcta acattagagt aatttcttta tttttaaagc ctttttcttt 600
aagggctttt attttttttc ttaatacatt taattcctct ttttttgttg cttttccttt 660
agcttttaat tgctcttgat aatttttttt acctctaata ttttctcttc tcttatattc 720
ctttttagaa attattattg tcatatattt ttgttcttct tctgtaattt ctaataactc 780
tataagagtt tcattcttat acttatattg cttattttta tctaaataac atctttcagc 840
acttctagtt gctcttataa cttctctttc acttaaatgt tgtctaaaca tactattaag 900
ttctaaaaca tcatttaatg ccttctcaat gtcttctgta aagctacaaa gataatatct 960
atataaaaat aatataagct ctctgtgtcc ttttaaatca tattctctta gttcacaaag 1020
ttttattatg tcttgtattc ttccataata taaacttctt tctctataaa tataatttat 1080
tttgcttggt ctaccctttt tcctttcata tggttttaat tcaggtaaaa atccattttg 1140
tatttctctt aagtcataaa tatattcgta ctcatctaat atattgacta ctgtttttga 1200
tttagagttt atacttcctg gaactcttaa tattctcgtt gcatctaagg cttgtctatc 1260
tgctccaaag tattttaatt gattatataa atattcttga accgctttcc ataatggtaa 1320
tgctttacta ggtactgcat ttattatcca tattaaatac attcctcttc cactatctat 1380
tacatagttt ggtataggaa tactttgatt aaaataattc ttttctaagt ccattaatac 1440
ctggtcttta gttttgccag ttttataata atccaagtct ataaacagtg tatttaactc 1500
ttttatattt tctaatcgcc tacacggctt ataaaaggta tttagagtta tatagatatt 1560
ttcatcactc atatctaaat cttttaattc agcgtattta tagtgccatt ggctatatcc 1620
ttttttatct ataacgctcc tggttatcca ccctttactt ctactatgaa tattatctat 1680
atagttcttt ttattcagct ttaatgcgtt tctcacttat tcacctcccc ttctgtaaaa 1740
ctaagaaaat tatatcatat tttcaataat tattaactat tcttaaactc ttaataaaaa 1800
atagagtaag tccccaattg aaacttaatc tattttttat gttttaattt attattttta 1860
ttaaaatatt ttaaactaaa ttaaatgatt ctttttaatt ttttactatt tcattccata 1920
atatattact ataattattt acaaataata tttcttcatt tgtaatattt agatgattta 1980
ctaattttag tttttatata ttaaataatt aatgtataat ttatataaaa aatcaaagga 2040
gcttataaat tatgattatt tccaaagata ctaaagattt aatttttttc aattttaaca 2100
atactttttg taatattatg tttaaattta attgtatttt tttcatataa taaagccgtt 2160
gaagtaaacc aatccatttt ccttatgatg ttattattaa atttaagttt tataataata 2220
tctttattat atttattgtt tttaaaaaaa ctagtgaaat ttctagtgaa atttccggct 2280
ttattaaact tatttttagg aattttattt tcattttcat ctttacagga tttgattata 2340
tctttaaata tgttttatca aatattatct ttttctaaat ttatatatat ttttattata 2400
tttattatta tatatatttt atttttaagt ttctttctaa cagctattaa aaagaaactt 2460
aaaaataaaa acacgtactc taaaccaata aataaaacta tttttattat tgctgccttg 2520
attggaatag tttttagtaa aattaatttc aatattccac aatattatat tataagctag 2580
cacgcctcga gtatattgat aaaaataata atagtgggta taattaagtt gttaggaggt 2640
tagttagagc ttattacgac ataacacagt tttagagcta gaaatagcaa gttaaaataa 2700
ggctagtccg ttatcaactt gaaaaagtgg caccgagtcg gtgctttttt tgaagcttgt 2760
cgactctaga ggatccccgg gtaccgagct cgaattcgta atcatggtca tagctgtttc 2820
ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat acgagccgga agcataaagt 2880
gtaaagcctg gggtgcctaa tgagtgagct aactcacatt aattgcgttg cgctcactgc 2940
ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg 3000
ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc gctcactgac tcgctgcgct 3060
cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca 3120
cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga 3180
accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 3240
acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 3300
cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 3360
acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 3420
atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 3480
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 3540
acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 3600
gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg 3660
gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg 3720
gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 3780
gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga 3840
acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga 3900
tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt 3960
ctgacagtta ccaaagctag cttaatacta gtatatactt aatgtgataa gtgtctgaca 4020
gctgaccggt ctaaagaggt ccctagcgcc tacggggaat ttgtatcgat aaggggtaca 4080
aattcccact aagcgctcgg ccggggatcg atccccgggt acgtacccgg cagtttttct 4140
ttttcggcaa gtgttcaaga agttattaag tcgggagtgc agtcgaagtg ggcaagttga 4200
aaaattcaca aaaatgtggt ataatatctt tgttcattag agcgataaac ttgaatttga 4260
gagggaactt agatggtatt tgaaaaaatt gataaaaata gttggaacag aaaagagtat 4320
tttgaccact actttgcaag tgtaccttgt acctacagca tgaccgttaa agtggatatc 4380
acacaaataa aggaaaaggg aatgaaacta tatcctgcaa tgctttatta tattgcaatg 4440
attgtaaacc gccattcaga gtttaggacg gcaatcaatc aagatggtga attggggata 4500
tatgatgaga tgataccaag ctatacaata tttcacaatg atactgaaac attttccagc 4560
ctttggactg agtgtaagtc tgactttaaa tcatttttag cagattatga aagtgatacg 4620
caacggtatg gaaacaatca tagaatggaa ggaaagccaa atgctccgga aaacattttt 4680
aatgtatcta tgataccgtg gtcaaccttc gatggcttta atctgaattt gcagaaagga 4740
tatgattatt tgattcctat ttttactatg gggaaatatt ataaagaaga taacaaaatt 4800
atacttcctt tggcaattca agttcatcac gcagtatgtg acggatttca catttgccgt 4860
tttgtaaacg aattgcagga attgataaat agttaacttc aggtttgtct gtaactaaaa 4920
actagtattt aacctagg 4938
<210> 106
<211> 4790
<212> DNA
<213> Artificial Sequence
<220>
<223> pEC750C
<400> 106
atcaaaaaaa tttccaataa tcccactcta agccacaaac acgccctata aaatcccgct 60
ttaatcccac tttgagacac atgtaatatt actttacgcc ctagtatagt gataattttt 120
tacattcaat gccacgcaaa aaaataaagg ggcactataa taaaagttcc ttcggaacta 180
actaaagtaa aaaattatct ttacaacctc cccaaaaaaa agaacaggta caaagtaccc 240
tataatacaa gcgtaaaaaa aatgagggta aaaataaaaa aataaaaaaa taaaaaaata 300
aaaaaataaa aaaataaaaa aataaaaaaa tataaaaata aaaaaatata aaaataaaaa 360
aatataaaaa taaaaaaata aaaaaatata aaaataaaaa aataaaaaaa tataaaaata 420
ttttttattt aaagtttgaa aaaaattttt ttatattata taatctttga agaaaagaat 480
ataaaaaatg agcctttata aaagcccatt ttttttcata tacgtaatat gacgttctaa 540
tgtttttatt ggtacttcta acattagagt aatttcttta tttttaaagc ctttttcttt 600
aagggctttt attttttttc ttaatacatt taattcctct ttttttgttg cttttccttt 660
agcttttaat tgctcttgat aatttttttt acctctaata ttttctcttc tcttatattc 720
ctttttagaa attattattg tcatatattt ttgttcttct tctgtaattt ctaataactc 780
tataagagtt tcattcttat acttatattg cttattttta tctaaataac atctttcagc 840
acttctagtt gctcttataa cttctctttc acttaaatgt tgtctaaaca tactattaag 900
ttctaaaaca tcatttaatg ccttctcaat gtcttctgta aagctacaaa gataatatct 960
atataaaaat aatataagct ctctgtgtcc ttttaaatca tattctctta gttcacaaag 1020
ttttattatg tcttgtattc ttccataata taaacttctt tctctataaa tataatttat 1080
tttgcttggt ctaccctttt tcctttcata tggttttaat tcaggtaaaa atccattttg 1140
tatttctctt aagtcataaa tatattcgta ctcatctaat atattgacta ctgtttttga 1200
tttagagttt atacttcctg gaactcttaa tattctcgtt gcatctaagg cttgtctatc 1260
tgctccaaag tattttaatt gattatataa atattcttga accgctttcc ataatggtaa 1320
tgctttacta ggtactgcat ttattatcca tattaaatac attcctcttc cactatctat 1380
tacatagttt ggtataggaa tactttgatt aaaataattc ttttctaagt ccattaatac 1440
ctggtcttta gttttgccag ttttataata atccaagtct ataaacagtg tatttaactc 1500
ttttatattt tctaatcgcc tacacggctt ataaaaggta tttagagtta tatagatatt 1560
ttcatcactc atatctaaat cttttaattc agcgtattta tagtgccatt ggctatatcc 1620
ttttttatct ataacgctcc tggttatcca ccctttactt ctactatgaa tattatctat 1680
atagttcttt ttattcagct ttaatgcgtt tctcacttat tcacctcccc ttctgtaaaa 1740
ctaagaaaat tatatcatat tttcaataat tattaactat tcttaaactc ttaataaaaa 1800
atagagtaag tccccaattg aaacttaatc tattttttat gttttaattt attattttta 1860
ttaaaatatt ttaaactaaa ttaaatgatt ctttttaatt ttttactatt tcattccata 1920
atatattact ataattattt acaaataata tttcttcatt tgtaatattt agatgattta 1980
ctaattttag tttttatata ttaaataatt aatgtataat ttatataaaa aatcaaagga 2040
gcttataaat tatgattatt tccaaagata ctaaagattt aatttttttc aattttaaca 2100
atactttttg taatattatg tttaaattta attgtatttt tttcatataa taaagccgtt 2160
gaagtaaacc aatccatttt ccttatgatg ttattattaa atttaagttt tataataata 2220
tctttattat atttattgtt tttaaaaaaa ctagtgaaat ttctagtgaa atttccggct 2280
ttattaaact tatttttagg aattttattt tcattttcat ctttacagga tttgattata 2340
tctttaaata tgttttatca aatattatct ttttctaaat ttatatatat ttttattata 2400
tttattatta tatatatttt atttttaagt ttctttctaa cagctattaa aaagaaactt 2460
aaaaataaaa acacgtactc taaaccaata aataaaacta tttttattat tgctgccttg 2520
attggaatag tttttagtaa aattaatttc aatattccac aatattatat tataagctag 2580
cacgcctcga gatctccatg gacgcgtgac gtcgactcta gaggatcccc gggtaccgag 2640
ctcgaattcg taatcatggt catagctgtt tcctgtgtga aattgttatc cgctcacaat 2700
tccacacaac atacgagccg gaagcataaa gtgtaaagcc tggggtgcct aatgagtgag 2760
ctaactcaca ttaattgcgt tgcgctcact gcccgctttc cagtcgggaa acctgtcgtg 2820
ccagctgcat taatgaatcg gccaacgcgc ggggagaggc ggtttgcgta ttgggcgctc 2880
ttccgcttcc tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc 2940
agctcactca aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa 3000
catgtgagca aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt 3060
tttccatagg ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg 3120
gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg 3180
ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag 3240
cgtggcgctt tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc 3300
caagctgggc tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa 3360
ctatcgtctt gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg 3420
taacaggatt agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc 3480
taactacggc tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac 3540
cttcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 3600
tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 3660
gatcttttct acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt 3720
catgagatta tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa 3780
atcaatctaa agtatatatg agtaaacttg gtctgacagt taccaaagct agcttaatac 3840
tagtatatac ttaatgtgat aagtgtctga cagctgaccg gtctaaagag gtccctagcg 3900
cctacgggga atttgtatcg ataaggggta caaattccca ctaagcgctc ggccggggat 3960
cgatccccgg gtacgtaccc ggcagttttt ctttttcggc aagtgttcaa gaagttatta 4020
agtcgggagt gcagtcgaag tgggcaagtt gaaaaattca caaaaatgtg gtataatatc 4080
tttgttcatt agagcgataa acttgaattt gagagggaac ttagatggta tttgaaaaaa 4140
ttgataaaaa tagttggaac agaaaagagt attttgacca ctactttgca agtgtacctt 4200
gtacctacag catgaccgtt aaagtggata tcacacaaat aaaggaaaag ggaatgaaac 4260
tatatcctgc aatgctttat tatattgcaa tgattgtaaa ccgccattca gagtttagga 4320
cggcaatcaa tcaagatggt gaattgggga tatatgatga gatgatacca agctatacaa 4380
tatttcacaa tgatactgaa acattttcca gcctttggac tgagtgtaag tctgacttta 4440
aatcattttt agcagattat gaaagtgata cgcaacggta tggaaacaat catagaatgg 4500
aaggaaagcc aaatgctccg gaaaacattt ttaatgtatc tatgataccg tggtcaacct 4560
tcgatggctt taatctgaat ttgcagaaag gatatgatta tttgattcct atttttacta 4620
tggggaaata ttataaagaa gataacaaaa ttatacttcc tttggcaatt caagttcatc 4680
acgcagtatg tgacggattt cacatttgcc gttttgtaaa cgaattgcag gaattgataa 4740
atagttaact tcaggtttgt ctgtaactaa aaactagtat ttaacctagg 4790
<210> 107
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 107
acttgggtcg accacgataa aacaaggttt taagg 35
<210> 108
<211> 42
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 108
taccagggat ccgtattaat gtaactatga tatcaattct tg 42
<210> 109
<211> 46
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 109
atgcatggtc ccaatgaata ggtttacact tactttagtt ttatgg 46
<210> 110
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 110
atgcgagtta acaacttcta aaatctgatt accaattag 39
<210> 111
<211> 47
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 111
atgcatggat cccaatgaat aggtttacac ttactttagt tttatgg 47
<210> 112
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 112
atgcgagagc tcaacttcta aaatctgatt accaattag 39
<210> 113
<211> 32
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 113
atgcatggat ccgtctgaca gttaccaggt cc 32
<210> 114
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 114
atgcgagagc tccaattgtt caaaaaaata atggcggag 39
<210> 115
<211> 32
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 115
atgcatggat cccggcagtt tttctttttc gg 32
<210> 116
<211> 40
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer
<400> 116
atgcgagagc tcggttaaat actagttttt agttacagac 40
<210> 117
<211> 2686
<212> DNA
<213> Artificial Sequence
<220>
<223> pUC19
<400> 117
gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat gcagctggca 60
cgacaggttt cccgactgga aagcgggcag tgagcgcaac gcaattaatg tgagttagct 120
cactcattag gcaccccagg ctttacactt tatgcttccg gctcgtatgt tgtgtggaat 180
tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg ccaagcttgc 240
atgcctgcag gtcgactcta gaggatcccc gggtaccgag ctcgaattca ctggccgtcg 300
ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca acttaatcgc cttgcagcac 360
atcccccttt cgccagctgg cgtaatagcg aagaggcccg caccgatcgc ccttcccaac 420
agttgcgcag cctgaatggc gaatggcgcc tgatgcggta ttttctcctt acgcatctgt 480
gcggtatttc acaccgcata tggtgcactc tcagtacaat ctgctctgat gccgcatagt 540
taagccagcc ccgacacccg ccaacacccg ctgacgcgcc ctgacgggct tgtctgctcc 600
cggcatccgc ttacagacaa gctgtgaccg tctccgggag ctgcatgtgt cagaggtttt 660
caccgtcatc accgaaacgc gcgagacgaa agggcctcgt gatacgccta tttttatagg 720
ttaatgtcat gataataatg gtttcttaga cgtcaggtgg cacttttcgg ggaaatgtgc 780
gcggaacccc tatttgttta tttttctaaa tacattcaaa tatgtatccg ctcatgagac 840
aataaccctg ataaatgctt caataatatt gaaaaaggaa gagtatgagt attcaacatt 900
tccgtgtcgc ccttattccc ttttttgcgg cattttgcct tcctgttttt gctcacccag 960
aaacgctggt gaaagtaaaa gatgctgaag atcagttggg tgcacgagtg ggttacatcg 1020
aactggatct caacagcggt aagatccttg agagttttcg ccccgaagaa cgttttccaa 1080
tgatgagcac ttttaaagtt ctgctatgtg gcgcggtatt atcccgtatt gacgccgggc 1140
aagagcaact cggtcgccgc atacactatt ctcagaatga cttggttgag tactcaccag 1200
tcacagaaaa gcatcttacg gatggcatga cagtaagaga attatgcagt gctgccataa 1260
ccatgagtga taacactgcg gccaacttac ttctgacaac gatcggagga ccgaaggagc 1320
taaccgcttt tttgcacaac atgggggatc atgtaactcg ccttgatcgt tgggaaccgg 1380
agctgaatga agccatacca aacgacgagc gtgacaccac gatgcctgta gcaatggcaa 1440
caacgttgcg caaactatta actggcgaac tacttactct agcttcccgg caacaattaa 1500
tagactggat ggaggcggat aaagttgcag gaccacttct gcgctcggcc cttccggctg 1560
gctggtttat tgctgataaa tctggagccg gtgagcgtgg gtctcgcggt atcattgcag 1620
cactggggcc agatggtaag ccctcccgta tcgtagttat ctacacgacg gggagtcagg 1680
caactatgga tgaacgaaat agacagatcg ctgagatagg tgcctcactg attaagcatt 1740
ggtaactgtc agaccaagtt tactcatata tactttagat tgatttaaaa cttcattttt 1800
aatttaaaag gatctaggtg aagatccttt ttgataatct catgaccaaa atcccttaac 1860
gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag 1920
atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg 1980
tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact ggcttcagca 2040
gagcgcagat accaaatact gttcttctag tgtagccgta gttaggccac cacttcaaga 2100
actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg gctgctgcca 2160
gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg gataaggcgc 2220
agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca 2280
ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc gaagggagaa 2340
aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc 2400
cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc tgacttgagc 2460
gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg 2520
cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat 2580
cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc gctcgccgca 2640
gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaaga 2686
<210> 118
<211> 4282
<212> DNA
<213> Artificial Sequence
<220>
<223> pNF2
<400> 118
ctggagagga ttgtccttat acttatcata agcatgaagg acttgttatt cctagataga 60
gaattaatta tgttaaagag atataataaa ctcattataa ttataatttt tagtataatt 120
attattgcaa ttttttcgta taaatatcta ataatgccaa aagagcatag aatagaaatt 180
tcaacattat caaacataga agtttttaaa tttaatagtt tttcaaagtt tagtaacgaa 240
aaaatgtata ctattaatga tagtgataag ttaataaaat tcaaaacact atttaataat 300
ttagataaat caaaagatat aaaaaagatt agtattccgg aaagtgaaaa tttaaatgca 360
tttaaatttt ctgcacatat aaaacttaac tttaactatg ttaataaaga tagccaaata 420
actgaaggtg cttttcttat gtatattttg gtagacaatt tagaagggaa gtcatatatg 480
acttttttag gacaagattc aagctatata ttagatagta atgaaactaa cattttaaga 540
gaaatattta tgaattcaga gattaattaa tttatgaatt cataaatatt atctaagcac 600
gataaaacaa ggttttaagg ataagaaaag tcatgagatt tatagtaaat cttgtgactt 660
tttttattga atagtagaga gagttcggaa gtataacacg ctatattctt gatattttta 720
gaatagcaag cattggattt gtcctgacac tttcccaaaa attaaggagt tattccttaa 780
accaaaaaga ttaatgtggg aacaaattta gtgtatccat ttttgaaggg cgcacttata 840
caccaccaaa atggtgtgtg cgaaatcttt aaaaaagatt tatcaaaaag cttttttaaa 900
gctgggacat ttagaaaatc aataatgttt tttgcccaat acgctagtct taaaatctgc 960
aaggttgata actatttagt cccaggtatt agaatggggc atatatatac aaagtatata 1020
tatgcgtaaa tatatgtggg actgtgggaa caaaattgcg tgctaaaatt gtattgaaaa 1080
ggtaatgaaa aggtcatgct ttggtattgc taacgtatag aaaaggtaat gaaaagctca 1140
tggttctata aaaaagatgt acccacgaaa ataataggct ttgcctattt ccccatgtaa 1200
tatgggggca gttttctctt atgctctttc ttaacatatt gaataaatac aaaatgcagc 1260
tttgtgggaa taaaaatatt tttgttttta ttcttatagt tagacaaaat tttaatcttt 1320
tttgtgctat aacaagatta aaatttgtgg gaacattaag aaatattgtt gtcacaaata 1380
aaaaggagag tgggaacaat tgctataaaa aacgcagaaa ttaagattag agttacaaaa 1440
gagcaaaaag aattatttaa gaaaattgca aaagctgaaa atatgagtat gagtgaattt 1500
attattgtga ccacagaata tttagccaga aaaaaagatg aaaatatgaa atcaaaagac 1560
atgatcgaga gaagagctgc gaagactgaa gaaaaaatta tgaagctaaa aaagaaacta 1620
aataaaaaca ggtaatatag attacagttt taagcttgtt ttccctatag actagagtaa 1680
atatataaat atacctgtca agggcttata agccccttta gggggtgcgt agcacccttg 1740
acaggtatat ttatatattt tagggtgcca ttaagggaaa caagctttaa aatgccttta 1800
aaggcatttt aaaataaata aaaaaaagat ggtttttacc atctttttta actcccgaaa 1860
gggagttctt tcttttcttg atactatacg taactatttc gatttgccct gaacctaatc 1920
aaagctagat aaattcagta ttagggcata aaaaaacttg ctttttcggg tggaaatctg 1980
tataatttaa attgcttaga taaaaattac caattccata cgaaaggagc aagttttaca 2040
taaggttaaa gccttatgtg aattctcatt taattacatg aataataata acacagaaag 2100
tgaagaatta aaagagcaaa gtcaactatt gcttgacaaa tgcacaaaaa agaaaaagaa 2160
aaatcctaaa tttagtagtt atatagaacc attagtaagc aagaaattat ctgaaagaat 2220
aaaggaatgt ggtgactttt tgcagatgtt atctgattta aaccttgaaa attcgaaact 2280
gcatagagca agtttttgtg gtaacagatt ttgtcctatg tgtagctggc gtattgcttg 2340
taaggatagt ttggaaatat ctattctcat ggagcattta cgcaaagagg aaagcaaaga 2400
atttatcttt ttgaccttaa caactccaaa tgtgaaaggt gcggaccttg ataattccat 2460
aaaagcatac aataaagcat ttaaaaagtt aatggaacgc aaagaggtca agagcatagt 2520
aaaaggctac ataagaaagc tagaagtaac ctataatttg gacaagagtt ccaaatcata 2580
taatacttat cacccacatt tccatgtggt actagcagtc aatagaagtt actttaaaaa 2640
gcaaaatcta tatataaacc atcatagatg gcttagtttg tggcaagagt caactggtga 2700
ttattcgata actcaagttg atgtaagaaa ggctaaaatt aacgattata aagaggttta 2760
tgagcttgct aagtattcgg ctaaggattc cgactattta atcaatagag aagtgtttac 2820
ggtattctac aaatctttaa agggtaaaca ggtacttgta tttagtggat tatttaaaga 2880
cgctcataaa atgtataaga atggagagct agatctgtat aagaagttgg atactatcga 2940
atatgcttat atggtaagtt ataactggct taaaaagaag tatgatactt caaatattag 3000
agaattaact gaggaagaaa agcagaaatt caataaaaat ttaatcgaag atgtggatat 3060
tgagtaggtg ggattatatc tcaccttttt tattgtcttt tcatgttgaa attttgacgc 3120
ttaatgcatg aagtattgac aagtttaaaa attacggttt ttaatcctta gttgattagc 3180
aggattatgg ccggaatgct ccgtccagtc ctgttaagga attaaaattc cctaaaaccc 3240
ttggctatga tttatagcga gaatcgtcaa ttaaaaattt aataggtgct atgaaagtcg 3300
attaataatt aattttaaaa tgcaatatga aacataatta caagaatttg acttttaata 3360
caagaattga tatcatagtt acattaatac atttattttg aagggggaaa atgttttatg 3420
aaaagactac ttaaactacc tattttatca ttattaggat tatttttaat tggatcaact 3480
ccaacattag ctttaactaa agataataat caaaatttag atactatgaa agtaaactta 3540
tatactgaaa cagtagatgt gtttgataaa gatgcattta aacaaacatt tactaataaa 3600
gatataaaat ttctagagga ttctttgaat gcaaaaataa attattcagg taaatctgtt 3660
acagtaacaa tgaaaaacaa aattaagcca tctactaaac aagggcttgt tttatatgta 3720
aatggaaaat cagttaatgt tgattcagat ggcagtataa aagtacctaa agatactaag 3780
aaaatttcta aattaaataa agataaatca atgatggatg gatcaatgat ggataaatca 3840
ttacatgatg agaattgtgt agtatcagat agtttttata atgctgatgt taataatata 3900
aattcaaaag aagcagaagc tgtatttaaa gtaagttctg gtgaattatt agctaaaatg 3960
gatgaaaaag aagatgatta catacaaaag aactcatcta aaattctagc agctgcttat 4020
cataagggat atggggacaa gtactatgaa ggagattggg ttcattgcaa taggtttaat 4080
ggtcaactta cagatgatgt tcactataat tggagaactg gaagtgtttc agaaaaagca 4140
gctgcaatga gaaattttta tggcagtgat tgtcatatag cattagttca agcaggtagt 4200
ggatgtacaa gtataggttc atgcgaatgc aatacagatc aaatagctgc gtattgttca 4260
ggtttcgtaa aagataaaaa ta 4282
<210> 119
<211> 5473
<212> DNA
<213> Artificial Sequence
<220>
<223> pNF3
<400> 119
gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat gcagctggca 60
cgacaggttt cccgactgga aagcgggcag tgagcgcaac gcaattaatg tgagttagct 120
cactcattag gcaccccagg ctttacactt tatgcttccg gctcgtatgt tgtgtggaat 180
tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg ccaagcttgc 240
atgcctgcag gtcgaccacg ataaaacaag gttttaagga taagaaaagt catgagattt 300
atagtaaatc ttgtgacttt ttttattgaa tagtagagag agttcggaag tataacacgc 360
tatattcttg atatttttag aatagcaagc attggatttg tcctgacact ttcccaaaaa 420
ttaaggagtt attccttaaa ccaaaaagat taatgtggga acaaatttag tgtatccatt 480
tttgaagggc gcacttatac accaccaaaa tggtgtgtgc gaaatcttta aaaaagattt 540
atcaaaaagc ttttttaaag ctgggacatt tagaaaatca ataatgtttt ttgcccaata 600
cgctagtctt aaaatctgca aggttgataa ctatttagtc ccaggtatta gaatggggca 660
tatatataca aagtatatat atgcgtaaat atatgtggga ctgtgggaac aaaattgcgt 720
gctaaaattg tattgaaaag gtaatgaaaa ggtcatgctt tggtattgct aacgtataga 780
aaaggtaatg aaaagctcat ggttctataa aaaagatgta cccacgaaaa taataggctt 840
tgcctatttc cccatgtaat atgggggcag ttttctctta tgctctttct taacatattg 900
aataaataca aaatgcagct ttgtgggaat aaaaatattt ttgtttttat tcttatagtt 960
agacaaaatt ttaatctttt ttgtgctata acaagattaa aatttgtggg aacattaaga 1020
aatattgttg tcacaaataa aaaggagagt gggaacaatt gctataaaaa acgcagaaat 1080
taagattaga gttacaaaag agcaaaaaga attatttaag aaaattgcaa aagctgaaaa 1140
tatgagtatg agtgaattta ttattgtgac cacagaatat ttagccagaa aaaaagatga 1200
aaatatgaaa tcaaaagaca tgatcgagag aagagctgcg aagactgaag aaaaaattat 1260
gaagctaaaa aagaaactaa ataaaaacag gtaatataga ttacagtttt aagcttgttt 1320
tccctataga ctagagtaaa tatataaata tacctgtcaa gggcttataa gcccctttag 1380
ggggtgcgta gcacccttga caggtatatt tatatatttt agggtgccat taagggaaac 1440
aagctttaaa atgcctttaa aggcatttta aaataaataa aaaaaagatg gtttttacca 1500
tcttttttaa ctcccgaaag ggagttcttt cttttcttga tactatacgt aactatttcg 1560
atttgccctg aacctaatca aagctagata aattcagtat tagggcataa aaaaacttgc 1620
tttttcgggt ggaaatctgt ataatttaaa ttgcttagat aaaaattacc aattccatac 1680
gaaaggagca agttttacat aaggttaaag ccttatgtga attctcattt aattacatga 1740
ataataataa cacagaaagt gaagaattaa aagagcaaag tcaactattg cttgacaaat 1800
gcacaaaaaa gaaaaagaaa aatcctaaat ttagtagtta tatagaacca ttagtaagca 1860
agaaattatc tgaaagaata aaggaatgtg gtgacttttt gcagatgtta tctgatttaa 1920
accttgaaaa ttcgaaactg catagagcaa gtttttgtgg taacagattt tgtcctatgt 1980
gtagctggcg tattgcttgt aaggatagtt tggaaatatc tattctcatg gagcatttac 2040
gcaaagagga aagcaaagaa tttatctttt tgaccttaac aactccaaat gtgaaaggtg 2100
cggaccttga taattccata aaagcataca ataaagcatt taaaaagtta atggaacgca 2160
aagaggtcaa gagcatagta aaaggctaca taagaaagct agaagtaacc tataatttgg 2220
acaagagttc caaatcatat aatacttatc acccacattt ccatgtggta ctagcagtca 2280
atagaagtta ctttaaaaag caaaatctat atataaacca tcatagatgg cttagtttgt 2340
ggcaagagtc aactggtgat tattcgataa ctcaagttga tgtaagaaag gctaaaatta 2400
acgattataa agaggtttat gagcttgcta agtattcggc taaggattcc gactatttaa 2460
tcaatagaga agtgtttacg gtattctaca aatctttaaa gggtaaacag gtacttgtat 2520
ttagtggatt atttaaagac gctcataaaa tgtataagaa tggagagcta gatctgtata 2580
agaagttgga tactatcgaa tatgcttata tggtaagtta taactggctt aaaaagaagt 2640
atgatacttc aaatattaga gaattaactg aggaagaaaa gcagaaattc aataaaaatt 2700
taatcgaaga tgtggatatt gagtaggtgg gattatatct cacctttttt attgtctttt 2760
catgttgaaa ttttgacgct taatgcatga agtattgaca agtttaaaaa ttacggtttt 2820
taatccttag ttgattagca ggattatggc cggaatgctc cgtccagtcc tgttaaggaa 2880
ttaaaattcc ctaaaaccct tggctatgat ttatagcgag aatcgtcaat taaaaattta 2940
ataggtgcta tgaaagtcga ttaataatta attttaaaat gcaatatgaa acataattac 3000
aagaatttga cttttaatac aagaattgat atcatagtta cattaatacg gatccccggg 3060
taccgagctc gaattcactg gccgtcgttt tacaacgtcg tgactgggaa aaccctggcg 3120
ttacccaact taatcgcctt gcagcacatc cccctttcgc cagctggcgt aatagcgaag 3180
aggcccgcac cgatcgccct tcccaacagt tgcgcagcct gaatggcgaa tggcgcctga 3240
tgcggtattt tctccttacg catctgtgcg gtatttcaca ccgcatatgg tgcactctca 3300
gtacaatctg ctctgatgcc gcatagttaa gccagccccg acacccgcca acacccgctg 3360
acgcgccctg acgggcttgt ctgctcccgg catccgctta cagacaagct gtgaccgtct 3420
ccgggagctg catgtgtcag aggttttcac cgtcatcacc gaaacgcgcg agacgaaagg 3480
gcctcgtgat acgcctattt ttataggtta atgtcatgat aataatggtt tcttagacgt 3540
caggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac 3600
attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa 3660
aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat 3720
tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc 3780
agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag atccttgaga 3840
gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg ctatgtggcg 3900
cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcata cactattctc 3960
agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat ggcatgacag 4020
taagagaatt atgcagtgct gccataacca tgagtgataa cactgcggcc aacttacttc 4080
tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg ggggatcatg 4140
taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac gacgagcgtg 4200
acaccacgat gcctgtagca atggcaacaa cgttgcgcaa actattaact ggcgaactac 4260
ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa gttgcaggac 4320
cacttctgcg ctcggccctt ccggctggct ggtttattgc tgataaatct ggagccggtg 4380
agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc tcccgtatcg 4440
tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga cagatcgctg 4500
agataggtgc ctcactgatt aagcattggt aactgtcaga ccaagtttac tcatatatac 4560
tttagattga tttaaaactt catttttaat ttaaaaggat ctaggtgaag atcctttttg 4620
ataatctcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg tcagaccccg 4680
tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc tgctgcttgc 4740
aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag ctaccaactc 4800
tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtt cttctagtgt 4860
agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac ctcgctctgc 4920
taatcctgtt accagtggct gctgccagtg gcgataagtc gtgtcttacc gggttggact 4980
caagacgata gttaccggat aaggcgcagc ggtcgggctg aacggggggt tcgtgcacac 5040
agcccagctt ggagcgaacg acctacaccg aactgagata cctacagcgt gagctatgag 5100
aaagcgccac gcttcccgaa gggagaaagg cggacaggta tccggtaagc ggcagggtcg 5160
gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt tatagtcctg 5220
tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca ggggggcgga 5280
gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt tgctggcctt 5340
ttgctcacat gttctttcct gcgttatccc ctgattctgt ggataaccgt attaccgcct 5400
ttgagtgagc tgataccgct cgccgcagcc gaacgaccga gcgcagcgag tcagtgagcg 5460
aggaagcgga aga 5473
<210> 120
<211> 9128
<212> DNA
<213> Artificial Sequence
<220>
<223> pMTL007S-E1
<400> 120
gatcgggccc cctgcagggt gtagtagcct gtgaaataag taaggaaaaa aaagaagtaa 60
gtgttatata tgatgattat tttgtagatg tagataggat aatagaatcc atagaaaata 120
taggttatac agttatataa aaattacttt aaaaattaat aaaaacatgg taaaatataa 180
atcgtataaa gttgtgtaat ttttaagctt gagctcataa caatttcaca caggaaacag 240
ctatgaccat gattacggat tcactggccg tcgttttaca acgtcgtgac tgggaaaacc 300
ctggcgttac ccaacttaat cgccttgcag cacatccccc tttcgccagc tggcgtaata 360
gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg cagcctgaat ggcgaatggc 420
gctaataaag atcttgtaca atctgtagga gaacctatgg gaacgaaacg aaagcgatgc 480
cgagaatctg aatttaccaa gacttaacac taactgggga taccctaaac aagaatgcct 540
aatagaaagg aggaaaaagg ctatagcact agagcttgaa aatcttgcaa gggtacggag 600
tactcgtagt agtctgagaa gggtaacgcc ctttacatgg caaaggggta cagttattgt 660
gtactaaaat taaaaattga ttagggagga aaacctcaaa atgaaaccaa caatggcaat 720
tttagaaaga atcagtaaaa attcacaaga aaatatagac gaagttttta caagacttta 780
tcgttatctt ttacgtccag atatttatta cgtggcgacg cgtgcgactc atagaattat 840
ttcctcccgt taaataatag ataactatta aaaatagaca atacttgctc ataagtaacg 900
gtacttaaat tgtttacttt ggcgtgtttc attgcttgat gaaactgatt tttagtaaac 960
agttgacgat attctcgatt gacccatttt gaaacaaagt acgtatatag cttccaatat 1020
ttatctggaa catctgtggt atggcgggta agttttatta agacactgtt tacttttggt 1080
ttaggatgaa agcattccgc tggcagctta agcaattgct gaatcgagac ttgagtgtgc 1140
aagagcaacc ctagtgttcg gtgaatatcc aaggtacgct tgtagaatcc ttcttcaaca 1200
atcagataga tgtcagacgc atggctttca aaaaccactt ttttaataat ttgtgtgctt 1260
aaatggtaag gaatactccc aacaatttta tacctctgtt tgttagggaa ttgaaactgt 1320
agaatatctt ggtgaattaa agtgacacga gtattcagtt ttaatttttc tgacgataag 1380
ttgaatagat gactgtctaa ttcaatagac gttacctgtt tacttatttt agccagtttc 1440
gtcgttaaat gccctttacc tgttccaatt tcgtaaacgg tatcggtttc ttttaaattc 1500
aattgtttta ttatttggtt gagtactttt tcactcgtta aaaagttttg agaatatttt 1560
atatttttgt tcataccagc accagaagca ccagcatctc ttgggttaat tgaggcctga 1620
gtataaggtg acttatactt gtaatctatc taaacgggga acctctctag tagacaatcc 1680
cgtgctaaat tgtaggactg ccctttaata aatacttcta tatttaaaga ggtatttatg 1740
aaaagcggaa tttatcagat taaaaatact ttctctagag aaaatttcgt ctggattagt 1800
tacttatcgt gtaaaatctg ataaatggaa ttggttctac ataaatgcct aacgactatc 1860
cctttgggga gtagggtcaa gtgactcgaa acgatagaca acttgcttta acaagttgga 1920
gatatagtct gctctgcatg gtgacatgca gctggatata attccggggt aagattaacg 1980
accttatctg aacataatgc catatgaatc cctcctaatt tatacgtttt ctctaacaac 2040
ttaattatac ccactattat tatttttatc aatataacgc gttgggaaat ggcaatgata 2100
gcgaaacaac gtaaaactct tgttgtatgc tttcattgtc atcgtcacgt gattcataaa 2160
cacaagtgaa tgtcgacagt gaatttttac gaacgaacaa taacagagcc gtatactccg 2220
agaggggtac gtacggttcc cgaagagggt ggtgcaaacc agtcacagta atgtgaacaa 2280
ggcggtacct ccctacttca ccatatcatt ttctgcagcc ccctagaaat aattttgttt 2340
aactttaaga aggagatata catatatggc tagatcgtcc attccgacag catcgccagt 2400
cactatggcg tgctgctagc gctatatgcg ttgatgcaat ttctatgcac tcgtagtagt 2460
ctgagaaggg taacgccctt tacatggcaa aggggtacag ttattgtgta ctaaaattaa 2520
aaattgatta gggaggaaaa cctcaaaatg aaaccaacaa tggcaatttt agaaagaatc 2580
agtaaaaatt cacaagaaaa tatagacgaa gtttttacaa gactttatcg ttatctttta 2640
cgtccagata tttattacgt ggcgtatcaa aatttatatt ccaataaagg agcttccaca 2700
aaaggaatat tagatgatac agcggatggc tttagtgaag aaaaaataaa aaagattatt 2760
caatctttaa aagacggaac ttactatcct caacctgtac gaagaatgta tattgcaaaa 2820
aagaattcta aaaagatgag acctttagga attccaactt tcacagataa attgatccaa 2880
gaagctgtga gaataattct tgaatctatc tatgaaccgg tattcgaaga tgtgtctcac 2940
ggttttagac ctcaacgaag ctgtcacaca gctttgaaaa caatcaaaag agagtttggc 3000
ggcgcaagat ggtttgtgga gggagatata aaaggctgct tcgataatat agaccacgtt 3060
acactcattg gactcatcaa tcttaaaatc aaagatatga aaatgagcca attgatttat 3120
aaatttctaa aagcaggtta tctggaaaac tggcagtatc acaaaactta cagcggaaca 3180
cctcaaggtg gaattctatc tcctcttttg gccaacatct atcttcatga attggataag 3240
tttgttttac aactcaaaat gaagtttgac cgagaaagtc cagaaagaat aacacctgaa 3300
tatcgggagc tccacaatga gataaaaaga atttctcacc gtctcaagaa gttggagggt 3360
gaagaaaaag ctaaagttct tttagaatat caagaaaaac gtaaaagatt acccacactc 3420
ccctgtacct cacagacaaa taaagtattg aaatacgtcc ggtatgcgga cgacttcatt 3480
atctctgtta aaggaagcaa agaggactgt caatggataa aagaacaatt aaaacttttt 3540
attcataaca agctaaaaat ggaattgagt gaagaaaaaa cactcatcac acatagcagt 3600
caacccgctc gttttctggg atatgatata cgagtaagga gatctggaac gataaaacga 3660
tctggtaaag tcaaaaagag aacactcaat gggagtgtag aactccttat tcctcttcaa 3720
gacaaaattc gtcaatttat ttttgacaag aaaatagcta tccaaaagaa agatagctca 3780
tggtttccag ttcacaggaa atatcttatt cgttcaacag acttagaaat catcacaatt 3840
tataattctg aactccgcgg gatttgtaat tactacggtc tagcaagtaa ttttaaccag 3900
ctcaattatt ttgcttatct tatggaatac agctgtctaa aaacgatagc ctccaaacat 3960
aagggaacac tttcaaaaac catttccatg tttaaagatg gaagtggttc gtgggggatc 4020
ccgtatgaga taaagcaagg taagcagcgc cgttattttg caaattttag tgaatgtaaa 4080
tccccttatc aatttacgga tgagataagt caagctcctg tattgtatgg ctatgcccgg 4140
aatactcttg aaaacaggtt aaaagctaaa tgttgtgaat tatgtgggac gtctgatgaa 4200
aatacttcct atgaaattca ccatgtcaat aaggtcaaaa atcttaaagg caaagaaaaa 4260
tgggaaatgg caatgatagc gaaacaacgt aaaactcttg ttgtatgctt tcattgtcat 4320
cgtcacgtga ttcataaaca caagtgaatg tcgagcaccc gttctcggag cactgtccga 4380
ccgctttggc cgccgcccag tcctgctcgc ttcgctactt ggagccacta tcgactacgc 4440
gatcatggcg accacacccg tcctgtggat cgccaagccg ccgatggtag tgtggggtct 4500
ccccatgcga gagtagggaa ctgccaggca tcaaataaaa cgaaaggctc agtcgaaaga 4560
ctgggccttt cgttttatct gttgtttgtc ggtgaacgct ctcctgagta ggacaaatcc 4620
gccgggagcg gatttgaacg ttgcgaagca acggcccgga gggtggcggg caggacgccc 4680
gccataaact gccaggcatc aaattaagca gaaggccatc ctgacggatg gcctttttgc 4740
gtttctacaa actcttcctg tcgtcatatc tacaagccat ccccccacag atacgggcgc 4800
gccgccatta tttttttgaa caattgacaa ttcatttctt attttttatt aagtgatagt 4860
caaaaggcat aacagtgctg aatagaaaga aatttacaga aaagaaaatt atagaattta 4920
gtatgattaa ttatactcat ttatgaatgt ttaattgaat acaaaaaaaa atacttgtta 4980
tgtattcaat tacgggttaa aatatagaca agttgaaaaa tttaataaaa aaataagtcc 5040
tcagctctta tatattaagc taccaactta gtatataagc caaaacttaa atgtgctacc 5100
aacacatcaa gccgttagag aactctatct atagcaatat ttcaaatgta ccgacataca 5160
agagaaacat taactatata tattcaattt atgagattat cttaacagat ataaatgtaa 5220
attgcaataa gtaagattta gaagtttata gcctttgtgt attggaagca gtacgcaaag 5280
gcttttttat ttgataaaaa ttagaagtat atttattttt tcataattaa tttatgaaaa 5340
tgaaaggggg tgagcaaagt gacagaggaa agcagtatct tatcaaataa caaggtatta 5400
gcaatatcat tattgacttt agcagtaaac attatgactt ttatagtgct tgtagctaag 5460
tagtacgaaa gggggagctt taaaaagctc cttggaatac atagaattca taaattaatt 5520
tatgaaaaga agggcgtata tgaaaacttg taaaaattgc aaagagttta ttaaagatac 5580
tgaaatatgc aaaatacatt cgttgatgat tcatgataaa acagtagcaa cctattgcag 5640
taaatacaat gagtcaagat gtttacataa agggaaagtc caatgtatta attgttcaaa 5700
gatgaaccga tatggatggt gtgccataaa aatgagatgt tttacagagg aagaacagaa 5760
aaaagaacgt acatgcatta aatattatgc aaggagcttt aaaaaagctc atgtaaagaa 5820
gagtaaaaag aaaaaataat ttatttatta atttaatatt gagagtgccg acacagtatg 5880
cactaaaaaa tatatctgtg gtgtagtgag ccgatacaaa aggatagtca ctcgcatttt 5940
cataatacat cttatgttat gattatgtgt cggtgggact tcacgacgaa aacccacaat 6000
aaaaaaagag ttcggggtag ggttaagcat agttgaggca actaaacaat caagctagga 6060
tatgcagtag cagaccgtaa ggtcgttgtt taggtgtgtt gtaatacata cgctattaag 6120
atgtaaaaat acggatacca atgaagggaa aagtataatt tttggatgta gtttgtttgt 6180
tcatctatgg gcaaactacg tccaaagccg tttccaaatc tgctaaaaag tatatccttt 6240
ctaaaatcaa agtcaagtat gaaatcataa ataaagttta attttgaagt tattatgata 6300
ttatgttttt ctattaaaat aaattaagta tatagaatag tttaataata gtatatactt 6360
aatgtgataa gtgtctgaca gtgtcacaga aaggatgatt gttatggatt ataagcggcc 6420
ggcccaatga ataggtttac acttacttta gttttatgga aatgaaagat catatcatat 6480
ataatctaga ataaaattaa ctaaaataat tattatctag ataaaaaatt tagaagccaa 6540
tgaaatctat aaataaacta aattaagttt atttaattaa caactatgga tataaaatag 6600
gtactaatca aaatagtgag gaggatatat ttgaatacat acgaacaaat taataaagtg 6660
aaaaaaatac ttcggaaaca tttaaaaaat aaccttattg gtacttacat gtttggatca 6720
ggagttgaga gtggactaaa accaaatagt gatcttgact ttttagtcgt cgtatctgaa 6780
ccattgacag atcaaagtaa agaaatactt atacaaaaaa ttagacctat ttcaaagaaa 6840
ataggagata aaagcaactt acgatatatt gaattaacaa ttattattca gcaagaaatg 6900
gtaccgtgga atcatcctcc caaacaagaa tttatttatg gagaatggtt acaagagctt 6960
tatgaacaag gatacattcc tcagaaggaa ttaaattcag atttaaccat aatgctttac 7020
caagcaaaac gaaaaaataa aagaatatac ggaaattatg acttagagga attactacct 7080
gatattccat tttctgatgt gagaagagcc attatggatt cgtcagagga attaatagat 7140
aattatcagg atgatgaaac caactctata ttaactttat gccgtatgat tttaactatg 7200
gacacgggta aaatcatacc aaaagatatt gcgggaaatg cagtggctga atcttctcca 7260
ttagaacata gggagagaat tttgttagca gttcgtagtt atcttggaga gaatattgaa 7320
tggactaatg aaaatgtaaa tttaactata aactatttaa ataacagatt aaaaaaatta 7380
taaaaaaatt gaaaaaatgg tggaaacact tttttcaatt tttttgtttt attatttaat 7440
atttgggaaa tattcattct aattggtaat cagattttag aagtttaaac tcctttttga 7500
taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt cagaccccgt 7560
agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct gctgcttgca 7620
aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg gatcaagagc taccaactct 7680
ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgttc ttctagtgta 7740
gccgtagtta ggccaccact tcaagaactc tgtagcaccg cctacatacc tcgctctgct 7800
aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg ggttggactc 7860
aagacgatag ttaccggata aggcgcagcg gtcgggctga acggggggtt cgtgcacaca 7920
gcccagcttg gagcgaacga cctacaccga actgagatac ctacagcgtg agctatgaga 7980
aagcgccacg cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg 8040
aacaggagag cgcacgaggg agcttccagg gggaaacgcc tggtatcttt atagtcctgt 8100
cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag gggggcggag 8160
cctatggaaa aacgccagca acgcggcctt tttacggttc ctggcctttt gctggccttt 8220
tgctcacatg ttctttcctg cgttatcccc tgattctgtg gataaccgta ttaccgcctt 8280
tgagtgagct gataccgctc gccgcagccg aacgaccgag cgcagcgagt cagtgagcga 8340
ggaagcggaa gagcgcccaa tacgcagggc cccctgcttc ggggtcatta tagcgatttt 8400
ttcggtatat ccatcctttt tcgcacgata tacaggattt tgccaaaggg ttcgtgtaga 8460
ctttccttgg tgtatccaac ggcgtcagcc gggcaggata ggtgaagtag gcccacccgc 8520
gagcgggtgt tccttcttca ctgtccctta ttcgcacctg gcggtgctca acgggaatcc 8580
tgctctgcga ggctggccgg ctaccgccgg cgtaacagat gagggcaagc ggatggctga 8640
tgaaaccaag ccaaccagga agggcagccc acctatcaag gtgtactgcc ttccagacga 8700
acgaagagcg attgaggaaa aggcggcggc ggccggcatg agcctgtcgg cctacctgct 8760
ggccgtcggc cagggctaca aaatcacggg cgtcgtggac tatgagcacg tccgcgagct 8820
ggcccgcatc aatggcgacc tgggccgcct gggcggcctg ctgaaactct ggctcaccga 8880
cgacccgcgc acggcgcggt tcggtgatgc cacgatcctc gccctgctgg cgaagatcga 8940
agagaagcag gacgagcttg gcaaggtcat gatgggcgtg gtccgcccga gggcagagcc 9000
atgacttttt tagccgctaa aacggccggg gggtgcgcgt gattgccaag cacgtcccca 9060
tgcgctccat caagaagagc gacttcgcgg agctggtgaa gtacatcacc gacgagcaag 9120
gcaagacc 9128
<210> 121
<211> 5002
<212> DNA
<213> Artificial Sequence
<220>
<223> pEC751S
<400> 121
atcaaaaaaa tttccaataa tcccactcta agccacaaac acgccctata aaatcccgct 60
ttaatcccac tttgagacac atgtaatatt actttacgcc ctagtatagt gataattttt 120
tacattcaat gccacgcaaa aaaataaagg ggcactataa taaaagttcc ttcggaacta 180
actaaagtaa aaaattatct ttacaacctc cccaaaaaaa agaacaggta caaagtaccc 240
tataatacaa gcgtaaaaaa aatgagggta aaaataaaaa aataaaaaaa taaaaaaata 300
aaaaaataaa aaaataaaaa aataaaaaaa tataaaaata aaaaaatata aaaataaaaa 360
aatataaaaa taaaaaaata aaaaaatata aaaataaaaa aataaaaaaa tataaaaata 420
ttttttattt aaagtttgaa aaaaattttt ttatattata taatctttga agaaaagaat 480
ataaaaaatg agcctttata aaagcccatt ttttttcata tacgtaatat gacgttctaa 540
tgtttttatt ggtacttcta acattagagt aatttcttta tttttaaagc ctttttcttt 600
aagggctttt attttttttc ttaatacatt taattcctct ttttttgttg cttttccttt 660
agcttttaat tgctcttgat aatttttttt acctctaata ttttctcttc tcttatattc 720
ctttttagaa attattattg tcatatattt ttgttcttct tctgtaattt ctaataactc 780
tataagagtt tcattcttat acttatattg cttattttta tctaaataac atctttcagc 840
acttctagtt gctcttataa cttctctttc acttaaatgt tgtctaaaca tactattaag 900
ttctaaaaca tcatttaatg ccttctcaat gtcttctgta aagctacaaa gataatatct 960
atataaaaat aatataagct ctctgtgtcc ttttaaatca tattctctta gttcacaaag 1020
ttttattatg tcttgtattc ttccataata taaacttctt tctctataaa tataatttat 1080
tttgcttggt ctaccctttt tcctttcata tggttttaat tcaggtaaaa atccattttg 1140
tatttctctt aagtcataaa tatattcgta ctcatctaat atattgacta ctgtttttga 1200
tttagagttt atacttcctg gaactcttaa tattctcgtt gcatctaagg cttgtctatc 1260
tgctccaaag tattttaatt gattatataa atattcttga accgctttcc ataatggtaa 1320
tgctttacta ggtactgcat ttattatcca tattaaatac attcctcttc cactatctat 1380
tacatagttt ggtataggaa tactttgatt aaaataattc ttttctaagt ccattaatac 1440
ctggtcttta gttttgccag ttttataata atccaagtct ataaacagtg tatttaactc 1500
ttttatattt tctaatcgcc tacacggctt ataaaaggta tttagagtta tatagatatt 1560
ttcatcactc atatctaaat cttttaattc agcgtattta tagtgccatt ggctatatcc 1620
ttttttatct ataacgctcc tggttatcca ccctttactt ctactatgaa tattatctat 1680
atagttcttt ttattcagct ttaatgcgtt tctcacttat tcacctcccc ttctgtaaaa 1740
ctaagaaaat tatatcatat tttcaataat tattaactat tcttaaactc ttaataaaaa 1800
atagagtaag tccccaattg aaacttaatc tattttttat gttttaattt attattttta 1860
ttaaaatatt ttaaactaaa ttaaatgatt ctttttaatt ttttactatt tcattccata 1920
atatattact ataattattt acaaataata tttcttcatt tgtaatattt agatgattta 1980
ctaattttag tttttatata ttaaataatt aatgtataat ttatataaaa aatcaaagga 2040
gcttataaat tatgattatt tccaaagata ctaaagattt aatttttttc aattttaaca 2100
atactttttg taatattatg tttaaattta attgtatttt tttcatataa taaagccgtt 2160
gaagtaaacc aatccatttt ccttatgatg ttattattaa atttaagttt tataataata 2220
tctttattat atttattgtt tttaaaaaaa ctagtgaaat ttctagtgaa atttccggct 2280
ttattaaact tatttttagg aattttattt tcattttcat ctttacagga tttgattata 2340
tctttaaata tgttttatca aatattatct ttttctaaat ttatatatat ttttattata 2400
tttattatta tatatatttt atttttaagt ttctttctaa cagctattaa aaagaaactt 2460
aaaaataaaa acacgtactc taaaccaata aataaaacta tttttattat tgctgccttg 2520
attggaatag tttttagtaa aattaatttc aatattccac aatattatat tataagctag 2580
cacgcctcga gatctccatg gacgcgtgac gtcgactcta gaggatcccc gggtaccgag 2640
ctcgaattcg taatcatggt catagctgtt tcctgtgtga aattgttatc cgctcacaat 2700
tccacacaac atacgagccg gaagcataaa gtgtaaagcc tggggtgcct aatgagtgag 2760
ctaactcaca ttaattgcgt tgcgctcact gcccgctttc cagtcgggaa acctgtcgtg 2820
ccagctgcat taatgaatcg gccaacgcgc ggggagaggc ggtttgcgta ttgggcgctc 2880
ttccgcttcc tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc 2940
agctcactca aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa 3000
catgtgagca aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt 3060
tttccatagg ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg 3120
gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg 3180
ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag 3240
cgtggcgctt tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc 3300
caagctgggc tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa 3360
ctatcgtctt gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg 3420
taacaggatt agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc 3480
taactacggc tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac 3540
cttcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 3600
tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 3660
gatcttttct acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt 3720
catgagatta tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa 3780
atcaatctaa agtatatatg agtaaacttg gtctgacagt taccaaagct agcttaatac 3840
tagtatatac ttaatgtgat aagtgtctga cagctgaccg gtctaaagag gtcccaatga 3900
ataggtttac acttacttta gttttatgga aatgaaagat catatcatat ataatctaga 3960
ataaaattaa ctaaaataat tattatctag ataaaaaatt tagaagccaa tgaaatctat 4020
aaataaacta aattaagttt atttaattaa caactatgga tataaaatag gtactaatca 4080
aaatagtgag gaggatatat ttgaatacat acgaacaaat taataaagtg aaaaaaatac 4140
ttcggaaaca tttaaaaaat aaccttattg gtacttacat gtttggatca ggagttgaga 4200
gtggactaaa accaaatagt gatcttgact ttttagtcgt cgtatctgaa ccattgacag 4260
atcaaagtaa agaaatactt atacaaaaaa ttagacctat ttcaaagaaa ataggagata 4320
aaagcaactt acgatatatt gaattaacaa ttattattca gcaagaaatg gtaccgtgga 4380
atcatcctcc caaacaagaa tttatttatg gagaatggtt acaagagctt tatgaacaag 4440
gatacattcc tcagaaggaa ttaaattcag atttaaccat aatgctttac caagcaaaac 4500
gaaaaaataa aagaatatac ggaaattatg acttagagga attactacct gatattccat 4560
tttctgatgt gagaagagcc attatggatt cgtcagagga attaatagat aattatcagg 4620
atgatgaaac caactctata ttaactttat gccgtatgat tttaactatg gacacgggta 4680
aaatcatacc aaaagatatt gcgggaaatg cagtggctga atcttctcca ttagaacata 4740
gggagagaat tttgttagca gttcgtagtt atcttggaga gaatattgaa tggactaatg 4800
aaaatgtaaa tttaactata aactatttaa ataacagatt aaaaaaatta taaaaaaatt 4860
gaaaaaatgg tggaaacact tttttcaatt tttttgtttt attatttaat atttgggaaa 4920
tattcattct aattggtaat cagattttag aagttgttaa cttcaggttt gtctgtaact 4980
aaaaactagt atttaaccta gg 5002
<210> 122
<211> 3907
<212> DNA
<213> Artificial Sequence
<220>
<223> pFW01
<400> 122
tcgagatctc catggacgcg tgacgtcgac tctagaggat ccccgggtac cgagctcgaa 60
ttcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca 120
caacatacga gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact 180
cacattaatt gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct 240
gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc 300
ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca 360
ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg 420
agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca 480
taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa 540
cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc 600
tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc 660
gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct 720
gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg 780
tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag 840
gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta 900
cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg 960
aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 1020
tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt 1080
ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag 1140
attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat 1200
ctaaagtata tatgagtaaa cttggtctga cagttaccag gtccactgcc gggcctcttg 1260
cgggatcaaa agaaaaacga aatgatacac caatcagtgc aaaaaaagat ataatgggag 1320
ataagacggt tcgtgttcgt gctgacttgc accatatcat aaaaatcgaa acagcaaaga 1380
atggcggaaa cgtaaaagaa gttatggaaa taagacttag aagcaaactt aagagtgtgt 1440
tgatagtgca gtatcttaaa attttgtata ataggaattg aagttaaatt agatgctaaa 1500
aatttgtaat taagaaggag tgattacatg aacaaaaata taaaatattc tcaaaacttt 1560
ttaacgagtg aaaaagtact caaccaaata ataaaacaat tgaatttaaa agaaaccgat 1620
accgtttacg aaattggaac aggtaaaggg catttaacga cgaaactggc taaaataagt 1680
aaacaggtaa cgtctattga attagacagt catctattca acttatcgtc agaaaaatta 1740
aaactgaata ctcgtgtcac tttaattcac caagatattc tacagtttca attccctaac 1800
aaacagaggt ataaaattgt tgggagtatt ccttaccatt taagcacaca aattattaaa 1860
aaagtggttt ttgaaagcca tgcgtctgac atctatctga ttgttgaaga aggattctac 1920
aagcgtacct tggatattca ccgaacacta gggttgctct tgcacactca agtctcgatt 1980
cagcaattgc ttaagctgcc agcggaatgc tttcatccta aaccaaaagt aaacagtgtc 2040
ttaataaaac ttacccgcca taccacagat gttccagata aatattggaa gctatatacg 2100
tactttgttt caaaatgggt caatcgagaa tatcgtcaac tgtttactaa aaatcagttt 2160
catcaagcaa tgaaacacgc caaagtaaac aatttaagta ccgttactta tgagcaagta 2220
ttgtctattt ttaatagtta tctattattt aacgggagga aataattcta tgagtcccta 2280
ggcaggcctc cgccattatt tttttgaaca attgacaatt catttcttat tttttattaa 2340
gtgatagtca aaaggcataa cagtgctgaa tagaaagaaa tttacagaaa agaaaattat 2400
agaatttagt atgattaatt atactcattt atgaatgttt aattgaatac aaaaaaaaat 2460
acttgttatg tattcaatta cgggttaaaa tatagacaag ttgaaaaatt taataaaaaa 2520
ataagtcctc agctcttata tattaagcta ccaacttagt atataagcca aaacttaaat 2580
gtgctaccaa cacatcaagc cgttagagaa ctctatctat agcaatattt caaatgtacc 2640
gacatacaag agaaacatta actatatata ttcaatttat gagattatct taacagatat 2700
aaatgtaaat tgcaataagt aagatttaga agtttatagc ctttgtgtat tggaagcagt 2760
acgcaaaggc ttttttattt gataaaaatt agaagtatat ttattttttc ataattaatt 2820
tatgaaaatg aaagggggtg agcaaagtga cagaggaaag cagtatctta tcaaataaca 2880
aggtattagc aatatcatta ttgactttag cagtaaacat tatgactttt atagtgcttg 2940
tagctaagta gtacgaaagg gggagcttta aaaagctcct tggaatacat agaattcata 3000
aattaattta tgaaaagaag ggcgtatatg aaaacttgta aaaattgcaa agagtttatt 3060
aaagatactg aaatatgcaa aatacattcg ttgatgattc atgataaaac agtagcaacc 3120
tattgcagta aatacaatga gtcaagatgt ttacataaag ggaaagtcca atgtattaat 3180
tgttcaaaga tgaaccgata tggatggtgt gccataaaaa tgagatgttt tacagaggaa 3240
gaacagaaaa aagaacgtac atgcattaaa tattatgcaa ggagctttaa aaaagctcat 3300
gtaaagaaga gtaaaaagaa aaaataattt atttattaat ttaatattga gagtgccgac 3360
acagtatgca ctaaaaaata tatctgtggt gtagtgagcc gatacaaaag gatagtcact 3420
cgcattttca taatacatct tatgttatga ttatgtgtcg gtgggacttc acgacgaaaa 3480
cccacaataa aaaaagagtt cggggtaggg ttaagcatag ttgaggcaac taaacaatca 3540
agctaggata tgcagtagca gaccgtaagg tcgttgttta ggtgtgttgt aatacatacg 3600
ctattaagat gtaaaaatac ggataccaat gaagggaaaa gtataatttt tggatgtagt 3660
ttgtttgttc atctatgggc aaactacgtc caaagccgtt tccaaatctg ctaaaaagta 3720
tatcctttct aaaatcaaag tcaagtatga aatcataaat aaagtttaat tttgaagtta 3780
ttatgatatt atgtttttct attaaaataa attaagtata tagaatagtt taataatagt 3840
atatacttaa tgtgataagt gtctgacagt gtcacagaaa ggatgattgt tatggattat 3900
aagcggc 3907
<210> 123
<211> 6525
<212> DNA
<213> Artificial Sequence
<220>
<223> pNF3S
<400> 123
gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat gcagctggca 60
cgacaggttt cccgactgga aagcgggcag tgagcgcaac gcaattaatg tgagttagct 120
cactcattag gcaccccagg ctttacactt tatgcttccg gctcgtatgt tgtgtggaat 180
tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg ccaagcttgc 240
atgcctgcag gtcgaccacg ataaaacaag gttttaagga taagaaaagt catgagattt 300
atagtaaatc ttgtgacttt ttttattgaa tagtagagag agttcggaag tataacacgc 360
tatattcttg atatttttag aatagcaagc attggatttg tcctgacact ttcccaaaaa 420
ttaaggagtt attccttaaa ccaaaaagat taatgtggga acaaatttag tgtatccatt 480
tttgaagggc gcacttatac accaccaaaa tggtgtgtgc gaaatcttta aaaaagattt 540
atcaaaaagc ttttttaaag ctgggacatt tagaaaatca ataatgtttt ttgcccaata 600
cgctagtctt aaaatctgca aggttgataa ctatttagtc ccaggtatta gaatggggca 660
tatatataca aagtatatat atgcgtaaat atatgtggga ctgtgggaac aaaattgcgt 720
gctaaaattg tattgaaaag gtaatgaaaa ggtcatgctt tggtattgct aacgtataga 780
aaaggtaatg aaaagctcat ggttctataa aaaagatgta cccacgaaaa taataggctt 840
tgcctatttc cccatgtaat atgggggcag ttttctctta tgctctttct taacatattg 900
aataaataca aaatgcagct ttgtgggaat aaaaatattt ttgtttttat tcttatagtt 960
agacaaaatt ttaatctttt ttgtgctata acaagattaa aatttgtggg aacattaaga 1020
aatattgttg tcacaaataa aaaggagagt gggaacaatt gctataaaaa acgcagaaat 1080
taagattaga gttacaaaag agcaaaaaga attatttaag aaaattgcaa aagctgaaaa 1140
tatgagtatg agtgaattta ttattgtgac cacagaatat ttagccagaa aaaaagatga 1200
aaatatgaaa tcaaaagaca tgatcgagag aagagctgcg aagactgaag aaaaaattat 1260
gaagctaaaa aagaaactaa ataaaaacag gtaatataga ttacagtttt aagcttgttt 1320
tccctataga ctagagtaaa tatataaata tacctgtcaa gggcttataa gcccctttag 1380
ggggtgcgta gcacccttga caggtatatt tatatatttt agggtgccat taagggaaac 1440
aagctttaaa atgcctttaa aggcatttta aaataaataa aaaaaagatg gtttttacca 1500
tcttttttaa ctcccgaaag ggagttcttt cttttcttga tactatacgt aactatttcg 1560
atttgccctg aacctaatca aagctagata aattcagtat tagggcataa aaaaacttgc 1620
tttttcgggt ggaaatctgt ataatttaaa ttgcttagat aaaaattacc aattccatac 1680
gaaaggagca agttttacat aaggttaaag ccttatgtga attctcattt aattacatga 1740
ataataataa cacagaaagt gaagaattaa aagagcaaag tcaactattg cttgacaaat 1800
gcacaaaaaa gaaaaagaaa aatcctaaat ttagtagtta tatagaacca ttagtaagca 1860
agaaattatc tgaaagaata aaggaatgtg gtgacttttt gcagatgtta tctgatttaa 1920
accttgaaaa ttcgaaactg catagagcaa gtttttgtgg taacagattt tgtcctatgt 1980
gtagctggcg tattgcttgt aaggatagtt tggaaatatc tattctcatg gagcatttac 2040
gcaaagagga aagcaaagaa tttatctttt tgaccttaac aactccaaat gtgaaaggtg 2100
cggaccttga taattccata aaagcataca ataaagcatt taaaaagtta atggaacgca 2160
aagaggtcaa gagcatagta aaaggctaca taagaaagct agaagtaacc tataatttgg 2220
acaagagttc caaatcatat aatacttatc acccacattt ccatgtggta ctagcagtca 2280
atagaagtta ctttaaaaag caaaatctat atataaacca tcatagatgg cttagtttgt 2340
ggcaagagtc aactggtgat tattcgataa ctcaagttga tgtaagaaag gctaaaatta 2400
acgattataa agaggtttat gagcttgcta agtattcggc taaggattcc gactatttaa 2460
tcaatagaga agtgtttacg gtattctaca aatctttaaa gggtaaacag gtacttgtat 2520
ttagtggatt atttaaagac gctcataaaa tgtataagaa tggagagcta gatctgtata 2580
agaagttgga tactatcgaa tatgcttata tggtaagtta taactggctt aaaaagaagt 2640
atgatacttc aaatattaga gaattaactg aggaagaaaa gcagaaattc aataaaaatt 2700
taatcgaaga tgtggatatt gagtaggtgg gattatatct cacctttttt attgtctttt 2760
catgttgaaa ttttgacgct taatgcatga agtattgaca agtttaaaaa ttacggtttt 2820
taatccttag ttgattagca ggattatggc cggaatgctc cgtccagtcc tgttaaggaa 2880
ttaaaattcc ctaaaaccct tggctatgat ttatagcgag aatcgtcaat taaaaattta 2940
ataggtgcta tgaaagtcga ttaataatta attttaaaat gcaatatgaa acataattac 3000
aagaatttga cttttaatac aagaattgat atcatagtta cattaatacg gatcccaatg 3060
aataggttta cacttacttt agttttatgg aaatgaaaga tcatatcata tataatctag 3120
aataaaatta actaaaataa ttattatcta gataaaaaat ttagaagcca atgaaatcta 3180
taaataaact aaattaagtt tatttaatta acaactatgg atataaaata ggtactaatc 3240
aaaatagtga ggaggatata tttgaataca tacgaacaaa ttaataaagt gaaaaaaata 3300
cttcggaaac atttaaaaaa taaccttatt ggtacttaca tgtttggatc aggagttgag 3360
agtggactaa aaccaaatag tgatcttgac tttttagtcg tcgtatctga accattgaca 3420
gatcaaagta aagaaatact tatacaaaaa attagaccta tttcaaagaa aataggagat 3480
aaaagcaact tacgatatat tgaattaaca attattattc agcaagaaat ggtaccgtgg 3540
aatcatcctc ccaaacaaga atttatttat ggagaatggt tacaagagct ttatgaacaa 3600
ggatacattc ctcagaagga attaaattca gatttaacca taatgcttta ccaagcaaaa 3660
cgaaaaaata aaagaatata cggaaattat gacttagagg aattactacc tgatattcca 3720
ttttctgatg tgagaagagc cattatggat tcgtcagagg aattaataga taattatcag 3780
gatgatgaaa ccaactctat attaacttta tgccgtatga ttttaactat ggacacgggt 3840
aaaatcatac caaaagatat tgcgggaaat gcagtggctg aatcttctcc attagaacat 3900
agggagagaa ttttgttagc agttcgtagt tatcttggag agaatattga atggactaat 3960
gaaaatgtaa atttaactat aaactattta aataacagat taaaaaaatt ataaaaaaat 4020
tgaaaaaatg gtggaaacac ttttttcaat ttttttgttt tattatttaa tatttgggaa 4080
atattcattc taattggtaa tcagatttta gaagttgagc tcgaattcac tggccgtcgt 4140
tttacaacgt cgtgactggg aaaaccctgg cgttacccaa cttaatcgcc ttgcagcaca 4200
tccccctttc gccagctggc gtaatagcga agaggcccgc accgatcgcc cttcccaaca 4260
gttgcgcagc ctgaatggcg aatggcgcct gatgcggtat tttctcctta cgcatctgtg 4320
cggtatttca caccgcatat ggtgcactct cagtacaatc tgctctgatg ccgcatagtt 4380
aagccagccc cgacacccgc caacacccgc tgacgcgccc tgacgggctt gtctgctccc 4440
ggcatccgct tacagacaag ctgtgaccgt ctccgggagc tgcatgtgtc agaggttttc 4500
accgtcatca ccgaaacgcg cgagacgaaa gggcctcgtg atacgcctat ttttataggt 4560
taatgtcatg ataataatgg tttcttagac gtcaggtggc acttttcggg gaaatgtgcg 4620
cggaacccct atttgtttat ttttctaaat acattcaaat atgtatccgc tcatgagaca 4680
ataaccctga taaatgcttc aataatattg aaaaaggaag agtatgagta ttcaacattt 4740
ccgtgtcgcc cttattccct tttttgcggc attttgcctt cctgtttttg ctcacccaga 4800
aacgctggtg aaagtaaaag atgctgaaga tcagttgggt gcacgagtgg gttacatcga 4860
actggatctc aacagcggta agatccttga gagttttcgc cccgaagaac gttttccaat 4920
gatgagcact tttaaagttc tgctatgtgg cgcggtatta tcccgtattg acgccgggca 4980
agagcaactc ggtcgccgca tacactattc tcagaatgac ttggttgagt actcaccagt 5040
cacagaaaag catcttacgg atggcatgac agtaagagaa ttatgcagtg ctgccataac 5100
catgagtgat aacactgcgg ccaacttact tctgacaacg atcggaggac cgaaggagct 5160
aaccgctttt ttgcacaaca tgggggatca tgtaactcgc cttgatcgtt gggaaccgga 5220
gctgaatgaa gccataccaa acgacgagcg tgacaccacg atgcctgtag caatggcaac 5280
aacgttgcgc aaactattaa ctggcgaact acttactcta gcttcccggc aacaattaat 5340
agactggatg gaggcggata aagttgcagg accacttctg cgctcggccc ttccggctgg 5400
ctggtttatt gctgataaat ctggagccgg tgagcgtggg tctcgcggta tcattgcagc 5460
actggggcca gatggtaagc cctcccgtat cgtagttatc tacacgacgg ggagtcaggc 5520
aactatggat gaacgaaata gacagatcgc tgagataggt gcctcactga ttaagcattg 5580
gtaactgtca gaccaagttt actcatatat actttagatt gatttaaaac ttcattttta 5640
atttaaaagg atctaggtga agatcctttt tgataatctc atgaccaaaa tcccttaacg 5700
tgagttttcg ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga 5760
tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt 5820
ggtttgtttg ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag 5880
agcgcagata ccaaatactg ttcttctagt gtagccgtag ttaggccacc acttcaagaa 5940
ctctgtagca ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag 6000
tggcgataag tcgtgtctta ccgggttgga ctcaagacga tagttaccgg ataaggcgca 6060
gcggtcgggc tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac 6120
cgaactgaga tacctacagc gtgagctatg agaaagcgcc acgcttcccg aagggagaaa 6180
ggcggacagg tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc 6240
agggggaaac gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg 6300
tcgatttttg tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc 6360
ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc ctgcgttatc 6420
ccctgattct gtggataacc gtattaccgc ctttgagtga gctgataccg ctcgccgcag 6480
ccgaacgacc gagcgcagcg agtcagtgag cgaggaagcg gaaga 6525
<210> 124
<211> 6554
<212> DNA
<213> Artificial Sequence
<220>
<223> pNF3E
<400> 124
gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat gcagctggca 60
cgacaggttt cccgactgga aagcgggcag tgagcgcaac gcaattaatg tgagttagct 120
cactcattag gcaccccagg ctttacactt tatgcttccg gctcgtatgt tgtgtggaat 180
tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg ccaagcttgc 240
atgcctgcag gtcgaccacg ataaaacaag gttttaagga taagaaaagt catgagattt 300
atagtaaatc ttgtgacttt ttttattgaa tagtagagag agttcggaag tataacacgc 360
tatattcttg atatttttag aatagcaagc attggatttg tcctgacact ttcccaaaaa 420
ttaaggagtt attccttaaa ccaaaaagat taatgtggga acaaatttag tgtatccatt 480
tttgaagggc gcacttatac accaccaaaa tggtgtgtgc gaaatcttta aaaaagattt 540
atcaaaaagc ttttttaaag ctgggacatt tagaaaatca ataatgtttt ttgcccaata 600
cgctagtctt aaaatctgca aggttgataa ctatttagtc ccaggtatta gaatggggca 660
tatatataca aagtatatat atgcgtaaat atatgtggga ctgtgggaac aaaattgcgt 720
gctaaaattg tattgaaaag gtaatgaaaa ggtcatgctt tggtattgct aacgtataga 780
aaaggtaatg aaaagctcat ggttctataa aaaagatgta cccacgaaaa taataggctt 840
tgcctatttc cccatgtaat atgggggcag ttttctctta tgctctttct taacatattg 900
aataaataca aaatgcagct ttgtgggaat aaaaatattt ttgtttttat tcttatagtt 960
agacaaaatt ttaatctttt ttgtgctata acaagattaa aatttgtggg aacattaaga 1020
aatattgttg tcacaaataa aaaggagagt gggaacaatt gctataaaaa acgcagaaat 1080
taagattaga gttacaaaag agcaaaaaga attatttaag aaaattgcaa aagctgaaaa 1140
tatgagtatg agtgaattta ttattgtgac cacagaatat ttagccagaa aaaaagatga 1200
aaatatgaaa tcaaaagaca tgatcgagag aagagctgcg aagactgaag aaaaaattat 1260
gaagctaaaa aagaaactaa ataaaaacag gtaatataga ttacagtttt aagcttgttt 1320
tccctataga ctagagtaaa tatataaata tacctgtcaa gggcttataa gcccctttag 1380
ggggtgcgta gcacccttga caggtatatt tatatatttt agggtgccat taagggaaac 1440
aagctttaaa atgcctttaa aggcatttta aaataaataa aaaaaagatg gtttttacca 1500
tcttttttaa ctcccgaaag ggagttcttt cttttcttga tactatacgt aactatttcg 1560
atttgccctg aacctaatca aagctagata aattcagtat tagggcataa aaaaacttgc 1620
tttttcgggt ggaaatctgt ataatttaaa ttgcttagat aaaaattacc aattccatac 1680
gaaaggagca agttttacat aaggttaaag ccttatgtga attctcattt aattacatga 1740
ataataataa cacagaaagt gaagaattaa aagagcaaag tcaactattg cttgacaaat 1800
gcacaaaaaa gaaaaagaaa aatcctaaat ttagtagtta tatagaacca ttagtaagca 1860
agaaattatc tgaaagaata aaggaatgtg gtgacttttt gcagatgtta tctgatttaa 1920
accttgaaaa ttcgaaactg catagagcaa gtttttgtgg taacagattt tgtcctatgt 1980
gtagctggcg tattgcttgt aaggatagtt tggaaatatc tattctcatg gagcatttac 2040
gcaaagagga aagcaaagaa tttatctttt tgaccttaac aactccaaat gtgaaaggtg 2100
cggaccttga taattccata aaagcataca ataaagcatt taaaaagtta atggaacgca 2160
aagaggtcaa gagcatagta aaaggctaca taagaaagct agaagtaacc tataatttgg 2220
acaagagttc caaatcatat aatacttatc acccacattt ccatgtggta ctagcagtca 2280
atagaagtta ctttaaaaag caaaatctat atataaacca tcatagatgg cttagtttgt 2340
ggcaagagtc aactggtgat tattcgataa ctcaagttga tgtaagaaag gctaaaatta 2400
acgattataa agaggtttat gagcttgcta agtattcggc taaggattcc gactatttaa 2460
tcaatagaga agtgtttacg gtattctaca aatctttaaa gggtaaacag gtacttgtat 2520
ttagtggatt atttaaagac gctcataaaa tgtataagaa tggagagcta gatctgtata 2580
agaagttgga tactatcgaa tatgcttata tggtaagtta taactggctt aaaaagaagt 2640
atgatacttc aaatattaga gaattaactg aggaagaaaa gcagaaattc aataaaaatt 2700
taatcgaaga tgtggatatt gagtaggtgg gattatatct cacctttttt attgtctttt 2760
catgttgaaa ttttgacgct taatgcatga agtattgaca agtttaaaaa ttacggtttt 2820
taatccttag ttgattagca ggattatggc cggaatgctc cgtccagtcc tgttaaggaa 2880
ttaaaattcc ctaaaaccct tggctatgat ttatagcgag aatcgtcaat taaaaattta 2940
ataggtgcta tgaaagtcga ttaataatta attttaaaat gcaatatgaa acataattac 3000
aagaatttga cttttaatac aagaattgat atcatagtta cattaatacg gatccgtctg 3060
acagttacca ggtccactgc cgggcctctt gcgggatcaa aagaaaaacg aaatgataca 3120
ccaatcagtg caaaaaaaga tataatggga gataagacgg ttcgtgttcg tgctgacttg 3180
caccatatca taaaaatcga aacagcaaag aatggcggaa acgtaaaaga agttatggaa 3240
ataagactta gaagcaaact taagagtgtg ttgatagtgc agtatcttaa aattttgtat 3300
aataggaatt gaagttaaat tagatgctaa aaatttgtaa ttaagaagga gtgattacat 3360
gaacaaaaat ataaaatatt ctcaaaactt tttaacgagt gaaaaagtac tcaaccaaat 3420
aataaaacaa ttgaatttaa aagaaaccga taccgtttac gaaattggaa caggtaaagg 3480
gcatttaacg acgaaactgg ctaaaataag taaacaggta acgtctattg aattagacag 3540
tcatctattc aacttatcgt cagaaaaatt aaaactgaat actcgtgtca ctttaattca 3600
ccaagatatt ctacagtttc aattccctaa caaacagagg tataaaattg ttgggagtat 3660
tccttaccat ttaagcacac aaattattaa aaaagtggtt tttgaaagcc atgcgtctga 3720
catctatctg attgttgaag aaggattcta caagcgtacc ttggatattc accgaacact 3780
agggttgctc ttgcacactc aagtctcgat tcagcaattg cttaagctgc cagcggaatg 3840
ctttcatcct aaaccaaaag taaacagtgt cttaataaaa cttacccgcc ataccacaga 3900
tgttccagat aaatattgga agctatatac gtactttgtt tcaaaatggg tcaatcgaga 3960
atatcgtcaa ctgtttacta aaaatcagtt tcatcaagca atgaaacacg ccaaagtaaa 4020
caatttaagt accgttactt atgagcaagt attgtctatt tttaatagtt atctattatt 4080
taacgggagg aaataattct atgagtccct aggcaggcct ccgccattat ttttttgaac 4140
aattggagct cgaattcact ggccgtcgtt ttacaacgtc gtgactggga aaaccctggc 4200
gttacccaac ttaatcgcct tgcagcacat ccccctttcg ccagctggcg taatagcgaa 4260
gaggcccgca ccgatcgccc ttcccaacag ttgcgcagcc tgaatggcga atggcgcctg 4320
atgcggtatt ttctccttac gcatctgtgc ggtatttcac accgcatatg gtgcactctc 4380
agtacaatct gctctgatgc cgcatagtta agccagcccc gacacccgcc aacacccgct 4440
gacgcgccct gacgggcttg tctgctcccg gcatccgctt acagacaagc tgtgaccgtc 4500
tccgggagct gcatgtgtca gaggttttca ccgtcatcac cgaaacgcgc gagacgaaag 4560
ggcctcgtga tacgcctatt tttataggtt aatgtcatga taataatggt ttcttagacg 4620
tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt tttctaaata 4680
cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca ataatattga 4740
aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt ttttgcggca 4800
ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga tgctgaagat 4860
cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa gatccttgag 4920
agttttcgcc ccgaagaacg ttttccaatg atgagcactt ttaaagttct gctatgtggc 4980
gcggtattat cccgtattga cgccgggcaa gagcaactcg gtcgccgcat acactattct 5040
cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga tggcatgaca 5100
gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc caacttactt 5160
ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat gggggatcat 5220
gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa cgacgagcgt 5280
gacaccacga tgcctgtagc aatggcaaca acgttgcgca aactattaac tggcgaacta 5340
cttactctag cttcccggca acaattaata gactggatgg aggcggataa agttgcagga 5400
ccacttctgc gctcggccct tccggctggc tggtttattg ctgataaatc tggagccggt 5460
gagcgtgggt ctcgcggtat cattgcagca ctggggccag atggtaagcc ctcccgtatc 5520
gtagttatct acacgacggg gagtcaggca actatggatg aacgaaatag acagatcgct 5580
gagataggtg cctcactgat taagcattgg taactgtcag accaagttta ctcatatata 5640
ctttagattg atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt 5700
gataatctca tgaccaaaat cccttaacgt gagttttcgt tccactgagc gtcagacccc 5760
gtagaaaaga tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg 5820
caaacaaaaa aaccaccgct accagcggtg gtttgtttgc cggatcaaga gctaccaact 5880
ctttttccga aggtaactgg cttcagcaga gcgcagatac caaatactgt tcttctagtg 5940
tagccgtagt taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg 6000
ctaatcctgt taccagtggc tgctgccagt ggcgataagt cgtgtcttac cgggttggac 6060
tcaagacgat agttaccgga taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca 6120
cagcccagct tggagcgaac gacctacacc gaactgagat acctacagcg tgagctatga 6180
gaaagcgcca cgcttcccga agggagaaag gcggacaggt atccggtaag cggcagggtc 6240
ggaacaggag agcgcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct 6300
gtcgggtttc gccacctctg acttgagcgt cgatttttgt gatgctcgtc aggggggcgg 6360
agcctatgga aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct 6420
tttgctcaca tgttctttcc tgcgttatcc cctgattctg tggataaccg tattaccgcc 6480
tttgagtgag ctgataccgc tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc 6540
gaggaagcgg aaga 6554
<210> 125
<211> 6271
<212> DNA
<213> Artificial Sequence
<220>
<223> pNF3C
<400> 125
gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat gcagctggca 60
cgacaggttt cccgactgga aagcgggcag tgagcgcaac gcaattaatg tgagttagct 120
cactcattag gcaccccagg ctttacactt tatgcttccg gctcgtatgt tgtgtggaat 180
tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg ccaagcttgc 240
atgcctgcag gtcgaccacg ataaaacaag gttttaagga taagaaaagt catgagattt 300
atagtaaatc ttgtgacttt ttttattgaa tagtagagag agttcggaag tataacacgc 360
tatattcttg atatttttag aatagcaagc attggatttg tcctgacact ttcccaaaaa 420
ttaaggagtt attccttaaa ccaaaaagat taatgtggga acaaatttag tgtatccatt 480
tttgaagggc gcacttatac accaccaaaa tggtgtgtgc gaaatcttta aaaaagattt 540
atcaaaaagc ttttttaaag ctgggacatt tagaaaatca ataatgtttt ttgcccaata 600
cgctagtctt aaaatctgca aggttgataa ctatttagtc ccaggtatta gaatggggca 660
tatatataca aagtatatat atgcgtaaat atatgtggga ctgtgggaac aaaattgcgt 720
gctaaaattg tattgaaaag gtaatgaaaa ggtcatgctt tggtattgct aacgtataga 780
aaaggtaatg aaaagctcat ggttctataa aaaagatgta cccacgaaaa taataggctt 840
tgcctatttc cccatgtaat atgggggcag ttttctctta tgctctttct taacatattg 900
aataaataca aaatgcagct ttgtgggaat aaaaatattt ttgtttttat tcttatagtt 960
agacaaaatt ttaatctttt ttgtgctata acaagattaa aatttgtggg aacattaaga 1020
aatattgttg tcacaaataa aaaggagagt gggaacaatt gctataaaaa acgcagaaat 1080
taagattaga gttacaaaag agcaaaaaga attatttaag aaaattgcaa aagctgaaaa 1140
tatgagtatg agtgaattta ttattgtgac cacagaatat ttagccagaa aaaaagatga 1200
aaatatgaaa tcaaaagaca tgatcgagag aagagctgcg aagactgaag aaaaaattat 1260
gaagctaaaa aagaaactaa ataaaaacag gtaatataga ttacagtttt aagcttgttt 1320
tccctataga ctagagtaaa tatataaata tacctgtcaa gggcttataa gcccctttag 1380
ggggtgcgta gcacccttga caggtatatt tatatatttt agggtgccat taagggaaac 1440
aagctttaaa atgcctttaa aggcatttta aaataaataa aaaaaagatg gtttttacca 1500
tcttttttaa ctcccgaaag ggagttcttt cttttcttga tactatacgt aactatttcg 1560
atttgccctg aacctaatca aagctagata aattcagtat tagggcataa aaaaacttgc 1620
tttttcgggt ggaaatctgt ataatttaaa ttgcttagat aaaaattacc aattccatac 1680
gaaaggagca agttttacat aaggttaaag ccttatgtga attctcattt aattacatga 1740
ataataataa cacagaaagt gaagaattaa aagagcaaag tcaactattg cttgacaaat 1800
gcacaaaaaa gaaaaagaaa aatcctaaat ttagtagtta tatagaacca ttagtaagca 1860
agaaattatc tgaaagaata aaggaatgtg gtgacttttt gcagatgtta tctgatttaa 1920
accttgaaaa ttcgaaactg catagagcaa gtttttgtgg taacagattt tgtcctatgt 1980
gtagctggcg tattgcttgt aaggatagtt tggaaatatc tattctcatg gagcatttac 2040
gcaaagagga aagcaaagaa tttatctttt tgaccttaac aactccaaat gtgaaaggtg 2100
cggaccttga taattccata aaagcataca ataaagcatt taaaaagtta atggaacgca 2160
aagaggtcaa gagcatagta aaaggctaca taagaaagct agaagtaacc tataatttgg 2220
acaagagttc caaatcatat aatacttatc acccacattt ccatgtggta ctagcagtca 2280
atagaagtta ctttaaaaag caaaatctat atataaacca tcatagatgg cttagtttgt 2340
ggcaagagtc aactggtgat tattcgataa ctcaagttga tgtaagaaag gctaaaatta 2400
acgattataa agaggtttat gagcttgcta agtattcggc taaggattcc gactatttaa 2460
tcaatagaga agtgtttacg gtattctaca aatctttaaa gggtaaacag gtacttgtat 2520
ttagtggatt atttaaagac gctcataaaa tgtataagaa tggagagcta gatctgtata 2580
agaagttgga tactatcgaa tatgcttata tggtaagtta taactggctt aaaaagaagt 2640
atgatacttc aaatattaga gaattaactg aggaagaaaa gcagaaattc aataaaaatt 2700
taatcgaaga tgtggatatt gagtaggtgg gattatatct cacctttttt attgtctttt 2760
catgttgaaa ttttgacgct taatgcatga agtattgaca agtttaaaaa ttacggtttt 2820
taatccttag ttgattagca ggattatggc cggaatgctc cgtccagtcc tgttaaggaa 2880
ttaaaattcc ctaaaaccct tggctatgat ttatagcgag aatcgtcaat taaaaattta 2940
ataggtgcta tgaaagtcga ttaataatta attttaaaat gcaatatgaa acataattac 3000
aagaatttga cttttaatac aagaattgat atcatagtta cattaatacg gatcccggca 3060
gtttttcttt ttcggcaagt gttcaagaag ttattaagtc gggagtgcag tcgaagtggg 3120
caagttgaaa aattcacaaa aatgtggtat aatatctttg ttcattagag cgataaactt 3180
gaatttgaga gggaacttag atggtatttg aaaaaattga taaaaatagt tggaacagaa 3240
aagagtattt tgaccactac tttgcaagtg taccttgtac ctacagcatg accgttaaag 3300
tggatatcac acaaataaag gaaaagggaa tgaaactata tcctgcaatg ctttattata 3360
ttgcaatgat tgtaaaccgc cattcagagt ttaggacggc aatcaatcaa gatggtgaat 3420
tggggatata tgatgagatg ataccaagct atacaatatt tcacaatgat actgaaacat 3480
tttccagcct ttggactgag tgtaagtctg actttaaatc atttttagca gattatgaaa 3540
gtgatacgca acggtatgga aacaatcata gaatggaagg aaagccaaat gctccggaaa 3600
acatttttaa tgtatctatg ataccgtggt caaccttcga tggctttaat ctgaatttgc 3660
agaaaggata tgattatttg attcctattt ttactatggg gaaatattat aaagaagata 3720
acaaaattat acttcctttg gcaattcaag ttcatcacgc agtatgtgac ggatttcaca 3780
tttgccgttt tgtaaacgaa ttgcaggaat tgataaatag ttaacttcag gtttgtctgt 3840
aactaaaaac tagtatttaa ccgagctcga attcactggc cgtcgtttta caacgtcgtg 3900
actgggaaaa ccctggcgtt acccaactta atcgccttgc agcacatccc cctttcgcca 3960
gctggcgtaa tagcgaagag gcccgcaccg atcgcccttc ccaacagttg cgcagcctga 4020
atggcgaatg gcgcctgatg cggtattttc tccttacgca tctgtgcggt atttcacacc 4080
gcatatggtg cactctcagt acaatctgct ctgatgccgc atagttaagc cagccccgac 4140
acccgccaac acccgctgac gcgccctgac gggcttgtct gctcccggca tccgcttaca 4200
gacaagctgt gaccgtctcc gggagctgca tgtgtcagag gttttcaccg tcatcaccga 4260
aacgcgcgag acgaaagggc ctcgtgatac gcctattttt ataggttaat gtcatgataa 4320
taatggtttc ttagacgtca ggtggcactt ttcggggaaa tgtgcgcgga acccctattt 4380
gtttattttt ctaaatacat tcaaatatgt atccgctcat gagacaataa ccctgataaa 4440
tgcttcaata atattgaaaa aggaagagta tgagtattca acatttccgt gtcgccctta 4500
ttcccttttt tgcggcattt tgccttcctg tttttgctca cccagaaacg ctggtgaaag 4560
taaaagatgc tgaagatcag ttgggtgcac gagtgggtta catcgaactg gatctcaaca 4620
gcggtaagat ccttgagagt tttcgccccg aagaacgttt tccaatgatg agcactttta 4680
aagttctgct atgtggcgcg gtattatccc gtattgacgc cgggcaagag caactcggtc 4740
gccgcataca ctattctcag aatgacttgg ttgagtactc accagtcaca gaaaagcatc 4800
ttacggatgg catgacagta agagaattat gcagtgctgc cataaccatg agtgataaca 4860
ctgcggccaa cttacttctg acaacgatcg gaggaccgaa ggagctaacc gcttttttgc 4920
acaacatggg ggatcatgta actcgccttg atcgttggga accggagctg aatgaagcca 4980
taccaaacga cgagcgtgac accacgatgc ctgtagcaat ggcaacaacg ttgcgcaaac 5040
tattaactgg cgaactactt actctagctt cccggcaaca attaatagac tggatggagg 5100
cggataaagt tgcaggacca cttctgcgct cggcccttcc ggctggctgg tttattgctg 5160
ataaatctgg agccggtgag cgtgggtctc gcggtatcat tgcagcactg gggccagatg 5220
gtaagccctc ccgtatcgta gttatctaca cgacggggag tcaggcaact atggatgaac 5280
gaaatagaca gatcgctgag ataggtgcct cactgattaa gcattggtaa ctgtcagacc 5340
aagtttactc atatatactt tagattgatt taaaacttca tttttaattt aaaaggatct 5400
aggtgaagat cctttttgat aatctcatga ccaaaatccc ttaacgtgag ttttcgttcc 5460
actgagcgtc agaccccgta gaaaagatca aaggatcttc ttgagatcct ttttttctgc 5520
gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc agcggtggtt tgtttgccgg 5580
atcaagagct accaactctt tttccgaagg taactggctt cagcagagcg cagataccaa 5640
atactgttct tctagtgtag ccgtagttag gccaccactt caagaactct gtagcaccgc 5700
ctacatacct cgctctgcta atcctgttac cagtggctgc tgccagtggc gataagtcgt 5760
gtcttaccgg gttggactca agacgatagt taccggataa ggcgcagcgg tcgggctgaa 5820
cggggggttc gtgcacacag cccagcttgg agcgaacgac ctacaccgaa ctgagatacc 5880
tacagcgtga gctatgagaa agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc 5940
cggtaagcgg cagggtcgga acaggagagc gcacgaggga gcttccaggg ggaaacgcct 6000
ggtatcttta tagtcctgtc gggtttcgcc acctctgact tgagcgtcga tttttgtgat 6060
gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa cgcggccttt ttacggttcc 6120
tggccttttg ctggcctttt gctcacatgt tctttcctgc gttatcccct gattctgtgg 6180
ataaccgtat taccgccttt gagtgagctg ataccgctcg ccgcagccga acgaccgagc 6240
gcagcgagtc agtgagcgag gaagcggaag a 6271
<210> 126
<211> 2793
<212> DNA
<213> Artificial Sequence
<220>
<223> OREP
<400> 126
cacgataaaa caaggtttta aggataagaa aagtcatgag atttatagta aatcttgtga 60
ctttttttat tgaatagtag agagagttcg gaagtataac acgctatatt cttgatattt 120
ttagaatagc aagcattgga tttgtcctga cactttccca aaaattaagg agttattcct 180
taaaccaaaa agattaatgt gggaacaaat ttagtgtatc catttttgaa gggcgcactt 240
atacaccacc aaaatggtgt gtgcgaaatc tttaaaaaag atttatcaaa aagctttttt 300
aaagctggga catttagaaa atcaataatg ttttttgccc aatacgctag tcttaaaatc 360
tgcaaggttg ataactattt agtcccaggt attagaatgg ggcatatata tacaaagtat 420
atatatgcgt aaatatatgt gggactgtgg gaacaaaatt gcgtgctaaa attgtattga 480
aaaggtaatg aaaaggtcat gctttggtat tgctaacgta tagaaaaggt aatgaaaagc 540
tcatggttct ataaaaaaga tgtacccacg aaaataatag gctttgccta tttccccatg 600
taatatgggg gcagttttct cttatgctct ttcttaacat attgaataaa tacaaaatgc 660
agctttgtgg gaataaaaat atttttgttt ttattcttat agttagacaa aattttaatc 720
ttttttgtgc tataacaaga ttaaaatttg tgggaacatt aagaaatatt gttgtcacaa 780
ataaaaagga gagtgggaac aattgctata aaaaacgcag aaattaagat tagagttaca 840
aaagagcaaa aagaattatt taagaaaatt gcaaaagctg aaaatatgag tatgagtgaa 900
tttattattg tgaccacaga atatttagcc agaaaaaaag atgaaaatat gaaatcaaaa 960
gacatgatcg agagaagagc tgcgaagact gaagaaaaaa ttatgaagct aaaaaagaaa 1020
ctaaataaaa acaggtaata tagattacag ttttaagctt gttttcccta tagactagag 1080
taaatatata aatatacctg tcaagggctt ataagcccct ttagggggtg cgtagcaccc 1140
ttgacaggta tatttatata ttttagggtg ccattaaggg aaacaagctt taaaatgcct 1200
ttaaaggcat tttaaaataa ataaaaaaaa gatggttttt accatctttt ttaactcccg 1260
aaagggagtt ctttcttttc ttgatactat acgtaactat ttcgatttgc cctgaaccta 1320
atcaaagcta gataaattca gtattagggc ataaaaaaac ttgctttttc gggtggaaat 1380
ctgtataatt taaattgctt agataaaaat taccaattcc atacgaaagg agcaagtttt 1440
acataaggtt aaagccttat gtgaattctc atttaattac atgaataata ataacacaga 1500
aagtgaagaa ttaaaagagc aaagtcaact attgcttgac aaatgcacaa aaaagaaaaa 1560
gaaaaatcct aaatttagta gttatataga accattagta agcaagaaat tatctgaaag 1620
aataaaggaa tgtggtgact ttttgcagat gttatctgat ttaaaccttg aaaattcgaa 1680
actgcataga gcaagttttt gtggtaacag attttgtcct atgtgtagct ggcgtattgc 1740
ttgtaaggat agtttggaaa tatctattct catggagcat ttacgcaaag aggaaagcaa 1800
agaatttatc tttttgacct taacaactcc aaatgtgaaa ggtgcggacc ttgataattc 1860
cataaaagca tacaataaag catttaaaaa gttaatggaa cgcaaagagg tcaagagcat 1920
agtaaaaggc tacataagaa agctagaagt aacctataat ttggacaaga gttccaaatc 1980
atataatact tatcacccac atttccatgt ggtactagca gtcaatagaa gttactttaa 2040
aaagcaaaat ctatatataa accatcatag atggcttagt ttgtggcaag agtcaactgg 2100
tgattattcg ataactcaag ttgatgtaag aaaggctaaa attaacgatt ataaagaggt 2160
ttatgagctt gctaagtatt cggctaagga ttccgactat ttaatcaata gagaagtgtt 2220
tacggtattc tacaaatctt taaagggtaa acaggtactt gtatttagtg gattatttaa 2280
agacgctcat aaaatgtata agaatggaga gctagatctg tataagaagt tggatactat 2340
cgaatatgct tatatggtaa gttataactg gcttaaaaag aagtatgata cttcaaatat 2400
tagagaatta actgaggaag aaaagcagaa attcaataaa aatttaatcg aagatgtgga 2460
tattgagtag gtgggattat atctcacctt ttttattgtc ttttcatgtt gaaattttga 2520
cgcttaatgc atgaagtatt gacaagttta aaaattacgg tttttaatcc ttagttgatt 2580
agcaggatta tggccggaat gctccgtcca gtcctgttaa ggaattaaaa ttccctaaaa 2640
cccttggcta tgatttatag cgagaatcgt caattaaaaa tttaataggt gctatgaaag 2700
tcgattaata attaatttta aaatgcaata tgaaacataa ttacaagaat ttgactttta 2760
atacaagaat tgatatcata gttacattaa tac 2793
<210> 127
<211> 2793
<212> DNA
<213> Clostridium beijerinckii
<400> 127
cacgataaaa caaggtttta aggataagaa aagtcatgag atttatagta aatcttgtga 60
ctttttttat tgaatagtag agagagttcg gaagtataac acgctatatt cttgatattt 120
ttagaatagc aagcattgga tttgtcctga cactttccca aaaattaagg agttattcct 180
taaaccaaaa agattaatgt gggaacaaat ttagtgtatc catttttgaa gggcgcactt 240
atacaccacc aaaatggtgt gtgcgaaatc tttaaaaaag atttatcaaa aagctttttt 300
aaagctggga catttagaaa atcaataatg ttttttgccc aatacgctag tcttaaaatc 360
tgcaaggttg ataactattt agtcccaggt attagaatgg ggcatatata tacaaagtat 420
atatatgcgt aaatatatgt gggactgtgg gaacaaaatt gcgtgctaaa attgtattga 480
aaaggtaatg aaaaggtcat gctttggtat tgctaacgta tagaaaaggt aatgaaaagc 540
tcatggttct ataaaaaaga tgtacccacg aaaataatag gctttgccta tttccccatg 600
taatatgggg gcagttttct cttatgctct ttcttaacat attgaataaa tacaaaatgc 660
agctttgtgg gaataaaaat atttttgttt ttattcttat agttagacaa aattttaatc 720
ttttttgtgc tataacaaga ttaaaatttg tgggaacatt aagaaatatt gttgtcacaa 780
ataaaaagga gagtgggaac aattgctata aaaaacgcag aaattaagat tagagttaca 840
aaagagcaaa aagaattatt taagaaaatt gcaaaagctg aaaatatgag tatgagtgaa 900
tttattattg tgaccacaga atatttagcc agaaaaaaag atgaaaatat gaaatcaaaa 960
gacatgatcg agagaagagc tgcgaagact gaagaaaaaa ttatgaagct aaaaaagaaa 1020
ctaaataaaa acaggtaata tagattacag ttttaagctt gttttcccta tagactagag 1080
taaatatata aatatacctg tcaagggctt ataagcccct ttagggggtg cgtagcaccc 1140
ttgacaggta tatttatata ttttagggtg ccattaaggg aaacaagctt taaaatgcct 1200
ttaaaggcat tttaaaataa ataaaaaaaa gatggttttt accatctttt ttaactcccg 1260
aaagggagtt ctttcttttc ttgatactat acgtaactat ttcgatttgc cctgaaccta 1320
atcaaagcta gataaattca gtattagggc ataaaaaaac ttgctttttc gggtggaaat 1380
ctgtataatt taaattgctt agataaaaat taccaattcc atacgaaagg agcaagtttt 1440
acataaggtt aaagccttat gtgaattctc atttaattac atgaataata ataacacaga 1500
aagtgaagaa ttaaaagagc aaagtcaact attgcttgac aaatgcacaa aaaagaaaaa 1560
gaaaaatcct aaatttagta gttatataga accattagta agcaagaaat tatctgaaag 1620
aataaaggaa tgtggtgact ttttgcagat gttatctgat ttaaaccttg aaaattcgaa 1680
actgcataga gcaagttttt gtggtaacag attttgtcct atgtgtagct ggcgtattgc 1740
ttgtaaggat agtttggaaa tatctattct catggagcat ttacgcaaag aggaaagcaa 1800
agaatttatc tttttgacct taacaactcc aaatgtgaaa ggtgcggacc ttgataattc 1860
cataaaagca tacaataaag catttaaaaa gttaatggaa cgcaaagagg tcaagagcat 1920
agtaaaaggc tacataagaa agctagaagt aacctataat ttggacaaga gttccaaatc 1980
atataatact tatcacccac atttccatgt ggtactagca gtcaatagaa gttactttaa 2040
aaagcaaaat ctatatataa accatcatag atggcttagt ttgtggcaag agtcaactgg 2100
tgattattcg ataactcaag ttgatgtaag aaaggctaaa attaacgatt ataaagaggt 2160
ttatgagctt gctaagtatt cggctaagga ttccgactat ttaatcaata gagaagtgtt 2220
tacggtattc tacaaatctt taaagggtaa acaggtactt gtatttagtg gattatttaa 2280
agacgctcat aaaatgtata agaatggaga gctagatctg tataagaagt tggatactat 2340
cgaatatgct tatatggtaa gttataactg gcttaaaaag aagtatgata cttcaaatat 2400
tagagaatta actgaggaag aaaagcagaa attcaataaa aatttaatcg aagatgtgga 2460
tattgagtag gtgggattat atctcacctt ttttattgtc ttttcatgtt gaaattttga 2520
cgcttaatgc atgaagtatt gacaagttta aaaattacgg tttttaatcc ttagttgatt 2580
agcaggatta tggccggaat gctccgtcca gtcctgttaa ggaattaaaa ttccctaaaa 2640
cccttggcta tgatttatag cgagaatcgt caattaaaaa tttaataggt gctatgaaag 2700
tcgattaata attaatttta aaatgcaata tgaaacataa ttacaagaat ttgactttta 2760
atacaagaat tgatatcata gttacattaa tac 2793
<210> 128
<211> 329
<212> PRT
<213> Clostridium beijerinckii
<400> 128
Met Asn Asn Asn Asn Thr Glu Ser Glu Glu Leu Lys Glu Gln Ser Gln
1 5 10 15
Leu Leu Leu Asp Lys Cys Thr Lys Lys Lys Lys Lys Asn Pro Lys Phe
20 25 30
Ser Ser Tyr Ile Glu Pro Leu Val Ser Lys Lys Leu Ser Glu Arg Ile
35 40 45
Lys Glu Cys Gly Asp Phe Leu Gln Met Leu Ser Asp Leu Asn Leu Glu
50 55 60
Asn Ser Lys Leu His Arg Ala Ser Phe Cys Gly Asn Arg Phe Cys Pro
65 70 75 80
Met Cys Ser Trp Arg Ile Ala Cys Lys Asp Ser Leu Glu Ile Ser Ile
85 90 95
Leu Met Glu His Leu Arg Lys Glu Glu Ser Lys Glu Phe Ile Phe Leu
100 105 110
Thr Leu Thr Thr Pro Asn Val Lys Gly Ala Asp Leu Asp Asn Ser Ile
115 120 125
Lys Ala Tyr Asn Lys Ala Phe Lys Lys Leu Met Glu Arg Lys Glu Val
130 135 140
Lys Ser Ile Val Lys Gly Tyr Ile Arg Lys Leu Glu Val Thr Tyr Asn
145 150 155 160
Leu Asp Lys Ser Ser Lys Ser Tyr Asn Thr Tyr His Pro His Phe His
165 170 175
Val Val Leu Ala Val Asn Arg Ser Tyr Phe Lys Lys Gln Asn Leu Tyr
180 185 190
Ile Asn His His Arg Trp Leu Ser Leu Trp Gln Glu Ser Thr Gly Asp
195 200 205
Tyr Ser Ile Thr Gln Val Asp Val Arg Lys Ala Lys Ile Asn Asp Tyr
210 215 220
Lys Glu Val Tyr Glu Leu Ala Lys Tyr Ser Ala Lys Asp Ser Asp Tyr
225 230 235 240
Leu Ile Asn Arg Glu Val Phe Thr Val Phe Tyr Lys Ser Leu Lys Gly
245 250 255
Lys Gln Val Leu Val Phe Ser Gly Leu Phe Lys Asp Ala His Lys Met
260 265 270
Tyr Lys Asn Gly Glu Leu Asp Leu Tyr Lys Lys Leu Asp Thr Ile Glu
275 280 285
Tyr Ala Tyr Met Val Ser Tyr Asn Trp Leu Lys Lys Lys Tyr Asp Thr
290 295 300
Ser Asn Ile Arg Glu Leu Thr Glu Glu Glu Lys Gln Lys Phe Asn Lys
305 310 315 320
Asn Leu Ile Glu Asp Val Asp Ile Glu
325
<210> 129
<211> 256
<212> PRT
<213> Artificial Sequence
<220>
<223> Consensus COG5655
<400> 129
Met Cys Gln Lys Arg Ser Asp Tyr Ser Asp Glu Lys Ala Trp Leu Lys
1 5 10 15
Asp Lys Ser Lys Asp Gly Lys Val Glu Pro Trp Arg Glu Lys Lys Glu
20 25 30
Ala Asn Val Lys Tyr Phe Glu Leu Leu Lys Ile Leu Met Phe Lys Lys
35 40 45
Ala Glu Arg Val Tyr Arg Cys Asn Glu Leu Leu Glu Leu Gln Lys Val
50 55 60
Asn Glu Thr Gly Glu Asn Lys Leu Cys Pro Asn Trp Phe Cys Lys Ser
65 70 75 80
Leu Leu Cys Pro Met Cys Asn Trp Arg Lys Pro Met Lys Ser Asp Leu
85 90 95
Gln Asp Gly Leu Tyr Val Lys Arg Val Ile Ser Tyr Gly Pro Leu Leu
100 105 110
Lys Trp Lys His Leu Lys Leu Asn Leu Lys Asn Val Glu Asp Gly Asp
115 120 125
Leu Leu Asn Lys Ser Leu Asp Glu Met Ala Leu Gly Phe Lys Arg Thr
130 135 140
Met Gly Phe Lys Lys Ile Ala Lys Asn Phe Val Gly Phe Met Lys Ser
145 150 155 160
Thr Glu Ile Thr Tyr Asn Glu Lys Asp Asn Ser Tyr Asn Gln His Met
165 170 175
His Val Leu Phe Cys Ser Glu Gln Thr Tyr Phe Lys Asn Phe Ile Asn
180 185 190
Asn Thr Pro Gln Glu Phe Trp Asn Lys Arg Trp Ser Lys Ala Met Lys
195 200 205
Leu Asp Tyr Asp Pro Gln Val Met Lys Leu Trp Thr Met Tyr Lys Lys
210 215 220
Glu Ile Lys Asn Tyr Ile Gln Thr Ala Leu Gln Glu Thr Ala Lys Tyr
225 230 235 240
Asp Val Lys Asp Met Asp Ser Ala Thr Ile Asp Asp Glu Lys Ser Leu
245 250 255
<210> 130
<211> 768
<212> DNA
<213> Enterococcus faecalis
<400> 130
gtgaggagga tatatttgaa tacatacgaa caaattaata aagtgaaaaa aatacttcgg 60
aaacatttaa aaaataacct tattggtact tacatgtttg gatcaggagt tgagagtgga 120
ctaaaaccaa atagtgatct tgacttttta gtcgtcgtat ctgaaccatt gacagatcaa 180
agtaaagaaa tacttataca aaaaattaga cctatttcaa agaaaatagg agataaaagc 240
aacttacgat atattgaatt aacaattatt attcagcaag aaatggtacc gtggaatcat 300
cctcccaaac aagaatttat ttatggagaa tggttacaag agctttatga acaaggatac 360
attcctcaga aggaattaaa ttcagattta accataatgc tttaccaagc aaaacgaaaa 420
aataaaagaa tatacggaaa ttatgactta gaggaattac tacctgatat tccattttct 480
gatgtgagaa gagccattat ggattcgtca gaggaattaa tagataatta tcaggatgat 540
gaaaccaact ctatattaac tttatgccgt atgattttaa ctatggacac gggtaaaatc 600
ataccaaaag atattgcggg aaatgcagtg gctgaatctt ctccattaga acatagggag 660
agaattttgt tagcagttcg tagttatctt ggagagaata ttgaatggac taatgaaaat 720
gtaaatttaa ctataaacta tttaaataac agattaaaaa aattataa 768
<210> 131
<211> 738
<212> DNA
<213> Clostridium difficile
<400> 131
atgaacaaaa atataaaata ttctcaaaac tttttaacga gtgaaaaagt actcaaccaa 60
ataataaaac aattgaattt aaaagaaacc gataccgttt acgaaattgg aacaggtaaa 120
gggcatttaa cgacgaaact ggctaaaata agtaaacagg taacgtctat tgaattagac 180
agtcatctat tcaacttatc gtcagaaaaa ttaaaactga atactcgtgt cactttaatt 240
caccaagata ttctacagtt tcaattccct aacaaacaga ggtataaaat tgttgggagt 300
attccttacc atttaagcac acaaattatt aaaaaagtgg tttttgaaag ccatgcgtct 360
gacatctatc tgattgttga agaaggattc tacaagcgta ccttggatat tcaccgaaca 420
ctagggttgc tcttgcacac tcaagtctcg attcagcaat tgcttaagct gccagcggaa 480
tgctttcatc ctaaaccaaa agtaaacagt gtcttaataa aacttacccg ccataccaca 540
gatgttccag ataaatattg gaagctatat acgtactttg tttcaaaatg ggtcaatcga 600
gaatatcgtc aactgtttac taaaaatcag tttcatcaag caatgaaaca cgccaaagta 660
aacaatttaa gtaccgttac ttatgagcaa gtattgtcta tttttaatag ttatctatta 720
tttaacggga ggaaataa 738
<210> 132
<211> 3792
<212> DNA
<213> Artificial Sequence
<220>
<223> Optimized Mad7 CDS for B. subtilis
<400> 132
atgaacaacg gcacaaataa ttttcagaac tttattggca tttcatcatt gcagaaaacg 60
ttaagaaatg ctttaattcc gacggaaaca acgcaacagt ttattgttaa aaacggaatt 120
attaaagaag atgaattaag aggcgaaaac agacagattt taaaagatat tatggatgac 180
tactacagag gatttatttc tgaaacatta tcatctattg atgacattga ttggacaagc 240
ttatttgaaa aaatggaaat tcagttaaaa aatggtgata ataaagatac attaattaaa 300
gaacagacag aatatagaaa agcaattcat aaaaaatttg cgaacgacga tagatttaaa 360
aacatgttta gcgccaaatt aatttcagac attttacctg aatttgttat tcataacaat 420
aattattcag catcagaaaa agaagaaaaa acacaggtga ttaaattgtt ttcaagattt 480
gcgacaagct ttaaagatta ctttaaaaac agagcaaatt gcttttcagc ggacgatatt 540
tcatcaagca gctgccatag aattgttaac gacaatgcag aaattttttt ttcaaatgcg 600
ttagtttaca gaagaattgt aaaatcatta agcaatgacg atattaacaa aatttcaggc 660
gatatgaaag attcattaaa agaaatgtca ttagaagaaa tttattctta cgaaaaatat 720
ggcgaattta ttacacagga aggcattagc ttttataatg atatttgtgg caaagtgaat 780
tcttttatga acttatattg tcagaaaaat aaagaaaaca aaaatttata caaacttcag 840
aaacttcata aacagattct gtgcattgcg gacacaagct atgaagttcc gtataaattt 900
gaatcagacg aagaagtgta ccaatcagtt aacggctttc ttgataacat tagcagcaaa 960
catattgttg aaagattaag aaaaattggc gataactata acggctacaa cttagataaa 1020
atttatattg tgtccaaatt ttacgaaagc gttagccaaa aaacatacag agactgggaa 1080
acaattaata cagccttaga aattcattac aataatattt tgccgggtaa cggtaaatca 1140
aaagccgaca aagtaaaaaa agcggttaaa aatgatttac agaaatccat tacagaaatt 1200
aatgaactgg tgtcaaacta taaattatgc tcagacgaca acattaaagc ggaaacatat 1260
attcatgaaa ttagccatat tttgaataac tttgaagcac aggaattgaa atacaatccg 1320
gaaattcatc tggttgaatc cgaattaaaa gcgtcagaac ttaaaaacgt gttagacgtg 1380
attatgaatg cgtttcattg gtgttcagtt tttatgacag aagaacttgt tgataaagac 1440
aacaattttt atgcggaatt agaagaaatt tacgatgaaa tttatccggt aatttcatta 1500
tacaacttag ttagaaacta cgttacacag aaaccgtaca gcacgaaaaa aattaaattg 1560
aactttggaa ttccgacgtt agcagacggt tggtcaaaat ccaaagaata ttctaataac 1620
gctattattt taatgagaga caatttatat tatttaggca tttttaatgc gaaaaataaa 1680
ccggacaaaa aaattattga aggtaatacg tcagaaaata aaggtgacta caaaaaaatg 1740
atttataatt tgttaccggg tccgaacaaa atgattccga aagttttttt gagcagcaaa 1800
acgggcgtgg aaacgtataa accgagcgcc tatattctgg aaggctataa acagaataaa 1860
catattaaat cttcaaaaga ctttgatatt acattttgtc atgatttaat tgactacttt 1920
aaaaactgta ttgcaattca tccggaatgg aaaaactttg gttttgattt tagcgacaca 1980
tcaacatatg aagacatttc cggcttttat agagaagtag aattacaagg ttacaaaatt 2040
gattggacat acattagcga aaaagacatt gatttattac aggaaaaagg tcaattatat 2100
ttatttcaga tttataacaa agatttttca aaaaaatcaa caggcaatga caaccttcat 2160
acaatgtact taaaaaatct tttttcagaa gaaaatctta aagatattgt tttaaaactt 2220
aacggcgaag cggaaatttt ttttagaaaa agcagcatta aaaacccgat tattcataaa 2280
aaaggctcaa ttttagttaa cagaacatac gaagcagaag aaaaagacca gtttggcaac 2340
attcaaattg tgagaaaaaa tattccggaa aacatttatc aggaattata caaatacttt 2400
aacgataaaa gcgacaaaga attatctgat gaagcagcca aattaaaaaa tgtagtggga 2460
catcatgaag cagcgacgaa tattgttaaa gactatagat acacgtatga taaatacttt 2520
cttcatatgc ctattacgat taattttaaa gccaataaaa cgggttttat taatgataga 2580
attttacagt atattgctaa agaaaaagac ttacatgtga ttggcattga tagaggcgaa 2640
agaaacttaa tttacgtgtc cgtgattgat acatgtggta atattgttga acagaaaagc 2700
tttaacattg taaacggcta cgactatcag attaaattaa aacaacagga aggcgctaga 2760
cagattgcga gaaaagaatg gaaagaaatt ggtaaaatta aagaaattaa agaaggctac 2820
ttaagcttag taattcatga aatttctaaa atggtaatta aatacaatgc aattattgcg 2880
atggaagatt tgtcttatgg ttttaaaaaa ggcagattta aagttgaaag acaagtttac 2940
cagaaatttg aaacaatgtt aattaataaa ttaaactatt tagtatttaa agatatttca 3000
attacagaaa atggcggttt attaaaaggt tatcagttaa catacattcc tgataaactt 3060
aaaaacgtgg gtcatcagtg cggctgcatt ttttatgtgc ctgctgcata cacgagcaaa 3120
attgatccga caacaggctt tgtgaatatt tttaaattta aagacttaac agtggacgca 3180
aaaagagaat ttattaaaaa atttgactca attagatatg actcagaaaa aaatttattt 3240
tgctttacat ttgactacaa taactttatt acgcaaaaca cggttatgag caaatcatca 3300
tggtcagtgt atacatacgg cgtgagaatt aaaagaagat ttgtgaacgg cagattttca 3360
aacgaatcag atacaattga cattacaaaa gatatggaaa aaacgttgga aatgacggac 3420
attaactgga gagatggcca tgatcttaga caagacatta ttgattatga aattgttcag 3480
catatttttg aaatttttag attaacagtg caaatgagaa actccttgtc tgaattagaa 3540
gacagagatt acgatagatt aatttcacct gtattaaacg aaaataacat tttttatgac 3600
agcgcgaaag cgggcgatgc acttcctaaa gatgccgatg caaatggtgc gtattgtatt 3660
gcattaaaag gcttatatga aattaaacaa attacagaaa attggaaaga agatggtaaa 3720
ttttcaagag ataaattaaa aattagcaat aaagattggt ttgactttat tcagaataaa 3780
agatatttat aa 3792
<210> 133
<211> 10469
<212> DNA
<213> Artificial Sequence
<220>
<223> pCas9cond
<400> 133
catggataaa aagtacagta ttggtctaga cataggaact aactctgttg ggtgggctgt 60
tataacagat gaatataaag ttccatcaaa aaaatttaaa gtattaggaa acactgatag 120
acattcaata aaaaaaaact tgataggtgc tttattattc gattcaggag agactgctga 180
agctacacgt ttaaaaagaa cagctagacg tagatataca agaagaaaaa ataggatatg 240
ttatcttcaa gaaattttta gtaatgaaat ggcaaaagtt gatgattcat tctttcacag 300
actagaagaa agtttcttag ttgaagaaga taagaagcat gaaagacacc ctatttttgg 360
taatatcgta gatgaagtag catatcatga gaagtatcca actatctatc atttaagaaa 420
gaaattagtt gattctacag ataaagctga tctgagatta atatatttag ctttagctca 480
tatgattaaa tttagaggac attttttaat agaaggtgat ttaaacccag acaacagcga 540
tgtagataaa ttatttatcc aattagttca aacttataat caattattcg aagagaatcc 600
aattaatgca agtggtgtag acgctaaggc tatattatca gctagattat caaaatctag 660
aagattagaa aatctaatag ctcaacttcc tggagaaaag aaaaatggac tttttgggaa 720
cctaatagct ctctcactcg gactaacacc aaattttaaa agcaattttg atcttgctga 780
agacgcaaag ttacaactat caaaggatac atacgatgat gatttagata atttgttagc 840
tcaaataggt gatcaatatg ctgatttgtt tcttgcagca aaaaacttaa gtgatgcaat 900
tttactatca gatatactta gagtaaatac agaaataaca aaggctcctt tatcagcaag 960
tatgattaaa cgatatgatg agcatcatca agatttaaca ttattaaagg cacttgtaag 1020
acaacaatta ccagaaaaat ataaagaaat tttctttgat caatctaaaa atggatatgc 1080
tggatatata gacggtggag caagtcaaga agagttttat aaatttataa agcctatttt 1140
agaaaaaatg gatggaactg aagaattact tgttaaactt aacagagaag atttacttag 1200
aaaacaaaga acttttgata atggttcaat tcctcaccaa attcatttag gagaattaca 1260
tgctatacta agaagacaag aagattttta tccatttctt aaagataata gagaaaaaat 1320
tgaaaaaatt ttaactttta gaataccata ttatgtagga ccacttgcaa ggggaaattc 1380
aagatttgca tggatgacta gaaaatcaga agaaactata accccgtgga attttgaaga 1440
agtagtagat aaaggagcta gtgctcaatc atttatagaa agaatgacaa attttgataa 1500
gaatcttcct aacgaaaagg ttttgccaaa gcatagcctt ctttatgagt attttacagt 1560
ttataatgag cttactaaag taaaatacgt tacagaagga atgagaaaac cagcattttt 1620
gtctggtgaa caaaagaaag caatagtaga cctattattt aaaacaaata ggaaggttac 1680
cgtaaagcaa cttaaagaag attacttcaa aaaaattgaa tgctttgata gtgttgaaat 1740
atcaggagtt gaagatagat ttaatgcttc acttggtaca tatcacgatc tcttaaaaat 1800
tataaaagat aaggattttt tagataatga agaaaatgaa gatattcttg aagatatagt 1860
attaacattg acactttttg aagatagaga aatgatagaa gaaagattaa aaacatatgc 1920
acatcttttt gatgataagg ttatgaagca acttaaaaga agaagatata caggttgggg 1980
acgtttgtca agaaagctaa ttaatggtat tagagataaa caatcaggaa agactattct 2040
cgattttctt aaatcagatg gatttgctaa tagaaacttt atgcaattaa ttcatgatga 2100
ttctcttact ttcaaagagg atattcaaaa ggctcaagtt tctggacaag gcgatagctt 2160
acacgaacac attgctaacc ttgcagggag ccccgctatc aaaaaaggaa ttttacaaac 2220
agttaaagtt gtagatgaac ttgttaaagt tatgggaaga cacaaacctg agaatatagt 2280
tatagaaatg gccagagaaa atcaaacaac acaaaaagga caaaaaaatt ctagagagag 2340
aatgaagaga attgaagaag gaataaaaga gctaggatca caaatattaa aagaacatcc 2400
agttgaaaat actcaattgc aaaatgaaaa gttatatttg tattacttac aaaatggaag 2460
agatatgtat gttgatcaag aactcgatat taatagatta agtgactatg atgttgatca 2520
tattgttcct caatcatttt taaaagatga ttcaatcgat aacaaagtat taactagatc 2580
agataaaaat agaggaaagt cagataatgt accatctgaa gaagttgtta aaaaaatgaa 2640
gaactattgg agacaacttt taaatgcaaa gctaattaca caaagaaaat ttgacaattt 2700
aacaaaagca gaaagaggag gattaagcga attagacaaa gctggattta taaaaagaca 2760
acttgttgag acaagacaaa taactaagca tgttgctcaa atacttgatt caagaatgaa 2820
tacaaaatat gatgaaaatg ataaattaat cagagaagta aaagtaataa cattaaagtc 2880
aaaattagta tcagatttca gaaaggattt tcaattttac aaagttcgtg aaataaataa 2940
ctatcatcat gctcatgatg catacttaaa tgctgttgta ggaactgctc ttattaagaa 3000
atatcctaaa ctagaaagcg aatttgttta tggagattat aaagtttatg atgtgcgcaa 3060
aatgatcgcg aaatccgaac aagaaatcgg taaggctaca gcaaaatatt tcttttatag 3120
taatataatg aattttttta agacagaaat aactttggct aatggtgaaa tcagaaaaag 3180
accacttatc gaaacaaatg gagagacagg agaaatagta tgggataaag gaagagattt 3240
tgctactgtt agaaaagtac taagtatgcc acaagtaaat atcgtaaaga aaactgaagt 3300
tcaaactgga ggtttctcta aggaatcaat tttacctaag agaaattcag ataagttaat 3360
tgcaaggaaa aaagattggg acccaaaaaa atacggtggt tttgatagtc caacagttgc 3420
ctatagtgtt cttgtagtag cgaaagttga gaaaggtaag tcaaaaaagt tgaaaagcgt 3480
aaaagaactt cttggtatca caattatgga aagatcttca tttgaaaaaa atccaattga 3540
ctttttagaa gctaagggtt ataaagaagt taaaaaggat ttaatcataa aactaccaaa 3600
gtatagtcta tttgaactcg aaaacggaag aaaacgaatg ctcgctagcg caggagaact 3660
tcaaaaagga aatgaacttg cgctgccatc aaagtatgta aatttcttat atttagcttc 3720
tcattatgag aaattaaaag gatcaccaga ggataatgaa caaaagcaac tatttgtaga 3780
acaacacaaa cattatttag atgaaataat agaacaaata tctgaatttt ctaaaagagt 3840
tatacttgcc gacgcaaatc tagataaggt gctttcagcg tataataaac acagagataa 3900
accaataaga gaacaagcag aaaacattat ccatcttttt acattaacta atcttggtgc 3960
accagctgca tttaagtact ttgatacaac aatagataga aaaagataca catctactaa 4020
agaagtatta gacgcaactt taatacatca atctattaca gggctttatg aaacaagaat 4080
tgatttaagt caactaggcg gagattaagt cgacaaagta ttgttaaaaa taactctgta 4140
gaattataaa ttagttctac agagttattt tttgacccgg gtaccgagct cgaattcgta 4200
atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat 4260
acgagccgga agcataaagt gtaaagcctg gggtgcctaa tgagtgagct aactcacatt 4320
aattgcgttg cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta 4380
atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc 4440
gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa 4500
ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa 4560
aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt tccataggct 4620
ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac 4680
aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc 4740
gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc 4800
tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg 4860
tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga 4920
gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta acaggattag 4980
cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta 5040
cactagaagg acagtatttg gtatctgcgc tctgctgaag ccagttacct tcggaaaaag 5100
agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg 5160
caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac 5220
ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc 5280
aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag 5340
tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg cacctatctc 5400
agcgatctgt ctatttcgtt catccatagt tgcctgactc cccgtcgtgt agataactac 5460
gatacgggag ggcttaccat ctggccccag tgctgcaatg ataccgcgag acccacgctc 5520
accggctcca gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg 5580
tcctgcaact ttatccgcct ccatccagtc tattaattgt tgccgggaag ctagagtaag 5640
tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc 5700
acgctcgtcg tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac 5760
atgatccccc atgttgtgca aaaaagcggt tagctccttc ggtcctccga tcgttgtcag 5820
aagtaagttg gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac 5880
tgtcatgcca tccgtaagat gcttttctgt gactggtgag tactcaacca agtcattctg 5940
agaatagtgt atgcggcgac cgagttgctc ttgcccggcg tcaatacggg ataataccgc 6000
gccacatagc agaactttaa aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact 6060
ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg cacccaactg 6120
atcttcagca tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa 6180
tgccgcaaaa aagggaataa gggcgacacg gaaatgttga atactcatac tcttcctttt 6240
tcaatattat tgaagcattt atcagggtta ttgtctcatg agcggataca tatttgaatg 6300
tatttagaaa aataaacaaa taggggttcc gcgcacattt ccccgaaaag tgccacctga 6360
ctgccgggcc tcttgcggga tcaaaagaaa aacgaaatga tacaccaatc agtgcaaaaa 6420
aagatataat gggagataag acggttcgtg ttcgtgctga cttgcaccat atcataaaaa 6480
tcgaaacagc aaagaatggc ggaaacgtaa aagaagttat ggaaataaga cttagaagca 6540
aacttaagag tgtgttgata gtgcagtatc ttaaaatttt gtataatagg aattgaagtt 6600
aaattagatg ctaaaaattt gtaattaaga aggagtgatt acatgaacaa aaatataaaa 6660
tattctcaaa actttttaac gagtgaaaaa gtactcaacc aaataataaa acaattgaat 6720
ttaaaagaaa ccgataccgt ttacgaaatt ggaacaggta aagggcattt aacgacgaaa 6780
ctggctaaaa taagtaaaca ggtaacgtct attgaattag acagtcatct attcaactta 6840
tcgtcagaaa aattaaaact gaatactcgt gtcactttaa ttcaccaaga tattctacag 6900
tttcaattcc ctaacaaaca gaggtataaa attgttggga gtattcctta ccatttaagc 6960
acacaaatta ttaaaaaagt ggtttttgaa agccatgcgt ctgacatcta tctgattgtt 7020
gaagaaggat tctacaagcg taccttggat attcaccgaa cactagggtt gctcttgcac 7080
actcaagtct cgattcagca attgcttaag ctgccagcgg aatgctttca tcctaaacca 7140
aaagtaaaca gtgtcttaat aaaacttacc cgccatacca cagatgttcc agataaatat 7200
tggaagctat atacgtactt tgtttcaaaa tgggtcaatc gagaatatcg tcaactgttt 7260
actaaaaatc agtttcatca agcaatgaaa cacgccaaag taaacaattt aagtaccgtt 7320
acttatgagc aagtattgtc tatttttaat agttatctat tatttaacgg gaggaaataa 7380
ttctatgagt ccctaggccc aactaactca acgctagtag tggatttaat cccaaatgag 7440
ccaacagaac cagaaccaga aacagaatca gaacaagtaa cattggattt agaaatggaa 7500
gaagaaaaaa gcaatgactt cgtgtgaata atgcacgaaa tcgttgctta ttttttttta 7560
aaagcggtat actagatata acgaaacaac gaactgaata gaaacgaaaa aagagccatg 7620
acacatttat aaaatgtttg acgacatttt ataaatgcat agcccgataa gattgccaaa 7680
ccaacgctta tcagttagtc agatgaactc ttccctcgta agaagttatt taattaactt 7740
tgtttgaaga cggtatataa ccgtactatc attatatagg gaaatcagag agttttcaag 7800
tatctaagct actgaattta agaattgtta agcaatcaat cggaaatcgt ttgattgctt 7860
tttttgtatt catttataga aggtggagtt tgtatgaatc atgatgaatg taaaacttat 7920
ataaaaaata gtttattgga gataagaaaa ttagcaaata tctatacact agaaacgttt 7980
aagaaagagt tagaaaagag aaatatctac ttagaaacaa aatcagataa gtatttttct 8040
tcggaggggg aagattatat atataagtta atagaaaata acaaaataat ttattcgatt 8100
agtggaaaaa aattgactta taaaggaaaa aaatcttttt caaaacatgc aatattgaaa 8160
cagttgaatg aaaaagcaaa ccaagttaat taaacaacct attttatagg atttatagga 8220
aaggagaaca gctgaatgaa tatccctttt gttgtagaaa ctgtgcttca tgacggcttg 8280
ttaaagtaca aatttaaaaa tagtaaaatt cgctcaatca ctaccaagcc aggtaaaagc 8340
aaaggggcta tttttgcgta tcgctcaaaa tcaagcatga ttggcggtcg tggtgttgtt 8400
ctgacttccg aggaagcgat tcaagaaaat caagatacat ttacacattg gacacccaac 8460
gtttatcgtt atggaacgta tgcagacgaa aaccgttcat acacgaaagg acattctgaa 8520
aacaatttaa gacaaatcaa taccttcttt attgattttg atattcacac ggcaaaagaa 8580
actatttcag caagcgatat tttaacaacc gctattgatt taggttttat gcctactatg 8640
attatcaaat ctgataaagg ttatcaagca tattttgttt tagaaacgcc agtctatgtg 8700
acttcaaaat cagaatttaa atctgtcaaa gcagccaaaa taatttcgca aaatatccga 8760
gaatattttg gaaagtcttt gccagttgat ctaacgtgta atcattttgg tattgctcgc 8820
ataccaagaa cggacaatgt agaatttttt gatcctaatt accgttattc tttcaaagaa 8880
tggcaagatt ggtctttcaa acaaacagat aataagggct ttactcgttc aagtctaacg 8940
gttttaagcg gtacagaagg caaaaaacaa gtagatgaac cctggtttaa tctcttattg 9000
cacgaaacga aattttcagg agaaaagggt ttaatagggc gtaataacgt catgtttacc 9060
ctctctttag cctactttag ttcaggctat tcaatcgaaa cgtgcgaata taatatgttt 9120
gagtttaata atcgattaga tcaaccctta gaagaaaaag aagtaatcaa aattgttaga 9180
agtgcctatt cagaaaacta tcaaggggct aatagggaat acattaccat tctttgcaaa 9240
gcttgggtat caagtgattt aaccagtaaa gatttatttg tccgtcaagg gtggtttaaa 9300
ttcaagaaaa aaagaagcga acgtcaacgt gttcatttgt cagaatggaa agaagattta 9360
atggcttata ttagcgaaaa aagcgatgta tacaagcctt atttagtgac gaccaaaaaa 9420
gagattagag aagtgctagg cattcctgaa cggacattag ataaattgct gaaggtactg 9480
aaggcgaatc aggaaatttt ctttaagatt aaaccaggaa gaaatggtgg cattcaactt 9540
gctagtgtta aatcattgtt gctatcgatc attaaagtaa aaaaagaaga aaaagaaagc 9600
tatataaagg cgctgacaaa ttcttttgac ttagagcata cattcattca agagacttta 9660
aacaagctag cagaacgccc taaaacggac acacaactcg atttgtttag ctatgataca 9720
ggctgaaaat aaaacccgca ctatgccatt acatttatat ctatgatacg tgtttgtttt 9780
ttctttgctg tttagcgaat gattagcaga aatatacaga gtaagatttt aattaattat 9840
tagggggaga aggagagagt agcccgaaaa cttttagttg gcttggactg aacgaagtga 9900
gggaaaggct actaaaacgt cgaggggcag tgagagcgaa gcgaacactt gattttttaa 9960
ttttctatct tttataggtc attagagtat acttatttgt cctataaact atttagcagc 10020
ataatagatt tattgaatag gtcatttaag ttgagcatat tagaggagga aaatcttgga 10080
gaaatatttg aagaacccga ttacatggat tggattagtt cttgtggtta cgtggttttt 10140
aactaaaagt agtgaatttt tgatttttgg tgtgtgtgtc ttgttgttag tatttgctag 10200
tcaaagtgat taaatagaat tctagcgcca ttcgccattc aggctgcgca actgttggga 10260
agggcgatcg gtgcgggcct cttcgctatt acgccagctg gcgaaagggg gatgtgctgc 10320
aaggcgatta agttgggtaa cgccagggtt ttcccagtca cgacgttgta aaacgacggc 10380
cagtgccaag cttgcatgcc tgcaggcctc gagtatattg ataaaaataa taatagtggg 10440
tataattaag ttgttaggag gttagttac 10469
<210> 134
<211> 8559
<212> DNA
<213> Artificial Sequence
<220>
<223> pMAD7
<400> 134
tcgagtccct atcagtgata gattgaaact ctatcattga tagagtataa tatctttgtt 60
cattagagcg ataaacttga atttgagagg gaacttagat gaacaacggc acaaataatt 120
ttcagaactt catagggata tcaagtttgc agaaaacgtt aagaaatgct ttaataccca 180
cggaaaccac gcaacagttc atagttaaga acggaataat taaagaagat gagttaagag 240
gcgagaacag acagatttta aaagatataa tggatgacta ctacagagga ttcatatctg 300
agactttaag ttctattgat gacatagatt ggactagctt attcgaaaaa atggaaattc 360
agttaaaaaa tggtgataat aaagatacct taattaagga acagacagag tatagaaaag 420
caatacataa aaaatttgcg aacgacgata gatttaagaa catgtttagc gccaaattaa 480
ttagtgacat attacctgaa tttgttatac acaacaataa ttattcggca tcagagaaag 540
aggaaaaaac ccaggtgata aaattgtttt cgagatttgc gactagcttt aaagattact 600
tcaagaacag agcaaattgc ttttcagcgg acgatatttc atcaagcagc tgccatagaa 660
tagttaacga caatgcagag atattctttt caaatgcgtt agtttacaga agaatagtaa 720
aatcgttaag caatgacgat ataaacaaaa tttcgggcga tatgaaagat tcattaaaag 780
aaatgagttt agaagaaata tattcttacg agaagtatgg ggaatttatt acccaggaag 840
gcattagctt ctataatgat atatgtggga aagtgaattc ttttatgaac ttatattgtc 900
agaaaaataa agaaaacaaa aatttataca aacttcagaa acttcacaaa cagattctat 960
gcattgcgga cactagctat gaggttccgt ataaatttga aagtgacgag gaagtgtacc 1020
aatcagttaa cggcttcctt gataacatta gcagcaaaca tatagttgaa agattaagaa 1080
aaataggcga taactataac ggctacaact tagataaaat ttatatagtg tccaaatttt 1140
acgagagcgt tagccaaaaa acctacagag actgggaaac aattaatacc gccttagaaa 1200
ttcattacaa taatatattg ccgggtaacg gtaaaagtaa agccgacaaa gtaaaaaaag 1260
cggttaagaa tgatttacag aaatccataa ccgaaataaa tgaactagtg tcaaactata 1320
agttatgcag tgacgacaac ataaaagcgg agacttatat acatgagatt agccatatat 1380
tgaataactt tgaagcacag gaattgaaat acaatccgga aattcaccta gttgaatccg 1440
agttaaaagc gagtgagctt aaaaacgtgt tagacgtgat aatgaatgcg tttcattggt 1500
gttcggtttt tatgactgag gaacttgttg ataaagacaa caatttttat gcggaattag 1560
aggagattta cgatgaaatt tatccagtaa ttagtttata caacttagtt agaaactacg 1620
ttacccagaa accgtacagc acgaaaaaga ttaaattgaa ctttggaata ccgacgttag 1680
cagacggttg gtcaaagtcc aaagagtatt ctaataacgc tataatatta atgagagaca 1740
atttatatta tttaggcata tttaatgcga agaataaacc ggacaagaag attatagagg 1800
gtaatacgtc agaaaataag ggtgactaca aaaagatgat ttataatttg ttaccgggtc 1860
ccaacaaaat gataccgaaa gttttcttga gcagcaagac gggggtggaa acgtataaac 1920
cgagcgccta tatactagag gggtataaac agaataaaca tataaagtct tcaaaagact 1980
ttgatataac tttctgtcat gatttaatag actacttcaa aaactgtatt gcaattcatc 2040
ccgagtggaa aaacttcggt tttgatttta gcgacaccag tacttatgaa gacatttccg 2100
ggttttatag agaggtagag ttacaaggtt acaagattga ttggacatac attagcgaaa 2160
aagacattga tttattacag gaaaaaggtc aattatattt attccagata tataacaaag 2220
atttttcgaa aaaatcaacc gggaatgaca accttcacac catgtactta aaaaatcttt 2280
tctcagaaga aaatcttaag gatatagttt taaaacttaa cggcgaagcg gaaatattct 2340
tcaggaagag cagcataaag aacccaataa ttcataaaaa aggctcgatt ttagttaaca 2400
gaacctacga agcagaagaa aaagaccagt ttggcaacat tcaaattgtg agaaaaaata 2460
ttccggaaaa catttatcag gagttataca aatacttcaa cgataaaagc gacaaagagt 2520
tatctgatga agcagccaaa ttaaagaatg tagtgggaca ccacgaggca gcgacgaata 2580
tagttaagga ctatagatac acgtatgata aatacttcct tcatatgcct attacgataa 2640
atttcaaagc caataaaacg ggttttatta atgataggat attacagtat atagctaaag 2700
aaaaagactt acatgtgata ggcattgata gaggcgagag aaacttaata tacgtgtccg 2760
tgattgatac ttgtggtaat atagttgaac agaaaagctt taacattgta aacggctacg 2820
actatcagat aaaattaaaa caacaggagg gcgctagaca gattgcgaga aaagaatgga 2880
aagaaattgg taaaattaaa gagataaaag agggctactt aagcttagta atacacgaga 2940
tatctaaaat ggtaataaaa tacaatgcaa ttatagcgat ggaggatttg tcttatggtt 3000
ttaaaaaagg gagatttaag gttgaaagac aagtttacca gaaatttgaa accatgttaa 3060
taaataaatt aaactattta gtatttaaag atatttcgat taccgagaat ggcggtttat 3120
taaaaggtta tcagttaaca tacattcctg ataaacttaa aaacgtgggt catcagtgcg 3180
gctgcatttt ttatgtgcct gctgcataca cgagcaaaat tgatccgacc accggctttg 3240
tgaatatatt taaatttaaa gacttaacag tggacgcaaa aagagaattc attaaaaaat 3300
ttgactcaat tagatatgac agtgaaaaaa atttattctg ctttacattt gactacaata 3360
actttattac gcaaaacacg gttatgagca aatcatcgtg gagtgtgtat acatacggcg 3420
tgagaataaa aagaagattt gtgaacggca gattctcaaa cgaaagtgat accattgaca 3480
taaccaaaga tatggagaaa acgttggaaa tgacggacat taactggaga gatggccacg 3540
atcttagaca agacattata gattatgaaa ttgttcagca catattcgaa attttcagat 3600
taacagtgca aatgagaaac tccttgtctg aattagagga cagagattac gatagattaa 3660
tttcacctgt attaaacgaa aataacattt tttatgacag cgcgaaagcg ggggatgcac 3720
ttcctaagga tgccgatgca aatggtgcgt attgtattgc attaaaaggg ttatatgaaa 3780
ttaaacaaat taccgaaaat tggaaagaag atggtaaatt ttcgagagat aaattaaaaa 3840
taagcaataa agattggttc gactttatac agaataagag atatttataa gtcgacaaag 3900
tattgttaaa aataactctg tagaattata aattagttct acagagttat tttttgaccc 3960
gggtatattg ataaaaataa taatagtggg tataattaag ttgttaggag gttagttaga 4020
atgatgtcaa gattagataa aagtaaagtg attaacagcg cattagagct gcttaatgag 4080
gtcggaatcg aaggtttaac aacccgtaaa ctcgcccaga agctaggtgt agagcagcct 4140
acattgtatt ggcatgtaaa aaataagcgg gctttgctcg acgccttagc cattgagatg 4200
ttagataggc accatactca cttttgccct ttagaagggg aaagctggca agatttttta 4260
cgtaataacg ctaaaagttt tagatgtgct ttactaagtc atcgcgatgg agcaaaagta 4320
catttaggta cacggcctac agaaaaacag tatgaaactc tcgaaaatca attagccttt 4380
ttatgccaac aaggtttttc actagagaat gcattatatg cactcagcgc tgtggggcat 4440
tttactttag gttgcgtatt ggaagatcaa gagcatcaag tcgctaaaga agaaagggaa 4500
acacctacta ctgatagtat gccgccatta ttacgacaag ctatcgaatt atttgatcac 4560
caaggtgcag agccagcctt cttattcggc cttgaattga tcatatgcgg attagaaaaa 4620
caacttaaat gtgaaagtgg gtcttaaaag cagcataacc tttttccgtg atggtaactt 4680
cacggtaacc aagatgtcga gttgagctcg aattcgtaat catggtcata gctgtttcct 4740
gtgtgaaatt gttatccgct cacaattcca cacaacatac gagccggaag cataaagtgt 4800
aaagcctggg gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc 4860
gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg 4920
agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg 4980
gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg cggtaatacg gttatccaca 5040
gaatcagggg ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac 5100
cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac 5160
aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg 5220
tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct taccggatac 5280
ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc atagctcacg ctgtaggtat 5340
ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag 5400
cccgaccgct gcgccttatc cggtaactat cgtcttgagt ccaacccggt aagacacgac 5460
ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta tgtaggcggt 5520
gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaaggac agtatttggt 5580
atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc ttgatccggc 5640
aaacaaacca ccgctggtag cggtggtttt tttgtttgca agcagcagat tacgcgcaga 5700
aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc tcagtggaac 5760
gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt cacctagatc 5820
cttttaaatt aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggtct 5880
gacagttacc aggtccactg ccgggcctct tgcgggatca aaagaaaaac gaaatgatac 5940
accaatcagt gcaaaaaaag atataatggg agataagacg gttcgtgttc gtgctgactt 6000
gcaccatatc ataaaaatcg aaacagcaaa gaatggcgga aacgtaaaag aagttatgga 6060
aataagactt agaagcaaac ttaagagtgt gttgatagtg cagtatctta aaattttgta 6120
taataggaat tgaagttaaa ttagatgcta aaaatttgta attaagaagg agtgattaca 6180
tgaacaaaaa tataaaatat tctcaaaact ttttaacgag tgaaaaagta ctcaaccaaa 6240
taataaaaca attgaattta aaagaaaccg ataccgttta cgaaattgga acaggtaaag 6300
ggcatttaac gacgaaactg gctaaaataa gtaaacaggt aacgtctatt gaattagaca 6360
gtcatctatt caacttatcg tcagaaaaat taaaactgaa tactcgtgtc actttaattc 6420
accaagatat tctacagttt caattcccta acaaacagag gtataaaatt gttgggagta 6480
ttccttacca tttaagcaca caaattatta aaaaagtggt ttttgaaagc catgcgtctg 6540
acatctatct gattgttgaa gaaggattct acaagcgtac cttggatatt caccgaacac 6600
tagggttgct cttgcacact caagtctcga ttcagcaatt gcttaagctg ccagcggaat 6660
gctttcatcc taaaccaaaa gtaaacagtg tcttaataaa acttacccgc cataccacag 6720
atgttccaga taaatattgg aagctatata cgtactttgt ttcaaaatgg gtcaatcgag 6780
aatatcgtca actgtttact aaaaatcagt ttcatcaagc aatgaaacac gccaaagtaa 6840
acaatttaag taccgttact tatgagcaag tattgtctat ttttaatagt tatctattat 6900
ttaacgggag gaaataattc tatgagtccc taggcaggcc tccgccatta tttttttgaa 6960
caattgacaa ttcatttctt attttttatt aagtgatagt caaaaggcat aacagtgctg 7020
aatagaaaga aatttacaga aaagaaaatt atagaattta gtatgattaa ttatactcat 7080
ttatgaatgt ttaattgaat acaaaaaaaa atacttgtta tgtattcaat tacgggttaa 7140
aatatagaca agttgaaaaa tttaataaaa aaataagtcc tcagctctta tatattaagc 7200
taccaactta gtatataagc caaaacttaa atgtgctacc aacacatcaa gccgttagag 7260
aactctatct atagcaatat ttcaaatgta ccgacataca agagaaacat taactatata 7320
tattcaattt atgagattat cttaacagat ataaatgtaa attgcaataa gtaagattta 7380
gaagtttata gcctttgtgt attggaagca gtacgcaaag gcttttttat ttgataaaaa 7440
ttagaagtat atttattttt tcataattaa tttatgaaaa tgaaaggggg tgagcaaagt 7500
gacagaggaa agcagtatct tatcaaataa caaggtatta gcaatatcat tattgacttt 7560
agcagtaaac attatgactt ttatagtgct tgtagctaag tagtacgaaa gggggagctt 7620
taaaaagctc cttggaatac atagaattca taaattaatt tatgaaaaga agggcgtata 7680
tgaaaacttg taaaaattgc aaagagttta ttaaagatac tgaaatatgc aaaatacatt 7740
cgttgatgat tcatgataaa acagtagcaa cctattgcag taaatacaat gagtcaagat 7800
gtttacataa agggaaagtc caatgtatta attgttcaaa gatgaaccga tatggatggt 7860
gtgccataaa aatgagatgt tttacagagg aagaacagaa aaaagaacgt acatgcatta 7920
aatattatgc aaggagcttt aaaaaagctc atgtaaagaa gagtaaaaag aaaaaataat 7980
ttatttatta atttaatatt gagagtgccg acacagtatg cactaaaaaa tatatctgtg 8040
gtgtagtgag ccgatacaaa aggatagtca ctcgcatttt cataatacat cttatgttat 8100
gattatgtgt cggtgggact tcacgacgaa aacccacaat aaaaaaagag ttcggggtag 8160
ggttaagcat agttgaggca actaaacaat caagctagga tatgcagtag cagaccgtaa 8220
ggtcgttgtt taggtgtgtt gtaatacata cgctattaag atgtaaaaat acggatacca 8280
atgaagggaa aagtataatt tttggatgta gtttgtttgt tcatctatgg gcaaactacg 8340
tccaaagccg tttccaaatc tgctaaaaag tatatccttt ctaaaatcaa agtcaagtat 8400
gaaatcataa ataaagttta attttgaagt tattatgata ttatgttttt ctattaaaat 8460
aaattaagta tatagaatag tttaataata gtatatactt aatgtgataa gtgtctgaca 8520
gtgtcacaga aaggatgatt gttatggatt ataagcggc 8559
Claims (14)
- 동종 재조합에 의한 클로스트리디움 속(Clostridium geuns)의 박테리아(bacterium)의 형질전환 및 유전적 변형을 가능하게 하는 유전 툴(genetic tool)로서,
i) 하기를 포함하고:
- 적어도 Cas9를 암호화하는 제1의 핵산으로, 상기 Cas9를 암호화하는 서열이 프로모터의 제어 하에 있는, 핵산, 및
- 동종 재조합 메카니즘(homologous recombination mechanism)에 의해 목적한 서열에 의한 Cas9에 의해 표적화된 박테리아 DNA의 일부의 대체를 가능하게 하는 복구 주형(repair template)을 포함하는 적어도 제2의 핵산,
ii) 상기 핵산들 중 적어도 하나가 하나 이상의 가이드 RNA(gRNA)를 추가로 암호화하거나, 상기 유전 툴이 하나 이상의 가이드 RNA를 추가로 포함하고, 각각의 가이드 RNA는 Cas-효소-결합 RNA 구조 및 박테리아 DNA의 표적화된 부위에 대해 상보성인 서열을 포함하며, 및
iii) 상기 핵산들 중 적어도 하나가 유도성 프로모터의 제어 하에 있는 항-CRISPR 단백질을 암호화하는 서열을 추가로 포함하거나, 상기 유전 툴이 유도성 프로모터의 제어 하에 있는 항-CRISPR 단백질을 암호화하는 제3의 핵산을 추가로 포함함을 특징으로 하는 유전 툴. - 제1항에 있어서, 클로스트리디움 속의 박테리아가, 씨. 아세토부틸리쿰(C. acetobutylicum), 씨, 셀룰롤리티쿰(C. cellulolyticum), 씨. 파이토페르멘탄스(C. phytofermentans), 씨. 베이제린키이(C. beijerinckii), 씨, 사카로부틸리쿰(C. saccharobutylicum), 씨. 사카로페르부틸아세토니쿰( C. saccharoperbutylacetonicum), 씨. 스포로게네스(C. sporogenes), 씨. 부티리쿰(C. butyricum), 씨. 아우란티부티리쿰(C. aurantibutyricum) 및 씨. 타이로부티리쿰(C. tyrobutyricum)으로부터 선택된 용매생성 박테리아(solventogenic bacterium)인 유전 툴.
- 제2항에 있어서, 박테리아가 씨. 아세토부틸리쿰이고, 상기 씨. 아세토부틸리쿰 박테리아가 균주 DSM 792 (ATCC 824 또는 LMG 5710으로도 공지됨)이며, 상기 용매생성 박테리아가 씨. 베이제린키이인 경우 상기 씨. 베이제린키이 박테리아가 균주 NCIMB 8052 또는 균주 DSM 6423 (NRRL B-593, LMG 7814, LMG 7815로도 공지됨)임을 특징으로 하는, 유전 툴.
- 제1항 내지 제3항 중 어느 한 항에 있어서, 항-CRISPR 단백질을 암호화하는 서열이 제1의 핵산에 의해 수반되는 유전 툴.
- 제1항 내지 제4항 중 어느 한 항에 있어서, 항-CRISPR 단백질이 단백질 AcrIIA2 또는 단백질 AcrIIA4임을 특징으로 하는 유전 툴.
- 제1항 내지 제5항 중 어느 한 항에 있어서, 목적한 DNA 서열의 발현이 클로스트리디움 속의 박테리아가 6-탄당 및/또는 5-탄당 중 적어도 2개의 상이한 당을 발효하도록 하는 유전 툴.
- 제1항 내지 제6항 중 어느 한 항에 있어서, 목적한 서열이 글로스트리디움 속의 박테리아에 의한 용매 생산을 촉진하는 적어도 하나의 생성물, 예를 들면, 알데하이드의 알코올로의 전환에 포함된 적어도 하나의 효소, 막 단백질, 전사 인자, 또는 이의 조합을 암호화함을 특징으로 하는 유전 툴.
- 제1항 내지 제7항 중 어느 한 항에 있어서, 툴 내에 존재하는 각각의 핵산이 구별되는 발현 카세트 또는 구별되는 벡터, 예를 들면, 플라스미드에 속하는 유전 툴.
- 동종 재조합에 의해, 클로스트리디움 속의 박테리아를 형질전환 및 유전적으로 변형시키는 방법으로서, 상기 방법이 상기 박테리아 내로 제1항 내지 제8항 중 어느 한 항에 따른 유전 툴을 도입시킴으로써 상기 박테리아를 형질전환시키는 단계를 포함함을 특징으로 하는 방법.
- 제9항에 있어서, 다음의 단계를 포함함을 특징으로 하는 방법:
a) 항-CRISPR 단백질의 발현의 유도인자의 존재하에서 제1항 내지 제8항 중 어느 한 항에 따른 유전 툴을 박테리아 내로 도입시키는 단계, 및
b) 단계 a)의 말기에서 수득된 형질전환된 박테리아를 항-CRISPR 단백질의 발현의 유도인자를 함유하지 않는 배지 상에서 배양하는 단계. - 제9항 또는 제10항에 있어서, 복구 주형 및/또는 가이드 RNA(들)를 포함하는 핵산 또는 단계 a) 동안 유전 툴을 사용하여 도입된 가이드 RNA(들)을 암호화하는 서열을 제거하는 추가의 단계 c)를 포함함을 특징으로 하는 방법.
- 제9항 내지 제11항 중 어느 한 항에 있어서, 단계 b) 또는 단계 c) 이후에, 이미 도입된 것과는 구별되는 복구 주형 및 상기 구별된 복구 주형내에 포함된 목적한 서열이 박테리아 게놈의 표적화된 영역내로 통합되도록 하는 가이드 RNA에 대한 하나 이상의 발현 카세트를 포함하는 n번째 핵산을 항-CRISP 단백질의 발현의 유도인자의 존재하에서 도입하는 하나 이상의 추가의 단계를 포함함을 특징으로 하는 방법으로서, 각각의 추가의 단계가 이렇게 형질전환된 박테리아를 항-CRISPR 단백질의 발현의 유도인자를 함유하지 않고 Cas9/gRNA 리보뉴클레오단백질 복합체의 발현을 가능하게 하는 배지에서 배양하는 단계를 수반하는 방법.
- 제1항 내지 제8항 중 어느 하나에 기술된 바와 같은 유전 툴의 성분 및 툴내에 사용된 선택된 항-CRISPR 단백질의 발현의 유도성 프로모터에 적응된 적어도 하나의 유도인자를 포함하는, 클로스트리디움 속의 박테리아를 형질전환 및 바람직하게는 유전적으로 변형시키거나, 클로스트리디움 속의 박테리아를 사용하여 적어도 하나의 용매를 생산하기 위한 키트(kit).
- 용매 또는 용매의 혼합물, 바람직하게는 아세톤, 부탄올, 에탄올, 이소프로판올 또는 이의 혼합물, 전형적으로 이소프로판올/부탄올 혼합물을 산업적 규모로 생산할 수 있도록 하는, 제1항 내지 제8항 중 어느 한 항에 따른 유전 툴, 제9항 내지 제12항 중 어느 한 항에 따른 방법, 또는 제13항에 따른 키트의 용도.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1854835A FR3081881A1 (fr) | 2018-06-04 | 2018-06-04 | Outil genetique optimise pour modifier les bacteries du genre clostridium |
FR1854835 | 2018-06-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20190138274A true KR20190138274A (ko) | 2019-12-12 |
Family
ID=63684008
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020190061522A KR20190138274A (ko) | 2018-06-04 | 2019-05-24 | 클로스트리디움 박테리아의 변형을 위한 최적화된 유전 툴 |
Country Status (5)
Country | Link |
---|---|
US (1) | US11946067B2 (ko) |
EP (1) | EP3578662A1 (ko) |
KR (1) | KR20190138274A (ko) |
CN (1) | CN110551713A (ko) |
FR (1) | FR3081881A1 (ko) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20230067316A (ko) | 2021-11-09 | 2023-05-16 | 서울대학교산학협력단 | CRISPR-Cas9 시스템을 활용한 클로스트리디움 퍼프린젠스 균주 및 박테리오파지 유전자 조작 방법 |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR3042506B1 (fr) | 2015-10-16 | 2018-11-30 | IFP Energies Nouvelles | Outil genetique de transformation de bacteries clostridium |
US11851663B2 (en) | 2018-10-14 | 2023-12-26 | Snipr Biome Aps | Single-vector type I vectors |
FR3096373A1 (fr) * | 2019-05-24 | 2020-11-27 | IFP Energies Nouvelles | Outil genetique optimisé pour modifier les bacteries |
US10704033B1 (en) * | 2019-12-13 | 2020-07-07 | Inscripta, Inc. | Nucleic acid-guided nucleases |
EP4077359A1 (en) * | 2019-12-18 | 2022-10-26 | Exomnis Biotech B.V. | Genetically modified clostridium strains and uses thereof |
US20230193409A1 (en) * | 2020-04-03 | 2023-06-22 | The Rockefeller University | PHAGE-ENCODED AcrVIA1 FOR USE AS AN INHIBITOR OF THE RNA-TARGETING CRISPR-Cas13 SYSTEMS |
FR3111642A1 (fr) | 2020-06-23 | 2021-12-24 | IFP Energies Nouvelles | Souches de bacteriesclostridiumresistantes au 5-fluorouracile, outils genetiques et utilisations de ceux-ci |
CN116535473B (zh) * | 2021-03-04 | 2024-04-30 | 西部(重庆)科学城种质创制大科学中心 | 控制CRISPR-Cas编辑系统的效应子AcrIIIA2TEM123及其应用 |
CN112961226B (zh) * | 2021-03-04 | 2023-05-02 | 西部(重庆)科学城种质创制大科学中心 | III-A型CRISPR-Cas系统抑制剂AcrIIIA1及其应用 |
CN113278645B (zh) * | 2021-04-15 | 2022-06-24 | 浙江大学 | 一种增强链霉菌基因组编辑效率的方法及其应用 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
MY169074A (en) * | 2011-11-30 | 2019-02-13 | Dsm Ip Assets Bv | Yeast strains engineered to produce ethanol from acetic acid and glycerol |
FR3042506B1 (fr) * | 2015-10-16 | 2018-11-30 | IFP Energies Nouvelles | Outil genetique de transformation de bacteries clostridium |
CN110139927A (zh) * | 2016-11-16 | 2019-08-16 | 加利福尼亚大学董事会 | Crispr-cas9抑制剂 |
-
2018
- 2018-06-04 FR FR1854835A patent/FR3081881A1/fr active Pending
-
2019
- 2019-05-24 KR KR1020190061522A patent/KR20190138274A/ko not_active Application Discontinuation
- 2019-05-24 US US16/421,572 patent/US11946067B2/en active Active
- 2019-05-24 EP EP19176373.9A patent/EP3578662A1/fr not_active Withdrawn
- 2019-05-24 CN CN201910439553.XA patent/CN110551713A/zh active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20230067316A (ko) | 2021-11-09 | 2023-05-16 | 서울대학교산학협력단 | CRISPR-Cas9 시스템을 활용한 클로스트리디움 퍼프린젠스 균주 및 박테리오파지 유전자 조작 방법 |
Also Published As
Publication number | Publication date |
---|---|
CN110551713A (zh) | 2019-12-10 |
US11946067B2 (en) | 2024-04-02 |
US20190367947A1 (en) | 2019-12-05 |
EP3578662A1 (fr) | 2019-12-11 |
FR3081881A1 (fr) | 2019-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR20190138274A (ko) | 클로스트리디움 박테리아의 변형을 위한 최적화된 유전 툴 | |
RU2763170C2 (ru) | Производство олигосахаридов человеческого молока в микроорганизмах-хозяевах с модифицированным импортом/экспортом | |
KR20210149060A (ko) | Tn7-유사 트랜스포존을 사용한 rna-유도된 dna 통합 | |
CN101365788B (zh) | Δ-9延伸酶及其在制备多不饱和脂肪酸中的用途 | |
US6156567A (en) | Truncated transcriptionally active cytomegalovirus promoters | |
CN101939434B (zh) | 用于在大豆中提高种子贮藏油脂的生成和改变脂肪酸谱的来自解脂耶氏酵母的dgat基因 | |
KR101982360B1 (ko) | 콤팩트 tale-뉴클레아제의 발생 방법 및 이의 용도 | |
KR20140113997A (ko) | 부탄올 생성을 위한 유전자 스위치 | |
DK2718440T3 (en) | NUCLEASE ACTIVITY PROTEIN, FUSION PROTEINS AND APPLICATIONS THEREOF | |
KR20230091894A (ko) | 부위 특이적 표적화 요소를 통한 프로그램 가능한 첨가(paste)를 사용하는 부위 특이적 유전 공학을 위한 시스템, 방법, 및 조성물 | |
CN108431221A (zh) | 用于转化梭状芽胞杆菌属细菌的遗传工具 | |
DK2324120T3 (en) | Manipulating SNF1 protein kinase OF REVISION OF OIL CONTENT IN OLEAGINOUS ORGANISMS | |
US20040003420A1 (en) | Modified recombinase | |
KR20140092759A (ko) | 숙주 세포 및 아이소부탄올의 제조 방법 | |
BRPI0806354A2 (pt) | plantas oleaginosas transgências, sementes, óleos, produtos alimentìcios ou análogos a alimento, produtos alimentìcios medicinais ou análogos alimentìcios medicinais, produtos farmacêuticos, bebidas fórmulas para bebês, suplementos nutricionais, rações para animais domésticos, alimentos para aquacultura, rações animais, produtos de sementes inteiras, produtos de óleos misturados, produtos, subprodutos e subprodutos parcialmente processados | |
KR20140099224A (ko) | 케토-아이소발레레이트 데카르복실라제 효소 및 이의 이용 방법 | |
AU2016343979A1 (en) | Delivery of central nervous system targeting polynucleotides | |
DK2623594T3 (da) | Antistof mod human prostaglandin-E2-receptor EP4 | |
KR20140146616A (ko) | 부타놀로겐용 배지의 아세테이트 보충물 | |
KR20120099509A (ko) | 재조합 숙주 세포에서 육탄당 키나아제의 발현 | |
CN101627118A (zh) | 由靶向诱变工程化的突变型△8去饱和酶基因及其在制备多不饱和脂肪酸中的用途 | |
KR20130027063A (ko) | Fe-s 클러스터 요구성 단백질의 활성 향상 | |
CN101815432A (zh) | 涉及编码核苷二磷酸激酶(ndk)多肽及其同源物的基因的用于修改植物根构造的方法 | |
KR20210080375A (ko) | 암 면역요법을 위한 재조합 폭스바이러스 | |
CN111094569A (zh) | 光控性病毒蛋白质、其基因及包含该基因的病毒载体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal |