CA2388445A1 - Genetic markers, metabolic markers, and methods for evaluating pathogenicity of strains of e. coli - Google Patents
Genetic markers, metabolic markers, and methods for evaluating pathogenicity of strains of e. coli Download PDFInfo
- Publication number
- CA2388445A1 CA2388445A1 CA002388445A CA2388445A CA2388445A1 CA 2388445 A1 CA2388445 A1 CA 2388445A1 CA 002388445 A CA002388445 A CA 002388445A CA 2388445 A CA2388445 A CA 2388445A CA 2388445 A1 CA2388445 A1 CA 2388445A1
- Authority
- CA
- Canada
- Prior art keywords
- seq
- coli
- sequence
- strain
- sequences
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 241000588724 Escherichia coli Species 0.000 title claims abstract description 164
- 238000000034 method Methods 0.000 title claims abstract description 75
- 230000002503 metabolic effect Effects 0.000 title claims abstract description 13
- 230000007918 pathogenicity Effects 0.000 title claims description 16
- 230000002068 genetic effect Effects 0.000 title abstract description 13
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 133
- 230000001717 pathogenic effect Effects 0.000 claims abstract description 84
- 239000000523 sample Substances 0.000 claims abstract description 48
- 239000002773 nucleotide Substances 0.000 claims abstract description 28
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 28
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract 7
- 102000004169 proteins and genes Human genes 0.000 claims description 70
- 150000007523 nucleic acids Chemical group 0.000 claims description 65
- ASJSAQIRZKANQN-CRCLSJGQSA-N 2-deoxy-D-ribose Chemical compound OC[C@@H](O)[C@@H](O)CC=O ASJSAQIRZKANQN-CRCLSJGQSA-N 0.000 claims description 62
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 56
- 102000039446 nucleic acids Human genes 0.000 claims description 54
- 108020004707 nucleic acids Proteins 0.000 claims description 54
- 229920001184 polypeptide Polymers 0.000 claims description 50
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 50
- 230000014509 gene expression Effects 0.000 claims description 35
- 239000013598 vector Substances 0.000 claims description 17
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 15
- 239000002609 medium Substances 0.000 claims description 11
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 claims description 10
- 229910052799 carbon Inorganic materials 0.000 claims description 10
- 102000046755 Ribokinases Human genes 0.000 claims description 9
- 108700006309 Ribokinases Proteins 0.000 claims description 9
- 238000010367 cloning Methods 0.000 claims description 6
- 102000004190 Enzymes Human genes 0.000 claims description 5
- 108090000790 Enzymes Proteins 0.000 claims description 5
- 230000000295 complement effect Effects 0.000 claims description 4
- 230000002255 enzymatic effect Effects 0.000 claims description 4
- 238000004519 manufacturing process Methods 0.000 claims description 4
- 238000012258 culturing Methods 0.000 claims description 2
- 239000001963 growth medium Substances 0.000 claims description 2
- LWGJTAZLEJHCPA-UHFFFAOYSA-N n-(2-chloroethyl)-n-nitrosomorpholine-4-carboxamide Chemical compound ClCCN(N=O)C(=O)N1CCOCC1 LWGJTAZLEJHCPA-UHFFFAOYSA-N 0.000 claims description 2
- 239000011535 reaction buffer Substances 0.000 claims description 2
- 239000013599 cloning vector Substances 0.000 claims 1
- 239000013604 expression vector Substances 0.000 claims 1
- -1 antibodies Substances 0.000 abstract description 4
- 235000018102 proteins Nutrition 0.000 description 52
- 210000004027 cell Anatomy 0.000 description 40
- 108020004414 DNA Proteins 0.000 description 26
- 150000001413 amino acids Chemical group 0.000 description 22
- 239000012634 fragment Substances 0.000 description 21
- 230000000694 effects Effects 0.000 description 16
- 230000006870 function Effects 0.000 description 16
- 241000894006 Bacteria Species 0.000 description 13
- 206010037596 Pyelonephritis Diseases 0.000 description 13
- 150000001875 compounds Chemical class 0.000 description 12
- 206010012735 Diarrhoea Diseases 0.000 description 11
- 241000607142 Salmonella Species 0.000 description 11
- 235000001014 amino acid Nutrition 0.000 description 11
- 229940024606 amino acid Drugs 0.000 description 11
- 239000002299 complementary DNA Substances 0.000 description 11
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 10
- 208000015181 infectious disease Diseases 0.000 description 10
- 230000001018 virulence Effects 0.000 description 10
- KKZFLSZAWCYPOC-VPENINKCSA-N Deoxyribose 5-phosphate Chemical compound O[C@H]1C[C@H](O)[C@@H](COP(O)(O)=O)O1 KKZFLSZAWCYPOC-VPENINKCSA-N 0.000 description 9
- 241001646716 Escherichia coli K-12 Species 0.000 description 9
- 238000003556 assay Methods 0.000 description 9
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 8
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 8
- 108700026244 Open Reading Frames Proteins 0.000 description 8
- 206010040047 Sepsis Diseases 0.000 description 8
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 7
- 102000001390 Fructose-Bisphosphate Aldolase Human genes 0.000 description 7
- 108010068561 Fructose-Bisphosphate Aldolase Proteins 0.000 description 7
- 206010022678 Intestinal infections Diseases 0.000 description 7
- 210000000349 chromosome Anatomy 0.000 description 7
- 238000009396 hybridization Methods 0.000 description 7
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 6
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 6
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 6
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 6
- 230000000721 bacterilogical effect Effects 0.000 description 6
- 102000053602 DNA Human genes 0.000 description 5
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 5
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 5
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 5
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 5
- 230000001580 bacterial effect Effects 0.000 description 5
- 239000012472 biological sample Substances 0.000 description 5
- 108010017391 lysylvaline Proteins 0.000 description 5
- 239000000463 material Substances 0.000 description 5
- 239000000047 product Substances 0.000 description 5
- 208000013223 septicemia Diseases 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 210000001519 tissue Anatomy 0.000 description 5
- JPGBXANAQYHTLA-DRZSPHRISA-N Ala-Gln-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JPGBXANAQYHTLA-DRZSPHRISA-N 0.000 description 4
- 108020004635 Complementary DNA Proteins 0.000 description 4
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 4
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 4
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 4
- 241001465754 Metazoa Species 0.000 description 4
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 4
- 241000293871 Salmonella enterica subsp. enterica serovar Typhi Species 0.000 description 4
- 239000002253 acid Substances 0.000 description 4
- 201000003146 cystitis Diseases 0.000 description 4
- 230000007423 decrease Effects 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 230000000968 intestinal effect Effects 0.000 description 4
- 230000003472 neutralizing effect Effects 0.000 description 4
- 230000002085 persistent effect Effects 0.000 description 4
- 239000013612 plasmid Substances 0.000 description 4
- 108091033319 polynucleotide Proteins 0.000 description 4
- 102000040430 polynucleotide Human genes 0.000 description 4
- 239000002157 polynucleotide Substances 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 238000006467 substitution reaction Methods 0.000 description 4
- 208000019206 urinary tract infection Diseases 0.000 description 4
- 229920001817 Agar Polymers 0.000 description 3
- YHKANGMVQWRMAP-DCAQKATOSA-N Ala-Leu-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YHKANGMVQWRMAP-DCAQKATOSA-N 0.000 description 3
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 3
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 3
- 108020004705 Codon Proteins 0.000 description 3
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 3
- QICVAHODWHIWIS-HTFCKZLJSA-N Ile-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N QICVAHODWHIWIS-HTFCKZLJSA-N 0.000 description 3
- UMYZBHKAVTXWIW-GMOBBJLQSA-N Ile-Asp-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UMYZBHKAVTXWIW-GMOBBJLQSA-N 0.000 description 3
- GAZGFPOZOLEYAJ-YTFOTSKYSA-N Ile-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N GAZGFPOZOLEYAJ-YTFOTSKYSA-N 0.000 description 3
- HQEPKOFULQTSFV-JURCDPSOSA-N Ile-Phe-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)O)N HQEPKOFULQTSFV-JURCDPSOSA-N 0.000 description 3
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 3
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 3
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 3
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 3
- KKXDHFKZWKLYGB-GUBZILKMSA-N Leu-Asn-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKXDHFKZWKLYGB-GUBZILKMSA-N 0.000 description 3
- LOLUPZNNADDTAA-AVGNSLFASA-N Leu-Gln-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LOLUPZNNADDTAA-AVGNSLFASA-N 0.000 description 3
- DDVHDMSBLRAKNV-IHRRRGAJSA-N Leu-Met-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O DDVHDMSBLRAKNV-IHRRRGAJSA-N 0.000 description 3
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 3
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 3
- YQAIUOWPSUOINN-IUCAKERBSA-N Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN YQAIUOWPSUOINN-IUCAKERBSA-N 0.000 description 3
- MDDUIRLQCYVRDO-NHCYSSNCSA-N Lys-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN MDDUIRLQCYVRDO-NHCYSSNCSA-N 0.000 description 3
- 201000009906 Meningitis Diseases 0.000 description 3
- XMMWDTUFTZMQFD-GMOBBJLQSA-N Met-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCSC XMMWDTUFTZMQFD-GMOBBJLQSA-N 0.000 description 3
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 3
- ONORAGIFHNAADN-LLLHUVSDSA-N Phe-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N ONORAGIFHNAADN-LLLHUVSDSA-N 0.000 description 3
- CMHTUJQZQXFNTQ-OEAJRASXSA-N Phe-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CC=CC=C1)N)O CMHTUJQZQXFNTQ-OEAJRASXSA-N 0.000 description 3
- BAONJAHBAUDJKA-BZSNNMDCSA-N Phe-Tyr-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 BAONJAHBAUDJKA-BZSNNMDCSA-N 0.000 description 3
- 108020004511 Recombinant DNA Proteins 0.000 description 3
- IXZHZUGGKLRHJD-DCAQKATOSA-N Ser-Leu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IXZHZUGGKLRHJD-DCAQKATOSA-N 0.000 description 3
- DSGIVWSDDRDJIO-ZXXMMSQZSA-N Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DSGIVWSDDRDJIO-ZXXMMSQZSA-N 0.000 description 3
- BIVIUZRBCAUNPW-JRQIVUDYSA-N Tyr-Thr-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O BIVIUZRBCAUNPW-JRQIVUDYSA-N 0.000 description 3
- OVLIFGQSBSNGHY-KKHAAJSZSA-N Val-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N)O OVLIFGQSBSNGHY-KKHAAJSZSA-N 0.000 description 3
- 150000007513 acids Chemical class 0.000 description 3
- 239000008272 agar Substances 0.000 description 3
- 108010011559 alanylphenylalanine Proteins 0.000 description 3
- 230000000692 anti-sense effect Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 230000004071 biological effect Effects 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 3
- 230000001010 compromised effect Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 108010037850 glycylvaline Proteins 0.000 description 3
- 238000000338 in vitro Methods 0.000 description 3
- 108010064235 lysylglycine Proteins 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 239000000725 suspension Substances 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 230000001225 therapeutic effect Effects 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 210000002700 urine Anatomy 0.000 description 3
- IKHGUXGNUITLKF-UHFFFAOYSA-N Acetaldehyde Chemical compound CC=O IKHGUXGNUITLKF-UHFFFAOYSA-N 0.000 description 2
- BYXHQQCXAJARLQ-ZLUOBGJFSA-N Ala-Ala-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O BYXHQQCXAJARLQ-ZLUOBGJFSA-N 0.000 description 2
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 2
- HYIDEIQUCBKIPL-CQDKDKBSSA-N Ala-Phe-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N HYIDEIQUCBKIPL-CQDKDKBSSA-N 0.000 description 2
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 2
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 2
- 108091093088 Amplicon Proteins 0.000 description 2
- FRMQITGHXMUNDF-GMOBBJLQSA-N Arg-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FRMQITGHXMUNDF-GMOBBJLQSA-N 0.000 description 2
- BNYNOWJESJJIOI-XUXIUFHCSA-N Arg-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N BNYNOWJESJJIOI-XUXIUFHCSA-N 0.000 description 2
- MOGMYRUNTKYZFB-UNQGMJICSA-N Arg-Thr-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MOGMYRUNTKYZFB-UNQGMJICSA-N 0.000 description 2
- PJOPLXOCKACMLK-KKUMJFAQSA-N Arg-Tyr-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O PJOPLXOCKACMLK-KKUMJFAQSA-N 0.000 description 2
- IIFDPDVJAHQFSR-WHFBIAKZSA-N Asn-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O IIFDPDVJAHQFSR-WHFBIAKZSA-N 0.000 description 2
- OOXUBGLNDRGOKT-FXQIFTODSA-N Asn-Ser-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OOXUBGLNDRGOKT-FXQIFTODSA-N 0.000 description 2
- SNYCNNPOFYBCEK-ZLUOBGJFSA-N Asn-Ser-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O SNYCNNPOFYBCEK-ZLUOBGJFSA-N 0.000 description 2
- ZSVJVIOVABDTTL-YUMQZZPRSA-N Asp-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)O)N ZSVJVIOVABDTTL-YUMQZZPRSA-N 0.000 description 2
- SVABRQFIHCSNCI-FOHZUACHSA-N Asp-Gly-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O SVABRQFIHCSNCI-FOHZUACHSA-N 0.000 description 2
- VDUPGIDTWNQAJD-CIUDSAMLSA-N Cys-Lys-Cys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CS)C(=O)N[C@@H](CS)C(O)=O VDUPGIDTWNQAJD-CIUDSAMLSA-N 0.000 description 2
- XZFYRXDAULDNFX-UWVGGRQHSA-N Cys-Phe Chemical compound SC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UWVGGRQHSA-N 0.000 description 2
- 238000009007 Diagnostic Kit Methods 0.000 description 2
- 206010012741 Diarrhoea haemorrhagic Diseases 0.000 description 2
- 238000002965 ELISA Methods 0.000 description 2
- 241000206602 Eukaryota Species 0.000 description 2
- LVRKAFPPFJRIOF-GARJFASQSA-N Gln-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N LVRKAFPPFJRIOF-GARJFASQSA-N 0.000 description 2
- LHMWTCWZARHLPV-CIUDSAMLSA-N Gln-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)N)N LHMWTCWZARHLPV-CIUDSAMLSA-N 0.000 description 2
- HMIXCETWRYDVMO-GUBZILKMSA-N Gln-Pro-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O HMIXCETWRYDVMO-GUBZILKMSA-N 0.000 description 2
- MFORDNZDKAVNSR-SRVKXCTJSA-N Gln-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCC(N)=O MFORDNZDKAVNSR-SRVKXCTJSA-N 0.000 description 2
- JILRMFFFCHUUTJ-ACZMJKKPSA-N Gln-Ser-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O JILRMFFFCHUUTJ-ACZMJKKPSA-N 0.000 description 2
- DIXKFOPPGWKZLY-CIUDSAMLSA-N Glu-Arg-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O DIXKFOPPGWKZLY-CIUDSAMLSA-N 0.000 description 2
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 2
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 2
- JSIQVRIXMINMTA-ZDLURKLDSA-N Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O JSIQVRIXMINMTA-ZDLURKLDSA-N 0.000 description 2
- FUTAPPOITCCWTH-WHFBIAKZSA-N Gly-Asp-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FUTAPPOITCCWTH-WHFBIAKZSA-N 0.000 description 2
- SABZDFAAOJATBR-QWRGUYRKSA-N Gly-Cys-Phe Chemical compound [H]NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SABZDFAAOJATBR-QWRGUYRKSA-N 0.000 description 2
- FSPVILZGHUJOHS-QWRGUYRKSA-N Gly-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CNC=N1 FSPVILZGHUJOHS-QWRGUYRKSA-N 0.000 description 2
- COVXELOAORHTND-LSJOCFKGSA-N Gly-Ile-Val Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O COVXELOAORHTND-LSJOCFKGSA-N 0.000 description 2
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 2
- ICUTTWWCDIIIEE-BQBZGAKWSA-N Gly-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN ICUTTWWCDIIIEE-BQBZGAKWSA-N 0.000 description 2
- 102000000587 Glycerolphosphate Dehydrogenase Human genes 0.000 description 2
- 108010041921 Glycerolphosphate Dehydrogenase Proteins 0.000 description 2
- 208000032759 Hemolytic-Uremic Syndrome Diseases 0.000 description 2
- FBOMZVOKCZMDIG-XQQFMLRXSA-N His-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N FBOMZVOKCZMDIG-XQQFMLRXSA-N 0.000 description 2
- CNPNWGHRMBQHBZ-ZKWXMUAHSA-N Ile-Gln Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O CNPNWGHRMBQHBZ-ZKWXMUAHSA-N 0.000 description 2
- OEQKGSPBDVKYOC-ZKWXMUAHSA-N Ile-Gly-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N OEQKGSPBDVKYOC-ZKWXMUAHSA-N 0.000 description 2
- TWPSALMCEHCIOY-YTFOTSKYSA-N Ile-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(=O)O)N TWPSALMCEHCIOY-YTFOTSKYSA-N 0.000 description 2
- HPCFRQWLTRDGHT-AJNGGQMLSA-N Ile-Leu-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O HPCFRQWLTRDGHT-AJNGGQMLSA-N 0.000 description 2
- DRCKHKZYDLJYFQ-YWIQKCBGSA-N Ile-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DRCKHKZYDLJYFQ-YWIQKCBGSA-N 0.000 description 2
- HZVRQFKRALAMQS-SLBDDTMCSA-N Ile-Trp-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZVRQFKRALAMQS-SLBDDTMCSA-N 0.000 description 2
- 208000019637 Infantile Diarrhea Diseases 0.000 description 2
- 241000880493 Leptailurus serval Species 0.000 description 2
- SUPVSFFZWVOEOI-CQDKDKBSSA-N Leu-Ala-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-CQDKDKBSSA-N 0.000 description 2
- SUPVSFFZWVOEOI-UHFFFAOYSA-N Leu-Ala-Tyr Natural products CC(C)CC(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-UHFFFAOYSA-N 0.000 description 2
- CUXRXAIAVYLVFD-ULQDDVLXSA-N Leu-Arg-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 CUXRXAIAVYLVFD-ULQDDVLXSA-N 0.000 description 2
- DBVWMYGBVFCRBE-CIUDSAMLSA-N Leu-Asn-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DBVWMYGBVFCRBE-CIUDSAMLSA-N 0.000 description 2
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 2
- KFKWRHQBZQICHA-STQMWFEESA-N Leu-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 2
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 2
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 2
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 2
- HOMFINRJHIIZNJ-HOCLYGCPSA-N Leu-Trp-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O HOMFINRJHIIZNJ-HOCLYGCPSA-N 0.000 description 2
- MDSUKZSLOATHMH-IUCAKERBSA-N Leu-Val Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C([O-])=O MDSUKZSLOATHMH-IUCAKERBSA-N 0.000 description 2
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 2
- 239000006137 Luria-Bertani broth Substances 0.000 description 2
- FZIJIFCXUCZHOL-CIUDSAMLSA-N Lys-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN FZIJIFCXUCZHOL-CIUDSAMLSA-N 0.000 description 2
- HGNRJCINZYHNOU-LURJTMIESA-N Lys-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(O)=O HGNRJCINZYHNOU-LURJTMIESA-N 0.000 description 2
- ITWQLSZTLBKWJM-YUMQZZPRSA-N Lys-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCCN ITWQLSZTLBKWJM-YUMQZZPRSA-N 0.000 description 2
- IVFUVMSKSFSFBT-NHCYSSNCSA-N Lys-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN IVFUVMSKSFSFBT-NHCYSSNCSA-N 0.000 description 2
- BDFHWFUAQLIMJO-KXNHARMFSA-N Lys-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N)O BDFHWFUAQLIMJO-KXNHARMFSA-N 0.000 description 2
- 206010058780 Meningitis neonatal Diseases 0.000 description 2
- IVCPHARVJUYDPA-FXQIFTODSA-N Met-Asn-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IVCPHARVJUYDPA-FXQIFTODSA-N 0.000 description 2
- HKRYNJSKVLZIFP-IHRRRGAJSA-N Met-Asn-Tyr Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HKRYNJSKVLZIFP-IHRRRGAJSA-N 0.000 description 2
- MCNGIXXCMJAURZ-VEVYYDQMSA-N Met-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCSC)N)O MCNGIXXCMJAURZ-VEVYYDQMSA-N 0.000 description 2
- OFNCSQNBSWGGNV-DCAQKATOSA-N Met-Cys-His Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 OFNCSQNBSWGGNV-DCAQKATOSA-N 0.000 description 2
- RMLLCGYYVZKKRT-CIUDSAMLSA-N Met-Ser-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O RMLLCGYYVZKKRT-CIUDSAMLSA-N 0.000 description 2
- DBMLDOWSVHMQQN-XGEHTFHBSA-N Met-Ser-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DBMLDOWSVHMQQN-XGEHTFHBSA-N 0.000 description 2
- KSIPKXNIQOWMIC-RCWTZXSCSA-N Met-Thr-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KSIPKXNIQOWMIC-RCWTZXSCSA-N 0.000 description 2
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 2
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 2
- MDSUKZSLOATHMH-UHFFFAOYSA-N N-L-leucyl-L-valine Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(O)=O MDSUKZSLOATHMH-UHFFFAOYSA-N 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- 108091034117 Oligonucleotide Proteins 0.000 description 2
- 238000002944 PCR assay Methods 0.000 description 2
- MJQFZGOIVBDIMZ-WHOFXGATSA-N Phe-Ile-Gly Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)O MJQFZGOIVBDIMZ-WHOFXGATSA-N 0.000 description 2
- SCKXGHWQPPURGT-KKUMJFAQSA-N Phe-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O SCKXGHWQPPURGT-KKUMJFAQSA-N 0.000 description 2
- CQZNGNCAIXMAIQ-UBHSHLNASA-N Pro-Ala-Phe Chemical compound C[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O CQZNGNCAIXMAIQ-UBHSHLNASA-N 0.000 description 2
- WVOXLKUUVCCCSU-ZPFDUUQYSA-N Pro-Glu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVOXLKUUVCCCSU-ZPFDUUQYSA-N 0.000 description 2
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 2
- LTFSLKWFMWZEBD-IMJSIDKUSA-N Ser-Asn Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O LTFSLKWFMWZEBD-IMJSIDKUSA-N 0.000 description 2
- COAHUSQNSVFYBW-FXQIFTODSA-N Ser-Asn-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O COAHUSQNSVFYBW-FXQIFTODSA-N 0.000 description 2
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 2
- HZNFKPJCGZXKIC-DCAQKATOSA-N Ser-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CO)N HZNFKPJCGZXKIC-DCAQKATOSA-N 0.000 description 2
- UTSWGQNAQRIHAI-UNQGMJICSA-N Thr-Arg-Phe Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 UTSWGQNAQRIHAI-UNQGMJICSA-N 0.000 description 2
- DXPURPNJDFCKKO-RHYQMDGZSA-N Thr-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DXPURPNJDFCKKO-RHYQMDGZSA-N 0.000 description 2
- NWECYMJLJGCBOD-UNQGMJICSA-N Thr-Phe-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O NWECYMJLJGCBOD-UNQGMJICSA-N 0.000 description 2
- 102000005924 Triose-Phosphate Isomerase Human genes 0.000 description 2
- 108700015934 Triose-phosphate isomerases Proteins 0.000 description 2
- UUBKSZNKJUJQEJ-JRQIVUDYSA-N Tyr-Thr-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O UUBKSZNKJUJQEJ-JRQIVUDYSA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 2
- 238000001042 affinity chromatography Methods 0.000 description 2
- 108010017893 alanyl-alanyl-alanine Proteins 0.000 description 2
- 108010087924 alanylproline Proteins 0.000 description 2
- 150000001412 amines Chemical class 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 2
- 108010077245 asparaginyl-proline Proteins 0.000 description 2
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 2
- 108010038633 aspartylglutamate Proteins 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 238000009640 blood culture Methods 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 238000010835 comparative analysis Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 230000009089 cytolysis Effects 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 2
- 230000000369 enteropathogenic effect Effects 0.000 description 2
- 238000000855 fermentation Methods 0.000 description 2
- 230000004151 fermentation Effects 0.000 description 2
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 2
- 210000001035 gastrointestinal tract Anatomy 0.000 description 2
- 108010049041 glutamylalanine Proteins 0.000 description 2
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 2
- 108010089804 glycyl-threonine Proteins 0.000 description 2
- 108010050848 glycylleucine Proteins 0.000 description 2
- 238000004128 high performance liquid chromatography Methods 0.000 description 2
- 210000004408 hybridoma Anatomy 0.000 description 2
- 238000001114 immunoprecipitation Methods 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- 108010057821 leucylproline Proteins 0.000 description 2
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 2
- 210000004962 mammalian cell Anatomy 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 229930027945 nicotinamide-adenine dinucleotide Natural products 0.000 description 2
- BOPGDPNILDQYTO-NNYOXOHSSA-N nicotinamide-adenine dinucleotide Chemical compound C1=CCC(C(=O)N)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OC[C@@H]2[C@H]([C@@H](O)[C@@H](O2)N2C3=NC=NC(N)=C3N=C2)O)O1 BOPGDPNILDQYTO-NNYOXOHSSA-N 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 244000052769 pathogen Species 0.000 description 2
- 230000008506 pathogenesis Effects 0.000 description 2
- 108010082795 phenylalanyl-arginyl-arginine Proteins 0.000 description 2
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 2
- 108010051242 phenylalanylserine Proteins 0.000 description 2
- 230000026731 phosphorylation Effects 0.000 description 2
- 238000006366 phosphorylation reaction Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 108010070643 prolylglutamic acid Proteins 0.000 description 2
- 108010090894 prolylleucine Proteins 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 108091008146 restriction endonucleases Proteins 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 108010071207 serylmethionine Proteins 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 238000002560 therapeutic procedure Methods 0.000 description 2
- 108010061238 threonyl-glycine Proteins 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000009261 transgenic effect Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 241000701161 unidentified adenovirus Species 0.000 description 2
- 241001430294 unidentified retrovirus Species 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 238000001262 western blot Methods 0.000 description 2
- AUXMWYRZQPIXCC-KNIFDHDWSA-N (2s)-2-amino-4-methylpentanoic acid;(2s)-2-aminopropanoic acid Chemical compound C[C@H](N)C(O)=O.CC(C)C[C@H](N)C(O)=O AUXMWYRZQPIXCC-KNIFDHDWSA-N 0.000 description 1
- RHMALYOXPBRJBG-WXHCCQJTSA-N (2s)-6-amino-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-6-amino-2-[[(2s)-2-[[(2s)-2-[[2-[[(2s,3r)-2-[[(2s)-2-[[2-[[2-[[(2r)-2-amino-3-phenylpropanoyl]amino]acetyl]amino]acetyl]amino]-3-phenylpropanoyl]amino]-3-hydroxybutanoyl]amino]acetyl]amino]propanoyl]amino]- Chemical compound C([C@@H](C(=O)N[C@@H]([C@H](O)C)C(=O)NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(N)=O)NC(=O)CNC(=O)CNC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RHMALYOXPBRJBG-WXHCCQJTSA-N 0.000 description 1
- LXJXRIRHZLFYRP-VKHMYHEASA-L (R)-2-Hydroxy-3-(phosphonooxy)-propanal Natural products O=C[C@H](O)COP([O-])([O-])=O LXJXRIRHZLFYRP-VKHMYHEASA-L 0.000 description 1
- ZPLCXHWYPWVJDL-UHFFFAOYSA-N 4-[(4-hydroxyphenyl)methyl]-1,3-oxazolidin-2-one Chemical compound C1=CC(O)=CC=C1CC1NC(=O)OC1 ZPLCXHWYPWVJDL-UHFFFAOYSA-N 0.000 description 1
- 101710102786 ATP-dependent leucine adenylase Proteins 0.000 description 1
- 101000768957 Acholeplasma phage L2 Uncharacterized 37.2 kDa protein Proteins 0.000 description 1
- 101000823746 Acidianus ambivalens Uncharacterized 17.7 kDa protein in bps2 3'region Proteins 0.000 description 1
- 101000916369 Acidianus ambivalens Uncharacterized protein in sor 5'region Proteins 0.000 description 1
- 101000769342 Acinetobacter guillouiae Uncharacterized protein in rpoN-murA intergenic region Proteins 0.000 description 1
- 101000823696 Actinobacillus pleuropneumoniae Uncharacterized glycosyltransferase in aroQ 3'region Proteins 0.000 description 1
- 241000589155 Agrobacterium tumefaciens Species 0.000 description 1
- 101000786513 Agrobacterium tumefaciens (strain 15955) Uncharacterized protein outside the virF region Proteins 0.000 description 1
- HHGYNJRJIINWAK-FXQIFTODSA-N Ala-Ala-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N HHGYNJRJIINWAK-FXQIFTODSA-N 0.000 description 1
- AAQGRPOPTAUUBM-ZLUOBGJFSA-N Ala-Ala-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O AAQGRPOPTAUUBM-ZLUOBGJFSA-N 0.000 description 1
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 1
- VBDMWOKJZDCFJM-FXQIFTODSA-N Ala-Ala-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N VBDMWOKJZDCFJM-FXQIFTODSA-N 0.000 description 1
- FSBCNCKIQZZASN-GUBZILKMSA-N Ala-Arg-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O FSBCNCKIQZZASN-GUBZILKMSA-N 0.000 description 1
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 1
- IYCZBJXFSZSHPN-DLOVCJGASA-N Ala-Cys-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IYCZBJXFSZSHPN-DLOVCJGASA-N 0.000 description 1
- SFNFGFDRYJKZKN-XQXXSGGOSA-N Ala-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C)N)O SFNFGFDRYJKZKN-XQXXSGGOSA-N 0.000 description 1
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 1
- HCZXHQADHZIEJD-CIUDSAMLSA-N Ala-Leu-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HCZXHQADHZIEJD-CIUDSAMLSA-N 0.000 description 1
- UWIQWPWWZUHBAO-ZLIFDBKOSA-N Ala-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)CC(C)C)C(O)=O)=CNC2=C1 UWIQWPWWZUHBAO-ZLIFDBKOSA-N 0.000 description 1
- OMNVYXHOSHNURL-WPRPVWTQSA-N Ala-Phe Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OMNVYXHOSHNURL-WPRPVWTQSA-N 0.000 description 1
- CJQAEJMHBAOQHA-DLOVCJGASA-N Ala-Phe-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CJQAEJMHBAOQHA-DLOVCJGASA-N 0.000 description 1
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 1
- WPWUFUBLGADILS-WDSKDSINSA-N Ala-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WPWUFUBLGADILS-WDSKDSINSA-N 0.000 description 1
- WQLDNOCHHRISMS-NAKRPEOUSA-N Ala-Pro-Ile Chemical compound [H]N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WQLDNOCHHRISMS-NAKRPEOUSA-N 0.000 description 1
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 1
- KLALXKYLOMZDQT-ZLUOBGJFSA-N Ala-Ser-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(N)=O KLALXKYLOMZDQT-ZLUOBGJFSA-N 0.000 description 1
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 1
- QKHWNPQNOHEFST-VZFHVOOUSA-N Ala-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C)N)O QKHWNPQNOHEFST-VZFHVOOUSA-N 0.000 description 1
- JJHBEVZAZXZREW-LFSVMHDDSA-N Ala-Thr-Phe Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O JJHBEVZAZXZREW-LFSVMHDDSA-N 0.000 description 1
- CREYEAPXISDKSB-FQPOAREZSA-N Ala-Thr-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CREYEAPXISDKSB-FQPOAREZSA-N 0.000 description 1
- LIWMQSWFLXEGMA-WDSKDSINSA-N Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)N LIWMQSWFLXEGMA-WDSKDSINSA-N 0.000 description 1
- BVLPIIBTWIYOML-ZKWXMUAHSA-N Ala-Val-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BVLPIIBTWIYOML-ZKWXMUAHSA-N 0.000 description 1
- RFJNDTQGEJRBHO-DCAQKATOSA-N Ala-Val-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)[NH3+] RFJNDTQGEJRBHO-DCAQKATOSA-N 0.000 description 1
- SSQHYGLFYWZWDV-UVBJJODRSA-N Ala-Val-Trp Chemical compound CC(C)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O SSQHYGLFYWZWDV-UVBJJODRSA-N 0.000 description 1
- 101000618005 Alkalihalobacillus pseudofirmus (strain ATCC BAA-2126 / JCM 17055 / OF4) Uncharacterized protein BpOF4_00885 Proteins 0.000 description 1
- 241000710929 Alphavirus Species 0.000 description 1
- 102100020724 Ankyrin repeat, SAM and basic leucine zipper domain-containing protein 1 Human genes 0.000 description 1
- BIOCIVSVEDFKDJ-GUBZILKMSA-N Arg-Arg-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O BIOCIVSVEDFKDJ-GUBZILKMSA-N 0.000 description 1
- NONSEUUPKITYQT-BQBZGAKWSA-N Arg-Asn-Gly Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N)CN=C(N)N NONSEUUPKITYQT-BQBZGAKWSA-N 0.000 description 1
- GHNDBBVSWOWYII-LPEHRKFASA-N Arg-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O GHNDBBVSWOWYII-LPEHRKFASA-N 0.000 description 1
- UFBURHXMKFQVLM-CIUDSAMLSA-N Arg-Glu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UFBURHXMKFQVLM-CIUDSAMLSA-N 0.000 description 1
- NVUIWHJLPSZZQC-CYDGBPFRSA-N Arg-Ile-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NVUIWHJLPSZZQC-CYDGBPFRSA-N 0.000 description 1
- FLYANDHDFRGGTM-PYJNHQTQSA-N Arg-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FLYANDHDFRGGTM-PYJNHQTQSA-N 0.000 description 1
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 1
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 1
- INXWADWANGLMPJ-JYJNAYRXSA-N Arg-Phe-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)CC1=CC=CC=C1 INXWADWANGLMPJ-JYJNAYRXSA-N 0.000 description 1
- YTMKMRSYXHBGER-IHRRRGAJSA-N Arg-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YTMKMRSYXHBGER-IHRRRGAJSA-N 0.000 description 1
- ASQKVGRCKOFKIU-KZVJFYERSA-N Arg-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ASQKVGRCKOFKIU-KZVJFYERSA-N 0.000 description 1
- POZKLUIXMHIULG-FDARSICLSA-N Arg-Trp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CCCN=C(N)N)N POZKLUIXMHIULG-FDARSICLSA-N 0.000 description 1
- AOJYORNRFWWEIV-IHRRRGAJSA-N Arg-Tyr-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 AOJYORNRFWWEIV-IHRRRGAJSA-N 0.000 description 1
- FXGMURPOWCKNAZ-JYJNAYRXSA-N Arg-Val-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FXGMURPOWCKNAZ-JYJNAYRXSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- SWLOHUMCUDRTCL-ZLUOBGJFSA-N Asn-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N SWLOHUMCUDRTCL-ZLUOBGJFSA-N 0.000 description 1
- RJUHZPRQRQLCFL-IMJSIDKUSA-N Asn-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O RJUHZPRQRQLCFL-IMJSIDKUSA-N 0.000 description 1
- HZYFHQOWCFUSOV-IMJSIDKUSA-N Asn-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O HZYFHQOWCFUSOV-IMJSIDKUSA-N 0.000 description 1
- XVAPVJNJGLWGCS-ACZMJKKPSA-N Asn-Glu-Asn Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVAPVJNJGLWGCS-ACZMJKKPSA-N 0.000 description 1
- GFFRWIJAFFMQGM-NUMRIWBASA-N Asn-Glu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GFFRWIJAFFMQGM-NUMRIWBASA-N 0.000 description 1
- KLKHFFMNGWULBN-VKHMYHEASA-N Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)NCC(O)=O KLKHFFMNGWULBN-VKHMYHEASA-N 0.000 description 1
- RAQMSGVCGSJKCL-FOHZUACHSA-N Asn-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(N)=O RAQMSGVCGSJKCL-FOHZUACHSA-N 0.000 description 1
- SEKBHZJLARBNPB-GHCJXIJMSA-N Asn-Ile-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O SEKBHZJLARBNPB-GHCJXIJMSA-N 0.000 description 1
- HXWUJJADFMXNKA-BQBZGAKWSA-N Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O HXWUJJADFMXNKA-BQBZGAKWSA-N 0.000 description 1
- GIQCDTKOIPUDSG-GARJFASQSA-N Asn-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N)C(=O)O GIQCDTKOIPUDSG-GARJFASQSA-N 0.000 description 1
- GADKFYNESXNRLC-WDSKDSINSA-N Asn-Pro Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GADKFYNESXNRLC-WDSKDSINSA-N 0.000 description 1
- IDUUACUJKUXKKD-VEVYYDQMSA-N Asn-Pro-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O IDUUACUJKUXKKD-VEVYYDQMSA-N 0.000 description 1
- SONUFGRSSMFHFN-IMJSIDKUSA-N Asn-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O SONUFGRSSMFHFN-IMJSIDKUSA-N 0.000 description 1
- VBKIFHUVGLOJKT-FKZODXBYSA-N Asn-Thr Chemical compound C[C@@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)N)O VBKIFHUVGLOJKT-FKZODXBYSA-N 0.000 description 1
- DVUFTQLHHHJEMK-IMJSIDKUSA-N Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O DVUFTQLHHHJEMK-IMJSIDKUSA-N 0.000 description 1
- DBWYWXNMZZYIRY-LPEHRKFASA-N Asp-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N)C(=O)O DBWYWXNMZZYIRY-LPEHRKFASA-N 0.000 description 1
- VGRHZPNRCLAHQA-IMJSIDKUSA-N Asp-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O VGRHZPNRCLAHQA-IMJSIDKUSA-N 0.000 description 1
- VAWNQIGQPUOPQW-ACZMJKKPSA-N Asp-Glu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VAWNQIGQPUOPQW-ACZMJKKPSA-N 0.000 description 1
- HSPSXROIMXIJQW-BQBZGAKWSA-N Asp-His Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 HSPSXROIMXIJQW-BQBZGAKWSA-N 0.000 description 1
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 1
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 1
- HXVILZUZXFLVEN-DCAQKATOSA-N Asp-Met-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O HXVILZUZXFLVEN-DCAQKATOSA-N 0.000 description 1
- YZQCXOFQZKCETR-UWVGGRQHSA-N Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YZQCXOFQZKCETR-UWVGGRQHSA-N 0.000 description 1
- USNJAPJZSGTTPX-XVSYOHENSA-N Asp-Phe-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O USNJAPJZSGTTPX-XVSYOHENSA-N 0.000 description 1
- BRRPVTUFESPTCP-ACZMJKKPSA-N Asp-Ser-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O BRRPVTUFESPTCP-ACZMJKKPSA-N 0.000 description 1
- XAPPCWUWHNWCPQ-PBCZWWQYSA-N Asp-Thr-His Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O XAPPCWUWHNWCPQ-PBCZWWQYSA-N 0.000 description 1
- JDDYEZGPYBBPBN-JRQIVUDYSA-N Asp-Thr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JDDYEZGPYBBPBN-JRQIVUDYSA-N 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 101000967489 Azorhizobium caulinodans (strain ATCC 43989 / DSM 5975 / JCM 20966 / LMG 6465 / NBRC 14845 / NCIMB 13405 / ORS 571) Uncharacterized protein AZC_3924 Proteins 0.000 description 1
- 101000823761 Bacillus licheniformis Uncharacterized 9.4 kDa protein in flaL 3'region Proteins 0.000 description 1
- 101000819719 Bacillus methanolicus Uncharacterized N-acetyltransferase in lysA 3'region Proteins 0.000 description 1
- 101000765604 Bacillus subtilis (strain 168) FlaA locus 22.9 kDa protein Proteins 0.000 description 1
- 101000789586 Bacillus subtilis (strain 168) UPF0702 transmembrane protein YkjA Proteins 0.000 description 1
- 101000792624 Bacillus subtilis (strain 168) Uncharacterized protein YbxH Proteins 0.000 description 1
- 101000790792 Bacillus subtilis (strain 168) Uncharacterized protein YckC Proteins 0.000 description 1
- 101000819705 Bacillus subtilis (strain 168) Uncharacterized protein YlxR Proteins 0.000 description 1
- 101000948218 Bacillus subtilis (strain 168) Uncharacterized protein YtxJ Proteins 0.000 description 1
- 101000718627 Bacillus thuringiensis subsp. kurstaki Putative RNA polymerase sigma-G factor Proteins 0.000 description 1
- 208000031729 Bacteremia Diseases 0.000 description 1
- 208000035143 Bacterial infection Diseases 0.000 description 1
- 101000641200 Bombyx mori densovirus Putative non-structural protein Proteins 0.000 description 1
- 101100380241 Caenorhabditis elegans arx-2 gene Proteins 0.000 description 1
- 101000964402 Caldicellulosiruptor saccharolyticus Uncharacterized protein in xynC 3'region Proteins 0.000 description 1
- 241000588919 Citrobacter freundii Species 0.000 description 1
- 101000947633 Claviceps purpurea Uncharacterized 13.8 kDa protein Proteins 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 108091033380 Coding strand Proteins 0.000 description 1
- 102100031725 Cortactin-binding protein 2 Human genes 0.000 description 1
- 208000011231 Crohn disease Diseases 0.000 description 1
- TVYMKYUSZSVOAG-ZLUOBGJFSA-N Cys-Ala-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O TVYMKYUSZSVOAG-ZLUOBGJFSA-N 0.000 description 1
- SQJSYLDKQBZQTG-FXQIFTODSA-N Cys-Asn-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CS)N SQJSYLDKQBZQTG-FXQIFTODSA-N 0.000 description 1
- DZLQXIFVQFTFJY-BYPYZUCNSA-N Cys-Gly-Gly Chemical compound SC[C@H](N)C(=O)NCC(=O)NCC(O)=O DZLQXIFVQFTFJY-BYPYZUCNSA-N 0.000 description 1
- ZXCAQANTQWBICD-DCAQKATOSA-N Cys-Lys-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CS)N ZXCAQANTQWBICD-DCAQKATOSA-N 0.000 description 1
- WYVKPHCYMTWUCW-YUPRTTJUSA-N Cys-Thr Chemical compound C[C@@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)N)O WYVKPHCYMTWUCW-YUPRTTJUSA-N 0.000 description 1
- YFKWIIRWHGKSQQ-WFBYXXMGSA-N Cys-Trp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CS)N YFKWIIRWHGKSQQ-WFBYXXMGSA-N 0.000 description 1
- LXJXRIRHZLFYRP-VKHMYHEASA-N D-glyceraldehyde 3-phosphate Chemical compound O=C[C@H](O)COP(O)(O)=O LXJXRIRHZLFYRP-VKHMYHEASA-N 0.000 description 1
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 1
- 239000003298 DNA probe Substances 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 241000702421 Dependoparvovirus Species 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 101000948901 Enterobacteria phage T4 Uncharacterized 16.0 kDa protein in segB-ipI intergenic region Proteins 0.000 description 1
- 101000805958 Equine herpesvirus 4 (strain 1942) Virion protein US10 homolog Proteins 0.000 description 1
- 241001618318 Escherichia coli 55989 Species 0.000 description 1
- 101000790442 Escherichia coli Insertion element IS2 uncharacterized 11.1 kDa protein Proteins 0.000 description 1
- 241000617590 Escherichia coli K1 Species 0.000 description 1
- 241001646719 Escherichia coli O157:H7 Species 0.000 description 1
- 241000660147 Escherichia coli str. K-12 substr. MG1655 Species 0.000 description 1
- 101000788354 Escherichia phage P2 Uncharacterized 8.2 kDa protein in gpA 5'region Proteins 0.000 description 1
- 108700039887 Essential Genes Proteins 0.000 description 1
- 108091029865 Exogenous DNA Proteins 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 108091060211 Expressed sequence tag Proteins 0.000 description 1
- 101000770304 Frankia alni UPF0460 protein in nifX-nifW intergenic region Proteins 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 208000005577 Gastroenteritis Diseases 0.000 description 1
- 101000797344 Geobacillus stearothermophilus Putative tRNA (cytidine(34)-2'-O)-methyltransferase Proteins 0.000 description 1
- 101000748410 Geobacillus stearothermophilus Uncharacterized protein in fumA 3'region Proteins 0.000 description 1
- FAQVCWVVIYYWRR-WHFBIAKZSA-N Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O FAQVCWVVIYYWRR-WHFBIAKZSA-N 0.000 description 1
- JSYULGSPLTZDHM-NRPADANISA-N Gln-Ala-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O JSYULGSPLTZDHM-NRPADANISA-N 0.000 description 1
- OPINTGHFESTVAX-BQBZGAKWSA-N Gln-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N OPINTGHFESTVAX-BQBZGAKWSA-N 0.000 description 1
- MWLYSLMKFXWZPW-ZPFDUUQYSA-N Gln-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCC(N)=O MWLYSLMKFXWZPW-ZPFDUUQYSA-N 0.000 description 1
- WMOMPXKOKASNBK-PEFMBERDSA-N Gln-Asn-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WMOMPXKOKASNBK-PEFMBERDSA-N 0.000 description 1
- LOJYQMFIIJVETK-WDSKDSINSA-N Gln-Gln Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LOJYQMFIIJVETK-WDSKDSINSA-N 0.000 description 1
- YXQCLIVLWCKCRS-RYUDHWBXSA-N Gln-Gly-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N)O YXQCLIVLWCKCRS-RYUDHWBXSA-N 0.000 description 1
- UKKNTTCNGZLJEX-WHFBIAKZSA-N Gln-Ser Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(O)=O UKKNTTCNGZLJEX-WHFBIAKZSA-N 0.000 description 1
- KUBFPYIMAGXGBT-ACZMJKKPSA-N Gln-Ser-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KUBFPYIMAGXGBT-ACZMJKKPSA-N 0.000 description 1
- RONJIBWTGKVKFY-HTUGSXCWSA-N Gln-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O RONJIBWTGKVKFY-HTUGSXCWSA-N 0.000 description 1
- IIMZHVKZBGSEKZ-SZMVWBNQSA-N Gln-Trp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(C)C)C(O)=O IIMZHVKZBGSEKZ-SZMVWBNQSA-N 0.000 description 1
- JZDHUJAFXGNDSB-WHFBIAKZSA-N Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O JZDHUJAFXGNDSB-WHFBIAKZSA-N 0.000 description 1
- RSUVOPBMWMTVDI-XEGUGMAKSA-N Glu-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCC(O)=O)C)C(O)=O)=CNC2=C1 RSUVOPBMWMTVDI-XEGUGMAKSA-N 0.000 description 1
- FYYSIASRLDJUNP-WHFBIAKZSA-N Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FYYSIASRLDJUNP-WHFBIAKZSA-N 0.000 description 1
- XXCDTYBVGMPIOA-FXQIFTODSA-N Glu-Asp-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XXCDTYBVGMPIOA-FXQIFTODSA-N 0.000 description 1
- SBCYJMOOHUDWDA-NUMRIWBASA-N Glu-Asp-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SBCYJMOOHUDWDA-NUMRIWBASA-N 0.000 description 1
- UMIRPYLZFKOEOH-YVNDNENWSA-N Glu-Gln-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UMIRPYLZFKOEOH-YVNDNENWSA-N 0.000 description 1
- OGNJZUXUTPQVBR-BQBZGAKWSA-N Glu-Gly-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OGNJZUXUTPQVBR-BQBZGAKWSA-N 0.000 description 1
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 1
- BBBXWRGITSUJPB-YUMQZZPRSA-N Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O BBBXWRGITSUJPB-YUMQZZPRSA-N 0.000 description 1
- SUIAHERNFYRBDZ-GVXVVHGQSA-N Glu-Lys-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O SUIAHERNFYRBDZ-GVXVVHGQSA-N 0.000 description 1
- SXGAGTVDWKQYCX-BQBZGAKWSA-N Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SXGAGTVDWKQYCX-BQBZGAKWSA-N 0.000 description 1
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 1
- TZXOPHFCAATANZ-QEJZJMRPSA-N Glu-Ser-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N TZXOPHFCAATANZ-QEJZJMRPSA-N 0.000 description 1
- DMYACXMQUABZIQ-NRPADANISA-N Glu-Ser-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O DMYACXMQUABZIQ-NRPADANISA-N 0.000 description 1
- DLISPGXMKZTWQG-IFFSRLJSSA-N Glu-Thr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O DLISPGXMKZTWQG-IFFSRLJSSA-N 0.000 description 1
- SITLTJHOQZFJGG-XPUUQOCRSA-N Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SITLTJHOQZFJGG-XPUUQOCRSA-N 0.000 description 1
- WGYHAAXZWPEBDQ-IFFSRLJSSA-N Glu-Val-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGYHAAXZWPEBDQ-IFFSRLJSSA-N 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- PYTZFYUXZZHOAD-WHFBIAKZSA-N Gly-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)CN PYTZFYUXZZHOAD-WHFBIAKZSA-N 0.000 description 1
- MFBYPDKTAJXHNI-VKHMYHEASA-N Gly-Cys Chemical compound [NH3+]CC(=O)N[C@@H](CS)C([O-])=O MFBYPDKTAJXHNI-VKHMYHEASA-N 0.000 description 1
- MOJKRXIRAZPZLW-WDSKDSINSA-N Gly-Glu-Ala Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MOJKRXIRAZPZLW-WDSKDSINSA-N 0.000 description 1
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 1
- DGKBSGNCMCLDSL-BYULHYEWSA-N Gly-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN DGKBSGNCMCLDSL-BYULHYEWSA-N 0.000 description 1
- LUJVWKKYHSLULQ-ZKWXMUAHSA-N Gly-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN LUJVWKKYHSLULQ-ZKWXMUAHSA-N 0.000 description 1
- DKEXFJVMVGETOO-LURJTMIESA-N Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CN DKEXFJVMVGETOO-LURJTMIESA-N 0.000 description 1
- CVFOYJJOZYYEPE-KBPBESRZSA-N Gly-Lys-Tyr Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CVFOYJJOZYYEPE-KBPBESRZSA-N 0.000 description 1
- IFHJOBKVXBESRE-YUMQZZPRSA-N Gly-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)CN IFHJOBKVXBESRE-YUMQZZPRSA-N 0.000 description 1
- FJWSJWACLMTDMI-WPRPVWTQSA-N Gly-Met-Val Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O FJWSJWACLMTDMI-WPRPVWTQSA-N 0.000 description 1
- QSQXZZCGPXQBPP-BQBZGAKWSA-N Gly-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)CN)C(=O)N[C@@H](CS)C(=O)O QSQXZZCGPXQBPP-BQBZGAKWSA-N 0.000 description 1
- XBGGUPMXALFZOT-VIFPVBQESA-N Gly-Tyr Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-VIFPVBQESA-N 0.000 description 1
- KOYUSMBPJOVSOO-XEGUGMAKSA-N Gly-Tyr-Ile Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KOYUSMBPJOVSOO-XEGUGMAKSA-N 0.000 description 1
- GWCJMBNBFYBQCV-XPUUQOCRSA-N Gly-Val-Ala Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O GWCJMBNBFYBQCV-XPUUQOCRSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- VPZXBVLAVMBEQI-VKHMYHEASA-N Glycyl-alanine Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 1
- 101000772675 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) UPF0438 protein HI_0847 Proteins 0.000 description 1
- 101000631019 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) Uncharacterized protein HI_0350 Proteins 0.000 description 1
- 101000768938 Haemophilus phage HP1 (strain HP1c1) Uncharacterized 8.9 kDa protein in int-C1 intergenic region Proteins 0.000 description 1
- 241001523162 Helle Species 0.000 description 1
- SVHKVHBPTOMLTO-DCAQKATOSA-N His-Arg-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SVHKVHBPTOMLTO-DCAQKATOSA-N 0.000 description 1
- XJQDHFMUUBRCGA-KKUMJFAQSA-N His-Asn-Phe Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XJQDHFMUUBRCGA-KKUMJFAQSA-N 0.000 description 1
- JFFAPRNXXLRINI-NHCYSSNCSA-N His-Asp-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JFFAPRNXXLRINI-NHCYSSNCSA-N 0.000 description 1
- IDQKGZWUPVOGPZ-GUBZILKMSA-N His-Cys-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N IDQKGZWUPVOGPZ-GUBZILKMSA-N 0.000 description 1
- YAALVYQFVJNXIV-KKUMJFAQSA-N His-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 YAALVYQFVJNXIV-KKUMJFAQSA-N 0.000 description 1
- BSVLMPMIXPQNKC-KBPBESRZSA-N His-Phe-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O BSVLMPMIXPQNKC-KBPBESRZSA-N 0.000 description 1
- HTOOKGDPMXSJSY-STQMWFEESA-N His-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 HTOOKGDPMXSJSY-STQMWFEESA-N 0.000 description 1
- 101000785414 Homo sapiens Ankyrin repeat, SAM and basic leucine zipper domain-containing protein 1 Proteins 0.000 description 1
- 101000666730 Homo sapiens T-complex protein 1 subunit alpha Proteins 0.000 description 1
- OAKJQQAXSVQMHS-UHFFFAOYSA-N Hydrazine Chemical compound NN OAKJQQAXSVQMHS-UHFFFAOYSA-N 0.000 description 1
- BOTVMTSMOUSDRW-GMOBBJLQSA-N Ile-Arg-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(O)=O BOTVMTSMOUSDRW-GMOBBJLQSA-N 0.000 description 1
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 1
- GQKSJYINYYWPMR-NGZCFLSTSA-N Ile-Gly-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N GQKSJYINYYWPMR-NGZCFLSTSA-N 0.000 description 1
- CSQNHSGHAPRGPQ-YTFOTSKYSA-N Ile-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)O)N CSQNHSGHAPRGPQ-YTFOTSKYSA-N 0.000 description 1
- KLBVGHCGHUNHEA-BJDJZHNGSA-N Ile-Leu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)O)N KLBVGHCGHUNHEA-BJDJZHNGSA-N 0.000 description 1
- FCWFBHMAJZGWRY-XUXIUFHCSA-N Ile-Leu-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)O)N FCWFBHMAJZGWRY-XUXIUFHCSA-N 0.000 description 1
- DSDPLOODKXISDT-XUXIUFHCSA-N Ile-Leu-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O DSDPLOODKXISDT-XUXIUFHCSA-N 0.000 description 1
- UWBDLNOCIDGPQE-GUBZILKMSA-N Ile-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN UWBDLNOCIDGPQE-GUBZILKMSA-N 0.000 description 1
- PARSHQDZROHERM-NHCYSSNCSA-N Ile-Lys-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)O)N PARSHQDZROHERM-NHCYSSNCSA-N 0.000 description 1
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 1
- IMRKCLXPYOIHIF-ZPFDUUQYSA-N Ile-Met-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N IMRKCLXPYOIHIF-ZPFDUUQYSA-N 0.000 description 1
- MSASLZGZQAXVFP-PEDHHIEDSA-N Ile-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N MSASLZGZQAXVFP-PEDHHIEDSA-N 0.000 description 1
- WMDZARSFSMZOQO-DRZSPHRISA-N Ile-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WMDZARSFSMZOQO-DRZSPHRISA-N 0.000 description 1
- OWSWUWDMSNXTNE-GMOBBJLQSA-N Ile-Pro-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N OWSWUWDMSNXTNE-GMOBBJLQSA-N 0.000 description 1
- IVXJIMGDOYRLQU-XUXIUFHCSA-N Ile-Pro-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O IVXJIMGDOYRLQU-XUXIUFHCSA-N 0.000 description 1
- YKZAMJXNJUWFIK-JBDRJPRFSA-N Ile-Ser-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(=O)O)N YKZAMJXNJUWFIK-JBDRJPRFSA-N 0.000 description 1
- XMYURPUVJSKTMC-KBIXCLLPSA-N Ile-Ser-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XMYURPUVJSKTMC-KBIXCLLPSA-N 0.000 description 1
- QQVXERGIFIRCGW-NAKRPEOUSA-N Ile-Ser-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)O)N QQVXERGIFIRCGW-NAKRPEOUSA-N 0.000 description 1
- JNLSTRPWUXOORL-MMWGEVLESA-N Ile-Ser-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N JNLSTRPWUXOORL-MMWGEVLESA-N 0.000 description 1
- ANTFEOSJMAUGIB-KNZXXDILSA-N Ile-Thr-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@@H]1C(=O)O)N ANTFEOSJMAUGIB-KNZXXDILSA-N 0.000 description 1
- APQYGMBHIVXFML-OSUNSFLBSA-N Ile-Val-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N APQYGMBHIVXFML-OSUNSFLBSA-N 0.000 description 1
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 1
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 1
- 108010065920 Insulin Lispro Proteins 0.000 description 1
- 208000036209 Intraabdominal Infections Diseases 0.000 description 1
- 101000782488 Junonia coenia densovirus (isolate pBRJ/1990) Putative non-structural protein NS2 Proteins 0.000 description 1
- 101000811523 Klebsiella pneumoniae Uncharacterized 55.8 kDa protein in cps region Proteins 0.000 description 1
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 1
- QLROSWPKSBORFJ-BQBZGAKWSA-N L-Prolyl-L-glutamic acid Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 QLROSWPKSBORFJ-BQBZGAKWSA-N 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 101000818409 Lactococcus lactis subsp. lactis Uncharacterized HTH-type transcriptional regulator in lacX 3'region Proteins 0.000 description 1
- 241000713666 Lentivirus Species 0.000 description 1
- 101000878851 Leptolyngbya boryana Putative Fe(2+) transport protein A Proteins 0.000 description 1
- LJHGALIOHLRRQN-DCAQKATOSA-N Leu-Ala-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LJHGALIOHLRRQN-DCAQKATOSA-N 0.000 description 1
- MJOZZTKJZQFKDK-GUBZILKMSA-N Leu-Ala-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(N)=O MJOZZTKJZQFKDK-GUBZILKMSA-N 0.000 description 1
- CQQGCWPXDHTTNF-GUBZILKMSA-N Leu-Ala-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O CQQGCWPXDHTTNF-GUBZILKMSA-N 0.000 description 1
- WSGXUIQTEZDVHJ-GARJFASQSA-N Leu-Ala-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O WSGXUIQTEZDVHJ-GARJFASQSA-N 0.000 description 1
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 1
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 1
- OGCQGUIWMSBHRZ-CIUDSAMLSA-N Leu-Asn-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OGCQGUIWMSBHRZ-CIUDSAMLSA-N 0.000 description 1
- WXHFZJFZWNCDNB-KKUMJFAQSA-N Leu-Asn-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WXHFZJFZWNCDNB-KKUMJFAQSA-N 0.000 description 1
- BPANDPNDMJHFEV-CIUDSAMLSA-N Leu-Asp-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O BPANDPNDMJHFEV-CIUDSAMLSA-N 0.000 description 1
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 1
- JYOAXOMPIXKMKK-YUMQZZPRSA-N Leu-Gln Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CCC(N)=O JYOAXOMPIXKMKK-YUMQZZPRSA-N 0.000 description 1
- KUEVMUXNILMJTK-JYJNAYRXSA-N Leu-Gln-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KUEVMUXNILMJTK-JYJNAYRXSA-N 0.000 description 1
- NFNVDJGXRFEYTK-YUMQZZPRSA-N Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O NFNVDJGXRFEYTK-YUMQZZPRSA-N 0.000 description 1
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 1
- HPBCTWSUJOGJSH-MNXVOIDGSA-N Leu-Glu-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HPBCTWSUJOGJSH-MNXVOIDGSA-N 0.000 description 1
- ZFNLIDNJUWNIJL-WDCWCFNPSA-N Leu-Glu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZFNLIDNJUWNIJL-WDCWCFNPSA-N 0.000 description 1
- HVJVUYQWFYMGJS-GVXVVHGQSA-N Leu-Glu-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVJVUYQWFYMGJS-GVXVVHGQSA-N 0.000 description 1
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 1
- XWOBNBRUDDUEEY-UWVGGRQHSA-N Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 XWOBNBRUDDUEEY-UWVGGRQHSA-N 0.000 description 1
- BKTXKJMNTSMJDQ-AVGNSLFASA-N Leu-His-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N BKTXKJMNTSMJDQ-AVGNSLFASA-N 0.000 description 1
- AZLASBBHHSLQDB-GUBZILKMSA-N Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(C)C AZLASBBHHSLQDB-GUBZILKMSA-N 0.000 description 1
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 1
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 1
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 1
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 1
- BJWKOATWNQJPSK-SRVKXCTJSA-N Leu-Met-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N BJWKOATWNQJPSK-SRVKXCTJSA-N 0.000 description 1
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 1
- KQFZKDITNUEVFJ-JYJNAYRXSA-N Leu-Phe-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CC=CC=C1 KQFZKDITNUEVFJ-JYJNAYRXSA-N 0.000 description 1
- INCJJHQRZGQLFC-KBPBESRZSA-N Leu-Phe-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O INCJJHQRZGQLFC-KBPBESRZSA-N 0.000 description 1
- UHNQRAFSEBGZFZ-YESZJQIVSA-N Leu-Phe-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N UHNQRAFSEBGZFZ-YESZJQIVSA-N 0.000 description 1
- WMIOEVKKYIMVKI-DCAQKATOSA-N Leu-Pro-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WMIOEVKKYIMVKI-DCAQKATOSA-N 0.000 description 1
- KIZIOFNVSOSKJI-CIUDSAMLSA-N Leu-Ser-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N KIZIOFNVSOSKJI-CIUDSAMLSA-N 0.000 description 1
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 1
- SBANPBVRHYIMRR-GARJFASQSA-N Leu-Ser-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N SBANPBVRHYIMRR-GARJFASQSA-N 0.000 description 1
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 1
- ZDJQVSIPFLMNOX-RHYQMDGZSA-N Leu-Thr-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZDJQVSIPFLMNOX-RHYQMDGZSA-N 0.000 description 1
- VDIARPPNADFEAV-WEDXCCLWSA-N Leu-Thr-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O VDIARPPNADFEAV-WEDXCCLWSA-N 0.000 description 1
- URHJPNHRQMQGOZ-RHYQMDGZSA-N Leu-Thr-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O URHJPNHRQMQGOZ-RHYQMDGZSA-N 0.000 description 1
- UIIMIKFNIYPDJF-WDSOQIARSA-N Leu-Trp-Met Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CCSC)C(O)=O)NC(=O)[C@@H](N)CC(C)C)=CNC2=C1 UIIMIKFNIYPDJF-WDSOQIARSA-N 0.000 description 1
- WUHBLPVELFTPQK-KKUMJFAQSA-N Leu-Tyr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O WUHBLPVELFTPQK-KKUMJFAQSA-N 0.000 description 1
- ARNIBBOXIAWUOP-MGHWNKPDSA-N Leu-Tyr-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ARNIBBOXIAWUOP-MGHWNKPDSA-N 0.000 description 1
- VQHUBNVKFFLWRP-ULQDDVLXSA-N Leu-Tyr-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 VQHUBNVKFFLWRP-ULQDDVLXSA-N 0.000 description 1
- MVJRBCJCRYGCKV-GVXVVHGQSA-N Leu-Val-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MVJRBCJCRYGCKV-GVXVVHGQSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 241001435619 Lile Species 0.000 description 1
- 208000035752 Live birth Diseases 0.000 description 1
- 241001625930 Luria Species 0.000 description 1
- 239000006142 Luria-Bertani Agar Substances 0.000 description 1
- 101000977779 Lymantria dispar multicapsid nuclear polyhedrosis virus Uncharacterized 33.9 kDa protein in PE 3'region Proteins 0.000 description 1
- WQWZXKWOEVSGQM-DCAQKATOSA-N Lys-Ala-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN WQWZXKWOEVSGQM-DCAQKATOSA-N 0.000 description 1
- DEFGUIIUYAUEDU-ZPFDUUQYSA-N Lys-Asn-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DEFGUIIUYAUEDU-ZPFDUUQYSA-N 0.000 description 1
- HKCCVDWHHTVVPN-CIUDSAMLSA-N Lys-Asp-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O HKCCVDWHHTVVPN-CIUDSAMLSA-N 0.000 description 1
- QQUJSUFWEDZQQY-AVGNSLFASA-N Lys-Gln-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN QQUJSUFWEDZQQY-AVGNSLFASA-N 0.000 description 1
- URGPVYGVWLIRGT-DCAQKATOSA-N Lys-Met-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O URGPVYGVWLIRGT-DCAQKATOSA-N 0.000 description 1
- LMGNWHDWJDIOPK-DKIMLUQUSA-N Lys-Phe-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LMGNWHDWJDIOPK-DKIMLUQUSA-N 0.000 description 1
- AFLBTVGQCQLOFJ-AVGNSLFASA-N Lys-Pro-Arg Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AFLBTVGQCQLOFJ-AVGNSLFASA-N 0.000 description 1
- OBZHNHBAAVEWKI-DCAQKATOSA-N Lys-Pro-Asn Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O OBZHNHBAAVEWKI-DCAQKATOSA-N 0.000 description 1
- GIKFNMZSGYAPEJ-HJGDQZAQSA-N Lys-Thr-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O GIKFNMZSGYAPEJ-HJGDQZAQSA-N 0.000 description 1
- RMOKGALPSPOYKE-KATARQTJSA-N Lys-Thr-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMOKGALPSPOYKE-KATARQTJSA-N 0.000 description 1
- NQOQDINRVQCAKD-ULQDDVLXSA-N Lys-Tyr-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCCCN)N NQOQDINRVQCAKD-ULQDDVLXSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- QRHWTCJBCLGYRB-FXQIFTODSA-N Met-Ala-Cys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(O)=O QRHWTCJBCLGYRB-FXQIFTODSA-N 0.000 description 1
- DTICLBJHRYSJLH-GUBZILKMSA-N Met-Ala-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O DTICLBJHRYSJLH-GUBZILKMSA-N 0.000 description 1
- SDTSLIMYROCDNS-FXQIFTODSA-N Met-Cys-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(O)=O SDTSLIMYROCDNS-FXQIFTODSA-N 0.000 description 1
- UOENBSHXYCHSAU-YUMQZZPRSA-N Met-Gln-Gly Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O UOENBSHXYCHSAU-YUMQZZPRSA-N 0.000 description 1
- PHWSCIFNNLLUFJ-NHCYSSNCSA-N Met-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCSC)N PHWSCIFNNLLUFJ-NHCYSSNCSA-N 0.000 description 1
- JPCHYAUKOUGOIB-HJGDQZAQSA-N Met-Glu-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPCHYAUKOUGOIB-HJGDQZAQSA-N 0.000 description 1
- GVIVXNFKJQFTCE-YUMQZZPRSA-N Met-Gly-Gln Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O GVIVXNFKJQFTCE-YUMQZZPRSA-N 0.000 description 1
- GETCJHFFECHWHI-QXEWZRGKSA-N Met-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCSC)N GETCJHFFECHWHI-QXEWZRGKSA-N 0.000 description 1
- PBOUVYGPDSARIS-IUCAKERBSA-N Met-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(C)C PBOUVYGPDSARIS-IUCAKERBSA-N 0.000 description 1
- RBGLBUDVQVPTEG-DCAQKATOSA-N Met-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCSC)N RBGLBUDVQVPTEG-DCAQKATOSA-N 0.000 description 1
- LNXGEYIEEUZGGH-JYJNAYRXSA-N Met-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CCSC)CC1=CC=CC=C1 LNXGEYIEEUZGGH-JYJNAYRXSA-N 0.000 description 1
- DZMGFGQBRYWJOR-YUMQZZPRSA-N Met-Pro Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O DZMGFGQBRYWJOR-YUMQZZPRSA-N 0.000 description 1
- WYDFQSJOARJAMM-GUBZILKMSA-N Met-Pro-Asp Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WYDFQSJOARJAMM-GUBZILKMSA-N 0.000 description 1
- WEDDFMCSUNNZJR-WDSKDSINSA-N Met-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O WEDDFMCSUNNZJR-WDSKDSINSA-N 0.000 description 1
- GGXZOTSDJJTDGB-GUBZILKMSA-N Met-Ser-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O GGXZOTSDJJTDGB-GUBZILKMSA-N 0.000 description 1
- 101000758828 Methanosarcina barkeri (strain Fusaro / DSM 804) Uncharacterized protein Mbar_A1602 Proteins 0.000 description 1
- 101001122401 Middle East respiratory syndrome-related coronavirus (isolate United Kingdom/H123990006/2012) Non-structural protein ORF3 Proteins 0.000 description 1
- 101100005318 Mus musculus Ctsr gene Proteins 0.000 description 1
- 101001055788 Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) Pentapeptide repeat protein MfpA Proteins 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 1
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 1
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 101000827630 Narcissus mosaic virus Uncharacterized 10 kDa protein Proteins 0.000 description 1
- 239000000020 Nitrocellulose Substances 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 240000008881 Oenanthe javanica Species 0.000 description 1
- 101000740670 Orgyia pseudotsugata multicapsid polyhedrosis virus Protein C42 Proteins 0.000 description 1
- 206010031252 Osteomyelitis Diseases 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 239000001888 Peptone Substances 0.000 description 1
- 108010080698 Peptones Proteins 0.000 description 1
- MIDZLCFIAINOQN-WPRPVWTQSA-N Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 MIDZLCFIAINOQN-WPRPVWTQSA-N 0.000 description 1
- BJEYSVHMGIJORT-NHCYSSNCSA-N Phe-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BJEYSVHMGIJORT-NHCYSSNCSA-N 0.000 description 1
- BBDSZDHUCPSYAC-QEJZJMRPSA-N Phe-Ala-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BBDSZDHUCPSYAC-QEJZJMRPSA-N 0.000 description 1
- LZDIENNKWVXJMX-JYJNAYRXSA-N Phe-Arg-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC1=CC=CC=C1 LZDIENNKWVXJMX-JYJNAYRXSA-N 0.000 description 1
- PLNHHOXNVSYKOB-JYJNAYRXSA-N Phe-Arg-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CC=CC=C1)N PLNHHOXNVSYKOB-JYJNAYRXSA-N 0.000 description 1
- IWRZUGHCHFZYQZ-UFYCRDLUSA-N Phe-Arg-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 IWRZUGHCHFZYQZ-UFYCRDLUSA-N 0.000 description 1
- BXNGIHFNNNSEOS-UWVGGRQHSA-N Phe-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 BXNGIHFNNNSEOS-UWVGGRQHSA-N 0.000 description 1
- IUVYJBMTHARMIP-PCBIJLKTSA-N Phe-Asp-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O IUVYJBMTHARMIP-PCBIJLKTSA-N 0.000 description 1
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 1
- JWBLQDDHSDGEGR-DRZSPHRISA-N Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JWBLQDDHSDGEGR-DRZSPHRISA-N 0.000 description 1
- CWFGECHCRMGPPT-MXAVVETBSA-N Phe-Ile-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O CWFGECHCRMGPPT-MXAVVETBSA-N 0.000 description 1
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 1
- GKZIWHRNKRBEOH-HOTGVXAUSA-N Phe-Phe Chemical compound C([C@H]([NH3+])C(=O)N[C@@H](CC=1C=CC=CC=1)C([O-])=O)C1=CC=CC=C1 GKZIWHRNKRBEOH-HOTGVXAUSA-N 0.000 description 1
- KAJLHCWRWDSROH-BZSNNMDCSA-N Phe-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 KAJLHCWRWDSROH-BZSNNMDCSA-N 0.000 description 1
- WKLMCMXFMQEKCX-SLFFLAALSA-N Phe-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CC=CC=C3)N)C(=O)O WKLMCMXFMQEKCX-SLFFLAALSA-N 0.000 description 1
- XOHJOMKCRLHGCY-UNQGMJICSA-N Phe-Pro-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOHJOMKCRLHGCY-UNQGMJICSA-N 0.000 description 1
- ROHDXJUFQVRDAV-UWVGGRQHSA-N Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ROHDXJUFQVRDAV-UWVGGRQHSA-N 0.000 description 1
- MVIJMIZJPHQGEN-IHRRRGAJSA-N Phe-Ser-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@H](CO)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 MVIJMIZJPHQGEN-IHRRRGAJSA-N 0.000 description 1
- 101000769182 Photorhabdus luminescens Uncharacterized protein in pnp 3'region Proteins 0.000 description 1
- 206010035664 Pneumonia Diseases 0.000 description 1
- FELJDCNGZFDUNR-WDSKDSINSA-N Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 FELJDCNGZFDUNR-WDSKDSINSA-N 0.000 description 1
- FZHBZMDRDASUHN-NAKRPEOUSA-N Pro-Ala-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1)C(O)=O FZHBZMDRDASUHN-NAKRPEOUSA-N 0.000 description 1
- XQLBWXHVZVBNJM-FXQIFTODSA-N Pro-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 XQLBWXHVZVBNJM-FXQIFTODSA-N 0.000 description 1
- GRIRJQGZZJVANI-CYDGBPFRSA-N Pro-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H]1CCCN1 GRIRJQGZZJVANI-CYDGBPFRSA-N 0.000 description 1
- UTAUEDINXUMHLG-FXQIFTODSA-N Pro-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 UTAUEDINXUMHLG-FXQIFTODSA-N 0.000 description 1
- VJLJGKQAOQJXJG-CIUDSAMLSA-N Pro-Asp-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJLJGKQAOQJXJG-CIUDSAMLSA-N 0.000 description 1
- XJROSHJRQTXWAE-XGEHTFHBSA-N Pro-Cys-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XJROSHJRQTXWAE-XGEHTFHBSA-N 0.000 description 1
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 1
- OCYROESYHWUPBP-CIUDSAMLSA-N Pro-Ile Chemical compound CC[C@H](C)[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 OCYROESYHWUPBP-CIUDSAMLSA-N 0.000 description 1
- IBGCFJDLCYTKPW-NAKRPEOUSA-N Pro-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 IBGCFJDLCYTKPW-NAKRPEOUSA-N 0.000 description 1
- BCNRNJWSRFDPTQ-HJWJTTGWSA-N Pro-Ile-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BCNRNJWSRFDPTQ-HJWJTTGWSA-N 0.000 description 1
- GURGCNUWVSDYTP-SRVKXCTJSA-N Pro-Leu-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GURGCNUWVSDYTP-SRVKXCTJSA-N 0.000 description 1
- MCWHYUWXVNRXFV-RWMBFGLXSA-N Pro-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 MCWHYUWXVNRXFV-RWMBFGLXSA-N 0.000 description 1
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 1
- SXMSEHDMNIUTSP-DCAQKATOSA-N Pro-Lys-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SXMSEHDMNIUTSP-DCAQKATOSA-N 0.000 description 1
- POQFNPILEQEODH-FXQIFTODSA-N Pro-Ser-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O POQFNPILEQEODH-FXQIFTODSA-N 0.000 description 1
- RNEFESSBTOQSAC-DCAQKATOSA-N Pro-Ser-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O RNEFESSBTOQSAC-DCAQKATOSA-N 0.000 description 1
- CXGLFEOYCJFKPR-RCWTZXSCSA-N Pro-Thr-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O CXGLFEOYCJFKPR-RCWTZXSCSA-N 0.000 description 1
- OFSZYRZOUMNCCU-BZSNNMDCSA-N Pro-Trp-Met Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCSC)C(O)=O)C(=O)[C@@H]1CCCN1 OFSZYRZOUMNCCU-BZSNNMDCSA-N 0.000 description 1
- AWJGUZSYVIVZGP-YUMQZZPRSA-N Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 AWJGUZSYVIVZGP-YUMQZZPRSA-N 0.000 description 1
- 101710197985 Probable protein Rev Proteins 0.000 description 1
- 102100032859 Protein AMBP Human genes 0.000 description 1
- 101000961392 Pseudescherichia vulneris Uncharacterized 29.9 kDa protein in crtE 3'region Proteins 0.000 description 1
- 101000731030 Pseudomonas oleovorans Poly(3-hydroxyalkanoate) polymerase 2 Proteins 0.000 description 1
- 101001065485 Pseudomonas putida Probable fatty acid methyltransferase Proteins 0.000 description 1
- 101710194805 Putative repressor Proteins 0.000 description 1
- 108010025216 RVF peptide Proteins 0.000 description 1
- 102000009661 Repressor Proteins Human genes 0.000 description 1
- 206010057190 Respiratory tract infections Diseases 0.000 description 1
- 101000711023 Rhizobium leguminosarum bv. trifolii Uncharacterized protein in tfuA 3'region Proteins 0.000 description 1
- 241000191043 Rhodobacter sphaeroides Species 0.000 description 1
- 101000948156 Rhodococcus erythropolis Uncharacterized 47.3 kDa protein in thcA 5'region Proteins 0.000 description 1
- 101000917565 Rhodococcus fascians Uncharacterized 33.6 kDa protein in fasciation locus Proteins 0.000 description 1
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 1
- 101000790284 Saimiriine herpesvirus 2 (strain 488) Uncharacterized 9.5 kDa protein in DHFR 3'region Proteins 0.000 description 1
- 241001138501 Salmonella enterica Species 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- SSJMZMUVNKEENT-IMJSIDKUSA-N Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CO SSJMZMUVNKEENT-IMJSIDKUSA-N 0.000 description 1
- IDQFQFVEWMWRQQ-DLOVCJGASA-N Ser-Ala-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IDQFQFVEWMWRQQ-DLOVCJGASA-N 0.000 description 1
- YQHZVYJAGWMHES-ZLUOBGJFSA-N Ser-Ala-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YQHZVYJAGWMHES-ZLUOBGJFSA-N 0.000 description 1
- HBZBPFLJNDXRAY-FXQIFTODSA-N Ser-Ala-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O HBZBPFLJNDXRAY-FXQIFTODSA-N 0.000 description 1
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 1
- KNZQGAUEYZJUSQ-ZLUOBGJFSA-N Ser-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N KNZQGAUEYZJUSQ-ZLUOBGJFSA-N 0.000 description 1
- CDVFZMOFNJPUDD-ACZMJKKPSA-N Ser-Gln-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CDVFZMOFNJPUDD-ACZMJKKPSA-N 0.000 description 1
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 1
- WEQAYODCJHZSJZ-KKUMJFAQSA-N Ser-His-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 WEQAYODCJHZSJZ-KKUMJFAQSA-N 0.000 description 1
- ZUDXUJSYCCNZQJ-DCAQKATOSA-N Ser-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CO)N ZUDXUJSYCCNZQJ-DCAQKATOSA-N 0.000 description 1
- RIAKPZVSNBBNRE-BJDJZHNGSA-N Ser-Ile-Leu Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O RIAKPZVSNBBNRE-BJDJZHNGSA-N 0.000 description 1
- PPQRSMGDOHLTBE-UWVGGRQHSA-N Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PPQRSMGDOHLTBE-UWVGGRQHSA-N 0.000 description 1
- NQZFFLBPNDLTPO-DLOVCJGASA-N Ser-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CO)N NQZFFLBPNDLTPO-DLOVCJGASA-N 0.000 description 1
- MQUZANJDFOQOBX-SRVKXCTJSA-N Ser-Phe-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O MQUZANJDFOQOBX-SRVKXCTJSA-N 0.000 description 1
- XZKQVQKUZMAADP-IMJSIDKUSA-N Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(O)=O XZKQVQKUZMAADP-IMJSIDKUSA-N 0.000 description 1
- WLJPJRGQRNCIQS-ZLUOBGJFSA-N Ser-Ser-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O WLJPJRGQRNCIQS-ZLUOBGJFSA-N 0.000 description 1
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 1
- AABIBDJHSKIMJK-FXQIFTODSA-N Ser-Ser-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O AABIBDJHSKIMJK-FXQIFTODSA-N 0.000 description 1
- LDEBVRIURYMKQS-WISUUJSJSA-N Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO LDEBVRIURYMKQS-WISUUJSJSA-N 0.000 description 1
- BCAVNDNYOGTQMQ-AAEUAGOBSA-N Ser-Trp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O BCAVNDNYOGTQMQ-AAEUAGOBSA-N 0.000 description 1
- ILVGMCVCQBJPSH-WDSKDSINSA-N Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO ILVGMCVCQBJPSH-WDSKDSINSA-N 0.000 description 1
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 241000700584 Simplexvirus Species 0.000 description 1
- 206010062255 Soft tissue infection Diseases 0.000 description 1
- 101000936719 Streptococcus gordonii Accessory Sec system protein Asp3 Proteins 0.000 description 1
- 101000788499 Streptomyces coelicolor Uncharacterized oxidoreductase in mprA 5'region Proteins 0.000 description 1
- 101001102841 Streptomyces griseus Purine nucleoside phosphorylase ORF3 Proteins 0.000 description 1
- 101000708557 Streptomyces lincolnensis Uncharacterized 17.2 kDa protein in melC2-rnhH intergenic region Proteins 0.000 description 1
- 229930006000 Sucrose Natural products 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- NHUHCSRWZMLRLA-UHFFFAOYSA-N Sulfisoxazole Chemical compound CC1=NOC(NS(=O)(=O)C=2C=CC(N)=CC=2)=C1C NHUHCSRWZMLRLA-UHFFFAOYSA-N 0.000 description 1
- 102100038410 T-complex protein 1 subunit alpha Human genes 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 108020005038 Terminator Codon Proteins 0.000 description 1
- 101000649826 Thermotoga neapolitana Putative anti-sigma factor antagonist TM1081 homolog Proteins 0.000 description 1
- HYLXOQURIOCKIH-VQVTYTSYSA-N Thr-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N HYLXOQURIOCKIH-VQVTYTSYSA-N 0.000 description 1
- GLQFKOVWXPPFTP-VEVYYDQMSA-N Thr-Arg-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O GLQFKOVWXPPFTP-VEVYYDQMSA-N 0.000 description 1
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 1
- NRUPKQSXTJNQGD-XGEHTFHBSA-N Thr-Cys-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NRUPKQSXTJNQGD-XGEHTFHBSA-N 0.000 description 1
- YAAPRMFURSENOZ-KATARQTJSA-N Thr-Cys-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N)O YAAPRMFURSENOZ-KATARQTJSA-N 0.000 description 1
- BIYXEUAFGLTAEM-WUJLRWPWSA-N Thr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(O)=O BIYXEUAFGLTAEM-WUJLRWPWSA-N 0.000 description 1
- LUMXICQAOKVQOB-YWIQKCBGSA-N Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)[C@@H](C)O LUMXICQAOKVQOB-YWIQKCBGSA-N 0.000 description 1
- IHAPJUHCZXBPHR-WZLNRYEVSA-N Thr-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N IHAPJUHCZXBPHR-WZLNRYEVSA-N 0.000 description 1
- TZJSEJOXAIWOST-RHYQMDGZSA-N Thr-Lys-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N TZJSEJOXAIWOST-RHYQMDGZSA-N 0.000 description 1
- CGCMNOIQVAXYMA-UNQGMJICSA-N Thr-Met-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O CGCMNOIQVAXYMA-UNQGMJICSA-N 0.000 description 1
- MXNAOGFNFNKUPD-JHYOHUSXSA-N Thr-Phe-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MXNAOGFNFNKUPD-JHYOHUSXSA-N 0.000 description 1
- MEBDIIKMUUNBSB-RPTUDFQQSA-N Thr-Phe-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MEBDIIKMUUNBSB-RPTUDFQQSA-N 0.000 description 1
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 1
- BEZTUFWTPVOROW-KJEVXHAQSA-N Thr-Tyr-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O BEZTUFWTPVOROW-KJEVXHAQSA-N 0.000 description 1
- KZTLZZQTJMCGIP-ZJDVBMNYSA-N Thr-Val-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KZTLZZQTJMCGIP-ZJDVBMNYSA-N 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 102000004357 Transferases Human genes 0.000 description 1
- 108090000992 Transferases Proteins 0.000 description 1
- LCPVBXOHXMBLFW-JSGCOSHPSA-N Trp-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)=CNC2=C1 LCPVBXOHXMBLFW-JSGCOSHPSA-N 0.000 description 1
- HOJPPPKZWFRTHJ-PJODQICGSA-N Trp-Arg-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N HOJPPPKZWFRTHJ-PJODQICGSA-N 0.000 description 1
- HYNAKPYFEYJMAS-XIRDDKMYSA-N Trp-Arg-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HYNAKPYFEYJMAS-XIRDDKMYSA-N 0.000 description 1
- DNUJCLUFRGGSDJ-YLVFBTJISA-N Trp-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CNC2=CC=CC=C21)N DNUJCLUFRGGSDJ-YLVFBTJISA-N 0.000 description 1
- KULBQAVOXHQLIY-HSCHXYMDSA-N Trp-Ile-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O)=CNC2=C1 KULBQAVOXHQLIY-HSCHXYMDSA-N 0.000 description 1
- VPRHDRKAPYZMHL-SZMVWBNQSA-N Trp-Leu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 VPRHDRKAPYZMHL-SZMVWBNQSA-N 0.000 description 1
- GBEAUNVBIMLWIB-IHPCNDPISA-N Trp-Ser-Phe Chemical compound C([C@H](NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=CC=C1 GBEAUNVBIMLWIB-IHPCNDPISA-N 0.000 description 1
- 241001467018 Typhis Species 0.000 description 1
- NOXKHHXSHQFSGJ-FQPOAREZSA-N Tyr-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NOXKHHXSHQFSGJ-FQPOAREZSA-N 0.000 description 1
- QNJYPWZACBACER-KKUMJFAQSA-N Tyr-Asp-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O QNJYPWZACBACER-KKUMJFAQSA-N 0.000 description 1
- HPYDSVWYXXKHRD-VIFPVBQESA-N Tyr-Gly Chemical compound [O-]C(=O)CNC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 HPYDSVWYXXKHRD-VIFPVBQESA-N 0.000 description 1
- HIINQLBHPIQYHN-JTQLQIEISA-N Tyr-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HIINQLBHPIQYHN-JTQLQIEISA-N 0.000 description 1
- OHOVFPKXPZODHS-SJWGOKEGSA-N Tyr-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N OHOVFPKXPZODHS-SJWGOKEGSA-N 0.000 description 1
- MVFQLSPDMMFCMW-KKUMJFAQSA-N Tyr-Leu-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O MVFQLSPDMMFCMW-KKUMJFAQSA-N 0.000 description 1
- AOLHUMAVONBBEZ-STQMWFEESA-N Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AOLHUMAVONBBEZ-STQMWFEESA-N 0.000 description 1
- PLXQRTXVLZUNMU-RNXOBYDBSA-N Tyr-Phe-Trp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)NC(=O)[C@H](CC4=CC=C(C=C4)O)N PLXQRTXVLZUNMU-RNXOBYDBSA-N 0.000 description 1
- ZSXJENBJGRHKIG-UWVGGRQHSA-N Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZSXJENBJGRHKIG-UWVGGRQHSA-N 0.000 description 1
- VYQQQIRHIFALGE-UWJYBYFXSA-N Tyr-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 VYQQQIRHIFALGE-UWJYBYFXSA-N 0.000 description 1
- BCOBSVIZMQXKFY-KKUMJFAQSA-N Tyr-Ser-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O BCOBSVIZMQXKFY-KKUMJFAQSA-N 0.000 description 1
- UMSZZGTXGKHTFJ-SRVKXCTJSA-N Tyr-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UMSZZGTXGKHTFJ-SRVKXCTJSA-N 0.000 description 1
- OYOQKMOWUDVWCR-RYUDHWBXSA-N Tyr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OYOQKMOWUDVWCR-RYUDHWBXSA-N 0.000 description 1
- SMUWZUSWMWVOSL-JYJNAYRXSA-N Tyr-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N SMUWZUSWMWVOSL-JYJNAYRXSA-N 0.000 description 1
- ABSXSJZNRAQDDI-KJEVXHAQSA-N Tyr-Val-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ABSXSJZNRAQDDI-KJEVXHAQSA-N 0.000 description 1
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 1
- QPZMOUMNTGTEFR-ZKWXMUAHSA-N Val-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N QPZMOUMNTGTEFR-ZKWXMUAHSA-N 0.000 description 1
- IDKGBVZGNTYYCC-QXEWZRGKSA-N Val-Asn-Pro Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(O)=O IDKGBVZGNTYYCC-QXEWZRGKSA-N 0.000 description 1
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 1
- ZSZFTYVFQLUWBF-QXEWZRGKSA-N Val-Asp-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N ZSZFTYVFQLUWBF-QXEWZRGKSA-N 0.000 description 1
- XXDVDTMEVBYRPK-XPUUQOCRSA-N Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O XXDVDTMEVBYRPK-XPUUQOCRSA-N 0.000 description 1
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 1
- OACSGBOREVRSME-NHCYSSNCSA-N Val-His-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CC(N)=O)C(O)=O OACSGBOREVRSME-NHCYSSNCSA-N 0.000 description 1
- SDUBQHUJJWQTEU-XUXIUFHCSA-N Val-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C(C)C)N SDUBQHUJJWQTEU-XUXIUFHCSA-N 0.000 description 1
- BZOSBRIDWSSTFN-AVGNSLFASA-N Val-Leu-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](C(C)C)N BZOSBRIDWSSTFN-AVGNSLFASA-N 0.000 description 1
- QRVPEKJBBRYISE-XUXIUFHCSA-N Val-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N QRVPEKJBBRYISE-XUXIUFHCSA-N 0.000 description 1
- MJFSRZZJQWZHFQ-SRVKXCTJSA-N Val-Met-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)O)N MJFSRZZJQWZHFQ-SRVKXCTJSA-N 0.000 description 1
- BCBFMJYTNKDALA-UFYCRDLUSA-N Val-Phe-Phe Chemical compound N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O BCBFMJYTNKDALA-UFYCRDLUSA-N 0.000 description 1
- KISFXYYRKKNLOP-IHRRRGAJSA-N Val-Phe-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N KISFXYYRKKNLOP-IHRRRGAJSA-N 0.000 description 1
- GIAZPLMMQOERPN-YUMQZZPRSA-N Val-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GIAZPLMMQOERPN-YUMQZZPRSA-N 0.000 description 1
- HPOSMQWRPMRMFO-GUBZILKMSA-N Val-Pro-Cys Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)O)N HPOSMQWRPMRMFO-GUBZILKMSA-N 0.000 description 1
- SDHZOOIGIUEPDY-JYJNAYRXSA-N Val-Ser-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 SDHZOOIGIUEPDY-JYJNAYRXSA-N 0.000 description 1
- NZYNRRGJJVSSTJ-GUBZILKMSA-N Val-Ser-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O NZYNRRGJJVSSTJ-GUBZILKMSA-N 0.000 description 1
- CEKSLIVSNNGOKH-KZVJFYERSA-N Val-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](C(C)C)N)O CEKSLIVSNNGOKH-KZVJFYERSA-N 0.000 description 1
- MNSSBIHFEUUXNW-RCWTZXSCSA-N Val-Thr-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N MNSSBIHFEUUXNW-RCWTZXSCSA-N 0.000 description 1
- OFTXTCGQJXTNQS-XGEHTFHBSA-N Val-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N)O OFTXTCGQJXTNQS-XGEHTFHBSA-N 0.000 description 1
- PGBMPFKFKXYROZ-UFYCRDLUSA-N Val-Tyr-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N PGBMPFKFKXYROZ-UFYCRDLUSA-N 0.000 description 1
- ZNGPROMGGGFOAA-JYJNAYRXSA-N Val-Tyr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 ZNGPROMGGGFOAA-JYJNAYRXSA-N 0.000 description 1
- KRNYOVHEKOBTEF-YUMQZZPRSA-N Val-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O KRNYOVHEKOBTEF-YUMQZZPRSA-N 0.000 description 1
- IOUPEELXVYPCPG-UHFFFAOYSA-N Valylglycine Chemical compound CC(C)C(N)C(=O)NCC(O)=O IOUPEELXVYPCPG-UHFFFAOYSA-N 0.000 description 1
- 101000827562 Vibrio alginolyticus Uncharacterized protein in proC 3'region Proteins 0.000 description 1
- 101000778915 Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633) Uncharacterized membrane protein VP2115 Proteins 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- CUJRVFIICFDLGR-UHFFFAOYSA-N acetylacetonate Chemical compound CC(=O)[CH-]C(C)=O CUJRVFIICFDLGR-UHFFFAOYSA-N 0.000 description 1
- MKUXAQIIEYXACX-UHFFFAOYSA-N aciclovir Chemical compound N1C(N)=NC(=O)C2=C1N(COCCO)C=N2 MKUXAQIIEYXACX-UHFFFAOYSA-N 0.000 description 1
- 101150092805 actc1 gene Proteins 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000001464 adherent effect Effects 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 1
- 108010045350 alanyl-tyrosyl-alanine Proteins 0.000 description 1
- 108010005233 alanylglutamic acid Proteins 0.000 description 1
- 108010047495 alanylglycine Proteins 0.000 description 1
- 125000001931 aliphatic group Chemical group 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 230000000890 antigenic effect Effects 0.000 description 1
- 239000004599 antimicrobial Substances 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 1
- 108010093581 aspartyl-proline Proteins 0.000 description 1
- 238000003149 assay kit Methods 0.000 description 1
- 208000022362 bacterial infectious disease Diseases 0.000 description 1
- 244000052616 bacterial pathogen Species 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 239000011575 calcium Substances 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 229940041514 candida albicans extract Drugs 0.000 description 1
- FPPNZSSZRUTDAP-UWFZAAFLSA-N carbenicillin Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)C(C(O)=O)C1=CC=CC=C1 FPPNZSSZRUTDAP-UWFZAAFLSA-N 0.000 description 1
- 229960003669 carbenicillin Drugs 0.000 description 1
- 210000001175 cerebrospinal fluid Anatomy 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000000546 chi-square test Methods 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 230000001684 chronic effect Effects 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 206010009887 colitis Diseases 0.000 description 1
- 238000004440 column chromatography Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 108091036078 conserved sequence Proteins 0.000 description 1
- 239000000356 contaminant Substances 0.000 description 1
- 239000013601 cosmid vector Substances 0.000 description 1
- 239000013058 crude material Substances 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 108010016616 cysteinylglycine Proteins 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 239000005549 deoxyribonucleoside Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 230000002416 diarrheagenic effect Effects 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000007877 drug screening Methods 0.000 description 1
- QFYBYZLHPIALCZ-ZETCQYMHSA-N eglu Chemical compound CC[C@@](N)(C(O)=O)CCC(O)=O QFYBYZLHPIALCZ-ZETCQYMHSA-N 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000000688 enterotoxigenic effect Effects 0.000 description 1
- 238000001976 enzyme digestion Methods 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000002550 fecal effect Effects 0.000 description 1
- 210000003608 fece Anatomy 0.000 description 1
- 238000002866 fluorescence resonance energy transfer Methods 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 238000005194 fractionation Methods 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 235000004554 glutamine Nutrition 0.000 description 1
- 108010079547 glutamylmethionine Proteins 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 1
- 108010077515 glycylproline Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- STKYPAFSDFAEPH-LURJTMIESA-N glycylvaline Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CN STKYPAFSDFAEPH-LURJTMIESA-N 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 108010028295 histidylhistidine Proteins 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000003018 immunoassay Methods 0.000 description 1
- 238000010166 immunofluorescence Methods 0.000 description 1
- 230000016784 immunoglobulin production Effects 0.000 description 1
- 238000012744 immunostaining Methods 0.000 description 1
- 238000012405 in silico analysis Methods 0.000 description 1
- 238000007901 in situ hybridization Methods 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 238000011081 inoculation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 208000028774 intestinal disease Diseases 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 210000002429 large intestine Anatomy 0.000 description 1
- 238000011031 large-scale manufacturing process Methods 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 1
- 238000001638 lipofection Methods 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007102 metabolic function Effects 0.000 description 1
- 108010068488 methionylphenylalanine Proteins 0.000 description 1
- 238000002493 microarray Methods 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 238000002887 multiple sequence alignment Methods 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 229920001220 nitrocellulos Polymers 0.000 description 1
- 244000039328 opportunistic pathogen Species 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000003950 pathogenic mechanism Effects 0.000 description 1
- 235000019319 peptone Nutrition 0.000 description 1
- 102000013415 peroxidase activity proteins Human genes 0.000 description 1
- 108040007629 peroxidase activity proteins Proteins 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 230000004962 physiological condition Effects 0.000 description 1
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 1
- 238000003752 polymerase chain reaction Methods 0.000 description 1
- 229920001282 polysaccharide Polymers 0.000 description 1
- 239000005017 polysaccharide Substances 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 230000009465 prokaryotic expression Effects 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 108010015796 prolylisoleucine Proteins 0.000 description 1
- 108010053725 prolylvaline Proteins 0.000 description 1
- 238000000159 protein binding assay Methods 0.000 description 1
- 238000002331 protein detection Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 239000012429 reaction media Substances 0.000 description 1
- 238000003753 real-time PCR Methods 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 230000001177 retroviral effect Effects 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 229920002477 rna polymer Polymers 0.000 description 1
- 231100000241 scar Toxicity 0.000 description 1
- 238000002821 scintillation proximity assay Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 238000003153 stable transfection Methods 0.000 description 1
- 238000012289 standard assay Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 108010003641 statine renin inhibitory peptide Proteins 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- CXVGEDCSTKKODG-UHFFFAOYSA-N sulisobenzone Chemical compound C1=C(S(O)(=O)=O)C(OC)=CC(O)=C1C(=O)C1=CC=CC=C1 CXVGEDCSTKKODG-UHFFFAOYSA-N 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 239000012137 tryptone Substances 0.000 description 1
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 1
- 108010038745 tryptophylglycine Proteins 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 1
- 108010051110 tyrosyl-lysine Proteins 0.000 description 1
- 108010078580 tyrosylleucine Proteins 0.000 description 1
- 230000002485 urinary effect Effects 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- YSGSDAIMSCVPHG-UHFFFAOYSA-N valyl-methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)C(C)C YSGSDAIMSCVPHG-UHFFFAOYSA-N 0.000 description 1
- 108010021889 valylvaline Proteins 0.000 description 1
- 230000007923 virulence factor Effects 0.000 description 1
- 239000000304 virulence factor Substances 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 239000012138 yeast extract Substances 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/24—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Enterobacteriaceae (F), e.g. Citrobacter, Serratia, Proteus, Providencia, Morganella, Yersinia
- C07K14/245—Escherichia (G)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/02—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving viable microorganisms
- C12Q1/04—Determining presence or kind of microorganism; Use of selective media for testing antibiotics or bacteriocides; Compositions containing a chemical indicator therefor
- C12Q1/10—Enterobacteria
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/48—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving transferase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/689—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Engineering & Computer Science (AREA)
- Analytical Chemistry (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Genetics & Genomics (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Toxicology (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicinal Chemistry (AREA)
- Peptides Or Proteins (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present invention is concerned with genetic and metabolic markers and with methods to identify pathogenic or potentially pathogenic strains of E.
coli.
More particularly, the invention provides nucleotide and amino acid sequences, antibodies, probes, cells, kits and methods concerning genes expressed mostly by pathogenic strains E. coli.
coli.
More particularly, the invention provides nucleotide and amino acid sequences, antibodies, probes, cells, kits and methods concerning genes expressed mostly by pathogenic strains E. coli.
Description
GENETIC MARKERS, METABOLIC MARKERS, AND METHODS FOR
EVALUATING PATHOGENICITY OF STRAINS OF E. COLI
BACKGROUND OF THE INVENTION
a) Field of the invention The present invention is concerned with genetic and metabolic markers and with methods to identify pathogenic or potentially pathogenic strains of E.
coli.
More particularly, the invention provides nucleotide and amino acid sequences, antibodies, probes, cells, kits and methods concerning genes expressed mostly by pathogenic strains E. coli.
b) Brief description of the prior art Escherichia coli is a heterogeneous species consisting of both enteric commensal and pathogenic strains. Different types of E. coli cause different diseases in a range of hosts, including extra-intestinal and enteric infections. For example, enteropathogenic E. coil (EPEC) is the leading cause of severe infantile diarrhea in developing countries, and enterohaemorrhagic E. coli (EHEC) (including the well-known 0157:H7) have recently been shown to be the cause of bloody diarrhea and hemolytic-uremic syndrome in major food-borne outbreaks in the United States, Europe, and Asia (CMR 1998, 11:142).
Over the last five years, studies have been published on the E. coli chromosome. The whole genome sequence of the laboratory strain K-12 MG1655 was published in 1997 (Science 1997, 277:1453). The genome of E, coli 0157: H7 (EHEC strain EDL933) was recently sequenced (Nature 2001, 409:529). Although comparative analysis of these sequences have resulted in the identification of virulence genes and the characterization of pathogenicity islands, the specific virulence regions associated with the pathogenesis of E, coli causing various diseases remains to be elucidated.
Recently, some of the present inventors identified in the genome of S. enferica serovar Typhi, an operon of three genes (deoK operon) regulated by a repressor DeoQ and missing in E. coli K12 (J. Bacteriol., 2000, 182:869-873).
In E. coli strain AL862, sequences similar to the deoK operon have been sequenced (GenBankT"" accession Nos. AF286670 and AF286671 ) but no function has been assigned to these sequences.
Furthermore, although the use of 2-Deoxy-D-ribose by E. coli strains has been previously described (Br. J. Biomed. Sci., 1995; 52: 173), this property was never associated with the pathogenic status of the strains and the genes encoding this function were not identified.
In view of the above, there is a need for methods, nucleic acid molecules, polypeptides, antibodies, vectors and cells useful for the identification of pathogenic strains of E. coli.
The present invention fulfils this need and also other needs as it will be apparent to those skilled in the art upon reading the following specification.
SUMMARY OF THE INVENTION
The present inventors have found that a sugar (deoxyribose) that is not fermented by E. coli K12, is metabolized by a large number of pathogenic isolates belonging to various pathotypes. The present inventors have identified the genes encoding this function and demonstrated that they are conserved among several pathogenic strains. The present inventors have also developed genetic and bacteriological assays to identify deoxyribose-positive E. coli strains.
In general, the invention features an isolated or purified nucleic acid molecule, such as genomic, cDNA, antisense, DNA, RNA or a synthetic nucleic acid molecule that encodes or corresponds to a E. coli deoK polypeptide.
According to a first aspect, the invention features isolated or purified nucleic acid molecules, polynucleotides, polypeptides, E. coli deoK proteins and fragment thereof. Preferred nucleic acid molecules consist of a DNA.
According to another aspect, the invention features a nucleotide probe.
According to another aspect, the invention features a purified antibody. In a preferred embodiment, the antibody is a monoclonal or a polyclonal antibody that specifically binds to a E. coli deoK protein and/or to a fragment thereof.
A further aspect of the invention relate to a method for evaluating pathogenicity of a strain of E. coli, comprising assaying a metabolic activity of the strain. Preferably the metabolic activity consists of metabolization of 2-Deoxy-D-ribose and capacity of the strain to metabolize of 2-Deoxy-D-ribose is assessed.
In another aspect, the present invention further features a method for identifying a pathogenic or potentially pathogenic strain of E. coli. In a related aspect, the invention relates to a method for determining likelihood of pathogenicity of a strain of E. coli. In one embodiment, the method comprises detecting deoxyribokinase enzymatic activity of the strain. Preferably this is done by assaying, under suitable culture conditions, the capabilities of the strain to metabolize 2-Deoxy-D-ribose. In another embodiment, the method comprises assaying the E. coli strain for the presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose. According to the invention ability of E.
coli strains to metabolize 2-Deoxy-D-ribose and/or presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose is indicative that the strain of E. coli is pathogenic or a potentially pathogenic. Of course, both aspects of the method may be carried out simultaneously or in parallel.
In another aspect, the present invention further features a method for identifying a pathogenic or a potentially pathogenic strain of E. coli, the method having a level of specificity of at least 30%, 40%, 45%, 46%, 47%, 48%, 49% or 50% for pathogenic E. coli from some clinical isolates. More preferably, the method detect less than 25%, 20%, 18%, 15% or 10% of commensal E. coli from healthy peoples.
In another related aspect, the invention features a kit for identifying a strain of E. coli or evaluating pathogenicity of a strain of E. coli, the kit comprising preferably an antibody or a probe as defined previously.
The present invention also features a method of treatment of E. coli infections.
One of the greatest advantages of the present invention is that it provides genetic and proteinic markers, antibodies, probes, kits and methods that can be used for identifying pathogenic strains of E. coli and/or for evaluating pathogenicity of a strain of E. coli and eventually treat or prevent E. coli infections.
EVALUATING PATHOGENICITY OF STRAINS OF E. COLI
BACKGROUND OF THE INVENTION
a) Field of the invention The present invention is concerned with genetic and metabolic markers and with methods to identify pathogenic or potentially pathogenic strains of E.
coli.
More particularly, the invention provides nucleotide and amino acid sequences, antibodies, probes, cells, kits and methods concerning genes expressed mostly by pathogenic strains E. coli.
b) Brief description of the prior art Escherichia coli is a heterogeneous species consisting of both enteric commensal and pathogenic strains. Different types of E. coli cause different diseases in a range of hosts, including extra-intestinal and enteric infections. For example, enteropathogenic E. coil (EPEC) is the leading cause of severe infantile diarrhea in developing countries, and enterohaemorrhagic E. coli (EHEC) (including the well-known 0157:H7) have recently been shown to be the cause of bloody diarrhea and hemolytic-uremic syndrome in major food-borne outbreaks in the United States, Europe, and Asia (CMR 1998, 11:142).
Over the last five years, studies have been published on the E. coli chromosome. The whole genome sequence of the laboratory strain K-12 MG1655 was published in 1997 (Science 1997, 277:1453). The genome of E, coli 0157: H7 (EHEC strain EDL933) was recently sequenced (Nature 2001, 409:529). Although comparative analysis of these sequences have resulted in the identification of virulence genes and the characterization of pathogenicity islands, the specific virulence regions associated with the pathogenesis of E, coli causing various diseases remains to be elucidated.
Recently, some of the present inventors identified in the genome of S. enferica serovar Typhi, an operon of three genes (deoK operon) regulated by a repressor DeoQ and missing in E. coli K12 (J. Bacteriol., 2000, 182:869-873).
In E. coli strain AL862, sequences similar to the deoK operon have been sequenced (GenBankT"" accession Nos. AF286670 and AF286671 ) but no function has been assigned to these sequences.
Furthermore, although the use of 2-Deoxy-D-ribose by E. coli strains has been previously described (Br. J. Biomed. Sci., 1995; 52: 173), this property was never associated with the pathogenic status of the strains and the genes encoding this function were not identified.
In view of the above, there is a need for methods, nucleic acid molecules, polypeptides, antibodies, vectors and cells useful for the identification of pathogenic strains of E. coli.
The present invention fulfils this need and also other needs as it will be apparent to those skilled in the art upon reading the following specification.
SUMMARY OF THE INVENTION
The present inventors have found that a sugar (deoxyribose) that is not fermented by E. coli K12, is metabolized by a large number of pathogenic isolates belonging to various pathotypes. The present inventors have identified the genes encoding this function and demonstrated that they are conserved among several pathogenic strains. The present inventors have also developed genetic and bacteriological assays to identify deoxyribose-positive E. coli strains.
In general, the invention features an isolated or purified nucleic acid molecule, such as genomic, cDNA, antisense, DNA, RNA or a synthetic nucleic acid molecule that encodes or corresponds to a E. coli deoK polypeptide.
According to a first aspect, the invention features isolated or purified nucleic acid molecules, polynucleotides, polypeptides, E. coli deoK proteins and fragment thereof. Preferred nucleic acid molecules consist of a DNA.
According to another aspect, the invention features a nucleotide probe.
According to another aspect, the invention features a purified antibody. In a preferred embodiment, the antibody is a monoclonal or a polyclonal antibody that specifically binds to a E. coli deoK protein and/or to a fragment thereof.
A further aspect of the invention relate to a method for evaluating pathogenicity of a strain of E. coli, comprising assaying a metabolic activity of the strain. Preferably the metabolic activity consists of metabolization of 2-Deoxy-D-ribose and capacity of the strain to metabolize of 2-Deoxy-D-ribose is assessed.
In another aspect, the present invention further features a method for identifying a pathogenic or potentially pathogenic strain of E. coli. In a related aspect, the invention relates to a method for determining likelihood of pathogenicity of a strain of E. coli. In one embodiment, the method comprises detecting deoxyribokinase enzymatic activity of the strain. Preferably this is done by assaying, under suitable culture conditions, the capabilities of the strain to metabolize 2-Deoxy-D-ribose. In another embodiment, the method comprises assaying the E. coli strain for the presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose. According to the invention ability of E.
coli strains to metabolize 2-Deoxy-D-ribose and/or presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose is indicative that the strain of E. coli is pathogenic or a potentially pathogenic. Of course, both aspects of the method may be carried out simultaneously or in parallel.
In another aspect, the present invention further features a method for identifying a pathogenic or a potentially pathogenic strain of E. coli, the method having a level of specificity of at least 30%, 40%, 45%, 46%, 47%, 48%, 49% or 50% for pathogenic E. coli from some clinical isolates. More preferably, the method detect less than 25%, 20%, 18%, 15% or 10% of commensal E. coli from healthy peoples.
In another related aspect, the invention features a kit for identifying a strain of E. coli or evaluating pathogenicity of a strain of E. coli, the kit comprising preferably an antibody or a probe as defined previously.
The present invention also features a method of treatment of E. coli infections.
One of the greatest advantages of the present invention is that it provides genetic and proteinic markers, antibodies, probes, kits and methods that can be used for identifying pathogenic strains of E. coli and/or for evaluating pathogenicity of a strain of E. coli and eventually treat or prevent E. coli infections.
Other objects and advantages of the present invention will be apparent upon reading the following non-restrictive description of the preferred embodiments thereof and from the claims.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a schema illustrating operon deoK in Escherichia coli.
Figure 2 represents nucleic acids and amino acids sequences of deoK
operon in Escherichia coli - strain AL862. Underlined sequence corresponds to probe A and doubled underlined sequence corresponds to probe B. Bold nucleotides correspond to primers used in PCR assay to amplify probes A and B.
Figure 3 represents nucleic acids and amino acids sequences of deoK
operon in Escherichia coli - strain 55989.
Figure 4 represents nucleic acids sequence of Probe A.
Figure 5 represents nucleic acids sequence of Probe B.
DETAILED DESCRIPTION OF THE INVENTION
A) Definitions Throughout the text, the word "kilobase" is generally abbreviated as "kb", the words "deoxyribonucleic acid" as "DNA", the words "ribonucleic acid" as "RNA", the words "complementary DNA" as "cDNA", the words "polymerase chain reaction" as "PCR", and the words "reverse transcription" as "RT". Nucleotide sequences are written in the 5' to 3' orientation unless stated otherwise.
In order to provide an even clearer and more consistent understanding of the specification and the claims, including the scope given herein to such terms, the following definitions are provided:
Antisense: As used herein in reference to nucleic acids, is meant a nucleic acid sequence, regardless of length, that is complementary to the coding strand of a gene.
Expression: Refers to the process by which gene encoded information is 5 converted into the structures present and operating in the cell. In the case of cDNAs, cDNA fragments and genomic DNA fragments, the transcribed nucleic acid is subsequently translated into a peptide or a protein in order to carry out its function if any. By "positioned for expression" is meant that the DNA molecule is positioned adjacent to a DNA sequence which directs transcription and translation of the sequence (i.e., facilitates the production of, e.g., a deoK
polypeptide, a recombinant protein or a RNA molecule).
Fragment: Refers to a section of a molecule, such as a protein, a polypeptide or a nucleic acid, and is meant to refer to any portion of the amino acid or nucleotide sequence.
Host: A cell, tissue, organ or organism capable of providing cellular components for allowing the expression of an exogenous nucleic acid embedded into a vector. This term is intended to also include hosts which have been modified in order to accomplish these functions. Bacteria, fungi, animal (cells, tissues, or organisms) and plant (cells, tissues, or organisms) are examples of a host.
Isolated or Purified or Substantially pure: Means altered "by the hand of man" from its natural state, i.e., if it occurs in nature, it has been changed or removed from its original environment, or both. For example, a polynucleotide or a protein/peptide naturally present in a living organism is not "isolated", the same polynucleotide separated from the coexisting materials of its natural state, obtained by cloning, amplification and/or chemical synthesis is "isolated" as the term is employed herein. Moreover, a polynucleotide or a protein/peptide that is introduced into an organism by transformation, genetic manipulation or by any other recombinant method is "isolated" even if it is still present in said organism.
Nucleic acid: Any DNA, RNA sequence or molecule having one nucleotide or more, including nucleotide sequences encoding a complete gene. The term is intended to encompass all nucleic acids whether occurring naturally or non-naturally in a particular cell, tissue or organism. This includes DNA and fragments thereof, RNA and fragments thereof, cDNAs and fragments thereof, expressed sequence tags, artificial sequences including randomized artificial sequences.
Open reading frame ("ORF"): The portion of a cDNA that is translated into a protein. Typically, an open reading frame starts with an initiator ATG codon and ends with a termination codon (TAA, TAG or TGA).
Percent identity and Percent similarity: Used herein in nucleic acid and/or among amino acid sequences comparisons. Sequence identity is typically measured using sequence analysis software with the default parameters specified therein (e.g., Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Owl 53705). This software program matches similar sequences by assigning degrees of homology to various substitutions, deletions, and other modifications. Conservative substitutions typically include substitutions within the following groups: glycine, alanine, valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine.
Polypeptide or Protein: Means any chain of more than two amino acids, regardless of post-translational modification such as glycosylation or phosphorylation.
Potentially pathogenic: Refers to a strain which has the capacity to be involved in a pathogenic process. Examples of potentially pathogenic strains are extra-intestinal E. coli strains which are distinct from the commensal and from the intestinal pathogenic strains.
Specifically binds: Means an antibody that recognizes and binds a protein or polypeptide but that does not substantially recognize and bind other molecules in a sample, e.g., a biological sample, that naturally includes protein.
Substantially the same: Refers to nucleic acid or amino acid sequences having sequence variation that do not materially affect the nature of the protein.
With particular reference to nucleic acid sequences, the term "substantially the same" is intended to refer to the coding region and to conserved sequences governing expression, and refers primarily to degenerate codons encoding the same amino acid, or alternate codons encoding conservative substitute amino acids in the encoded polypeptide. With reference to amino acid sequences, the term "substantially the same" refers generally to conservative substitutions and/or variations in regions of the protein that are not involved in determination of structure or function of the protein. "Substantially the same" encompasses "degenerate variants" of nucleic acid or amino acid sequences.
Substantially pure polypeptide: Means a polypeptide that has been separated from the components that naturally accompany it. Typically, the polypeptide is substantially pure when it is at least 60%, by weight, free from the proteins and naturally-occurring organic molecules with which it is naturally associated. Preferably, the polypeptide is at least 75%, 80%, or 85%, more preferably at least 90%, 95% or 97% and most preferably at least 99%, by weight, pure. A substantially pure polypeptide or protein may be obtained, for example, by extraction from a natural source (including but not limited to E. Coh) by expression of a recombinant nucleic acid encoding the polypeptide, or by chemically synthesizing the protein. Purity can be measured by any appropriate method, e.g., by column chromatography, polyacrylamide gel electrophoresis, or HPLC
analysis.
A protein is substantially free of naturally associated components when it is separated from those contaminants which accompany it in its natural state.
Thus, a protein which is chemically synthesized or produced in a cellular system different from the cell from which it naturally originates will be substantially free from its naturally associated components. Accordingly, substantially pure polypeptides include those derived from eukaryotic organisms but synthesized in E. coli or other prokaryotes. By "substantially pure DNA" is meant DNA that is free of the genes which, in the naturally-occurring genome of the organism from which the DNA of the invention is derived, flank the gene. The term therefore includes, for example, a recombinant DNA which is incorporated into a vector; into an autonomously replicating plasmid or virus; or into the genomic DNA of a prokaryote or eukaryote;
or which exists as a separate molecule (e.g., a cDNA or a genomic or cDNA
fragment produced by PCR or restriction endonuclease digestion) independent of other sequences. It also includes a recombinant DNA which is part of a hybrid gene encoding an additional polypeptide sequence.
Transformed or Transfected or Transduced or Transgenic cell: Refers to a cell into which (or into an ancestor of which) has been introduced, by means of recombinant DNA techniques, an exogenous DNA molecule encoding a polypeptide of interest. By "'transformation" is meant any method for introducing foreign molecules into a cell. Lipofection, calcium phosphate precipitation, retroviral delivery, electroporation, and ballistic transformation are just a few of the teachings which may be used.
Vector: A self-replicating RNA or DNA molecule which can be used to transfer an RNA or DNA segment from one organism to another. Vectors are particularly useful for manipulating genetic constructs and different vectors may have properties particularly appropriate to express proteins) in a recipient during cloning procedures and may comprise different selectable markers. Bacterial plasmids are commonly used vectors. Modified viruses such as adenoviruses and retroviruses are other examples of vectors.
B) General overview of the invention The present inventors have shown that a sugar (deoxyribose) that is not fermented by E. coli K12, is metabolized by a large number of pathogenic isolates belonging to various pathotypes. The present inventors have also identified the genes encoding this function and they demonstrated that they are conserved among several pathogenic strains. The present inventors have further developed genetic and bacteriological assays to identify deoxyribose-positive E, coli strains.
i) Cloning and molecular characterization of deoK operon in E, coli As it will be described hereinafter in the exemplification section of the invention, the inventors have discovered, cloned and sequenced the DNA
encoding the deoK operon in two pathogenic strains of E. coli. The DNA
sequences and the predicted amino acid sequence of the encoded proteins are shown in Figures 2 and 3. Computer analysis revealed four open reading frames (ORF), deoX, deoP, deoK, and deoQ, which mapped to the same loci as had similar sequences to the deoX, deoP, deoK, and deoQ genes from the deoK
operon from Salmonella, respectively (See Figure 1 ).
The function of deoP, deoK, and deoQ is known. These E. coli genes encode a putative 2-Deoxy-D-ribose permease, a deoxyribokinase and a putative repressor protein, respectively. Function of deoX remains to be elucidated.
DeoX
gene encodes a protein of 337 amino acids (A.A.) long. In silico analysis indicates that the protein has the following features: it has a molecular weight of about 38 kDa, an isoelectric point of about 5.2; an instability index of about 45.4 (i.e.
Unstable); an aliphatic index of about 79.6; and a grand average of hydropathicity (GRAVY) of about -0.136.
ii) deoK homology with other genes and proteins As shown in Table 1 on the exemplification section, a blast search indicates that deoK operon in E. coli shares high level of identity with deoK operon in S. Typhi (about 75 to 80%).
Therefore, the present invention concerns an isolated or purified nucleic acid molecule (such as DNA) comprising a sequence selected from the group consisting of a) sequences provided in part or all of SEQ ID NO: 1 or 6;
b) complements of the sequences provided in part or all of SEQ ID NO: 1 or 6;
c) sequences consisting of at least 20 contiguous residues of a sequence provided in SEQ ID NO: 1 or 6;
d) sequences that hybridize to part or all of nucleic acids of SEQ ID NO: 1 or 6, under moderately, preferably high, stringent conditions;
e) sequences having at least 80% identity to part or all of SEQ ID NO: 1 or 6;
f) degenerate variants of a sequence provided in part or all of SEQ ID NO: 1 or 6;
and g) sequences encoding part or all of polypeptides provided in SEQ ID NO: 2-5 and 7-10.
More preferably, the nucleic acid molecule of the invention comprises a sequence selected from the group consisting of:
a) a nucleotide sequence having at least 80%, 85%, 90%, 95% or 97% nucleotide sequence identity with SEQ ID NO: 1 or 6; and b) a nucleotide sequence having at least 80%, 85%, 90%, 95% or 97% nucleotide sequence identity with a nucleic acid encoding an amino acid sequence of SEQ ID NO: 2-5 and 7-10.
More preferably, the nucleic acid molecule comprises a sequence 5 substantially the same or having 100% identity with SEQ ID NO: 1 or 6, or a sequence substantially the same or having 100% identity with nucleic acids encoding an amino acid sequence of SEQ ID NO: 2-5 and 7-10.
The present invention also concerns isolated or purified nucleic acid molecules comprising a sequence encoding a E. coli polypeptide involved in 10 metabolization of 2-Deoxy-D-ribose, or degenerate variants thereof, the E.
coli polypeptide or degenerate variant comprising part or all of SEQ ID N0:2-5 and 7-10.
The present invention also concerns isolated or purified nucleic acid molecule which hybridizes under moderate, preferably high stringency conditions with part or all of any of the nucleic acid molecules of the invention mentioned hereinbefore or with part or all of a complementary sequence thereof. The "hybridizing" nucleic acid could be used as probe or as antisense molecules as it will be described hereinafter.
In a related aspect, the present invention concerns an isolated or purified polypeptide or a protein comprising an amino acid sequence selected from the group consisting of:
a) sequences encoded by a nucleic acid as defined previously;
b) sequences having at least 80% identity to part or all of any of SEQ ID N0:2-and 7-10;
c) sequences having at least 85% homology to part or all of any of SEQ ID
N0:2-5 and 7-10; and d) sequence provided in part or all of any of SEQ ID N0:2-5 and 7-10.
More preferably, the polypeptide comprises an amino acid sequence substantially the same or having 100% identity with any of SEQ ID N0:2-5 and 7-10. Most preferred polypeptides are those having a biological activity that permit E. coli to metabolize 2-Deoxy-D-ribose.
iii) Anti-deoK antibodies The invention features purified antibodies that specifically bind to a protein encoded by the E. colt deoK operon. The antibodies of the invention may be prepared by a variety of methods using the deoK proteins or polypeptides described above. For example, the deoK polypeptide, or antigenic fragments thereof, may be administered to an animal in order to induce the production of polyclonal antibodies. Alternatively, antibodies used as described herein may be monoclonal antibodies, which are prepared using hybridoma technology (see, e.g., Hammerling et al., In Monoclonal Antibodies and T-Cell Hybridomas, Elsevier, NY, 1981 ).
The invention features antibodies that specifically bind E. colt deoK operon polypeptides, or fragments thereof. In particular, the invention features "neutralizing" antibodies. By "neutralizing" antibodies is meant antibodies that interfere with any of the biological activities of any of the E. colt deoK
operon polypeptides, particularly the ability of E. colt to metabolize 2-Deoxy-D-ribose. The neutralizing antibody may reduce the ability of E. colt deoK proteins to metabolize 2-Deoxy-D-ribose by, preferably 50%, more preferably by 70%, and most preferably by 90% or more. Any standard assay of 2-Deoxy-D-ribose metabolization, including those described herein, may be used to assess potentially neutralizing antibodies. Once produced, monoclonal and polyclonal antibodies are preferably tested for specific deoK proteins recognition by Western blot, immunoprecipitation analysis or any other suitable method.
In addition to intact monoclonal and polyclonal anti-deoK antibodies, the invention features various genetically engineered antibodies, humanized antibodies, and antibody fragments, including F(ab')2, Fab', Fab, Fv and sFv fragments. Antibodies can be humanized by methods known in the art. Fully human antibodies, such as those expressed in transgenic animals, are also features of the invention.
Antibodies that specifically recognize deoK proteins (or fragments deoK), such as those described herein, are considered useful to the invention. Such an antibody may be used in any standard immunodetection method for the detection, quantification, and purification of deoK proteins. The antibody may be a monoclonal or a polyclonal antibody and may be modified for diagnostic purposes.
The antibodies of the invention may, for example, be used in an immunoassay to monitor deoK expression levels, to determine the subcellular location of a deoK or deoK fragment produced by E. coli, to determine the amount of deoK or fragment thereof in a biological sample and evaluate the pathogenicity of a strain of E. coli.
In addition, the antibodies may be coupled to compounds for diagnostic and/or therapeutic uses such as gold particles, alkaline phosphatase, peroxidase for imaging and therapy The antibodies may also be labeled (e.g.
immunofluorescence) for easier detection.
iv) Identification of E. coli pathogenic strains According to the present invention, the ability of the E. coli strain to metabolize 2-Deoxy-D-ribose and/or the presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose in the E. coli strain is indicative that this strain is pathogenic or at least potentially pathogenic.
Therefore, the invention provides a method for evaluating pathogenicity of a strain of E. coli comprising assaying a metabolic activity of that strain.
Preferably, the metabolic activity consists of metabolization of 2-Deoxy-D-ribose and the assessment step consists of growing the strain of a minimal medium comprising 2-Deoxy-D-ribose as a sole source of carbon.
The antibodies described above and probes described hereinafter rnay be used to monitor deoK protein expression and/or to identify a pathogenic strain of E, coli in a biological sample or in a human or an subject. Accordingly, the invention provides a method for identifying a pathogenic strain of E, coli and/or for evaluating likelihood of pathogenicity of a strain of E. coli as compared to a commensal strain.
According to a first embodiment, the method comprises assaying the E. coli strain for the presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose. Preferably, oligonucleotides such as probes, or cloned nucleotide (RNA
or DNA) fragments corresponding to unique portions of genes and proteins from operon deoK are used to assess deoK proteins cellular levels or detect deoK
mRNAs (both indicative of E. coli pathogenicity). Such an assessment may also be done in vifro using well-known methods (Northern analysis, PCR, quantitative PCR, microarrays, etc.). The methods of the invention may be carried out by contacting, in vitro or in vivo, an E, coli isolate or a biological sample (such as a urine sample, feces, blood, cerebral spinal fluid, from an individual or an individual or an animal suspected of harboring pathogenic E. coli. or an extract thereof, witty an anti-deoK antibody or a probe according to the invention, in order to determine the presence or evaluate the amount of deoK proteins or gene in the sample or the cells therein.
According to a preferred embodiment, the method comprises assessment of the E, coli strain for the presence of a nucleic acid sequence selected from the group consisting of:
a) sequences provided in part or all of SEQ ID NO: 1 or 6;
b) complements of the sequences provided in part or all of SEQ ID NO: 1 or 6;
a) sequences consisting of at least 20 contiguous residues of a sequence provided in SEQ ID NO: 1 or 6;
b) sequences that hybridize to part or all of nucleic acids of SEQ ID NO: 1 or 6, under moderately, preferably high, stringent conditions;
c) sequences having at least 80% identity to part or all of SEQ ID NO: 1 or 6;
d) degenerate variants of a sequence provided in part or all of SEQ ID NO: 1 or 6;
and e) sequences encoding part or all of polypeptides provided in SEQ ID NO: 2-5 and 7-10.
According to another preferred embodiment, the method comprises assessment of the E. coli strain for the presence of a polypeptide comprising an amino acid sequence selected from the group consisting of:
a) sequences encoded by a nucleic acid as defined in claim 7;
b) sequences having at least 80% identity to part or all of any of SEQ ID N0:2-and 7-10;
c) sequences having at least 85% homology to part or all of any of SEQ ID N0:2-5 and 7-10; and d) sequence provided in part or all of any of SEQ ID N0:2-5 and 7-10.
Accordingly, the invention encompasses nucleotide probes comprising a sequence of at least 15, 20, 25, 30, 40, 50, 75, 100 or more sequential nucleotides cf SEQ ID NO: 1 or 6, or of a sequence complementary to SEQ ID NO: 1 or 6.
More preferably, the probe consists of SEQ ID NO: 11 or 12.
Of course, it may be preferable to further assay the presence (or absence) of other genes/proteins in order to increase sensitivity and/or specificity of the method.
According to another embodiment, the method for identifying a pathogenic strain of E. coli comprises detecting deoxyribokinase enzymatic activity of the strain. Preferably this is done by assaying, under suitable culture conditions, the capabilities of the strain to metabolize 2-Deoxy-D-ribose. This may be achieved by grow'ng in vitro an E. coli isolate or a biological sample suspected of harboring pathogenic E. coli on a minimal medium comprising 2-Deoxy-D-ribose as a sole source of carbon and evaluating bacteria growth and survival in that medium.
Preferably, the minimal medium comprises from about 0.01 % 2-Deoxy-D-ribose and the bacteria are cultured in the minimal medium for about 24h to about 48h.
Assay kits for determining the amount of deoK genes and proteins in a sample and/or for identifying a pathogenic strain of E. coli, are also within the scope of the present invention. According to one embodiment, such a kit would preferably comprises anti-deoK antibody(ies) or probes) according to the invention and other elements) selected such as instructions for using the kit, assay tubes, enzymes, reagents or reaction buffers}, enzymes}. In another embodiment, the kit would comprises means for assaying capabilities of a strain of E. coli to metabolize 2-Deoxy-D-ribose.
A non-limitative example of use for the methods, kits and probes of the invention is the detection of pathogenic or potentially pathogenic E. coli bacteria in food which may be contaminated by E. coli.
v) Downmodulation of deoK proteins expression As mentioned previously, expression of proteins of the deoK operon allows E. coli to metabolize 2-Deoxy-D-ribose. Modulation of deoK may be useful.
More particularly downmodulation of deoK proteins could be used to prevent and/or treat E. coli infections. Therefore, the invention also relates to methods for preventing or treating E. Coli infections comprising downmodulating expression or biological activity of deoK proteins or genes. This may be achieved 5 by administering a molecule or compound having such property.
vii) Vectors and Cells The invention is also directed to a host, such as a genetically modified cell, comprising any of the nucleic acid sequence according to the invention and more 10 preferably, a host capable of expressing the peptide/protein encoded by this nucleic acid.
The host cell may be any type of cell (a transiently-transfected mammalian cell line, an isolated primary cell, or a bacterium (such as E. coh). More preferably the host is Escherichia coli bacterium and it is selected from the Escherichia coli 15 bacteria filed on May 14, 2002 at the CNCM under accession numbers I-2867 and I-2867.
A number of vectors suitable for stable transfection of mammalian cells and bacteria are available to the public (e.g. plasmids, adenoviruses, adeno-associated viruses, retroviruses, Herpes Simplex Viruses, Alphaviruses, Lentiviruses), as are methods for constructing such cell lines. The present invention encompasses any type of vector comprising any of the nucleic acid molecule of the invention and more particularly the vectors capable of directing expression of the peptide encoded by such nucleic acid in a vector-containing cell.
The cells of the invention may be particularly useful for diagnostic purposes and for drug screening (by measuring effect of a compound on expression or activity levels of deoK genes of proteins for instance).
vii) Synthesis of E. coli deoK proteins and functional derivative thereof ;knowledge of E. coli deoK operon gene sequences open the door to a series of applications. For instance, the characteristics of the cloned E.
coli deoK
genes sequences may be analyzed by introducing the sequence into various cell types or using in vitro extracellular systems. The function of E. coli deoK
genes may then be examined under different physiological conditions. The deoK cDNA
sequences may be manipulated in studies to understand the expression of the gene and gene product. Alternatively, cell lines may be produced which overexpress the gene product allowing purification of deoK proteins for biochemical characterization, large-scale production, antibody production, and patient therapy.
For protein expression, eukaryotic and prokaryotic expression systems may be generated in which the deoK operon gene sequences is introduced into a plasmid or other vector which is then introduced into living cells. Gonstructs in which the deoK cDNA sequences containing the entire open reading frame inserted in the correct orientation into an expression plasmid may be used for protein expression. Alternatively, portions of the sequence, including wild-type or mutant deoK sequences, may be inserted. Prokaryotic and eukaryotic expression systems allow various important functional domains of the protein to be recovered as fusion proteins and then used for binding, structural and functional studies and also for the generation of appropriate antibodies. The deoK DNA sequences may be altered by using procedures such as restriction enzyme digestion, DNA
polymerase fill-in, exonuclease deletion, terminal deoxynucleotide transferase extension, ligation of synthetic or cloned DNA sequences and site directed sequence alteration using specific oligonucleotides together with PCR.
Accordingly, the invention also concerns a method for producing a polypeptide involved in E. coli metabolization of 2-Deoxy-D-ribose. The method comprises the steps of: (i) providing a cell transformed with a nucleic acid sequence encoding the polypeptide positioned for expression in the cell; (ii) culturing the transformed cell under conditions suitable for expressing the nucleic acid; (iii) producing the polypeptide; and optionally, (iv) recovering the polypeptide produced.
Once the recombinant protein is expressed, it is isolated by, for example, affinity chromatography. In one example, an anti-deoK polypeptide antibody, which may be produced by the methods described herein, can be attached to a column and used to isolate the deoK proteins. Lysis and fractionation of deoK-harboring cells prior to affinity chromatography may be performed by standard methods.
Once isolated, the recombinant protein can, if desired, be purified further.
Methods and techniques for expressing recombinant proteins and foreign sequences in prokaryotes and eukaryotes are well-known in the art and will not be described in more detail. One can refer, if necessary to Joseph Sambrook, David W. Russell, Joe Sambrook Molecular Cloning: A Laboratory Manual 2.001 Cold Spring Harbor Laboratory Press. Those skilled in the art of molecular biology will understand that a wide variety of expression systems may be used to produce the recombinant protein. The precise host cell used is not critical to the invention. The deoK proteins may be produced in a prokaryotic host (e.g., E. coh) or in a eukaryotic host. These cells are publicly available, for example, from the American Type Culture Collection, Rockville, MD. The method of transduction and the choice of expression vehicle will depend of the host system selected.
Polypeptides of the invention, particularly short deoK fragments, may also be produced by chemical synthesis. These general techniques of polypeptide expression and purification can also be used to produce and isolate useful deoK
fragments or analogs, as described herein.
Skilled artisans will recognize that a deoK polypeptide, or a fragment thereof (as described herein), may serve for various purposes, in diagnostic kits and methods, and for the obtaining of anti-deoK antibodies for instance.
viii) Identification of Molecules that Modulate deoK Proteins Expression deoK cDNAs may be used to facilitate the identification of molecules that increase or decrease deoK genes expression. In one approach, candidate molecules are added, in varying concentration, to the culture medium of cells expressing deoK mRNA. deoK expression is then measured (or capabilities of the cell to metabolize 2-Deoxy-D-ribose), for example, by Northern blot analysis using a deoK cDNA, or cDNA or RNA fragment, as a hybridization probe. The level of deoK expression (or cell metabolizing activity) in the presence of the candidate molecule is compared to the level of deoK expression (or cell metabolizing activity) in the absence of the candidate molecule, all other factors (e.g. cell type and culture conditions) being equal.
Compounds that modulate the level of deoK expression (or cell metabolizing activity) may be purified, or substantially purified, or may be one component of a mixture of compounds such as an extract or supernatant obtained from cells. In an assay of a mixture of compounds, deoK expression (or cell metabolizing activity) is tested against progressively smaller subsets of the compound pool (e.g., produced by standard purification techniques such as HPLC
or FPLC) until a single compound or minimal number of effective compounds is demonstrated to modulate deoK expression (or cell metabolizing activity).
The effect of candidate molecules on deoK-biological activity may, instead, be measured at the level of translation by using the general approach described above with standard protein detection techniques, such as Western blotting or immunoprecipitation with a deoK-specific antibody (for example, the anti-deoK
antibody described herein).
Another method for detecting compounds that modulate the activity of deoK
is to screen for compounds that interact physically with a given deoK
polypeptide.
Depending on the nature of the compounds to be tested, the binding interaction may be measured using methods such as enzyme-linked immunosorbent assays (ELISA), filter binding assays, FRET assays, scintillation proximity assays, microscopic visualization, immunostaining of the cells, in situ hybridization, PCR, etc.
A molecule that decreases deoK activity is considered particularly useful to the invention; such a molecule may be used, for example, as a therapeutic to decrease and/or block proliferation of pathogenic bacteria (see section (v) hereinbefore).
Molecules that are found, by the methods described above, to effectively modulate deoK gene expression or polypeptide activity, may be tested further in animal models. If they continue to function successfully in an in vivo setting, they may be used as therapeutics to prevent or treat bacterial infections.
EXAMPLES
The following examples are illustrative of the wide range of applicability of the present invention and is not intended to limit its scope. Modifications and variations can be made therein without departing from the spirit and scope of the invention. Although any method and material similar or equivalent to those described herein can be used in the practice for testing of the present invention, the preferred methods and materials are described.
EXAMPLE 1: Cloning and expression of deoxyribose-catalyzing genes in E. coli strains.
Introduction Escherichia coli is a heterogeneous species consisting of both enteric commensal and pathogenic strains. Different types of E. coli cause different diseases in a range of hosts, including extra-intestinal and enteric infections.
Extra-intestinal infections due to E. coli are common in groups of age and can involve almost any organ or anatomical site. Typically extra-intestinal infections include urinary tract infection (UTI), meningitis (mostly in neonates and after neurosurgery), diverse intra-abdominal infections, pneumonia (particularly in hospitalized and institutionalized patients), intravascular-device infection, osteomyelitis, and soft-tissue infection, which usually occurs when the tissue is compromised. Bacteremia can accompany infection at any of these sites (JID
2000, 181:1753; JID 2001,183:596). In 1999, extra-intestinal pathogenic E.
coli strains were the most frequently isolated organisms in US patients receiving antimicrobials (JAMA, 2001, 285: 1565). Bacterial UTI are second in incidence only to those causing respiratory infections. E. coli accounts for up to 90 %
of all UTIs in non-hospitalized patients (5th ed. Williams & Wilkins, Baltimore, Md.1997).
85 to 95 % of uncomplicated cystitis in pre-menopausal women are due to E.
coli strains; they globally represent 150-300 million cases per year in the world (Est. $
6 billion dollars direct cost/ year in US) (JID 2001;183:51). In US, there are at least 250,000 cases of uncomplicated pyelonephritis per year, allowing to 100,000 hospitalizations and an E. coli estimate cost of $ 175 million dollars /year (JAMA, 2001; 283:1583). E, coli is responsible for one third of all cases of neonatal meningitis with an incidence rate of 0.1 per 1,000 live births (JAC 1994, 34 (suppl.
A):61). The extra-intestinal E. coli strains are epidemiologically and phylogenetically distinct from both the commensal and the intestinal pathogenic strains; they appear to be unable of causing enteric disease, but they can stably colonize the host intestinal tract. In contrast, intestinal pathogenic strains of E. coli are rarely encountered in the fecal flora of healthy hosts and, instead, appear to be essentially obligate pathogens, causing gastroenteritis or colitis when ingested in 5 sufficient quantities by a naive host. Various pathotypes of E. coli are responsible for significant worldwide diarrheal disease (to date, six have been well characterized). For example, enteropathogenic E. coli (EPEC) are the leading cause of severe infantile diarrhea in developing countries, and enterohaemorrhagic E. coli (EHEC) (including the well-known 0157:H7) have 10 recently been shown to be the cause of bloody diarrhea and hemolytic-uremic syndrome in major food-borne outbreaks in the United States, Europe, and Asia (CMR 1998, 11:142). Although there is some overlap between certain diarrhoeagenic pathotypes, with respect to virulence traits, each pathotype possesses a unique combination of virulence traits that results in a distinctive 15 pathogenic mechanism. Recent studies have identified other categories of pathogenic E. coli, such as strains isolated from diarrhoeagenic stools of HIV-positive patients, and E. coli that were abnormally predominant in early and chronic ileal lesions of patients with Crohn's disease.
Knowledge of the pathogenic or non-pathogenic status of an isolate may be 20 of use for clinicians for diagnosis, especially in cases of opportunistic pathogens.
Isolation of an E. coli strain from a clinical specimen does not, by itself, confer the designation of pathogenic isolate, since commensal strains of E. coli can cause infections (in particular extraintestinal infections) when the host is compromised.
However, no single virulence factor is limited to (or absolutely required for) infection at any one given site or for any particular syndrome. Consequently, multiple phenotypic and genotypic assays are necessary to identify the pathotype of clinical isolates. The aim was to identify genes encoding functions that are conserved in all pathogenic strains but are absent in commensal E. coli and to use these data to develop new diagnostic and therapeutic tools.
Over the last five years, studies have been published on the E. coli chromosome. The whole genome sequence of the laboratory strain K-12 MG1655 was published in 1997 (Science 1997, 277:1453), and the size of E. coli ?_ 1 chromosome was shown to var~~ from 4.5 to 5.5 megabases (Mb) (1A1 1999, 19:230). Comparative restriction mapping among the chromosome of E. coli K-12, newborn sepsis-associated strain RS218, and uropathogenic strain J96, showed that the overall gene order is conserved in the three strains, that large accessory segments (some carrying virulence genes) are unique to the chromosome of pathogenic strains, and that some segments are only absent from the chromosome of pathogenic strains (1A1 1999, 19:230). Comparison of the E. coli K-12 genome and those of different pathogenic E. coli allowed us to identify the major differences. The genome of E. coli 0157: H7 (EHEC strain EDL933) was recently sequenced (Nature 2001, 409:529). Comparison with the E. coli K-12 reference strain genome confirmed that the two chromosomes share a common 4.1 Mb 'backbone' sequence and lineage-specific segments (specific islands) were found throughout both genomes in clusters of up to 88 kilobases. Roughly 26%
of the EDL933 genome lies completely within these specific islands, and 33% of these contain genes of unknown function. The Genome Center of Wisconsin is currently sequencing the genome of the newborn sepsis-associated strain RS218, the uropathogenic strain CFT073 and three strains belonging to different pathotypes of diarrhoeagenic E. coli [enterotoxigenic E. coli (ETEC), EPEC, and enteroaggregative E. coli (EAEC) (http://genome.wisc.edu)). It will take probably several years before information from the comparison of the pathogenic specific islands of various pathogenic E. coli isolates becomes available.
Most studies on pathogenic E. coli strains concern the identification of specific virulence regions associated with the pathogenesis of E. coli causing various diseases. Virulence genes have been identified, and pathogenicity islands have been characterized and sequenced. The first studies that investigated the relationship between groups of pathogenic and non-pathogenic E. coli strains were based on multilocus enzyme electrophoresis analysis (1A1 1997, 65:2685) and sequencing of housekeeping genes (Nature 2000, 406.64). They suggested that pathogenic isolates do not have a single evolutionary origin within E. coli but that they arose many times and that the high virulence of clones is a recent, derived state resulting from the acquisition of virulence genes rather than an ancestral condition of primitive E. coli.
E. coli strains expressing the K1 polysaccharide colonize the large intestine of newborn infants and are the leading cause of gram-negative septicaemia and meningitis during the neonatal period. A recent study used signature-tagged rnutagenesis to identify E. coli K1 genes that are required for colonization of the gastrointestinal tract, which is one of the initial steps in the development of enteric, urinary and systemic infections caused by E. coli (MM 2000, 37:1293). One of these genes is absent from the genome of E. coli K-12, although related sequences have been found in some representative pathogenic strains (uropathogenic E. coli, EAEC, and EPEC). The sequence of this gene is not available. These data strongly suggest that common (or strongly related) sequences that are absent from the genome of commensal E. coli, are present in all pathogenic E. coli strains.
A comparative analysis of metabolic functions expressed by pathogenic and commensal strains of E. coli was developed. The inventors showed that a sugar (deoxyribose) that is not fermented by E. coli K12, is metabolized by a large number of pathogenic isolates belonging to various pathotypes. The inventors identified the genes encoding this function and demonstrated that they are conserved among several pathogenic strains. They have developed genetic and bacteriological assays to identify deoxyribose-positive E. coli strains.
Materials and Methods Bacterial strains, cosmids, and culture conditions E. coli K-12/MG1655 (Blattner et al., 1997, Science 277:1453-1474) was used as a host for maintaining cosmid clones.
E. coli strains were routinely grown in Luria broth with glucose (10 g of tryptone, 5 g of yeast extract, and 5 g of NaCI per liter (pH 7.0] or on Luria agar plates (containing 1.5 % agar) at 37°C. E. coli-harboring cosmid clones were grown with 100 ~g of carbenicillin per ml.
Collections of human commensal and pathogenic E, coli strains were used in this study. One hundred fifteen E. coli strains were isolated from blood cultures from cancer patients. These strains were previously partially characterized (J. Clin.
Microbiol., 2001, 30:1738; Infect. A Immun., 2000, 68:3983). One hundred E.
coli 2' J
strains were isolated from urine specimen from patients (children and adults) clinically diagnosed with pyelonephritis. They were previously partially characterized and were from various geographical origin (France, USA, Romania).
Thirty six isolates were from urine specimen from patients with cystitis. They were isolated in Romania and USA. Twenty five strains were from the stools of patients with CD4 lymphocyte counts <400 cells/mm presenting persistent diarrhea.
Eleven isolates were from diarrhoeagenic stools of children in Brazil. Commensal E.
coli strains were isolated from normal flora of healthy people in France, Romania, Senegal (children), and Central African Republic.
Expression of deoxyribose-catalyzing genes by E. coli strains.
The capacity of bacteria to grow on a minimal medium (K5) (J Bacteriol 1971, 108:639) supplemented with 2-Deoxy-D-ribose 0,1 % as sole source of carbon was tested by inoculating agar plates with a bacterial suspension and incubating the plates at 37°C for 24 and 48 h. Inoculations of those plates were performed with a loop from a 1 ml bacterial suspension (in water) prepared with a loop of bacteria grown on LB agar plates.
The fermentation (Methodes de laboratoire pour ('identification des enterobacteries, 1e Minor et Richard, Institut Pasteur, p 169) of 2-Deoxy-D-ribose by E. coli strains was tested as follows: a drop (15 ~I) of an overnight culture in LB
broth was inoculated in 3 ml of peptone water containing 1,5% (v/v) of bromothymol blue and 1 % (w/v) of 2-Deoxy-D-ribose in a 12 x 120 mm glass tube.
The suspension was incubated 24 h at 37°C without shaking.
Activity assay: 2-Deoxy-D-ribose is phosphorylated by deoxyribokinase to deoxyribose-5 phosphate which is subsequently cleaved to acetaldehyde and glyceraldehyde-3phosphate by deoxyribose-5P aldolase also called phosphopentose aldolase. Deoxyribose-5P aldolase activity was determined by coupling deoxyribose-5P cleavage to NADH oxidation using glycerophosphate dehydrogenase and triosephosphate isomerase as coupling enzymes. The reaction medium (0.5 ml final volume) contains 50 mM Tris-HCI (pH 7.4); 0.2 mM
NADH; 9U and 3U of glycerophosphate dehydrogenase and triosephosphate isomerase respectively. The reaction was started with crude material extract followed by 1 mM deoxyribose-5Phosphate, then the absorption decrease at 334 nm was monitored with an EppendortT"" PCP6121 photometer thermostated at 30°C. One unit of deoxyribose-5P aldolase corresponds to 1 mole of product formed per minute.
DNA analysis and genetic technigues.
Cosmid libraries were previously constructed from the genomic DNA from E. coli AL862 isolated from the blood of a cancer patient (1A1, 2001;69:937) and from E. coli 55989 isolated from the stools of a patient with persistent diarrhea (C. Bernier, P. Gounon, and C. Le Bouguenec, In press, IAI august 2002). Sau3A
restriction fragments (35 to 50 kb) were sized on a sucrose gradient and ligated to the BamHl-digested and alkaline phosphatase-treated cosmid vector pHC79 (Collins J, 1979, Methods Enzymol., 68:309-326) DNA . The recombinant cosmids pILL1272 and pILL1287 resulted from cloning of DNA from AL862 and 55989 strains, respectively.
Recombinant cosmids were routinely isolated by alkaline lysis. The sequence of the primers to amplify probe A (GenBankT"" AF286671) and probe B
(GenBankT"~ AF286670) were derived from the partial sequence of PAI IA~ss2 (1A1, 2001 69:937, and Erratum in IAI June 2002). The sequences of the primers to amplify probe A were 5'-ATCAGATGCCTAAAGAAGGAGAAAC-3' and 5'-CAATACTCGGATAAGATGATTGC-3' and the size of the amplicon was 831 by (see Figure 4; SEQ 1D N0:11). The sequences of the primers to amplify the probe B were 5'-GGACGATAATGTGATCGTCTATAAG-3' and 5'-GTGGAAGA
TACTCATCTGCTACACG-3' and the size of the amplicon was 816 by (see Figure 5; SEQ ID N0:12). The cycling conditions were initial denaturation at 95°C
for 5 min followed by 30 cycles at 95°C for 30 s, 60°C or 65°C (for amplification of probe A and probe B, respectively) for 30 s, and 72°C for 1 min.
Hybridization.
Bacteria grown for 3 h on nitrocellulose filters were used for colony hybridization. Hybridization was performed under stringent conditions (overnight at 65°C), with PCR products labeled with 32P using the MegaprimeT"" DNA
labeling system (Amersham International) as probes. The 100 ml hybridization solution contained: 2 ml EDTA 0.5M; 20 mg ATP; and 10 ml 20x SSC.
DNA seauencina.
5 Double-stranded DNA was sequenced by Genome Express (France).
Multiple sequence alignments were generated with the CLUSTAL W program.
Statistical analysis Proportions were compared by using the chi-square test.
Results Presence of the deoK operon in the pathogenic E. coli isolates.
While a large number of bacteria are able to use the 2'-deoxyribosyl moiety of 2'-deoxyribonucleosides as carbon and energy sources via the well-known deo-operon, few organisms as Salmonella are able to use 2-Deoxy-D-ribose (dRib) as the sole carbon source through deoxyribokinase which catalyses the ATP-dependant phosphorylation of dRib to dRib-5 phosphate. Recently, the inventors identified in the genome of S. enterica serovar Typhi, not only the gene encoding deoxyribokinase, deoK but a whole operon (deoK operon) of three genes regulated by a repressor DeoQ (J. Bacteriol., 2000, 182:869-873). Searches in databanks showed that this operon was fully represented in one Citrobacter freundii strain and partially present in Agrobacterium tumefaciens, Rhodobacter sphaeroides, and the pathogenic E. coli strain AL862 isolated from a blood culture.
Use of 2-Deoxy-D-ribose by E. coli strains has been previously described (Br.
J.
Biomed. Sci., 1995; 52: 173), however this property was never associated with the pathogenic status of the strains and the genes encoding this function were not identified.
In strain AL862, the sequences similar to the deoK operon corresponded to ORF3', ORF4, ORFS and ORF 6 of the partial (and not continuous) sequence of a pathogenicity island (PAI IA~ss2)(GenBankT"" Nos. AF286670 and AF286671). No function was previously assigned to these sequences. Two probes derived from this PAI IA~862 region (probes A and B) corresponded to the deoK homologous sequences. Analysis of the distribution of PAI IA~as2 among pathogenic E. coli isolates strongly suggested that the A and B regions are widely distributed among pathogenic strains (1A1, 2001, 69: 937-948; IAI June 2002 Errata).
To confirm the presence of the deoK operon in pathogenic E. coli strains, the inventors sequenced again the region of PAI IA~asz that previously showed similarities to the deoK operon of Salmonella. The sequencing was performed on the recombinant cosmid pILL1272 (see Material and Methods). They identified a 4486-pb linear region displaying similarities to the entire deoK operon of Salmonella. Computer analysis revealed four open reading frames (ORF), deoX, deoP, deoK, and deoQ, which mapped to the same loci as had similar sequences to the deoX, deoP, deoK, and deoQ genes from the deoK operon from Salmonella, respectively (See Figure 1 ). These results confirmed that the genetic organization of the deoK operon from E. coli was similar to that of the deoK operon from Salmonella.
The detailed sequence analysis of E. coli - strain AL862 is presented in Figure 2. The deoK operon from E. coli strain AL862 displayed 78 % identity with that from Salmonella (4486 bp14517 bp).
The position and sequence (determined here) of the two probes (probe A
and probe B) that were used in the hybridization experiments are indicated in Figure 2 (single and doubled underline respectively). In both cases, the sequence of the primers used in PCR assays are indicated in bold. These primer sequences are identical to those previously described and used (IAI, 2001, 69:937-948;
IAI
June 2002 Errata). Probes A and B are PCR products obtained from strain AL862.
To study the degree of conservation of the deoK operon among pathogenic E. coli isolates, the inventors determined the nucleotide sequence of the deoK
region in E. coli strain 55989 isolated from the stools of a patient with persistent diarrhea. This isolate was shown to belong to the EAEC pathotype of pathogenic intestinal E. coli. A cosmid library from the genomic DNA of strain 55989 was previously constructed (Bernier et al., In press, IAI August 2002). The recombinant cosmid pILL1287 resulted from the screening of the 55989 cosmid library with both the probe A and the probe B. The sequence of the chromosomal region from strain 55989 that carries the deoK operon is presented in Figure 3.
The deoK operon from E. coli strain AL862 and strain 55989 showed 98%
identity (4486 bp/4489 bp). The degrees of identities of the deo genes from E.
coli and Salmonella strains are summarized in Table 1.
TABLE 1: Degrees of identities of the deo genes from E, coli and Salmonella strains Strains % of identity No. of nucleotides 55989 / AL862 98 % 4489bp/4486bp 55989 / S. Typhi 78 % 4489bp/4517bp AL862 / S. Typhi 78 % 4486bp/4517bp Genes % of identity No. of nucleotides deoX 55989 / AL862 99% 1014bp/1014bp deoX 55989 / S. Typhi75% 1014bp/1014bp deoXAL862 / S. Typhi75% 1014bp/1014bp deoP 55989 I AL862 99% 1317bp/1317bp deoP 55989 / S. Typhi83% 1317bp/1317bp deoP AL862 / S. Typhi82% 1317bp/1317bp deoK 55989 / AL862 99% 921 bp/921 by deoK 55989 / S, Typhi80% 921 bp/921 by deoK AL862 / S, Typhi80% 921 bp/921 by deoQ 55989 / AL862 96% 783bp/783bp deoQ 55989 / S. Typhi77% 783bp/786bp deoQ AL862 / S. Typhi76% 783bp1786bp Expression of the deoK operon in E. coli strains.
The inventors demonstrated the expression of the deoK operon in clinical isolates 55989 and AL862, as well as in the recombinant strain MG1655 carrying either the cosmid pILL1272 or the cosmid pILL1287. All these four strains were able to grow on K5 plates containing 2-Deoxy-D-ribose as a carbon source. The growth of the strains was evident after 48 h of incubation at 37°C. As a negative control, strain MG1655 alone did not grow on such medium. Deoxyribose-5P
aldolase activity, easier to determine than that of deoxyribokinase, is reported in Table 2.
Table 2: Deoxyribose-5P aldolase activity in E. coli strains Strain Deoxyribose-5P aldolase +dR -dR
AL862 0.47 Ulmg 0.06 U/mg 55989 0.45 U/mg 0.08 U/mg K-12 MG1655 (+1272)0.36 U/mg 0.10 U/mg K-12 MG1655 (+1287)0.24 U/mg 0.10 U/mg Analysis of the distribution of deoK operon among commensal and pathogenic E.
coli isolates To determine whether deoK operon sequences were specific for pathogenic E. coli, the frequency of occurrence of the A and B regions (corresponding to parts of deoK and deoX genes, respectively) was investigated. These regions were amplified from strain AL862 DNA and used as probes to screen by colony hybridization collections of E. coli isolates. The strains were also tested for their ability to use 2-Deoxy-D-ribose as a carbon source.
These collections comprised strains representative of the various pathotypes of pathogenic E. coli. Archetypal ExPEC (extraintestinal pathogenic E.
coh~ familiar to investigators in the field include strains CFT073 (pyelonephritis isolate), 536 (pyelonephritis isolate), J96 (pyelonephritis isolate), RS218 (neonatal meningitis isolate). Prototype strains of the various diarrheagenic E. coli pathotypes are also considered: EDL933 (EHEC), EDL1493 (ETEC), E2348/69 EPEC), 042 and JM221 (EAEC), C1845 (diffusely-adherent E. coli (DAEC)). As shown in Table 3, the results indicated that the deoK operon is carried by pathogenic strains belonging to various pathotypes of E. coli and associated with both extra-intestinal and intestinal infections.
Table 3: Frequency of occurrence of the A (deoK) and B (deo~ regions in various E, coli strains E. coli strains Probe Probe Deoxyribose utilization A B
CFT073 (pyelonephritis) + + +
536 (pyelonephritis) + + +
J96 (pyelonephritis) - - -RS218 (meningitis) - - -EDL933 (EHEC) - - -EDL 1493 (ETEC) + + +
E2348/69 (EPEC) - - -042 (EAEC) + + +
JM221 (EAEC) + + +
C1845 (DAEC) - - -The collections studied also comprised clinical isolates from 115 human with septicemia (isolated in France), 100 clinical isolates from patients with pyelonephritis (origin France, USA, Romania), 36 clinical isolates from patients with cystitis (origin USA, Romania), 25 EAEC isolated from HIV-positive patients with persistent diarrhea (origin Central African Republic and Senegal), 11 EPEC
with a diffuse adherent pattern (DA-EPEC) on epithelial cells isolated from infants with diarrhea in Brazil. We also investigated 257 commensal E. coli strains isolated from normal flora of healthy patients (origin France (36), Romania, Senegal, Central African Republic). The results are summarized in Table 4.
Table 4: Percentage of occurrence of the A (deol~ and B (deo~ regions in various E, coli clinical isolates E. coli strains Probe Probe Probe DeoxyriboseProbe A
A + + 2-A B Probe utilizationDeoxy-D-B
(level of ribose significance)utilization Septicemia 49 48 48 50 46 (n = 115) (p<0.0001 ) Pyelonephritis 50 53 48 50 48 (n = 100) (p<0.0001 ) Cystitis (n = 7 10 7 8 7 36) (0.2<p<0.4) Diarrhea (EAEC) 13 13 13 12 12 (n = 25) (p<0.0001 ) Diarrhea (DA-EPEC)11 11 11 11 11 (n - 11 ) (NA) Commensal (France)10 11 10 9 8 (n = 36) Cornmensal ~ NT NT NT 31 NT
(Romania, Senegal, Central African Republic) (n =
221 ) NT, not tested; NA, not appiicaoie.
5 The sensitivity of the two DNA probes appeared equivalent: 43%, and 45%
of the strains were positive with the A and B probes, respectively.
A total of 147 isolates (36 commensal strains and 113 pathogenic E. coli ) were tested for both the growth on K5 plates containing 2-Deoxy-D-ribose and fermentation of this sugar. A 100 % correlation was observed between the two 10 bacteriological tests; all the strains that grew on K5 plates with 2-Deoxy-D-ribose showed the ability to ferment the sugar. The 2-Deoxy-D-ribose utilization test appeared sensitive but, at a small extend, less specific than the genetic detection of the deoK operon (53 % of positive strains). Using both molecular and bacteriological approaches (probe A and growth on K5 plates with deoxyribose) a total of 40.8 % of the strains are positive.
Taking account of all the data, a significant association of the deoK operon with pyelonephritis- and septicemia-associated isolates, as well as with diarrhea associated EAEC isolates was evidenced.
Conclusion This work confirmed that metabolic characters may be specific of E. coli strains and that those expressed by pathogenic isolates may be considered as virulence-associated factors. Utilization of 2-Deoxy-D-ribose by some E. coli isolates has been previously reported. Here, the inventors identified the genes involved in utilization of 2-Deoxy-D-ribose by E. coli strains. These genes are organized in an operon (deoK) that is highly related to that previously identified in Salmonella enterica strains. Analysis of the sequences adjacent to the deoK in several E. coli isolates and in Salmonella strongly suggested that E. coli strains acquired the deoK operon by horizontal transfer from Salmonella strains. The inventors demonstrated that the deoK operon is highly conserved among E. coli strains. From this observation, the inventors defined two probes that were used to study the distribution of the deoK operon among collections of commensal and pathogenic E, coli isolates. Preliminary studies indicated an association of the deoK operon with strains belonging to various pathotypes of E. coli including strains causing pyelonephritis, septicemia, and some type of diarrhea in children.
If 40 to 50% of strains associated with pyelonephritis, septicemia, and diarrhea (EAEC and DA-EPEC strains) carry the deoK operon, we also detected it in 14 to 22 % of commensal isolates. This may be explained by the fact that commensal strains of E. coil can be potential pathogens when the host is compromised. It is interesting to note that the deoK operon is less prevalent in commensal strains from Romania, Senegal and Central African Republic than in French commensal strains.
In conclusion, the inventors have identified a metabolic character significantly associated with some pathogenic E, coil. The inventors have developed bacteriological and molecular tests to identify strains expressing this character. These tests could be associated with others in a future diagnostic kit for the identification of the pathogenic status of an E. coli isolate.
While several embodiments of the invention have been described, it will be understood that the present invention is capable of further modifications, and this application is intended to cover any variations, uses, or adaptations of the invention, following in general the principles of the invention and including such departures from the present disclosure as to came within knowledge or customary practice in the art to which the invention pertains, and as may be applied to the essential features hereinbefore set forth and falling within the scope of the invention or the limits of the appended claims.
2003-07-08 Listage pour 1e BdB corrige.txt SEQUENCE LISTING
(1) GENERAL INFORMATION;
(i) APPLICANT:
(A) NAME: Inst:il:ut Pasteur (B) STREET: 25-:?8 rue' du Docte~ur Roux (C) CITY: Paris (E) COUNTRY: France (F) POSTAL CODE (ZIP): 75724 (ii) TITLE OF INVENTIC)N: Genetic markers, metabolic markers, and methods for evaluating pathogenicity of strains of E.coli (iii) NUMBER OF SEQUENCES: 1?.
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: RobiC
(B) STRE:E:T': 55 St-J'acqueS
(C) CTTY: ~l~ont,real ( D ) STAT ~; . QC
(E) COUNTRY: Canada (F) ZIP: H2Y 3X<?
{G) TFLEPH~~IVE: 'i19-987-6242 (H) 'fELEc'A.k: 514-895-7874 ( v ) COMPU'1 ER RF~ADABLI=, FORM
(A) MhDTUNi TYPE:: ~?:i.sk 3.5" / 1.44 Ml3 (B) COMPU'CER: Tt=,M ?C compatible {C) (:)J?ERA~C:ING S'~S'fh)M: fC-DOS/MS-DO:, (D) SOFTWARE: T:~'I' ?,SCI
(vi) CURREN'P APPLICAT.CON DATA:
(A) Al?PLIC:ATION 'JCJMEIER: 2.388.945 (B) r':CLING DATE: :32 May 2002 ( 2 ) INFORMATI02J FOR SEQ I L; J() : 1 (i) SEQUEPJCE CHARACTER:CSTICS:
(A) LHNGTH: 4489 nucleotides (B) "..""PE: nucle:_c: ~:xcid (ii) MOLECULE TYPE: D~dA
(vi) ORIGICJ~1L SG(JRCE:
(A) ORGANISM: Escherichia coli (B) STRAIN: 5'i9t39 ( xi ) SEQUENCE DESCRI P'I':I C)PJ : SEQ I D NO : l ggacgataatgtgat:cgtctatangcJgcaacgctatcatagt:cttgtcctg'gcgggtaaa60 aaagcgcgcttaccta ataagcgcgccgctgttcaggccttgagtggttattcaat 120 aacg tcctgtggtgactgt:aaaagtgcclcdt:ttgcr_gcggtgcaacctgaatcagcgtgccatt 180 acgttgcgcggcaactatacc:ccta:a<7gccgacaggttgcaggtaatgcaaaggcggctac 240 ctgttgctctccgt:t:ataaaggatcc::aagc~~Itgtcac:ataattta<Ittcagcactgtagaa 300 acgagtaacaaacgt.agtgccatc:gggagagatcat~g~-gaaactctggctgatctgtata 360 agcgtccagtttgt.cagcaaaga~:uJac.;aat:t:tctggaJ:cat.aaaattccggttgactcag 920 cgtcgacagagaggcatctcCCtgCdfaatccgttgattaaacgccagccactgagcggt 480 gggattaacatgcgaggcactgat:tcacgcaatct:taat:at.t r_<:gtccgggatattctg 540 gctgaatgtagcat.t.tggtatatat-_qr:ataattcatgtggcacatatattgtagtggcat 600 atctacagaagccactattggttar.Jclc:catcataat:atr_c~aacagtgta3gaggatttgtg 660 aaggaccactgttgcrttgagccac~:~t:aatgatgaccgaaacccattacatactcgtaacg 720 Page:
2003- 07-08 BdB corrige.txt Tistage pour 1e ccggttaaggcgtaacatatctc:c,gtctaat.accagc~catget:tcatccatcgcggcaca780 ggccatttcaccgtgtagcagat:gagtat:cttccgcagatgggcagccattagccagcaa840 acctgaatgaaaagcaaaacagccataggtctctatcacctctgtcgccggtttaggctg900 gcgaaacatattgcacatggtgac~gccgt:gt.ccat:caaattgc:gcatc:ccaaatcatctg960 ccccatccagggaagaataatcaaaat:gtcc:ac:gactc3tt.tgcaatttt:aagcccctcgac1020 accgctgtcatagcgaaaagacgi:gacagt.aaaatcactattt.:tccagcaagatacgagg1080 tttctcgccaaaaagcgcccgcc<zcaaat:t:aatacgcgtactc.,at:aacggttctcctcag1140 gacgctgtgacttcagcc::3gtgcc;gt:accttacattgc:tttcac.;gccagaagtagactccg1200 acatagacaaagcagagcatagaaaccaggaatgaaagctgtagtgagtggaacatatct1260 gcaatatatccctgaattgccggaaccaccgcggcaccgacaatagcc:ataacaatgact1320 gctcctgccatttctgtat:gt:tegtt:atcaacagtatccagtctttcct:gcatagatcgtc1380 gcccagcaagggccaaacaaaae,act:taccaggacggr_gacat:agaccgcgctgaaactt1440 ggagccagtgcaacatatgccaggaacagcgcccctataacggaatagagaatcaatact1500 ttttccggattaaaacgcgtcataaggat.gtt:ggctataaact:tc~ccaat:aaagaagcag1560 gcaaagctatagac:catg<3ac~tt:t:gaagcatcacgttcgttgatatcgc~~caactccagc1620 gccagacggatggtaaatgaccatactgc.gacctgcatacccacataaaggaactgcgcc1680 acaataccgcgacc~aaagcgcggatt.tca:agccagat.agcgcaigcgt<it<:cattgctgac1740 gggcgtttatagt~acttc~tctgtc;<~~racattacagc~ti:gggaagcgggttaaaaggaac1800 aacaccatgaccacaaccaaatcataatcatatact.tataccTgttcaagggtgttctct1860 g aacatcagcacctt:aaagt:tgtga,nt:tt:gctcggcgtt:cattcvcggacatc;t:gcttctca1920 aggctttccccctc:ggag<saaaccagatat:t:tgcccaataaaataccagacgcagcacca1980 atcggataaaaggtctgg<.tgatattgagc:cgcaatgtggcat:aggct.tctggaccgatc2040 attgaactgtatgtgttcc~ctgc:agtttcaaggaaactcaggccaatcgc;aatcgcaaaa2100 atagctgcaagga._i::ata<.~tgtactgt:t:gcc:at;atgcga~ggcac~ggaaaaaaagtgtacaa2160 ccaccaatatacagcgtcagccaattaaaattgccaccatat.aactggtctttttaatc2220 g acaagggatgctggtattgcaatt:aaaaaataacctccataaaatgcgctcagcaccaat2280 gctgaagcaaagtt~ctt;~gcgaaaatacact:ttt:gaattgacttgattaat~atgtcattt2340 aatgcagctgcgc~tcccc:atagc:gggaataaacacgataac~aaaataaactggaacaag2400 ggagtcttattcag.stac<:catr._cggcatctgaatgatgtttt.tatcgttcatagtgcta2460 cctttaactgtgca~~gat<~at:tatt.cgti,taaggttaaaaatt.c,attaaai:t:gttcaata2.520 ctcggataagatg,~ttgccttacct tt gtgacgct:gaaacrcggcaaaa<lagagcggct2580 ccc:t tttttcaaagcggcttcaacatc.;cccgctttgaacataataatgggaaaagcaaccaata2640 aatgcgtcaccagc:gccac::tagt,utcaa<::agcat:ttactttgaatgcagctaacatgga<a2700 tcctgatcgcgggt~~atcc:ataatgcgcca:t.tttc:gctcatc~gtaacaat:aatattgtt:c2760 agccctttatcaactaacgaacgt:gcggccaaacgaatatgatcataagt:atcaaccgac2820 ataccggttaatatatccagttct<~r.ttcattcgggataaagaaatcacat:ta gcaggca2880 taagacatatcta:a..-.tcar:gc:aat.g;:~.~:ggagccggatt:taa~aac:acttc::aataccattt2940 ttcttaccaaacti::~atcc7cgtgcat.aaac:.tgtttccac3ttg:3acttccac~tagtaaaacg3000 atcaatttgcattTvttcagatcttctgcagctcgatcgarat.cttccgc~ggaaagaaat3060 ttattcgctccct~::aattatt:aa~atartattgctcgagtt:ggcattaac::~aagatcggt3120 gcaacaccactgc,:ggtacagggr~a::'~~t:tctcaac_-ataagtgqtattaat:t:ccccatgat3180 tcaagattacgaat:agtai.tatccg~caaaaatatcatcacctactttagt:cagcatcagg3240 acttttgaattca<.u~ttacaccgc~gwcac:r.gcttgatt:agcacctttccc:accacatccg3300 attttgaaggcag<~~:gcti:cag<sg-t cctt:.ttt:aggc~atctgattagtgtaagt:a3360 ~~ wtc:t atgagatccaccat~attgc,aaccaataac.tgcaatgt~catttcactacctcttataaac3420 tttcgcataacaat::c~gtat:ttaa:~t;~.~c:att:agcatgt.tact:tttgcatcatttgtgac:t3480 gagatcgcgattac,c:acat:caacc:c~at::gt.Matt taatagactr_ccagtctcatcactc3590 aggccaacactat<~t:aatc:ataagcaacctaacaagattagtgcccaaaactcagcagcc3600 tataccctttcatttcaaagggcycc~gtcgtatagtat.ggr_:.atgaaaac:aatgtttact3660 t aacgccaaaatgti::atttt:tata:~c:~r_t.cttacggagaga~fiagtl:gatgctaa.acgaagc:a3720 aaaagagcgtatccgacgtttgatggaeactgcttaag:~aaa:.cgacagaatccatttgaa3'780 agacgcagcgcgaai:gctcrgaagt_vct.gtaatga,~tattc:;tcgcgatctccatcagga3840 agatgaacct ctgcc,actcaaccci:.-ic:.:t:c;ggv:ggcar_attctt:aai:ggtg~~ataaacccgc3900 gccatccatgcca<xt:aatc:c~atga:gi~t:cc<3aaa<iatc;at.ytgatgactt: acctattgc3960 aattctggctgccggaatggttaatgaaaatgat.r_-.tg~,t.ctt:ctttgatGatggccagga4020 gataccactcgtt<_t:aagcatgat:-:c::;;gg~-~tgcaatc::cctt:caccggc~::t:cagtt:acts4080 acatcgcgtcttt<;i:tgcgtt:gaatca<3aaagcctaatgt:a_~::ag.aatac:t:ttgtggtgg4140 tacgtatcgtgccagaagt.gatgc:tl~tt.tacgatgccagtaactcttcgc:cattagactc4200 tctcaatccgcgaaaaatatttat:ti=c-cgccagcggtgtgcataatcactttggcgtcag42.60 ctggtttaaccctgaagat:cttgcca~t.aagcgt:~iaacxcga':gaaccgtggactacggaa4320 aattttgctcgcccqccacgcgt~:gt!=c:gat_gaag':ggcct:ag.~~cagcrt:cgcaccgat4380 ctctgcatttgacqt_tctgattactcc~atcc~i:ccg~taccggcagattatc_~ttacgcactg4440 ccagaatggttctctt:aaagateat:ta<,acctgat:tcaaa~s;rricg.,~atga 4489 Paste 2 2003-07-~~i8 Listage pour 1e BdEj corrige.txt (2) INFORMATION FOR SEQ TC: NO; 2:
( i ) SEQUE;~IOE CF~ARACTE.',RI STICS
(A) LENGTH: 33~ am:irw acids (B) 'TYPE: amino aci ci ( D ) '"~JPOLOGY : l ine~a:r (ii) MOLECJLE TYPE: protein (xi) SEQUEiVCE DESCRIPTION: SEQ ID N0;2:
Met Ser Thr Arg Ile Asn Le:u Trp Arg Ala Leu Phe Gly G:Lu Lys Pro Arg Ile Leu Leu Glu Asn Sex Asp Phe Thr Val Thr Ser Phe Arg Tyr Asp Se:r Gly Val. GLu Gly Leu Lys Ile A.la Asn Ser Arg Gly His Leu Ile Ile Leu P:ro 'Prp Met Gly Gln Met. Ile Trp Asp Ala Gln Phe Asp Gly His GLy Leu Thr Met Cys Asn Met Phe Arg 65 70 '75 Gln Pro Lys Pro Ala 'rhr GLu 'dal Ile Glu Thr Tyz Gly Cys I'he Ala Phe His Sexy Gly Leu L~.u Al.a Asn Gly C;ys Pra Ser A).a Glu Asp Thr His Leu Leu His G.y "1.u Met ALa Cys Ala Ala Met Asp Glu Ala Trp Leu Glu Leu Asp c~l.y Asp Met Leu Arg Leu Asn Arg Arg Tyr Glu Ty° Val Met G:Ly L?he ~~:Ly His His Tyz :Leu Ala Gln Pro Thr Val Va.Leu His Ly:per Ser Thr lieu Phe Asp Ile Lys Met Ala Val Thz: Asn Leu A'a :ver Val Asp Met Pro Leu Gln Tyr 170 1'75 180 Met Cys His Met Asn Tyr A:_a 'tyr Ile Pro t~sn Ala Thr Phe Ser Gln Asn Ile Pro Asp Glu Ia_e Leu Arg Leu Arg Glu Ser Val Pro Ser His Val Asn Pro Thr A1_a Gln Trp Leu Ala Phe Asn Gln Arg Ile Met Gln Gly Glu Ala Seer Leu Ser Thr Leu Ser Gln Pro Glu 230 23ti 2.40 Phe Tyr Asp Pro Glu Ile Val Phe Phe Ala Asp Lys Leu Asp Ala Tyr Thr Asp Gln Pro Glu Phe Arg Met Ile Ser Pro Asp Gly Thr Pan°
2003-07-08 L:istage pour 1e BdB corric~e.txt Thr Phe Val Thr Arg Phe Tyr Ser Ala G1u Leu Asn Tyr Val Thr Arg Trp Ile Leu Tyr Asn G~y Gl~_z Gln Gln 'Tal Ala A1a Phe Ala Leu Pro Ala Thr_ Cys Arg Pi:o ~;,1u Gly Tyr :~~eu Ala Ala G:Ln Arg Asn Gly Thr Lea Ile Gln V~s.l. A.L~a Pro G~n Gln 'z: hr Arg Thr Phe Thr Va1 Thr Thr G:Ly Ile Gl a ( 2 ) INFORMATION FOR SEQ I U L~,TO : 3 :
(i) SEQUENCE CHARACT~:RISTICS:
(A) .'..:~_'.NGTI-1: 43 . a:ni.no acids>
(B) '1'~'PE: amino acid (D) 'TOPOLOGY: l.im~ar (ii) MOLECC1:~E TYPE: peotr~-~n.
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3:
Met Asn Asp Ly:> Asn I1e IIe Gl.n Met Pro Asp (:,1y Tyr Leu Asn 1 C? 7. 5 Lys Thr Pro Let:c Phe Gln Phe Ile heu Leu Ser Cys Leu Phe Pro Leu Trp Gly Cys Ala Ala A.la L~eu Asn Asp IIe l,eu Ile Thr Gl.n Phe Lys Ser Val Phe Ser Le>u Ser Asn Phe Ala Ser Ala Leu Val Gln Ser Ala Phe Tyr Gly GLy 'Cyr Phe Leu :Ile Ala Ile Pro Ala Ser Leu Val IlELys Lys Thr Ser Ty.r Lys Val A_La Ile Leu Ile Gly Leu Thr Leu Tyr Ile G.~y ~::~l.y Cys Thr L~eu Phe Phe Pro A:La 95 100 7.05 Ser His Met Ala Thr Tyr Thr Met Phe Leu .Ala A'__a :Ile Phe A1a 110 17.5 120 Ile Ala Ile Gly Leu Ser Phe L.eu Glu 'I'hr Ala Ala Asn 'Phr Tyr Ser Ser Met Ile Gly Pro Lys Ala Tyr Ala Thr I,eu Arg heu Asn Ile Ser Gln Thr Phe 'ryr Pro Ile G'__y Ala ;Ill.a Sc~~r c;ly Ile Leu Leu Gly Lys Tyr Leu Val Phe :3er t~Lu Gly ~.:;la Ser heu Glu hys 2003-07-08 Listage pour 1e HdI3 corrig~.txt Gln Met Ser Gly Met Asn A;~a Glu Gln Ile His Asn Phe Lys Val 185 190 1.95 Leu Met Leu Glu Asn Thr I:eu Gl~.i Pro 'Pyr Lys Tyr Met Ile Met Ile Leu Val Va:L Val Met Va.l. Leu Phe Leu Leu Th~~ Ar.g Phe Pro Thr Cys Lys Val Ala Gln Thr Ser Hips Tyr Lys Ar<I Pro Ser Ala Met Asp Thr Leu Arg Tyr Leu A:La Arg Asn Pro Arg Phe Arg Arg 245 250 2.55 Gly Ile Val Ala Gln Phe Leu Tyr Val Gly Met Gln Val Ala Val Trp Ser Phe Th.r Ile Arg L~~u .Ala Leu G _u Leu G.Ly Asp I:Le Asn 275 280 <'?85 Glu Arg Asp Al,a Ser Asn P~~e Met: Va1 Tyr Ser .P.he Ala Cys Phe Phe Ile Gly Ly;_: Phe Ile Al.a .Elsru Ile Leu Met T:hr Arg Phe Asn Pro Glu Lys Va.L Leu Ile L,~_~u 'I'yr Ser Va 1 1~ 1e G.Ly Ala Leu I'he Leu Ala Tyr Va:1 Ala Leu Ala Prc; Ser Phe Ser A.1G Val Tyr Val Ala Val Leu Va:l. Ser Val Lf~~u ahe Gly Pro Cys Trp Al.a Thr ILe Tyr Ala Gly Th:~: Leu As.p T!,.r 'Jal. Asp Asn C~lu ,H:i~. Thr G~.u Met A1a Gly Ala Va:L Ile Val Mrt Ala .I: 1e Va 1. Gi y .A:1 G Ala Val Val Pro Ala Ile Gln Gly Tyr I:Le AI_a Asp M~:et E?he Eli. Ser Le~u G1n Leu Ser Phe Leu Val Ser Met Leu Cys Phe Va1 Tyr Val G1y Val Tyr Phe Trp Arg Glu Ser Lays Val. Arg Thr Ala Leu Ala Glu Val 425 4?~0 935 Thr Ala Ser (2) INFORMATION FOR SEQ ID t4(:): 4:
( i ) SEQUEPdC:E CHARACTE~~R:I,:p'I'IC::>
(A) LENGTH : 30Ei <~~rli rio acids (B) T"pE: ami.no a<:;.d (D) TOPOLC%GY: l_rn~ar 2003-07-C)8 Listage pour ie BdB corrige.txt (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIP'.I'ION: SEQ ID N(7:4:
Met Asp Ile Ala Val Ile G__y Ser Asn Met Val Asp Leu Ile Thr 1 Ci 15 Tyr Thr Asn Gln Met Pro Lys Giu Gly Glu Thr Leu Glu Ala Pro 20 2'e 30 Ala Phe Lys Ile Gly Cys Gly G:ly Lys Gly Ala Asru Gln Ala Val Al.a Ala Ala Lys Leu Asn Seer hys Val Leu Met Ge~.x Thr Lys Val Gly Asp Asp Ile Phe Ala Asp Asn '('hr Ile Arg Asn Leu G.Lu Ser Trp Gly Ile As.n Thr Thr 7'yr Val. Glu Lys Val Pro Cys Thr Ser Ser Gly Val Al~a Pro Tle Prze V<31 Asn Ala Asn Ser Ser Asn Ser Ile Leu Ile Ile Lys Gly Ala .?~sn l.ys Phe Leu Ser Pro Glu Asp 110 1:15 120 Ile Asp Arg Al,a Ala Glu A=;p Leu Lys Lys C:ys Lya Leu Tle Val 125 130 7.35 Leu Gln Leu Glm Val Gln L~=~u. Glu Thr V<~1 Tyr HL> Ala I~.e Glu 140 145 7.50 Phe Gly Lys Lys Asn Gly Ile ~:~l.u Val Leu Leu Asn Pro Al.a Pro 155 160 1.65 Ala Leu Arg Glu Leu Asp M~_~t Ser Tyr Aia Cys Lys Cys Asp Phe 170 1'75 1.80 Phe Ile Pro Asn Glu 'Phr G:Lu Leu Glu Ile Leu Thr Gly Met Ser Val Asp Thr Tyr Asp His I:Le Arg Leu Ala Ala Arg Ser Leu Val 200 205 2:10 Asp Lys Gly Lets Asn Asn I:le Ile Val Thr Met Ser Glu Lys Gly Ala Leu Trp Met Thr Arg Asp Gln Glu Va1 His Val Pro Ala Phe 230 23'p 240 Lys Val Asn Ala Val Asp Thr Ser Gly Ala Gly A.>p Ala Pr:e Ile Gly Cys Phe Sei: His Tyr Tyr_ ~~al Gln Ser G1y Asp Va1 Gl.u A1a Ala Leu Lys Ly: Ala Ala LEe;a !?7e Ala A.la E'he S~e.r 'Ja1 Thr G1y Lys Gly Thr Glr~ Ser Ser Tsr~:~ -?ro rer I.le :'~lu C: n P:ze .As;n C~Lu Paqo 2073-07-f78 I,istace pour 1.e Bd3 corrige.txt Phe Leu Thr Leu Asn G1u (2) INFORMAT2CN FOR SEA TI:) LSO:
(i) SEQUENCE CHARACT1~RIST:fC~:
(A) LENGTH: 261) amino acids (B) 'TYPE: amino amid (D) TOPOLOGY: .. inear (ii) MOLECCJLE 'T'IPE: Li:°<,tvei.n (xi) SEQUENCE D:F~SCRIP'f'ION: SEQ ID N0:5:
Met Glu Thr Lys Gln Lys lxlu Arg Ile Arg Arg Leu Met Glu Leu Leu Lys Lys Thr Asp Arg Ile His I,eu Lys Asp Ala Ala Arg Met Leu Glu Val Ser Va:1 Met Ttrr Tle Arg Arg Asp Leu His Gln Glu Asp Glu Pro Leu Pro Leu Thr heu Leu G1y Gly Tyr_ Ile Val Met Val Asn Lys Pro Ala Pro SEea: Met: Pro Val :Lle His Asp Val Pro Lys Asn His Arg Asp Asp L~eu Pro Ile Ala Ile Leu Ala A.La Gly Met Val Asn Glu Asn Asp Leu Ile Phe Phe Asp Asr:~. Gly G.Ln G1u Ile Pro Leu Va1 Ile Ser Met I1e Pro Asp Ala Ile: 'Thr Phe Thr Gly Ile Cys Tyr Ser His Arg Val Phe Val Al.a Leu Asn Glu Lys Pro Asn Val Thr Ala Ile L~~~~u Cys Gly Gly Thr Tyr Arg Al.a Arg 140 195 x.50 Ser Asp Ala Phe Tyr Asp Ala Ser Asn Ser Ser Prc? Leu Asp Ser 155 160 7.65 Leu Asn Pro Arg Lys Ile Phe Ile Ser Ala Ser Gly Val His Asn His Phe Gly Val Ser Trp Ph.e Asn Pro Gl.u Asp Leu Ala Thr Lys Arg Lys Ala Met Asn Arg G:ly ~eu Arg Lys Ile Leu Leu Ala Arg His A1a Leu Phe Asp Glu Va1 A1a Ser Ala Ser Leu Ala Pro Ile Ser Ala Phe Asl:~ Val heu I 1e :per Asp Arg Pro I,eu Pro Al,a Asp Page 2003-07-()8 Listage pc:ur 1e BdB corrige.txt 230 2.35 240 Tyr Val Thr His Cys Gln P.:~ru Gly Se:r Val Lys Ile Ile Thr Pro Asp Ser Glu Asp Glu (2) INFORMATION FOR SEQ ID NG: 6:
(i) SEQUENCE CHARACT3R.TSTICS:
(A) :LENGTIV: 4486 nucleotides (B) TYPE: nucl~:i c acs d (ii) MOLECULE TYPE: D2dA.
(vi) ORIGINAL Sc7URCE:
(A) ORGANISM: f;;c~erichia coi:i (B) STRAIN: AL8Cs2 (xi) SEQUENCE D°SCRIEe'''1:0N: SEQ IC N0:6:
ggacgataatgtgatcgtcaataagggca:cacgctatc:atagtc~ttgtc:ctggcgggtaaa60 aaaacgcgcttaccttaaa:gataa:gcgcgccgctgttcaggccttgagtggttattcaat120 tcctgtggtgactgtaaa~.gtgcgcgt ctgcc;gt.gcaacctgaatcagcgtgccatt180 tt:c~
acgttgcgcggcaagatac:ccoca ggcc_:gacagqtt.gc:aggl::aatgraaaggcggctac240 ctgttgctctccgttata ggat:cc:vagc:gtgtc<~ca.taa.t~t:tagttcac~::actgtagaa300 ~a acgagtaacaaacgtagtgccats:~gygagagatcatgcgaaactctggctgatctgtata360 agcgtccagtttgtctgc~aagaagacaatttctggatcataaaattccggttgactcag420 cgtcgacagagaggcttct;ccctctc:~:taatccgtt.gattaa:~<:gccagcc:actgagcggt480 gggattaacatgcclaaggc:actc~~~tt:cargcaatctt<zatatt.tcgt<:cc~ggatattct=g540 gctgaatgtagcai:ttggtat:at~3t:c:~ca:ataat.tcatc~i:ggc:.ac~at:ataqtagtggc<~t600 t~:
atctacagaagccagatt~gtaacggcc:atcttaatatcgaacagtgtagaggatttgtg660 aaggaccactgttggctgagccactat.aat:gatgacccaaacccattacatactcgtaacg720 cccgttaaggcgtaacatatctcc~qtctaattccagccatgctetcatc::c:r'.:cgcggcaca'780 ggccatttcaccgi:gtagcagat,tac7tt;~tr.t:t.cr_:acac~at.dr,Iqcagccat:~~<~gccagcaa84 acctgaatgaaaagcaaaacagcc:ataggtctctatc:acct~ct:gtcgccdgtataggctg900 gcgaaacatattgcacat<rgtgaagc:cgt.gtccatcaaattgcgcatcccaaatcatctg960 ccccatccagggaagaat:,iat:ca-a:ct:gt.c:c:acgac:U:a;tt:tgc<:at:tttaactcccctcgac1020 accgctgtcatat c;gaaaaagacgt:ctac<igt:aaaatcactattttccagc<3<3gatacgagg1080 tttctcgccaaaaa:~cgr_c,cgcc_cc:~.iaat:vtaatacgc~<Ltactcvataac:gat.-.tctcctcag1140 gacgctgtgacttcagccaagtgc~3gtacgtactttgctttcac:gccaga<3gtagactccg1200 acatagacaaagcagagcataga_caccaggaatgaaagctgtagtgagtggaacatatct1260 gcaatatatccct:Iaat~t.ccggnac:cs~ccgc:ggc:acc;gac.aatagcc:at:aacaatga<:t1320 c gctcctgccattt::tgtat:gttc~;t.t:aitc:aacagtatccar~tc;ttcctgc:at:agatcgtc1:380 gcccagcaagggc~.aaac<3aaac._,ct aggacggcgacat agaccgc:g<agaaactt1440 t<icc:
ggagccagtgcaac.atatgccag~~~aac:agcgcccctataacgcfaatagagaatcaatact1500 ttttccggattaa<iacgccftcat cac~gat~att.ggctat:aaac.t:tgccaai:aaagaagcag1560 gcaaagctataga::::atg~:3agtt tgaagcatcacctttcctttgatatcgcccaactccagc1620 gccagacggatggt:aaatc,tac:catactgc~gac:ctgcat:acc;:acataaaddaactgcgc:c1680 acaataccgcgacg:aaag<:gcggatttctagccagatagcgc<3gcgtatc~cattgctgac1740 gggcgtttatggtg:acttcJtctgt:gccactttacaggttgggaagcgggt:t:aaaaggaac1800 aacaccatgacca~;..aacca~gaat~:at;:,aatcatata<:ttat<~cctgttcaaciggtgttctct1860 aacatcagcaccttaaagi:tgtg:uat,ttgct:cggcgtt:cattc:ctgac:ai:ct:gcttttca1920 aggctttccccctcggagaaaac~~:agatat:ttgcccaataa~~~aaccag<icgcagcacca1980 atcggataaaaggt~~tggc.:tgatatt:gagecgcaatgtgge~ataggctti:tggaccgatc2040 attgaactgtatgt~~ttrc:xct..gc:e.gtaggaaact:caclgccaatcgc:aatcgcaaaa2100 t.i~ca atagctgcaagga,_~~ata<tt<ttaa~=~c3l::t,c~c:.catatgcgaggcactggaaaaa~<tagtgtacaa21 ccaccaatatacag~~gtcaggcc~:attaaaattgcca<:cttat:aactggt:cattttaat:c2220 acaagggatgctggtatt<3caatta.aaaaataacctccataaaatgcgct:ctgcaccaat2280 gctgaagcaaagttacttagcga,~aatacacttttgaattgagtgattaatatgtcattt2390 Paste 8 2003-07-C~E3 BdB corr:ige.txt f.~istage pour :1e aatgcagctgcgcat:ccccatagcg<~gaataaacacgataacaaaataaactggaacaag 2400 ggagtcttattcaciataccr_atcaggcatctgaatgat.gtttttatcgtt;catagtgcta 2460 cctttaactgtgcac~gatgattat:tc~gtataaggttaaaaattc:attaaat:tgttcaat.a2520 ctcggataagatgatagcgtaces=ti:c_cctgtgacgc2:gaaagcggcaaagagagcggct 2580 tttttcaaagcggcatcaacatcacc~c~ctttgaacataataatgggaaaagcaaccaata 2640 aatgcgtcaccagcc~ccac;tagt,:3t:s.~<~aragcatttar;tttgaatgcagctaacatggac,t2700 tcctgatcgcgggt:catcc:ataat:cacc:Lcc:tttttcgct.catggtaacaat;aatattgttc 2760 agccctttatcaar_i~aacgaacgtgcggccaaacgaat.atgatcataagtatcaaccgac 2820 ataccggttaatat.t~tccagttct:cyt~:t:cattcgggataaac~naatcacatt.tgcaggca2880 taagacatatctaactcac:gc:aatc3~~c~qgagccggatttaataacacttc:aataccattt 2990 ttcttaccaaactc~<~atc<~cgtgtt~3aactgtttc~cagttgaact.t:cc:agttgtaaaacg 3000 atcaatttgcatti:.i~ttcagatctt:~rtgcagctcgatcgatat.cttccg<iggaaagaaat 3060 ttattcgctccctt:<~attattaavatact.attgctcgagtt~3gcattaac:aaagatcggt 3120 gcaacaccactgc!::c~gtac::aggg~,~a;attctcaacataagt:ggtattaat:.t:ccccatgat3180 tcgagattacgaatagtattatc_:gcaaaaatatcatcacctactttagti_agcatcagg3290 acttttgaattcaavttac7ccgc~~:g~car_cgcttgattagcacctttccc:accacatccg 3300 attttgaaggcaggtgctt:.ccag,ugt:ttct:ccttc~ttt:aggr,atctgatt:agtgtaagta 3;360 atgagatccacca~t,3ttg<~aacc~uat;aactgcaatgtccattt cactacc:t_cttataaac3420 tttcgcataacaatggtatataaataacattagcatgttacttttgcatcatttgtgact 3480 gagatcgcgatta~;i;:acat:caac~:cgai~gttt.atttaatagac-ttccagtcttatcactc 3540 aggccaacactat::taatc:ataactcaac;ctaacaggat:t:aataccgaaaat~t:cagcagtc3600 tatacccttttcatttcaaagggt:cggtcgtatagtat~ggt.-3ar_taaaac:aatgtttact 3660 aatgccataatgtt.atttttataacattttacggagagagttgatggaaacgaagcaaaa 3720 agagcgtatccgacgtttgat:tg_uaatact.taagaaaaccgac:agaatccatttgaaaga 3780 cgcggcacgaatg,ctgga<igt:tr_c.t:cttaat:gactatt.<:gtagc:vgatct:cc~at=caggaaga3840 tgaacctctgccactgaccctact:gggtggctatattgt:aatggtgcataaacccgcacc 3900 atccatgccagtaatccaggacgt:tccgagaaatcatc:gtgatgactt.acctattgcaat 3960 tctggccgccggaatggttaatgaaaatgat:ctgatca:t:cttt.gataaat~:~gccaggagat 4020 accgctcgttataagcatgatccc:ggatycaatcacc:ttcactggr_atc=gttactcaca 4080 tcgtgtcttt gttgcgttgaatgaaaaacc:taatgtgar_agcaatactttgtggtggtac 4140 gtatcgtgccagaagtgatgc.~tt.i.t:t:.acc:tatgccagt:aact<a.tcgccatt:agactctct4200 caatccgcgaaaaatattt:atttc.ccaccagc;ggtgta~c~atgat:cactttggcgtcagctg 4260 gtttaatcccgaagatcttgccactaagcgtaaagcgatggcccgtggactaaggaaaat 4320 tttgctcgcccgcc:acgc~:atgt.tcgatgaagtagcctctgc<aagcct:cgc~accgctctc4380 tgcatttgat gttctgattagcgagc.gtccgt:t:accctgcagat;.tatgttacgcactgccg 4440 gaatgcttcgtaaagat~,at t. t:cactaaagacgautga 4486 ttacvacctga (2) INFORMATION FOR SEQ II:7 N0: 7:
(i) SEQUENCE CHARAC'!'i;R.IST.ICS:
(A) LENGTH: 3~'~' domino acic>
(B) TYPE: amine ac:,id (D) TOPOLOGY: :spear (ii) MOLECULE TYPE: p:co.ein (xi) SEQUENCE DESCRIF'iT~~N: SEQ ID NO:7:
Met Ser Thr Arg Ile Asn i_.n_u:: Trp Arg F,~' a f.~eu Fnee G1y Glu L~ys I. ~ ~. 5 Pro Arg Ile Leu Leu Glu F.:=,n. Ser P.sp P:.e Thr ua!. ~'hr Ser Phe 20 <:~~ 30 Arg Tyr Asp Ser Gly Val ~:l.m Gl~Y~ heu hys I1_e P.la Asn Ser Arg 35 =?(i e5 Gly His Leu Ile Ile Leu Pro Trp Met G~.y Gln Mev Ile Trp Asp 50 5!-i 60 Ala Gln Phe Asp Gly His G..y Leu Thr Met Cys Asn Met Phe Arg Pave 9 2003-07-O8 histage poeir1e BdEcorrige.txt GlnProLys PrrrAlaThr G ';!alI GJ.Thr'TyxGly CysPhe 1.,a 1e a 80 8'~ 90 AlaPheHis Se:rGlyLeu h_~uAlaAsn G:LyCysPrc:~Ser ValGlu 95 100 ~05 AspThrHis Le~_iLeuHis Gl.yG1L.Met.Ala(:ysAlaAI_aMetAsp GluAlaTrp Le,zGluLeu A.>p.,1yAsp MetLeuAr_c)Leu AsnGly ArgTyrGlu TyrValMet.Gl_yPheGly HisHi_sTyr:-Leu AlaGln ProThrVa1 Va:LLeuHis hysSerSer Thr7_~euPheAsp ILeLys 155 160 1.65 MetAlaVal ThrAsnLeu A=_aSerVal AspMetPrc-.Leu GLnTyr 170 175 1.80 MetCysHis MetAsnTyr A:LaTyrIle ProAsnAlaThr PheSer GlnAsnIle ProAspGlu I:LeL,euArg LeuArgGlmSer ValPro SerHisVal AsnProThr RiaGl_nTrp LeuA:LaPheAsn GLnArg IleMetGln GlyGluAla :>c~rLeiaSer ThrLeuSerGln ProGlu PheTyrAsp ProGluIle Val.PhePhe ALaAspLysLeu AspAla 295 2~,0 255 TyrThrAsp GlnProGlu PlueArgMet IieSerPrc~Asp GlyThr 260 2E>5 270 ThrPheVal ThrArgPhe TyxSerA:laGl.uLeuFsnTyr Va1Thr ArgTrpIle LeuTyrAsn C:lyG1~~Gl.nGLnValAl..:xA1a PheAla LeuProAla ThrCysArg k'rc>GluG_y TyrLeuAlaAla G.lnArg 305 31.0 315 AsnGlyThr LeL.IleGln V,a:l.AlaPro GLnGlnThr_Arg ThrPhe 320 325 ;330 ThrValThr Tl-~rGlyIle Ca:Ll.i ( 2 ) INFORMATIOI~I FOR. SEQ I 1) I'!0 : 8 :
(i) SEQUE;DICE CHARAC'1'E; I:STICS:
(A) hENGTH: 4 ,t3 amino acids (B) TYPE: aminas a-3cv.i.d (D) TOPOLOGY: ' ira<ear Page '.0 2003-07-08 L~istage pour1e BdBcorrige.txt (ii)MOL ECULE protein TYPE:
(xi)SEQUEN(:E P':CIOt~T; :8:
DESCRI SEQ
ID
MetAsnAsp LysAsnIle I:Le;7 MetP:r Faspu.LyTyr LE.~uAsn n o 5 7.0 7.5 LysThrPro LeuPheGln PheI1_eheuLeu L~er.~.~JSLeu PheI~'ro LeuTrpGly Cy::~AlaAla Ala?~euAsnAsp Il.eLei:Ile TtArGin 35 4~ 45 PheLysSer Va:LPheSer IaE_;a:3erAsnP'~eA1aSerAla LeuVal 50 5.'> 60 GlnSerAla Ph<~TyrGly G 'ryrP:heLeu 7: i~ I ProAla l 1e La 1e y 65 70 'S
SerLeuVal IleLysLys T!-~r,:perTyrLys ValAlaI1_eLeu7:1e GlyLeuThr Le~.rTyrI1e Gl.y~:~'.yCysThr LeuP:ze.~Phe PwoAla SerHisMet A1.3ThrTyr Tr:r!hetPheLeu AlaA1<~ile PheAla 110 1'15 120 IleAlaIle GlyLeuSer F:hc~_LeuGluThr F~7.aAlaA.snThrTyr 125 1:~0 '_35 SerSerMet IleGlyPro hysAiaTyrAia ThrL~uArg LceuAsn 140 145 ~~50 IleSerGln ThrPheTyr ProI:LGlyAla A7.aS2rGly I:l_eI~eu e:.
155 1E~0 165 LeuGlyLys Tyr_LeuVa:LFheSerGluG1y GluSerLeu GluLys 1'70 175 180 GlnMetSer GlyMetAsn A7 G:LiaGlnI HisAsrzPhe LysVal a 1e LeuMetLeu GluAsnThr.LeuGluProTyr I~ysTyrMet I:LeMet IleLeuVal Va:LValMet.Va7.LeuPheLeu LeuThr-Arg PhePro 215 220 'Z25 ThrCysLys Va:LAlaGln ThrSerHisHis -LysArgPro SerAla MetAspThr LeuArgTyr L,euA.laArgAsn ProAr<(Phe ArgArg GlyIleVal AlaGlnPhe L,~~uT ValGly MetGlnVal AlaVal yr TrpSerPhe Th.rIleArg I,euAlaLeuGlu :LeuGlvrAsp I1eAsn GluArgAsp A1;~SerAsn PheMetVa1Tyr SerPheAla CysPhe 2003-07-(:~E3 L~istage pour1e BdBc:orrige.
txt PheIleGly LysPheIle A_aAsnIleLeu MetThrArg PheAsn ProGluLys Va~_LeuIle Lea.x'CyrSerV<i:LLleG:LyAla LeuPhe LeuAlaTyr VaI.Al.aLeu A:LaProSerPhe SerAlaVal TyrVal AlaValLeu ValSerVal L<:uPheGlyPro CysTrpAla ThrIle TyrA1aGly ThwLeuAsp Th.r',7a1AspAsn GluIsisThr GluMet AlaGlyAla Vaa.Il.eVal McaA.laIl.eVal GlyALaAla ValVal ProAlaIle GlnGlyTyr IleAlaAspMet PheH.isSex LeuGln LeuSerPhe LeuValSer M._aLeuC:ysPhe ValT Val G7_yVal yr TyrPheTrp ArgGluSer LysValArgThr AlaLeuAla GluVal Thr Ala Ser (2) INFORMATIOiV FOR SEQ_ IC~ NO: 9:
( i ) SEQUENCE C'?ARACTE:,R1 S'_"I CS
{A) IaGNGTII: 3UFi amino acida {B) :CYPE: amine ;7cid (D) TOPOLOGY: lp_r_ear (ii) MOLECULE T'~PE: pi:r.:tr:in (xi) SEQUENCE DhSCRIP~"I:C?N: SEQ ID NO:.°.:
Met Asp Ile Ala Val Ile G:y Ser Asn Met 'JaJ. Asp heu Ile Thr 10 :15 Tyr Thr Asn Gln Met Pro L~js Glu Gly Glu 'rrr Leu Glu Aia Pro Ala Phe Lys Ile Gly Cys C:,:Ly G1y hys Gl.y Al.a As~a Gln ALa 'Jal Ala Ala Ala Lys Leu Asn Scar Lys Va1 Leu Met Leu Thr Lys Val 50 5'60 Gly Asp Asp Ile Phe Ala A:~L:~ Asn Thr T:ie Arg Asu heu Glu Ser Trp Gly Ile Asn Thr Thr '1'yr Val Glu Lys Val Pro Cys Thr Ser Ser Gly Val Ala Pro Ile Phe Val Asn Ala Asn Ser Ser Asn Ser 95 1. C)0 1.05 2003-07 -08I~ist~age pourle BdBcorr:ige.txt IleLeuIle IleLysGly ALaAsnLysPhe heuSerProGlu Asp IleAspArg AlaAlaGlu Asp!:.~euhysLys O:yshya,LeuI7_eVal 125 130 1.35 LeuGlnLeu GluVal.Gin LeuGluThrVal 'I'yrH:LsAl.aIl.eGlu 140 195 1.50 PheGlyLys LysAsnGly ILeGluValLeu L~euAsnProAl.aPro 155 160 I_65 AlaLeuArg GlnzLeuAsp Met:SerTyrA:laCysLysCysAsp Phe 1'70 115 180 PheIlePro AsnGlu'rhrGLuLeuGluIle LeuThxGlyMet Ser ValAspThr Ty:c:AspHis I .?ergLeuAla AlaArc)SerLeu Val Le AspLysGly LeuAsnAsn Il.e:L:LeVal.'rhrMetS~~rG1_uLys Gly 215 220 2.25 AlaLeuTrp Met.ThrArg A~;p~:LnGluVal HisVaif'roAl.aPhe LysValAsn Al._iValAsp ThrSerGlyAl.aGlyAspAlaPhe Ile GlyCysPhe Ser_HisTyr Ty:rValGlnSer GlyAsiaValG_LuAla AlaLeuLys LysAlaAla LeuPheAl.aAla hheSerValThr Gly LysGlyThr GlnSerSer TyrProSerIle GluG1nPheA.snGlu PheLeuThr LeuAsnGlu (2) INFORMATION FOR SEQ LD N0: 10:
(i) SEQUENCE CIiARACThRIS'PICS:
(A) LENGTH: 30E: amino acids (B) 'TYPE: amino avid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE C)L,SCRIPT~GN: SEQ ID NC3:10:
Met Asp Ile Ala Val Ile G':y Sez Asn Mc~t Jai Ast~ Leu I1e Thr 1C' 15 Tyr Thr Asn Gl~z Met Pro I~ys G1u GI_y Glu Thr L,e G.I_u A1a Pro Ala Phe Lys Ile Gly Cys GL.y Gly Lvs G?~y .~lla Pan Ciln Ala Val 35 4C) 95 2003-07-C3 '~~is"_:age po~:_ir.1e I3clBcorrige.txt AlaAlaAla Lye;LeuAsn Ser,ysVa.LLeu MetLeu'l LysVal hr G1yAspAsp Ile:PheA1 A:>palsnTh.rI7_e~,rgAsnLeu Gl.uSer a 65 70 7~
TrpGlyIle Asr:ThrThr Tsrr'1<a1.C~:l.uLy,>N'alPrc~:ysThrSer SerGlyVal AlaProIle PheValAsnAia AsnSerSer AsnSer 95 i~:?~ 105 IleLeuIle I1<;LysGly A:laAsnLysPhe LeuSerPro GluAsp IleAspArg AlmAlaGlu A:~p:L,euLysLys C'.ysLysLeu I~.eVal LeuGlnLeu GlozValGln L. GluThrVal TyrH A7_aI:LeGlu ~u L:;
PheGlyLys LysAsnGly Il.eGluVal.Leu LeuAsr;Pro A1aPro 155 1.60 165 AlaLeuArg GluLeuAsp MeatSerTyrAla CysLysCys AspPhe PheIlePro AsnGluThr Gl.uLeuGluIle L,eu'rhrGly MetSer ValAspThr TyrAspHis IleArgL,euAla AlaArc;Sex LeuVal AspLysGly LeuAsnAsn I'_eIleValThr MetSerGlu LysGly AlaLeuTrp MetThrArg AspGLnGl.uVal HisValPro AlaPhe LysValAsn AlaValAsp 7'h~:SerGl.yA:laGlyAspAla P:helle GlyCysPhe Se:rHisTyr TyrValGl.nSer Gl.yAspVal GluAla AlaLeuLys LysAlaAla I:E=u.PheAlaAla PheSerVal ThrGly 2.75 280 285 LysGlyThr GlnSerSer TyrPr<>SerI GluGl.nPhe AsnGlu ~e 290 2.95 300 PheLeuThr LeuAsnGlu ( 2 ) INFORMATI01\! FOR SEQ I D NC): 11 (i) SEQUEI~ICE CHARACTERISTICS:
(A) LENGTH: 8'~4 nucLeot~.ide,>
(B) TYPE: nucle~:i.c: acid ( i i ) MOLEC'.LILE TYPE : C>NA
2003-07-C'_'s histage poszr 1e BdB c:orrige.txt (vi) ORIGINF~L SOURCE: lsc:herichia coli ( xi ) SEQUENCE DESCRI P'l.' LC)N : S EQ I D N0 : 1 1 caatactcggataactatgattgccttacctttccctgtgacgcagaaagcggcaaagagag 60 cggcttttttcaaagcggcttca<ccat:caccgctttgaacat aataatgggaaaagcaac 120 caataaatgcgtc<~c:cagcgccar:t<~cJt:atcaacagcatttactttgaat:gcaggaacat 180 ggacttcctgatc<Jc:gggt:catcc:ai:~~atg<:gcctttttcgctcatggtaacaataatat 2.40 tgttcagccctttat:caact=aaccJaac;gtgcggcc:aaacgaatatgatcat:aagtatca.a300 ccgacataccggttt~atat:ttcc~igi:t_<agt:ttcattcgggataaagaaat: cacatttgc360 aggcataagacatat:ctaactca<:gc:.aatgccggagccggat:ttaataac:acttcaatac 420 catttttcttacca.aactcaatcgcgtggtaaactgt?-tccagttgaactt:ccagttgta480 aaacgatcaattt<Jcattt:tttcrigat:ctr.ctgcagctcgatcgatatctt:ccggggaaa540 gaaatttattcgct:cccttaattati_aatatacta"~tgct,,:,~t:gttggc~c't:taacaaaga600 tcggtgcaacaccacagctggtac,~gc~ggactttctcaacata.agtggtat:taattcccc660 atgattcgagattac:gaatgta':t,~;:~cgcaaaa;~tatca'~c a:~ctact,t:t.agttagc;a720 a tcaggacttttgaai:tcaactttacJcc:~c:cgc<:ac~~gct..t:gat:tac~cacctt:t:.cccaccac780 atccgattttgaacxgcaggtgctt~:~c~agagtttc;:cctt:ct'Ytaggcatc:t:gat 8:34 ( 2 ) INFORMATIOt~I FOR SEQ IC~ 1v0: 12 ( i ) SEQUEtVC:E CHARACTL,ft . STI CS
{A) LENGTH: 81~= !m.ic7_e~otides {B) '1"fPE: nucleic acid ( i i ) MOLECiJ::~E TYPE : Dl>IA
(vi) ORIGIN~~L SOURCE: I_,scher.ichia ca7.i.
{xi) SEQUE19CE DESCRIP'I'I~DiV: SEQ ID N0:7.2:
ggacgataatgtg,_~rcgtc::tata,~q;Jgcaacgctatcatagtcatgtcct:ggcgggtaaa 60 aaaacgcgcttaccttaa<:gata~~~g.~.gcgccgctgttc:aggccttgagt.<~gttattcaat 120 tcctgtggtgact~ataaaagtgc~:~cqtttgctgcggtcJcaa :cvtgaatcac~cgtgccat:t180 acgttgcgcggcaagatac:cc:ct~::ag:~<:cgacaggttgc:aggt:aatgcaaaggcggctac 240 ctgttgctctccgt:tata~~aggatccagcqtgtcacat:aa-tt agttcagc~actgtagaa300 acgagtaacaaac:Jtagtgccat.:gcx:~agagatcatgcgaasc:t~ctggctgatctgtata 360 agcgtccagtttgt~tgct3aagaagac ttct:ggat:c:a!:aar.aattcccJgttgactcag 420 aat cgtcgacagagag;Jcttct:ccct.~fc:ataatccgttgattaaacgccagcc::actgagcggt 480 gggattaacatgc;fiaagg~actg=u.t caatctt<iatatt:tcgtccgcJgatattctg540 c<~cg gctgaatgtagcat:ttggn:atat:~stgcat:.aat.tcatgt:ggcac:atatat?:dt:agtggcat600 atctacagaagccagatt<;gttac:ggccatct.taatat=cgaac:agtgtac4aggatttgtg 660 aaggaccactgttc~gctg<igccac~~ataatgat:gaccgaaaccc:attaca'_actcgtaacg '720 cccgttaaggcgtaacata-itctccgtctaattccagrcatgcttcatcc~itcgcggcaca 780 ggccatttcaccgi:gtagc:agat_c~~agtatc:ttc:cac 816 PacJee 1. S
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a schema illustrating operon deoK in Escherichia coli.
Figure 2 represents nucleic acids and amino acids sequences of deoK
operon in Escherichia coli - strain AL862. Underlined sequence corresponds to probe A and doubled underlined sequence corresponds to probe B. Bold nucleotides correspond to primers used in PCR assay to amplify probes A and B.
Figure 3 represents nucleic acids and amino acids sequences of deoK
operon in Escherichia coli - strain 55989.
Figure 4 represents nucleic acids sequence of Probe A.
Figure 5 represents nucleic acids sequence of Probe B.
DETAILED DESCRIPTION OF THE INVENTION
A) Definitions Throughout the text, the word "kilobase" is generally abbreviated as "kb", the words "deoxyribonucleic acid" as "DNA", the words "ribonucleic acid" as "RNA", the words "complementary DNA" as "cDNA", the words "polymerase chain reaction" as "PCR", and the words "reverse transcription" as "RT". Nucleotide sequences are written in the 5' to 3' orientation unless stated otherwise.
In order to provide an even clearer and more consistent understanding of the specification and the claims, including the scope given herein to such terms, the following definitions are provided:
Antisense: As used herein in reference to nucleic acids, is meant a nucleic acid sequence, regardless of length, that is complementary to the coding strand of a gene.
Expression: Refers to the process by which gene encoded information is 5 converted into the structures present and operating in the cell. In the case of cDNAs, cDNA fragments and genomic DNA fragments, the transcribed nucleic acid is subsequently translated into a peptide or a protein in order to carry out its function if any. By "positioned for expression" is meant that the DNA molecule is positioned adjacent to a DNA sequence which directs transcription and translation of the sequence (i.e., facilitates the production of, e.g., a deoK
polypeptide, a recombinant protein or a RNA molecule).
Fragment: Refers to a section of a molecule, such as a protein, a polypeptide or a nucleic acid, and is meant to refer to any portion of the amino acid or nucleotide sequence.
Host: A cell, tissue, organ or organism capable of providing cellular components for allowing the expression of an exogenous nucleic acid embedded into a vector. This term is intended to also include hosts which have been modified in order to accomplish these functions. Bacteria, fungi, animal (cells, tissues, or organisms) and plant (cells, tissues, or organisms) are examples of a host.
Isolated or Purified or Substantially pure: Means altered "by the hand of man" from its natural state, i.e., if it occurs in nature, it has been changed or removed from its original environment, or both. For example, a polynucleotide or a protein/peptide naturally present in a living organism is not "isolated", the same polynucleotide separated from the coexisting materials of its natural state, obtained by cloning, amplification and/or chemical synthesis is "isolated" as the term is employed herein. Moreover, a polynucleotide or a protein/peptide that is introduced into an organism by transformation, genetic manipulation or by any other recombinant method is "isolated" even if it is still present in said organism.
Nucleic acid: Any DNA, RNA sequence or molecule having one nucleotide or more, including nucleotide sequences encoding a complete gene. The term is intended to encompass all nucleic acids whether occurring naturally or non-naturally in a particular cell, tissue or organism. This includes DNA and fragments thereof, RNA and fragments thereof, cDNAs and fragments thereof, expressed sequence tags, artificial sequences including randomized artificial sequences.
Open reading frame ("ORF"): The portion of a cDNA that is translated into a protein. Typically, an open reading frame starts with an initiator ATG codon and ends with a termination codon (TAA, TAG or TGA).
Percent identity and Percent similarity: Used herein in nucleic acid and/or among amino acid sequences comparisons. Sequence identity is typically measured using sequence analysis software with the default parameters specified therein (e.g., Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Owl 53705). This software program matches similar sequences by assigning degrees of homology to various substitutions, deletions, and other modifications. Conservative substitutions typically include substitutions within the following groups: glycine, alanine, valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine.
Polypeptide or Protein: Means any chain of more than two amino acids, regardless of post-translational modification such as glycosylation or phosphorylation.
Potentially pathogenic: Refers to a strain which has the capacity to be involved in a pathogenic process. Examples of potentially pathogenic strains are extra-intestinal E. coli strains which are distinct from the commensal and from the intestinal pathogenic strains.
Specifically binds: Means an antibody that recognizes and binds a protein or polypeptide but that does not substantially recognize and bind other molecules in a sample, e.g., a biological sample, that naturally includes protein.
Substantially the same: Refers to nucleic acid or amino acid sequences having sequence variation that do not materially affect the nature of the protein.
With particular reference to nucleic acid sequences, the term "substantially the same" is intended to refer to the coding region and to conserved sequences governing expression, and refers primarily to degenerate codons encoding the same amino acid, or alternate codons encoding conservative substitute amino acids in the encoded polypeptide. With reference to amino acid sequences, the term "substantially the same" refers generally to conservative substitutions and/or variations in regions of the protein that are not involved in determination of structure or function of the protein. "Substantially the same" encompasses "degenerate variants" of nucleic acid or amino acid sequences.
Substantially pure polypeptide: Means a polypeptide that has been separated from the components that naturally accompany it. Typically, the polypeptide is substantially pure when it is at least 60%, by weight, free from the proteins and naturally-occurring organic molecules with which it is naturally associated. Preferably, the polypeptide is at least 75%, 80%, or 85%, more preferably at least 90%, 95% or 97% and most preferably at least 99%, by weight, pure. A substantially pure polypeptide or protein may be obtained, for example, by extraction from a natural source (including but not limited to E. Coh) by expression of a recombinant nucleic acid encoding the polypeptide, or by chemically synthesizing the protein. Purity can be measured by any appropriate method, e.g., by column chromatography, polyacrylamide gel electrophoresis, or HPLC
analysis.
A protein is substantially free of naturally associated components when it is separated from those contaminants which accompany it in its natural state.
Thus, a protein which is chemically synthesized or produced in a cellular system different from the cell from which it naturally originates will be substantially free from its naturally associated components. Accordingly, substantially pure polypeptides include those derived from eukaryotic organisms but synthesized in E. coli or other prokaryotes. By "substantially pure DNA" is meant DNA that is free of the genes which, in the naturally-occurring genome of the organism from which the DNA of the invention is derived, flank the gene. The term therefore includes, for example, a recombinant DNA which is incorporated into a vector; into an autonomously replicating plasmid or virus; or into the genomic DNA of a prokaryote or eukaryote;
or which exists as a separate molecule (e.g., a cDNA or a genomic or cDNA
fragment produced by PCR or restriction endonuclease digestion) independent of other sequences. It also includes a recombinant DNA which is part of a hybrid gene encoding an additional polypeptide sequence.
Transformed or Transfected or Transduced or Transgenic cell: Refers to a cell into which (or into an ancestor of which) has been introduced, by means of recombinant DNA techniques, an exogenous DNA molecule encoding a polypeptide of interest. By "'transformation" is meant any method for introducing foreign molecules into a cell. Lipofection, calcium phosphate precipitation, retroviral delivery, electroporation, and ballistic transformation are just a few of the teachings which may be used.
Vector: A self-replicating RNA or DNA molecule which can be used to transfer an RNA or DNA segment from one organism to another. Vectors are particularly useful for manipulating genetic constructs and different vectors may have properties particularly appropriate to express proteins) in a recipient during cloning procedures and may comprise different selectable markers. Bacterial plasmids are commonly used vectors. Modified viruses such as adenoviruses and retroviruses are other examples of vectors.
B) General overview of the invention The present inventors have shown that a sugar (deoxyribose) that is not fermented by E. coli K12, is metabolized by a large number of pathogenic isolates belonging to various pathotypes. The present inventors have also identified the genes encoding this function and they demonstrated that they are conserved among several pathogenic strains. The present inventors have further developed genetic and bacteriological assays to identify deoxyribose-positive E, coli strains.
i) Cloning and molecular characterization of deoK operon in E, coli As it will be described hereinafter in the exemplification section of the invention, the inventors have discovered, cloned and sequenced the DNA
encoding the deoK operon in two pathogenic strains of E. coli. The DNA
sequences and the predicted amino acid sequence of the encoded proteins are shown in Figures 2 and 3. Computer analysis revealed four open reading frames (ORF), deoX, deoP, deoK, and deoQ, which mapped to the same loci as had similar sequences to the deoX, deoP, deoK, and deoQ genes from the deoK
operon from Salmonella, respectively (See Figure 1 ).
The function of deoP, deoK, and deoQ is known. These E. coli genes encode a putative 2-Deoxy-D-ribose permease, a deoxyribokinase and a putative repressor protein, respectively. Function of deoX remains to be elucidated.
DeoX
gene encodes a protein of 337 amino acids (A.A.) long. In silico analysis indicates that the protein has the following features: it has a molecular weight of about 38 kDa, an isoelectric point of about 5.2; an instability index of about 45.4 (i.e.
Unstable); an aliphatic index of about 79.6; and a grand average of hydropathicity (GRAVY) of about -0.136.
ii) deoK homology with other genes and proteins As shown in Table 1 on the exemplification section, a blast search indicates that deoK operon in E. coli shares high level of identity with deoK operon in S. Typhi (about 75 to 80%).
Therefore, the present invention concerns an isolated or purified nucleic acid molecule (such as DNA) comprising a sequence selected from the group consisting of a) sequences provided in part or all of SEQ ID NO: 1 or 6;
b) complements of the sequences provided in part or all of SEQ ID NO: 1 or 6;
c) sequences consisting of at least 20 contiguous residues of a sequence provided in SEQ ID NO: 1 or 6;
d) sequences that hybridize to part or all of nucleic acids of SEQ ID NO: 1 or 6, under moderately, preferably high, stringent conditions;
e) sequences having at least 80% identity to part or all of SEQ ID NO: 1 or 6;
f) degenerate variants of a sequence provided in part or all of SEQ ID NO: 1 or 6;
and g) sequences encoding part or all of polypeptides provided in SEQ ID NO: 2-5 and 7-10.
More preferably, the nucleic acid molecule of the invention comprises a sequence selected from the group consisting of:
a) a nucleotide sequence having at least 80%, 85%, 90%, 95% or 97% nucleotide sequence identity with SEQ ID NO: 1 or 6; and b) a nucleotide sequence having at least 80%, 85%, 90%, 95% or 97% nucleotide sequence identity with a nucleic acid encoding an amino acid sequence of SEQ ID NO: 2-5 and 7-10.
More preferably, the nucleic acid molecule comprises a sequence 5 substantially the same or having 100% identity with SEQ ID NO: 1 or 6, or a sequence substantially the same or having 100% identity with nucleic acids encoding an amino acid sequence of SEQ ID NO: 2-5 and 7-10.
The present invention also concerns isolated or purified nucleic acid molecules comprising a sequence encoding a E. coli polypeptide involved in 10 metabolization of 2-Deoxy-D-ribose, or degenerate variants thereof, the E.
coli polypeptide or degenerate variant comprising part or all of SEQ ID N0:2-5 and 7-10.
The present invention also concerns isolated or purified nucleic acid molecule which hybridizes under moderate, preferably high stringency conditions with part or all of any of the nucleic acid molecules of the invention mentioned hereinbefore or with part or all of a complementary sequence thereof. The "hybridizing" nucleic acid could be used as probe or as antisense molecules as it will be described hereinafter.
In a related aspect, the present invention concerns an isolated or purified polypeptide or a protein comprising an amino acid sequence selected from the group consisting of:
a) sequences encoded by a nucleic acid as defined previously;
b) sequences having at least 80% identity to part or all of any of SEQ ID N0:2-and 7-10;
c) sequences having at least 85% homology to part or all of any of SEQ ID
N0:2-5 and 7-10; and d) sequence provided in part or all of any of SEQ ID N0:2-5 and 7-10.
More preferably, the polypeptide comprises an amino acid sequence substantially the same or having 100% identity with any of SEQ ID N0:2-5 and 7-10. Most preferred polypeptides are those having a biological activity that permit E. coli to metabolize 2-Deoxy-D-ribose.
iii) Anti-deoK antibodies The invention features purified antibodies that specifically bind to a protein encoded by the E. colt deoK operon. The antibodies of the invention may be prepared by a variety of methods using the deoK proteins or polypeptides described above. For example, the deoK polypeptide, or antigenic fragments thereof, may be administered to an animal in order to induce the production of polyclonal antibodies. Alternatively, antibodies used as described herein may be monoclonal antibodies, which are prepared using hybridoma technology (see, e.g., Hammerling et al., In Monoclonal Antibodies and T-Cell Hybridomas, Elsevier, NY, 1981 ).
The invention features antibodies that specifically bind E. colt deoK operon polypeptides, or fragments thereof. In particular, the invention features "neutralizing" antibodies. By "neutralizing" antibodies is meant antibodies that interfere with any of the biological activities of any of the E. colt deoK
operon polypeptides, particularly the ability of E. colt to metabolize 2-Deoxy-D-ribose. The neutralizing antibody may reduce the ability of E. colt deoK proteins to metabolize 2-Deoxy-D-ribose by, preferably 50%, more preferably by 70%, and most preferably by 90% or more. Any standard assay of 2-Deoxy-D-ribose metabolization, including those described herein, may be used to assess potentially neutralizing antibodies. Once produced, monoclonal and polyclonal antibodies are preferably tested for specific deoK proteins recognition by Western blot, immunoprecipitation analysis or any other suitable method.
In addition to intact monoclonal and polyclonal anti-deoK antibodies, the invention features various genetically engineered antibodies, humanized antibodies, and antibody fragments, including F(ab')2, Fab', Fab, Fv and sFv fragments. Antibodies can be humanized by methods known in the art. Fully human antibodies, such as those expressed in transgenic animals, are also features of the invention.
Antibodies that specifically recognize deoK proteins (or fragments deoK), such as those described herein, are considered useful to the invention. Such an antibody may be used in any standard immunodetection method for the detection, quantification, and purification of deoK proteins. The antibody may be a monoclonal or a polyclonal antibody and may be modified for diagnostic purposes.
The antibodies of the invention may, for example, be used in an immunoassay to monitor deoK expression levels, to determine the subcellular location of a deoK or deoK fragment produced by E. coli, to determine the amount of deoK or fragment thereof in a biological sample and evaluate the pathogenicity of a strain of E. coli.
In addition, the antibodies may be coupled to compounds for diagnostic and/or therapeutic uses such as gold particles, alkaline phosphatase, peroxidase for imaging and therapy The antibodies may also be labeled (e.g.
immunofluorescence) for easier detection.
iv) Identification of E. coli pathogenic strains According to the present invention, the ability of the E. coli strain to metabolize 2-Deoxy-D-ribose and/or the presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose in the E. coli strain is indicative that this strain is pathogenic or at least potentially pathogenic.
Therefore, the invention provides a method for evaluating pathogenicity of a strain of E. coli comprising assaying a metabolic activity of that strain.
Preferably, the metabolic activity consists of metabolization of 2-Deoxy-D-ribose and the assessment step consists of growing the strain of a minimal medium comprising 2-Deoxy-D-ribose as a sole source of carbon.
The antibodies described above and probes described hereinafter rnay be used to monitor deoK protein expression and/or to identify a pathogenic strain of E, coli in a biological sample or in a human or an subject. Accordingly, the invention provides a method for identifying a pathogenic strain of E, coli and/or for evaluating likelihood of pathogenicity of a strain of E. coli as compared to a commensal strain.
According to a first embodiment, the method comprises assaying the E. coli strain for the presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose. Preferably, oligonucleotides such as probes, or cloned nucleotide (RNA
or DNA) fragments corresponding to unique portions of genes and proteins from operon deoK are used to assess deoK proteins cellular levels or detect deoK
mRNAs (both indicative of E. coli pathogenicity). Such an assessment may also be done in vifro using well-known methods (Northern analysis, PCR, quantitative PCR, microarrays, etc.). The methods of the invention may be carried out by contacting, in vitro or in vivo, an E, coli isolate or a biological sample (such as a urine sample, feces, blood, cerebral spinal fluid, from an individual or an individual or an animal suspected of harboring pathogenic E. coli. or an extract thereof, witty an anti-deoK antibody or a probe according to the invention, in order to determine the presence or evaluate the amount of deoK proteins or gene in the sample or the cells therein.
According to a preferred embodiment, the method comprises assessment of the E, coli strain for the presence of a nucleic acid sequence selected from the group consisting of:
a) sequences provided in part or all of SEQ ID NO: 1 or 6;
b) complements of the sequences provided in part or all of SEQ ID NO: 1 or 6;
a) sequences consisting of at least 20 contiguous residues of a sequence provided in SEQ ID NO: 1 or 6;
b) sequences that hybridize to part or all of nucleic acids of SEQ ID NO: 1 or 6, under moderately, preferably high, stringent conditions;
c) sequences having at least 80% identity to part or all of SEQ ID NO: 1 or 6;
d) degenerate variants of a sequence provided in part or all of SEQ ID NO: 1 or 6;
and e) sequences encoding part or all of polypeptides provided in SEQ ID NO: 2-5 and 7-10.
According to another preferred embodiment, the method comprises assessment of the E. coli strain for the presence of a polypeptide comprising an amino acid sequence selected from the group consisting of:
a) sequences encoded by a nucleic acid as defined in claim 7;
b) sequences having at least 80% identity to part or all of any of SEQ ID N0:2-and 7-10;
c) sequences having at least 85% homology to part or all of any of SEQ ID N0:2-5 and 7-10; and d) sequence provided in part or all of any of SEQ ID N0:2-5 and 7-10.
Accordingly, the invention encompasses nucleotide probes comprising a sequence of at least 15, 20, 25, 30, 40, 50, 75, 100 or more sequential nucleotides cf SEQ ID NO: 1 or 6, or of a sequence complementary to SEQ ID NO: 1 or 6.
More preferably, the probe consists of SEQ ID NO: 11 or 12.
Of course, it may be preferable to further assay the presence (or absence) of other genes/proteins in order to increase sensitivity and/or specificity of the method.
According to another embodiment, the method for identifying a pathogenic strain of E. coli comprises detecting deoxyribokinase enzymatic activity of the strain. Preferably this is done by assaying, under suitable culture conditions, the capabilities of the strain to metabolize 2-Deoxy-D-ribose. This may be achieved by grow'ng in vitro an E. coli isolate or a biological sample suspected of harboring pathogenic E. coli on a minimal medium comprising 2-Deoxy-D-ribose as a sole source of carbon and evaluating bacteria growth and survival in that medium.
Preferably, the minimal medium comprises from about 0.01 % 2-Deoxy-D-ribose and the bacteria are cultured in the minimal medium for about 24h to about 48h.
Assay kits for determining the amount of deoK genes and proteins in a sample and/or for identifying a pathogenic strain of E. coli, are also within the scope of the present invention. According to one embodiment, such a kit would preferably comprises anti-deoK antibody(ies) or probes) according to the invention and other elements) selected such as instructions for using the kit, assay tubes, enzymes, reagents or reaction buffers}, enzymes}. In another embodiment, the kit would comprises means for assaying capabilities of a strain of E. coli to metabolize 2-Deoxy-D-ribose.
A non-limitative example of use for the methods, kits and probes of the invention is the detection of pathogenic or potentially pathogenic E. coli bacteria in food which may be contaminated by E. coli.
v) Downmodulation of deoK proteins expression As mentioned previously, expression of proteins of the deoK operon allows E. coli to metabolize 2-Deoxy-D-ribose. Modulation of deoK may be useful.
More particularly downmodulation of deoK proteins could be used to prevent and/or treat E. coli infections. Therefore, the invention also relates to methods for preventing or treating E. Coli infections comprising downmodulating expression or biological activity of deoK proteins or genes. This may be achieved 5 by administering a molecule or compound having such property.
vii) Vectors and Cells The invention is also directed to a host, such as a genetically modified cell, comprising any of the nucleic acid sequence according to the invention and more 10 preferably, a host capable of expressing the peptide/protein encoded by this nucleic acid.
The host cell may be any type of cell (a transiently-transfected mammalian cell line, an isolated primary cell, or a bacterium (such as E. coh). More preferably the host is Escherichia coli bacterium and it is selected from the Escherichia coli 15 bacteria filed on May 14, 2002 at the CNCM under accession numbers I-2867 and I-2867.
A number of vectors suitable for stable transfection of mammalian cells and bacteria are available to the public (e.g. plasmids, adenoviruses, adeno-associated viruses, retroviruses, Herpes Simplex Viruses, Alphaviruses, Lentiviruses), as are methods for constructing such cell lines. The present invention encompasses any type of vector comprising any of the nucleic acid molecule of the invention and more particularly the vectors capable of directing expression of the peptide encoded by such nucleic acid in a vector-containing cell.
The cells of the invention may be particularly useful for diagnostic purposes and for drug screening (by measuring effect of a compound on expression or activity levels of deoK genes of proteins for instance).
vii) Synthesis of E. coli deoK proteins and functional derivative thereof ;knowledge of E. coli deoK operon gene sequences open the door to a series of applications. For instance, the characteristics of the cloned E.
coli deoK
genes sequences may be analyzed by introducing the sequence into various cell types or using in vitro extracellular systems. The function of E. coli deoK
genes may then be examined under different physiological conditions. The deoK cDNA
sequences may be manipulated in studies to understand the expression of the gene and gene product. Alternatively, cell lines may be produced which overexpress the gene product allowing purification of deoK proteins for biochemical characterization, large-scale production, antibody production, and patient therapy.
For protein expression, eukaryotic and prokaryotic expression systems may be generated in which the deoK operon gene sequences is introduced into a plasmid or other vector which is then introduced into living cells. Gonstructs in which the deoK cDNA sequences containing the entire open reading frame inserted in the correct orientation into an expression plasmid may be used for protein expression. Alternatively, portions of the sequence, including wild-type or mutant deoK sequences, may be inserted. Prokaryotic and eukaryotic expression systems allow various important functional domains of the protein to be recovered as fusion proteins and then used for binding, structural and functional studies and also for the generation of appropriate antibodies. The deoK DNA sequences may be altered by using procedures such as restriction enzyme digestion, DNA
polymerase fill-in, exonuclease deletion, terminal deoxynucleotide transferase extension, ligation of synthetic or cloned DNA sequences and site directed sequence alteration using specific oligonucleotides together with PCR.
Accordingly, the invention also concerns a method for producing a polypeptide involved in E. coli metabolization of 2-Deoxy-D-ribose. The method comprises the steps of: (i) providing a cell transformed with a nucleic acid sequence encoding the polypeptide positioned for expression in the cell; (ii) culturing the transformed cell under conditions suitable for expressing the nucleic acid; (iii) producing the polypeptide; and optionally, (iv) recovering the polypeptide produced.
Once the recombinant protein is expressed, it is isolated by, for example, affinity chromatography. In one example, an anti-deoK polypeptide antibody, which may be produced by the methods described herein, can be attached to a column and used to isolate the deoK proteins. Lysis and fractionation of deoK-harboring cells prior to affinity chromatography may be performed by standard methods.
Once isolated, the recombinant protein can, if desired, be purified further.
Methods and techniques for expressing recombinant proteins and foreign sequences in prokaryotes and eukaryotes are well-known in the art and will not be described in more detail. One can refer, if necessary to Joseph Sambrook, David W. Russell, Joe Sambrook Molecular Cloning: A Laboratory Manual 2.001 Cold Spring Harbor Laboratory Press. Those skilled in the art of molecular biology will understand that a wide variety of expression systems may be used to produce the recombinant protein. The precise host cell used is not critical to the invention. The deoK proteins may be produced in a prokaryotic host (e.g., E. coh) or in a eukaryotic host. These cells are publicly available, for example, from the American Type Culture Collection, Rockville, MD. The method of transduction and the choice of expression vehicle will depend of the host system selected.
Polypeptides of the invention, particularly short deoK fragments, may also be produced by chemical synthesis. These general techniques of polypeptide expression and purification can also be used to produce and isolate useful deoK
fragments or analogs, as described herein.
Skilled artisans will recognize that a deoK polypeptide, or a fragment thereof (as described herein), may serve for various purposes, in diagnostic kits and methods, and for the obtaining of anti-deoK antibodies for instance.
viii) Identification of Molecules that Modulate deoK Proteins Expression deoK cDNAs may be used to facilitate the identification of molecules that increase or decrease deoK genes expression. In one approach, candidate molecules are added, in varying concentration, to the culture medium of cells expressing deoK mRNA. deoK expression is then measured (or capabilities of the cell to metabolize 2-Deoxy-D-ribose), for example, by Northern blot analysis using a deoK cDNA, or cDNA or RNA fragment, as a hybridization probe. The level of deoK expression (or cell metabolizing activity) in the presence of the candidate molecule is compared to the level of deoK expression (or cell metabolizing activity) in the absence of the candidate molecule, all other factors (e.g. cell type and culture conditions) being equal.
Compounds that modulate the level of deoK expression (or cell metabolizing activity) may be purified, or substantially purified, or may be one component of a mixture of compounds such as an extract or supernatant obtained from cells. In an assay of a mixture of compounds, deoK expression (or cell metabolizing activity) is tested against progressively smaller subsets of the compound pool (e.g., produced by standard purification techniques such as HPLC
or FPLC) until a single compound or minimal number of effective compounds is demonstrated to modulate deoK expression (or cell metabolizing activity).
The effect of candidate molecules on deoK-biological activity may, instead, be measured at the level of translation by using the general approach described above with standard protein detection techniques, such as Western blotting or immunoprecipitation with a deoK-specific antibody (for example, the anti-deoK
antibody described herein).
Another method for detecting compounds that modulate the activity of deoK
is to screen for compounds that interact physically with a given deoK
polypeptide.
Depending on the nature of the compounds to be tested, the binding interaction may be measured using methods such as enzyme-linked immunosorbent assays (ELISA), filter binding assays, FRET assays, scintillation proximity assays, microscopic visualization, immunostaining of the cells, in situ hybridization, PCR, etc.
A molecule that decreases deoK activity is considered particularly useful to the invention; such a molecule may be used, for example, as a therapeutic to decrease and/or block proliferation of pathogenic bacteria (see section (v) hereinbefore).
Molecules that are found, by the methods described above, to effectively modulate deoK gene expression or polypeptide activity, may be tested further in animal models. If they continue to function successfully in an in vivo setting, they may be used as therapeutics to prevent or treat bacterial infections.
EXAMPLES
The following examples are illustrative of the wide range of applicability of the present invention and is not intended to limit its scope. Modifications and variations can be made therein without departing from the spirit and scope of the invention. Although any method and material similar or equivalent to those described herein can be used in the practice for testing of the present invention, the preferred methods and materials are described.
EXAMPLE 1: Cloning and expression of deoxyribose-catalyzing genes in E. coli strains.
Introduction Escherichia coli is a heterogeneous species consisting of both enteric commensal and pathogenic strains. Different types of E. coli cause different diseases in a range of hosts, including extra-intestinal and enteric infections.
Extra-intestinal infections due to E. coli are common in groups of age and can involve almost any organ or anatomical site. Typically extra-intestinal infections include urinary tract infection (UTI), meningitis (mostly in neonates and after neurosurgery), diverse intra-abdominal infections, pneumonia (particularly in hospitalized and institutionalized patients), intravascular-device infection, osteomyelitis, and soft-tissue infection, which usually occurs when the tissue is compromised. Bacteremia can accompany infection at any of these sites (JID
2000, 181:1753; JID 2001,183:596). In 1999, extra-intestinal pathogenic E.
coli strains were the most frequently isolated organisms in US patients receiving antimicrobials (JAMA, 2001, 285: 1565). Bacterial UTI are second in incidence only to those causing respiratory infections. E. coli accounts for up to 90 %
of all UTIs in non-hospitalized patients (5th ed. Williams & Wilkins, Baltimore, Md.1997).
85 to 95 % of uncomplicated cystitis in pre-menopausal women are due to E.
coli strains; they globally represent 150-300 million cases per year in the world (Est. $
6 billion dollars direct cost/ year in US) (JID 2001;183:51). In US, there are at least 250,000 cases of uncomplicated pyelonephritis per year, allowing to 100,000 hospitalizations and an E. coli estimate cost of $ 175 million dollars /year (JAMA, 2001; 283:1583). E, coli is responsible for one third of all cases of neonatal meningitis with an incidence rate of 0.1 per 1,000 live births (JAC 1994, 34 (suppl.
A):61). The extra-intestinal E. coli strains are epidemiologically and phylogenetically distinct from both the commensal and the intestinal pathogenic strains; they appear to be unable of causing enteric disease, but they can stably colonize the host intestinal tract. In contrast, intestinal pathogenic strains of E. coli are rarely encountered in the fecal flora of healthy hosts and, instead, appear to be essentially obligate pathogens, causing gastroenteritis or colitis when ingested in 5 sufficient quantities by a naive host. Various pathotypes of E. coli are responsible for significant worldwide diarrheal disease (to date, six have been well characterized). For example, enteropathogenic E. coli (EPEC) are the leading cause of severe infantile diarrhea in developing countries, and enterohaemorrhagic E. coli (EHEC) (including the well-known 0157:H7) have 10 recently been shown to be the cause of bloody diarrhea and hemolytic-uremic syndrome in major food-borne outbreaks in the United States, Europe, and Asia (CMR 1998, 11:142). Although there is some overlap between certain diarrhoeagenic pathotypes, with respect to virulence traits, each pathotype possesses a unique combination of virulence traits that results in a distinctive 15 pathogenic mechanism. Recent studies have identified other categories of pathogenic E. coli, such as strains isolated from diarrhoeagenic stools of HIV-positive patients, and E. coli that were abnormally predominant in early and chronic ileal lesions of patients with Crohn's disease.
Knowledge of the pathogenic or non-pathogenic status of an isolate may be 20 of use for clinicians for diagnosis, especially in cases of opportunistic pathogens.
Isolation of an E. coli strain from a clinical specimen does not, by itself, confer the designation of pathogenic isolate, since commensal strains of E. coli can cause infections (in particular extraintestinal infections) when the host is compromised.
However, no single virulence factor is limited to (or absolutely required for) infection at any one given site or for any particular syndrome. Consequently, multiple phenotypic and genotypic assays are necessary to identify the pathotype of clinical isolates. The aim was to identify genes encoding functions that are conserved in all pathogenic strains but are absent in commensal E. coli and to use these data to develop new diagnostic and therapeutic tools.
Over the last five years, studies have been published on the E. coli chromosome. The whole genome sequence of the laboratory strain K-12 MG1655 was published in 1997 (Science 1997, 277:1453), and the size of E. coli ?_ 1 chromosome was shown to var~~ from 4.5 to 5.5 megabases (Mb) (1A1 1999, 19:230). Comparative restriction mapping among the chromosome of E. coli K-12, newborn sepsis-associated strain RS218, and uropathogenic strain J96, showed that the overall gene order is conserved in the three strains, that large accessory segments (some carrying virulence genes) are unique to the chromosome of pathogenic strains, and that some segments are only absent from the chromosome of pathogenic strains (1A1 1999, 19:230). Comparison of the E. coli K-12 genome and those of different pathogenic E. coli allowed us to identify the major differences. The genome of E. coli 0157: H7 (EHEC strain EDL933) was recently sequenced (Nature 2001, 409:529). Comparison with the E. coli K-12 reference strain genome confirmed that the two chromosomes share a common 4.1 Mb 'backbone' sequence and lineage-specific segments (specific islands) were found throughout both genomes in clusters of up to 88 kilobases. Roughly 26%
of the EDL933 genome lies completely within these specific islands, and 33% of these contain genes of unknown function. The Genome Center of Wisconsin is currently sequencing the genome of the newborn sepsis-associated strain RS218, the uropathogenic strain CFT073 and three strains belonging to different pathotypes of diarrhoeagenic E. coli [enterotoxigenic E. coli (ETEC), EPEC, and enteroaggregative E. coli (EAEC) (http://genome.wisc.edu)). It will take probably several years before information from the comparison of the pathogenic specific islands of various pathogenic E. coli isolates becomes available.
Most studies on pathogenic E. coli strains concern the identification of specific virulence regions associated with the pathogenesis of E. coli causing various diseases. Virulence genes have been identified, and pathogenicity islands have been characterized and sequenced. The first studies that investigated the relationship between groups of pathogenic and non-pathogenic E. coli strains were based on multilocus enzyme electrophoresis analysis (1A1 1997, 65:2685) and sequencing of housekeeping genes (Nature 2000, 406.64). They suggested that pathogenic isolates do not have a single evolutionary origin within E. coli but that they arose many times and that the high virulence of clones is a recent, derived state resulting from the acquisition of virulence genes rather than an ancestral condition of primitive E. coli.
E. coli strains expressing the K1 polysaccharide colonize the large intestine of newborn infants and are the leading cause of gram-negative septicaemia and meningitis during the neonatal period. A recent study used signature-tagged rnutagenesis to identify E. coli K1 genes that are required for colonization of the gastrointestinal tract, which is one of the initial steps in the development of enteric, urinary and systemic infections caused by E. coli (MM 2000, 37:1293). One of these genes is absent from the genome of E. coli K-12, although related sequences have been found in some representative pathogenic strains (uropathogenic E. coli, EAEC, and EPEC). The sequence of this gene is not available. These data strongly suggest that common (or strongly related) sequences that are absent from the genome of commensal E. coli, are present in all pathogenic E. coli strains.
A comparative analysis of metabolic functions expressed by pathogenic and commensal strains of E. coli was developed. The inventors showed that a sugar (deoxyribose) that is not fermented by E. coli K12, is metabolized by a large number of pathogenic isolates belonging to various pathotypes. The inventors identified the genes encoding this function and demonstrated that they are conserved among several pathogenic strains. They have developed genetic and bacteriological assays to identify deoxyribose-positive E. coli strains.
Materials and Methods Bacterial strains, cosmids, and culture conditions E. coli K-12/MG1655 (Blattner et al., 1997, Science 277:1453-1474) was used as a host for maintaining cosmid clones.
E. coli strains were routinely grown in Luria broth with glucose (10 g of tryptone, 5 g of yeast extract, and 5 g of NaCI per liter (pH 7.0] or on Luria agar plates (containing 1.5 % agar) at 37°C. E. coli-harboring cosmid clones were grown with 100 ~g of carbenicillin per ml.
Collections of human commensal and pathogenic E, coli strains were used in this study. One hundred fifteen E. coli strains were isolated from blood cultures from cancer patients. These strains were previously partially characterized (J. Clin.
Microbiol., 2001, 30:1738; Infect. A Immun., 2000, 68:3983). One hundred E.
coli 2' J
strains were isolated from urine specimen from patients (children and adults) clinically diagnosed with pyelonephritis. They were previously partially characterized and were from various geographical origin (France, USA, Romania).
Thirty six isolates were from urine specimen from patients with cystitis. They were isolated in Romania and USA. Twenty five strains were from the stools of patients with CD4 lymphocyte counts <400 cells/mm presenting persistent diarrhea.
Eleven isolates were from diarrhoeagenic stools of children in Brazil. Commensal E.
coli strains were isolated from normal flora of healthy people in France, Romania, Senegal (children), and Central African Republic.
Expression of deoxyribose-catalyzing genes by E. coli strains.
The capacity of bacteria to grow on a minimal medium (K5) (J Bacteriol 1971, 108:639) supplemented with 2-Deoxy-D-ribose 0,1 % as sole source of carbon was tested by inoculating agar plates with a bacterial suspension and incubating the plates at 37°C for 24 and 48 h. Inoculations of those plates were performed with a loop from a 1 ml bacterial suspension (in water) prepared with a loop of bacteria grown on LB agar plates.
The fermentation (Methodes de laboratoire pour ('identification des enterobacteries, 1e Minor et Richard, Institut Pasteur, p 169) of 2-Deoxy-D-ribose by E. coli strains was tested as follows: a drop (15 ~I) of an overnight culture in LB
broth was inoculated in 3 ml of peptone water containing 1,5% (v/v) of bromothymol blue and 1 % (w/v) of 2-Deoxy-D-ribose in a 12 x 120 mm glass tube.
The suspension was incubated 24 h at 37°C without shaking.
Activity assay: 2-Deoxy-D-ribose is phosphorylated by deoxyribokinase to deoxyribose-5 phosphate which is subsequently cleaved to acetaldehyde and glyceraldehyde-3phosphate by deoxyribose-5P aldolase also called phosphopentose aldolase. Deoxyribose-5P aldolase activity was determined by coupling deoxyribose-5P cleavage to NADH oxidation using glycerophosphate dehydrogenase and triosephosphate isomerase as coupling enzymes. The reaction medium (0.5 ml final volume) contains 50 mM Tris-HCI (pH 7.4); 0.2 mM
NADH; 9U and 3U of glycerophosphate dehydrogenase and triosephosphate isomerase respectively. The reaction was started with crude material extract followed by 1 mM deoxyribose-5Phosphate, then the absorption decrease at 334 nm was monitored with an EppendortT"" PCP6121 photometer thermostated at 30°C. One unit of deoxyribose-5P aldolase corresponds to 1 mole of product formed per minute.
DNA analysis and genetic technigues.
Cosmid libraries were previously constructed from the genomic DNA from E. coli AL862 isolated from the blood of a cancer patient (1A1, 2001;69:937) and from E. coli 55989 isolated from the stools of a patient with persistent diarrhea (C. Bernier, P. Gounon, and C. Le Bouguenec, In press, IAI august 2002). Sau3A
restriction fragments (35 to 50 kb) were sized on a sucrose gradient and ligated to the BamHl-digested and alkaline phosphatase-treated cosmid vector pHC79 (Collins J, 1979, Methods Enzymol., 68:309-326) DNA . The recombinant cosmids pILL1272 and pILL1287 resulted from cloning of DNA from AL862 and 55989 strains, respectively.
Recombinant cosmids were routinely isolated by alkaline lysis. The sequence of the primers to amplify probe A (GenBankT"" AF286671) and probe B
(GenBankT"~ AF286670) were derived from the partial sequence of PAI IA~ss2 (1A1, 2001 69:937, and Erratum in IAI June 2002). The sequences of the primers to amplify probe A were 5'-ATCAGATGCCTAAAGAAGGAGAAAC-3' and 5'-CAATACTCGGATAAGATGATTGC-3' and the size of the amplicon was 831 by (see Figure 4; SEQ 1D N0:11). The sequences of the primers to amplify the probe B were 5'-GGACGATAATGTGATCGTCTATAAG-3' and 5'-GTGGAAGA
TACTCATCTGCTACACG-3' and the size of the amplicon was 816 by (see Figure 5; SEQ ID N0:12). The cycling conditions were initial denaturation at 95°C
for 5 min followed by 30 cycles at 95°C for 30 s, 60°C or 65°C (for amplification of probe A and probe B, respectively) for 30 s, and 72°C for 1 min.
Hybridization.
Bacteria grown for 3 h on nitrocellulose filters were used for colony hybridization. Hybridization was performed under stringent conditions (overnight at 65°C), with PCR products labeled with 32P using the MegaprimeT"" DNA
labeling system (Amersham International) as probes. The 100 ml hybridization solution contained: 2 ml EDTA 0.5M; 20 mg ATP; and 10 ml 20x SSC.
DNA seauencina.
5 Double-stranded DNA was sequenced by Genome Express (France).
Multiple sequence alignments were generated with the CLUSTAL W program.
Statistical analysis Proportions were compared by using the chi-square test.
Results Presence of the deoK operon in the pathogenic E. coli isolates.
While a large number of bacteria are able to use the 2'-deoxyribosyl moiety of 2'-deoxyribonucleosides as carbon and energy sources via the well-known deo-operon, few organisms as Salmonella are able to use 2-Deoxy-D-ribose (dRib) as the sole carbon source through deoxyribokinase which catalyses the ATP-dependant phosphorylation of dRib to dRib-5 phosphate. Recently, the inventors identified in the genome of S. enterica serovar Typhi, not only the gene encoding deoxyribokinase, deoK but a whole operon (deoK operon) of three genes regulated by a repressor DeoQ (J. Bacteriol., 2000, 182:869-873). Searches in databanks showed that this operon was fully represented in one Citrobacter freundii strain and partially present in Agrobacterium tumefaciens, Rhodobacter sphaeroides, and the pathogenic E. coli strain AL862 isolated from a blood culture.
Use of 2-Deoxy-D-ribose by E. coli strains has been previously described (Br.
J.
Biomed. Sci., 1995; 52: 173), however this property was never associated with the pathogenic status of the strains and the genes encoding this function were not identified.
In strain AL862, the sequences similar to the deoK operon corresponded to ORF3', ORF4, ORFS and ORF 6 of the partial (and not continuous) sequence of a pathogenicity island (PAI IA~ss2)(GenBankT"" Nos. AF286670 and AF286671). No function was previously assigned to these sequences. Two probes derived from this PAI IA~862 region (probes A and B) corresponded to the deoK homologous sequences. Analysis of the distribution of PAI IA~as2 among pathogenic E. coli isolates strongly suggested that the A and B regions are widely distributed among pathogenic strains (1A1, 2001, 69: 937-948; IAI June 2002 Errata).
To confirm the presence of the deoK operon in pathogenic E. coli strains, the inventors sequenced again the region of PAI IA~asz that previously showed similarities to the deoK operon of Salmonella. The sequencing was performed on the recombinant cosmid pILL1272 (see Material and Methods). They identified a 4486-pb linear region displaying similarities to the entire deoK operon of Salmonella. Computer analysis revealed four open reading frames (ORF), deoX, deoP, deoK, and deoQ, which mapped to the same loci as had similar sequences to the deoX, deoP, deoK, and deoQ genes from the deoK operon from Salmonella, respectively (See Figure 1 ). These results confirmed that the genetic organization of the deoK operon from E. coli was similar to that of the deoK operon from Salmonella.
The detailed sequence analysis of E. coli - strain AL862 is presented in Figure 2. The deoK operon from E. coli strain AL862 displayed 78 % identity with that from Salmonella (4486 bp14517 bp).
The position and sequence (determined here) of the two probes (probe A
and probe B) that were used in the hybridization experiments are indicated in Figure 2 (single and doubled underline respectively). In both cases, the sequence of the primers used in PCR assays are indicated in bold. These primer sequences are identical to those previously described and used (IAI, 2001, 69:937-948;
IAI
June 2002 Errata). Probes A and B are PCR products obtained from strain AL862.
To study the degree of conservation of the deoK operon among pathogenic E. coli isolates, the inventors determined the nucleotide sequence of the deoK
region in E. coli strain 55989 isolated from the stools of a patient with persistent diarrhea. This isolate was shown to belong to the EAEC pathotype of pathogenic intestinal E. coli. A cosmid library from the genomic DNA of strain 55989 was previously constructed (Bernier et al., In press, IAI August 2002). The recombinant cosmid pILL1287 resulted from the screening of the 55989 cosmid library with both the probe A and the probe B. The sequence of the chromosomal region from strain 55989 that carries the deoK operon is presented in Figure 3.
The deoK operon from E. coli strain AL862 and strain 55989 showed 98%
identity (4486 bp/4489 bp). The degrees of identities of the deo genes from E.
coli and Salmonella strains are summarized in Table 1.
TABLE 1: Degrees of identities of the deo genes from E, coli and Salmonella strains Strains % of identity No. of nucleotides 55989 / AL862 98 % 4489bp/4486bp 55989 / S. Typhi 78 % 4489bp/4517bp AL862 / S. Typhi 78 % 4486bp/4517bp Genes % of identity No. of nucleotides deoX 55989 / AL862 99% 1014bp/1014bp deoX 55989 / S. Typhi75% 1014bp/1014bp deoXAL862 / S. Typhi75% 1014bp/1014bp deoP 55989 I AL862 99% 1317bp/1317bp deoP 55989 / S. Typhi83% 1317bp/1317bp deoP AL862 / S. Typhi82% 1317bp/1317bp deoK 55989 / AL862 99% 921 bp/921 by deoK 55989 / S, Typhi80% 921 bp/921 by deoK AL862 / S, Typhi80% 921 bp/921 by deoQ 55989 / AL862 96% 783bp/783bp deoQ 55989 / S. Typhi77% 783bp/786bp deoQ AL862 / S. Typhi76% 783bp1786bp Expression of the deoK operon in E. coli strains.
The inventors demonstrated the expression of the deoK operon in clinical isolates 55989 and AL862, as well as in the recombinant strain MG1655 carrying either the cosmid pILL1272 or the cosmid pILL1287. All these four strains were able to grow on K5 plates containing 2-Deoxy-D-ribose as a carbon source. The growth of the strains was evident after 48 h of incubation at 37°C. As a negative control, strain MG1655 alone did not grow on such medium. Deoxyribose-5P
aldolase activity, easier to determine than that of deoxyribokinase, is reported in Table 2.
Table 2: Deoxyribose-5P aldolase activity in E. coli strains Strain Deoxyribose-5P aldolase +dR -dR
AL862 0.47 Ulmg 0.06 U/mg 55989 0.45 U/mg 0.08 U/mg K-12 MG1655 (+1272)0.36 U/mg 0.10 U/mg K-12 MG1655 (+1287)0.24 U/mg 0.10 U/mg Analysis of the distribution of deoK operon among commensal and pathogenic E.
coli isolates To determine whether deoK operon sequences were specific for pathogenic E. coli, the frequency of occurrence of the A and B regions (corresponding to parts of deoK and deoX genes, respectively) was investigated. These regions were amplified from strain AL862 DNA and used as probes to screen by colony hybridization collections of E. coli isolates. The strains were also tested for their ability to use 2-Deoxy-D-ribose as a carbon source.
These collections comprised strains representative of the various pathotypes of pathogenic E. coli. Archetypal ExPEC (extraintestinal pathogenic E.
coh~ familiar to investigators in the field include strains CFT073 (pyelonephritis isolate), 536 (pyelonephritis isolate), J96 (pyelonephritis isolate), RS218 (neonatal meningitis isolate). Prototype strains of the various diarrheagenic E. coli pathotypes are also considered: EDL933 (EHEC), EDL1493 (ETEC), E2348/69 EPEC), 042 and JM221 (EAEC), C1845 (diffusely-adherent E. coli (DAEC)). As shown in Table 3, the results indicated that the deoK operon is carried by pathogenic strains belonging to various pathotypes of E. coli and associated with both extra-intestinal and intestinal infections.
Table 3: Frequency of occurrence of the A (deoK) and B (deo~ regions in various E, coli strains E. coli strains Probe Probe Deoxyribose utilization A B
CFT073 (pyelonephritis) + + +
536 (pyelonephritis) + + +
J96 (pyelonephritis) - - -RS218 (meningitis) - - -EDL933 (EHEC) - - -EDL 1493 (ETEC) + + +
E2348/69 (EPEC) - - -042 (EAEC) + + +
JM221 (EAEC) + + +
C1845 (DAEC) - - -The collections studied also comprised clinical isolates from 115 human with septicemia (isolated in France), 100 clinical isolates from patients with pyelonephritis (origin France, USA, Romania), 36 clinical isolates from patients with cystitis (origin USA, Romania), 25 EAEC isolated from HIV-positive patients with persistent diarrhea (origin Central African Republic and Senegal), 11 EPEC
with a diffuse adherent pattern (DA-EPEC) on epithelial cells isolated from infants with diarrhea in Brazil. We also investigated 257 commensal E. coli strains isolated from normal flora of healthy patients (origin France (36), Romania, Senegal, Central African Republic). The results are summarized in Table 4.
Table 4: Percentage of occurrence of the A (deol~ and B (deo~ regions in various E, coli clinical isolates E. coli strains Probe Probe Probe DeoxyriboseProbe A
A + + 2-A B Probe utilizationDeoxy-D-B
(level of ribose significance)utilization Septicemia 49 48 48 50 46 (n = 115) (p<0.0001 ) Pyelonephritis 50 53 48 50 48 (n = 100) (p<0.0001 ) Cystitis (n = 7 10 7 8 7 36) (0.2<p<0.4) Diarrhea (EAEC) 13 13 13 12 12 (n = 25) (p<0.0001 ) Diarrhea (DA-EPEC)11 11 11 11 11 (n - 11 ) (NA) Commensal (France)10 11 10 9 8 (n = 36) Cornmensal ~ NT NT NT 31 NT
(Romania, Senegal, Central African Republic) (n =
221 ) NT, not tested; NA, not appiicaoie.
5 The sensitivity of the two DNA probes appeared equivalent: 43%, and 45%
of the strains were positive with the A and B probes, respectively.
A total of 147 isolates (36 commensal strains and 113 pathogenic E. coli ) were tested for both the growth on K5 plates containing 2-Deoxy-D-ribose and fermentation of this sugar. A 100 % correlation was observed between the two 10 bacteriological tests; all the strains that grew on K5 plates with 2-Deoxy-D-ribose showed the ability to ferment the sugar. The 2-Deoxy-D-ribose utilization test appeared sensitive but, at a small extend, less specific than the genetic detection of the deoK operon (53 % of positive strains). Using both molecular and bacteriological approaches (probe A and growth on K5 plates with deoxyribose) a total of 40.8 % of the strains are positive.
Taking account of all the data, a significant association of the deoK operon with pyelonephritis- and septicemia-associated isolates, as well as with diarrhea associated EAEC isolates was evidenced.
Conclusion This work confirmed that metabolic characters may be specific of E. coli strains and that those expressed by pathogenic isolates may be considered as virulence-associated factors. Utilization of 2-Deoxy-D-ribose by some E. coli isolates has been previously reported. Here, the inventors identified the genes involved in utilization of 2-Deoxy-D-ribose by E. coli strains. These genes are organized in an operon (deoK) that is highly related to that previously identified in Salmonella enterica strains. Analysis of the sequences adjacent to the deoK in several E. coli isolates and in Salmonella strongly suggested that E. coli strains acquired the deoK operon by horizontal transfer from Salmonella strains. The inventors demonstrated that the deoK operon is highly conserved among E. coli strains. From this observation, the inventors defined two probes that were used to study the distribution of the deoK operon among collections of commensal and pathogenic E, coli isolates. Preliminary studies indicated an association of the deoK operon with strains belonging to various pathotypes of E. coli including strains causing pyelonephritis, septicemia, and some type of diarrhea in children.
If 40 to 50% of strains associated with pyelonephritis, septicemia, and diarrhea (EAEC and DA-EPEC strains) carry the deoK operon, we also detected it in 14 to 22 % of commensal isolates. This may be explained by the fact that commensal strains of E. coil can be potential pathogens when the host is compromised. It is interesting to note that the deoK operon is less prevalent in commensal strains from Romania, Senegal and Central African Republic than in French commensal strains.
In conclusion, the inventors have identified a metabolic character significantly associated with some pathogenic E, coil. The inventors have developed bacteriological and molecular tests to identify strains expressing this character. These tests could be associated with others in a future diagnostic kit for the identification of the pathogenic status of an E. coli isolate.
While several embodiments of the invention have been described, it will be understood that the present invention is capable of further modifications, and this application is intended to cover any variations, uses, or adaptations of the invention, following in general the principles of the invention and including such departures from the present disclosure as to came within knowledge or customary practice in the art to which the invention pertains, and as may be applied to the essential features hereinbefore set forth and falling within the scope of the invention or the limits of the appended claims.
2003-07-08 Listage pour 1e BdB corrige.txt SEQUENCE LISTING
(1) GENERAL INFORMATION;
(i) APPLICANT:
(A) NAME: Inst:il:ut Pasteur (B) STREET: 25-:?8 rue' du Docte~ur Roux (C) CITY: Paris (E) COUNTRY: France (F) POSTAL CODE (ZIP): 75724 (ii) TITLE OF INVENTIC)N: Genetic markers, metabolic markers, and methods for evaluating pathogenicity of strains of E.coli (iii) NUMBER OF SEQUENCES: 1?.
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: RobiC
(B) STRE:E:T': 55 St-J'acqueS
(C) CTTY: ~l~ont,real ( D ) STAT ~; . QC
(E) COUNTRY: Canada (F) ZIP: H2Y 3X<?
{G) TFLEPH~~IVE: 'i19-987-6242 (H) 'fELEc'A.k: 514-895-7874 ( v ) COMPU'1 ER RF~ADABLI=, FORM
(A) MhDTUNi TYPE:: ~?:i.sk 3.5" / 1.44 Ml3 (B) COMPU'CER: Tt=,M ?C compatible {C) (:)J?ERA~C:ING S'~S'fh)M: fC-DOS/MS-DO:, (D) SOFTWARE: T:~'I' ?,SCI
(vi) CURREN'P APPLICAT.CON DATA:
(A) Al?PLIC:ATION 'JCJMEIER: 2.388.945 (B) r':CLING DATE: :32 May 2002 ( 2 ) INFORMATI02J FOR SEQ I L; J() : 1 (i) SEQUEPJCE CHARACTER:CSTICS:
(A) LHNGTH: 4489 nucleotides (B) "..""PE: nucle:_c: ~:xcid (ii) MOLECULE TYPE: D~dA
(vi) ORIGICJ~1L SG(JRCE:
(A) ORGANISM: Escherichia coli (B) STRAIN: 5'i9t39 ( xi ) SEQUENCE DESCRI P'I':I C)PJ : SEQ I D NO : l ggacgataatgtgat:cgtctatangcJgcaacgctatcatagt:cttgtcctg'gcgggtaaa60 aaagcgcgcttaccta ataagcgcgccgctgttcaggccttgagtggttattcaat 120 aacg tcctgtggtgactgt:aaaagtgcclcdt:ttgcr_gcggtgcaacctgaatcagcgtgccatt 180 acgttgcgcggcaactatacc:ccta:a<7gccgacaggttgcaggtaatgcaaaggcggctac 240 ctgttgctctccgt:t:ataaaggatcc::aagc~~Itgtcac:ataattta<Ittcagcactgtagaa 300 acgagtaacaaacgt.agtgccatc:gggagagatcat~g~-gaaactctggctgatctgtata 360 agcgtccagtttgt.cagcaaaga~:uJac.;aat:t:tctggaJ:cat.aaaattccggttgactcag 920 cgtcgacagagaggcatctcCCtgCdfaatccgttgattaaacgccagccactgagcggt 480 gggattaacatgcgaggcactgat:tcacgcaatct:taat:at.t r_<:gtccgggatattctg 540 gctgaatgtagcat.t.tggtatatat-_qr:ataattcatgtggcacatatattgtagtggcat 600 atctacagaagccactattggttar.Jclc:catcataat:atr_c~aacagtgta3gaggatttgtg 660 aaggaccactgttgcrttgagccac~:~t:aatgatgaccgaaacccattacatactcgtaacg 720 Page:
2003- 07-08 BdB corrige.txt Tistage pour 1e ccggttaaggcgtaacatatctc:c,gtctaat.accagc~catget:tcatccatcgcggcaca780 ggccatttcaccgtgtagcagat:gagtat:cttccgcagatgggcagccattagccagcaa840 acctgaatgaaaagcaaaacagccataggtctctatcacctctgtcgccggtttaggctg900 gcgaaacatattgcacatggtgac~gccgt:gt.ccat:caaattgc:gcatc:ccaaatcatctg960 ccccatccagggaagaataatcaaaat:gtcc:ac:gactc3tt.tgcaatttt:aagcccctcgac1020 accgctgtcatagcgaaaagacgi:gacagt.aaaatcactattt.:tccagcaagatacgagg1080 tttctcgccaaaaagcgcccgcc<zcaaat:t:aatacgcgtactc.,at:aacggttctcctcag1140 gacgctgtgacttcagcc::3gtgcc;gt:accttacattgc:tttcac.;gccagaagtagactccg1200 acatagacaaagcagagcatagaaaccaggaatgaaagctgtagtgagtggaacatatct1260 gcaatatatccctgaattgccggaaccaccgcggcaccgacaatagcc:ataacaatgact1320 gctcctgccatttctgtat:gt:tegtt:atcaacagtatccagtctttcct:gcatagatcgtc1380 gcccagcaagggccaaacaaaae,act:taccaggacggr_gacat:agaccgcgctgaaactt1440 ggagccagtgcaacatatgccaggaacagcgcccctataacggaatagagaatcaatact1500 ttttccggattaaaacgcgtcataaggat.gtt:ggctataaact:tc~ccaat:aaagaagcag1560 gcaaagctatagac:catg<3ac~tt:t:gaagcatcacgttcgttgatatcgc~~caactccagc1620 gccagacggatggtaaatgaccatactgc.gacctgcatacccacataaaggaactgcgcc1680 acaataccgcgacc~aaagcgcggatt.tca:agccagat.agcgcaigcgt<it<:cattgctgac1740 gggcgtttatagt~acttc~tctgtc;<~~racattacagc~ti:gggaagcgggttaaaaggaac1800 aacaccatgaccacaaccaaatcataatcatatact.tataccTgttcaagggtgttctct1860 g aacatcagcacctt:aaagt:tgtga,nt:tt:gctcggcgtt:cattcvcggacatc;t:gcttctca1920 aggctttccccctc:ggag<saaaccagatat:t:tgcccaataaaataccagacgcagcacca1980 atcggataaaaggtctgg<.tgatattgagc:cgcaatgtggcat:aggct.tctggaccgatc2040 attgaactgtatgtgttcc~ctgc:agtttcaaggaaactcaggccaatcgc;aatcgcaaaa2100 atagctgcaagga._i::ata<.~tgtactgt:t:gcc:at;atgcga~ggcac~ggaaaaaaagtgtacaa2160 ccaccaatatacagcgtcagccaattaaaattgccaccatat.aactggtctttttaatc2220 g acaagggatgctggtattgcaatt:aaaaaataacctccataaaatgcgctcagcaccaat2280 gctgaagcaaagtt~ctt;~gcgaaaatacact:ttt:gaattgacttgattaat~atgtcattt2340 aatgcagctgcgc~tcccc:atagc:gggaataaacacgataac~aaaataaactggaacaag2400 ggagtcttattcag.stac<:catr._cggcatctgaatgatgtttt.tatcgttcatagtgcta2460 cctttaactgtgca~~gat<~at:tatt.cgti,taaggttaaaaatt.c,attaaai:t:gttcaata2.520 ctcggataagatg,~ttgccttacct tt gtgacgct:gaaacrcggcaaaa<lagagcggct2580 ccc:t tttttcaaagcggcttcaacatc.;cccgctttgaacataataatgggaaaagcaaccaata2640 aatgcgtcaccagc:gccac::tagt,utcaa<::agcat:ttactttgaatgcagctaacatgga<a2700 tcctgatcgcgggt~~atcc:ataatgcgcca:t.tttc:gctcatc~gtaacaat:aatattgtt:c2760 agccctttatcaactaacgaacgt:gcggccaaacgaatatgatcataagt:atcaaccgac2820 ataccggttaatatatccagttct<~r.ttcattcgggataaagaaatcacat:ta gcaggca2880 taagacatatcta:a..-.tcar:gc:aat.g;:~.~:ggagccggatt:taa~aac:acttc::aataccattt2940 ttcttaccaaacti::~atcc7cgtgcat.aaac:.tgtttccac3ttg:3acttccac~tagtaaaacg3000 atcaatttgcattTvttcagatcttctgcagctcgatcgarat.cttccgc~ggaaagaaat3060 ttattcgctccct~::aattatt:aa~atartattgctcgagtt:ggcattaac::~aagatcggt3120 gcaacaccactgc,:ggtacagggr~a::'~~t:tctcaac_-ataagtgqtattaat:t:ccccatgat3180 tcaagattacgaat:agtai.tatccg~caaaaatatcatcacctactttagt:cagcatcagg3240 acttttgaattca<.u~ttacaccgc~gwcac:r.gcttgatt:agcacctttccc:accacatccg3300 attttgaaggcag<~~:gcti:cag<sg-t cctt:.ttt:aggc~atctgattagtgtaagt:a3360 ~~ wtc:t atgagatccaccat~attgc,aaccaataac.tgcaatgt~catttcactacctcttataaac3420 tttcgcataacaat::c~gtat:ttaa:~t;~.~c:att:agcatgt.tact:tttgcatcatttgtgac:t3480 gagatcgcgattac,c:acat:caacc:c~at::gt.Matt taatagactr_ccagtctcatcactc3590 aggccaacactat<~t:aatc:ataagcaacctaacaagattagtgcccaaaactcagcagcc3600 tataccctttcatttcaaagggcycc~gtcgtatagtat.ggr_:.atgaaaac:aatgtttact3660 t aacgccaaaatgti::atttt:tata:~c:~r_t.cttacggagaga~fiagtl:gatgctaa.acgaagc:a3720 aaaagagcgtatccgacgtttgatggaeactgcttaag:~aaa:.cgacagaatccatttgaa3'780 agacgcagcgcgaai:gctcrgaagt_vct.gtaatga,~tattc:;tcgcgatctccatcagga3840 agatgaacct ctgcc,actcaaccci:.-ic:.:t:c;ggv:ggcar_attctt:aai:ggtg~~ataaacccgc3900 gccatccatgcca<xt:aatc:c~atga:gi~t:cc<3aaa<iatc;at.ytgatgactt: acctattgc3960 aattctggctgccggaatggttaatgaaaatgat.r_-.tg~,t.ctt:ctttgatGatggccagga4020 gataccactcgtt<_t:aagcatgat:-:c::;;gg~-~tgcaatc::cctt:caccggc~::t:cagtt:acts4080 acatcgcgtcttt<;i:tgcgtt:gaatca<3aaagcctaatgt:a_~::ag.aatac:t:ttgtggtgg4140 tacgtatcgtgccagaagt.gatgc:tl~tt.tacgatgccagtaactcttcgc:cattagactc4200 tctcaatccgcgaaaaatatttat:ti=c-cgccagcggtgtgcataatcactttggcgtcag42.60 ctggtttaaccctgaagat:cttgcca~t.aagcgt:~iaacxcga':gaaccgtggactacggaa4320 aattttgctcgcccqccacgcgt~:gt!=c:gat_gaag':ggcct:ag.~~cagcrt:cgcaccgat4380 ctctgcatttgacqt_tctgattactcc~atcc~i:ccg~taccggcagattatc_~ttacgcactg4440 ccagaatggttctctt:aaagateat:ta<,acctgat:tcaaa~s;rricg.,~atga 4489 Paste 2 2003-07-~~i8 Listage pour 1e BdEj corrige.txt (2) INFORMATION FOR SEQ TC: NO; 2:
( i ) SEQUE;~IOE CF~ARACTE.',RI STICS
(A) LENGTH: 33~ am:irw acids (B) 'TYPE: amino aci ci ( D ) '"~JPOLOGY : l ine~a:r (ii) MOLECJLE TYPE: protein (xi) SEQUEiVCE DESCRIPTION: SEQ ID N0;2:
Met Ser Thr Arg Ile Asn Le:u Trp Arg Ala Leu Phe Gly G:Lu Lys Pro Arg Ile Leu Leu Glu Asn Sex Asp Phe Thr Val Thr Ser Phe Arg Tyr Asp Se:r Gly Val. GLu Gly Leu Lys Ile A.la Asn Ser Arg Gly His Leu Ile Ile Leu P:ro 'Prp Met Gly Gln Met. Ile Trp Asp Ala Gln Phe Asp Gly His GLy Leu Thr Met Cys Asn Met Phe Arg 65 70 '75 Gln Pro Lys Pro Ala 'rhr GLu 'dal Ile Glu Thr Tyz Gly Cys I'he Ala Phe His Sexy Gly Leu L~.u Al.a Asn Gly C;ys Pra Ser A).a Glu Asp Thr His Leu Leu His G.y "1.u Met ALa Cys Ala Ala Met Asp Glu Ala Trp Leu Glu Leu Asp c~l.y Asp Met Leu Arg Leu Asn Arg Arg Tyr Glu Ty° Val Met G:Ly L?he ~~:Ly His His Tyz :Leu Ala Gln Pro Thr Val Va.Leu His Ly:per Ser Thr lieu Phe Asp Ile Lys Met Ala Val Thz: Asn Leu A'a :ver Val Asp Met Pro Leu Gln Tyr 170 1'75 180 Met Cys His Met Asn Tyr A:_a 'tyr Ile Pro t~sn Ala Thr Phe Ser Gln Asn Ile Pro Asp Glu Ia_e Leu Arg Leu Arg Glu Ser Val Pro Ser His Val Asn Pro Thr A1_a Gln Trp Leu Ala Phe Asn Gln Arg Ile Met Gln Gly Glu Ala Seer Leu Ser Thr Leu Ser Gln Pro Glu 230 23ti 2.40 Phe Tyr Asp Pro Glu Ile Val Phe Phe Ala Asp Lys Leu Asp Ala Tyr Thr Asp Gln Pro Glu Phe Arg Met Ile Ser Pro Asp Gly Thr Pan°
2003-07-08 L:istage pour 1e BdB corric~e.txt Thr Phe Val Thr Arg Phe Tyr Ser Ala G1u Leu Asn Tyr Val Thr Arg Trp Ile Leu Tyr Asn G~y Gl~_z Gln Gln 'Tal Ala A1a Phe Ala Leu Pro Ala Thr_ Cys Arg Pi:o ~;,1u Gly Tyr :~~eu Ala Ala G:Ln Arg Asn Gly Thr Lea Ile Gln V~s.l. A.L~a Pro G~n Gln 'z: hr Arg Thr Phe Thr Va1 Thr Thr G:Ly Ile Gl a ( 2 ) INFORMATION FOR SEQ I U L~,TO : 3 :
(i) SEQUENCE CHARACT~:RISTICS:
(A) .'..:~_'.NGTI-1: 43 . a:ni.no acids>
(B) '1'~'PE: amino acid (D) 'TOPOLOGY: l.im~ar (ii) MOLECC1:~E TYPE: peotr~-~n.
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3:
Met Asn Asp Ly:> Asn I1e IIe Gl.n Met Pro Asp (:,1y Tyr Leu Asn 1 C? 7. 5 Lys Thr Pro Let:c Phe Gln Phe Ile heu Leu Ser Cys Leu Phe Pro Leu Trp Gly Cys Ala Ala A.la L~eu Asn Asp IIe l,eu Ile Thr Gl.n Phe Lys Ser Val Phe Ser Le>u Ser Asn Phe Ala Ser Ala Leu Val Gln Ser Ala Phe Tyr Gly GLy 'Cyr Phe Leu :Ile Ala Ile Pro Ala Ser Leu Val IlELys Lys Thr Ser Ty.r Lys Val A_La Ile Leu Ile Gly Leu Thr Leu Tyr Ile G.~y ~::~l.y Cys Thr L~eu Phe Phe Pro A:La 95 100 7.05 Ser His Met Ala Thr Tyr Thr Met Phe Leu .Ala A'__a :Ile Phe A1a 110 17.5 120 Ile Ala Ile Gly Leu Ser Phe L.eu Glu 'I'hr Ala Ala Asn 'Phr Tyr Ser Ser Met Ile Gly Pro Lys Ala Tyr Ala Thr I,eu Arg heu Asn Ile Ser Gln Thr Phe 'ryr Pro Ile G'__y Ala ;Ill.a Sc~~r c;ly Ile Leu Leu Gly Lys Tyr Leu Val Phe :3er t~Lu Gly ~.:;la Ser heu Glu hys 2003-07-08 Listage pour 1e HdI3 corrig~.txt Gln Met Ser Gly Met Asn A;~a Glu Gln Ile His Asn Phe Lys Val 185 190 1.95 Leu Met Leu Glu Asn Thr I:eu Gl~.i Pro 'Pyr Lys Tyr Met Ile Met Ile Leu Val Va:L Val Met Va.l. Leu Phe Leu Leu Th~~ Ar.g Phe Pro Thr Cys Lys Val Ala Gln Thr Ser Hips Tyr Lys Ar<I Pro Ser Ala Met Asp Thr Leu Arg Tyr Leu A:La Arg Asn Pro Arg Phe Arg Arg 245 250 2.55 Gly Ile Val Ala Gln Phe Leu Tyr Val Gly Met Gln Val Ala Val Trp Ser Phe Th.r Ile Arg L~~u .Ala Leu G _u Leu G.Ly Asp I:Le Asn 275 280 <'?85 Glu Arg Asp Al,a Ser Asn P~~e Met: Va1 Tyr Ser .P.he Ala Cys Phe Phe Ile Gly Ly;_: Phe Ile Al.a .Elsru Ile Leu Met T:hr Arg Phe Asn Pro Glu Lys Va.L Leu Ile L,~_~u 'I'yr Ser Va 1 1~ 1e G.Ly Ala Leu I'he Leu Ala Tyr Va:1 Ala Leu Ala Prc; Ser Phe Ser A.1G Val Tyr Val Ala Val Leu Va:l. Ser Val Lf~~u ahe Gly Pro Cys Trp Al.a Thr ILe Tyr Ala Gly Th:~: Leu As.p T!,.r 'Jal. Asp Asn C~lu ,H:i~. Thr G~.u Met A1a Gly Ala Va:L Ile Val Mrt Ala .I: 1e Va 1. Gi y .A:1 G Ala Val Val Pro Ala Ile Gln Gly Tyr I:Le AI_a Asp M~:et E?he Eli. Ser Le~u G1n Leu Ser Phe Leu Val Ser Met Leu Cys Phe Va1 Tyr Val G1y Val Tyr Phe Trp Arg Glu Ser Lays Val. Arg Thr Ala Leu Ala Glu Val 425 4?~0 935 Thr Ala Ser (2) INFORMATION FOR SEQ ID t4(:): 4:
( i ) SEQUEPdC:E CHARACTE~~R:I,:p'I'IC::>
(A) LENGTH : 30Ei <~~rli rio acids (B) T"pE: ami.no a<:;.d (D) TOPOLC%GY: l_rn~ar 2003-07-C)8 Listage pour ie BdB corrige.txt (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIP'.I'ION: SEQ ID N(7:4:
Met Asp Ile Ala Val Ile G__y Ser Asn Met Val Asp Leu Ile Thr 1 Ci 15 Tyr Thr Asn Gln Met Pro Lys Giu Gly Glu Thr Leu Glu Ala Pro 20 2'e 30 Ala Phe Lys Ile Gly Cys Gly G:ly Lys Gly Ala Asru Gln Ala Val Al.a Ala Ala Lys Leu Asn Seer hys Val Leu Met Ge~.x Thr Lys Val Gly Asp Asp Ile Phe Ala Asp Asn '('hr Ile Arg Asn Leu G.Lu Ser Trp Gly Ile As.n Thr Thr 7'yr Val. Glu Lys Val Pro Cys Thr Ser Ser Gly Val Al~a Pro Tle Prze V<31 Asn Ala Asn Ser Ser Asn Ser Ile Leu Ile Ile Lys Gly Ala .?~sn l.ys Phe Leu Ser Pro Glu Asp 110 1:15 120 Ile Asp Arg Al,a Ala Glu A=;p Leu Lys Lys C:ys Lya Leu Tle Val 125 130 7.35 Leu Gln Leu Glm Val Gln L~=~u. Glu Thr V<~1 Tyr HL> Ala I~.e Glu 140 145 7.50 Phe Gly Lys Lys Asn Gly Ile ~:~l.u Val Leu Leu Asn Pro Al.a Pro 155 160 1.65 Ala Leu Arg Glu Leu Asp M~_~t Ser Tyr Aia Cys Lys Cys Asp Phe 170 1'75 1.80 Phe Ile Pro Asn Glu 'Phr G:Lu Leu Glu Ile Leu Thr Gly Met Ser Val Asp Thr Tyr Asp His I:Le Arg Leu Ala Ala Arg Ser Leu Val 200 205 2:10 Asp Lys Gly Lets Asn Asn I:le Ile Val Thr Met Ser Glu Lys Gly Ala Leu Trp Met Thr Arg Asp Gln Glu Va1 His Val Pro Ala Phe 230 23'p 240 Lys Val Asn Ala Val Asp Thr Ser Gly Ala Gly A.>p Ala Pr:e Ile Gly Cys Phe Sei: His Tyr Tyr_ ~~al Gln Ser G1y Asp Va1 Gl.u A1a Ala Leu Lys Ly: Ala Ala LEe;a !?7e Ala A.la E'he S~e.r 'Ja1 Thr G1y Lys Gly Thr Glr~ Ser Ser Tsr~:~ -?ro rer I.le :'~lu C: n P:ze .As;n C~Lu Paqo 2073-07-f78 I,istace pour 1.e Bd3 corrige.txt Phe Leu Thr Leu Asn G1u (2) INFORMAT2CN FOR SEA TI:) LSO:
(i) SEQUENCE CHARACT1~RIST:fC~:
(A) LENGTH: 261) amino acids (B) 'TYPE: amino amid (D) TOPOLOGY: .. inear (ii) MOLECCJLE 'T'IPE: Li:°<,tvei.n (xi) SEQUENCE D:F~SCRIP'f'ION: SEQ ID N0:5:
Met Glu Thr Lys Gln Lys lxlu Arg Ile Arg Arg Leu Met Glu Leu Leu Lys Lys Thr Asp Arg Ile His I,eu Lys Asp Ala Ala Arg Met Leu Glu Val Ser Va:1 Met Ttrr Tle Arg Arg Asp Leu His Gln Glu Asp Glu Pro Leu Pro Leu Thr heu Leu G1y Gly Tyr_ Ile Val Met Val Asn Lys Pro Ala Pro SEea: Met: Pro Val :Lle His Asp Val Pro Lys Asn His Arg Asp Asp L~eu Pro Ile Ala Ile Leu Ala A.La Gly Met Val Asn Glu Asn Asp Leu Ile Phe Phe Asp Asr:~. Gly G.Ln G1u Ile Pro Leu Va1 Ile Ser Met I1e Pro Asp Ala Ile: 'Thr Phe Thr Gly Ile Cys Tyr Ser His Arg Val Phe Val Al.a Leu Asn Glu Lys Pro Asn Val Thr Ala Ile L~~~~u Cys Gly Gly Thr Tyr Arg Al.a Arg 140 195 x.50 Ser Asp Ala Phe Tyr Asp Ala Ser Asn Ser Ser Prc? Leu Asp Ser 155 160 7.65 Leu Asn Pro Arg Lys Ile Phe Ile Ser Ala Ser Gly Val His Asn His Phe Gly Val Ser Trp Ph.e Asn Pro Gl.u Asp Leu Ala Thr Lys Arg Lys Ala Met Asn Arg G:ly ~eu Arg Lys Ile Leu Leu Ala Arg His A1a Leu Phe Asp Glu Va1 A1a Ser Ala Ser Leu Ala Pro Ile Ser Ala Phe Asl:~ Val heu I 1e :per Asp Arg Pro I,eu Pro Al,a Asp Page 2003-07-()8 Listage pc:ur 1e BdB corrige.txt 230 2.35 240 Tyr Val Thr His Cys Gln P.:~ru Gly Se:r Val Lys Ile Ile Thr Pro Asp Ser Glu Asp Glu (2) INFORMATION FOR SEQ ID NG: 6:
(i) SEQUENCE CHARACT3R.TSTICS:
(A) :LENGTIV: 4486 nucleotides (B) TYPE: nucl~:i c acs d (ii) MOLECULE TYPE: D2dA.
(vi) ORIGINAL Sc7URCE:
(A) ORGANISM: f;;c~erichia coi:i (B) STRAIN: AL8Cs2 (xi) SEQUENCE D°SCRIEe'''1:0N: SEQ IC N0:6:
ggacgataatgtgatcgtcaataagggca:cacgctatc:atagtc~ttgtc:ctggcgggtaaa60 aaaacgcgcttaccttaaa:gataa:gcgcgccgctgttcaggccttgagtggttattcaat120 tcctgtggtgactgtaaa~.gtgcgcgt ctgcc;gt.gcaacctgaatcagcgtgccatt180 tt:c~
acgttgcgcggcaagatac:ccoca ggcc_:gacagqtt.gc:aggl::aatgraaaggcggctac240 ctgttgctctccgttata ggat:cc:vagc:gtgtc<~ca.taa.t~t:tagttcac~::actgtagaa300 ~a acgagtaacaaacgtagtgccats:~gygagagatcatgcgaaactctggctgatctgtata360 agcgtccagtttgtctgc~aagaagacaatttctggatcataaaattccggttgactcag420 cgtcgacagagaggcttct;ccctctc:~:taatccgtt.gattaa:~<:gccagcc:actgagcggt480 gggattaacatgcclaaggc:actc~~~tt:cargcaatctt<zatatt.tcgt<:cc~ggatattct=g540 gctgaatgtagcai:ttggtat:at~3t:c:~ca:ataat.tcatc~i:ggc:.ac~at:ataqtagtggc<~t600 t~:
atctacagaagccagatt~gtaacggcc:atcttaatatcgaacagtgtagaggatttgtg660 aaggaccactgttggctgagccactat.aat:gatgacccaaacccattacatactcgtaacg720 cccgttaaggcgtaacatatctcc~qtctaattccagccatgctetcatc::c:r'.:cgcggcaca'780 ggccatttcaccgi:gtagcagat,tac7tt;~tr.t:t.cr_:acac~at.dr,Iqcagccat:~~<~gccagcaa84 acctgaatgaaaagcaaaacagcc:ataggtctctatc:acct~ct:gtcgccdgtataggctg900 gcgaaacatattgcacat<rgtgaagc:cgt.gtccatcaaattgcgcatcccaaatcatctg960 ccccatccagggaagaat:,iat:ca-a:ct:gt.c:c:acgac:U:a;tt:tgc<:at:tttaactcccctcgac1020 accgctgtcatat c;gaaaaagacgt:ctac<igt:aaaatcactattttccagc<3<3gatacgagg1080 tttctcgccaaaaa:~cgr_c,cgcc_cc:~.iaat:vtaatacgc~<Ltactcvataac:gat.-.tctcctcag1140 gacgctgtgacttcagccaagtgc~3gtacgtactttgctttcac:gccaga<3gtagactccg1200 acatagacaaagcagagcataga_caccaggaatgaaagctgtagtgagtggaacatatct1260 gcaatatatccct:Iaat~t.ccggnac:cs~ccgc:ggc:acc;gac.aatagcc:at:aacaatga<:t1320 c gctcctgccattt::tgtat:gttc~;t.t:aitc:aacagtatccar~tc;ttcctgc:at:agatcgtc1:380 gcccagcaagggc~.aaac<3aaac._,ct aggacggcgacat agaccgc:g<agaaactt1440 t<icc:
ggagccagtgcaac.atatgccag~~~aac:agcgcccctataacgcfaatagagaatcaatact1500 ttttccggattaa<iacgccftcat cac~gat~att.ggctat:aaac.t:tgccaai:aaagaagcag1560 gcaaagctataga::::atg~:3agtt tgaagcatcacctttcctttgatatcgcccaactccagc1620 gccagacggatggt:aaatc,tac:catactgc~gac:ctgcat:acc;:acataaaddaactgcgc:c1680 acaataccgcgacg:aaag<:gcggatttctagccagatagcgc<3gcgtatc~cattgctgac1740 gggcgtttatggtg:acttcJtctgt:gccactttacaggttgggaagcgggt:t:aaaaggaac1800 aacaccatgacca~;..aacca~gaat~:at;:,aatcatata<:ttat<~cctgttcaaciggtgttctct1860 aacatcagcaccttaaagi:tgtg:uat,ttgct:cggcgtt:cattc:ctgac:ai:ct:gcttttca1920 aggctttccccctcggagaaaac~~:agatat:ttgcccaataa~~~aaccag<icgcagcacca1980 atcggataaaaggt~~tggc.:tgatatt:gagecgcaatgtgge~ataggctti:tggaccgatc2040 attgaactgtatgt~~ttrc:xct..gc:e.gtaggaaact:caclgccaatcgc:aatcgcaaaa2100 t.i~ca atagctgcaagga,_~~ata<tt<ttaa~=~c3l::t,c~c:.catatgcgaggcactggaaaaa~<tagtgtacaa21 ccaccaatatacag~~gtcaggcc~:attaaaattgcca<:cttat:aactggt:cattttaat:c2220 acaagggatgctggtatt<3caatta.aaaaataacctccataaaatgcgct:ctgcaccaat2280 gctgaagcaaagttacttagcga,~aatacacttttgaattgagtgattaatatgtcattt2390 Paste 8 2003-07-C~E3 BdB corr:ige.txt f.~istage pour :1e aatgcagctgcgcat:ccccatagcg<~gaataaacacgataacaaaataaactggaacaag 2400 ggagtcttattcaciataccr_atcaggcatctgaatgat.gtttttatcgtt;catagtgcta 2460 cctttaactgtgcac~gatgattat:tc~gtataaggttaaaaattc:attaaat:tgttcaat.a2520 ctcggataagatgatagcgtaces=ti:c_cctgtgacgc2:gaaagcggcaaagagagcggct 2580 tttttcaaagcggcatcaacatcacc~c~ctttgaacataataatgggaaaagcaaccaata 2640 aatgcgtcaccagcc~ccac;tagt,:3t:s.~<~aragcatttar;tttgaatgcagctaacatggac,t2700 tcctgatcgcgggt:catcc:ataat:cacc:Lcc:tttttcgct.catggtaacaat;aatattgttc 2760 agccctttatcaar_i~aacgaacgtgcggccaaacgaat.atgatcataagtatcaaccgac 2820 ataccggttaatat.t~tccagttct:cyt~:t:cattcgggataaac~naatcacatt.tgcaggca2880 taagacatatctaactcac:gc:aatc3~~c~qgagccggatttaataacacttc:aataccattt 2990 ttcttaccaaactc~<~atc<~cgtgtt~3aactgtttc~cagttgaact.t:cc:agttgtaaaacg 3000 atcaatttgcatti:.i~ttcagatctt:~rtgcagctcgatcgatat.cttccg<iggaaagaaat 3060 ttattcgctccctt:<~attattaavatact.attgctcgagtt~3gcattaac:aaagatcggt 3120 gcaacaccactgc!::c~gtac::aggg~,~a;attctcaacataagt:ggtattaat:.t:ccccatgat3180 tcgagattacgaatagtattatc_:gcaaaaatatcatcacctactttagti_agcatcagg3290 acttttgaattcaavttac7ccgc~~:g~car_cgcttgattagcacctttccc:accacatccg 3300 attttgaaggcaggtgctt:.ccag,ugt:ttct:ccttc~ttt:aggr,atctgatt:agtgtaagta 3;360 atgagatccacca~t,3ttg<~aacc~uat;aactgcaatgtccattt cactacc:t_cttataaac3420 tttcgcataacaatggtatataaataacattagcatgttacttttgcatcatttgtgact 3480 gagatcgcgatta~;i;:acat:caac~:cgai~gttt.atttaatagac-ttccagtcttatcactc 3540 aggccaacactat::taatc:ataactcaac;ctaacaggat:t:aataccgaaaat~t:cagcagtc3600 tatacccttttcatttcaaagggt:cggtcgtatagtat~ggt.-3ar_taaaac:aatgtttact 3660 aatgccataatgtt.atttttataacattttacggagagagttgatggaaacgaagcaaaa 3720 agagcgtatccgacgtttgat:tg_uaatact.taagaaaaccgac:agaatccatttgaaaga 3780 cgcggcacgaatg,ctgga<igt:tr_c.t:cttaat:gactatt.<:gtagc:vgatct:cc~at=caggaaga3840 tgaacctctgccactgaccctact:gggtggctatattgt:aatggtgcataaacccgcacc 3900 atccatgccagtaatccaggacgt:tccgagaaatcatc:gtgatgactt.acctattgcaat 3960 tctggccgccggaatggttaatgaaaatgat:ctgatca:t:cttt.gataaat~:~gccaggagat 4020 accgctcgttataagcatgatccc:ggatycaatcacc:ttcactggr_atc=gttactcaca 4080 tcgtgtcttt gttgcgttgaatgaaaaacc:taatgtgar_agcaatactttgtggtggtac 4140 gtatcgtgccagaagtgatgc.~tt.i.t:t:.acc:tatgccagt:aact<a.tcgccatt:agactctct4200 caatccgcgaaaaatattt:atttc.ccaccagc;ggtgta~c~atgat:cactttggcgtcagctg 4260 gtttaatcccgaagatcttgccactaagcgtaaagcgatggcccgtggactaaggaaaat 4320 tttgctcgcccgcc:acgc~:atgt.tcgatgaagtagcctctgc<aagcct:cgc~accgctctc4380 tgcatttgat gttctgattagcgagc.gtccgt:t:accctgcagat;.tatgttacgcactgccg 4440 gaatgcttcgtaaagat~,at t. t:cactaaagacgautga 4486 ttacvacctga (2) INFORMATION FOR SEQ II:7 N0: 7:
(i) SEQUENCE CHARAC'!'i;R.IST.ICS:
(A) LENGTH: 3~'~' domino acic>
(B) TYPE: amine ac:,id (D) TOPOLOGY: :spear (ii) MOLECULE TYPE: p:co.ein (xi) SEQUENCE DESCRIF'iT~~N: SEQ ID NO:7:
Met Ser Thr Arg Ile Asn i_.n_u:: Trp Arg F,~' a f.~eu Fnee G1y Glu L~ys I. ~ ~. 5 Pro Arg Ile Leu Leu Glu F.:=,n. Ser P.sp P:.e Thr ua!. ~'hr Ser Phe 20 <:~~ 30 Arg Tyr Asp Ser Gly Val ~:l.m Gl~Y~ heu hys I1_e P.la Asn Ser Arg 35 =?(i e5 Gly His Leu Ile Ile Leu Pro Trp Met G~.y Gln Mev Ile Trp Asp 50 5!-i 60 Ala Gln Phe Asp Gly His G..y Leu Thr Met Cys Asn Met Phe Arg Pave 9 2003-07-O8 histage poeir1e BdEcorrige.txt GlnProLys PrrrAlaThr G ';!alI GJ.Thr'TyxGly CysPhe 1.,a 1e a 80 8'~ 90 AlaPheHis Se:rGlyLeu h_~uAlaAsn G:LyCysPrc:~Ser ValGlu 95 100 ~05 AspThrHis Le~_iLeuHis Gl.yG1L.Met.Ala(:ysAlaAI_aMetAsp GluAlaTrp Le,zGluLeu A.>p.,1yAsp MetLeuAr_c)Leu AsnGly ArgTyrGlu TyrValMet.Gl_yPheGly HisHi_sTyr:-Leu AlaGln ProThrVa1 Va:LLeuHis hysSerSer Thr7_~euPheAsp ILeLys 155 160 1.65 MetAlaVal ThrAsnLeu A=_aSerVal AspMetPrc-.Leu GLnTyr 170 175 1.80 MetCysHis MetAsnTyr A:LaTyrIle ProAsnAlaThr PheSer GlnAsnIle ProAspGlu I:LeL,euArg LeuArgGlmSer ValPro SerHisVal AsnProThr RiaGl_nTrp LeuA:LaPheAsn GLnArg IleMetGln GlyGluAla :>c~rLeiaSer ThrLeuSerGln ProGlu PheTyrAsp ProGluIle Val.PhePhe ALaAspLysLeu AspAla 295 2~,0 255 TyrThrAsp GlnProGlu PlueArgMet IieSerPrc~Asp GlyThr 260 2E>5 270 ThrPheVal ThrArgPhe TyxSerA:laGl.uLeuFsnTyr Va1Thr ArgTrpIle LeuTyrAsn C:lyG1~~Gl.nGLnValAl..:xA1a PheAla LeuProAla ThrCysArg k'rc>GluG_y TyrLeuAlaAla G.lnArg 305 31.0 315 AsnGlyThr LeL.IleGln V,a:l.AlaPro GLnGlnThr_Arg ThrPhe 320 325 ;330 ThrValThr Tl-~rGlyIle Ca:Ll.i ( 2 ) INFORMATIOI~I FOR. SEQ I 1) I'!0 : 8 :
(i) SEQUE;DICE CHARAC'1'E; I:STICS:
(A) hENGTH: 4 ,t3 amino acids (B) TYPE: aminas a-3cv.i.d (D) TOPOLOGY: ' ira<ear Page '.0 2003-07-08 L~istage pour1e BdBcorrige.txt (ii)MOL ECULE protein TYPE:
(xi)SEQUEN(:E P':CIOt~T; :8:
DESCRI SEQ
ID
MetAsnAsp LysAsnIle I:Le;7 MetP:r Faspu.LyTyr LE.~uAsn n o 5 7.0 7.5 LysThrPro LeuPheGln PheI1_eheuLeu L~er.~.~JSLeu PheI~'ro LeuTrpGly Cy::~AlaAla Ala?~euAsnAsp Il.eLei:Ile TtArGin 35 4~ 45 PheLysSer Va:LPheSer IaE_;a:3erAsnP'~eA1aSerAla LeuVal 50 5.'> 60 GlnSerAla Ph<~TyrGly G 'ryrP:heLeu 7: i~ I ProAla l 1e La 1e y 65 70 'S
SerLeuVal IleLysLys T!-~r,:perTyrLys ValAlaI1_eLeu7:1e GlyLeuThr Le~.rTyrI1e Gl.y~:~'.yCysThr LeuP:ze.~Phe PwoAla SerHisMet A1.3ThrTyr Tr:r!hetPheLeu AlaA1<~ile PheAla 110 1'15 120 IleAlaIle GlyLeuSer F:hc~_LeuGluThr F~7.aAlaA.snThrTyr 125 1:~0 '_35 SerSerMet IleGlyPro hysAiaTyrAia ThrL~uArg LceuAsn 140 145 ~~50 IleSerGln ThrPheTyr ProI:LGlyAla A7.aS2rGly I:l_eI~eu e:.
155 1E~0 165 LeuGlyLys Tyr_LeuVa:LFheSerGluG1y GluSerLeu GluLys 1'70 175 180 GlnMetSer GlyMetAsn A7 G:LiaGlnI HisAsrzPhe LysVal a 1e LeuMetLeu GluAsnThr.LeuGluProTyr I~ysTyrMet I:LeMet IleLeuVal Va:LValMet.Va7.LeuPheLeu LeuThr-Arg PhePro 215 220 'Z25 ThrCysLys Va:LAlaGln ThrSerHisHis -LysArgPro SerAla MetAspThr LeuArgTyr L,euA.laArgAsn ProAr<(Phe ArgArg GlyIleVal AlaGlnPhe L,~~uT ValGly MetGlnVal AlaVal yr TrpSerPhe Th.rIleArg I,euAlaLeuGlu :LeuGlvrAsp I1eAsn GluArgAsp A1;~SerAsn PheMetVa1Tyr SerPheAla CysPhe 2003-07-(:~E3 L~istage pour1e BdBc:orrige.
txt PheIleGly LysPheIle A_aAsnIleLeu MetThrArg PheAsn ProGluLys Va~_LeuIle Lea.x'CyrSerV<i:LLleG:LyAla LeuPhe LeuAlaTyr VaI.Al.aLeu A:LaProSerPhe SerAlaVal TyrVal AlaValLeu ValSerVal L<:uPheGlyPro CysTrpAla ThrIle TyrA1aGly ThwLeuAsp Th.r',7a1AspAsn GluIsisThr GluMet AlaGlyAla Vaa.Il.eVal McaA.laIl.eVal GlyALaAla ValVal ProAlaIle GlnGlyTyr IleAlaAspMet PheH.isSex LeuGln LeuSerPhe LeuValSer M._aLeuC:ysPhe ValT Val G7_yVal yr TyrPheTrp ArgGluSer LysValArgThr AlaLeuAla GluVal Thr Ala Ser (2) INFORMATIOiV FOR SEQ_ IC~ NO: 9:
( i ) SEQUENCE C'?ARACTE:,R1 S'_"I CS
{A) IaGNGTII: 3UFi amino acida {B) :CYPE: amine ;7cid (D) TOPOLOGY: lp_r_ear (ii) MOLECULE T'~PE: pi:r.:tr:in (xi) SEQUENCE DhSCRIP~"I:C?N: SEQ ID NO:.°.:
Met Asp Ile Ala Val Ile G:y Ser Asn Met 'JaJ. Asp heu Ile Thr 10 :15 Tyr Thr Asn Gln Met Pro L~js Glu Gly Glu 'rrr Leu Glu Aia Pro Ala Phe Lys Ile Gly Cys C:,:Ly G1y hys Gl.y Al.a As~a Gln ALa 'Jal Ala Ala Ala Lys Leu Asn Scar Lys Va1 Leu Met Leu Thr Lys Val 50 5'60 Gly Asp Asp Ile Phe Ala A:~L:~ Asn Thr T:ie Arg Asu heu Glu Ser Trp Gly Ile Asn Thr Thr '1'yr Val Glu Lys Val Pro Cys Thr Ser Ser Gly Val Ala Pro Ile Phe Val Asn Ala Asn Ser Ser Asn Ser 95 1. C)0 1.05 2003-07 -08I~ist~age pourle BdBcorr:ige.txt IleLeuIle IleLysGly ALaAsnLysPhe heuSerProGlu Asp IleAspArg AlaAlaGlu Asp!:.~euhysLys O:yshya,LeuI7_eVal 125 130 1.35 LeuGlnLeu GluVal.Gin LeuGluThrVal 'I'yrH:LsAl.aIl.eGlu 140 195 1.50 PheGlyLys LysAsnGly ILeGluValLeu L~euAsnProAl.aPro 155 160 I_65 AlaLeuArg GlnzLeuAsp Met:SerTyrA:laCysLysCysAsp Phe 1'70 115 180 PheIlePro AsnGlu'rhrGLuLeuGluIle LeuThxGlyMet Ser ValAspThr Ty:c:AspHis I .?ergLeuAla AlaArc)SerLeu Val Le AspLysGly LeuAsnAsn Il.e:L:LeVal.'rhrMetS~~rG1_uLys Gly 215 220 2.25 AlaLeuTrp Met.ThrArg A~;p~:LnGluVal HisVaif'roAl.aPhe LysValAsn Al._iValAsp ThrSerGlyAl.aGlyAspAlaPhe Ile GlyCysPhe Ser_HisTyr Ty:rValGlnSer GlyAsiaValG_LuAla AlaLeuLys LysAlaAla LeuPheAl.aAla hheSerValThr Gly LysGlyThr GlnSerSer TyrProSerIle GluG1nPheA.snGlu PheLeuThr LeuAsnGlu (2) INFORMATION FOR SEQ LD N0: 10:
(i) SEQUENCE CIiARACThRIS'PICS:
(A) LENGTH: 30E: amino acids (B) 'TYPE: amino avid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE C)L,SCRIPT~GN: SEQ ID NC3:10:
Met Asp Ile Ala Val Ile G':y Sez Asn Mc~t Jai Ast~ Leu I1e Thr 1C' 15 Tyr Thr Asn Gl~z Met Pro I~ys G1u GI_y Glu Thr L,e G.I_u A1a Pro Ala Phe Lys Ile Gly Cys GL.y Gly Lvs G?~y .~lla Pan Ciln Ala Val 35 4C) 95 2003-07-C3 '~~is"_:age po~:_ir.1e I3clBcorrige.txt AlaAlaAla Lye;LeuAsn Ser,ysVa.LLeu MetLeu'l LysVal hr G1yAspAsp Ile:PheA1 A:>palsnTh.rI7_e~,rgAsnLeu Gl.uSer a 65 70 7~
TrpGlyIle Asr:ThrThr Tsrr'1<a1.C~:l.uLy,>N'alPrc~:ysThrSer SerGlyVal AlaProIle PheValAsnAia AsnSerSer AsnSer 95 i~:?~ 105 IleLeuIle I1<;LysGly A:laAsnLysPhe LeuSerPro GluAsp IleAspArg AlmAlaGlu A:~p:L,euLysLys C'.ysLysLeu I~.eVal LeuGlnLeu GlozValGln L. GluThrVal TyrH A7_aI:LeGlu ~u L:;
PheGlyLys LysAsnGly Il.eGluVal.Leu LeuAsr;Pro A1aPro 155 1.60 165 AlaLeuArg GluLeuAsp MeatSerTyrAla CysLysCys AspPhe PheIlePro AsnGluThr Gl.uLeuGluIle L,eu'rhrGly MetSer ValAspThr TyrAspHis IleArgL,euAla AlaArc;Sex LeuVal AspLysGly LeuAsnAsn I'_eIleValThr MetSerGlu LysGly AlaLeuTrp MetThrArg AspGLnGl.uVal HisValPro AlaPhe LysValAsn AlaValAsp 7'h~:SerGl.yA:laGlyAspAla P:helle GlyCysPhe Se:rHisTyr TyrValGl.nSer Gl.yAspVal GluAla AlaLeuLys LysAlaAla I:E=u.PheAlaAla PheSerVal ThrGly 2.75 280 285 LysGlyThr GlnSerSer TyrPr<>SerI GluGl.nPhe AsnGlu ~e 290 2.95 300 PheLeuThr LeuAsnGlu ( 2 ) INFORMATI01\! FOR SEQ I D NC): 11 (i) SEQUEI~ICE CHARACTERISTICS:
(A) LENGTH: 8'~4 nucLeot~.ide,>
(B) TYPE: nucle~:i.c: acid ( i i ) MOLEC'.LILE TYPE : C>NA
2003-07-C'_'s histage poszr 1e BdB c:orrige.txt (vi) ORIGINF~L SOURCE: lsc:herichia coli ( xi ) SEQUENCE DESCRI P'l.' LC)N : S EQ I D N0 : 1 1 caatactcggataactatgattgccttacctttccctgtgacgcagaaagcggcaaagagag 60 cggcttttttcaaagcggcttca<ccat:caccgctttgaacat aataatgggaaaagcaac 120 caataaatgcgtc<~c:cagcgccar:t<~cJt:atcaacagcatttactttgaat:gcaggaacat 180 ggacttcctgatc<Jc:gggt:catcc:ai:~~atg<:gcctttttcgctcatggtaacaataatat 2.40 tgttcagccctttat:caact=aaccJaac;gtgcggcc:aaacgaatatgatcat:aagtatca.a300 ccgacataccggttt~atat:ttcc~igi:t_<agt:ttcattcgggataaagaaat: cacatttgc360 aggcataagacatat:ctaactca<:gc:.aatgccggagccggat:ttaataac:acttcaatac 420 catttttcttacca.aactcaatcgcgtggtaaactgt?-tccagttgaactt:ccagttgta480 aaacgatcaattt<Jcattt:tttcrigat:ctr.ctgcagctcgatcgatatctt:ccggggaaa540 gaaatttattcgct:cccttaattati_aatatacta"~tgct,,:,~t:gttggc~c't:taacaaaga600 tcggtgcaacaccacagctggtac,~gc~ggactttctcaacata.agtggtat:taattcccc660 atgattcgagattac:gaatgta':t,~;:~cgcaaaa;~tatca'~c a:~ctact,t:t.agttagc;a720 a tcaggacttttgaai:tcaactttacJcc:~c:cgc<:ac~~gct..t:gat:tac~cacctt:t:.cccaccac780 atccgattttgaacxgcaggtgctt~:~c~agagtttc;:cctt:ct'Ytaggcatc:t:gat 8:34 ( 2 ) INFORMATIOt~I FOR SEQ IC~ 1v0: 12 ( i ) SEQUEtVC:E CHARACTL,ft . STI CS
{A) LENGTH: 81~= !m.ic7_e~otides {B) '1"fPE: nucleic acid ( i i ) MOLECiJ::~E TYPE : Dl>IA
(vi) ORIGIN~~L SOURCE: I_,scher.ichia ca7.i.
{xi) SEQUE19CE DESCRIP'I'I~DiV: SEQ ID N0:7.2:
ggacgataatgtg,_~rcgtc::tata,~q;Jgcaacgctatcatagtcatgtcct:ggcgggtaaa 60 aaaacgcgcttaccttaa<:gata~~~g.~.gcgccgctgttc:aggccttgagt.<~gttattcaat 120 tcctgtggtgact~ataaaagtgc~:~cqtttgctgcggtcJcaa :cvtgaatcac~cgtgccat:t180 acgttgcgcggcaagatac:cc:ct~::ag:~<:cgacaggttgc:aggt:aatgcaaaggcggctac 240 ctgttgctctccgt:tata~~aggatccagcqtgtcacat:aa-tt agttcagc~actgtagaa300 acgagtaacaaac:Jtagtgccat.:gcx:~agagatcatgcgaasc:t~ctggctgatctgtata 360 agcgtccagtttgt~tgct3aagaagac ttct:ggat:c:a!:aar.aattcccJgttgactcag 420 aat cgtcgacagagag;Jcttct:ccct.~fc:ataatccgttgattaaacgccagcc::actgagcggt 480 gggattaacatgc;fiaagg~actg=u.t caatctt<iatatt:tcgtccgcJgatattctg540 c<~cg gctgaatgtagcat:ttggn:atat:~stgcat:.aat.tcatgt:ggcac:atatat?:dt:agtggcat600 atctacagaagccagatt<;gttac:ggccatct.taatat=cgaac:agtgtac4aggatttgtg 660 aaggaccactgttc~gctg<igccac~~ataatgat:gaccgaaaccc:attaca'_actcgtaacg '720 cccgttaaggcgtaacata-itctccgtctaattccagrcatgcttcatcc~itcgcggcaca 780 ggccatttcaccgi:gtagc:agat_c~~agtatc:ttc:cac 816 PacJee 1. S
Claims (39)
1. A method for evaluating pathogenicity of a strain of E. coli, comprising the step of assaying a metabolic activity of said strain.
2. The method of claim 1, wherein said metabolic activity consists of metabolization of 2-Deoxy-D-ribose.
3. The method of claim 1 or 2, wherein said assessment comprises growing said strain on a minimal medium comprising 2-Deoxy-D-ribose as a sole source of carbon.
4. A method for determining likelihood of pathogenicity of a strain of E.
coli, comprising:
- assaying deoxyribokinase enzymatic activity of said strain; and/or - assaying said strain for the presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose;
wherein ability of said strain to metabolize 2-Deoxy-D-ribose and/or presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose is indicative of a higher likelihood that said strain of E. coli is pathogenic as compared to a commensal strain.
coli, comprising:
- assaying deoxyribokinase enzymatic activity of said strain; and/or - assaying said strain for the presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose;
wherein ability of said strain to metabolize 2-Deoxy-D-ribose and/or presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose is indicative of a higher likelihood that said strain of E. coli is pathogenic as compared to a commensal strain.
5. A method for identifying a pathogenic strain of E. coli, comprising:
- assaying deoxyribokinase enzymatic activity of said strain; and/or - assaying said strain for the presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose (autres deoxyribose??);
wherein ability of said strain to metabolize 2-Deoxy-D-ribose and/or presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose is indicative that said strain of E. coli is pathogenic.
- assaying deoxyribokinase enzymatic activity of said strain; and/or - assaying said strain for the presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose (autres deoxyribose??);
wherein ability of said strain to metabolize 2-Deoxy-D-ribose and/or presence of genes or proteins involved in metabolization of 2-Deoxy-D-ribose is indicative that said strain of E. coli is pathogenic.
6. The method of claim 4 or 5, wherein said genes or proteins consists of genes or proteins from operon deoK.
34.
34.
7. The method of any one of claims 4 to 6, comprising assaying said strain for the presence of a nucleic acid sequence selected from the group consisting of:
a) sequences provided in part or all of SEQ ID NO: 1 or 6;
b) complements of the sequences provided in part or all of SEQ ID NO: 1 or 6;
c) sequences consisting of at least 20 contiguous residues of a sequence provided in SEQ ID NO: 1 or 6;
d) sequences that hybridize to part or all of nucleic acids of SEQ ID NO: 1 or 6, under moderately, preferably high, stringent conditions;
e) sequences having at least 80% identity to part or all of SEQ ID NO: 1 or 6;
f) degenerate variants of a sequence provided in part or all of SEQ ID NO: 1 or 6;
and g) sequences encoding part or all of polypeptides provided in SEQ ID NO: 2-5 and 7-10.
a) sequences provided in part or all of SEQ ID NO: 1 or 6;
b) complements of the sequences provided in part or all of SEQ ID NO: 1 or 6;
c) sequences consisting of at least 20 contiguous residues of a sequence provided in SEQ ID NO: 1 or 6;
d) sequences that hybridize to part or all of nucleic acids of SEQ ID NO: 1 or 6, under moderately, preferably high, stringent conditions;
e) sequences having at least 80% identity to part or all of SEQ ID NO: 1 or 6;
f) degenerate variants of a sequence provided in part or all of SEQ ID NO: 1 or 6;
and g) sequences encoding part or all of polypeptides provided in SEQ ID NO: 2-5 and 7-10.
8. The method of claim 7, wherein said nucleic acid sequence is selected from the group consisting of:
a) a nucleotide sequence having at least 80% nucleotide sequence identity with part or all of SEQ ID NO: 1 or 6; and b) a nucleotide sequence having at least 80% nucleotide sequence identity with a nucleic acid encoding any of SEQ ID N0:2-5 and 7-10.
a) a nucleotide sequence having at least 80% nucleotide sequence identity with part or all of SEQ ID NO: 1 or 6; and b) a nucleotide sequence having at least 80% nucleotide sequence identity with a nucleic acid encoding any of SEQ ID N0:2-5 and 7-10.
9. The method of claim 8, wherein said nucleic acid sequence is selected from the group consisting of:
a) a sequence substantially the same to part or all of SEQ ID NO: 1 or 6; and b) a sequence substantially the same to a nucleic acid encoding part or all of any of SEQ ID NO:2-5 and 7-10.
a) a sequence substantially the same to part or all of SEQ ID NO: 1 or 6; and b) a sequence substantially the same to a nucleic acid encoding part or all of any of SEQ ID NO:2-5 and 7-10.
10. The method of claim 9, wherein it comprises a sequence selected from the group consisting of:
a) a sequence having 100% identity with SEQ ID NO: 1 or 6;
b) a sequence having 100% identity with a nucleic acid encoding any of SEQ ID
NO:2-5 and 7-10.
a) a sequence having 100% identity with SEQ ID NO: 1 or 6;
b) a sequence having 100% identity with a nucleic acid encoding any of SEQ ID
NO:2-5 and 7-10.
11. The method of any one of claims 4 to 6, comprising assaying said strain for the presence of a polypeptide comprising an amino acid sequence selected from the group consisting of:
a) sequences encoded by a nucleic acid as defined in claim 7;
b) sequences having at least 80% identity to part or all of any of SEQ ID NO:2-and 7-10;
c) sequences having at least 85% homology to part or all of any of SEQ ID NO:2-and 7-10; and d) sequence provided in part or all of any of SEQ ID NO:2-5 and 7-10.
a) sequences encoded by a nucleic acid as defined in claim 7;
b) sequences having at least 80% identity to part or all of any of SEQ ID NO:2-and 7-10;
c) sequences having at least 85% homology to part or all of any of SEQ ID NO:2-and 7-10; and d) sequence provided in part or all of any of SEQ ID NO:2-5 and 7-10.
12. The method of claim 11, wherein said polypeptide comprises an amino acid sequence selected from the group consisting of sequences substantially the same as any of SEQ ID NO:2-5 and 7-10.
13. The method of claim 12, wherein said polypeptide comprises an amino acid sequence selected from the group consisting of sequences 100% identical to any of SEQ ID NO:2-5 and 7-10.
14. The method of any one of claims 4 to 6, comprising assaying, under suitable culture conditions, capabilities of said strain to metabolize 2-Deoxy-D-ribose.
15. The method of claim 14, comprising growing said strain on a minimal medium comprising 2-Deoxy-D-ribose as a sole source of carbon.
16. The method of claim 15, wherein said minimal medium comprises about 0.1 % 2-Deoxy-D-ribose.
17. The method of claim 15 or 16, wherein said strain is cultured in said minimal medium for about 24h to about 48h.
18. An isolated or purified nucleic acid molecule comprising a sequence selected from the group consisting of a) sequences provided in part or all of SEQ ID NO: 1 or 6;
b) complements of the sequences provided in part or all of SEQ ID NO: 1 or 6;
c) sequences consisting of at least 20 contiguous residues of a sequence provided in SEQ ID NO: 1 or 6;
d) sequences that hybridize to part or all of nucleic acids of SEQ ID NO: 1 or 6, under moderately, preferably high, stringent conditions;
e) sequences having at least 80% identity to part or all of SEQ ID NO: 1 or 6;
f) degenerate variants of a sequence provided in part or all of SEQ ID NO: 1 or 6;
and g) sequences encoding part or all of polypeptides provided in SEQ ID NO: 2-5 and 7-10.
b) complements of the sequences provided in part or all of SEQ ID NO: 1 or 6;
c) sequences consisting of at least 20 contiguous residues of a sequence provided in SEQ ID NO: 1 or 6;
d) sequences that hybridize to part or all of nucleic acids of SEQ ID NO: 1 or 6, under moderately, preferably high, stringent conditions;
e) sequences having at least 80% identity to part or all of SEQ ID NO: 1 or 6;
f) degenerate variants of a sequence provided in part or all of SEQ ID NO: 1 or 6;
and g) sequences encoding part or all of polypeptides provided in SEQ ID NO: 2-5 and 7-10.
19. The nucleic acid of claim 18, wherein it comprises a sequence selected from the group consisting of:
a) a nucleotide sequence having at least 80% nucleotide sequence identity with part or all of SEQ ID NO: 1 or 6; and b) a nucleotide sequence having at least 80% nucleotide sequence identity with a nucleic acid encoding a polypeptide provided in SEQ ID NO: 2-5 and 7-10.
a) a nucleotide sequence having at least 80% nucleotide sequence identity with part or all of SEQ ID NO: 1 or 6; and b) a nucleotide sequence having at least 80% nucleotide sequence identity with a nucleic acid encoding a polypeptide provided in SEQ ID NO: 2-5 and 7-10.
20. The nucleic acid of claim 19, wherein it comprises a sequence is selected from the group consisting of:
a) a sequence substantially the same to part or all of SEQ ID NO: 1 or 6; and b) a sequence substantially the same to a nucleic acid encoding part or all of any of SEQ ID NO: 2-5 and 7-10.
a) a sequence substantially the same to part or all of SEQ ID NO: 1 or 6; and b) a sequence substantially the same to a nucleic acid encoding part or all of any of SEQ ID NO: 2-5 and 7-10.
21. The nucleic acid of claim 20, wherein it comprises a sequence selected from the group consisting of:
a) a sequence having 100% identity with SEQ ID NO: 1 or 6;
b) a sequence having 100% identity with a nucleic acid encoding any of SEQ ID
NO:2-5 and 7-10.
a) a sequence having 100% identity with SEQ ID NO: 1 or 6;
b) a sequence having 100% identity with a nucleic acid encoding any of SEQ ID
NO:2-5 and 7-10.
22. An isolated or purified nucleic acid molecule comprising a sequence encoding a E. Coli polypeptide involved in metabolization of 2-Deoxy-D-ribose, or degenerate variants thereof, wherein said E. coli polypeptide or degenerate variant comprises part or all of SEQ ID NO:2-5 and 7-10.
23. An isolated or purified protein comprising an amino acid sequence selected from the group consisting of:
a) sequences encoded by a nucleic acid as defined in claim 7;
b) sequences having at least 80% identity to part or all of any of SEQ ID NO:2-and 7-10;
c) sequences having at least 85% homology to part or all of any of SEQ ID NO:2-and 7-10; and d) sequence provided in part or all of any of SEQ ID NO:2-5 and 7-10.
a) sequences encoded by a nucleic acid as defined in claim 7;
b) sequences having at least 80% identity to part or all of any of SEQ ID NO:2-and 7-10;
c) sequences having at least 85% homology to part or all of any of SEQ ID NO:2-and 7-10; and d) sequence provided in part or all of any of SEQ ID NO:2-5 and 7-10.
24. The protein of claim 23, wherein it comprises an amino acid sequence selected from the group consisting of sequences substantially the same as any of SEQ ID NO:2-5 and 7-10.
25. The protein of claim 24, wherein it comprises an amino acid sequence selected from the group consisting of sequences 100% identical to any of SEQ
ID
NO:2-5 and 7-10.
ID
NO:2-5 and 7-10.
26. An isolated or purified protein involved in E. Coli metabolization of 2-Deoxy-D-ribose, or degenerate variants thereof, wherein said protein or degenerate variant comprises part or all of any of SEQ ID NO:2-5 and 7-10.
27. An isolated or purified antibody that specifically binds to a protein as defined in any one of claims 23 to 26.
28. The antibody of claim 27, wherein said antibody consists of a monoclonal or of a polyclonal antibody.
29. A cloning or expression vector comprising the nucleic acid of any one of claims 18 to 22.
30. The vector of claim 29, wherein said vector is capable of directing expression of the peptide encoded by said nucleic acid in a vector-containing cell.
31. A transformed or transfected cell that contains the nucleic acid any one of claims 18 to 22.
32. The cell of claim 31, wherein said cell consists of a Escherichia coli bacterium.
33. The cell of claim 31, wherein the Escherichia coli bacterium is selected from the group consisting of Escherichia coli bacteria filed at the CNCM under accession numbers I-2867 and I-2867 on May 14, 2002.
34. A nucleotide probe comprising a sequence of at least 15 sequential nucleotides of SEQ ID NO: 1 or 6, or of a sequence complementary to SEQ ID
NO: 1 or 6.
NO: 1 or 6.
35. The probe of claim 30, wherein it consists of SEQ ID NO: 11 or 12.
36. A kit for identifying a pathogenic strain of E. coli, comprising the antibody of claim 27 or 28; or the probe according to claim 34 or 35; and at least one element selected from the group consisting of instructions for using said kit, reaction buffer(s), and enzyme(s).
37. A kit for identifying a pathogenic strain of E. coli, comprising means for assaying capabilities of said strain to metabolize 2-Deoxy-D-ribose.
38. The kit of claim 37, wherein said kit comprises a minimal culture medium with 2-Deoxy-D-ribose as a sole source of carbon.
39. A method for producing a polypeptide involved in E. coli metabolization of Deoxy-D-ribose, comprising:
- providing a cell transformed with a nucleic acid sequence encoding said polypeptide positioned for expression in said cell;
- culturing said transformed cell under conditions suitable for expressing said nucleic acid; and - producing said human polypeptide.
- providing a cell transformed with a nucleic acid sequence encoding said polypeptide positioned for expression in said cell;
- culturing said transformed cell under conditions suitable for expressing said nucleic acid; and - producing said human polypeptide.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA002388445A CA2388445A1 (en) | 2002-05-31 | 2002-05-31 | Genetic markers, metabolic markers, and methods for evaluating pathogenicity of strains of e. coli |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA002388445A CA2388445A1 (en) | 2002-05-31 | 2002-05-31 | Genetic markers, metabolic markers, and methods for evaluating pathogenicity of strains of e. coli |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2388445A1 true CA2388445A1 (en) | 2003-11-30 |
Family
ID=29783844
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002388445A Abandoned CA2388445A1 (en) | 2002-05-31 | 2002-05-31 | Genetic markers, metabolic markers, and methods for evaluating pathogenicity of strains of e. coli |
Country Status (1)
Country | Link |
---|---|
CA (1) | CA2388445A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105886595A (en) * | 2016-04-28 | 2016-08-24 | 夏云 | Method for rapidly detecting proteolysis bacteria in human excrement |
-
2002
- 2002-05-31 CA CA002388445A patent/CA2388445A1/en not_active Abandoned
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105886595A (en) * | 2016-04-28 | 2016-08-24 | 夏云 | Method for rapidly detecting proteolysis bacteria in human excrement |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU735444B2 (en) | (Enterococcus faecalis) polynucleotides and polypeptides | |
Sokurenko et al. | Quantitative differences in adhesiveness of type 1 fimbriated Escherichia coli due to structural differences in fimH genes | |
JP3368268B2 (en) | Universal eubacterial nucleic acid probe and method | |
Gough et al. | hrp genes of Pseudomonas solanacearum are homologous to pathogenicity determinants of animal pathogenic bacteria and are conserved among plant pathogenic bacteria | |
AU700752B2 (en) | Nucleic acid probes for detecting E. coli 0157:H7 | |
US8354500B2 (en) | Cytolethal distending toxins and detection of campylobacter bacteria using the same as a target | |
CN111020041B (en) | 16 different serotype salmonella specific new molecular targets and rapid detection method thereof | |
Chang et al. | Cloning and sequence analysis of a novel hemolysin gene (vllY) from Vibrio vulnificus | |
Kageyama et al. | Emendation of genus Collinsella and proposal of Collinsella stercoris sp. nov. and Collinsella intestinalis sp. nov. | |
EP0821735A1 (en) | Helicobacter pylori cagi region | |
JPH05504672A (en) | Polynucleotide probes, methods and kits for the identification and detection of Gram-negative bacteria | |
US7601822B2 (en) | Molecular identification of bacteria of genus Streptococcus and related genera | |
JPH03505974A (en) | Nucleotide sequences of the order Actinobacteria, application to the synthesis or detection of nucleic acids, expression products of such sequences and application as immunological compositions | |
CA2388445A1 (en) | Genetic markers, metabolic markers, and methods for evaluating pathogenicity of strains of e. coli | |
JPH08503609A (en) | Pathogenic-specific bacterial DNA sequences | |
EP1878439A2 (en) | Actinobacillus pleuropneumoniae virulence genes | |
KR20110007539A (en) | Primer and probe for detection of streptococcus oralis and method for detecting streptococcus oralis using thereof | |
Momynaliev et al. | Characterization of the Mycoplasma hominis ftsZ gene and its sequence variability in mycoplasma clinical isolates | |
AU770915B2 (en) | Enterococcus faecalis polynucleotides and polypeptides | |
EP2324052B1 (en) | Marker of streptococcus anginosus / streptococcus constellatus (moac) and uses thereof | |
WO1997001647A2 (en) | Dna sequences for identifying highly transmissible lineages of pseudomonas (burkholderia) cepacia | |
CN113583991A (en) | Amylosucrase SaAS and coding gene and application thereof | |
AU760154B2 (en) | Antigens and their detection | |
De Smidt et al. | Genetic organisation of the capsule transport gene region from Haemophilus paragallinarum | |
Wren et al. | DNA Probes and PCR Analysis in the Detection of Clostridium difficile and Helicobacter pylori |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FZDE | Dead |