CN1751125A - 混合并匹配tc蛋白质用于病虫害防治 - Google Patents
混合并匹配tc蛋白质用于病虫害防治 Download PDFInfo
- Publication number
- CN1751125A CN1751125A CNA2004800046014A CN200480004601A CN1751125A CN 1751125 A CN1751125 A CN 1751125A CN A2004800046014 A CNA2004800046014 A CN A2004800046014A CN 200480004601 A CN200480004601 A CN 200480004601A CN 1751125 A CN1751125 A CN 1751125A
- Authority
- CN
- China
- Prior art keywords
- protein
- seq
- toxin
- amino acid
- gene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 621
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 408
- 241000607479 Yersinia pestis Species 0.000 title abstract description 13
- 238000002156 mixing Methods 0.000 title description 14
- 239000003053 toxin Substances 0.000 claims abstract description 218
- 231100000765 toxin Toxicity 0.000 claims abstract description 213
- 241000607757 Xenorhabdus Species 0.000 claims abstract description 115
- 241001148062 Photorhabdus Species 0.000 claims abstract description 99
- 241000238631 Hexapoda Species 0.000 claims description 187
- 238000000034 method Methods 0.000 claims description 81
- 239000000523 sample Substances 0.000 claims description 81
- 241000193830 Bacillus <bacterium> Species 0.000 claims description 73
- 150000001413 amino acids Chemical class 0.000 claims description 62
- 239000002773 nucleotide Substances 0.000 claims description 45
- 125000003729 nucleotide group Chemical group 0.000 claims description 44
- 241000894007 species Species 0.000 claims description 44
- 239000002157 polynucleotide Substances 0.000 claims description 40
- 108091033319 polynucleotide Proteins 0.000 claims description 39
- 102000040430 polynucleotide Human genes 0.000 claims description 39
- 101800004937 Protein C Proteins 0.000 claims description 25
- 101800001700 Saposin-D Proteins 0.000 claims description 25
- 230000008859 change Effects 0.000 claims description 25
- 238000009396 hybridization Methods 0.000 claims description 25
- 229960000856 protein c Drugs 0.000 claims description 25
- 230000000295 complement effect Effects 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 15
- 238000013461 design Methods 0.000 claims description 9
- 241000589516 Pseudomonas Species 0.000 claims description 7
- 101100371219 Pseudomonas putida (strain DOT-T1E) ttgE gene Proteins 0.000 claims description 7
- 241000607720 Serratia Species 0.000 claims description 6
- 239000002919 insect venom Substances 0.000 claims description 5
- 101100165658 Alternaria brassicicola bsc5 gene Proteins 0.000 claims description 3
- 101100032924 Bacillus subtilis (strain 168) radA gene Proteins 0.000 claims description 3
- 101100492392 Didymella fabae pksAC gene Proteins 0.000 claims description 3
- 101100226893 Phomopsis amygdali PaP450-2 gene Proteins 0.000 claims description 3
- 125000003275 alpha amino acid group Chemical group 0.000 claims 28
- 102100036546 Salivary acidic proline-rich phosphoprotein 1/2 Human genes 0.000 claims 23
- 101100165660 Alternaria brassicicola bsc6 gene Proteins 0.000 claims 4
- 101100499295 Bacillus subtilis (strain 168) disA gene Proteins 0.000 claims 4
- 101150007210 ORF6 gene Proteins 0.000 claims 4
- 101100226894 Phomopsis amygdali PaGT gene Proteins 0.000 claims 4
- 230000000694 effects Effects 0.000 abstract description 99
- 231100000419 toxicity Toxicity 0.000 abstract description 19
- 230000001988 toxicity Effects 0.000 abstract description 19
- 230000008901 benefit Effects 0.000 abstract description 9
- 231100000654 protein toxin Toxicity 0.000 abstract description 5
- 238000001228 spectrum Methods 0.000 abstract description 4
- 230000009466 transformation Effects 0.000 abstract description 4
- 230000009261 transgenic effect Effects 0.000 abstract description 3
- 241000179039 Paenibacillus Species 0.000 abstract description 2
- 235000018102 proteins Nutrition 0.000 description 347
- 108700012359 toxins Proteins 0.000 description 244
- 241000196324 Embryophyta Species 0.000 description 169
- 210000004027 cell Anatomy 0.000 description 128
- 238000012360 testing method Methods 0.000 description 109
- 239000013612 plasmid Substances 0.000 description 105
- 241000894006 Bacteria Species 0.000 description 74
- 108020004414 DNA Proteins 0.000 description 67
- 241000588724 Escherichia coli Species 0.000 description 57
- 230000014509 gene expression Effects 0.000 description 54
- 241000607735 Xenorhabdus nematophila Species 0.000 description 52
- 239000000203 mixture Substances 0.000 description 50
- 239000006166 lysate Substances 0.000 description 48
- 240000008042 Zea mays Species 0.000 description 44
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 43
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 43
- 235000005822 corn Nutrition 0.000 description 43
- 241000702540 bacterium W14 Species 0.000 description 37
- 239000012634 fragment Substances 0.000 description 37
- 235000001014 amino acid Nutrition 0.000 description 35
- 238000006243 chemical reaction Methods 0.000 description 34
- 229940024606 amino acid Drugs 0.000 description 32
- 230000001580 bacterial effect Effects 0.000 description 32
- 108020004705 Codon Proteins 0.000 description 29
- 241000256244 Heliothis virescens Species 0.000 description 29
- 239000004927 clay Substances 0.000 description 29
- 235000013305 food Nutrition 0.000 description 29
- 241001566735 Archon Species 0.000 description 26
- 238000002360 preparation method Methods 0.000 description 25
- 239000000047 product Substances 0.000 description 25
- 238000005516 engineering process Methods 0.000 description 24
- 238000005406 washing Methods 0.000 description 24
- 241000255967 Helicoverpa zea Species 0.000 description 23
- 230000003321 amplification Effects 0.000 description 23
- 238000003199 nucleic acid amplification method Methods 0.000 description 23
- 102000004190 Enzymes Human genes 0.000 description 22
- 108090000790 Enzymes Proteins 0.000 description 22
- 229940088598 enzyme Drugs 0.000 description 22
- 238000002474 experimental method Methods 0.000 description 22
- 241000589158 Agrobacterium Species 0.000 description 21
- 108091028043 Nucleic acid sequence Proteins 0.000 description 21
- 241000244206 Nematoda Species 0.000 description 20
- 238000004458 analytical method Methods 0.000 description 19
- 230000012010 growth Effects 0.000 description 19
- 239000013613 expression plasmid Substances 0.000 description 17
- 244000005700 microbiome Species 0.000 description 17
- 239000000575 pesticide Substances 0.000 description 17
- 101150032575 tcdA gene Proteins 0.000 description 17
- 230000000749 insecticidal effect Effects 0.000 description 16
- 101000936711 Streptococcus gordonii Accessory secretory protein Asp4 Proteins 0.000 description 15
- 230000003115 biocidal effect Effects 0.000 description 15
- 239000013598 vector Substances 0.000 description 15
- 101710087110 ORF6 protein Proteins 0.000 description 14
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 14
- 101710095001 Uncharacterized protein in nifU 5'region Proteins 0.000 description 14
- 238000010367 cloning Methods 0.000 description 14
- 238000012545 processing Methods 0.000 description 14
- 101000621943 Acholeplasma phage L2 Probable integrase/recombinase Proteins 0.000 description 13
- 101000618348 Allochromatium vinosum (strain ATCC 17899 / DSM 180 / NBRC 103801 / NCIMB 10441 / D) Uncharacterized protein Alvin_0065 Proteins 0.000 description 13
- 101000781117 Autographa californica nuclear polyhedrosis virus Uncharacterized 12.4 kDa protein in CTL-LEF2 intergenic region Proteins 0.000 description 13
- 101000708323 Azospirillum brasilense Uncharacterized 28.8 kDa protein in nifR3-like 5'region Proteins 0.000 description 13
- 101000770311 Azotobacter chroococcum mcd 1 Uncharacterized 19.8 kDa protein in nifW 5'region Proteins 0.000 description 13
- 101000748761 Bacillus subtilis (strain 168) Uncharacterized MFS-type transporter YcxA Proteins 0.000 description 13
- 101000765620 Bacillus subtilis (strain 168) Uncharacterized protein YlxP Proteins 0.000 description 13
- 101000916134 Bacillus subtilis (strain 168) Uncharacterized protein YqxJ Proteins 0.000 description 13
- 101000754349 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251) UPF0065 protein BP0148 Proteins 0.000 description 13
- 101000827633 Caldicellulosiruptor sp. (strain Rt8B.4) Uncharacterized 23.9 kDa protein in xynA 3'region Proteins 0.000 description 13
- 101000947628 Claviceps purpurea Uncharacterized 11.8 kDa protein Proteins 0.000 description 13
- 101000686796 Clostridium perfringens Replication protein Proteins 0.000 description 13
- 101000788129 Escherichia coli Uncharacterized protein in sul1 3'region Proteins 0.000 description 13
- 101000788370 Escherichia phage P2 Uncharacterized 12.9 kDa protein in GpA 3'region Proteins 0.000 description 13
- 101000787096 Geobacillus stearothermophilus Uncharacterized protein in gldA 3'region Proteins 0.000 description 13
- 101000976889 Haemophilus phage HP1 (strain HP1c1) Uncharacterized 19.2 kDa protein in cox-rep intergenic region Proteins 0.000 description 13
- 101000827627 Klebsiella pneumoniae Putative low molecular weight protein-tyrosine-phosphatase Proteins 0.000 description 13
- 101001130841 Middle East respiratory syndrome-related coronavirus (isolate United Kingdom/H123990006/2012) Non-structural protein ORF5 Proteins 0.000 description 13
- 101000974028 Rhizobium leguminosarum bv. viciae (strain 3841) Putative cystathionine beta-lyase Proteins 0.000 description 13
- 101000756519 Rhodobacter capsulatus (strain ATCC BAA-309 / NBRC 16581 / SB1003) Uncharacterized protein RCAP_rcc00048 Proteins 0.000 description 13
- 101000948219 Rhodococcus erythropolis Uncharacterized 11.5 kDa protein in thcD 3'region Proteins 0.000 description 13
- 101000929863 Streptomyces cinnamonensis Monensin polyketide synthase putative ketoacyl reductase Proteins 0.000 description 13
- 101000788468 Streptomyces coelicolor Uncharacterized protein in mprR 3'region Proteins 0.000 description 13
- 101000845085 Streptomyces violaceoruber Granaticin polyketide synthase putative ketoacyl reductase 1 Proteins 0.000 description 13
- 101000711771 Thiocystis violacea Uncharacterized 76.5 kDa protein in phbC 3'region Proteins 0.000 description 13
- 101000711318 Vibrio alginolyticus Uncharacterized 11.6 kDa protein in scrR 3'region Proteins 0.000 description 13
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 13
- 239000008363 phosphate buffer Substances 0.000 description 13
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 12
- 241000256247 Spodoptera exigua Species 0.000 description 12
- 230000034994 death Effects 0.000 description 12
- 201000010099 disease Diseases 0.000 description 12
- 108020004999 messenger RNA Proteins 0.000 description 12
- 108090000765 processed proteins & peptides Proteins 0.000 description 12
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 11
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 11
- 108010005233 alanylglutamic acid Proteins 0.000 description 11
- 230000006870 function Effects 0.000 description 11
- 239000001963 growth medium Substances 0.000 description 11
- 239000003550 marker Substances 0.000 description 11
- 238000013519 translation Methods 0.000 description 11
- 102100031725 Cortactin-binding protein 2 Human genes 0.000 description 10
- 101710197985 Probable protein Rev Proteins 0.000 description 10
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 10
- 101000936720 Streptococcus gordonii Accessory secretory protein Asp5 Proteins 0.000 description 10
- 229960005091 chloramphenicol Drugs 0.000 description 10
- 230000008878 coupling Effects 0.000 description 10
- 238000010168 coupling process Methods 0.000 description 10
- 238000005859 coupling reaction Methods 0.000 description 10
- 230000005714 functional activity Effects 0.000 description 10
- 239000000126 substance Substances 0.000 description 10
- 210000001519 tissue Anatomy 0.000 description 10
- 238000012546 transfer Methods 0.000 description 10
- 241000255777 Lepidoptera Species 0.000 description 9
- 241000880493 Leptailurus serval Species 0.000 description 9
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 9
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 9
- 108010003700 lysyl aspartic acid Proteins 0.000 description 9
- 230000002441 reversible effect Effects 0.000 description 9
- 238000012216 screening Methods 0.000 description 9
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 9
- 108091005804 Peptidases Proteins 0.000 description 8
- 101710159752 Poly(3-hydroxyalkanoate) polymerase subunit PhaE Proteins 0.000 description 8
- 101710130262 Probable Vpr-like protein Proteins 0.000 description 8
- 239000004365 Protease Substances 0.000 description 8
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 8
- 241000700605 Viruses Species 0.000 description 8
- 230000002708 enhancing effect Effects 0.000 description 8
- 239000013604 expression vector Substances 0.000 description 8
- 239000000499 gel Substances 0.000 description 8
- 230000002068 genetic effect Effects 0.000 description 8
- 108010078144 glutaminyl-glycine Proteins 0.000 description 8
- 108010050848 glycylleucine Proteins 0.000 description 8
- 230000009036 growth inhibition Effects 0.000 description 8
- 230000002147 killing effect Effects 0.000 description 8
- 239000000463 material Substances 0.000 description 8
- 108010026333 seryl-proline Proteins 0.000 description 8
- 239000002689 soil Substances 0.000 description 8
- 229960000268 spectinomycin Drugs 0.000 description 8
- UNFWWIHTNXNPBV-WXKVUWSESA-N spectinomycin Chemical compound O([C@@H]1[C@@H](NC)[C@@H](O)[C@H]([C@@H]([C@H]1O1)O)NC)[C@]2(O)[C@H]1O[C@H](C)CC2=O UNFWWIHTNXNPBV-WXKVUWSESA-N 0.000 description 8
- 230000007888 toxin activity Effects 0.000 description 8
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 7
- 101000768957 Acholeplasma phage L2 Uncharacterized 37.2 kDa protein Proteins 0.000 description 7
- 101000823746 Acidianus ambivalens Uncharacterized 17.7 kDa protein in bps2 3'region Proteins 0.000 description 7
- 101000916369 Acidianus ambivalens Uncharacterized protein in sor 5'region Proteins 0.000 description 7
- 101000769342 Acinetobacter guillouiae Uncharacterized protein in rpoN-murA intergenic region Proteins 0.000 description 7
- 101000823696 Actinobacillus pleuropneumoniae Uncharacterized glycosyltransferase in aroQ 3'region Proteins 0.000 description 7
- 101000786513 Agrobacterium tumefaciens (strain 15955) Uncharacterized protein outside the virF region Proteins 0.000 description 7
- 101000618005 Alkalihalobacillus pseudofirmus (strain ATCC BAA-2126 / JCM 17055 / OF4) Uncharacterized protein BpOF4_00885 Proteins 0.000 description 7
- 102100020724 Ankyrin repeat, SAM and basic leucine zipper domain-containing protein 1 Human genes 0.000 description 7
- 101000967489 Azorhizobium caulinodans (strain ATCC 43989 / DSM 5975 / JCM 20966 / LMG 6465 / NBRC 14845 / NCIMB 13405 / ORS 571) Uncharacterized protein AZC_3924 Proteins 0.000 description 7
- 101000823761 Bacillus licheniformis Uncharacterized 9.4 kDa protein in flaL 3'region Proteins 0.000 description 7
- 101000819719 Bacillus methanolicus Uncharacterized N-acetyltransferase in lysA 3'region Proteins 0.000 description 7
- 101000789586 Bacillus subtilis (strain 168) UPF0702 transmembrane protein YkjA Proteins 0.000 description 7
- 101000792624 Bacillus subtilis (strain 168) Uncharacterized protein YbxH Proteins 0.000 description 7
- 101000790792 Bacillus subtilis (strain 168) Uncharacterized protein YckC Proteins 0.000 description 7
- 101000819705 Bacillus subtilis (strain 168) Uncharacterized protein YlxR Proteins 0.000 description 7
- 101000948218 Bacillus subtilis (strain 168) Uncharacterized protein YtxJ Proteins 0.000 description 7
- 241000193388 Bacillus thuringiensis Species 0.000 description 7
- 101000718627 Bacillus thuringiensis subsp. kurstaki Putative RNA polymerase sigma-G factor Proteins 0.000 description 7
- 101000641200 Bombyx mori densovirus Putative non-structural protein Proteins 0.000 description 7
- 101000947633 Claviceps purpurea Uncharacterized 13.8 kDa protein Proteins 0.000 description 7
- 101000948901 Enterobacteria phage T4 Uncharacterized 16.0 kDa protein in segB-ipI intergenic region Proteins 0.000 description 7
- 101000805958 Equine herpesvirus 4 (strain 1942) Virion protein US10 homolog Proteins 0.000 description 7
- 101000790442 Escherichia coli Insertion element IS2 uncharacterized 11.1 kDa protein Proteins 0.000 description 7
- 101000788354 Escherichia phage P2 Uncharacterized 8.2 kDa protein in gpA 5'region Proteins 0.000 description 7
- 101000770304 Frankia alni UPF0460 protein in nifX-nifW intergenic region Proteins 0.000 description 7
- 101000797344 Geobacillus stearothermophilus Putative tRNA (cytidine(34)-2'-O)-methyltransferase Proteins 0.000 description 7
- 101000748410 Geobacillus stearothermophilus Uncharacterized protein in fumA 3'region Proteins 0.000 description 7
- 101000772675 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) UPF0438 protein HI_0847 Proteins 0.000 description 7
- 101000631019 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) Uncharacterized protein HI_0350 Proteins 0.000 description 7
- 101000768938 Haemophilus phage HP1 (strain HP1c1) Uncharacterized 8.9 kDa protein in int-C1 intergenic region Proteins 0.000 description 7
- 101000785414 Homo sapiens Ankyrin repeat, SAM and basic leucine zipper domain-containing protein 1 Proteins 0.000 description 7
- 101000782488 Junonia coenia densovirus (isolate pBRJ/1990) Putative non-structural protein NS2 Proteins 0.000 description 7
- 101000811523 Klebsiella pneumoniae Uncharacterized 55.8 kDa protein in cps region Proteins 0.000 description 7
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 7
- 101000818409 Lactococcus lactis subsp. lactis Uncharacterized HTH-type transcriptional regulator in lacX 3'region Proteins 0.000 description 7
- 101000878851 Leptolyngbya boryana Putative Fe(2+) transport protein A Proteins 0.000 description 7
- 101000758828 Methanosarcina barkeri (strain Fusaro / DSM 804) Uncharacterized protein Mbar_A1602 Proteins 0.000 description 7
- 101001122401 Middle East respiratory syndrome-related coronavirus (isolate United Kingdom/H123990006/2012) Non-structural protein ORF3 Proteins 0.000 description 7
- 101001055788 Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) Pentapeptide repeat protein MfpA Proteins 0.000 description 7
- 108700026244 Open Reading Frames Proteins 0.000 description 7
- 101000740670 Orgyia pseudotsugata multicapsid polyhedrosis virus Protein C42 Proteins 0.000 description 7
- 101000769182 Photorhabdus luminescens Uncharacterized protein in pnp 3'region Proteins 0.000 description 7
- 101000961392 Pseudescherichia vulneris Uncharacterized 29.9 kDa protein in crtE 3'region Proteins 0.000 description 7
- 101000731030 Pseudomonas oleovorans Poly(3-hydroxyalkanoate) polymerase 2 Proteins 0.000 description 7
- 101001065485 Pseudomonas putida Probable fatty acid methyltransferase Proteins 0.000 description 7
- 101000711023 Rhizobium leguminosarum bv. trifolii Uncharacterized protein in tfuA 3'region Proteins 0.000 description 7
- 101000948156 Rhodococcus erythropolis Uncharacterized 47.3 kDa protein in thcA 5'region Proteins 0.000 description 7
- 101000917565 Rhodococcus fascians Uncharacterized 33.6 kDa protein in fasciation locus Proteins 0.000 description 7
- 101000790284 Saimiriine herpesvirus 2 (strain 488) Uncharacterized 9.5 kDa protein in DHFR 3'region Proteins 0.000 description 7
- 101000936719 Streptococcus gordonii Accessory Sec system protein Asp3 Proteins 0.000 description 7
- 101000788499 Streptomyces coelicolor Uncharacterized oxidoreductase in mprA 5'region Proteins 0.000 description 7
- 101001102841 Streptomyces griseus Purine nucleoside phosphorylase ORF3 Proteins 0.000 description 7
- 101000708557 Streptomyces lincolnensis Uncharacterized 17.2 kDa protein in melC2-rnhH intergenic region Proteins 0.000 description 7
- 101000649826 Thermotoga neapolitana Putative anti-sigma factor antagonist TM1081 homolog Proteins 0.000 description 7
- 101000827562 Vibrio alginolyticus Uncharacterized protein in proC 3'region Proteins 0.000 description 7
- 101000778915 Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633) Uncharacterized membrane protein VP2115 Proteins 0.000 description 7
- 229940097012 bacillus thuringiensis Drugs 0.000 description 7
- 230000008034 disappearance Effects 0.000 description 7
- 230000007613 environmental effect Effects 0.000 description 7
- 238000003780 insertion Methods 0.000 description 7
- 230000037431 insertion Effects 0.000 description 7
- 239000012071 phase Substances 0.000 description 7
- 239000000243 solution Substances 0.000 description 7
- 101710117545 C protein Proteins 0.000 description 6
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 6
- 101710110895 Uncharacterized 7.3 kDa protein in cox-rep intergenic region Proteins 0.000 description 6
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 6
- 108010008355 arginyl-glutamine Proteins 0.000 description 6
- 108010038633 aspartylglutamate Proteins 0.000 description 6
- 239000011324 bead Substances 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 6
- 238000005336 cracking Methods 0.000 description 6
- 239000000284 extract Substances 0.000 description 6
- 239000002917 insecticide Substances 0.000 description 6
- 108010034529 leucyl-lysine Proteins 0.000 description 6
- 238000007726 management method Methods 0.000 description 6
- 239000002609 medium Substances 0.000 description 6
- 238000011017 operating method Methods 0.000 description 6
- 108091008146 restriction endonucleases Proteins 0.000 description 6
- 230000004083 survival effect Effects 0.000 description 6
- 108010073969 valyllysine Proteins 0.000 description 6
- IVPNEDNYYYFAGI-GARJFASQSA-N Asp-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N IVPNEDNYYYFAGI-GARJFASQSA-N 0.000 description 5
- 241000254173 Coleoptera Species 0.000 description 5
- 101000833492 Homo sapiens Jouberin Proteins 0.000 description 5
- 101000651236 Homo sapiens NCK-interacting protein with SH3 domain Proteins 0.000 description 5
- 102100024407 Jouberin Human genes 0.000 description 5
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 5
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 5
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 5
- 241000589623 Pseudomonas syringae pv. syringae Species 0.000 description 5
- 101710084578 Short neurotoxin 1 Proteins 0.000 description 5
- 108091081024 Start codon Proteins 0.000 description 5
- 101710182223 Toxin B Proteins 0.000 description 5
- 101710182532 Toxin a Proteins 0.000 description 5
- 108010047495 alanylglycine Proteins 0.000 description 5
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 5
- 108010093581 aspartyl-proline Proteins 0.000 description 5
- 230000029087 digestion Effects 0.000 description 5
- 238000011156 evaluation Methods 0.000 description 5
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 5
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 5
- 230000001965 increasing effect Effects 0.000 description 5
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 5
- 108010057821 leucylproline Proteins 0.000 description 5
- 108010090894 prolylleucine Proteins 0.000 description 5
- 238000000926 separation method Methods 0.000 description 5
- 239000006228 supernatant Substances 0.000 description 5
- 241000589156 Agrobacterium rhizogenes Species 0.000 description 4
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 4
- JBDLMLZNDRLDIX-HJGDQZAQSA-N Asn-Thr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O JBDLMLZNDRLDIX-HJGDQZAQSA-N 0.000 description 4
- XWKBWZXGNXTDKY-ZKWXMUAHSA-N Asp-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O XWKBWZXGNXTDKY-ZKWXMUAHSA-N 0.000 description 4
- XFTWUNOVBCHBJR-UHFFFAOYSA-N Aspergillomarasmine A Chemical compound OC(=O)C(N)CNC(C(O)=O)CNC(C(O)=O)CC(O)=O XFTWUNOVBCHBJR-UHFFFAOYSA-N 0.000 description 4
- 108010077805 Bacterial Proteins Proteins 0.000 description 4
- 108700010070 Codon Usage Proteins 0.000 description 4
- YPMDZWPZFOZYFG-GUBZILKMSA-N Gln-Leu-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YPMDZWPZFOZYFG-GUBZILKMSA-N 0.000 description 4
- 102000053187 Glucuronidase Human genes 0.000 description 4
- 108010060309 Glucuronidase Proteins 0.000 description 4
- 101000963974 Hydrophis stokesii Alpha-elapitoxin-Ast2b Proteins 0.000 description 4
- KAFOIVJDVSZUMD-UHFFFAOYSA-N Leu-Gln-Gln Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)NC(CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-UHFFFAOYSA-N 0.000 description 4
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 4
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 4
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 4
- IEWBEPKLKUXQBU-VOAKCMCISA-N Leu-Leu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IEWBEPKLKUXQBU-VOAKCMCISA-N 0.000 description 4
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 4
- 101000964025 Naja naja Long neurotoxin 3 Proteins 0.000 description 4
- 206010058667 Oral toxicity Diseases 0.000 description 4
- 238000012408 PCR amplification Methods 0.000 description 4
- 229910019142 PO4 Inorganic materials 0.000 description 4
- 108700001094 Plant Genes Proteins 0.000 description 4
- 241000589540 Pseudomonas fluorescens Species 0.000 description 4
- 241000223252 Rhodotorula Species 0.000 description 4
- 238000012300 Sequence Analysis Methods 0.000 description 4
- 101150099105 alien gene Proteins 0.000 description 4
- 229960000723 ampicillin Drugs 0.000 description 4
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 4
- 108010047857 aspartylglycine Proteins 0.000 description 4
- 108010092854 aspartyllysine Proteins 0.000 description 4
- 239000000969 carrier Substances 0.000 description 4
- 239000003153 chemical reaction reagent Substances 0.000 description 4
- 239000012141 concentrate Substances 0.000 description 4
- 238000005520 cutting process Methods 0.000 description 4
- 238000010790 dilution Methods 0.000 description 4
- 239000012895 dilution Substances 0.000 description 4
- 238000004520 electroporation Methods 0.000 description 4
- 238000010353 genetic engineering Methods 0.000 description 4
- FDGQSTZJBFJUBT-UHFFFAOYSA-N hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 4
- 230000001939 inductive effect Effects 0.000 description 4
- 230000002401 inhibitory effect Effects 0.000 description 4
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 4
- 239000007788 liquid Substances 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 238000002703 mutagenesis Methods 0.000 description 4
- 231100000350 mutagenesis Toxicity 0.000 description 4
- 231100000418 oral toxicity Toxicity 0.000 description 4
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 4
- 239000010452 phosphate Substances 0.000 description 4
- 238000001556 precipitation Methods 0.000 description 4
- 108010004914 prolylarginine Proteins 0.000 description 4
- 230000009467 reduction Effects 0.000 description 4
- 230000001105 regulatory effect Effects 0.000 description 4
- 230000010076 replication Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 4
- 101150089436 tcdB gene Proteins 0.000 description 4
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 4
- 108010080629 tryptophan-leucine Proteins 0.000 description 4
- 101150085703 vir gene Proteins 0.000 description 4
- WDIYWDJLXOCGRW-ACZMJKKPSA-N Ala-Asp-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WDIYWDJLXOCGRW-ACZMJKKPSA-N 0.000 description 3
- FUSPCLTUKXQREV-ACZMJKKPSA-N Ala-Glu-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O FUSPCLTUKXQREV-ACZMJKKPSA-N 0.000 description 3
- PNALXAODQKTNLV-JBDRJPRFSA-N Ala-Ile-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O PNALXAODQKTNLV-JBDRJPRFSA-N 0.000 description 3
- 241000588986 Alcaligenes Species 0.000 description 3
- -1 Amino acids Amino acid Chemical class 0.000 description 3
- YNDLOUMBVDVALC-ZLUOBGJFSA-N Asn-Ala-Ala Chemical compound C[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CC(=O)N)N YNDLOUMBVDVALC-ZLUOBGJFSA-N 0.000 description 3
- PDECQIHABNQRHN-GUBZILKMSA-N Asp-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(O)=O PDECQIHABNQRHN-GUBZILKMSA-N 0.000 description 3
- 241000223651 Aureobasidium Species 0.000 description 3
- 241000589151 Azotobacter Species 0.000 description 3
- 108700003918 Bacillus Thuringiensis insecticidal crystal Proteins 0.000 description 3
- 108091026890 Coding region Proteins 0.000 description 3
- 229920000742 Cotton Polymers 0.000 description 3
- 241001337994 Cryptococcus <scale insect> Species 0.000 description 3
- 238000002965 ELISA Methods 0.000 description 3
- 241000353522 Earias insulana Species 0.000 description 3
- 241000588921 Enterobacteriaceae Species 0.000 description 3
- ININBLZFFVOQIO-JHEQGTHGSA-N Gln-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O ININBLZFFVOQIO-JHEQGTHGSA-N 0.000 description 3
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 3
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 3
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 3
- UHPAZODVFFYEEL-QWRGUYRKSA-N Gly-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN UHPAZODVFFYEEL-QWRGUYRKSA-N 0.000 description 3
- 206010020649 Hyperkeratosis Diseases 0.000 description 3
- XMYURPUVJSKTMC-KBIXCLLPSA-N Ile-Ser-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XMYURPUVJSKTMC-KBIXCLLPSA-N 0.000 description 3
- 241000235649 Kluyveromyces Species 0.000 description 3
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 3
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 3
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 3
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 3
- OXKYZSRZKBTVEY-ZPFDUUQYSA-N Leu-Asn-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OXKYZSRZKBTVEY-ZPFDUUQYSA-N 0.000 description 3
- POJPZSMTTMLSTG-SRVKXCTJSA-N Leu-Asn-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N POJPZSMTTMLSTG-SRVKXCTJSA-N 0.000 description 3
- WQWSMEOYXJTFRU-GUBZILKMSA-N Leu-Glu-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O WQWSMEOYXJTFRU-GUBZILKMSA-N 0.000 description 3
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 3
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 3
- LFSQWRSVPNKJGP-WDCWCFNPSA-N Leu-Thr-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O LFSQWRSVPNKJGP-WDCWCFNPSA-N 0.000 description 3
- WUHBLPVELFTPQK-KKUMJFAQSA-N Leu-Tyr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O WUHBLPVELFTPQK-KKUMJFAQSA-N 0.000 description 3
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 3
- 239000006142 Luria-Bertani Agar Substances 0.000 description 3
- MYZMQWHPDAYKIE-SRVKXCTJSA-N Lys-Leu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O MYZMQWHPDAYKIE-SRVKXCTJSA-N 0.000 description 3
- 241000255908 Manduca sexta Species 0.000 description 3
- 241000254043 Melolonthinae Species 0.000 description 3
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 3
- 101000822778 Naja naja Long neurotoxin 4 Proteins 0.000 description 3
- 244000061176 Nicotiana tabacum Species 0.000 description 3
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 3
- 241001310335 Paenibacillus lentimorbus Species 0.000 description 3
- 241001310339 Paenibacillus popilliae Species 0.000 description 3
- 108091093037 Peptide nucleic acid Proteins 0.000 description 3
- 241001216646 Photorhabdus temperata Species 0.000 description 3
- 241000500437 Plutella xylostella Species 0.000 description 3
- 108700008625 Reporter Genes Proteins 0.000 description 3
- 241000589180 Rhizobium Species 0.000 description 3
- FIXILCYTSAUERA-FXQIFTODSA-N Ser-Ala-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FIXILCYTSAUERA-FXQIFTODSA-N 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- 108020005038 Terminator Codon Proteins 0.000 description 3
- ZQUKYJOKQBRBCS-GLLZPBPUSA-N Thr-Gln-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O ZQUKYJOKQBRBCS-GLLZPBPUSA-N 0.000 description 3
- NOXKHHXSHQFSGJ-FQPOAREZSA-N Tyr-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NOXKHHXSHQFSGJ-FQPOAREZSA-N 0.000 description 3
- YGKVNUAKYPGORG-AVGNSLFASA-N Tyr-Asp-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YGKVNUAKYPGORG-AVGNSLFASA-N 0.000 description 3
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 3
- DDNIHOWRDOXXPF-NGZCFLSTSA-N Val-Asp-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N DDNIHOWRDOXXPF-NGZCFLSTSA-N 0.000 description 3
- CFSSLXZJEMERJY-NRPADANISA-N Val-Gln-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CFSSLXZJEMERJY-NRPADANISA-N 0.000 description 3
- 108010077245 asparaginyl-proline Proteins 0.000 description 3
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 3
- 238000003556 assay Methods 0.000 description 3
- 235000013339 cereals Nutrition 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 238000003967 crop rotation Methods 0.000 description 3
- 238000009795 derivation Methods 0.000 description 3
- 239000000539 dimer Substances 0.000 description 3
- 108010054812 diprotin A Proteins 0.000 description 3
- 235000013399 edible fruits Nutrition 0.000 description 3
- 230000002255 enzymatic effect Effects 0.000 description 3
- 239000008103 glucose Substances 0.000 description 3
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 3
- 230000001976 improved effect Effects 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 229930027917 kanamycin Natural products 0.000 description 3
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical class O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 3
- 231100000518 lethal Toxicity 0.000 description 3
- 230000001665 lethal effect Effects 0.000 description 3
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 3
- 108010009298 lysylglutamic acid Proteins 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 238000001840 matrix-assisted laser desorption--ionisation time-of-flight mass spectrometry Methods 0.000 description 3
- 238000002844 melting Methods 0.000 description 3
- 230000008018 melting Effects 0.000 description 3
- 102000039446 nucleic acids Human genes 0.000 description 3
- 108020004707 nucleic acids Proteins 0.000 description 3
- 150000007523 nucleic acids Chemical class 0.000 description 3
- 239000002751 oligonucleotide probe Substances 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 230000007096 poisonous effect Effects 0.000 description 3
- 230000002265 prevention Effects 0.000 description 3
- 108010079317 prolyl-tyrosine Proteins 0.000 description 3
- 108010015796 prolylisoleucine Proteins 0.000 description 3
- 235000004252 protein component Nutrition 0.000 description 3
- 230000006798 recombination Effects 0.000 description 3
- 238000005215 recombination Methods 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 3
- 238000005507 spraying Methods 0.000 description 3
- 238000010561 standard procedure Methods 0.000 description 3
- 238000005728 strengthening Methods 0.000 description 3
- 108010061238 threonyl-glycine Proteins 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 230000001131 transforming effect Effects 0.000 description 3
- 108010051110 tyrosyl-lysine Proteins 0.000 description 3
- 108010003137 tyrosyltyrosine Proteins 0.000 description 3
- 241000701447 unidentified baculovirus Species 0.000 description 3
- 230000003612 virological effect Effects 0.000 description 3
- UPMXNNIRAGDFEH-UHFFFAOYSA-N 3,5-dibromo-4-hydroxybenzonitrile Chemical compound OC1=C(Br)C=C(C#N)C=C1Br UPMXNNIRAGDFEH-UHFFFAOYSA-N 0.000 description 2
- 241000238876 Acari Species 0.000 description 2
- 241000589220 Acetobacter Species 0.000 description 2
- 244000235858 Acetobacter xylinum Species 0.000 description 2
- 235000002837 Acetobacter xylinum Nutrition 0.000 description 2
- 229920001817 Agar Polymers 0.000 description 2
- AAQGRPOPTAUUBM-ZLUOBGJFSA-N Ala-Ala-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O AAQGRPOPTAUUBM-ZLUOBGJFSA-N 0.000 description 2
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N Ala-Ala-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 2
- XCVRVWZTXPCYJT-BIIVOSGPSA-N Ala-Asn-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N XCVRVWZTXPCYJT-BIIVOSGPSA-N 0.000 description 2
- BGNLUHXLSAQYRQ-FXQIFTODSA-N Ala-Glu-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O BGNLUHXLSAQYRQ-FXQIFTODSA-N 0.000 description 2
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 2
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 2
- HHRAXZAYZFFRAM-CIUDSAMLSA-N Ala-Leu-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O HHRAXZAYZFFRAM-CIUDSAMLSA-N 0.000 description 2
- SUMYEVXWCAYLLJ-GUBZILKMSA-N Ala-Leu-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O SUMYEVXWCAYLLJ-GUBZILKMSA-N 0.000 description 2
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 2
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 2
- FEGOCLZUJUFCHP-CIUDSAMLSA-N Ala-Pro-Gln Chemical compound [H]N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O FEGOCLZUJUFCHP-CIUDSAMLSA-N 0.000 description 2
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 2
- YYAVDNKUWLAFCV-ACZMJKKPSA-N Ala-Ser-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYAVDNKUWLAFCV-ACZMJKKPSA-N 0.000 description 2
- NCQMBSJGJMYKCK-ZLUOBGJFSA-N Ala-Ser-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O NCQMBSJGJMYKCK-ZLUOBGJFSA-N 0.000 description 2
- JJHBEVZAZXZREW-LFSVMHDDSA-N Ala-Thr-Phe Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O JJHBEVZAZXZREW-LFSVMHDDSA-N 0.000 description 2
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 2
- 241000219195 Arabidopsis thaliana Species 0.000 description 2
- VBFJESQBIWCWRL-DCAQKATOSA-N Arg-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCNC(N)=N VBFJESQBIWCWRL-DCAQKATOSA-N 0.000 description 2
- SQKPKIJVWHAWNF-DCAQKATOSA-N Arg-Asp-Lys Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(O)=O SQKPKIJVWHAWNF-DCAQKATOSA-N 0.000 description 2
- PBSOQGZLPFVXPU-YUMQZZPRSA-N Arg-Glu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PBSOQGZLPFVXPU-YUMQZZPRSA-N 0.000 description 2
- PHHRSPBBQUFULD-UWVGGRQHSA-N Arg-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N PHHRSPBBQUFULD-UWVGGRQHSA-N 0.000 description 2
- UPKMBGAAEZGHOC-RWMBFGLXSA-N Arg-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O UPKMBGAAEZGHOC-RWMBFGLXSA-N 0.000 description 2
- LKDHUGLXOHYINY-XUXIUFHCSA-N Arg-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LKDHUGLXOHYINY-XUXIUFHCSA-N 0.000 description 2
- 241000186063 Arthrobacter Species 0.000 description 2
- CMLGVVWQQHUXOZ-GHCJXIJMSA-N Asn-Ala-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CMLGVVWQQHUXOZ-GHCJXIJMSA-N 0.000 description 2
- KXEGPPNPXOKKHK-ZLUOBGJFSA-N Asn-Asp-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KXEGPPNPXOKKHK-ZLUOBGJFSA-N 0.000 description 2
- QPTAGIPWARILES-AVGNSLFASA-N Asn-Gln-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QPTAGIPWARILES-AVGNSLFASA-N 0.000 description 2
- JZDZLBJVYWIIQU-AVGNSLFASA-N Asn-Glu-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JZDZLBJVYWIIQU-AVGNSLFASA-N 0.000 description 2
- XLZCLJRGGMBKLR-PCBIJLKTSA-N Asn-Ile-Phe Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XLZCLJRGGMBKLR-PCBIJLKTSA-N 0.000 description 2
- SEKBHZJLARBNPB-GHCJXIJMSA-N Asn-Ile-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O SEKBHZJLARBNPB-GHCJXIJMSA-N 0.000 description 2
- JQBCANGGAVVERB-CFMVVWHZSA-N Asn-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N JQBCANGGAVVERB-CFMVVWHZSA-N 0.000 description 2
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 2
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 2
- YRTOMUMWSTUQAX-FXQIFTODSA-N Asn-Pro-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O YRTOMUMWSTUQAX-FXQIFTODSA-N 0.000 description 2
- PUUPMDXIHCOPJU-HJGDQZAQSA-N Asn-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O PUUPMDXIHCOPJU-HJGDQZAQSA-N 0.000 description 2
- DPWDPEVGACCWTC-SRVKXCTJSA-N Asn-Tyr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O DPWDPEVGACCWTC-SRVKXCTJSA-N 0.000 description 2
- PBVLJOIPOGUQQP-CIUDSAMLSA-N Asp-Ala-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O PBVLJOIPOGUQQP-CIUDSAMLSA-N 0.000 description 2
- HRGGPWBIMIQANI-GUBZILKMSA-N Asp-Gln-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HRGGPWBIMIQANI-GUBZILKMSA-N 0.000 description 2
- ZSJFGGSPCCHMNE-LAEOZQHASA-N Asp-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N ZSJFGGSPCCHMNE-LAEOZQHASA-N 0.000 description 2
- VAWNQIGQPUOPQW-ACZMJKKPSA-N Asp-Glu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VAWNQIGQPUOPQW-ACZMJKKPSA-N 0.000 description 2
- IJHUZMGJRGNXIW-CIUDSAMLSA-N Asp-Glu-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IJHUZMGJRGNXIW-CIUDSAMLSA-N 0.000 description 2
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 2
- WSGVTKZFVJSJOG-RCOVLWMOSA-N Asp-Gly-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O WSGVTKZFVJSJOG-RCOVLWMOSA-N 0.000 description 2
- ORRJQLIATJDMQM-HJGDQZAQSA-N Asp-Leu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O ORRJQLIATJDMQM-HJGDQZAQSA-N 0.000 description 2
- KESWRFKUZRUTAH-FXQIFTODSA-N Asp-Pro-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O KESWRFKUZRUTAH-FXQIFTODSA-N 0.000 description 2
- RVMXMLSYBTXCAV-VEVYYDQMSA-N Asp-Pro-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMXMLSYBTXCAV-VEVYYDQMSA-N 0.000 description 2
- WMLFFCRUSPNENW-ZLUOBGJFSA-N Asp-Ser-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O WMLFFCRUSPNENW-ZLUOBGJFSA-N 0.000 description 2
- DRCOAZZDQRCGGP-GHCJXIJMSA-N Asp-Ser-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DRCOAZZDQRCGGP-GHCJXIJMSA-N 0.000 description 2
- JSHWXQIZOCVWIA-ZKWXMUAHSA-N Asp-Ser-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O JSHWXQIZOCVWIA-ZKWXMUAHSA-N 0.000 description 2
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 2
- NWAHPBGBDIFUFD-KKUMJFAQSA-N Asp-Tyr-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O NWAHPBGBDIFUFD-KKUMJFAQSA-N 0.000 description 2
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 2
- 239000005489 Bromoxynil Substances 0.000 description 2
- 108090000565 Capsid Proteins Proteins 0.000 description 2
- 102000016938 Catalase Human genes 0.000 description 2
- 108010053835 Catalase Proteins 0.000 description 2
- 102100023321 Ceruloplasmin Human genes 0.000 description 2
- 101710151559 Crystal protein Proteins 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 2
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 2
- 241000489973 Diabrotica undecimpunctata Species 0.000 description 2
- 241000255925 Diptera Species 0.000 description 2
- YQYJSBFKSSDGFO-UHFFFAOYSA-N Epihygromycin Natural products OC1C(O)C(C(=O)C)OC1OC(C(=C1)O)=CC=C1C=C(C)C(=O)NC1C(O)C(O)C2OCOC2C1O YQYJSBFKSSDGFO-UHFFFAOYSA-N 0.000 description 2
- 241000588698 Erwinia Species 0.000 description 2
- 101100437498 Escherichia coli (strain K12) uidA gene Proteins 0.000 description 2
- PLUBXMRUUVWRLT-UHFFFAOYSA-N Ethyl methanesulfonate Chemical compound CCOS(C)(=O)=O PLUBXMRUUVWRLT-UHFFFAOYSA-N 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- HHWQMFIGMMOVFK-WDSKDSINSA-N Gln-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O HHWQMFIGMMOVFK-WDSKDSINSA-N 0.000 description 2
- KZKBJEUWNMQTLV-XDTLVQLUSA-N Gln-Ala-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KZKBJEUWNMQTLV-XDTLVQLUSA-N 0.000 description 2
- LFIVHGMKWFGUGK-IHRRRGAJSA-N Gln-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N LFIVHGMKWFGUGK-IHRRRGAJSA-N 0.000 description 2
- JXBZEDIQFFCHPZ-PEFMBERDSA-N Gln-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JXBZEDIQFFCHPZ-PEFMBERDSA-N 0.000 description 2
- CAXXTYYGFYTBPV-IUCAKERBSA-N Gln-Leu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CAXXTYYGFYTBPV-IUCAKERBSA-N 0.000 description 2
- RWQCWSGOOOEGPB-FXQIFTODSA-N Gln-Ser-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O RWQCWSGOOOEGPB-FXQIFTODSA-N 0.000 description 2
- IIMZHVKZBGSEKZ-SZMVWBNQSA-N Gln-Trp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(C)C)C(O)=O IIMZHVKZBGSEKZ-SZMVWBNQSA-N 0.000 description 2
- VCUNGPMMPNJSGS-JYJNAYRXSA-N Gln-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O VCUNGPMMPNJSGS-JYJNAYRXSA-N 0.000 description 2
- HPBKQFJXDUVNQV-FHWLQOOXSA-N Gln-Tyr-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O HPBKQFJXDUVNQV-FHWLQOOXSA-N 0.000 description 2
- MKRDNSWGJWTBKZ-GVXVVHGQSA-N Gln-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MKRDNSWGJWTBKZ-GVXVVHGQSA-N 0.000 description 2
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 2
- AFODTOLGSZQDSL-PEFMBERDSA-N Glu-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N AFODTOLGSZQDSL-PEFMBERDSA-N 0.000 description 2
- RDDSZZJOKDVPAE-ACZMJKKPSA-N Glu-Asn-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDDSZZJOKDVPAE-ACZMJKKPSA-N 0.000 description 2
- ZJICFHQSPWFBKP-AVGNSLFASA-N Glu-Asn-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZJICFHQSPWFBKP-AVGNSLFASA-N 0.000 description 2
- GFLQTABMFBXRIY-GUBZILKMSA-N Glu-Gln-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GFLQTABMFBXRIY-GUBZILKMSA-N 0.000 description 2
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 2
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 2
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 2
- VMKCPNBBPGGQBJ-GUBZILKMSA-N Glu-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N VMKCPNBBPGGQBJ-GUBZILKMSA-N 0.000 description 2
- VGBSZQSKQRMLHD-MNXVOIDGSA-N Glu-Leu-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VGBSZQSKQRMLHD-MNXVOIDGSA-N 0.000 description 2
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 2
- IOUQWHIEQYQVFD-JYJNAYRXSA-N Glu-Leu-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IOUQWHIEQYQVFD-JYJNAYRXSA-N 0.000 description 2
- BDISFWMLMNBTGP-NUMRIWBASA-N Glu-Thr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O BDISFWMLMNBTGP-NUMRIWBASA-N 0.000 description 2
- DTLLNDVORUEOTM-WDCWCFNPSA-N Glu-Thr-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DTLLNDVORUEOTM-WDCWCFNPSA-N 0.000 description 2
- BEQGFMIBZFNROK-JGVFFNPUSA-N Gly-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)CN)C(=O)O BEQGFMIBZFNROK-JGVFFNPUSA-N 0.000 description 2
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 2
- PTIIBFKSLCYQBO-NHCYSSNCSA-N Gly-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)CN PTIIBFKSLCYQBO-NHCYSSNCSA-N 0.000 description 2
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 2
- OHUKZZYSJBKFRR-WHFBIAKZSA-N Gly-Ser-Asp Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O OHUKZZYSJBKFRR-WHFBIAKZSA-N 0.000 description 2
- JQFILXICXLDTRR-FBCQKBJTSA-N Gly-Thr-Gly Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)NCC(O)=O JQFILXICXLDTRR-FBCQKBJTSA-N 0.000 description 2
- TVTZEOHWHUVYCG-KYNKHSRBSA-N Gly-Thr-Thr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O TVTZEOHWHUVYCG-KYNKHSRBSA-N 0.000 description 2
- SBVMXEZQJVUARN-XPUUQOCRSA-N Gly-Val-Ser Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O SBVMXEZQJVUARN-XPUUQOCRSA-N 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- BDFCIKANUNMFGB-PMVVWTBXSA-N His-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 BDFCIKANUNMFGB-PMVVWTBXSA-N 0.000 description 2
- UGQMRVRMYYASKQ-UHFFFAOYSA-N Hypoxanthine nucleoside Natural products OC1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 UGQMRVRMYYASKQ-UHFFFAOYSA-N 0.000 description 2
- HDOYNXLPTRQLAD-JBDRJPRFSA-N Ile-Ala-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)O)N HDOYNXLPTRQLAD-JBDRJPRFSA-N 0.000 description 2
- NBJAAWYRLGCJOF-UGYAYLCHSA-N Ile-Asp-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NBJAAWYRLGCJOF-UGYAYLCHSA-N 0.000 description 2
- MQFGXJNSUJTXDT-QSFUFRPTSA-N Ile-Gly-Ile Chemical compound N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)O MQFGXJNSUJTXDT-QSFUFRPTSA-N 0.000 description 2
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 2
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 2
- XLXPYSDGMXTTNQ-UHFFFAOYSA-N Ile-Phe-Leu Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(CC(C)C)C(O)=O)CC1=CC=CC=C1 XLXPYSDGMXTTNQ-UHFFFAOYSA-N 0.000 description 2
- SVZFKLBRCYCIIY-CYDGBPFRSA-N Ile-Pro-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SVZFKLBRCYCIIY-CYDGBPFRSA-N 0.000 description 2
- COWHUQXTSYTKQC-RWRJDSDZSA-N Ile-Thr-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N COWHUQXTSYTKQC-RWRJDSDZSA-N 0.000 description 2
- NURNJECQNNCRBK-FLBSBUHZSA-N Ile-Thr-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NURNJECQNNCRBK-FLBSBUHZSA-N 0.000 description 2
- 108010060231 Insect Proteins Proteins 0.000 description 2
- 108010065920 Insulin Lispro Proteins 0.000 description 2
- 241000588748 Klebsiella Species 0.000 description 2
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 2
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 2
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 2
- 241000186660 Lactobacillus Species 0.000 description 2
- 241000254158 Lampyridae Species 0.000 description 2
- CQQGCWPXDHTTNF-GUBZILKMSA-N Leu-Ala-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O CQQGCWPXDHTTNF-GUBZILKMSA-N 0.000 description 2
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 2
- HASRFYOMVPJRPU-SRVKXCTJSA-N Leu-Arg-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HASRFYOMVPJRPU-SRVKXCTJSA-N 0.000 description 2
- STAVRDQLZOTNKJ-RHYQMDGZSA-N Leu-Arg-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O STAVRDQLZOTNKJ-RHYQMDGZSA-N 0.000 description 2
- DBVWMYGBVFCRBE-CIUDSAMLSA-N Leu-Asn-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DBVWMYGBVFCRBE-CIUDSAMLSA-N 0.000 description 2
- DLCOFDAHNMMQPP-SRVKXCTJSA-N Leu-Asp-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DLCOFDAHNMMQPP-SRVKXCTJSA-N 0.000 description 2
- JQSXWJXBASFONF-KKUMJFAQSA-N Leu-Asp-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JQSXWJXBASFONF-KKUMJFAQSA-N 0.000 description 2
- ZYLJULGXQDNXDK-GUBZILKMSA-N Leu-Gln-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O ZYLJULGXQDNXDK-GUBZILKMSA-N 0.000 description 2
- KAFOIVJDVSZUMD-DCAQKATOSA-N Leu-Gln-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-DCAQKATOSA-N 0.000 description 2
- DPWGZWUMUUJQDT-IUCAKERBSA-N Leu-Gln-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O DPWGZWUMUUJQDT-IUCAKERBSA-N 0.000 description 2
- HQUXQAMSWFIRET-AVGNSLFASA-N Leu-Glu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HQUXQAMSWFIRET-AVGNSLFASA-N 0.000 description 2
- UCDHVOALNXENLC-KBPBESRZSA-N Leu-Gly-Tyr Chemical compound CC(C)C[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 UCDHVOALNXENLC-KBPBESRZSA-N 0.000 description 2
- BTNXKBVLWJBTNR-SRVKXCTJSA-N Leu-His-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O BTNXKBVLWJBTNR-SRVKXCTJSA-N 0.000 description 2
- JNDYEOUZBLOVOF-AVGNSLFASA-N Leu-Leu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JNDYEOUZBLOVOF-AVGNSLFASA-N 0.000 description 2
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 2
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 2
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 2
- IDGZVZJLYFTXSL-DCAQKATOSA-N Leu-Ser-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IDGZVZJLYFTXSL-DCAQKATOSA-N 0.000 description 2
- KZZCOWMDDXDKSS-CIUDSAMLSA-N Leu-Ser-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KZZCOWMDDXDKSS-CIUDSAMLSA-N 0.000 description 2
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 2
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 2
- SVBJIZVVYJYGLA-DCAQKATOSA-N Leu-Ser-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O SVBJIZVVYJYGLA-DCAQKATOSA-N 0.000 description 2
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 2
- LCNASHSOFMRYFO-WDCWCFNPSA-N Leu-Thr-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(N)=O LCNASHSOFMRYFO-WDCWCFNPSA-N 0.000 description 2
- ILDSIMPXNFWKLH-KATARQTJSA-N Leu-Thr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ILDSIMPXNFWKLH-KATARQTJSA-N 0.000 description 2
- UCRJTSIIAYHOHE-ULQDDVLXSA-N Leu-Tyr-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UCRJTSIIAYHOHE-ULQDDVLXSA-N 0.000 description 2
- VHTIZYYHIUHMCA-JYJNAYRXSA-N Leu-Tyr-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O VHTIZYYHIUHMCA-JYJNAYRXSA-N 0.000 description 2
- AXVIGSRGTMNSJU-YESZJQIVSA-N Leu-Tyr-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N AXVIGSRGTMNSJU-YESZJQIVSA-N 0.000 description 2
- FDBTVENULFNTAL-XQQFMLRXSA-N Leu-Val-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N FDBTVENULFNTAL-XQQFMLRXSA-N 0.000 description 2
- DEFGUIIUYAUEDU-ZPFDUUQYSA-N Lys-Asn-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DEFGUIIUYAUEDU-ZPFDUUQYSA-N 0.000 description 2
- SLQJJFAVWSZLBL-BJDJZHNGSA-N Lys-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN SLQJJFAVWSZLBL-BJDJZHNGSA-N 0.000 description 2
- PRSBSVAVOQOAMI-BJDJZHNGSA-N Lys-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN PRSBSVAVOQOAMI-BJDJZHNGSA-N 0.000 description 2
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 2
- YRNRVKTYDSLKMD-KKUMJFAQSA-N Lys-Ser-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YRNRVKTYDSLKMD-KKUMJFAQSA-N 0.000 description 2
- FRWZTWWOORIIBA-FXQIFTODSA-N Met-Asn-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N FRWZTWWOORIIBA-FXQIFTODSA-N 0.000 description 2
- VZBXCMCHIHEPBL-SRVKXCTJSA-N Met-Glu-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN VZBXCMCHIHEPBL-SRVKXCTJSA-N 0.000 description 2
- OSZTUONKUMCWEP-XUXIUFHCSA-N Met-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCSC OSZTUONKUMCWEP-XUXIUFHCSA-N 0.000 description 2
- CONKYWFMLIMRLU-BVSLBCMMSA-N Met-Trp-Tyr Chemical compound C([C@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@@H](N)CCSC)C(O)=O)C1=CC=C(O)C=C1 CONKYWFMLIMRLU-BVSLBCMMSA-N 0.000 description 2
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 2
- 229910002651 NO3 Inorganic materials 0.000 description 2
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 2
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 2
- NHNBFGGVMKEFGY-UHFFFAOYSA-N Nitrate Chemical compound [O-][N+]([O-])=O NHNBFGGVMKEFGY-UHFFFAOYSA-N 0.000 description 2
- 108091092724 Noncoding DNA Proteins 0.000 description 2
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 2
- 241001147398 Ostrinia nubilalis Species 0.000 description 2
- 238000009004 PCR Kit Methods 0.000 description 2
- 241000222051 Papiliotrema laurentii Species 0.000 description 2
- JJHVFCUWLSKADD-ONGXEEELSA-N Phe-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O JJHVFCUWLSKADD-ONGXEEELSA-N 0.000 description 2
- PBWNICYZGJQKJV-BZSNNMDCSA-N Phe-Phe-Cys Chemical compound N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CS)C(O)=O PBWNICYZGJQKJV-BZSNNMDCSA-N 0.000 description 2
- BONHGTUEEPIMPM-AVGNSLFASA-N Phe-Ser-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O BONHGTUEEPIMPM-AVGNSLFASA-N 0.000 description 2
- QSWKNJAPHQDAAS-MELADBBJSA-N Phe-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O QSWKNJAPHQDAAS-MELADBBJSA-N 0.000 description 2
- BAONJAHBAUDJKA-BZSNNMDCSA-N Phe-Tyr-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 BAONJAHBAUDJKA-BZSNNMDCSA-N 0.000 description 2
- ZOGICTVLQDWPER-UFYCRDLUSA-N Phe-Tyr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O ZOGICTVLQDWPER-UFYCRDLUSA-N 0.000 description 2
- GOUWCZRDTWTODO-YDHLFZDLSA-N Phe-Val-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O GOUWCZRDTWTODO-YDHLFZDLSA-N 0.000 description 2
- RGMLUHANLDVMPB-ULQDDVLXSA-N Phe-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RGMLUHANLDVMPB-ULQDDVLXSA-N 0.000 description 2
- XYFCBTPGUUZFHI-UHFFFAOYSA-N Phosphine Chemical compound P XYFCBTPGUUZFHI-UHFFFAOYSA-N 0.000 description 2
- 241001123094 Photorhabdus asymbiotica Species 0.000 description 2
- 241000255969 Pieris brassicae Species 0.000 description 2
- 108020005089 Plant RNA Proteins 0.000 description 2
- IFMDQWDAJUMMJC-DCAQKATOSA-N Pro-Ala-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O IFMDQWDAJUMMJC-DCAQKATOSA-N 0.000 description 2
- DMKWYMWNEKIPFC-IUCAKERBSA-N Pro-Gly-Arg Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O DMKWYMWNEKIPFC-IUCAKERBSA-N 0.000 description 2
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 2
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 2
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 2
- WOIFYRZPIORBRY-AVGNSLFASA-N Pro-Lys-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WOIFYRZPIORBRY-AVGNSLFASA-N 0.000 description 2
- SBVPYBFMIGDIDX-SRVKXCTJSA-N Pro-Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1N(C(=O)[C@H]2NCCC2)CCC1 SBVPYBFMIGDIDX-SRVKXCTJSA-N 0.000 description 2
- BVRBCQBUNGAWFP-KKUMJFAQSA-N Pro-Tyr-Gln Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O BVRBCQBUNGAWFP-KKUMJFAQSA-N 0.000 description 2
- 101100365469 Pseudomonas putida (strain ATCC 700007 / DSM 6899 / BCRC 17059 / F1) sepB gene Proteins 0.000 description 2
- 101100371218 Pseudomonas putida (strain DOT-T1E) ttgD gene Proteins 0.000 description 2
- 241000191043 Rhodobacter sphaeroides Species 0.000 description 2
- 241000190932 Rhodopseudomonas Species 0.000 description 2
- 241000235070 Saccharomyces Species 0.000 description 2
- 102400000827 Saposin-D Human genes 0.000 description 2
- 241000242583 Scyphozoa Species 0.000 description 2
- HRNQLKCLPVKZNE-CIUDSAMLSA-N Ser-Ala-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O HRNQLKCLPVKZNE-CIUDSAMLSA-N 0.000 description 2
- WDXYVIIVDIDOSX-DCAQKATOSA-N Ser-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N WDXYVIIVDIDOSX-DCAQKATOSA-N 0.000 description 2
- GHPQVUYZQQGEDA-BIIVOSGPSA-N Ser-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N)C(=O)O GHPQVUYZQQGEDA-BIIVOSGPSA-N 0.000 description 2
- UAJAYRMZGNQILN-BQBZGAKWSA-N Ser-Gly-Met Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O UAJAYRMZGNQILN-BQBZGAKWSA-N 0.000 description 2
- SFTZWNJFZYOLBD-ZDLURKLDSA-N Ser-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO SFTZWNJFZYOLBD-ZDLURKLDSA-N 0.000 description 2
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 2
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 2
- FBLNYDYPCLFTSP-IXOXFDKPSA-N Ser-Phe-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FBLNYDYPCLFTSP-IXOXFDKPSA-N 0.000 description 2
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 2
- VGQVAVQWKJLIRM-FXQIFTODSA-N Ser-Ser-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O VGQVAVQWKJLIRM-FXQIFTODSA-N 0.000 description 2
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 2
- LLSLRQOEAFCZLW-NRPADANISA-N Ser-Val-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LLSLRQOEAFCZLW-NRPADANISA-N 0.000 description 2
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 2
- 241000607715 Serratia marcescens Species 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 241000222068 Sporobolomyces <Sporidiobolaceae> Species 0.000 description 2
- 241000187747 Streptomyces Species 0.000 description 2
- 241001514658 Symmetrospora marina Species 0.000 description 2
- 108700005078 Synthetic Genes Proteins 0.000 description 2
- 101710137500 T7 RNA polymerase Proteins 0.000 description 2
- 241000589500 Thermus aquaticus Species 0.000 description 2
- XYEXCEPTALHNEV-RCWTZXSCSA-N Thr-Arg-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XYEXCEPTALHNEV-RCWTZXSCSA-N 0.000 description 2
- MQBTXMPQNCGSSZ-OSUNSFLBSA-N Thr-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)CCCN=C(N)N MQBTXMPQNCGSSZ-OSUNSFLBSA-N 0.000 description 2
- VFEHSAJCWWHDBH-RHYQMDGZSA-N Thr-Arg-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VFEHSAJCWWHDBH-RHYQMDGZSA-N 0.000 description 2
- YLXAMFZYJTZXFH-OLHMAJIHSA-N Thr-Asn-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O YLXAMFZYJTZXFH-OLHMAJIHSA-N 0.000 description 2
- OJRNZRROAIAHDL-LKXGYXEUSA-N Thr-Asn-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OJRNZRROAIAHDL-LKXGYXEUSA-N 0.000 description 2
- GNHRVXYZKWSJTF-HJGDQZAQSA-N Thr-Asp-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O GNHRVXYZKWSJTF-HJGDQZAQSA-N 0.000 description 2
- RKDFEMGVMMYYNG-WDCWCFNPSA-N Thr-Gln-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O RKDFEMGVMMYYNG-WDCWCFNPSA-N 0.000 description 2
- CRZNCABIJLRFKZ-IUKAMOBKSA-N Thr-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N CRZNCABIJLRFKZ-IUKAMOBKSA-N 0.000 description 2
- FQPDRTDDEZXCEC-SVSWQMSJSA-N Thr-Ile-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O FQPDRTDDEZXCEC-SVSWQMSJSA-N 0.000 description 2
- IMDMLDSVUSMAEJ-HJGDQZAQSA-N Thr-Leu-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IMDMLDSVUSMAEJ-HJGDQZAQSA-N 0.000 description 2
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 2
- YOOAQCZYZHGUAZ-KATARQTJSA-N Thr-Leu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YOOAQCZYZHGUAZ-KATARQTJSA-N 0.000 description 2
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 2
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 2
- BKVICMPZWRNWOC-RHYQMDGZSA-N Thr-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)O BKVICMPZWRNWOC-RHYQMDGZSA-N 0.000 description 2
- MNYNCKZAEIAONY-XGEHTFHBSA-N Thr-Val-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O MNYNCKZAEIAONY-XGEHTFHBSA-N 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- AIISTODACBDQLW-WDSOQIARSA-N Trp-Leu-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 AIISTODACBDQLW-WDSOQIARSA-N 0.000 description 2
- UIRPULWLRODAEQ-QEJZJMRPSA-N Trp-Ser-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 UIRPULWLRODAEQ-QEJZJMRPSA-N 0.000 description 2
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 2
- PLXQRTXVLZUNMU-RNXOBYDBSA-N Tyr-Phe-Trp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)NC(=O)[C@H](CC4=CC=C(C=C4)O)N PLXQRTXVLZUNMU-RNXOBYDBSA-N 0.000 description 2
- YMZYSCDRTXEOKD-IHPCNDPISA-N Tyr-Trp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N YMZYSCDRTXEOKD-IHPCNDPISA-N 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- IZFVRRYRMQFVGX-NRPADANISA-N Val-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N IZFVRRYRMQFVGX-NRPADANISA-N 0.000 description 2
- SLLKXDSRVAOREO-KZVJFYERSA-N Val-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N)O SLLKXDSRVAOREO-KZVJFYERSA-N 0.000 description 2
- VJOWWOGRNXRQMF-UVBJJODRSA-N Val-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 VJOWWOGRNXRQMF-UVBJJODRSA-N 0.000 description 2
- WFENBJPLZMPVAX-XVKPBYJWSA-N Val-Gly-Glu Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O WFENBJPLZMPVAX-XVKPBYJWSA-N 0.000 description 2
- OVBMCNDKCWAXMZ-NAKRPEOUSA-N Val-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N OVBMCNDKCWAXMZ-NAKRPEOUSA-N 0.000 description 2
- MBGFDZDWMDLXHQ-GUBZILKMSA-N Val-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C(C)C)N MBGFDZDWMDLXHQ-GUBZILKMSA-N 0.000 description 2
- LTTQCQRTSHJPPL-ZKWXMUAHSA-N Val-Ser-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N LTTQCQRTSHJPPL-ZKWXMUAHSA-N 0.000 description 2
- LCHZBEUVGAVMKS-RHYQMDGZSA-N Val-Thr-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)[C@@H](C)O)C(O)=O LCHZBEUVGAVMKS-RHYQMDGZSA-N 0.000 description 2
- 241000589634 Xanthomonas Species 0.000 description 2
- 241000589636 Xanthomonas campestris Species 0.000 description 2
- 230000002411 adverse Effects 0.000 description 2
- 239000008272 agar Substances 0.000 description 2
- 108010044940 alanylglutamine Proteins 0.000 description 2
- 238000000137 annealing Methods 0.000 description 2
- 230000000692 anti-sense effect Effects 0.000 description 2
- 108010013835 arginine glutamate Proteins 0.000 description 2
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 2
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 2
- 108010062796 arginyllysine Proteins 0.000 description 2
- 108010068265 aspartyltyrosine Proteins 0.000 description 2
- 239000003139 biocide Substances 0.000 description 2
- 229940098773 bovine serum albumin Drugs 0.000 description 2
- 238000011088 calibration curve Methods 0.000 description 2
- 238000005352 clarification Methods 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 230000001276 controlling effect Effects 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 2
- 241001493065 dsRNA viruses Species 0.000 description 2
- 239000002158 endotoxin Substances 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 238000013467 fragmentation Methods 0.000 description 2
- 238000006062 fragmentation reaction Methods 0.000 description 2
- 210000001035 gastrointestinal tract Anatomy 0.000 description 2
- BRZYSWJRSDMWLG-CAXSIQPQSA-N geneticin Natural products O1C[C@@](O)(C)[C@H](NC)[C@@H](O)[C@H]1O[C@@H]1[C@@H](O)[C@H](O[C@@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H](C(C)O)O2)N)[C@@H](N)C[C@H]1N BRZYSWJRSDMWLG-CAXSIQPQSA-N 0.000 description 2
- 108010049041 glutamylalanine Proteins 0.000 description 2
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 2
- 108010090037 glycyl-alanyl-isoleucine Proteins 0.000 description 2
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Chemical compound NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 2
- 108010010147 glycylglutamine Proteins 0.000 description 2
- 108010020688 glycylhistidine Proteins 0.000 description 2
- 108010037850 glycylvaline Proteins 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 108010040030 histidinoalanine Proteins 0.000 description 2
- 108010025306 histidylleucine Proteins 0.000 description 2
- 229940088597 hormone Drugs 0.000 description 2
- 239000005556 hormone Substances 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 210000000936 intestine Anatomy 0.000 description 2
- 229940039696 lactobacillus Drugs 0.000 description 2
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 2
- 108010091871 leucylmethionine Proteins 0.000 description 2
- 108010012058 leucyltyrosine Proteins 0.000 description 2
- 108010038320 lysylphenylalanine Proteins 0.000 description 2
- 210000001161 mammalian embryo Anatomy 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000002503 metabolic effect Effects 0.000 description 2
- 108010068488 methionylphenylalanine Proteins 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 231100000989 no adverse effect Toxicity 0.000 description 2
- 239000002853 nucleic acid probe Substances 0.000 description 2
- 235000015097 nutrients Nutrition 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 108010065135 phenylalanyl-phenylalanyl-phenylalanine Proteins 0.000 description 2
- 108010024654 phenylalanyl-prolyl-alanine Proteins 0.000 description 2
- 108010012581 phenylalanylglutamate Proteins 0.000 description 2
- 108010051242 phenylalanylserine Proteins 0.000 description 2
- 230000008488 polyadenylation Effects 0.000 description 2
- 230000003389 potentiating effect Effects 0.000 description 2
- 102000004196 processed proteins & peptides Human genes 0.000 description 2
- 108010070643 prolylglutamic acid Proteins 0.000 description 2
- 239000011546 protein dye Substances 0.000 description 2
- 230000018883 protein targeting Effects 0.000 description 2
- RYVMUASDIZQXAA-UHFFFAOYSA-N pyranoside Natural products O1C2(OCC(C)C(OC3C(C(O)C(O)C(CO)O3)O)C2)C(C)C(C2(CCC3C4(C)CC5O)C)C1CC2C3CC=C4CC5OC(C(C1O)O)OC(CO)C1OC(C1OC2C(C(OC3C(C(O)C(O)C(CO)O3)O)C(O)C(CO)O2)O)OC(CO)C(O)C1OC1OCC(O)C(O)C1O RYVMUASDIZQXAA-UHFFFAOYSA-N 0.000 description 2
- 230000008929 regeneration Effects 0.000 description 2
- 238000011069 regeneration method Methods 0.000 description 2
- 230000032537 response to toxin Effects 0.000 description 2
- 150000003839 salts Chemical class 0.000 description 2
- 230000028327 secretion Effects 0.000 description 2
- 101150085300 sepA gene Proteins 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 108010071207 serylmethionine Proteins 0.000 description 2
- 229960005322 streptomycin Drugs 0.000 description 2
- 239000004094 surface-active agent Substances 0.000 description 2
- 108010071097 threonyl-lysyl-proline Proteins 0.000 description 2
- 231100000331 toxic Toxicity 0.000 description 2
- 230000002588 toxic effect Effects 0.000 description 2
- 238000001890 transfection Methods 0.000 description 2
- 108700004896 tripeptide FEG Proteins 0.000 description 2
- LWIHDJKSTIGBAC-UHFFFAOYSA-K tripotassium phosphate Chemical compound [K+].[K+].[K+].[O-]P([O-])([O-])=O LWIHDJKSTIGBAC-UHFFFAOYSA-K 0.000 description 2
- 238000002525 ultrasonication Methods 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 239000013603 viral vector Substances 0.000 description 2
- 101150033098 xpt gene Proteins 0.000 description 2
- JNTMAZFVYNDPLB-PEDHHIEDSA-N (2S,3S)-2-[[[(2S)-1-[(2S,3S)-2-amino-3-methyl-1-oxopentyl]-2-pyrrolidinyl]-oxomethyl]amino]-3-methylpentanoic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JNTMAZFVYNDPLB-PEDHHIEDSA-N 0.000 description 1
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 1
- CWFMWBHMIMNZLN-NAKRPEOUSA-N (2s)-1-[(2s)-2-[[(2s,3s)-2-amino-3-methylpentanoyl]amino]propanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CWFMWBHMIMNZLN-NAKRPEOUSA-N 0.000 description 1
- VKYBWTVHQHFSJL-ZATYTLRZSA-N (2s)-2,5-diamino-5-oxopentanoic acid;(2s)-pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1.OC(=O)[C@@H](N)CCC(N)=O VKYBWTVHQHFSJL-ZATYTLRZSA-N 0.000 description 1
- AZTKONSEYGYKKM-QIKNFSLBSA-N (2s)-2-aminobutanedioic acid;(2s)-2-aminopentanedioic acid;(2s)-2-amino-3-phenylpropanoic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O.OC(=O)[C@@H](N)CCC(O)=O.OC(=O)[C@@H](N)CC1=CC=CC=C1 AZTKONSEYGYKKM-QIKNFSLBSA-N 0.000 description 1
- YUXKOWPNKJSTPQ-AXWWPMSFSA-N (2s,3r)-2-amino-3-hydroxybutanoic acid;(2s)-2-amino-3-hydroxypropanoic acid Chemical compound OC[C@H](N)C(O)=O.C[C@@H](O)[C@H](N)C(O)=O YUXKOWPNKJSTPQ-AXWWPMSFSA-N 0.000 description 1
- NMNYMRMXUPRAKF-TYHJCQIPSA-N (2s,3s)-2-amino-3-methylpentanoic acid;(2s)-2,6-diaminohexanoic acid Chemical compound CC[C@H](C)[C@H](N)C(O)=O.NCCCC[C@H](N)C(O)=O NMNYMRMXUPRAKF-TYHJCQIPSA-N 0.000 description 1
- NNRFRJQMBSBXGO-CIUDSAMLSA-N (3s)-3-[[2-[[(2s)-2-amino-5-(diaminomethylideneamino)pentanoyl]amino]acetyl]amino]-4-[[(1s)-1-carboxy-2-hydroxyethyl]amino]-4-oxobutanoic acid Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O NNRFRJQMBSBXGO-CIUDSAMLSA-N 0.000 description 1
- 108020004465 16S ribosomal RNA Proteins 0.000 description 1
- WTLKTXIHIHFSGU-UHFFFAOYSA-N 2-nitrosoguanidine Chemical compound NC(N)=NN=O WTLKTXIHIHFSGU-UHFFFAOYSA-N 0.000 description 1
- CAAMSDWKXXPUJR-UHFFFAOYSA-N 3,5-dihydro-4H-imidazol-4-one Chemical compound O=C1CNC=N1 CAAMSDWKXXPUJR-UHFFFAOYSA-N 0.000 description 1
- IXHADCPJRQNDGG-UHFFFAOYSA-N 3-[bis(2-chloroethyl)amino]-1-(4-phenylphenyl)propan-1-one Chemical compound C1=CC(C(=O)CCN(CCCl)CCCl)=CC=C1C1=CC=CC=C1 IXHADCPJRQNDGG-UHFFFAOYSA-N 0.000 description 1
- 108010085238 Actins Proteins 0.000 description 1
- 102000007469 Actins Human genes 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 241000242764 Aequorea victoria Species 0.000 description 1
- 101000689231 Aeromonas salmonicida S-layer protein Proteins 0.000 description 1
- 241000589155 Agrobacterium tumefaciens Species 0.000 description 1
- 241000218475 Agrotis segetum Species 0.000 description 1
- BYXHQQCXAJARLQ-ZLUOBGJFSA-N Ala-Ala-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O BYXHQQCXAJARLQ-ZLUOBGJFSA-N 0.000 description 1
- DKJPOZOEBONHFS-ZLUOBGJFSA-N Ala-Ala-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O DKJPOZOEBONHFS-ZLUOBGJFSA-N 0.000 description 1
- BUANFPRKJKJSRR-ACZMJKKPSA-N Ala-Ala-Gln Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CCC(N)=O BUANFPRKJKJSRR-ACZMJKKPSA-N 0.000 description 1
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 1
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 1
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 1
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 1
- UGLPMYSCWHTZQU-AUTRQRHGSA-N Ala-Ala-Tyr Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 UGLPMYSCWHTZQU-AUTRQRHGSA-N 0.000 description 1
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 1
- JAMAWBXXKFGFGX-KZVJFYERSA-N Ala-Arg-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JAMAWBXXKFGFGX-KZVJFYERSA-N 0.000 description 1
- WYPUMLRSQMKIJU-BPNCWPANSA-N Ala-Arg-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WYPUMLRSQMKIJU-BPNCWPANSA-N 0.000 description 1
- YAXNATKKPOWVCP-ZLUOBGJFSA-N Ala-Asn-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O YAXNATKKPOWVCP-ZLUOBGJFSA-N 0.000 description 1
- XEXJJJRVTFGWIC-FXQIFTODSA-N Ala-Asn-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XEXJJJRVTFGWIC-FXQIFTODSA-N 0.000 description 1
- ZEXDYVGDZJBRMO-ACZMJKKPSA-N Ala-Asn-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N ZEXDYVGDZJBRMO-ACZMJKKPSA-N 0.000 description 1
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 1
- FXKNPWNXPQZLES-ZLUOBGJFSA-N Ala-Asn-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FXKNPWNXPQZLES-ZLUOBGJFSA-N 0.000 description 1
- GORKKVHIBWAQHM-GCJQMDKQSA-N Ala-Asn-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GORKKVHIBWAQHM-GCJQMDKQSA-N 0.000 description 1
- PBAMJJXWDQXOJA-FXQIFTODSA-N Ala-Asp-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PBAMJJXWDQXOJA-FXQIFTODSA-N 0.000 description 1
- MCKSLROAGSDNFC-ACZMJKKPSA-N Ala-Asp-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MCKSLROAGSDNFC-ACZMJKKPSA-N 0.000 description 1
- GWFSQQNGMPGBEF-GHCJXIJMSA-N Ala-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)N GWFSQQNGMPGBEF-GHCJXIJMSA-N 0.000 description 1
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 1
- OILNWMNBLIHXQK-ZLUOBGJFSA-N Ala-Cys-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O OILNWMNBLIHXQK-ZLUOBGJFSA-N 0.000 description 1
- BLGHHPHXVJWCNK-GUBZILKMSA-N Ala-Gln-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BLGHHPHXVJWCNK-GUBZILKMSA-N 0.000 description 1
- SFNFGFDRYJKZKN-XQXXSGGOSA-N Ala-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C)N)O SFNFGFDRYJKZKN-XQXXSGGOSA-N 0.000 description 1
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 1
- KXEVYGKATAMXJJ-ACZMJKKPSA-N Ala-Glu-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KXEVYGKATAMXJJ-ACZMJKKPSA-N 0.000 description 1
- WKOBSJOZRJJVRZ-FXQIFTODSA-N Ala-Glu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WKOBSJOZRJJVRZ-FXQIFTODSA-N 0.000 description 1
- PAIHPOGPJVUFJY-WDSKDSINSA-N Ala-Glu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PAIHPOGPJVUFJY-WDSKDSINSA-N 0.000 description 1
- GGNHBHYDMUDXQB-KBIXCLLPSA-N Ala-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)N GGNHBHYDMUDXQB-KBIXCLLPSA-N 0.000 description 1
- FBHOPGDGELNWRH-DRZSPHRISA-N Ala-Glu-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O FBHOPGDGELNWRH-DRZSPHRISA-N 0.000 description 1
- ROLXPVQSRCPVGK-XDTLVQLUSA-N Ala-Glu-Tyr Chemical compound N[C@@H](C)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O ROLXPVQSRCPVGK-XDTLVQLUSA-N 0.000 description 1
- NHLAEBFGWPXFGI-WHFBIAKZSA-N Ala-Gly-Asn Chemical compound C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N NHLAEBFGWPXFGI-WHFBIAKZSA-N 0.000 description 1
- MPLOSMWGDNJSEV-WHFBIAKZSA-N Ala-Gly-Asp Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MPLOSMWGDNJSEV-WHFBIAKZSA-N 0.000 description 1
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 1
- ZPXCNXMJEZKRLU-LSJOCFKGSA-N Ala-His-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CN=CN1 ZPXCNXMJEZKRLU-LSJOCFKGSA-N 0.000 description 1
- FAJIYNONGXEXAI-CQDKDKBSSA-N Ala-His-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CNC=N1 FAJIYNONGXEXAI-CQDKDKBSSA-N 0.000 description 1
- NYDBKUNVSALYPX-NAKRPEOUSA-N Ala-Ile-Arg Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NYDBKUNVSALYPX-NAKRPEOUSA-N 0.000 description 1
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 1
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 1
- RZZMZYZXNJRPOJ-BJDJZHNGSA-N Ala-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C)N RZZMZYZXNJRPOJ-BJDJZHNGSA-N 0.000 description 1
- VNYMOTCMNHJGTG-JBDRJPRFSA-N Ala-Ile-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O VNYMOTCMNHJGTG-JBDRJPRFSA-N 0.000 description 1
- LXAARTARZJJCMB-CIQUZCHMSA-N Ala-Ile-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LXAARTARZJJCMB-CIQUZCHMSA-N 0.000 description 1
- LNNSWWRRYJLGNI-NAKRPEOUSA-N Ala-Ile-Val Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O LNNSWWRRYJLGNI-NAKRPEOUSA-N 0.000 description 1
- YHKANGMVQWRMAP-DCAQKATOSA-N Ala-Leu-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YHKANGMVQWRMAP-DCAQKATOSA-N 0.000 description 1
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 1
- DPNZTBKGAUAZQU-DLOVCJGASA-N Ala-Leu-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N DPNZTBKGAUAZQU-DLOVCJGASA-N 0.000 description 1
- VHVVPYOJIIQCKS-QEJZJMRPSA-N Ala-Leu-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VHVVPYOJIIQCKS-QEJZJMRPSA-N 0.000 description 1
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 1
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 1
- MLNSNVLOEIYJIU-ZUDIRPEPSA-N Ala-Leu-Thr-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MLNSNVLOEIYJIU-ZUDIRPEPSA-N 0.000 description 1
- RGQCNKIDEQJEBT-CQDKDKBSSA-N Ala-Leu-Tyr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 RGQCNKIDEQJEBT-CQDKDKBSSA-N 0.000 description 1
- QUIGLPSHIFPEOV-CIUDSAMLSA-N Ala-Lys-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O QUIGLPSHIFPEOV-CIUDSAMLSA-N 0.000 description 1
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- OINVDEKBKBCPLX-JXUBOQSCSA-N Ala-Lys-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OINVDEKBKBCPLX-JXUBOQSCSA-N 0.000 description 1
- KQESEZXHYOUIIM-CQDKDKBSSA-N Ala-Lys-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KQESEZXHYOUIIM-CQDKDKBSSA-N 0.000 description 1
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 1
- AWNAEZICPNGAJK-FXQIFTODSA-N Ala-Met-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O AWNAEZICPNGAJK-FXQIFTODSA-N 0.000 description 1
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 1
- CNQAFFMNJIQYGX-DRZSPHRISA-N Ala-Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 CNQAFFMNJIQYGX-DRZSPHRISA-N 0.000 description 1
- HYIDEIQUCBKIPL-CQDKDKBSSA-N Ala-Phe-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N HYIDEIQUCBKIPL-CQDKDKBSSA-N 0.000 description 1
- JAQNUEWEJWBVAY-WBAXXEDZSA-N Ala-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 JAQNUEWEJWBVAY-WBAXXEDZSA-N 0.000 description 1
- IPZQNYYAYVRKKK-FXQIFTODSA-N Ala-Pro-Ala Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IPZQNYYAYVRKKK-FXQIFTODSA-N 0.000 description 1
- IORKCNUBHNIMKY-CIUDSAMLSA-N Ala-Pro-Glu Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O IORKCNUBHNIMKY-CIUDSAMLSA-N 0.000 description 1
- VRTOMXFZHGWHIJ-KZVJFYERSA-N Ala-Thr-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VRTOMXFZHGWHIJ-KZVJFYERSA-N 0.000 description 1
- LSMDIAAALJJLRO-XQXXSGGOSA-N Ala-Thr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LSMDIAAALJJLRO-XQXXSGGOSA-N 0.000 description 1
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 1
- QDGMZAOSMNGBLP-MRFFXTKBSA-N Ala-Trp-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)N QDGMZAOSMNGBLP-MRFFXTKBSA-N 0.000 description 1
- MTDDMSUUXNQMKK-BPNCWPANSA-N Ala-Tyr-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N MTDDMSUUXNQMKK-BPNCWPANSA-N 0.000 description 1
- YCTIYBUTCKNOTI-UWJYBYFXSA-N Ala-Tyr-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCTIYBUTCKNOTI-UWJYBYFXSA-N 0.000 description 1
- JNJHNBXBGNJESC-KKXDTOCCSA-N Ala-Tyr-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JNJHNBXBGNJESC-KKXDTOCCSA-N 0.000 description 1
- MUGAESARFRGOTQ-IGNZVWTISA-N Ala-Tyr-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N MUGAESARFRGOTQ-IGNZVWTISA-N 0.000 description 1
- XSLGWYYNOSUMRM-ZKWXMUAHSA-N Ala-Val-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XSLGWYYNOSUMRM-ZKWXMUAHSA-N 0.000 description 1
- LYILPUNCKACNGF-NAKRPEOUSA-N Ala-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N LYILPUNCKACNGF-NAKRPEOUSA-N 0.000 description 1
- XKHLBBQNPSOGPI-GUBZILKMSA-N Ala-Val-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N XKHLBBQNPSOGPI-GUBZILKMSA-N 0.000 description 1
- REWSWYIDQIELBE-FXQIFTODSA-N Ala-Val-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O REWSWYIDQIELBE-FXQIFTODSA-N 0.000 description 1
- 101900318283 Alfalfa mosaic virus Capsid protein Proteins 0.000 description 1
- 101100437895 Alternaria brassicicola bsc3 gene Proteins 0.000 description 1
- 241000254175 Anthonomus grandis Species 0.000 description 1
- 241000256844 Apis mellifera Species 0.000 description 1
- SGYSTDWPNPKJPP-GUBZILKMSA-N Arg-Ala-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SGYSTDWPNPKJPP-GUBZILKMSA-N 0.000 description 1
- GXCSUJQOECMKPV-CIUDSAMLSA-N Arg-Ala-Gln Chemical compound C[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GXCSUJQOECMKPV-CIUDSAMLSA-N 0.000 description 1
- PEFFAAKJGBZBKL-NAKRPEOUSA-N Arg-Ala-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PEFFAAKJGBZBKL-NAKRPEOUSA-N 0.000 description 1
- GIVATXIGCXFQQA-FXQIFTODSA-N Arg-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N GIVATXIGCXFQQA-FXQIFTODSA-N 0.000 description 1
- BIOCIVSVEDFKDJ-GUBZILKMSA-N Arg-Arg-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O BIOCIVSVEDFKDJ-GUBZILKMSA-N 0.000 description 1
- MUXONAMCEUBVGA-DCAQKATOSA-N Arg-Arg-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O MUXONAMCEUBVGA-DCAQKATOSA-N 0.000 description 1
- OVVUNXXROOFSIM-SDDRHHMPSA-N Arg-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O OVVUNXXROOFSIM-SDDRHHMPSA-N 0.000 description 1
- USNSOPDIZILSJP-FXQIFTODSA-N Arg-Asn-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O USNSOPDIZILSJP-FXQIFTODSA-N 0.000 description 1
- QPOARHANPULOTM-GMOBBJLQSA-N Arg-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N QPOARHANPULOTM-GMOBBJLQSA-N 0.000 description 1
- ITVINTQUZMQWJR-QXEWZRGKSA-N Arg-Asn-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O ITVINTQUZMQWJR-QXEWZRGKSA-N 0.000 description 1
- XVLLUZMFSAYKJV-GUBZILKMSA-N Arg-Asp-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XVLLUZMFSAYKJV-GUBZILKMSA-N 0.000 description 1
- JSHVMZANPXCDTL-GMOBBJLQSA-N Arg-Asp-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JSHVMZANPXCDTL-GMOBBJLQSA-N 0.000 description 1
- RRGPUNYIPJXJBU-GUBZILKMSA-N Arg-Asp-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O RRGPUNYIPJXJBU-GUBZILKMSA-N 0.000 description 1
- FBLMOFHNVQBKRR-IHRRRGAJSA-N Arg-Asp-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FBLMOFHNVQBKRR-IHRRRGAJSA-N 0.000 description 1
- YUGFLWBWAJFGKY-BQBZGAKWSA-N Arg-Cys-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O YUGFLWBWAJFGKY-BQBZGAKWSA-N 0.000 description 1
- PTVGLOCPAVYPFG-CIUDSAMLSA-N Arg-Gln-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O PTVGLOCPAVYPFG-CIUDSAMLSA-N 0.000 description 1
- HPKSHFSEXICTLI-CIUDSAMLSA-N Arg-Glu-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O HPKSHFSEXICTLI-CIUDSAMLSA-N 0.000 description 1
- XLWSGICNBZGYTA-CIUDSAMLSA-N Arg-Glu-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XLWSGICNBZGYTA-CIUDSAMLSA-N 0.000 description 1
- OHYQKYUTLIPFOX-ZPFDUUQYSA-N Arg-Glu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OHYQKYUTLIPFOX-ZPFDUUQYSA-N 0.000 description 1
- JAYIQMNQDMOBFY-KKUMJFAQSA-N Arg-Glu-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JAYIQMNQDMOBFY-KKUMJFAQSA-N 0.000 description 1
- PPPXVIBMLFWNSK-BQBZGAKWSA-N Arg-Gly-Cys Chemical compound C(C[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N PPPXVIBMLFWNSK-BQBZGAKWSA-N 0.000 description 1
- CYXCAHZVPFREJD-LURJTMIESA-N Arg-Gly-Gly Chemical compound NC(=N)NCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O CYXCAHZVPFREJD-LURJTMIESA-N 0.000 description 1
- QEHMMRSQJMOYNO-DCAQKATOSA-N Arg-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QEHMMRSQJMOYNO-DCAQKATOSA-N 0.000 description 1
- YKBHOXLMMPZPHQ-GMOBBJLQSA-N Arg-Ile-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O YKBHOXLMMPZPHQ-GMOBBJLQSA-N 0.000 description 1
- HJDNZFIYILEIKR-OSUNSFLBSA-N Arg-Ile-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HJDNZFIYILEIKR-OSUNSFLBSA-N 0.000 description 1
- GMFAGHNRXPSSJS-SRVKXCTJSA-N Arg-Leu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GMFAGHNRXPSSJS-SRVKXCTJSA-N 0.000 description 1
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 1
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 1
- DNUKXVMPARLPFN-XUXIUFHCSA-N Arg-Leu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DNUKXVMPARLPFN-XUXIUFHCSA-N 0.000 description 1
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 1
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 1
- OGSQONVYSTZIJB-WDSOQIARSA-N Arg-Leu-Trp Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O OGSQONVYSTZIJB-WDSOQIARSA-N 0.000 description 1
- PZBSKYJGKNNYNK-ULQDDVLXSA-N Arg-Leu-Tyr Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O PZBSKYJGKNNYNK-ULQDDVLXSA-N 0.000 description 1
- YVTHEZNOKSAWRW-DCAQKATOSA-N Arg-Lys-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O YVTHEZNOKSAWRW-DCAQKATOSA-N 0.000 description 1
- AFNHFVVOJZBIJD-GUBZILKMSA-N Arg-Met-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O AFNHFVVOJZBIJD-GUBZILKMSA-N 0.000 description 1
- INXWADWANGLMPJ-JYJNAYRXSA-N Arg-Phe-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)CC1=CC=CC=C1 INXWADWANGLMPJ-JYJNAYRXSA-N 0.000 description 1
- IGFJVXOATGZTHD-UHFFFAOYSA-N Arg-Phe-His Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccccc1)C(=O)NC(Cc2c[nH]cn2)C(=O)O IGFJVXOATGZTHD-UHFFFAOYSA-N 0.000 description 1
- VUGWHBXPMAHEGZ-SRVKXCTJSA-N Arg-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N VUGWHBXPMAHEGZ-SRVKXCTJSA-N 0.000 description 1
- AUIJUTGLPVHIRT-FXQIFTODSA-N Arg-Ser-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N AUIJUTGLPVHIRT-FXQIFTODSA-N 0.000 description 1
- KMFPQTITXUKJOV-DCAQKATOSA-N Arg-Ser-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O KMFPQTITXUKJOV-DCAQKATOSA-N 0.000 description 1
- AIFHRTPABBBHKU-RCWTZXSCSA-N Arg-Thr-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AIFHRTPABBBHKU-RCWTZXSCSA-N 0.000 description 1
- LYJXHXGPWDTLKW-HJGDQZAQSA-N Arg-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O LYJXHXGPWDTLKW-HJGDQZAQSA-N 0.000 description 1
- AUZAXCPWMDBWEE-HJGDQZAQSA-N Arg-Thr-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O AUZAXCPWMDBWEE-HJGDQZAQSA-N 0.000 description 1
- YNSUUAOAFCVINY-OSUNSFLBSA-N Arg-Thr-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YNSUUAOAFCVINY-OSUNSFLBSA-N 0.000 description 1
- MOGMYRUNTKYZFB-UNQGMJICSA-N Arg-Thr-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MOGMYRUNTKYZFB-UNQGMJICSA-N 0.000 description 1
- INOIAEUXVVNJKA-XGEHTFHBSA-N Arg-Thr-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O INOIAEUXVVNJKA-XGEHTFHBSA-N 0.000 description 1
- YHZQOSXDTFRZKU-WDSOQIARSA-N Arg-Trp-Leu Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N)=CNC2=C1 YHZQOSXDTFRZKU-WDSOQIARSA-N 0.000 description 1
- QJWLLRZTJFPCHA-STECZYCISA-N Arg-Tyr-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O QJWLLRZTJFPCHA-STECZYCISA-N 0.000 description 1
- QCTOLCVIGRLMQS-HRCADAONSA-N Arg-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O QCTOLCVIGRLMQS-HRCADAONSA-N 0.000 description 1
- QHUOOCKNNURZSL-IHRRRGAJSA-N Arg-Tyr-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O QHUOOCKNNURZSL-IHRRRGAJSA-N 0.000 description 1
- IZSMEUDYADKZTJ-KJEVXHAQSA-N Arg-Tyr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IZSMEUDYADKZTJ-KJEVXHAQSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- SWLOHUMCUDRTCL-ZLUOBGJFSA-N Asn-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N SWLOHUMCUDRTCL-ZLUOBGJFSA-N 0.000 description 1
- XYOVHPDDWCEUDY-CIUDSAMLSA-N Asn-Ala-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O XYOVHPDDWCEUDY-CIUDSAMLSA-N 0.000 description 1
- SLKLLQWZQHXYSV-CIUDSAMLSA-N Asn-Ala-Lys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O SLKLLQWZQHXYSV-CIUDSAMLSA-N 0.000 description 1
- NTXNUXPCNRDMAF-WFBYXXMGSA-N Asn-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CC(N)=O)C)C(O)=O)=CNC2=C1 NTXNUXPCNRDMAF-WFBYXXMGSA-N 0.000 description 1
- VDCIPFYVCICPEC-FXQIFTODSA-N Asn-Arg-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O VDCIPFYVCICPEC-FXQIFTODSA-N 0.000 description 1
- GXMSVVBIAMWMKO-BQBZGAKWSA-N Asn-Arg-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N GXMSVVBIAMWMKO-BQBZGAKWSA-N 0.000 description 1
- HUZGPXBILPMCHM-IHRRRGAJSA-N Asn-Arg-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HUZGPXBILPMCHM-IHRRRGAJSA-N 0.000 description 1
- GOVUDFOGXOONFT-VEVYYDQMSA-N Asn-Arg-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GOVUDFOGXOONFT-VEVYYDQMSA-N 0.000 description 1
- RCENDENBBJFJHZ-ACZMJKKPSA-N Asn-Asn-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RCENDENBBJFJHZ-ACZMJKKPSA-N 0.000 description 1
- DXZNJWFECGJCQR-FXQIFTODSA-N Asn-Asn-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N DXZNJWFECGJCQR-FXQIFTODSA-N 0.000 description 1
- NVGWESORMHFISY-SRVKXCTJSA-N Asn-Asn-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NVGWESORMHFISY-SRVKXCTJSA-N 0.000 description 1
- BHQQRVARKXWXPP-ACZMJKKPSA-N Asn-Asp-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N BHQQRVARKXWXPP-ACZMJKKPSA-N 0.000 description 1
- VYLVOMUVLMGCRF-ZLUOBGJFSA-N Asn-Asp-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O VYLVOMUVLMGCRF-ZLUOBGJFSA-N 0.000 description 1
- XQQVCUIBGYFKDC-OLHMAJIHSA-N Asn-Asp-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XQQVCUIBGYFKDC-OLHMAJIHSA-N 0.000 description 1
- HJRBIWRXULGMOA-ACZMJKKPSA-N Asn-Gln-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HJRBIWRXULGMOA-ACZMJKKPSA-N 0.000 description 1
- VJTWLBMESLDOMK-WDSKDSINSA-N Asn-Gln-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VJTWLBMESLDOMK-WDSKDSINSA-N 0.000 description 1
- UPALZCBCKAMGIY-PEFMBERDSA-N Asn-Gln-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UPALZCBCKAMGIY-PEFMBERDSA-N 0.000 description 1
- KWQPAXYXVMHJJR-AVGNSLFASA-N Asn-Gln-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KWQPAXYXVMHJJR-AVGNSLFASA-N 0.000 description 1
- SNAKIVFVLVUCKB-UHFFFAOYSA-N Asn-Glu-Ala-Lys Natural products NCCCCC(C(O)=O)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(N)CC(N)=O SNAKIVFVLVUCKB-UHFFFAOYSA-N 0.000 description 1
- QYXNFROWLZPWPC-FXQIFTODSA-N Asn-Glu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QYXNFROWLZPWPC-FXQIFTODSA-N 0.000 description 1
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 1
- CTQIOCMSIJATNX-WHFBIAKZSA-N Asn-Gly-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O CTQIOCMSIJATNX-WHFBIAKZSA-N 0.000 description 1
- IICZCLFBILYRCU-WHFBIAKZSA-N Asn-Gly-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O IICZCLFBILYRCU-WHFBIAKZSA-N 0.000 description 1
- OLISTMZJGQUOGS-GMOBBJLQSA-N Asn-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N OLISTMZJGQUOGS-GMOBBJLQSA-N 0.000 description 1
- XVBDDUPJVQXDSI-PEFMBERDSA-N Asn-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVBDDUPJVQXDSI-PEFMBERDSA-N 0.000 description 1
- NVWJMQNYLYWVNQ-BYULHYEWSA-N Asn-Ile-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O NVWJMQNYLYWVNQ-BYULHYEWSA-N 0.000 description 1
- ACKNRKFVYUVWAC-ZPFDUUQYSA-N Asn-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ACKNRKFVYUVWAC-ZPFDUUQYSA-N 0.000 description 1
- PNHQRQTVBRDIEF-CIUDSAMLSA-N Asn-Leu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)N)N PNHQRQTVBRDIEF-CIUDSAMLSA-N 0.000 description 1
- HFPXZWPUVFVNLL-GUBZILKMSA-N Asn-Leu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFPXZWPUVFVNLL-GUBZILKMSA-N 0.000 description 1
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 1
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 1
- YVXRYLVELQYAEQ-SRVKXCTJSA-N Asn-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N YVXRYLVELQYAEQ-SRVKXCTJSA-N 0.000 description 1
- FBODFHMLALOPHP-GUBZILKMSA-N Asn-Lys-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O FBODFHMLALOPHP-GUBZILKMSA-N 0.000 description 1
- RZNAMKZJPBQWDJ-SRVKXCTJSA-N Asn-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N RZNAMKZJPBQWDJ-SRVKXCTJSA-N 0.000 description 1
- ZYPWIUFLYMQZBS-SRVKXCTJSA-N Asn-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ZYPWIUFLYMQZBS-SRVKXCTJSA-N 0.000 description 1
- MYVBTYXSWILFCG-BQBZGAKWSA-N Asn-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N MYVBTYXSWILFCG-BQBZGAKWSA-N 0.000 description 1
- AEZCCDMZZJOGII-DCAQKATOSA-N Asn-Met-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O AEZCCDMZZJOGII-DCAQKATOSA-N 0.000 description 1
- PBFXCUOEGVJTMV-QXEWZRGKSA-N Asn-Met-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O PBFXCUOEGVJTMV-QXEWZRGKSA-N 0.000 description 1
- RTFWCVDISAMGEQ-SRVKXCTJSA-N Asn-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N RTFWCVDISAMGEQ-SRVKXCTJSA-N 0.000 description 1
- LSJQOMAZIKQMTJ-SRVKXCTJSA-N Asn-Phe-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LSJQOMAZIKQMTJ-SRVKXCTJSA-N 0.000 description 1
- PPCORQFLAZWUNO-QWRGUYRKSA-N Asn-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N PPCORQFLAZWUNO-QWRGUYRKSA-N 0.000 description 1
- HZZIFFOVHLWGCS-KKUMJFAQSA-N Asn-Phe-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O HZZIFFOVHLWGCS-KKUMJFAQSA-N 0.000 description 1
- BKFXFUPYETWGGA-XVSYOHENSA-N Asn-Phe-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BKFXFUPYETWGGA-XVSYOHENSA-N 0.000 description 1
- YUOXLJYVSZYPBJ-CIUDSAMLSA-N Asn-Pro-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O YUOXLJYVSZYPBJ-CIUDSAMLSA-N 0.000 description 1
- NJSNXIOKBHPFMB-GMOBBJLQSA-N Asn-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)N)N NJSNXIOKBHPFMB-GMOBBJLQSA-N 0.000 description 1
- BYLSYQASFJJBCL-DCAQKATOSA-N Asn-Pro-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BYLSYQASFJJBCL-DCAQKATOSA-N 0.000 description 1
- GMUOCGCDOYYWPD-FXQIFTODSA-N Asn-Pro-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O GMUOCGCDOYYWPD-FXQIFTODSA-N 0.000 description 1
- GZXOUBTUAUAVHD-ACZMJKKPSA-N Asn-Ser-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GZXOUBTUAUAVHD-ACZMJKKPSA-N 0.000 description 1
- MKJBPDLENBUHQU-CIUDSAMLSA-N Asn-Ser-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O MKJBPDLENBUHQU-CIUDSAMLSA-N 0.000 description 1
- XHTUGJCAEYOZOR-UBHSHLNASA-N Asn-Ser-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O XHTUGJCAEYOZOR-UBHSHLNASA-N 0.000 description 1
- MYTHOBCLNIOFBL-SRVKXCTJSA-N Asn-Ser-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYTHOBCLNIOFBL-SRVKXCTJSA-N 0.000 description 1
- NCXTYSVDWLAQGZ-ZKWXMUAHSA-N Asn-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O NCXTYSVDWLAQGZ-ZKWXMUAHSA-N 0.000 description 1
- WLVLIYYBPPONRJ-GCJQMDKQSA-N Asn-Thr-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O WLVLIYYBPPONRJ-GCJQMDKQSA-N 0.000 description 1
- QUMKPKWYDVMGNT-NUMRIWBASA-N Asn-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QUMKPKWYDVMGNT-NUMRIWBASA-N 0.000 description 1
- FMNBYVSGRCXWEK-FOHZUACHSA-N Asn-Thr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O FMNBYVSGRCXWEK-FOHZUACHSA-N 0.000 description 1
- PIABYSIYPGLLDQ-XVSYOHENSA-N Asn-Thr-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PIABYSIYPGLLDQ-XVSYOHENSA-N 0.000 description 1
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 1
- UXHYOWXTJLBEPG-GSSVUCPTSA-N Asn-Thr-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UXHYOWXTJLBEPG-GSSVUCPTSA-N 0.000 description 1
- BCADFFUQHIMQAA-KKHAAJSZSA-N Asn-Thr-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BCADFFUQHIMQAA-KKHAAJSZSA-N 0.000 description 1
- JPPLRQVZMZFOSX-UWJYBYFXSA-N Asn-Tyr-Ala Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=C(O)C=C1 JPPLRQVZMZFOSX-UWJYBYFXSA-N 0.000 description 1
- QIRJQYQOIKBPBZ-IHRRRGAJSA-N Asn-Tyr-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QIRJQYQOIKBPBZ-IHRRRGAJSA-N 0.000 description 1
- YSYTWUMRHSFODC-QWRGUYRKSA-N Asn-Tyr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O YSYTWUMRHSFODC-QWRGUYRKSA-N 0.000 description 1
- MJIJBEYEHBKTIM-BYULHYEWSA-N Asn-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MJIJBEYEHBKTIM-BYULHYEWSA-N 0.000 description 1
- MYRLSKYSMXNLLA-LAEOZQHASA-N Asn-Val-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MYRLSKYSMXNLLA-LAEOZQHASA-N 0.000 description 1
- JNCRAQVYJZGIOW-QSFUFRPTSA-N Asn-Val-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JNCRAQVYJZGIOW-QSFUFRPTSA-N 0.000 description 1
- HBUJSDCLZCXXCW-YDHLFZDLSA-N Asn-Val-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HBUJSDCLZCXXCW-YDHLFZDLSA-N 0.000 description 1
- UWMIZBCTVWVMFI-FXQIFTODSA-N Asp-Ala-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UWMIZBCTVWVMFI-FXQIFTODSA-N 0.000 description 1
- WSWYMRLTJVKRCE-ZLUOBGJFSA-N Asp-Ala-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O WSWYMRLTJVKRCE-ZLUOBGJFSA-N 0.000 description 1
- VTYQAQFKMQTKQD-ACZMJKKPSA-N Asp-Ala-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O VTYQAQFKMQTKQD-ACZMJKKPSA-N 0.000 description 1
- XEDQMTWEYFBOIK-ACZMJKKPSA-N Asp-Ala-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XEDQMTWEYFBOIK-ACZMJKKPSA-N 0.000 description 1
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 1
- RGKKALNPOYURGE-ZKWXMUAHSA-N Asp-Ala-Val Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O RGKKALNPOYURGE-ZKWXMUAHSA-N 0.000 description 1
- MFMJRYHVLLEMQM-DCAQKATOSA-N Asp-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N MFMJRYHVLLEMQM-DCAQKATOSA-N 0.000 description 1
- CASGONAXMZPHCK-FXQIFTODSA-N Asp-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)CN=C(N)N CASGONAXMZPHCK-FXQIFTODSA-N 0.000 description 1
- QRULNKJGYQQZMW-ZLUOBGJFSA-N Asp-Asn-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QRULNKJGYQQZMW-ZLUOBGJFSA-N 0.000 description 1
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 1
- KNMRXHIAVXHCLW-ZLUOBGJFSA-N Asp-Asn-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)C(=O)O KNMRXHIAVXHCLW-ZLUOBGJFSA-N 0.000 description 1
- XACXDSRQIXRMNS-OLHMAJIHSA-N Asp-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)O XACXDSRQIXRMNS-OLHMAJIHSA-N 0.000 description 1
- NAPNAGZWHQHZLG-ZLUOBGJFSA-N Asp-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N NAPNAGZWHQHZLG-ZLUOBGJFSA-N 0.000 description 1
- WCFCYFDBMNFSPA-ACZMJKKPSA-N Asp-Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O WCFCYFDBMNFSPA-ACZMJKKPSA-N 0.000 description 1
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 1
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 1
- BFOYULZBKYOKAN-OLHMAJIHSA-N Asp-Asp-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFOYULZBKYOKAN-OLHMAJIHSA-N 0.000 description 1
- KGAJCJXBEWLQDZ-UBHSHLNASA-N Asp-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N KGAJCJXBEWLQDZ-UBHSHLNASA-N 0.000 description 1
- AMRANMVXQWXNAH-ZLUOBGJFSA-N Asp-Cys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CC(O)=O AMRANMVXQWXNAH-ZLUOBGJFSA-N 0.000 description 1
- VHQOCWWKXIOAQI-WDSKDSINSA-N Asp-Gln-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VHQOCWWKXIOAQI-WDSKDSINSA-N 0.000 description 1
- XAJRHVUUVUPFQL-ACZMJKKPSA-N Asp-Glu-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XAJRHVUUVUPFQL-ACZMJKKPSA-N 0.000 description 1
- GHODABZPVZMWCE-FXQIFTODSA-N Asp-Glu-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O GHODABZPVZMWCE-FXQIFTODSA-N 0.000 description 1
- LTXGDRFJRZSZAV-CIUDSAMLSA-N Asp-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N LTXGDRFJRZSZAV-CIUDSAMLSA-N 0.000 description 1
- DGKCOYGQLNWNCJ-ACZMJKKPSA-N Asp-Glu-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O DGKCOYGQLNWNCJ-ACZMJKKPSA-N 0.000 description 1
- XDGBFDYXZCMYEX-NUMRIWBASA-N Asp-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)O XDGBFDYXZCMYEX-NUMRIWBASA-N 0.000 description 1
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 1
- HAFCJCDJGIOYPW-WDSKDSINSA-N Asp-Gly-Gln Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O HAFCJCDJGIOYPW-WDSKDSINSA-N 0.000 description 1
- OMMIEVATLAGRCK-BYPYZUCNSA-N Asp-Gly-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)NCC(O)=O OMMIEVATLAGRCK-BYPYZUCNSA-N 0.000 description 1
- QCVXMEHGFUMKCO-YUMQZZPRSA-N Asp-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O QCVXMEHGFUMKCO-YUMQZZPRSA-N 0.000 description 1
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 1
- OGTCOKZFOJIZFG-CIUDSAMLSA-N Asp-His-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O OGTCOKZFOJIZFG-CIUDSAMLSA-N 0.000 description 1
- TZOZNVLBTAFJRW-UGYAYLCHSA-N Asp-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N TZOZNVLBTAFJRW-UGYAYLCHSA-N 0.000 description 1
- QNFRBNZGVVKBNJ-PEFMBERDSA-N Asp-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N QNFRBNZGVVKBNJ-PEFMBERDSA-N 0.000 description 1
- KQBVNNAPIURMPD-PEFMBERDSA-N Asp-Ile-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KQBVNNAPIURMPD-PEFMBERDSA-N 0.000 description 1
- NHSDEZURHWEZPN-SXTJYALSSA-N Asp-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CC(=O)O)N NHSDEZURHWEZPN-SXTJYALSSA-N 0.000 description 1
- PYXXJFRXIYAESU-PCBIJLKTSA-N Asp-Ile-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PYXXJFRXIYAESU-PCBIJLKTSA-N 0.000 description 1
- SPKCGKRUYKMDHP-GUDRVLHUSA-N Asp-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N SPKCGKRUYKMDHP-GUDRVLHUSA-N 0.000 description 1
- KLYPOCBLKMPBIQ-GHCJXIJMSA-N Asp-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N KLYPOCBLKMPBIQ-GHCJXIJMSA-N 0.000 description 1
- UZNSWMFLKVKJLI-VHWLVUOQSA-N Asp-Ile-Trp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O UZNSWMFLKVKJLI-VHWLVUOQSA-N 0.000 description 1
- KYQNAIMCTRZLNP-QSFUFRPTSA-N Asp-Ile-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O KYQNAIMCTRZLNP-QSFUFRPTSA-N 0.000 description 1
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 1
- PAYPSKIBMDHZPI-CIUDSAMLSA-N Asp-Leu-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PAYPSKIBMDHZPI-CIUDSAMLSA-N 0.000 description 1
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 1
- DPNWSMBUYCLEDG-CIUDSAMLSA-N Asp-Lys-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O DPNWSMBUYCLEDG-CIUDSAMLSA-N 0.000 description 1
- VMVUDJUXJKDGNR-FXQIFTODSA-N Asp-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N VMVUDJUXJKDGNR-FXQIFTODSA-N 0.000 description 1
- HSGOFISJLFDMBJ-CIUDSAMLSA-N Asp-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N HSGOFISJLFDMBJ-CIUDSAMLSA-N 0.000 description 1
- JXGJJQJHXHXJQF-CIUDSAMLSA-N Asp-Met-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O JXGJJQJHXHXJQF-CIUDSAMLSA-N 0.000 description 1
- IDDMGSKZQDEDGA-SRVKXCTJSA-N Asp-Phe-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 IDDMGSKZQDEDGA-SRVKXCTJSA-N 0.000 description 1
- GWIJZUVQVDJHDI-AVGNSLFASA-N Asp-Phe-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O GWIJZUVQVDJHDI-AVGNSLFASA-N 0.000 description 1
- LTCKTLYKRMCFOC-KKUMJFAQSA-N Asp-Phe-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LTCKTLYKRMCFOC-KKUMJFAQSA-N 0.000 description 1
- UCHSVZYJKJLPHF-BZSNNMDCSA-N Asp-Phe-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O UCHSVZYJKJLPHF-BZSNNMDCSA-N 0.000 description 1
- ZQFRDAZBTSFGGW-SRVKXCTJSA-N Asp-Ser-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZQFRDAZBTSFGGW-SRVKXCTJSA-N 0.000 description 1
- MGSVBZIBCCKGCY-ZLUOBGJFSA-N Asp-Ser-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MGSVBZIBCCKGCY-ZLUOBGJFSA-N 0.000 description 1
- MNQMTYSEKZHIDF-GCJQMDKQSA-N Asp-Thr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O MNQMTYSEKZHIDF-GCJQMDKQSA-N 0.000 description 1
- IWLZBRTUIVXZJD-OLHMAJIHSA-N Asp-Thr-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O IWLZBRTUIVXZJD-OLHMAJIHSA-N 0.000 description 1
- JJQGZGOEDSSHTE-FOHZUACHSA-N Asp-Thr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JJQGZGOEDSSHTE-FOHZUACHSA-N 0.000 description 1
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 1
- ITGFVUYOLWBPQW-KKHAAJSZSA-N Asp-Thr-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O ITGFVUYOLWBPQW-KKHAAJSZSA-N 0.000 description 1
- HCOQNGIHSXICCB-IHRRRGAJSA-N Asp-Tyr-Arg Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)O HCOQNGIHSXICCB-IHRRRGAJSA-N 0.000 description 1
- NJLLRXWFPQQPHV-SRVKXCTJSA-N Asp-Tyr-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O NJLLRXWFPQQPHV-SRVKXCTJSA-N 0.000 description 1
- KNDCWFXCFKSEBM-AVGNSLFASA-N Asp-Tyr-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O KNDCWFXCFKSEBM-AVGNSLFASA-N 0.000 description 1
- 108010083946 Asp-Tyr-Leu-Lys Proteins 0.000 description 1
- VHUKCUHLFMRHOD-MELADBBJSA-N Asp-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O VHUKCUHLFMRHOD-MELADBBJSA-N 0.000 description 1
- CZIVKMOEXPILDK-SRVKXCTJSA-N Asp-Tyr-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O CZIVKMOEXPILDK-SRVKXCTJSA-N 0.000 description 1
- PLOKOIJSGCISHE-BYULHYEWSA-N Asp-Val-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PLOKOIJSGCISHE-BYULHYEWSA-N 0.000 description 1
- UXRVDHVARNBOIO-QSFUFRPTSA-N Asp-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(=O)O)N UXRVDHVARNBOIO-QSFUFRPTSA-N 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 241000223678 Aureobasidium pullulans Species 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- 102000006734 Beta-Globulins Human genes 0.000 description 1
- 108010087504 Beta-Globulins Proteins 0.000 description 1
- 108010006654 Bleomycin Proteins 0.000 description 1
- 101150078024 CRY2 gene Proteins 0.000 description 1
- 101100465058 Caenorhabditis elegans prk-2 gene Proteins 0.000 description 1
- BVKZGUZCCUSVTD-UHFFFAOYSA-L Carbonate Chemical compound [O-]C([O-])=O BVKZGUZCCUSVTD-UHFFFAOYSA-L 0.000 description 1
- 241000701489 Cauliflower mosaic virus Species 0.000 description 1
- 108010035563 Chloramphenicol O-acetyltransferase Proteins 0.000 description 1
- 239000005496 Chlorsulfuron Substances 0.000 description 1
- KRKNYBCHXYNGOX-UHFFFAOYSA-K Citrate Chemical compound [O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O KRKNYBCHXYNGOX-UHFFFAOYSA-K 0.000 description 1
- 108091033380 Coding strand Proteins 0.000 description 1
- 241000500845 Costelytra zealandica Species 0.000 description 1
- 241000195493 Cryptophyta Species 0.000 description 1
- 241000252867 Cupriavidus metallidurans Species 0.000 description 1
- XRTISHJEPHMBJG-SRVKXCTJSA-N Cys-Asp-Tyr Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XRTISHJEPHMBJG-SRVKXCTJSA-N 0.000 description 1
- PFAQXUDMZVMADG-AVGNSLFASA-N Cys-Gln-Tyr Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PFAQXUDMZVMADG-AVGNSLFASA-N 0.000 description 1
- MRVSLWQRNWEROS-SVSWQMSJSA-N Cys-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CS)N MRVSLWQRNWEROS-SVSWQMSJSA-N 0.000 description 1
- WVLZTXGTNGHPBO-SRVKXCTJSA-N Cys-Leu-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O WVLZTXGTNGHPBO-SRVKXCTJSA-N 0.000 description 1
- VTBGVPWSWJBERH-DCAQKATOSA-N Cys-Leu-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CS)N VTBGVPWSWJBERH-DCAQKATOSA-N 0.000 description 1
- GGRDJANMZPGMNS-CIUDSAMLSA-N Cys-Ser-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O GGRDJANMZPGMNS-CIUDSAMLSA-N 0.000 description 1
- MWVDDZUTWXFYHL-XKBZYTNZSA-N Cys-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CS)N)O MWVDDZUTWXFYHL-XKBZYTNZSA-N 0.000 description 1
- 108010066133 D-octopine dehydrogenase Proteins 0.000 description 1
- YAHZABJORDUQGO-NQXXGFSBSA-N D-ribulose 1,5-bisphosphate Chemical compound OP(=O)(O)OC[C@@H](O)[C@@H](O)C(=O)COP(O)(O)=O YAHZABJORDUQGO-NQXXGFSBSA-N 0.000 description 1
- 108010090461 DFG peptide Proteins 0.000 description 1
- 239000003298 DNA probe Substances 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- NDUPDOJHUQKPAG-UHFFFAOYSA-N Dalapon Chemical compound CC(Cl)(Cl)C(O)=O NDUPDOJHUQKPAG-UHFFFAOYSA-N 0.000 description 1
- 208000034423 Delivery Diseases 0.000 description 1
- 241001303048 Ditta Species 0.000 description 1
- 102000002322 Egg Proteins Human genes 0.000 description 1
- 108010000912 Egg Proteins Proteins 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 108090000371 Esterases Proteins 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 108010001515 Galectin 4 Proteins 0.000 description 1
- 102100039556 Galectin-4 Human genes 0.000 description 1
- 108010010803 Gelatin Proteins 0.000 description 1
- YJIUYQKQBBQYHZ-ACZMJKKPSA-N Gln-Ala-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YJIUYQKQBBQYHZ-ACZMJKKPSA-N 0.000 description 1
- REJJNXODKSHOKA-ACZMJKKPSA-N Gln-Ala-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N REJJNXODKSHOKA-ACZMJKKPSA-N 0.000 description 1
- NNQHEEQNPQYPGL-FXQIFTODSA-N Gln-Ala-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O NNQHEEQNPQYPGL-FXQIFTODSA-N 0.000 description 1
- RZSLYUUFFVHFRQ-FXQIFTODSA-N Gln-Ala-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O RZSLYUUFFVHFRQ-FXQIFTODSA-N 0.000 description 1
- UWZLBXOBVKRUFE-HGNGGELXSA-N Gln-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N UWZLBXOBVKRUFE-HGNGGELXSA-N 0.000 description 1
- LKUWAWGNJYJODH-KBIXCLLPSA-N Gln-Ala-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LKUWAWGNJYJODH-KBIXCLLPSA-N 0.000 description 1
- OVQXQLWWJSNYFV-XEGUGMAKSA-N Gln-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCC(N)=O)C)C(O)=O)=CNC2=C1 OVQXQLWWJSNYFV-XEGUGMAKSA-N 0.000 description 1
- JSYULGSPLTZDHM-NRPADANISA-N Gln-Ala-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O JSYULGSPLTZDHM-NRPADANISA-N 0.000 description 1
- PRBLYKYHAJEABA-SRVKXCTJSA-N Gln-Arg-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O PRBLYKYHAJEABA-SRVKXCTJSA-N 0.000 description 1
- MQANCSUBSBJNLU-KKUMJFAQSA-N Gln-Arg-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MQANCSUBSBJNLU-KKUMJFAQSA-N 0.000 description 1
- TWHDOEYLXXQYOZ-FXQIFTODSA-N Gln-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N TWHDOEYLXXQYOZ-FXQIFTODSA-N 0.000 description 1
- AAOBFSKXAVIORT-GUBZILKMSA-N Gln-Asn-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O AAOBFSKXAVIORT-GUBZILKMSA-N 0.000 description 1
- SSHIXEILTLPAQT-WHFBIAKZSA-N Gln-Asp Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O SSHIXEILTLPAQT-WHFBIAKZSA-N 0.000 description 1
- QYTKAVBFRUGYAU-ACZMJKKPSA-N Gln-Asp-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QYTKAVBFRUGYAU-ACZMJKKPSA-N 0.000 description 1
- ULXXDWZMMSQBDC-ACZMJKKPSA-N Gln-Asp-Asp Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ULXXDWZMMSQBDC-ACZMJKKPSA-N 0.000 description 1
- RKAQZCDMSUQTSS-FXQIFTODSA-N Gln-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RKAQZCDMSUQTSS-FXQIFTODSA-N 0.000 description 1
- JFSNBQJNDMXMQF-XHNCKOQMSA-N Gln-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N)C(=O)O JFSNBQJNDMXMQF-XHNCKOQMSA-N 0.000 description 1
- OFPWCBGRYAOLMU-AVGNSLFASA-N Gln-Asp-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O OFPWCBGRYAOLMU-AVGNSLFASA-N 0.000 description 1
- COYGBRTZEVWZBW-XKBZYTNZSA-N Gln-Cys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CCC(N)=O COYGBRTZEVWZBW-XKBZYTNZSA-N 0.000 description 1
- ZDJZEGYVKANKED-NRPADANISA-N Gln-Cys-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O ZDJZEGYVKANKED-NRPADANISA-N 0.000 description 1
- NKCZYEDZTKOFBG-GUBZILKMSA-N Gln-Gln-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NKCZYEDZTKOFBG-GUBZILKMSA-N 0.000 description 1
- XFKUFUJECJUQTQ-CIUDSAMLSA-N Gln-Gln-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XFKUFUJECJUQTQ-CIUDSAMLSA-N 0.000 description 1
- GPISLLFQNHELLK-DCAQKATOSA-N Gln-Gln-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N GPISLLFQNHELLK-DCAQKATOSA-N 0.000 description 1
- RBWKVOSARCFSQQ-FXQIFTODSA-N Gln-Gln-Ser Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O RBWKVOSARCFSQQ-FXQIFTODSA-N 0.000 description 1
- CGVWDTRDPLOMHZ-FXQIFTODSA-N Gln-Glu-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O CGVWDTRDPLOMHZ-FXQIFTODSA-N 0.000 description 1
- ZQPOVSJFBBETHQ-CIUDSAMLSA-N Gln-Glu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZQPOVSJFBBETHQ-CIUDSAMLSA-N 0.000 description 1
- ZNZPKVQURDQFFS-FXQIFTODSA-N Gln-Glu-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZNZPKVQURDQFFS-FXQIFTODSA-N 0.000 description 1
- XJKAKYXMFHUIHT-AUTRQRHGSA-N Gln-Glu-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N XJKAKYXMFHUIHT-AUTRQRHGSA-N 0.000 description 1
- SMLDOQHTOAAFJQ-WDSKDSINSA-N Gln-Gly-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SMLDOQHTOAAFJQ-WDSKDSINSA-N 0.000 description 1
- ORYMMTRPKVTGSJ-XVKPBYJWSA-N Gln-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O ORYMMTRPKVTGSJ-XVKPBYJWSA-N 0.000 description 1
- NROSLUJMIQGFKS-IUCAKERBSA-N Gln-His-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N NROSLUJMIQGFKS-IUCAKERBSA-N 0.000 description 1
- NNXIQPMZGZUFJJ-AVGNSLFASA-N Gln-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N NNXIQPMZGZUFJJ-AVGNSLFASA-N 0.000 description 1
- YRWWJCDWLVXTHN-LAEOZQHASA-N Gln-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N YRWWJCDWLVXTHN-LAEOZQHASA-N 0.000 description 1
- RGAOLBZBLOJUTP-GRLWGSQLSA-N Gln-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N RGAOLBZBLOJUTP-GRLWGSQLSA-N 0.000 description 1
- MTCXQQINVAFZKW-MNXVOIDGSA-N Gln-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MTCXQQINVAFZKW-MNXVOIDGSA-N 0.000 description 1
- ITZWDGBYBPUZRG-KBIXCLLPSA-N Gln-Ile-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O ITZWDGBYBPUZRG-KBIXCLLPSA-N 0.000 description 1
- ZNTDJIMJKNNSLR-RWRJDSDZSA-N Gln-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZNTDJIMJKNNSLR-RWRJDSDZSA-N 0.000 description 1
- JKGHMESJHRTHIC-SIUGBPQLSA-N Gln-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JKGHMESJHRTHIC-SIUGBPQLSA-N 0.000 description 1
- HYPVLWGNBIYTNA-GUBZILKMSA-N Gln-Leu-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HYPVLWGNBIYTNA-GUBZILKMSA-N 0.000 description 1
- VZRAXPGTUNDIDK-GUBZILKMSA-N Gln-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N VZRAXPGTUNDIDK-GUBZILKMSA-N 0.000 description 1
- HHQCBFGKQDMWSP-GUBZILKMSA-N Gln-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HHQCBFGKQDMWSP-GUBZILKMSA-N 0.000 description 1
- LGIKBBLQVSWUGK-DCAQKATOSA-N Gln-Leu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGIKBBLQVSWUGK-DCAQKATOSA-N 0.000 description 1
- QKCZZAZNMMVICF-DCAQKATOSA-N Gln-Leu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O QKCZZAZNMMVICF-DCAQKATOSA-N 0.000 description 1
- XFAUJGNLHIGXET-AVGNSLFASA-N Gln-Leu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XFAUJGNLHIGXET-AVGNSLFASA-N 0.000 description 1
- IULKWYSYZSURJK-AVGNSLFASA-N Gln-Leu-Lys Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O IULKWYSYZSURJK-AVGNSLFASA-N 0.000 description 1
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 1
- ATTWDCRXQNKRII-GUBZILKMSA-N Gln-Lys-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ATTWDCRXQNKRII-GUBZILKMSA-N 0.000 description 1
- HPCOBEHVEHWREJ-DCAQKATOSA-N Gln-Lys-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HPCOBEHVEHWREJ-DCAQKATOSA-N 0.000 description 1
- ZEEPYMXTJWIMSN-GUBZILKMSA-N Gln-Lys-Ser Chemical compound NCCCC[C@@H](C(=O)N[C@@H](CO)C(O)=O)NC(=O)[C@@H](N)CCC(N)=O ZEEPYMXTJWIMSN-GUBZILKMSA-N 0.000 description 1
- XBWGJWXGUNSZAT-CIUDSAMLSA-N Gln-Met-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N XBWGJWXGUNSZAT-CIUDSAMLSA-N 0.000 description 1
- QMVCEWKHIUHTSD-GUBZILKMSA-N Gln-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N QMVCEWKHIUHTSD-GUBZILKMSA-N 0.000 description 1
- JNVGVECJCOZHCN-DRZSPHRISA-N Gln-Phe-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O JNVGVECJCOZHCN-DRZSPHRISA-N 0.000 description 1
- HMIXCETWRYDVMO-GUBZILKMSA-N Gln-Pro-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O HMIXCETWRYDVMO-GUBZILKMSA-N 0.000 description 1
- XQDGOJPVMSWZSO-SRVKXCTJSA-N Gln-Pro-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N XQDGOJPVMSWZSO-SRVKXCTJSA-N 0.000 description 1
- VNTGPISAOMAXRK-CIUDSAMLSA-N Gln-Pro-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O VNTGPISAOMAXRK-CIUDSAMLSA-N 0.000 description 1
- DCWNCMRZIZSZBL-KKUMJFAQSA-N Gln-Pro-Tyr Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)N)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O DCWNCMRZIZSZBL-KKUMJFAQSA-N 0.000 description 1
- NYCVMJGIJYQWDO-CIUDSAMLSA-N Gln-Ser-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NYCVMJGIJYQWDO-CIUDSAMLSA-N 0.000 description 1
- UTOQQOMEJDPDMX-ACZMJKKPSA-N Gln-Ser-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O UTOQQOMEJDPDMX-ACZMJKKPSA-N 0.000 description 1
- LGWNISYVKDNJRP-FXQIFTODSA-N Gln-Ser-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGWNISYVKDNJRP-FXQIFTODSA-N 0.000 description 1
- KPNWAJMEMRCLAL-GUBZILKMSA-N Gln-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N KPNWAJMEMRCLAL-GUBZILKMSA-N 0.000 description 1
- OTQSTOXRUBVWAP-NRPADANISA-N Gln-Ser-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O OTQSTOXRUBVWAP-NRPADANISA-N 0.000 description 1
- PAOHIZNRJNIXQY-XQXXSGGOSA-N Gln-Thr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PAOHIZNRJNIXQY-XQXXSGGOSA-N 0.000 description 1
- XKPACHRGOWQHFH-IRIUXVKKSA-N Gln-Thr-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XKPACHRGOWQHFH-IRIUXVKKSA-N 0.000 description 1
- HLRLXVPRJJITSK-IFFSRLJSSA-N Gln-Thr-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HLRLXVPRJJITSK-IFFSRLJSSA-N 0.000 description 1
- DITJVHONFRJKJW-BPUTZDHNSA-N Gln-Trp-Glu Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N DITJVHONFRJKJW-BPUTZDHNSA-N 0.000 description 1
- GTBXHETZPUURJE-KKUMJFAQSA-N Gln-Tyr-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GTBXHETZPUURJE-KKUMJFAQSA-N 0.000 description 1
- OACQOWPRWGNKTP-AVGNSLFASA-N Gln-Tyr-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O OACQOWPRWGNKTP-AVGNSLFASA-N 0.000 description 1
- WIMVKDYAKRAUCG-IHRRRGAJSA-N Gln-Tyr-Glu Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O WIMVKDYAKRAUCG-IHRRRGAJSA-N 0.000 description 1
- SGVGIVDZLSHSEN-RYUDHWBXSA-N Gln-Tyr-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O SGVGIVDZLSHSEN-RYUDHWBXSA-N 0.000 description 1
- BJVBMSTUUWGZKX-JYJNAYRXSA-N Gln-Tyr-His Chemical compound N[C@@H](CCC(N)=O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O BJVBMSTUUWGZKX-JYJNAYRXSA-N 0.000 description 1
- UGEZSPWLJABDAR-KKUMJFAQSA-N Gln-Tyr-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)N)N UGEZSPWLJABDAR-KKUMJFAQSA-N 0.000 description 1
- JTWZNMUVQWWGOX-SOUVJXGZSA-N Gln-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCC(=O)N)N)C(=O)O JTWZNMUVQWWGOX-SOUVJXGZSA-N 0.000 description 1
- SJMJMEWQMBJYPR-DZKIICNBSA-N Gln-Tyr-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)N)N SJMJMEWQMBJYPR-DZKIICNBSA-N 0.000 description 1
- OACPJRQRAHMQEQ-NHCYSSNCSA-N Gln-Val-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OACPJRQRAHMQEQ-NHCYSSNCSA-N 0.000 description 1
- FITIQFSXXBKFFM-NRPADANISA-N Gln-Val-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FITIQFSXXBKFFM-NRPADANISA-N 0.000 description 1
- SOEXCCGNHQBFPV-DLOVCJGASA-N Gln-Val-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SOEXCCGNHQBFPV-DLOVCJGASA-N 0.000 description 1
- 108010044091 Globulins Proteins 0.000 description 1
- 102000006395 Globulins Human genes 0.000 description 1
- RUFHOVYUYSNDNY-ACZMJKKPSA-N Glu-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O RUFHOVYUYSNDNY-ACZMJKKPSA-N 0.000 description 1
- WZZSKAJIHTUUSG-ACZMJKKPSA-N Glu-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O WZZSKAJIHTUUSG-ACZMJKKPSA-N 0.000 description 1
- OGMQXTXGLDNBSS-FXQIFTODSA-N Glu-Ala-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O OGMQXTXGLDNBSS-FXQIFTODSA-N 0.000 description 1
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 1
- AVZHGSCDKIQZPQ-CIUDSAMLSA-N Glu-Arg-Ala Chemical compound C[C@H](NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O AVZHGSCDKIQZPQ-CIUDSAMLSA-N 0.000 description 1
- VTTSANCGJWLPNC-ZPFDUUQYSA-N Glu-Arg-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VTTSANCGJWLPNC-ZPFDUUQYSA-N 0.000 description 1
- LTUVYLVIZHJCOQ-KKUMJFAQSA-N Glu-Arg-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LTUVYLVIZHJCOQ-KKUMJFAQSA-N 0.000 description 1
- ZOXBSICWUDAOHX-GUBZILKMSA-N Glu-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O ZOXBSICWUDAOHX-GUBZILKMSA-N 0.000 description 1
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 1
- BUVMZWZNWMKASN-QEJZJMRPSA-N Glu-Asn-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCC(O)=O)N)C(O)=O)=CNC2=C1 BUVMZWZNWMKASN-QEJZJMRPSA-N 0.000 description 1
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 1
- JRCUFCXYZLPSDZ-ACZMJKKPSA-N Glu-Asp-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O JRCUFCXYZLPSDZ-ACZMJKKPSA-N 0.000 description 1
- CYHBMLHCQXXCCT-AVGNSLFASA-N Glu-Asp-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CYHBMLHCQXXCCT-AVGNSLFASA-N 0.000 description 1
- WATXSTJXNBOHKD-LAEOZQHASA-N Glu-Asp-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O WATXSTJXNBOHKD-LAEOZQHASA-N 0.000 description 1
- PXHABOCPJVTGEK-BQBZGAKWSA-N Glu-Gln-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O PXHABOCPJVTGEK-BQBZGAKWSA-N 0.000 description 1
- UMIRPYLZFKOEOH-YVNDNENWSA-N Glu-Gln-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UMIRPYLZFKOEOH-YVNDNENWSA-N 0.000 description 1
- WLIPTFCZLHCNFD-LPEHRKFASA-N Glu-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)O)N)C(=O)O WLIPTFCZLHCNFD-LPEHRKFASA-N 0.000 description 1
- WPLGNDORMXTMQS-FXQIFTODSA-N Glu-Gln-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O WPLGNDORMXTMQS-FXQIFTODSA-N 0.000 description 1
- HTTSBEBKVNEDFE-AUTRQRHGSA-N Glu-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)O)N HTTSBEBKVNEDFE-AUTRQRHGSA-N 0.000 description 1
- CGOHAEBMDSEKFB-FXQIFTODSA-N Glu-Glu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O CGOHAEBMDSEKFB-FXQIFTODSA-N 0.000 description 1
- BUZMZDDKFCSKOT-CIUDSAMLSA-N Glu-Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BUZMZDDKFCSKOT-CIUDSAMLSA-N 0.000 description 1
- AUTNXSQEVVHSJK-YVNDNENWSA-N Glu-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O AUTNXSQEVVHSJK-YVNDNENWSA-N 0.000 description 1
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 1
- UHVIQGKBMXEVGN-WDSKDSINSA-N Glu-Gly-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UHVIQGKBMXEVGN-WDSKDSINSA-N 0.000 description 1
- CUXJIASLBRJOFV-LAEOZQHASA-N Glu-Gly-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUXJIASLBRJOFV-LAEOZQHASA-N 0.000 description 1
- KRGZZKWSBGPLKL-IUCAKERBSA-N Glu-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N KRGZZKWSBGPLKL-IUCAKERBSA-N 0.000 description 1
- ZWQVYZXPYSYPJD-RYUDHWBXSA-N Glu-Gly-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZWQVYZXPYSYPJD-RYUDHWBXSA-N 0.000 description 1
- HPJLZFTUUJKWAJ-JHEQGTHGSA-N Glu-Gly-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HPJLZFTUUJKWAJ-JHEQGTHGSA-N 0.000 description 1
- XMPAXPSENRSOSV-RYUDHWBXSA-N Glu-Gly-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XMPAXPSENRSOSV-RYUDHWBXSA-N 0.000 description 1
- ZJFNRQHUIHKZJF-GUBZILKMSA-N Glu-His-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O ZJFNRQHUIHKZJF-GUBZILKMSA-N 0.000 description 1
- XIKYNVKEUINBGL-IUCAKERBSA-N Glu-His-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)NCC(O)=O XIKYNVKEUINBGL-IUCAKERBSA-N 0.000 description 1
- QLPYYTDOUQNJGQ-AVGNSLFASA-N Glu-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N QLPYYTDOUQNJGQ-AVGNSLFASA-N 0.000 description 1
- WVYJNPCWJYBHJG-YVNDNENWSA-N Glu-Ile-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O WVYJNPCWJYBHJG-YVNDNENWSA-N 0.000 description 1
- QXDXIXFSFHUYAX-MNXVOIDGSA-N Glu-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O QXDXIXFSFHUYAX-MNXVOIDGSA-N 0.000 description 1
- XTZDZAXYPDISRR-MNXVOIDGSA-N Glu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XTZDZAXYPDISRR-MNXVOIDGSA-N 0.000 description 1
- ZHNHJYYFCGUZNQ-KBIXCLLPSA-N Glu-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O ZHNHJYYFCGUZNQ-KBIXCLLPSA-N 0.000 description 1
- ATVYZJGOZLVXDK-IUCAKERBSA-N Glu-Leu-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O ATVYZJGOZLVXDK-IUCAKERBSA-N 0.000 description 1
- DWBBKNPKDHXIAC-SRVKXCTJSA-N Glu-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCC(O)=O DWBBKNPKDHXIAC-SRVKXCTJSA-N 0.000 description 1
- FBEJIDRSQCGFJI-GUBZILKMSA-N Glu-Leu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FBEJIDRSQCGFJI-GUBZILKMSA-N 0.000 description 1
- GJBUAAAIZSRCDC-GVXVVHGQSA-N Glu-Leu-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O GJBUAAAIZSRCDC-GVXVVHGQSA-N 0.000 description 1
- BBBXWRGITSUJPB-YUMQZZPRSA-N Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O BBBXWRGITSUJPB-YUMQZZPRSA-N 0.000 description 1
- YKBUCXNNBYZYAY-MNXVOIDGSA-N Glu-Lys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YKBUCXNNBYZYAY-MNXVOIDGSA-N 0.000 description 1
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 1
- QDMVXRNLOPTPIE-WDCWCFNPSA-N Glu-Lys-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QDMVXRNLOPTPIE-WDCWCFNPSA-N 0.000 description 1
- KJBGAZSLZAQDPV-KKUMJFAQSA-N Glu-Phe-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N KJBGAZSLZAQDPV-KKUMJFAQSA-N 0.000 description 1
- YRMZCZIRHYCNHX-RYUDHWBXSA-N Glu-Phe-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O YRMZCZIRHYCNHX-RYUDHWBXSA-N 0.000 description 1
- JZJGEKDPWVJOLD-QEWYBTABSA-N Glu-Phe-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JZJGEKDPWVJOLD-QEWYBTABSA-N 0.000 description 1
- TWYFJOHWGCCRIR-DCAQKATOSA-N Glu-Pro-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYFJOHWGCCRIR-DCAQKATOSA-N 0.000 description 1
- UDEPRBFQTWGLCW-CIUDSAMLSA-N Glu-Pro-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O UDEPRBFQTWGLCW-CIUDSAMLSA-N 0.000 description 1
- SYWCGQOIIARSIX-SRVKXCTJSA-N Glu-Pro-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O SYWCGQOIIARSIX-SRVKXCTJSA-N 0.000 description 1
- SWDNPSMMEWRNOH-HJGDQZAQSA-N Glu-Pro-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWDNPSMMEWRNOH-HJGDQZAQSA-N 0.000 description 1
- ARIORLIIMJACKZ-KKUMJFAQSA-N Glu-Pro-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ARIORLIIMJACKZ-KKUMJFAQSA-N 0.000 description 1
- GMVCSRBOSIUTFC-FXQIFTODSA-N Glu-Ser-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMVCSRBOSIUTFC-FXQIFTODSA-N 0.000 description 1
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 1
- JWNZHMSRZXXGTM-XKBZYTNZSA-N Glu-Ser-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWNZHMSRZXXGTM-XKBZYTNZSA-N 0.000 description 1
- WXONSNSSBYQGNN-AVGNSLFASA-N Glu-Ser-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WXONSNSSBYQGNN-AVGNSLFASA-N 0.000 description 1
- MWTGQXBHVRTCOR-GLLZPBPUSA-N Glu-Thr-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MWTGQXBHVRTCOR-GLLZPBPUSA-N 0.000 description 1
- RGJKYNUINKGPJN-RWRJDSDZSA-N Glu-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(=O)O)N RGJKYNUINKGPJN-RWRJDSDZSA-N 0.000 description 1
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 1
- MXJYXYDREQWUMS-XKBZYTNZSA-N Glu-Thr-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O MXJYXYDREQWUMS-XKBZYTNZSA-N 0.000 description 1
- VJVAQZYGLMJPTK-QEJZJMRPSA-N Glu-Trp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N VJVAQZYGLMJPTK-QEJZJMRPSA-N 0.000 description 1
- HHSKZJZWQFPSKN-AVGNSLFASA-N Glu-Tyr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O HHSKZJZWQFPSKN-AVGNSLFASA-N 0.000 description 1
- PMSDOVISAARGAV-FHWLQOOXSA-N Glu-Tyr-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 PMSDOVISAARGAV-FHWLQOOXSA-N 0.000 description 1
- HQTDNEZTGZUWSY-XVKPBYJWSA-N Glu-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)NCC(O)=O HQTDNEZTGZUWSY-XVKPBYJWSA-N 0.000 description 1
- GZUKEVBTYNNUQF-WDSKDSINSA-N Gly-Ala-Gln Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GZUKEVBTYNNUQF-WDSKDSINSA-N 0.000 description 1
- YMUFWNJHVPQNQD-ZKWXMUAHSA-N Gly-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN YMUFWNJHVPQNQD-ZKWXMUAHSA-N 0.000 description 1
- QXPRJQPCFXMCIY-NKWVEPMBSA-N Gly-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN QXPRJQPCFXMCIY-NKWVEPMBSA-N 0.000 description 1
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 1
- LERGJIVJIIODPZ-ZANVPECISA-N Gly-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)CN)C)C(O)=O)=CNC2=C1 LERGJIVJIIODPZ-ZANVPECISA-N 0.000 description 1
- OVSKVOOUFAKODB-UWVGGRQHSA-N Gly-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OVSKVOOUFAKODB-UWVGGRQHSA-N 0.000 description 1
- KFMBRBPXHVMDFN-UWVGGRQHSA-N Gly-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCNC(N)=N KFMBRBPXHVMDFN-UWVGGRQHSA-N 0.000 description 1
- CIMULJZTTOBOPN-WHFBIAKZSA-N Gly-Asn-Asn Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CIMULJZTTOBOPN-WHFBIAKZSA-N 0.000 description 1
- AIJAPFVDBFYNKN-WHFBIAKZSA-N Gly-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN)C(=O)N AIJAPFVDBFYNKN-WHFBIAKZSA-N 0.000 description 1
- NZAFOTBEULLEQB-WDSKDSINSA-N Gly-Asn-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN NZAFOTBEULLEQB-WDSKDSINSA-N 0.000 description 1
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 1
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 1
- XQHSBNVACKQWAV-WHFBIAKZSA-N Gly-Asp-Asn Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XQHSBNVACKQWAV-WHFBIAKZSA-N 0.000 description 1
- FUTAPPOITCCWTH-WHFBIAKZSA-N Gly-Asp-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FUTAPPOITCCWTH-WHFBIAKZSA-N 0.000 description 1
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 1
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 1
- JPWIMMUNWUKOAD-STQMWFEESA-N Gly-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN JPWIMMUNWUKOAD-STQMWFEESA-N 0.000 description 1
- BPQYBFAXRGMGGY-LAEOZQHASA-N Gly-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN BPQYBFAXRGMGGY-LAEOZQHASA-N 0.000 description 1
- JUGQPPOVWXSPKJ-RYUDHWBXSA-N Gly-Gln-Phe Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JUGQPPOVWXSPKJ-RYUDHWBXSA-N 0.000 description 1
- NPSWCZIRBAYNSB-JHEQGTHGSA-N Gly-Gln-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NPSWCZIRBAYNSB-JHEQGTHGSA-N 0.000 description 1
- JLJLBWDKDRYOPA-RYUDHWBXSA-N Gly-Gln-Tyr Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 JLJLBWDKDRYOPA-RYUDHWBXSA-N 0.000 description 1
- MOJKRXIRAZPZLW-WDSKDSINSA-N Gly-Glu-Ala Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MOJKRXIRAZPZLW-WDSKDSINSA-N 0.000 description 1
- HDNXXTBKOJKWNN-WDSKDSINSA-N Gly-Glu-Asn Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O HDNXXTBKOJKWNN-WDSKDSINSA-N 0.000 description 1
- NTOWAXLMQFKJPT-YUMQZZPRSA-N Gly-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CN NTOWAXLMQFKJPT-YUMQZZPRSA-N 0.000 description 1
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 1
- QPTNELDXWKRIFX-YFKPBYRVSA-N Gly-Gly-Gln Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O QPTNELDXWKRIFX-YFKPBYRVSA-N 0.000 description 1
- XMPXVJIDADUOQB-RCOVLWMOSA-N Gly-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C([O-])=O)NC(=O)CNC(=O)C[NH3+] XMPXVJIDADUOQB-RCOVLWMOSA-N 0.000 description 1
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 1
- UQJNXZSSGQIPIQ-FBCQKBJTSA-N Gly-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)CN UQJNXZSSGQIPIQ-FBCQKBJTSA-N 0.000 description 1
- UPADCCSMVOQAGF-LBPRGKRZSA-N Gly-Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CNC(=O)CN)C(O)=O)=CNC2=C1 UPADCCSMVOQAGF-LBPRGKRZSA-N 0.000 description 1
- HHSOPSCKAZKQHQ-PEXQALLHSA-N Gly-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)CN HHSOPSCKAZKQHQ-PEXQALLHSA-N 0.000 description 1
- BHPQOIPBLYJNAW-NGZCFLSTSA-N Gly-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN BHPQOIPBLYJNAW-NGZCFLSTSA-N 0.000 description 1
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 1
- COVXELOAORHTND-LSJOCFKGSA-N Gly-Ile-Val Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O COVXELOAORHTND-LSJOCFKGSA-N 0.000 description 1
- NSTUFLGQJCOCDL-UWVGGRQHSA-N Gly-Leu-Arg Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NSTUFLGQJCOCDL-UWVGGRQHSA-N 0.000 description 1
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 1
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 1
- FHQRLHFYVZAQHU-IUCAKERBSA-N Gly-Lys-Gln Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O FHQRLHFYVZAQHU-IUCAKERBSA-N 0.000 description 1
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 1
- FJWSJWACLMTDMI-WPRPVWTQSA-N Gly-Met-Val Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O FJWSJWACLMTDMI-WPRPVWTQSA-N 0.000 description 1
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 1
- SCJJPCQUJYPHRZ-BQBZGAKWSA-N Gly-Pro-Asn Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O SCJJPCQUJYPHRZ-BQBZGAKWSA-N 0.000 description 1
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 1
- YOBGUCWZPXJHTN-BQBZGAKWSA-N Gly-Ser-Arg Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YOBGUCWZPXJHTN-BQBZGAKWSA-N 0.000 description 1
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 1
- IMRNSEPSPFQNHF-STQMWFEESA-N Gly-Ser-Trp Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=CC=CC=C12)C(=O)O IMRNSEPSPFQNHF-STQMWFEESA-N 0.000 description 1
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 1
- ZZWUYQXMIFTIIY-WEDXCCLWSA-N Gly-Thr-Leu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O ZZWUYQXMIFTIIY-WEDXCCLWSA-N 0.000 description 1
- MYXNLWDWWOTERK-BHNWBGBOSA-N Gly-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN)O MYXNLWDWWOTERK-BHNWBGBOSA-N 0.000 description 1
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 1
- LKJCZEPXHOIAIW-HOTGVXAUSA-N Gly-Trp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN LKJCZEPXHOIAIW-HOTGVXAUSA-N 0.000 description 1
- YJDALMUYJIENAG-QWRGUYRKSA-N Gly-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN)O YJDALMUYJIENAG-QWRGUYRKSA-N 0.000 description 1
- KBBFOULZCHWGJX-KBPBESRZSA-N Gly-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)CN)O KBBFOULZCHWGJX-KBPBESRZSA-N 0.000 description 1
- RYAOJUMWLWUGNW-QMMMGPOBSA-N Gly-Val-Gly Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O RYAOJUMWLWUGNW-QMMMGPOBSA-N 0.000 description 1
- ZVXMEWXHFBYJPI-LSJOCFKGSA-N Gly-Val-Ile Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZVXMEWXHFBYJPI-LSJOCFKGSA-N 0.000 description 1
- MUGLKCQHTUFLGF-WPRPVWTQSA-N Gly-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)CN MUGLKCQHTUFLGF-WPRPVWTQSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 108700037728 Glycine max beta-conglycinin Proteins 0.000 description 1
- 239000005562 Glyphosate Substances 0.000 description 1
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 1
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 1
- 241000190714 Gymnosporangium clavipes Species 0.000 description 1
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 1
- 241001465746 Heterorhabditidae Species 0.000 description 1
- 241000500097 Heterorhabditis indica Species 0.000 description 1
- OBTMRGFRLJBSFI-GARJFASQSA-N His-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O OBTMRGFRLJBSFI-GARJFASQSA-N 0.000 description 1
- DFHVLUKTTVTCKY-PBCZWWQYSA-N His-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N)O DFHVLUKTTVTCKY-PBCZWWQYSA-N 0.000 description 1
- FMRKUXFLLPKVPG-JYJNAYRXSA-N His-Gln-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC2=CN=CN2)N)O FMRKUXFLLPKVPG-JYJNAYRXSA-N 0.000 description 1
- BQFGKVYHKCNEMF-DCAQKATOSA-N His-Glu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 BQFGKVYHKCNEMF-DCAQKATOSA-N 0.000 description 1
- AKEDPWJFQULLPE-IUCAKERBSA-N His-Glu-Gly Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O AKEDPWJFQULLPE-IUCAKERBSA-N 0.000 description 1
- HQKADFMLECZIQJ-HVTMNAMFSA-N His-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CN=CN1)N HQKADFMLECZIQJ-HVTMNAMFSA-N 0.000 description 1
- VBOFRJNDIOPNDO-YUMQZZPRSA-N His-Gly-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N VBOFRJNDIOPNDO-YUMQZZPRSA-N 0.000 description 1
- QAMFAYSMNZBNCA-UWVGGRQHSA-N His-Gly-Met Chemical compound CSCC[C@H](NC(=O)CNC(=O)[C@@H](N)Cc1cnc[nH]1)C(O)=O QAMFAYSMNZBNCA-UWVGGRQHSA-N 0.000 description 1
- CNHSMSFYVARZLI-YJRXYDGGSA-N His-His-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CNHSMSFYVARZLI-YJRXYDGGSA-N 0.000 description 1
- QPSCMXDWVKWVOW-BZSNNMDCSA-N His-His-Tyr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QPSCMXDWVKWVOW-BZSNNMDCSA-N 0.000 description 1
- MPXGJGBXCRQQJE-MXAVVETBSA-N His-Ile-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O MPXGJGBXCRQQJE-MXAVVETBSA-N 0.000 description 1
- ZRSJXIKQXUGKRB-TUBUOCAGSA-N His-Ile-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZRSJXIKQXUGKRB-TUBUOCAGSA-N 0.000 description 1
- VYUXYMRNGALHEA-DLOVCJGASA-N His-Leu-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O VYUXYMRNGALHEA-DLOVCJGASA-N 0.000 description 1
- RNMNYMDTESKEAJ-KKUMJFAQSA-N His-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 RNMNYMDTESKEAJ-KKUMJFAQSA-N 0.000 description 1
- UXSATKFPUVZVDK-KKUMJFAQSA-N His-Lys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N UXSATKFPUVZVDK-KKUMJFAQSA-N 0.000 description 1
- AYUOWUNWZGTNKB-ULQDDVLXSA-N His-Phe-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AYUOWUNWZGTNKB-ULQDDVLXSA-N 0.000 description 1
- RLAOTFTXBFQJDV-KKUMJFAQSA-N His-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CN=CN1 RLAOTFTXBFQJDV-KKUMJFAQSA-N 0.000 description 1
- JSQIXEHORHLQEE-MEYUZBJRSA-N His-Phe-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JSQIXEHORHLQEE-MEYUZBJRSA-N 0.000 description 1
- WHKLDLQHSYAVGU-ACRUOGEOSA-N His-Phe-Tyr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WHKLDLQHSYAVGU-ACRUOGEOSA-N 0.000 description 1
- STGQSBKUYSPPIG-CIUDSAMLSA-N His-Ser-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 STGQSBKUYSPPIG-CIUDSAMLSA-N 0.000 description 1
- CSRRMQFXMBPSIL-SIXJUCDHSA-N His-Trp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CC3=CN=CN3)N CSRRMQFXMBPSIL-SIXJUCDHSA-N 0.000 description 1
- JATYGDHMDRAISQ-KKUMJFAQSA-N His-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O JATYGDHMDRAISQ-KKUMJFAQSA-N 0.000 description 1
- KFQDSSNYWKZFOO-LSJOCFKGSA-N His-Val-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KFQDSSNYWKZFOO-LSJOCFKGSA-N 0.000 description 1
- 101000669528 Homo sapiens Tachykinin-4 Proteins 0.000 description 1
- 101000802734 Homo sapiens eIF5-mimic protein 2 Proteins 0.000 description 1
- 101710126256 Hydrolase in agr operon Proteins 0.000 description 1
- VSZALHITQINTGC-GHCJXIJMSA-N Ile-Ala-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VSZALHITQINTGC-GHCJXIJMSA-N 0.000 description 1
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 1
- DPTBVFUDCPINIP-JURCDPSOSA-N Ile-Ala-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DPTBVFUDCPINIP-JURCDPSOSA-N 0.000 description 1
- CYHYBSGMHMHKOA-CIQUZCHMSA-N Ile-Ala-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CYHYBSGMHMHKOA-CIQUZCHMSA-N 0.000 description 1
- WZPIKDWQVRTATP-SYWGBEHUSA-N Ile-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 WZPIKDWQVRTATP-SYWGBEHUSA-N 0.000 description 1
- HERITAGIPLEJMT-GVARAGBVSA-N Ile-Ala-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HERITAGIPLEJMT-GVARAGBVSA-N 0.000 description 1
- TZCGZYWNIDZZMR-UHFFFAOYSA-N Ile-Arg-Ala Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(C)C(O)=O)CCCN=C(N)N TZCGZYWNIDZZMR-UHFFFAOYSA-N 0.000 description 1
- FVEWRQXNISSYFO-ZPFDUUQYSA-N Ile-Arg-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N FVEWRQXNISSYFO-ZPFDUUQYSA-N 0.000 description 1
- QLRMMMQNCWBNPQ-QXEWZRGKSA-N Ile-Arg-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)O)N QLRMMMQNCWBNPQ-QXEWZRGKSA-N 0.000 description 1
- WECYRWOMWSCWNX-XUXIUFHCSA-N Ile-Arg-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O WECYRWOMWSCWNX-XUXIUFHCSA-N 0.000 description 1
- DMHGKBGOUAJRHU-UHFFFAOYSA-N Ile-Arg-Pro Natural products CCC(C)C(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O DMHGKBGOUAJRHU-UHFFFAOYSA-N 0.000 description 1
- NULSANWBUWLTKN-NAKRPEOUSA-N Ile-Arg-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N NULSANWBUWLTKN-NAKRPEOUSA-N 0.000 description 1
- CWJQMCPYXNVMBS-STECZYCISA-N Ile-Arg-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N CWJQMCPYXNVMBS-STECZYCISA-N 0.000 description 1
- QADCTXFNLZBZAB-GHCJXIJMSA-N Ile-Asn-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C)C(=O)O)N QADCTXFNLZBZAB-GHCJXIJMSA-N 0.000 description 1
- YKRIXHPEIZUDDY-GMOBBJLQSA-N Ile-Asn-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKRIXHPEIZUDDY-GMOBBJLQSA-N 0.000 description 1
- QYZYJFXHXYUZMZ-UGYAYLCHSA-N Ile-Asn-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N QYZYJFXHXYUZMZ-UGYAYLCHSA-N 0.000 description 1
- SCHZQZPYHBWYEQ-PEFMBERDSA-N Ile-Asn-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SCHZQZPYHBWYEQ-PEFMBERDSA-N 0.000 description 1
- YPQDTQJBOFOTJQ-SXTJYALSSA-N Ile-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N YPQDTQJBOFOTJQ-SXTJYALSSA-N 0.000 description 1
- ZZHGKECPZXPXJF-PCBIJLKTSA-N Ile-Asn-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZZHGKECPZXPXJF-PCBIJLKTSA-N 0.000 description 1
- LEDRIAHEWDJRMF-CFMVVWHZSA-N Ile-Asn-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LEDRIAHEWDJRMF-CFMVVWHZSA-N 0.000 description 1
- HVWXAQVMRBKKFE-UGYAYLCHSA-N Ile-Asp-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HVWXAQVMRBKKFE-UGYAYLCHSA-N 0.000 description 1
- BGZIJZJBXRVBGJ-SXTJYALSSA-N Ile-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N BGZIJZJBXRVBGJ-SXTJYALSSA-N 0.000 description 1
- RGSOCXHDOPQREB-ZPFDUUQYSA-N Ile-Asp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N RGSOCXHDOPQREB-ZPFDUUQYSA-N 0.000 description 1
- NPROWIBAWYMPAZ-GUDRVLHUSA-N Ile-Asp-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N NPROWIBAWYMPAZ-GUDRVLHUSA-N 0.000 description 1
- GECLQMBTZCPAFY-PEFMBERDSA-N Ile-Gln-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GECLQMBTZCPAFY-PEFMBERDSA-N 0.000 description 1
- WNQKUUQIVDDAFA-ZPFDUUQYSA-N Ile-Gln-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCSC)C(=O)O)N WNQKUUQIVDDAFA-ZPFDUUQYSA-N 0.000 description 1
- BALLIXFZYSECCF-QEWYBTABSA-N Ile-Gln-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N BALLIXFZYSECCF-QEWYBTABSA-N 0.000 description 1
- JRYQSFOFUFXPTB-RWRJDSDZSA-N Ile-Gln-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N JRYQSFOFUFXPTB-RWRJDSDZSA-N 0.000 description 1
- BEWFWZRGBDVXRP-PEFMBERDSA-N Ile-Glu-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O BEWFWZRGBDVXRP-PEFMBERDSA-N 0.000 description 1
- KIMHKBDJQQYLHU-PEFMBERDSA-N Ile-Glu-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KIMHKBDJQQYLHU-PEFMBERDSA-N 0.000 description 1
- LGMUPVWZEYYUMU-YVNDNENWSA-N Ile-Glu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N LGMUPVWZEYYUMU-YVNDNENWSA-N 0.000 description 1
- UBHUJPVCJHPSEU-GRLWGSQLSA-N Ile-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N UBHUJPVCJHPSEU-GRLWGSQLSA-N 0.000 description 1
- LPXHYGGZJOCAFR-MNXVOIDGSA-N Ile-Glu-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N LPXHYGGZJOCAFR-MNXVOIDGSA-N 0.000 description 1
- SPQWWEZBHXHUJN-KBIXCLLPSA-N Ile-Glu-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O SPQWWEZBHXHUJN-KBIXCLLPSA-N 0.000 description 1
- JXMSHKFPDIUYGS-SIUGBPQLSA-N Ile-Glu-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N JXMSHKFPDIUYGS-SIUGBPQLSA-N 0.000 description 1
- IGJWJGIHUFQANP-LAEOZQHASA-N Ile-Gly-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N IGJWJGIHUFQANP-LAEOZQHASA-N 0.000 description 1
- LPFBXFILACZHIB-LAEOZQHASA-N Ile-Gly-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)O)N LPFBXFILACZHIB-LAEOZQHASA-N 0.000 description 1
- PDTMWFVVNZYWTR-NHCYSSNCSA-N Ile-Gly-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O PDTMWFVVNZYWTR-NHCYSSNCSA-N 0.000 description 1
- LWWILHPVAKKLQS-QXEWZRGKSA-N Ile-Gly-Met Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCSC)C(=O)O)N LWWILHPVAKKLQS-QXEWZRGKSA-N 0.000 description 1
- RWYCOSAAAJBJQL-KCTSRDHCSA-N Ile-Gly-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N RWYCOSAAAJBJQL-KCTSRDHCSA-N 0.000 description 1
- UAQSZXGJGLHMNV-XEGUGMAKSA-N Ile-Gly-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N UAQSZXGJGLHMNV-XEGUGMAKSA-N 0.000 description 1
- AMSYMDIIIRJRKZ-HJPIBITLSA-N Ile-His-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N AMSYMDIIIRJRKZ-HJPIBITLSA-N 0.000 description 1
- KOPIAUWNLKKELG-SIGLWIIPSA-N Ile-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N KOPIAUWNLKKELG-SIGLWIIPSA-N 0.000 description 1
- VNDQNDYEPSXHLU-JUKXBJQTSA-N Ile-His-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N VNDQNDYEPSXHLU-JUKXBJQTSA-N 0.000 description 1
- APDIECQNNDGFPD-PYJNHQTQSA-N Ile-His-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C(C)C)C(=O)O)N APDIECQNNDGFPD-PYJNHQTQSA-N 0.000 description 1
- PWDSHAAAFXISLE-SXTJYALSSA-N Ile-Ile-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O PWDSHAAAFXISLE-SXTJYALSSA-N 0.000 description 1
- CSQNHSGHAPRGPQ-YTFOTSKYSA-N Ile-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)O)N CSQNHSGHAPRGPQ-YTFOTSKYSA-N 0.000 description 1
- MTONDYJJCIBZTK-PEDHHIEDSA-N Ile-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCSC)C(=O)O)N MTONDYJJCIBZTK-PEDHHIEDSA-N 0.000 description 1
- PFPUFNLHBXKPHY-HTFCKZLJSA-N Ile-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)O)N PFPUFNLHBXKPHY-HTFCKZLJSA-N 0.000 description 1
- KBAPKNDWAGVGTH-IGISWZIWSA-N Ile-Ile-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KBAPKNDWAGVGTH-IGISWZIWSA-N 0.000 description 1
- YGDWPQCLFJNMOL-MNXVOIDGSA-N Ile-Leu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YGDWPQCLFJNMOL-MNXVOIDGSA-N 0.000 description 1
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 1
- TVYWVSJGSHQWMT-AJNGGQMLSA-N Ile-Leu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N TVYWVSJGSHQWMT-AJNGGQMLSA-N 0.000 description 1
- IOVUXUSIGXCREV-DKIMLUQUSA-N Ile-Leu-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IOVUXUSIGXCREV-DKIMLUQUSA-N 0.000 description 1
- DSDPLOODKXISDT-XUXIUFHCSA-N Ile-Leu-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O DSDPLOODKXISDT-XUXIUFHCSA-N 0.000 description 1
- OVDKXUDMKXAZIV-ZPFDUUQYSA-N Ile-Lys-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OVDKXUDMKXAZIV-ZPFDUUQYSA-N 0.000 description 1
- RMNMUUCYTMLWNA-ZPFDUUQYSA-N Ile-Lys-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N RMNMUUCYTMLWNA-ZPFDUUQYSA-N 0.000 description 1
- PNTWNAXGBOZMBO-MNXVOIDGSA-N Ile-Lys-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PNTWNAXGBOZMBO-MNXVOIDGSA-N 0.000 description 1
- GLYJPWIRLBAIJH-FQUUOJAGSA-N Ile-Lys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N GLYJPWIRLBAIJH-FQUUOJAGSA-N 0.000 description 1
- NPAYJTAXWXJKLO-NAKRPEOUSA-N Ile-Met-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N NPAYJTAXWXJKLO-NAKRPEOUSA-N 0.000 description 1
- UYNXBNHVWFNVIN-HJWJTTGWSA-N Ile-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 UYNXBNHVWFNVIN-HJWJTTGWSA-N 0.000 description 1
- OTSVBELRDMSPKY-PCBIJLKTSA-N Ile-Phe-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OTSVBELRDMSPKY-PCBIJLKTSA-N 0.000 description 1
- UAELWXJFLZBKQS-WHOFXGATSA-N Ile-Phe-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O UAELWXJFLZBKQS-WHOFXGATSA-N 0.000 description 1
- WYUHAXJAMDTOAU-IAVJCBSLSA-N Ile-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N WYUHAXJAMDTOAU-IAVJCBSLSA-N 0.000 description 1
- YKZAMJXNJUWFIK-JBDRJPRFSA-N Ile-Ser-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(=O)O)N YKZAMJXNJUWFIK-JBDRJPRFSA-N 0.000 description 1
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 1
- PELCGFMHLZXWBQ-BJDJZHNGSA-N Ile-Ser-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)O)N PELCGFMHLZXWBQ-BJDJZHNGSA-N 0.000 description 1
- ZDNNDIJTUHQCAM-MXAVVETBSA-N Ile-Ser-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N ZDNNDIJTUHQCAM-MXAVVETBSA-N 0.000 description 1
- RQJUKVXWAKJDBW-SVSWQMSJSA-N Ile-Ser-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N RQJUKVXWAKJDBW-SVSWQMSJSA-N 0.000 description 1
- NAFIFZNBSPWYOO-RWRJDSDZSA-N Ile-Thr-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NAFIFZNBSPWYOO-RWRJDSDZSA-N 0.000 description 1
- HJDZMPFEXINXLO-QPHKQPEJSA-N Ile-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N HJDZMPFEXINXLO-QPHKQPEJSA-N 0.000 description 1
- KBDIBHQICWDGDL-PPCPHDFISA-N Ile-Thr-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N KBDIBHQICWDGDL-PPCPHDFISA-N 0.000 description 1
- QHUREMVLLMNUAX-OSUNSFLBSA-N Ile-Thr-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)O)N QHUREMVLLMNUAX-OSUNSFLBSA-N 0.000 description 1
- YBHKCXNNNVDYEB-SPOWBLRKSA-N Ile-Trp-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CO)C(=O)O)N YBHKCXNNNVDYEB-SPOWBLRKSA-N 0.000 description 1
- HQLSBZFLOUHQJK-STECZYCISA-N Ile-Tyr-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HQLSBZFLOUHQJK-STECZYCISA-N 0.000 description 1
- GNXGAVNTVNOCLL-SIUGBPQLSA-N Ile-Tyr-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N GNXGAVNTVNOCLL-SIUGBPQLSA-N 0.000 description 1
- ZUWSVOYKBCHLRR-MGHWNKPDSA-N Ile-Tyr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZUWSVOYKBCHLRR-MGHWNKPDSA-N 0.000 description 1
- NXRNRBOKDBIVKQ-CXTHYWKRSA-N Ile-Tyr-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N NXRNRBOKDBIVKQ-CXTHYWKRSA-N 0.000 description 1
- NJGXXYLPDMMFJB-XUXIUFHCSA-N Ile-Val-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N NJGXXYLPDMMFJB-XUXIUFHCSA-N 0.000 description 1
- RQZFWBLDTBDEOF-RNJOBUHISA-N Ile-Val-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N RQZFWBLDTBDEOF-RNJOBUHISA-N 0.000 description 1
- QSXSHZIRKTUXNG-STECZYCISA-N Ile-Val-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QSXSHZIRKTUXNG-STECZYCISA-N 0.000 description 1
- 206010061217 Infestation Diseases 0.000 description 1
- 108700001097 Insect Genes Proteins 0.000 description 1
- OWYWGLHRNBIFJP-UHFFFAOYSA-N Ipazine Chemical compound CCN(CC)C1=NC(Cl)=NC(NC(C)C)=N1 OWYWGLHRNBIFJP-UHFFFAOYSA-N 0.000 description 1
- 240000007049 Juglans regia Species 0.000 description 1
- 235000009496 Juglans regia Nutrition 0.000 description 1
- 108010025815 Kanamycin Kinase Proteins 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N Leu-Ala-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 1
- XIRYQRLFHWWWTC-QEJZJMRPSA-N Leu-Ala-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XIRYQRLFHWWWTC-QEJZJMRPSA-N 0.000 description 1
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 1
- GRZSCTXVCDUIPO-SRVKXCTJSA-N Leu-Arg-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRZSCTXVCDUIPO-SRVKXCTJSA-N 0.000 description 1
- FJUKMPUELVROGK-IHRRRGAJSA-N Leu-Arg-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N FJUKMPUELVROGK-IHRRRGAJSA-N 0.000 description 1
- UILIPCLTHRPCRB-XUXIUFHCSA-N Leu-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(C)C)N UILIPCLTHRPCRB-XUXIUFHCSA-N 0.000 description 1
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 1
- KKXDHFKZWKLYGB-GUBZILKMSA-N Leu-Asn-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKXDHFKZWKLYGB-GUBZILKMSA-N 0.000 description 1
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 1
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 1
- OGCQGUIWMSBHRZ-CIUDSAMLSA-N Leu-Asn-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OGCQGUIWMSBHRZ-CIUDSAMLSA-N 0.000 description 1
- TWQIYNGNYNJUFM-NHCYSSNCSA-N Leu-Asn-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TWQIYNGNYNJUFM-NHCYSSNCSA-N 0.000 description 1
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 1
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 1
- ULXYQAJWJGLCNR-YUMQZZPRSA-N Leu-Asp-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 1
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 1
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 1
- CLVUXCBGKUECIT-HJGDQZAQSA-N Leu-Asp-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CLVUXCBGKUECIT-HJGDQZAQSA-N 0.000 description 1
- GBDMISNMNXVTNV-XIRDDKMYSA-N Leu-Asp-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O GBDMISNMNXVTNV-XIRDDKMYSA-N 0.000 description 1
- PPBKJAQJAUHZKX-SRVKXCTJSA-N Leu-Cys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC(C)C PPBKJAQJAUHZKX-SRVKXCTJSA-N 0.000 description 1
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 1
- BOFAFKVZQUMTID-AVGNSLFASA-N Leu-Gln-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N BOFAFKVZQUMTID-AVGNSLFASA-N 0.000 description 1
- GPICTNQYKHHHTH-GUBZILKMSA-N Leu-Gln-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O GPICTNQYKHHHTH-GUBZILKMSA-N 0.000 description 1
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 1
- HPBCTWSUJOGJSH-MNXVOIDGSA-N Leu-Glu-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HPBCTWSUJOGJSH-MNXVOIDGSA-N 0.000 description 1
- OGUUKPXUTHOIAV-SDDRHHMPSA-N Leu-Glu-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N OGUUKPXUTHOIAV-SDDRHHMPSA-N 0.000 description 1
- LLBQJYDYOLIQAI-JYJNAYRXSA-N Leu-Glu-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LLBQJYDYOLIQAI-JYJNAYRXSA-N 0.000 description 1
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 1
- LAPSXOAUPNOINL-YUMQZZPRSA-N Leu-Gly-Asp Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O LAPSXOAUPNOINL-YUMQZZPRSA-N 0.000 description 1
- VBZOAGIPCULURB-QWRGUYRKSA-N Leu-Gly-His Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N VBZOAGIPCULURB-QWRGUYRKSA-N 0.000 description 1
- CCQLQKZTXZBXTN-NHCYSSNCSA-N Leu-Gly-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CCQLQKZTXZBXTN-NHCYSSNCSA-N 0.000 description 1
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 1
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 1
- KEVYYIMVELOXCT-KBPBESRZSA-N Leu-Gly-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KEVYYIMVELOXCT-KBPBESRZSA-N 0.000 description 1
- YFBBUHJJUXXZOF-UWVGGRQHSA-N Leu-Gly-Pro Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O YFBBUHJJUXXZOF-UWVGGRQHSA-N 0.000 description 1
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 1
- HYMLKESRWLZDBR-WEDXCCLWSA-N Leu-Gly-Thr Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HYMLKESRWLZDBR-WEDXCCLWSA-N 0.000 description 1
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 1
- XQXGNBFMAXWIGI-MXAVVETBSA-N Leu-His-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 XQXGNBFMAXWIGI-MXAVVETBSA-N 0.000 description 1
- DBSLVQBXKVKDKJ-BJDJZHNGSA-N Leu-Ile-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O DBSLVQBXKVKDKJ-BJDJZHNGSA-N 0.000 description 1
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 1
- AUBMZAMQCOYSIC-MNXVOIDGSA-N Leu-Ile-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O AUBMZAMQCOYSIC-MNXVOIDGSA-N 0.000 description 1
- KUIDCYNIEJBZBU-AJNGGQMLSA-N Leu-Ile-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O KUIDCYNIEJBZBU-AJNGGQMLSA-N 0.000 description 1
- QLDHBYRUNQZIJQ-DKIMLUQUSA-N Leu-Ile-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QLDHBYRUNQZIJQ-DKIMLUQUSA-N 0.000 description 1
- JKSIBWITFMQTOA-XUXIUFHCSA-N Leu-Ile-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O JKSIBWITFMQTOA-XUXIUFHCSA-N 0.000 description 1
- PDQDCFBVYXEFSD-SRVKXCTJSA-N Leu-Leu-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PDQDCFBVYXEFSD-SRVKXCTJSA-N 0.000 description 1
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 1
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 1
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 1
- FOBUGKUBUJOWAD-IHPCNDPISA-N Leu-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 FOBUGKUBUJOWAD-IHPCNDPISA-N 0.000 description 1
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 1
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 1
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 1
- RTIRBWJPYJYTLO-MELADBBJSA-N Leu-Lys-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N RTIRBWJPYJYTLO-MELADBBJSA-N 0.000 description 1
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 1
- ONPJGOIVICHWBW-BZSNNMDCSA-N Leu-Lys-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ONPJGOIVICHWBW-BZSNNMDCSA-N 0.000 description 1
- POMXSEDNUXYPGK-IHRRRGAJSA-N Leu-Met-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N POMXSEDNUXYPGK-IHRRRGAJSA-N 0.000 description 1
- DDVHDMSBLRAKNV-IHRRRGAJSA-N Leu-Met-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O DDVHDMSBLRAKNV-IHRRRGAJSA-N 0.000 description 1
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 1
- PTRKPHUGYULXPU-KKUMJFAQSA-N Leu-Phe-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O PTRKPHUGYULXPU-KKUMJFAQSA-N 0.000 description 1
- WMIOEVKKYIMVKI-DCAQKATOSA-N Leu-Pro-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WMIOEVKKYIMVKI-DCAQKATOSA-N 0.000 description 1
- VULJUQZPSOASBZ-SRVKXCTJSA-N Leu-Pro-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O VULJUQZPSOASBZ-SRVKXCTJSA-N 0.000 description 1
- XXXXOVFBXRERQL-ULQDDVLXSA-N Leu-Pro-Phe Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XXXXOVFBXRERQL-ULQDDVLXSA-N 0.000 description 1
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 1
- JDBQSGMJBMPNFT-AVGNSLFASA-N Leu-Pro-Val Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O JDBQSGMJBMPNFT-AVGNSLFASA-N 0.000 description 1
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 1
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 1
- ADJWHHZETYAAAX-SRVKXCTJSA-N Leu-Ser-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ADJWHHZETYAAAX-SRVKXCTJSA-N 0.000 description 1
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 1
- SBANPBVRHYIMRR-GARJFASQSA-N Leu-Ser-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N SBANPBVRHYIMRR-GARJFASQSA-N 0.000 description 1
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 1
- HWMQRQIFVGEAPH-XIRDDKMYSA-N Leu-Ser-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 HWMQRQIFVGEAPH-XIRDDKMYSA-N 0.000 description 1
- SQUFDMCWMFOEBA-KKUMJFAQSA-N Leu-Ser-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SQUFDMCWMFOEBA-KKUMJFAQSA-N 0.000 description 1
- ZDJQVSIPFLMNOX-RHYQMDGZSA-N Leu-Thr-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZDJQVSIPFLMNOX-RHYQMDGZSA-N 0.000 description 1
- LJBVRCDPWOJOEK-PPCPHDFISA-N Leu-Thr-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LJBVRCDPWOJOEK-PPCPHDFISA-N 0.000 description 1
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 1
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 1
- GZRABTMNWJXFMH-UVOCVTCTSA-N Leu-Thr-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZRABTMNWJXFMH-UVOCVTCTSA-N 0.000 description 1
- AIQWYVFNBNNOLU-RHYQMDGZSA-N Leu-Thr-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O AIQWYVFNBNNOLU-RHYQMDGZSA-N 0.000 description 1
- ISSAURVGLGAPDK-KKUMJFAQSA-N Leu-Tyr-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O ISSAURVGLGAPDK-KKUMJFAQSA-N 0.000 description 1
- VJGQRELPQWNURN-JYJNAYRXSA-N Leu-Tyr-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJGQRELPQWNURN-JYJNAYRXSA-N 0.000 description 1
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 1
- VQHUBNVKFFLWRP-ULQDDVLXSA-N Leu-Tyr-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 VQHUBNVKFFLWRP-ULQDDVLXSA-N 0.000 description 1
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 1
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 1
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 241000192132 Leuconostoc Species 0.000 description 1
- 241000209510 Liliopsida Species 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 1
- NQCJGQHHYZNUDK-DCAQKATOSA-N Lys-Arg-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CCCN=C(N)N NQCJGQHHYZNUDK-DCAQKATOSA-N 0.000 description 1
- SWWCDAGDQHTKIE-RHYQMDGZSA-N Lys-Arg-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWWCDAGDQHTKIE-RHYQMDGZSA-N 0.000 description 1
- DGAAQRAUOFHBFJ-CIUDSAMLSA-N Lys-Asn-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O DGAAQRAUOFHBFJ-CIUDSAMLSA-N 0.000 description 1
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 1
- ZQCVMVCVPFYXHZ-SRVKXCTJSA-N Lys-Asn-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN ZQCVMVCVPFYXHZ-SRVKXCTJSA-N 0.000 description 1
- YVSHZSUKQHNDHD-KKUMJFAQSA-N Lys-Asn-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N YVSHZSUKQHNDHD-KKUMJFAQSA-N 0.000 description 1
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 1
- YEIYAQQKADPIBJ-GARJFASQSA-N Lys-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O YEIYAQQKADPIBJ-GARJFASQSA-N 0.000 description 1
- DFXQCCBKGUNYGG-GUBZILKMSA-N Lys-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN DFXQCCBKGUNYGG-GUBZILKMSA-N 0.000 description 1
- WTZUSCUIVPVCRH-SRVKXCTJSA-N Lys-Gln-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N WTZUSCUIVPVCRH-SRVKXCTJSA-N 0.000 description 1
- YFGWNAROEYWGNL-GUBZILKMSA-N Lys-Gln-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YFGWNAROEYWGNL-GUBZILKMSA-N 0.000 description 1
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 1
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 1
- DUTMKEAPLLUGNO-JYJNAYRXSA-N Lys-Glu-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DUTMKEAPLLUGNO-JYJNAYRXSA-N 0.000 description 1
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 1
- ODUQLUADRKMHOZ-JYJNAYRXSA-N Lys-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)O ODUQLUADRKMHOZ-JYJNAYRXSA-N 0.000 description 1
- GQZMPWBZQALKJO-UWVGGRQHSA-N Lys-Gly-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O GQZMPWBZQALKJO-UWVGGRQHSA-N 0.000 description 1
- ISHNZELVUVPCHY-ZETCQYMHSA-N Lys-Gly-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O ISHNZELVUVPCHY-ZETCQYMHSA-N 0.000 description 1
- DTUZCYRNEJDKSR-NHCYSSNCSA-N Lys-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN DTUZCYRNEJDKSR-NHCYSSNCSA-N 0.000 description 1
- NKKFVJRLCCUJNA-QWRGUYRKSA-N Lys-Gly-Lys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN NKKFVJRLCCUJNA-QWRGUYRKSA-N 0.000 description 1
- PBLLTSKBTAHDNA-KBPBESRZSA-N Lys-Gly-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PBLLTSKBTAHDNA-KBPBESRZSA-N 0.000 description 1
- IUWMQCZOTYRXPL-ZPFDUUQYSA-N Lys-Ile-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O IUWMQCZOTYRXPL-ZPFDUUQYSA-N 0.000 description 1
- YWJQHDDBFAXNIR-MXAVVETBSA-N Lys-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCCN)N YWJQHDDBFAXNIR-MXAVVETBSA-N 0.000 description 1
- JYXBNQOKPRQNQS-YTFOTSKYSA-N Lys-Ile-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JYXBNQOKPRQNQS-YTFOTSKYSA-N 0.000 description 1
- NCZIQZYZPUPMKY-PPCPHDFISA-N Lys-Ile-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NCZIQZYZPUPMKY-PPCPHDFISA-N 0.000 description 1
- MUXNCRWTWBMNHX-SRVKXCTJSA-N Lys-Leu-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O MUXNCRWTWBMNHX-SRVKXCTJSA-N 0.000 description 1
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 1
- WRODMZBHNNPRLN-SRVKXCTJSA-N Lys-Leu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O WRODMZBHNNPRLN-SRVKXCTJSA-N 0.000 description 1
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 1
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 1
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 1
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 1
- BXPHMHQHYHILBB-BZSNNMDCSA-N Lys-Lys-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BXPHMHQHYHILBB-BZSNNMDCSA-N 0.000 description 1
- VSTNAUBHKQPVJX-IHRRRGAJSA-N Lys-Met-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O VSTNAUBHKQPVJX-IHRRRGAJSA-N 0.000 description 1
- MTBLFIQZECOEBY-IHRRRGAJSA-N Lys-Met-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O MTBLFIQZECOEBY-IHRRRGAJSA-N 0.000 description 1
- KVNLHIXLLZBAFQ-RWMBFGLXSA-N Lys-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N KVNLHIXLLZBAFQ-RWMBFGLXSA-N 0.000 description 1
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 1
- AFLBTVGQCQLOFJ-AVGNSLFASA-N Lys-Pro-Arg Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AFLBTVGQCQLOFJ-AVGNSLFASA-N 0.000 description 1
- SVSQSPICRKBMSZ-SRVKXCTJSA-N Lys-Pro-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O SVSQSPICRKBMSZ-SRVKXCTJSA-N 0.000 description 1
- CNGOEHJCLVCJHN-SRVKXCTJSA-N Lys-Pro-Glu Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O CNGOEHJCLVCJHN-SRVKXCTJSA-N 0.000 description 1
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 1
- ZUGVARDEGWMMLK-SRVKXCTJSA-N Lys-Ser-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN ZUGVARDEGWMMLK-SRVKXCTJSA-N 0.000 description 1
- MIFFFXHMAHFACR-KATARQTJSA-N Lys-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN MIFFFXHMAHFACR-KATARQTJSA-N 0.000 description 1
- PLOUVAYOMTYJRG-JXUBOQSCSA-N Lys-Thr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PLOUVAYOMTYJRG-JXUBOQSCSA-N 0.000 description 1
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 1
- DLCAXBGXGOVUCD-PPCPHDFISA-N Lys-Thr-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DLCAXBGXGOVUCD-PPCPHDFISA-N 0.000 description 1
- RPWTZTBIFGENIA-VOAKCMCISA-N Lys-Thr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RPWTZTBIFGENIA-VOAKCMCISA-N 0.000 description 1
- CAVRAQIDHUPECU-UVOCVTCTSA-N Lys-Thr-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAVRAQIDHUPECU-UVOCVTCTSA-N 0.000 description 1
- GVKINWYYLOLEFQ-XIRDDKMYSA-N Lys-Trp-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O GVKINWYYLOLEFQ-XIRDDKMYSA-N 0.000 description 1
- PELXPRPDQRFBGQ-KKUMJFAQSA-N Lys-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O PELXPRPDQRFBGQ-KKUMJFAQSA-N 0.000 description 1
- VVURYEVJJTXWNE-ULQDDVLXSA-N Lys-Tyr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O VVURYEVJJTXWNE-ULQDDVLXSA-N 0.000 description 1
- VKCPHIOZDWUFSW-ONGXEEELSA-N Lys-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN VKCPHIOZDWUFSW-ONGXEEELSA-N 0.000 description 1
- RPWQJSBMXJSCPD-XUXIUFHCSA-N Lys-Val-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)C(C)C)C(O)=O RPWQJSBMXJSCPD-XUXIUFHCSA-N 0.000 description 1
- VWJFOUBDZIUXGA-AVGNSLFASA-N Lys-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCCCN)N VWJFOUBDZIUXGA-AVGNSLFASA-N 0.000 description 1
- OZVXDDFYCQOPFD-XQQFMLRXSA-N Lys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N OZVXDDFYCQOPFD-XQQFMLRXSA-N 0.000 description 1
- 101000819644 Lysinibacillus sphaericus UPF0309 protein in nagA 3'region Proteins 0.000 description 1
- 241000702489 Maize streak virus Species 0.000 description 1
- KUQWVNFMZLHAPA-CIUDSAMLSA-N Met-Ala-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O KUQWVNFMZLHAPA-CIUDSAMLSA-N 0.000 description 1
- ONGCSGVHCSAATF-CIUDSAMLSA-N Met-Ala-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O ONGCSGVHCSAATF-CIUDSAMLSA-N 0.000 description 1
- NSGXXVIHCIAISP-CIUDSAMLSA-N Met-Asn-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O NSGXXVIHCIAISP-CIUDSAMLSA-N 0.000 description 1
- OSOLWRWQADPDIQ-DCAQKATOSA-N Met-Asp-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OSOLWRWQADPDIQ-DCAQKATOSA-N 0.000 description 1
- XOMXAVJBLRROMC-IHRRRGAJSA-N Met-Asp-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XOMXAVJBLRROMC-IHRRRGAJSA-N 0.000 description 1
- HLYIDXAXQIJYIG-CIUDSAMLSA-N Met-Gln-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HLYIDXAXQIJYIG-CIUDSAMLSA-N 0.000 description 1
- JYCQGAGDJQYEDB-GUBZILKMSA-N Met-Gln-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O JYCQGAGDJQYEDB-GUBZILKMSA-N 0.000 description 1
- GXYYFDKJHLRNSI-SRVKXCTJSA-N Met-Gln-His Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(O)=O GXYYFDKJHLRNSI-SRVKXCTJSA-N 0.000 description 1
- MTBVQFFQMXHCPC-CIUDSAMLSA-N Met-Glu-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MTBVQFFQMXHCPC-CIUDSAMLSA-N 0.000 description 1
- JPCHYAUKOUGOIB-HJGDQZAQSA-N Met-Glu-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPCHYAUKOUGOIB-HJGDQZAQSA-N 0.000 description 1
- WWWGMQHQSAUXBU-BQBZGAKWSA-N Met-Gly-Asn Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O WWWGMQHQSAUXBU-BQBZGAKWSA-N 0.000 description 1
- MVBZBRKNZVJEKK-DTWKUNHWSA-N Met-Gly-Pro Chemical compound CSCC[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N MVBZBRKNZVJEKK-DTWKUNHWSA-N 0.000 description 1
- OBCRZLRPJFNLAN-DCAQKATOSA-N Met-His-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O OBCRZLRPJFNLAN-DCAQKATOSA-N 0.000 description 1
- RRIHXWPHQSXHAQ-XUXIUFHCSA-N Met-Ile-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O RRIHXWPHQSXHAQ-XUXIUFHCSA-N 0.000 description 1
- ZIIMORLEZLVRIP-SRVKXCTJSA-N Met-Leu-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZIIMORLEZLVRIP-SRVKXCTJSA-N 0.000 description 1
- SODXFJOPSCXOHE-IHRRRGAJSA-N Met-Leu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O SODXFJOPSCXOHE-IHRRRGAJSA-N 0.000 description 1
- AWGBEIYZPAXXSX-RWMBFGLXSA-N Met-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N AWGBEIYZPAXXSX-RWMBFGLXSA-N 0.000 description 1
- DBXMFHGGHMXYHY-DCAQKATOSA-N Met-Leu-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O DBXMFHGGHMXYHY-DCAQKATOSA-N 0.000 description 1
- BEZJTLKUMFMITF-AVGNSLFASA-N Met-Lys-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCNC(N)=N BEZJTLKUMFMITF-AVGNSLFASA-N 0.000 description 1
- JCMMNFZUKMMECJ-DCAQKATOSA-N Met-Lys-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JCMMNFZUKMMECJ-DCAQKATOSA-N 0.000 description 1
- HAQLBBVZAGMESV-IHRRRGAJSA-N Met-Lys-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O HAQLBBVZAGMESV-IHRRRGAJSA-N 0.000 description 1
- VBGGTAPDGFQMKF-AVGNSLFASA-N Met-Lys-Met Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O VBGGTAPDGFQMKF-AVGNSLFASA-N 0.000 description 1
- UDOYVQQKQHZYMB-DCAQKATOSA-N Met-Met-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDOYVQQKQHZYMB-DCAQKATOSA-N 0.000 description 1
- WUYLWZRHRLLEGB-AVGNSLFASA-N Met-Met-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O WUYLWZRHRLLEGB-AVGNSLFASA-N 0.000 description 1
- OBPCXINRFKHSRY-SDDRHHMPSA-N Met-Met-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N OBPCXINRFKHSRY-SDDRHHMPSA-N 0.000 description 1
- OIFHHODAXVWKJN-ULQDDVLXSA-N Met-Phe-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 OIFHHODAXVWKJN-ULQDDVLXSA-N 0.000 description 1
- ZDJICAUBMUKVEJ-CIUDSAMLSA-N Met-Ser-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O ZDJICAUBMUKVEJ-CIUDSAMLSA-N 0.000 description 1
- MIXPUVSPPOWTCR-FXQIFTODSA-N Met-Ser-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MIXPUVSPPOWTCR-FXQIFTODSA-N 0.000 description 1
- WXJLBSXNUHIGSS-OSUNSFLBSA-N Met-Thr-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WXJLBSXNUHIGSS-OSUNSFLBSA-N 0.000 description 1
- QQPMHUCGDRJFQK-RHYQMDGZSA-N Met-Thr-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QQPMHUCGDRJFQK-RHYQMDGZSA-N 0.000 description 1
- NDJSSFWDYDUQID-YTWAJWBKSA-N Met-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N)O NDJSSFWDYDUQID-YTWAJWBKSA-N 0.000 description 1
- DOQXHOUYYSPISL-SZMVWBNQSA-N Met-Trp-Met Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCSC)C(=O)O)N DOQXHOUYYSPISL-SZMVWBNQSA-N 0.000 description 1
- VOAKKHOIAFKOQZ-JYJNAYRXSA-N Met-Tyr-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CCSC)CC1=CC=C(O)C=C1 VOAKKHOIAFKOQZ-JYJNAYRXSA-N 0.000 description 1
- YJNDFEWPGLNLNH-IHRRRGAJSA-N Met-Tyr-Cys Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CS)C(O)=O)CC1=CC=C(O)C=C1 YJNDFEWPGLNLNH-IHRRRGAJSA-N 0.000 description 1
- PNHRPOWKRRJATF-IHRRRGAJSA-N Met-Tyr-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 PNHRPOWKRRJATF-IHRRRGAJSA-N 0.000 description 1
- FSTWDRPCQQUJIT-NHCYSSNCSA-N Met-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCSC)N FSTWDRPCQQUJIT-NHCYSSNCSA-N 0.000 description 1
- PVSPJQWHEIQTEH-JYJNAYRXSA-N Met-Val-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PVSPJQWHEIQTEH-JYJNAYRXSA-N 0.000 description 1
- 101000751492 Mycoplasma hominis (strain ATCC 23114 / NBRC 14850 / NCTC 10111 / PG21) Uncharacterized protein MHO_3790 Proteins 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 1
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 1
- 241001443590 Naganishia albida Species 0.000 description 1
- 241000033319 Naganishia diffluens Species 0.000 description 1
- 229920002274 Nalgene Polymers 0.000 description 1
- 101710202365 Napin Proteins 0.000 description 1
- 229930193140 Neomycin Natural products 0.000 description 1
- 101150073872 ORF3 gene Proteins 0.000 description 1
- RCEAADKTGXTDOA-UHFFFAOYSA-N OS(O)(=O)=O.CCCCCCCCCCCC[Na] Chemical compound OS(O)(=O)=O.CCCCCCCCCCCC[Na] RCEAADKTGXTDOA-UHFFFAOYSA-N 0.000 description 1
- 101710089395 Oleosin Proteins 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 241000193418 Paenibacillus larvae Species 0.000 description 1
- 241000269800 Percidae Species 0.000 description 1
- ULECEJGNDHWSKD-QEJZJMRPSA-N Phe-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 ULECEJGNDHWSKD-QEJZJMRPSA-N 0.000 description 1
- LBSARGIQACMGDF-WBAXXEDZSA-N Phe-Ala-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 LBSARGIQACMGDF-WBAXXEDZSA-N 0.000 description 1
- BKWJQWJPZMUWEG-LFSVMHDDSA-N Phe-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BKWJQWJPZMUWEG-LFSVMHDDSA-N 0.000 description 1
- LZDIENNKWVXJMX-JYJNAYRXSA-N Phe-Arg-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC1=CC=CC=C1 LZDIENNKWVXJMX-JYJNAYRXSA-N 0.000 description 1
- MQWISMJKHOUEMW-ULQDDVLXSA-N Phe-Arg-His Chemical compound C([C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CC=CC=C1 MQWISMJKHOUEMW-ULQDDVLXSA-N 0.000 description 1
- AGYXCMYVTBYGCT-ULQDDVLXSA-N Phe-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O AGYXCMYVTBYGCT-ULQDDVLXSA-N 0.000 description 1
- LJUUGSWZPQOJKD-JYJNAYRXSA-N Phe-Arg-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)Cc1ccccc1)C(O)=O LJUUGSWZPQOJKD-JYJNAYRXSA-N 0.000 description 1
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 1
- HXSUFWQYLPKEHF-IHRRRGAJSA-N Phe-Asn-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HXSUFWQYLPKEHF-IHRRRGAJSA-N 0.000 description 1
- HCTXJGRYAACKOB-SRVKXCTJSA-N Phe-Asn-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HCTXJGRYAACKOB-SRVKXCTJSA-N 0.000 description 1
- MRNRMSDVVSKPGM-AVGNSLFASA-N Phe-Asn-Gln Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MRNRMSDVVSKPGM-AVGNSLFASA-N 0.000 description 1
- WGXOKDLDIWSOCV-MELADBBJSA-N Phe-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O WGXOKDLDIWSOCV-MELADBBJSA-N 0.000 description 1
- CDNPIRSCAFMMBE-SRVKXCTJSA-N Phe-Asn-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O CDNPIRSCAFMMBE-SRVKXCTJSA-N 0.000 description 1
- JIYJYFIXQTYDNF-YDHLFZDLSA-N Phe-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N JIYJYFIXQTYDNF-YDHLFZDLSA-N 0.000 description 1
- LDSOBEJVGGVWGD-DLOVCJGASA-N Phe-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 LDSOBEJVGGVWGD-DLOVCJGASA-N 0.000 description 1
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 1
- ZENDEDYRYVHBEG-SRVKXCTJSA-N Phe-Asp-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZENDEDYRYVHBEG-SRVKXCTJSA-N 0.000 description 1
- DDYIRGBOZVKRFR-AVGNSLFASA-N Phe-Asp-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N DDYIRGBOZVKRFR-AVGNSLFASA-N 0.000 description 1
- WIVCOAKLPICYGY-KKUMJFAQSA-N Phe-Asp-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N WIVCOAKLPICYGY-KKUMJFAQSA-N 0.000 description 1
- MQVFHOPCKNTHGT-MELADBBJSA-N Phe-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O MQVFHOPCKNTHGT-MELADBBJSA-N 0.000 description 1
- CUMXHKAOHNWRFQ-BZSNNMDCSA-N Phe-Asp-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 CUMXHKAOHNWRFQ-BZSNNMDCSA-N 0.000 description 1
- FSPGBMWPNMRWDB-AVGNSLFASA-N Phe-Cys-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N FSPGBMWPNMRWDB-AVGNSLFASA-N 0.000 description 1
- FIRWJEJVFFGXSH-RYUDHWBXSA-N Phe-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 FIRWJEJVFFGXSH-RYUDHWBXSA-N 0.000 description 1
- NAXPHWZXEXNDIW-JTQLQIEISA-N Phe-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 NAXPHWZXEXNDIW-JTQLQIEISA-N 0.000 description 1
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 1
- WFHRXJOZEXUKLV-IRXDYDNUSA-N Phe-Gly-Tyr Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 WFHRXJOZEXUKLV-IRXDYDNUSA-N 0.000 description 1
- PMKIMKUGCSVFSV-CQDKDKBSSA-N Phe-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=CC=C2)N PMKIMKUGCSVFSV-CQDKDKBSSA-N 0.000 description 1
- ISYSEOWLRQKQEQ-JYJNAYRXSA-N Phe-His-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O ISYSEOWLRQKQEQ-JYJNAYRXSA-N 0.000 description 1
- PBXYXOAEQQUVMM-ULQDDVLXSA-N Phe-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=CC=C2)N PBXYXOAEQQUVMM-ULQDDVLXSA-N 0.000 description 1
- BEEVXUYVEHXWRQ-YESZJQIVSA-N Phe-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC3=CC=CC=C3)N)C(=O)O BEEVXUYVEHXWRQ-YESZJQIVSA-N 0.000 description 1
- DVOCGBNHAUHKHJ-DKIMLUQUSA-N Phe-Ile-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O DVOCGBNHAUHKHJ-DKIMLUQUSA-N 0.000 description 1
- KXUZHWXENMYOHC-QEJZJMRPSA-N Phe-Leu-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O KXUZHWXENMYOHC-QEJZJMRPSA-N 0.000 description 1
- YTILBRIUASDGBL-BZSNNMDCSA-N Phe-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 YTILBRIUASDGBL-BZSNNMDCSA-N 0.000 description 1
- KPEIBEPEUAZWNS-ULQDDVLXSA-N Phe-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KPEIBEPEUAZWNS-ULQDDVLXSA-N 0.000 description 1
- OSBADCBXAMSPQD-YESZJQIVSA-N Phe-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N OSBADCBXAMSPQD-YESZJQIVSA-N 0.000 description 1
- RMKGXGPQIPLTFC-KKUMJFAQSA-N Phe-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RMKGXGPQIPLTFC-KKUMJFAQSA-N 0.000 description 1
- WLYPRKLMRIYGPP-JYJNAYRXSA-N Phe-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 WLYPRKLMRIYGPP-JYJNAYRXSA-N 0.000 description 1
- DOXQMJCSSYZSNM-BZSNNMDCSA-N Phe-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O DOXQMJCSSYZSNM-BZSNNMDCSA-N 0.000 description 1
- BNRFQGLWLQESBG-YESZJQIVSA-N Phe-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O BNRFQGLWLQESBG-YESZJQIVSA-N 0.000 description 1
- VHDNDCPMHQMXIR-IHRRRGAJSA-N Phe-Met-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC1=CC=CC=C1 VHDNDCPMHQMXIR-IHRRRGAJSA-N 0.000 description 1
- QRUOLOPKCOEZKU-HJWJTTGWSA-N Phe-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CC=CC=C1)N QRUOLOPKCOEZKU-HJWJTTGWSA-N 0.000 description 1
- WURZLPSMYZLEGH-UNQGMJICSA-N Phe-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CC=CC=C1)N)O WURZLPSMYZLEGH-UNQGMJICSA-N 0.000 description 1
- ROOQMPCUFLDOSB-FHWLQOOXSA-N Phe-Phe-Gln Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(N)=O)C(O)=O)C1=CC=CC=C1 ROOQMPCUFLDOSB-FHWLQOOXSA-N 0.000 description 1
- CBENHWCORLVGEQ-HJOGWXRNSA-N Phe-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 CBENHWCORLVGEQ-HJOGWXRNSA-N 0.000 description 1
- MMJJFXWMCMJMQA-STQMWFEESA-N Phe-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CC=CC=C1 MMJJFXWMCMJMQA-STQMWFEESA-N 0.000 description 1
- FKFCKDROTNIVSO-JYJNAYRXSA-N Phe-Pro-Met Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(O)=O FKFCKDROTNIVSO-JYJNAYRXSA-N 0.000 description 1
- XOHJOMKCRLHGCY-UNQGMJICSA-N Phe-Pro-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOHJOMKCRLHGCY-UNQGMJICSA-N 0.000 description 1
- ZLAKUZDMKVKFAI-JYJNAYRXSA-N Phe-Pro-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O ZLAKUZDMKVKFAI-JYJNAYRXSA-N 0.000 description 1
- WEDZFLRYSIDIRX-IHRRRGAJSA-N Phe-Ser-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 WEDZFLRYSIDIRX-IHRRRGAJSA-N 0.000 description 1
- HBXAOEBRGLCLIW-AVGNSLFASA-N Phe-Ser-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HBXAOEBRGLCLIW-AVGNSLFASA-N 0.000 description 1
- JXQVYPWVGUOIDV-MXAVVETBSA-N Phe-Ser-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JXQVYPWVGUOIDV-MXAVVETBSA-N 0.000 description 1
- UNBFGVQVQGXXCK-KKUMJFAQSA-N Phe-Ser-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O UNBFGVQVQGXXCK-KKUMJFAQSA-N 0.000 description 1
- GLJZDMZJHFXJQG-BZSNNMDCSA-N Phe-Ser-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLJZDMZJHFXJQG-BZSNNMDCSA-N 0.000 description 1
- MRWOVVNKSXXLRP-IHPCNDPISA-N Phe-Ser-Trp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O MRWOVVNKSXXLRP-IHPCNDPISA-N 0.000 description 1
- MVIJMIZJPHQGEN-IHRRRGAJSA-N Phe-Ser-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@H](CO)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 MVIJMIZJPHQGEN-IHRRRGAJSA-N 0.000 description 1
- GMWNQSGWWGKTSF-LFSVMHDDSA-N Phe-Thr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O GMWNQSGWWGKTSF-LFSVMHDDSA-N 0.000 description 1
- BSKMOCNNLNDIMU-CDMKHQONSA-N Phe-Thr-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O BSKMOCNNLNDIMU-CDMKHQONSA-N 0.000 description 1
- SHUFSZDAIPLZLF-BEAPCOKYSA-N Phe-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O SHUFSZDAIPLZLF-BEAPCOKYSA-N 0.000 description 1
- GLUYKHMBGKQBHE-JYJNAYRXSA-N Phe-Val-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 GLUYKHMBGKQBHE-JYJNAYRXSA-N 0.000 description 1
- KUSYCSMTTHSZOA-DZKIICNBSA-N Phe-Val-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N KUSYCSMTTHSZOA-DZKIICNBSA-N 0.000 description 1
- FXEKNHAJIMHRFJ-ULQDDVLXSA-N Phe-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N FXEKNHAJIMHRFJ-ULQDDVLXSA-N 0.000 description 1
- 101100226891 Phomopsis amygdali PaP450-1 gene Proteins 0.000 description 1
- 241001148064 Photorhabdus luminescens Species 0.000 description 1
- 241000694891 Photorhabdus luminescens subsp. akhurstii Species 0.000 description 1
- 241001246813 Photorhabdus luminescens subsp. laumondii Species 0.000 description 1
- 241000694892 Photorhabdus luminescens subsp. luminescens Species 0.000 description 1
- 239000004743 Polypropylene Substances 0.000 description 1
- APKRGYLBSCWJJP-FXQIFTODSA-N Pro-Ala-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O APKRGYLBSCWJJP-FXQIFTODSA-N 0.000 description 1
- IWNOFCGBMSFTBC-CIUDSAMLSA-N Pro-Ala-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IWNOFCGBMSFTBC-CIUDSAMLSA-N 0.000 description 1
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 1
- VCYJKOLZYPYGJV-AVGNSLFASA-N Pro-Arg-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VCYJKOLZYPYGJV-AVGNSLFASA-N 0.000 description 1
- KDIIENQUNVNWHR-JYJNAYRXSA-N Pro-Arg-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KDIIENQUNVNWHR-JYJNAYRXSA-N 0.000 description 1
- ZSKJPKFTPQCPIH-RCWTZXSCSA-N Pro-Arg-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSKJPKFTPQCPIH-RCWTZXSCSA-N 0.000 description 1
- OYEUSRAZOGIDBY-JYJNAYRXSA-N Pro-Arg-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OYEUSRAZOGIDBY-JYJNAYRXSA-N 0.000 description 1
- UVKNEILZSJMKSR-FXQIFTODSA-N Pro-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 UVKNEILZSJMKSR-FXQIFTODSA-N 0.000 description 1
- SWXSLPHTJVAWDF-VEVYYDQMSA-N Pro-Asn-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWXSLPHTJVAWDF-VEVYYDQMSA-N 0.000 description 1
- JARJPEMLQAWNBR-GUBZILKMSA-N Pro-Asp-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JARJPEMLQAWNBR-GUBZILKMSA-N 0.000 description 1
- SGCZFWSQERRKBD-BQBZGAKWSA-N Pro-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 SGCZFWSQERRKBD-BQBZGAKWSA-N 0.000 description 1
- GDXZRWYXJSGWIV-GMOBBJLQSA-N Pro-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 GDXZRWYXJSGWIV-GMOBBJLQSA-N 0.000 description 1
- QVIZLAUEAMQKGS-GUBZILKMSA-N Pro-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 QVIZLAUEAMQKGS-GUBZILKMSA-N 0.000 description 1
- XUSDDSLCRPUKLP-QXEWZRGKSA-N Pro-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 XUSDDSLCRPUKLP-QXEWZRGKSA-N 0.000 description 1
- OZAPWFHRPINHND-GUBZILKMSA-N Pro-Cys-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O OZAPWFHRPINHND-GUBZILKMSA-N 0.000 description 1
- PZSCUPVOJGKHEP-CIUDSAMLSA-N Pro-Gln-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O PZSCUPVOJGKHEP-CIUDSAMLSA-N 0.000 description 1
- WGAQWMRJUFQXMF-ZPFDUUQYSA-N Pro-Gln-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WGAQWMRJUFQXMF-ZPFDUUQYSA-N 0.000 description 1
- UAYHMOIGIQZLFR-NHCYSSNCSA-N Pro-Gln-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UAYHMOIGIQZLFR-NHCYSSNCSA-N 0.000 description 1
- VDGTVWFMRXVQCT-GUBZILKMSA-N Pro-Glu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 VDGTVWFMRXVQCT-GUBZILKMSA-N 0.000 description 1
- FRKBNXCFJBPJOL-GUBZILKMSA-N Pro-Glu-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FRKBNXCFJBPJOL-GUBZILKMSA-N 0.000 description 1
- NXEYSLRNNPWCRN-SRVKXCTJSA-N Pro-Glu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXEYSLRNNPWCRN-SRVKXCTJSA-N 0.000 description 1
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 1
- LGSANCBHSMDFDY-GARJFASQSA-N Pro-Glu-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O LGSANCBHSMDFDY-GARJFASQSA-N 0.000 description 1
- ZTVCLZLGHZXLOT-ULQDDVLXSA-N Pro-Glu-Trp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O ZTVCLZLGHZXLOT-ULQDDVLXSA-N 0.000 description 1
- CLNJSLSHKJECME-BQBZGAKWSA-N Pro-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H]1CCCN1 CLNJSLSHKJECME-BQBZGAKWSA-N 0.000 description 1
- UUHXBJHVTVGSKM-BQBZGAKWSA-N Pro-Gly-Asn Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UUHXBJHVTVGSKM-BQBZGAKWSA-N 0.000 description 1
- FEPSEIDIPBMIOS-QXEWZRGKSA-N Pro-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEPSEIDIPBMIOS-QXEWZRGKSA-N 0.000 description 1
- FKLSMYYLJHYPHH-UWVGGRQHSA-N Pro-Gly-Leu Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O FKLSMYYLJHYPHH-UWVGGRQHSA-N 0.000 description 1
- IBGCFJDLCYTKPW-NAKRPEOUSA-N Pro-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 IBGCFJDLCYTKPW-NAKRPEOUSA-N 0.000 description 1
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 1
- AQGUSRZKDZYGGV-GMOBBJLQSA-N Pro-Ile-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O AQGUSRZKDZYGGV-GMOBBJLQSA-N 0.000 description 1
- KWMUAKQOVYCQJQ-ZPFDUUQYSA-N Pro-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@@H]1CCCN1 KWMUAKQOVYCQJQ-ZPFDUUQYSA-N 0.000 description 1
- FJLODLCIOJUDRG-PYJNHQTQSA-N Pro-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@@H]2CCCN2 FJLODLCIOJUDRG-PYJNHQTQSA-N 0.000 description 1
- LXLFEIHKWGHJJB-XUXIUFHCSA-N Pro-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 LXLFEIHKWGHJJB-XUXIUFHCSA-N 0.000 description 1
- BCNRNJWSRFDPTQ-HJWJTTGWSA-N Pro-Ile-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BCNRNJWSRFDPTQ-HJWJTTGWSA-N 0.000 description 1
- UREQLMJCKFLLHM-NAKRPEOUSA-N Pro-Ile-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UREQLMJCKFLLHM-NAKRPEOUSA-N 0.000 description 1
- KLSOMAFWRISSNI-OSUNSFLBSA-N Pro-Ile-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 KLSOMAFWRISSNI-OSUNSFLBSA-N 0.000 description 1
- HFNPOYOKIPGAEI-SRVKXCTJSA-N Pro-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 HFNPOYOKIPGAEI-SRVKXCTJSA-N 0.000 description 1
- RMODQFBNDDENCP-IHRRRGAJSA-N Pro-Lys-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O RMODQFBNDDENCP-IHRRRGAJSA-N 0.000 description 1
- MZNUJZBYRWXWLQ-AVGNSLFASA-N Pro-Met-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@@H]2CCCN2 MZNUJZBYRWXWLQ-AVGNSLFASA-N 0.000 description 1
- ZZCJYPLMOPTZFC-SRVKXCTJSA-N Pro-Met-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCSC)C(O)=O ZZCJYPLMOPTZFC-SRVKXCTJSA-N 0.000 description 1
- VGVCNKSUVSZEIE-IHRRRGAJSA-N Pro-Phe-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O VGVCNKSUVSZEIE-IHRRRGAJSA-N 0.000 description 1
- WHNJMTHJGCEKGA-ULQDDVLXSA-N Pro-Phe-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O WHNJMTHJGCEKGA-ULQDDVLXSA-N 0.000 description 1
- BUEIYHBJHCDAMI-UFYCRDLUSA-N Pro-Phe-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BUEIYHBJHCDAMI-UFYCRDLUSA-N 0.000 description 1
- ZVEQWRWMRFIVSD-HRCADAONSA-N Pro-Phe-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N3CCC[C@@H]3C(=O)O ZVEQWRWMRFIVSD-HRCADAONSA-N 0.000 description 1
- FHZJRBVMLGOHBX-GUBZILKMSA-N Pro-Pro-Asp Chemical compound OC(=O)C[C@H](NC(=O)[C@@H]1CCCN1C(=O)[C@@H]1CCCN1)C(O)=O FHZJRBVMLGOHBX-GUBZILKMSA-N 0.000 description 1
- RCYUBVHMVUHEBM-RCWTZXSCSA-N Pro-Pro-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RCYUBVHMVUHEBM-RCWTZXSCSA-N 0.000 description 1
- AJNGQVUFQUVRQT-JYJNAYRXSA-N Pro-Pro-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1N(CCC1)C(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 AJNGQVUFQUVRQT-JYJNAYRXSA-N 0.000 description 1
- POQFNPILEQEODH-FXQIFTODSA-N Pro-Ser-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O POQFNPILEQEODH-FXQIFTODSA-N 0.000 description 1
- GMJDSFYVTAMIBF-FXQIFTODSA-N Pro-Ser-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GMJDSFYVTAMIBF-FXQIFTODSA-N 0.000 description 1
- BJCXXMGGPHRSHV-GUBZILKMSA-N Pro-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BJCXXMGGPHRSHV-GUBZILKMSA-N 0.000 description 1
- PRKWBYCXBBSLSK-GUBZILKMSA-N Pro-Ser-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O PRKWBYCXBBSLSK-GUBZILKMSA-N 0.000 description 1
- PKHDJFHFMGQMPS-RCWTZXSCSA-N Pro-Thr-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PKHDJFHFMGQMPS-RCWTZXSCSA-N 0.000 description 1
- GBUNEGKQPSAMNK-QTKMDUPCSA-N Pro-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@@H]2CCCN2)O GBUNEGKQPSAMNK-QTKMDUPCSA-N 0.000 description 1
- CHYAYDLYYIJCKY-OSUNSFLBSA-N Pro-Thr-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CHYAYDLYYIJCKY-OSUNSFLBSA-N 0.000 description 1
- JDJMFMVVJHLWDP-UNQGMJICSA-N Pro-Thr-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JDJMFMVVJHLWDP-UNQGMJICSA-N 0.000 description 1
- ZYJMLBCDFPIGNL-JYJNAYRXSA-N Pro-Tyr-Arg Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@@H]1CCCN1)C(O)=O ZYJMLBCDFPIGNL-JYJNAYRXSA-N 0.000 description 1
- CWZUFLWPEFHWEI-IHRRRGAJSA-N Pro-Tyr-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O CWZUFLWPEFHWEI-IHRRRGAJSA-N 0.000 description 1
- FZXSYIPVAFVYBH-KKUMJFAQSA-N Pro-Tyr-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O FZXSYIPVAFVYBH-KKUMJFAQSA-N 0.000 description 1
- FIDNSJUXESUDOV-JYJNAYRXSA-N Pro-Tyr-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O FIDNSJUXESUDOV-JYJNAYRXSA-N 0.000 description 1
- XDKKMRPRRCOELJ-GUBZILKMSA-N Pro-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 XDKKMRPRRCOELJ-GUBZILKMSA-N 0.000 description 1
- OOZJHTXCLJUODH-QXEWZRGKSA-N Pro-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 OOZJHTXCLJUODH-QXEWZRGKSA-N 0.000 description 1
- OQSGBXGNAFQGGS-CYDGBPFRSA-N Pro-Val-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OQSGBXGNAFQGGS-CYDGBPFRSA-N 0.000 description 1
- IIRBTQHFVNGPMQ-AVGNSLFASA-N Pro-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 IIRBTQHFVNGPMQ-AVGNSLFASA-N 0.000 description 1
- 241000169446 Promethis Species 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 241001209258 Pseudomonas syringae pv. syringae B728a Species 0.000 description 1
- 241001340896 Pyralis Species 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 101000798665 Rhizobium etli (strain CFN 42 / ATCC 51251) Acyl carrier protein AcpXL Proteins 0.000 description 1
- 108010046685 Rho Factor Proteins 0.000 description 1
- 241000223253 Rhodotorula glutinis Species 0.000 description 1
- 241000223254 Rhodotorula mucilaginosa Species 0.000 description 1
- 241001479507 Senecio odorus Species 0.000 description 1
- ZUGXSSFMTXKHJS-ZLUOBGJFSA-N Ser-Ala-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O ZUGXSSFMTXKHJS-ZLUOBGJFSA-N 0.000 description 1
- MWMKFWJYRRGXOR-ZLUOBGJFSA-N Ser-Ala-Asn Chemical compound N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)O)CC(N)=O)C)CO MWMKFWJYRRGXOR-ZLUOBGJFSA-N 0.000 description 1
- LVVBAKCGXXUHFO-ZLUOBGJFSA-N Ser-Ala-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O LVVBAKCGXXUHFO-ZLUOBGJFSA-N 0.000 description 1
- SRTCFKGBYBZRHA-ACZMJKKPSA-N Ser-Ala-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SRTCFKGBYBZRHA-ACZMJKKPSA-N 0.000 description 1
- WTWGOQRNRFHFQD-JBDRJPRFSA-N Ser-Ala-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WTWGOQRNRFHFQD-JBDRJPRFSA-N 0.000 description 1
- WTUJZHKANPDPIN-CIUDSAMLSA-N Ser-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N WTUJZHKANPDPIN-CIUDSAMLSA-N 0.000 description 1
- YQHZVYJAGWMHES-ZLUOBGJFSA-N Ser-Ala-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YQHZVYJAGWMHES-ZLUOBGJFSA-N 0.000 description 1
- PZZJMBYSYAKYPK-UWJYBYFXSA-N Ser-Ala-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PZZJMBYSYAKYPK-UWJYBYFXSA-N 0.000 description 1
- FCRMLGJMPXCAHD-FXQIFTODSA-N Ser-Arg-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O FCRMLGJMPXCAHD-FXQIFTODSA-N 0.000 description 1
- QEDMOZUJTGEIBF-FXQIFTODSA-N Ser-Arg-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O QEDMOZUJTGEIBF-FXQIFTODSA-N 0.000 description 1
- VQBLHWSPVYYZTB-DCAQKATOSA-N Ser-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N VQBLHWSPVYYZTB-DCAQKATOSA-N 0.000 description 1
- QGMLKFGTGXWAHF-IHRRRGAJSA-N Ser-Arg-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QGMLKFGTGXWAHF-IHRRRGAJSA-N 0.000 description 1
- NRCJWSGXMAPYQX-LPEHRKFASA-N Ser-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N)C(=O)O NRCJWSGXMAPYQX-LPEHRKFASA-N 0.000 description 1
- UBRXAVQWXOWRSJ-ZLUOBGJFSA-N Ser-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)C(=O)N UBRXAVQWXOWRSJ-ZLUOBGJFSA-N 0.000 description 1
- BCKYYTVFBXHPOG-ACZMJKKPSA-N Ser-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N BCKYYTVFBXHPOG-ACZMJKKPSA-N 0.000 description 1
- ZXLUWXWISXIFIX-ACZMJKKPSA-N Ser-Asn-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZXLUWXWISXIFIX-ACZMJKKPSA-N 0.000 description 1
- FIDMVVBUOCMMJG-CIUDSAMLSA-N Ser-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO FIDMVVBUOCMMJG-CIUDSAMLSA-N 0.000 description 1
- UGJRQLURDVGULT-LKXGYXEUSA-N Ser-Asn-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UGJRQLURDVGULT-LKXGYXEUSA-N 0.000 description 1
- TYYBJUYSTWJHGO-ZKWXMUAHSA-N Ser-Asn-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TYYBJUYSTWJHGO-ZKWXMUAHSA-N 0.000 description 1
- OHKLFYXEOGGGCK-ZLUOBGJFSA-N Ser-Asp-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OHKLFYXEOGGGCK-ZLUOBGJFSA-N 0.000 description 1
- VAIZFHMTBFYJIA-ACZMJKKPSA-N Ser-Asp-Gln Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O VAIZFHMTBFYJIA-ACZMJKKPSA-N 0.000 description 1
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 1
- BYIROAKULFFTEK-CIUDSAMLSA-N Ser-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO BYIROAKULFFTEK-CIUDSAMLSA-N 0.000 description 1
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 1
- BTPAWKABYQMKKN-LKXGYXEUSA-N Ser-Asp-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BTPAWKABYQMKKN-LKXGYXEUSA-N 0.000 description 1
- XWCYBVBLJRWOFR-WDSKDSINSA-N Ser-Gln-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O XWCYBVBLJRWOFR-WDSKDSINSA-N 0.000 description 1
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 1
- KJMOINFQVCCSDX-XKBZYTNZSA-N Ser-Gln-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KJMOINFQVCCSDX-XKBZYTNZSA-N 0.000 description 1
- PVDTYLHUWAEYGY-CIUDSAMLSA-N Ser-Glu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PVDTYLHUWAEYGY-CIUDSAMLSA-N 0.000 description 1
- SQBLRDDJTUJDMV-ACZMJKKPSA-N Ser-Glu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQBLRDDJTUJDMV-ACZMJKKPSA-N 0.000 description 1
- UOLGINIHBRIECN-FXQIFTODSA-N Ser-Glu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UOLGINIHBRIECN-FXQIFTODSA-N 0.000 description 1
- BRGQQXQKPUCUJQ-KBIXCLLPSA-N Ser-Glu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRGQQXQKPUCUJQ-KBIXCLLPSA-N 0.000 description 1
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 1
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 1
- UFKPDBLKLOBMRH-XHNCKOQMSA-N Ser-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)C(=O)O UFKPDBLKLOBMRH-XHNCKOQMSA-N 0.000 description 1
- MUARUIBTKQJKFY-WHFBIAKZSA-N Ser-Gly-Asp Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MUARUIBTKQJKFY-WHFBIAKZSA-N 0.000 description 1
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 1
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 1
- HMRAQFJFTOLDKW-GUBZILKMSA-N Ser-His-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O HMRAQFJFTOLDKW-GUBZILKMSA-N 0.000 description 1
- CICQXRWZNVXFCU-SRVKXCTJSA-N Ser-His-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O CICQXRWZNVXFCU-SRVKXCTJSA-N 0.000 description 1
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 1
- BKZYBLLIBOBOOW-GHCJXIJMSA-N Ser-Ile-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O BKZYBLLIBOBOOW-GHCJXIJMSA-N 0.000 description 1
- CJINPXGSKSZQNE-KBIXCLLPSA-N Ser-Ile-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O CJINPXGSKSZQNE-KBIXCLLPSA-N 0.000 description 1
- DJACUBDEDBZKLQ-KBIXCLLPSA-N Ser-Ile-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O DJACUBDEDBZKLQ-KBIXCLLPSA-N 0.000 description 1
- IFPBAGJBHSNYPR-ZKWXMUAHSA-N Ser-Ile-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O IFPBAGJBHSNYPR-ZKWXMUAHSA-N 0.000 description 1
- RIAKPZVSNBBNRE-BJDJZHNGSA-N Ser-Ile-Leu Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O RIAKPZVSNBBNRE-BJDJZHNGSA-N 0.000 description 1
- ZOPISOXXPQNOCO-SVSWQMSJSA-N Ser-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CO)N ZOPISOXXPQNOCO-SVSWQMSJSA-N 0.000 description 1
- MQQBBLVOUUJKLH-HJPIBITLSA-N Ser-Ile-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MQQBBLVOUUJKLH-HJPIBITLSA-N 0.000 description 1
- FUMGHWDRRFCKEP-CIUDSAMLSA-N Ser-Leu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O FUMGHWDRRFCKEP-CIUDSAMLSA-N 0.000 description 1
- QYSFWUIXDFJUDW-DCAQKATOSA-N Ser-Leu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYSFWUIXDFJUDW-DCAQKATOSA-N 0.000 description 1
- NLOAIFSWUUFQFR-CIUDSAMLSA-N Ser-Leu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O NLOAIFSWUUFQFR-CIUDSAMLSA-N 0.000 description 1
- IUXGJEIKJBYKOO-SRVKXCTJSA-N Ser-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N IUXGJEIKJBYKOO-SRVKXCTJSA-N 0.000 description 1
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 1
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 1
- JWOBLHJRDADHLN-KKUMJFAQSA-N Ser-Leu-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JWOBLHJRDADHLN-KKUMJFAQSA-N 0.000 description 1
- HDBOEVPDIDDEPC-CIUDSAMLSA-N Ser-Lys-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O HDBOEVPDIDDEPC-CIUDSAMLSA-N 0.000 description 1
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 1
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 1
- PTWIYDNFWPXQSD-GARJFASQSA-N Ser-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CO)N)C(=O)O PTWIYDNFWPXQSD-GARJFASQSA-N 0.000 description 1
- NIOYDASGXWLHEZ-CIUDSAMLSA-N Ser-Met-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O NIOYDASGXWLHEZ-CIUDSAMLSA-N 0.000 description 1
- VIIJCAQMJBHSJH-FXQIFTODSA-N Ser-Met-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O VIIJCAQMJBHSJH-FXQIFTODSA-N 0.000 description 1
- JAWGSPUJAXYXJA-IHRRRGAJSA-N Ser-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CO)N)CC1=CC=CC=C1 JAWGSPUJAXYXJA-IHRRRGAJSA-N 0.000 description 1
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 1
- RWDVVSKYZBNDCO-MELADBBJSA-N Ser-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CO)N)C(=O)O RWDVVSKYZBNDCO-MELADBBJSA-N 0.000 description 1
- PJIQEIFXZPCWOJ-FXQIFTODSA-N Ser-Pro-Asp Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O PJIQEIFXZPCWOJ-FXQIFTODSA-N 0.000 description 1
- BSXKBOUZDAZXHE-CIUDSAMLSA-N Ser-Pro-Glu Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O BSXKBOUZDAZXHE-CIUDSAMLSA-N 0.000 description 1
- NMZXJDSKEGFDLJ-DCAQKATOSA-N Ser-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CO)N)C(=O)N[C@@H](CCCCN)C(=O)O NMZXJDSKEGFDLJ-DCAQKATOSA-N 0.000 description 1
- PPCZVWHJWJFTFN-ZLUOBGJFSA-N Ser-Ser-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPCZVWHJWJFTFN-ZLUOBGJFSA-N 0.000 description 1
- JCLAFVNDBJMLBC-JBDRJPRFSA-N Ser-Ser-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JCLAFVNDBJMLBC-JBDRJPRFSA-N 0.000 description 1
- AABIBDJHSKIMJK-FXQIFTODSA-N Ser-Ser-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O AABIBDJHSKIMJK-FXQIFTODSA-N 0.000 description 1
- WUXCHQZLUHBSDJ-LKXGYXEUSA-N Ser-Thr-Asp Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WUXCHQZLUHBSDJ-LKXGYXEUSA-N 0.000 description 1
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 1
- ZKOKTQPHFMRSJP-YJRXYDGGSA-N Ser-Thr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKOKTQPHFMRSJP-YJRXYDGGSA-N 0.000 description 1
- FZNNGIHSIPKFRE-QEJZJMRPSA-N Ser-Trp-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(O)=O FZNNGIHSIPKFRE-QEJZJMRPSA-N 0.000 description 1
- BCAVNDNYOGTQMQ-AAEUAGOBSA-N Ser-Trp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O BCAVNDNYOGTQMQ-AAEUAGOBSA-N 0.000 description 1
- ZWSZBWAFDZRBNM-UBHSHLNASA-N Ser-Trp-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O ZWSZBWAFDZRBNM-UBHSHLNASA-N 0.000 description 1
- PIQRHJQWEPWFJG-UWJYBYFXSA-N Ser-Tyr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PIQRHJQWEPWFJG-UWJYBYFXSA-N 0.000 description 1
- PLQWGQUNUPMNOD-KKUMJFAQSA-N Ser-Tyr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O PLQWGQUNUPMNOD-KKUMJFAQSA-N 0.000 description 1
- VVKVHAOOUGNDPJ-SRVKXCTJSA-N Ser-Tyr-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O VVKVHAOOUGNDPJ-SRVKXCTJSA-N 0.000 description 1
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 1
- SIEBDTCABMZCLF-XGEHTFHBSA-N Ser-Val-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SIEBDTCABMZCLF-XGEHTFHBSA-N 0.000 description 1
- 241000147799 Serratia entomophila Species 0.000 description 1
- 102000039471 Small Nuclear RNA Human genes 0.000 description 1
- 241001480223 Steinernema carpocapsae Species 0.000 description 1
- 241001465745 Steinernematidae Species 0.000 description 1
- 101000901034 Streptococcus gordonii Accessory Sec system protein Asp2 Proteins 0.000 description 1
- 101000870438 Streptococcus gordonii UDP-N-acetylglucosamine-peptide N-acetylglucosaminyltransferase stabilizing protein GtfB Proteins 0.000 description 1
- 101100038645 Streptomyces griseus rppA gene Proteins 0.000 description 1
- 101710172711 Structural protein Proteins 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-L Sulfate Chemical compound [O-]S([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 1
- 229940100389 Sulfonylurea Drugs 0.000 description 1
- 108700026226 TATA Box Proteins 0.000 description 1
- 102100039365 Tachykinin-4 Human genes 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 244000269722 Thea sinensis Species 0.000 description 1
- 101000748795 Thermus thermophilus (strain ATCC 27634 / DSM 579 / HB8) Cytochrome c oxidase polypeptide I+III Proteins 0.000 description 1
- IGROJMCBGRFRGI-YTLHQDLWSA-N Thr-Ala-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 1
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 1
- KEGBFULVYKYJRD-LFSVMHDDSA-N Thr-Ala-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KEGBFULVYKYJRD-LFSVMHDDSA-N 0.000 description 1
- CAJFZCICSVBOJK-SHGPDSBTSA-N Thr-Ala-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAJFZCICSVBOJK-SHGPDSBTSA-N 0.000 description 1
- XSLXHSYIVPGEER-KZVJFYERSA-N Thr-Ala-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O XSLXHSYIVPGEER-KZVJFYERSA-N 0.000 description 1
- JMZKMSTYXHFYAK-VEVYYDQMSA-N Thr-Arg-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O JMZKMSTYXHFYAK-VEVYYDQMSA-N 0.000 description 1
- LHUBVKCLOVALIA-HJGDQZAQSA-N Thr-Arg-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LHUBVKCLOVALIA-HJGDQZAQSA-N 0.000 description 1
- UKBSDLHIKIXJKH-HJGDQZAQSA-N Thr-Arg-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UKBSDLHIKIXJKH-HJGDQZAQSA-N 0.000 description 1
- JHBHMCMKSPXRHV-NUMRIWBASA-N Thr-Asn-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O JHBHMCMKSPXRHV-NUMRIWBASA-N 0.000 description 1
- TZKPNGDGUVREEB-FOHZUACHSA-N Thr-Asn-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O TZKPNGDGUVREEB-FOHZUACHSA-N 0.000 description 1
- CTONFVDJYCAMQM-IUKAMOBKSA-N Thr-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H]([C@@H](C)O)N CTONFVDJYCAMQM-IUKAMOBKSA-N 0.000 description 1
- VBPDMBAFBRDZSK-HOUAVDHOSA-N Thr-Asn-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O VBPDMBAFBRDZSK-HOUAVDHOSA-N 0.000 description 1
- LMMDEZPNUTZJAY-GCJQMDKQSA-N Thr-Asp-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O LMMDEZPNUTZJAY-GCJQMDKQSA-N 0.000 description 1
- YOSLMIPKOUAHKI-OLHMAJIHSA-N Thr-Asp-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O YOSLMIPKOUAHKI-OLHMAJIHSA-N 0.000 description 1
- GKMYGVQDGVYCPC-IUKAMOBKSA-N Thr-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H]([C@@H](C)O)N GKMYGVQDGVYCPC-IUKAMOBKSA-N 0.000 description 1
- XDARBNMYXKUFOJ-GSSVUCPTSA-N Thr-Asp-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XDARBNMYXKUFOJ-GSSVUCPTSA-N 0.000 description 1
- VUKVQVNKIIZBPO-HOUAVDHOSA-N Thr-Asp-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O VUKVQVNKIIZBPO-HOUAVDHOSA-N 0.000 description 1
- DCLBXIWHLVEPMQ-JRQIVUDYSA-N Thr-Asp-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DCLBXIWHLVEPMQ-JRQIVUDYSA-N 0.000 description 1
- OYTNZCBFDXGQGE-XQXXSGGOSA-N Thr-Gln-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C)C(=O)O)N)O OYTNZCBFDXGQGE-XQXXSGGOSA-N 0.000 description 1
- UHBPFYOQQPFKQR-JHEQGTHGSA-N Thr-Gln-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O UHBPFYOQQPFKQR-JHEQGTHGSA-N 0.000 description 1
- LIXBDERDAGNVAV-XKBZYTNZSA-N Thr-Gln-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O LIXBDERDAGNVAV-XKBZYTNZSA-N 0.000 description 1
- DKDHTRVDOUZZTP-IFFSRLJSSA-N Thr-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DKDHTRVDOUZZTP-IFFSRLJSSA-N 0.000 description 1
- CQNFRKAKGDSJFR-NUMRIWBASA-N Thr-Glu-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O CQNFRKAKGDSJFR-NUMRIWBASA-N 0.000 description 1
- LGNBRHZANHMZHK-NUMRIWBASA-N Thr-Glu-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O LGNBRHZANHMZHK-NUMRIWBASA-N 0.000 description 1
- SHOMROOOQBDGRL-JHEQGTHGSA-N Thr-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SHOMROOOQBDGRL-JHEQGTHGSA-N 0.000 description 1
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 1
- OQCXTUQTKQFDCX-HTUGSXCWSA-N Thr-Glu-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O OQCXTUQTKQFDCX-HTUGSXCWSA-N 0.000 description 1
- KBLYJPQSNGTDIU-LOKLDPHHSA-N Thr-Glu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N)O KBLYJPQSNGTDIU-LOKLDPHHSA-N 0.000 description 1
- LKEKWDJCJSPXNI-IRIUXVKKSA-N Thr-Glu-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LKEKWDJCJSPXNI-IRIUXVKKSA-N 0.000 description 1
- BNGDYRRHRGOPHX-IFFSRLJSSA-N Thr-Glu-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O BNGDYRRHRGOPHX-IFFSRLJSSA-N 0.000 description 1
- SLUWOCTZVGMURC-BFHQHQDPSA-N Thr-Gly-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O SLUWOCTZVGMURC-BFHQHQDPSA-N 0.000 description 1
- XFTYVCHLARBHBQ-FOHZUACHSA-N Thr-Gly-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XFTYVCHLARBHBQ-FOHZUACHSA-N 0.000 description 1
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 1
- IMULJHHGAUZZFE-MBLNEYKQSA-N Thr-Gly-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IMULJHHGAUZZFE-MBLNEYKQSA-N 0.000 description 1
- UBDDORVPVLEECX-FJXKBIBVSA-N Thr-Gly-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O UBDDORVPVLEECX-FJXKBIBVSA-N 0.000 description 1
- KBBRNEDOYWMIJP-KYNKHSRBSA-N Thr-Gly-Thr Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O KBBRNEDOYWMIJP-KYNKHSRBSA-N 0.000 description 1
- JQAWYCUUFIMTHE-WLTAIBSBSA-N Thr-Gly-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JQAWYCUUFIMTHE-WLTAIBSBSA-N 0.000 description 1
- WPAKPLPGQNUXGN-OSUNSFLBSA-N Thr-Ile-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WPAKPLPGQNUXGN-OSUNSFLBSA-N 0.000 description 1
- DDDLIMCZFKOERC-SVSWQMSJSA-N Thr-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N DDDLIMCZFKOERC-SVSWQMSJSA-N 0.000 description 1
- AHOLTQCAVBSUDP-PPCPHDFISA-N Thr-Ile-Lys Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)[C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O AHOLTQCAVBSUDP-PPCPHDFISA-N 0.000 description 1
- HOVLHEKTGVIKAP-WDCWCFNPSA-N Thr-Leu-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HOVLHEKTGVIKAP-WDCWCFNPSA-N 0.000 description 1
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 1
- PRNGXSILMXSWQQ-OEAJRASXSA-N Thr-Leu-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PRNGXSILMXSWQQ-OEAJRASXSA-N 0.000 description 1
- NCXVJIQMWSGRHY-KXNHARMFSA-N Thr-Leu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O NCXVJIQMWSGRHY-KXNHARMFSA-N 0.000 description 1
- ISLDRLHVPXABBC-IEGACIPQSA-N Thr-Leu-Trp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O ISLDRLHVPXABBC-IEGACIPQSA-N 0.000 description 1
- CJXURNZYNHCYFD-WDCWCFNPSA-N Thr-Lys-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O CJXURNZYNHCYFD-WDCWCFNPSA-N 0.000 description 1
- WFAUDCSNCWJJAA-KXNHARMFSA-N Thr-Lys-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(O)=O WFAUDCSNCWJJAA-KXNHARMFSA-N 0.000 description 1
- OHDXOXIZXSFCDN-RCWTZXSCSA-N Thr-Met-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OHDXOXIZXSFCDN-RCWTZXSCSA-N 0.000 description 1
- GUHLYMZJVXUIPO-RCWTZXSCSA-N Thr-Met-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O GUHLYMZJVXUIPO-RCWTZXSCSA-N 0.000 description 1
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 1
- NYQIZWROIMIQSL-VEVYYDQMSA-N Thr-Pro-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O NYQIZWROIMIQSL-VEVYYDQMSA-N 0.000 description 1
- NDXSOKGYKCGYKT-VEVYYDQMSA-N Thr-Pro-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O NDXSOKGYKCGYKT-VEVYYDQMSA-N 0.000 description 1
- LKJCABTUFGTPPY-HJGDQZAQSA-N Thr-Pro-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O LKJCABTUFGTPPY-HJGDQZAQSA-N 0.000 description 1
- DEGCBBCMYWNJNA-RHYQMDGZSA-N Thr-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O DEGCBBCMYWNJNA-RHYQMDGZSA-N 0.000 description 1
- BDENGIGFTNYZSJ-RCWTZXSCSA-N Thr-Pro-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(O)=O BDENGIGFTNYZSJ-RCWTZXSCSA-N 0.000 description 1
- OLFOOYQTTQSSRK-UNQGMJICSA-N Thr-Pro-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLFOOYQTTQSSRK-UNQGMJICSA-N 0.000 description 1
- KERCOYANYUPLHJ-XGEHTFHBSA-N Thr-Pro-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O KERCOYANYUPLHJ-XGEHTFHBSA-N 0.000 description 1
- DOBIBIXIHJKVJF-XKBZYTNZSA-N Thr-Ser-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O DOBIBIXIHJKVJF-XKBZYTNZSA-N 0.000 description 1
- NQQMWWVVGIXUOX-SVSWQMSJSA-N Thr-Ser-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NQQMWWVVGIXUOX-SVSWQMSJSA-N 0.000 description 1
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 1
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 1
- IEZVHOULSUULHD-XGEHTFHBSA-N Thr-Ser-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O IEZVHOULSUULHD-XGEHTFHBSA-N 0.000 description 1
- NDZYTIMDOZMECO-SHGPDSBTSA-N Thr-Thr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O NDZYTIMDOZMECO-SHGPDSBTSA-N 0.000 description 1
- YRJOLUDFVAUXLI-GSSVUCPTSA-N Thr-Thr-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O YRJOLUDFVAUXLI-GSSVUCPTSA-N 0.000 description 1
- MFMGPEKYBXFIRF-SUSMZKCASA-N Thr-Thr-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MFMGPEKYBXFIRF-SUSMZKCASA-N 0.000 description 1
- LECUEEHKUFYOOV-ZJDVBMNYSA-N Thr-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)[C@@H](C)O LECUEEHKUFYOOV-ZJDVBMNYSA-N 0.000 description 1
- FBQHKSPOIAFUEI-OWLDWWDNSA-N Thr-Trp-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O FBQHKSPOIAFUEI-OWLDWWDNSA-N 0.000 description 1
- ZOCJFNXUVSGBQI-HSHDSVGOSA-N Thr-Trp-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O ZOCJFNXUVSGBQI-HSHDSVGOSA-N 0.000 description 1
- NDLHSJWPCXKOGG-VLCNGCBASA-N Thr-Trp-Tyr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)N)O NDLHSJWPCXKOGG-VLCNGCBASA-N 0.000 description 1
- KAJRRNHOVMZYBL-IRIUXVKKSA-N Thr-Tyr-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O KAJRRNHOVMZYBL-IRIUXVKKSA-N 0.000 description 1
- JAWUQFCGNVEDRN-MEYUZBJRSA-N Thr-Tyr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O JAWUQFCGNVEDRN-MEYUZBJRSA-N 0.000 description 1
- DIHPMRTXPYMDJZ-KAOXEZKKSA-N Thr-Tyr-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N)O DIHPMRTXPYMDJZ-KAOXEZKKSA-N 0.000 description 1
- OGOYMQWIWHGTGH-KZVJFYERSA-N Thr-Val-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O OGOYMQWIWHGTGH-KZVJFYERSA-N 0.000 description 1
- QNXZCKMXHPULME-ZNSHCXBVSA-N Thr-Val-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O QNXZCKMXHPULME-ZNSHCXBVSA-N 0.000 description 1
- 244000288561 Torulaspora delbrueckii Species 0.000 description 1
- 235000014681 Torulaspora delbrueckii Nutrition 0.000 description 1
- 241001495125 Torulaspora pretoriensis Species 0.000 description 1
- 101710126507 Toxin 5 Proteins 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- MQVGIFJSFFVGFW-XEGUGMAKSA-N Trp-Ala-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MQVGIFJSFFVGFW-XEGUGMAKSA-N 0.000 description 1
- NMCBVGFGWSIGSB-NUTKFTJISA-N Trp-Ala-Leu Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N NMCBVGFGWSIGSB-NUTKFTJISA-N 0.000 description 1
- PEYSVKMXSLPQRU-FJHTZYQYSA-N Trp-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O PEYSVKMXSLPQRU-FJHTZYQYSA-N 0.000 description 1
- PNKDNKGMEHJTJQ-BPUTZDHNSA-N Trp-Arg-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N PNKDNKGMEHJTJQ-BPUTZDHNSA-N 0.000 description 1
- RNFZZCMCRDFNAE-WFBYXXMGSA-N Trp-Asn-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O RNFZZCMCRDFNAE-WFBYXXMGSA-N 0.000 description 1
- IBBBOLAPFHRDHW-BPUTZDHNSA-N Trp-Asn-Arg Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N IBBBOLAPFHRDHW-BPUTZDHNSA-N 0.000 description 1
- DXDMNBJJEXYMLA-UBHSHLNASA-N Trp-Asn-Asp Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O)=CNC2=C1 DXDMNBJJEXYMLA-UBHSHLNASA-N 0.000 description 1
- QNTBGBCOEYNAPV-CWRNSKLLSA-N Trp-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O QNTBGBCOEYNAPV-CWRNSKLLSA-N 0.000 description 1
- BORCDLUWGBGTKL-XIRDDKMYSA-N Trp-Gln-Met Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O)=CNC2=C1 BORCDLUWGBGTKL-XIRDDKMYSA-N 0.000 description 1
- DXHHCIYKHRKBOC-BHYGNILZSA-N Trp-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O DXHHCIYKHRKBOC-BHYGNILZSA-N 0.000 description 1
- YXONONCLMLHWJX-SZMVWBNQSA-N Trp-Glu-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O)=CNC2=C1 YXONONCLMLHWJX-SZMVWBNQSA-N 0.000 description 1
- UDCHKDYNMRJYMI-QEJZJMRPSA-N Trp-Glu-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UDCHKDYNMRJYMI-QEJZJMRPSA-N 0.000 description 1
- SNJAPSVIPKUMCK-NWLDYVSISA-N Trp-Glu-Thr Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SNJAPSVIPKUMCK-NWLDYVSISA-N 0.000 description 1
- HXNVJPQADLRHGR-JBACZVJFSA-N Trp-Glu-Tyr Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)N HXNVJPQADLRHGR-JBACZVJFSA-N 0.000 description 1
- UYKREHOKELZSPB-JTQLQIEISA-N Trp-Gly Chemical compound C1=CC=C2C(C[C@H](N)C(=O)NCC(O)=O)=CNC2=C1 UYKREHOKELZSPB-JTQLQIEISA-N 0.000 description 1
- DZIKVMCFXIIETR-JSGCOSHPSA-N Trp-Gly-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O DZIKVMCFXIIETR-JSGCOSHPSA-N 0.000 description 1
- WSGPBCAGEGHKQJ-BBRMVZONSA-N Trp-Gly-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CNC2=CC=CC=C21)N WSGPBCAGEGHKQJ-BBRMVZONSA-N 0.000 description 1
- XLVRTKPAIXJYOH-HOCLYGCPSA-N Trp-His-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)NCC(=O)O)N XLVRTKPAIXJYOH-HOCLYGCPSA-N 0.000 description 1
- HNIWONZFMIPCCT-SIXJUCDHSA-N Trp-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N HNIWONZFMIPCCT-SIXJUCDHSA-N 0.000 description 1
- KIMOCKLJBXHFIN-YLVFBTJISA-N Trp-Ile-Gly Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O)=CNC2=C1 KIMOCKLJBXHFIN-YLVFBTJISA-N 0.000 description 1
- YVXIAOOYAKBAAI-SZMVWBNQSA-N Trp-Leu-Gln Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O)=CNC2=C1 YVXIAOOYAKBAAI-SZMVWBNQSA-N 0.000 description 1
- VPRHDRKAPYZMHL-SZMVWBNQSA-N Trp-Leu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 VPRHDRKAPYZMHL-SZMVWBNQSA-N 0.000 description 1
- IQXWAJUIAQLZNX-IHPCNDPISA-N Trp-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N IQXWAJUIAQLZNX-IHPCNDPISA-N 0.000 description 1
- NWQCKAPDGQMZQN-IHPCNDPISA-N Trp-Lys-Leu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O NWQCKAPDGQMZQN-IHPCNDPISA-N 0.000 description 1
- UUIYFDAWNBSWPG-IHPCNDPISA-N Trp-Lys-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N UUIYFDAWNBSWPG-IHPCNDPISA-N 0.000 description 1
- JZSLIZLZGWOJBJ-PMVMPFDFSA-N Trp-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N JZSLIZLZGWOJBJ-PMVMPFDFSA-N 0.000 description 1
- SUEGAFMNTXXNLR-WFBYXXMGSA-N Trp-Ser-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O SUEGAFMNTXXNLR-WFBYXXMGSA-N 0.000 description 1
- ARKBYVBCEOWRNR-UBHSHLNASA-N Trp-Ser-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O ARKBYVBCEOWRNR-UBHSHLNASA-N 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- BURPTJBFWIOHEY-UWJYBYFXSA-N Tyr-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BURPTJBFWIOHEY-UWJYBYFXSA-N 0.000 description 1
- DLZKEQQWXODGGZ-KWQFWETISA-N Tyr-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DLZKEQQWXODGGZ-KWQFWETISA-N 0.000 description 1
- ZWZOCUWOXSDYFZ-CQDKDKBSSA-N Tyr-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZWZOCUWOXSDYFZ-CQDKDKBSSA-N 0.000 description 1
- ADBDQGBDNUTRDB-ULQDDVLXSA-N Tyr-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O ADBDQGBDNUTRDB-ULQDDVLXSA-N 0.000 description 1
- GFZQWWDXJVGEMW-ULQDDVLXSA-N Tyr-Arg-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O GFZQWWDXJVGEMW-ULQDDVLXSA-N 0.000 description 1
- DKKHULUSOSWGHS-UWJYBYFXSA-N Tyr-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N DKKHULUSOSWGHS-UWJYBYFXSA-N 0.000 description 1
- CKKFTIQYURNSEI-IHRRRGAJSA-N Tyr-Asn-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CKKFTIQYURNSEI-IHRRRGAJSA-N 0.000 description 1
- GFHYISDTIWZUSU-QWRGUYRKSA-N Tyr-Asn-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GFHYISDTIWZUSU-QWRGUYRKSA-N 0.000 description 1
- MTEQZJFSEMXXRK-CFMVVWHZSA-N Tyr-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N MTEQZJFSEMXXRK-CFMVVWHZSA-N 0.000 description 1
- AYHSJESDFKREAR-KKUMJFAQSA-N Tyr-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AYHSJESDFKREAR-KKUMJFAQSA-N 0.000 description 1
- NSTPFWRAIDTNGH-BZSNNMDCSA-N Tyr-Asn-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NSTPFWRAIDTNGH-BZSNNMDCSA-N 0.000 description 1
- GAYLGYUVTDMLKC-UWJYBYFXSA-N Tyr-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 GAYLGYUVTDMLKC-UWJYBYFXSA-N 0.000 description 1
- JWHOIHCOHMZSAR-QWRGUYRKSA-N Tyr-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JWHOIHCOHMZSAR-QWRGUYRKSA-N 0.000 description 1
- RCLOWEZASFJFEX-KKUMJFAQSA-N Tyr-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 RCLOWEZASFJFEX-KKUMJFAQSA-N 0.000 description 1
- TZXFLDNBYYGLKA-BZSNNMDCSA-N Tyr-Asp-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 TZXFLDNBYYGLKA-BZSNNMDCSA-N 0.000 description 1
- UXUFNBVCPAWACG-SIUGBPQLSA-N Tyr-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N UXUFNBVCPAWACG-SIUGBPQLSA-N 0.000 description 1
- XQYHLZNPOTXRMQ-KKUMJFAQSA-N Tyr-Glu-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XQYHLZNPOTXRMQ-KKUMJFAQSA-N 0.000 description 1
- HKYTWJOWZTWBQB-AVGNSLFASA-N Tyr-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HKYTWJOWZTWBQB-AVGNSLFASA-N 0.000 description 1
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 1
- IMXAAEFAIBRCQF-SIUGBPQLSA-N Tyr-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N IMXAAEFAIBRCQF-SIUGBPQLSA-N 0.000 description 1
- SLCSPPCQWUHPPO-JYJNAYRXSA-N Tyr-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SLCSPPCQWUHPPO-JYJNAYRXSA-N 0.000 description 1
- KOVXHANYYYMBRF-IRIUXVKKSA-N Tyr-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O KOVXHANYYYMBRF-IRIUXVKKSA-N 0.000 description 1
- PMDWYLVWHRTJIW-STQMWFEESA-N Tyr-Gly-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PMDWYLVWHRTJIW-STQMWFEESA-N 0.000 description 1
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 1
- QAYSODICXVZUIA-WLTAIBSBSA-N Tyr-Gly-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O QAYSODICXVZUIA-WLTAIBSBSA-N 0.000 description 1
- FBHBVXUBTYVCRU-BZSNNMDCSA-N Tyr-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CN=CN1 FBHBVXUBTYVCRU-BZSNNMDCSA-N 0.000 description 1
- AXWBYOVVDRBOGU-SIUGBPQLSA-N Tyr-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N AXWBYOVVDRBOGU-SIUGBPQLSA-N 0.000 description 1
- GGXUDPQWAWRINY-XEGUGMAKSA-N Tyr-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 GGXUDPQWAWRINY-XEGUGMAKSA-N 0.000 description 1
- QARCDOCCDOLJSF-HJPIBITLSA-N Tyr-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QARCDOCCDOLJSF-HJPIBITLSA-N 0.000 description 1
- BSCBBPKDVOZICB-KKUMJFAQSA-N Tyr-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BSCBBPKDVOZICB-KKUMJFAQSA-N 0.000 description 1
- AVIQBBOOTZENLH-KKUMJFAQSA-N Tyr-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N AVIQBBOOTZENLH-KKUMJFAQSA-N 0.000 description 1
- KSCVLGXNQXKUAR-JYJNAYRXSA-N Tyr-Leu-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O KSCVLGXNQXKUAR-JYJNAYRXSA-N 0.000 description 1
- KHCSOLAHNLOXJR-BZSNNMDCSA-N Tyr-Leu-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHCSOLAHNLOXJR-BZSNNMDCSA-N 0.000 description 1
- NSGZILIDHCIZAM-KKUMJFAQSA-N Tyr-Leu-Ser Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NSGZILIDHCIZAM-KKUMJFAQSA-N 0.000 description 1
- DMWNPLOERDAHSY-MEYUZBJRSA-N Tyr-Leu-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DMWNPLOERDAHSY-MEYUZBJRSA-N 0.000 description 1
- JAGGEZACYAAMIL-CQDKDKBSSA-N Tyr-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=C(C=C1)O)N JAGGEZACYAAMIL-CQDKDKBSSA-N 0.000 description 1
- ZOBLBMGJKVJVEV-BZSNNMDCSA-N Tyr-Lys-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O ZOBLBMGJKVJVEV-BZSNNMDCSA-N 0.000 description 1
- KGSDLCMCDFETHU-YESZJQIVSA-N Tyr-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O KGSDLCMCDFETHU-YESZJQIVSA-N 0.000 description 1
- PGEFRHBWGOJPJT-KKUMJFAQSA-N Tyr-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O PGEFRHBWGOJPJT-KKUMJFAQSA-N 0.000 description 1
- LMKKMCGTDANZTR-BZSNNMDCSA-N Tyr-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LMKKMCGTDANZTR-BZSNNMDCSA-N 0.000 description 1
- NVZVJIUDICCMHZ-BZSNNMDCSA-N Tyr-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O NVZVJIUDICCMHZ-BZSNNMDCSA-N 0.000 description 1
- BIWVVOHTKDLRMP-ULQDDVLXSA-N Tyr-Pro-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BIWVVOHTKDLRMP-ULQDDVLXSA-N 0.000 description 1
- YYLHVUCSTXXKBS-IHRRRGAJSA-N Tyr-Pro-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YYLHVUCSTXXKBS-IHRRRGAJSA-N 0.000 description 1
- RWOKVQUCENPXGE-IHRRRGAJSA-N Tyr-Ser-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RWOKVQUCENPXGE-IHRRRGAJSA-N 0.000 description 1
- HRHYJNLMIJWGLF-BZSNNMDCSA-N Tyr-Ser-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 HRHYJNLMIJWGLF-BZSNNMDCSA-N 0.000 description 1
- LUMQYLVYUIRHHU-YJRXYDGGSA-N Tyr-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LUMQYLVYUIRHHU-YJRXYDGGSA-N 0.000 description 1
- HMPMGPISLMLHSI-JBACZVJFSA-N Tyr-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N HMPMGPISLMLHSI-JBACZVJFSA-N 0.000 description 1
- DJSYPCWZPNHQQE-FHWLQOOXSA-N Tyr-Tyr-Gln Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCC(N)=O)C(O)=O)C1=CC=C(O)C=C1 DJSYPCWZPNHQQE-FHWLQOOXSA-N 0.000 description 1
- KRXFXDCNKLANCP-CXTHYWKRSA-N Tyr-Tyr-Ile Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 KRXFXDCNKLANCP-CXTHYWKRSA-N 0.000 description 1
- KHPLUFDSWGDRHD-SLFFLAALSA-N Tyr-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N)C(=O)O KHPLUFDSWGDRHD-SLFFLAALSA-N 0.000 description 1
- AGDDLOQMXUQPDY-BZSNNMDCSA-N Tyr-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O AGDDLOQMXUQPDY-BZSNNMDCSA-N 0.000 description 1
- KSGKJSFPWSMJHK-JNPHEJMOSA-N Tyr-Tyr-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSGKJSFPWSMJHK-JNPHEJMOSA-N 0.000 description 1
- FZADUTOCSFDBRV-RNXOBYDBSA-N Tyr-Tyr-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=C(O)C=C1 FZADUTOCSFDBRV-RNXOBYDBSA-N 0.000 description 1
- HZWPGKAKGYJWCI-ULQDDVLXSA-N Tyr-Val-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(C)C)C(O)=O HZWPGKAKGYJWCI-ULQDDVLXSA-N 0.000 description 1
- 108090000848 Ubiquitin Proteins 0.000 description 1
- 102000044159 Ubiquitin Human genes 0.000 description 1
- YFOCMOVJBQDBCE-NRPADANISA-N Val-Ala-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N YFOCMOVJBQDBCE-NRPADANISA-N 0.000 description 1
- UUYCNAXCCDNULB-QXEWZRGKSA-N Val-Arg-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(O)=O UUYCNAXCCDNULB-QXEWZRGKSA-N 0.000 description 1
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 1
- JYVKKBDANPZIAW-AVGNSLFASA-N Val-Arg-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](C(C)C)N JYVKKBDANPZIAW-AVGNSLFASA-N 0.000 description 1
- CVUDMNSZAIZFAE-TUAOUCFPSA-N Val-Arg-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N CVUDMNSZAIZFAE-TUAOUCFPSA-N 0.000 description 1
- CVUDMNSZAIZFAE-UHFFFAOYSA-N Val-Arg-Pro Natural products NC(N)=NCCCC(NC(=O)C(N)C(C)C)C(=O)N1CCCC1C(O)=O CVUDMNSZAIZFAE-UHFFFAOYSA-N 0.000 description 1
- ZMDCGGKHRKNWKD-LAEOZQHASA-N Val-Asn-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZMDCGGKHRKNWKD-LAEOZQHASA-N 0.000 description 1
- QGFPYRPIUXBYGR-YDHLFZDLSA-N Val-Asn-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N QGFPYRPIUXBYGR-YDHLFZDLSA-N 0.000 description 1
- PVPAOIGJYHVWBT-KKHAAJSZSA-N Val-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N)O PVPAOIGJYHVWBT-KKHAAJSZSA-N 0.000 description 1
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 1
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 1
- ZQGPWORGSNRQLN-NHCYSSNCSA-N Val-Asp-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZQGPWORGSNRQLN-NHCYSSNCSA-N 0.000 description 1
- XLDYBRXERHITNH-QSFUFRPTSA-N Val-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)C(C)C XLDYBRXERHITNH-QSFUFRPTSA-N 0.000 description 1
- HHSILIQTHXABKM-YDHLFZDLSA-N Val-Asp-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](Cc1ccccc1)C(O)=O HHSILIQTHXABKM-YDHLFZDLSA-N 0.000 description 1
- BWVHQINTNLVWGZ-ZKWXMUAHSA-N Val-Cys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N BWVHQINTNLVWGZ-ZKWXMUAHSA-N 0.000 description 1
- JXGWQYWDUOWQHA-DZKIICNBSA-N Val-Gln-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N JXGWQYWDUOWQHA-DZKIICNBSA-N 0.000 description 1
- BRPKEERLGYNCNC-NHCYSSNCSA-N Val-Glu-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N BRPKEERLGYNCNC-NHCYSSNCSA-N 0.000 description 1
- OQWNEUXPKHIEJO-NRPADANISA-N Val-Glu-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N OQWNEUXPKHIEJO-NRPADANISA-N 0.000 description 1
- JTWIMNMUYLQNPI-WPRPVWTQSA-N Val-Gly-Arg Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N JTWIMNMUYLQNPI-WPRPVWTQSA-N 0.000 description 1
- JPPXDMBGXJBTIB-ULQDDVLXSA-N Val-His-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N JPPXDMBGXJBTIB-ULQDDVLXSA-N 0.000 description 1
- VXDSPJJQUQDCKH-UKJIMTQDSA-N Val-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N VXDSPJJQUQDCKH-UKJIMTQDSA-N 0.000 description 1
- FTKXYXACXYOHND-XUXIUFHCSA-N Val-Ile-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O FTKXYXACXYOHND-XUXIUFHCSA-N 0.000 description 1
- PYXQBKJPHNCTNW-CYDGBPFRSA-N Val-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](C(C)C)N PYXQBKJPHNCTNW-CYDGBPFRSA-N 0.000 description 1
- BZWUSZGQOILYEU-STECZYCISA-N Val-Ile-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 BZWUSZGQOILYEU-STECZYCISA-N 0.000 description 1
- FEXILLGKGGTLRI-NHCYSSNCSA-N Val-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N FEXILLGKGGTLRI-NHCYSSNCSA-N 0.000 description 1
- BMOFUVHDBROBSE-DCAQKATOSA-N Val-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N BMOFUVHDBROBSE-DCAQKATOSA-N 0.000 description 1
- HGJRMXOWUWVUOA-GVXVVHGQSA-N Val-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N HGJRMXOWUWVUOA-GVXVVHGQSA-N 0.000 description 1
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 1
- XTDDIVQWDXMRJL-IHRRRGAJSA-N Val-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](C(C)C)N XTDDIVQWDXMRJL-IHRRRGAJSA-N 0.000 description 1
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 1
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 1
- RWOGENDAOGMHLX-DCAQKATOSA-N Val-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N RWOGENDAOGMHLX-DCAQKATOSA-N 0.000 description 1
- CXWJFWAZIVWBOS-XQQFMLRXSA-N Val-Lys-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N CXWJFWAZIVWBOS-XQQFMLRXSA-N 0.000 description 1
- UOUIMEGEPSBZIV-ULQDDVLXSA-N Val-Lys-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UOUIMEGEPSBZIV-ULQDDVLXSA-N 0.000 description 1
- UEPLNXPLHJUYPT-AVGNSLFASA-N Val-Met-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O UEPLNXPLHJUYPT-AVGNSLFASA-N 0.000 description 1
- WSUWDIVCPOJFCX-TUAOUCFPSA-N Val-Met-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N WSUWDIVCPOJFCX-TUAOUCFPSA-N 0.000 description 1
- QPPZEDOTPZOSEC-RCWTZXSCSA-N Val-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C(C)C)N)O QPPZEDOTPZOSEC-RCWTZXSCSA-N 0.000 description 1
- UZFNHAXYMICTBU-DZKIICNBSA-N Val-Phe-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N UZFNHAXYMICTBU-DZKIICNBSA-N 0.000 description 1
- UXODSMTVPWXHBT-ULQDDVLXSA-N Val-Phe-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N UXODSMTVPWXHBT-ULQDDVLXSA-N 0.000 description 1
- MHHAWNPHDLCPLF-ULQDDVLXSA-N Val-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 MHHAWNPHDLCPLF-ULQDDVLXSA-N 0.000 description 1
- USLVEJAHTBLSIL-CYDGBPFRSA-N Val-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C USLVEJAHTBLSIL-CYDGBPFRSA-N 0.000 description 1
- DOFAQXCYFQKSHT-SRVKXCTJSA-N Val-Pro-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DOFAQXCYFQKSHT-SRVKXCTJSA-N 0.000 description 1
- NSUUANXHLKKHQB-BZSNNMDCSA-N Val-Pro-Trp Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CNC2=CC=CC=C12 NSUUANXHLKKHQB-BZSNNMDCSA-N 0.000 description 1
- QWCZXKIFPWPQHR-JYJNAYRXSA-N Val-Pro-Tyr Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QWCZXKIFPWPQHR-JYJNAYRXSA-N 0.000 description 1
- DEGUERSKQBRZMZ-FXQIFTODSA-N Val-Ser-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DEGUERSKQBRZMZ-FXQIFTODSA-N 0.000 description 1
- PGQUDQYHWICSAB-NAKRPEOUSA-N Val-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N PGQUDQYHWICSAB-NAKRPEOUSA-N 0.000 description 1
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 1
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 1
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 1
- UJMCYJKPDFQLHX-XGEHTFHBSA-N Val-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N)O UJMCYJKPDFQLHX-XGEHTFHBSA-N 0.000 description 1
- MNSSBIHFEUUXNW-RCWTZXSCSA-N Val-Thr-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N MNSSBIHFEUUXNW-RCWTZXSCSA-N 0.000 description 1
- DLRZGNXCXUGIDG-KKHAAJSZSA-N Val-Thr-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O DLRZGNXCXUGIDG-KKHAAJSZSA-N 0.000 description 1
- UQMPYVLTQCGRSK-IFFSRLJSSA-N Val-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N)O UQMPYVLTQCGRSK-IFFSRLJSSA-N 0.000 description 1
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 1
- GVNLOVJNNDZUHS-RHYQMDGZSA-N Val-Thr-Lys Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O GVNLOVJNNDZUHS-RHYQMDGZSA-N 0.000 description 1
- PDDJTOSAVNRJRH-UNQGMJICSA-N Val-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](C(C)C)N)O PDDJTOSAVNRJRH-UNQGMJICSA-N 0.000 description 1
- YLBNZCJFSVJDRJ-KJEVXHAQSA-N Val-Thr-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O YLBNZCJFSVJDRJ-KJEVXHAQSA-N 0.000 description 1
- NGXQOQNXSGOYOI-BQFCYCMXSA-N Val-Trp-Gln Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O)=CNC2=C1 NGXQOQNXSGOYOI-BQFCYCMXSA-N 0.000 description 1
- POFQRHFHYPSCOI-FHWLQOOXSA-N Val-Trp-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N POFQRHFHYPSCOI-FHWLQOOXSA-N 0.000 description 1
- VBTFUDNTMCHPII-FKBYEOEOSA-N Val-Trp-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O VBTFUDNTMCHPII-FKBYEOEOSA-N 0.000 description 1
- VBTFUDNTMCHPII-UHFFFAOYSA-N Val-Trp-Tyr Natural products C=1NC2=CC=CC=C2C=1CC(NC(=O)C(N)C(C)C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 VBTFUDNTMCHPII-UHFFFAOYSA-N 0.000 description 1
- QPJSIBAOZBVELU-BPNCWPANSA-N Val-Tyr-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N QPJSIBAOZBVELU-BPNCWPANSA-N 0.000 description 1
- JXCOEPXCBVCTRD-JYJNAYRXSA-N Val-Tyr-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N JXCOEPXCBVCTRD-JYJNAYRXSA-N 0.000 description 1
- PFMSJVIPEZMKSC-DZKIICNBSA-N Val-Tyr-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PFMSJVIPEZMKSC-DZKIICNBSA-N 0.000 description 1
- NLNCNKIVJPEFBC-DLOVCJGASA-N Val-Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O NLNCNKIVJPEFBC-DLOVCJGASA-N 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 241000082085 Verticillium <Phyllachorales> Species 0.000 description 1
- 101000645119 Vibrio campbellii (strain ATCC BAA-1116 / BB120) Nucleotide-binding protein VIBHAR_03667 Proteins 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 241000123579 Xenorhabdus bovienii Species 0.000 description 1
- 241000123581 Xenorhabdus poinarii Species 0.000 description 1
- 241000209149 Zea Species 0.000 description 1
- 235000007244 Zea mays Nutrition 0.000 description 1
- 108010055615 Zein Proteins 0.000 description 1
- 229920002494 Zein Polymers 0.000 description 1
- HYJODZUSLXOFNC-UHFFFAOYSA-N [S].[Cl] Chemical compound [S].[Cl] HYJODZUSLXOFNC-UHFFFAOYSA-N 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 239000011149 active material Substances 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 239000002671 adjuvant Substances 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 1
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 1
- 108010045023 alanyl-prolyl-tyrosine Proteins 0.000 description 1
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 1
- 108010045350 alanyl-tyrosyl-alanine Proteins 0.000 description 1
- 108010041407 alanylaspartic acid Proteins 0.000 description 1
- 108010070944 alanylhistidine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- 108010070783 alanyltyrosine Proteins 0.000 description 1
- 125000000266 alpha-aminoacyl group Chemical group 0.000 description 1
- 102000006646 aminoglycoside phosphotransferase Human genes 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 108010052670 arginyl-glutamyl-glutamic acid Proteins 0.000 description 1
- 108010089975 arginyl-glycyl-aspartyl-serine Proteins 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 1
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 1
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 1
- GINJFDRNADDBIN-FXQIFTODSA-N bilanafos Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCP(C)(O)=O GINJFDRNADDBIN-FXQIFTODSA-N 0.000 description 1
- 238000004166 bioassay Methods 0.000 description 1
- 230000029918 bioluminescence Effects 0.000 description 1
- 238000005415 bioluminescence Methods 0.000 description 1
- 229960001561 bleomycin Drugs 0.000 description 1
- OYVAGSVQBOHSSS-UAPAGMARSA-O bleomycin A2 Chemical compound N([C@H](C(=O)N[C@H](C)[C@@H](O)[C@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)NCCC=1SC=C(N=1)C=1SC=C(N=1)C(=O)NCCC[S+](C)C)[C@@H](O[C@H]1[C@H]([C@@H](O)[C@H](O)[C@H](CO)O1)O[C@@H]1[C@H]([C@@H](OC(N)=O)[C@H](O)[C@@H](CO)O1)O)C=1N=CNC=1)C(=O)C1=NC([C@H](CC(N)=O)NC[C@H](N)C(N)=O)=NC(N)=C1C OYVAGSVQBOHSSS-UAPAGMARSA-O 0.000 description 1
- 238000009835 boiling Methods 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 210000000234 capsid Anatomy 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- VJYIFXVZLXQVHO-UHFFFAOYSA-N chlorsulfuron Chemical compound COC1=NC(C)=NC(NC(=O)NS(=O)(=O)C=2C(=CC=CC=2)Cl)=N1 VJYIFXVZLXQVHO-UHFFFAOYSA-N 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 230000004186 co-expression Effects 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 210000001072 colon Anatomy 0.000 description 1
- 210000002777 columnar cell Anatomy 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 229940089639 cornsilk Drugs 0.000 description 1
- 210000002251 corpora allata Anatomy 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 239000003398 denaturant Substances 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 108010033011 des-Arg- enterostatin Proteins 0.000 description 1
- 230000001066 destructive effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 108010009297 diglycyl-histidine Proteins 0.000 description 1
- 108010054813 diprotin B Proteins 0.000 description 1
- CEYULKASIQJZGP-UHFFFAOYSA-L disodium;2-(carboxymethyl)-2-hydroxybutanedioate Chemical compound [Na+].[Na+].[O-]C(=O)CC(O)(C(=O)O)CC([O-])=O CEYULKASIQJZGP-UHFFFAOYSA-L 0.000 description 1
- 239000002270 dispersing agent Substances 0.000 description 1
- 239000002552 dosage form Substances 0.000 description 1
- 238000010410 dusting Methods 0.000 description 1
- 239000000975 dye Substances 0.000 description 1
- 102100035859 eIF5-mimic protein 2 Human genes 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 239000004495 emulsifiable concentrate Substances 0.000 description 1
- 239000003995 emulsifying agent Substances 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 239000006260 foam Substances 0.000 description 1
- 238000005187 foaming Methods 0.000 description 1
- ZHNUHDYFZUAESO-UHFFFAOYSA-N formamide Substances NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 1
- 239000003205 fragrance Substances 0.000 description 1
- 235000012055 fruits and vegetables Nutrition 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 239000008273 gelatin Substances 0.000 description 1
- 229920000159 gelatin Polymers 0.000 description 1
- 235000019322 gelatine Nutrition 0.000 description 1
- 235000011852 gelatine desserts Nutrition 0.000 description 1
- 238000012215 gene cloning Methods 0.000 description 1
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 1
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 1
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 1
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 1
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 1
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 1
- 108010074027 glycyl-seryl-phenylalanine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 1
- 108010081551 glycylphenylalanine Proteins 0.000 description 1
- 108010077515 glycylproline Proteins 0.000 description 1
- 108010084389 glycyltryptophan Proteins 0.000 description 1
- XDDAORKBJWWYJS-UHFFFAOYSA-N glyphosate Chemical compound OC(=O)CNCP(O)(O)=O XDDAORKBJWWYJS-UHFFFAOYSA-N 0.000 description 1
- 229940097068 glyphosate Drugs 0.000 description 1
- 210000002175 goblet cell Anatomy 0.000 description 1
- 239000011544 gradient gel Substances 0.000 description 1
- 239000005090 green fluorescent protein Substances 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 238000001114 immunoprecipitation Methods 0.000 description 1
- 230000003116 impacting effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 210000003000 inclusion body Anatomy 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 238000007373 indentation Methods 0.000 description 1
- 230000002458 infectious effect Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 229910052500 inorganic mineral Inorganic materials 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000004347 intestinal mucosa Anatomy 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 108010031424 isoleucyl-prolyl-proline Proteins 0.000 description 1
- 108010060857 isoleucyl-valyl-tyrosine Proteins 0.000 description 1
- 108010078274 isoleucylvaline Proteins 0.000 description 1
- XUWPJKDMEZSVTP-LTYMHZPRSA-N kalafungina Chemical compound O=C1C2=C(O)C=CC=C2C(=O)C2=C1[C@@H](C)O[C@H]1[C@@H]2OC(=O)C1 XUWPJKDMEZSVTP-LTYMHZPRSA-N 0.000 description 1
- 108010053037 kyotorphin Proteins 0.000 description 1
- 108010077158 leucinyl-arginyl-tryptophan Proteins 0.000 description 1
- 108010076756 leucyl-alanyl-phenylalanine Proteins 0.000 description 1
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 1
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 1
- 108010087810 leucyl-seryl-glutamyl-leucine Proteins 0.000 description 1
- 239000007791 liquid phase Substances 0.000 description 1
- 108010064235 lysylglycine Proteins 0.000 description 1
- 108010054155 lysyllysine Proteins 0.000 description 1
- 108010017391 lysylvaline Proteins 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 230000010534 mechanism of action Effects 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 108010085203 methionylmethionine Proteins 0.000 description 1
- 229960000485 methotrexate Drugs 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 244000005706 microflora Species 0.000 description 1
- 239000011859 microparticle Substances 0.000 description 1
- 238000007479 molecular analysis Methods 0.000 description 1
- 238000000329 molecular dynamics simulation Methods 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 230000004899 motility Effects 0.000 description 1
- 229960004927 neomycin Drugs 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 238000006386 neutralization reaction Methods 0.000 description 1
- 239000000618 nitrogen fertilizer Substances 0.000 description 1
- 108010058731 nopaline synthase Proteins 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 210000004681 ovum Anatomy 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 108010082795 phenylalanyl-arginyl-arginine Proteins 0.000 description 1
- 108010072637 phenylalanyl-arginyl-phenylalanine Proteins 0.000 description 1
- 108010070409 phenylalanyl-glycyl-glycine Proteins 0.000 description 1
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 1
- 108010084572 phenylalanyl-valine Proteins 0.000 description 1
- 108010073101 phenylalanylleucine Proteins 0.000 description 1
- 229910000073 phosphorus hydride Inorganic materials 0.000 description 1
- 229910052615 phyllosilicate Inorganic materials 0.000 description 1
- 230000019612 pigmentation Effects 0.000 description 1
- 238000003976 plant breeding Methods 0.000 description 1
- 239000003375 plant hormone Substances 0.000 description 1
- 231100000614 poison Toxicity 0.000 description 1
- 231100000572 poisoning Toxicity 0.000 description 1
- 230000000607 poisoning effect Effects 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 229920001155 polypropylene Polymers 0.000 description 1
- 229910000160 potassium phosphate Inorganic materials 0.000 description 1
- 235000011009 potassium phosphates Nutrition 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 108700042769 prolyl-leucyl-glycine Proteins 0.000 description 1
- 108010077112 prolyl-proline Proteins 0.000 description 1
- 108010053725 prolylvaline Proteins 0.000 description 1
- 230000001681 protective effect Effects 0.000 description 1
- 230000004853 protein function Effects 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 230000007026 protein scission Effects 0.000 description 1
- 239000012460 protein solution Substances 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- 210000001938 protoplast Anatomy 0.000 description 1
- YIEHRKORSQMTOV-UHFFFAOYSA-N pyrimidine;2h-triazolo[4,5-d]pyrimidine Chemical compound C1=CN=CN=C1.N1=CN=CC2=NNN=C21 YIEHRKORSQMTOV-UHFFFAOYSA-N 0.000 description 1
- 238000013102 re-test Methods 0.000 description 1
- 239000012744 reinforcing agent Substances 0.000 description 1
- 230000008521 reorganization Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 239000012723 sample buffer Substances 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 108010048818 seryl-histidine Proteins 0.000 description 1
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 208000007056 sickle cell anemia Diseases 0.000 description 1
- 230000037432 silent mutation Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 108091029842 small nuclear ribonucleic acid Proteins 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 239000011877 solvent mixture Substances 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 108010005652 splenotritin Proteins 0.000 description 1
- 239000007921 spray Substances 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 239000003381 stabilizer Substances 0.000 description 1
- 230000000638 stimulation Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 230000031068 symbiosis, encompassing mutualism through parasitism Effects 0.000 description 1
- 101150078747 tcaA gene Proteins 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- DPJRMOMPQZCRJU-UHFFFAOYSA-M thiamine hydrochloride Chemical compound Cl.[Cl-].CC1=C(CCO)SC=[N+]1CC1=CN=C(C)N=C1N DPJRMOMPQZCRJU-UHFFFAOYSA-M 0.000 description 1
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 1
- 229940104230 thymidine Drugs 0.000 description 1
- 231100000167 toxic agent Toxicity 0.000 description 1
- 239000003440 toxic substance Substances 0.000 description 1
- 231100000033 toxigenic Toxicity 0.000 description 1
- 230000001551 toxigenic effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- 150000003852 triazoles Chemical class 0.000 description 1
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 1
- 108010038745 tryptophylglycine Proteins 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 108010005834 tyrosyl-alanyl-glycine Proteins 0.000 description 1
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 1
- 108010020532 tyrosyl-proline Proteins 0.000 description 1
- 108010078580 tyrosylleucine Proteins 0.000 description 1
- 241000701366 unidentified nuclear polyhedrosis viruses Species 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 108010015385 valyl-prolyl-proline Proteins 0.000 description 1
- 108010009962 valyltyrosine Proteins 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 235000020234 walnut Nutrition 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- 239000004563 wettable powder Substances 0.000 description 1
- 108010027345 wheylin-1 peptide Proteins 0.000 description 1
- 239000001231 zea mays silk Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01N—PRESERVATION OF BODIES OF HUMANS OR ANIMALS OR PLANTS OR PARTS THEREOF; BIOCIDES, e.g. AS DISINFECTANTS, AS PESTICIDES OR AS HERBICIDES; PEST REPELLANTS OR ATTRACTANTS; PLANT GROWTH REGULATORS
- A01N37/00—Biocides, pest repellants or attractants, or plant growth regulators containing organic compounds containing a carbon atom having three bonds to hetero atoms with at the most two bonds to halogen, e.g. carboxylic acids
- A01N37/44—Biocides, pest repellants or attractants, or plant growth regulators containing organic compounds containing a carbon atom having three bonds to hetero atoms with at the most two bonds to halogen, e.g. carboxylic acids containing at least one carboxylic group or a thio analogue, or a derivative thereof, and a nitrogen atom attached to the same carbon skeleton by a single or double bond, this nitrogen atom not being a member of a derivative or of a thio analogue of a carboxylic group, e.g. amino-carboxylic acids
- A01N37/46—N-acyl derivatives
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01N—PRESERVATION OF BODIES OF HUMANS OR ANIMALS OR PLANTS OR PARTS THEREOF; BIOCIDES, e.g. AS DISINFECTANTS, AS PESTICIDES OR AS HERBICIDES; PEST REPELLANTS OR ATTRACTANTS; PLANT GROWTH REGULATORS
- A01N63/00—Biocides, pest repellants or attractants, or plant growth regulators containing microorganisms, viruses, microbial fungi, animals or substances produced by, or obtained from, microorganisms, viruses, microbial fungi or animals, e.g. enzymes or fermentates
- A01N63/50—Isolated enzymes; Isolated proteins
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/24—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Enterobacteriaceae (F), e.g. Citrobacter, Serratia, Proteus, Providencia, Morganella, Yersinia
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
- C12N15/8261—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
- C12N15/8271—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
- C12N15/8279—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance
- C12N15/8286—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance for insect resistance
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
- Y02A40/146—Genetically Modified [GMO] plants, e.g. transgenic plants
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Zoology (AREA)
- Molecular Biology (AREA)
- Wood Science & Technology (AREA)
- Biotechnology (AREA)
- Plant Pathology (AREA)
- Pest Control & Pesticides (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Gastroenterology & Hepatology (AREA)
- Dentistry (AREA)
- Environmental Sciences (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Agronomy & Crop Science (AREA)
- Physics & Mathematics (AREA)
- Cell Biology (AREA)
- Insects & Arthropods (AREA)
- Virology (AREA)
- Agricultural Chemicals And Associated Chemicals (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Peptides Or Proteins (AREA)
Abstract
本主题发明涉及到惊人的发现,即从致病杆菌属、光杆状菌属和类芽孢杆菌属得到的毒素复合体(TC)蛋白质可以彼此互换使用。在本发明尤其优选的实施方案中,“独特”TC蛋白质(例如来源于光杆状菌属、致病杆菌属或者类芽孢杆菌属)的毒性通过一种或者多种TC蛋白质“增效剂”而增强,该增效剂的来源生物与毒素来源的生物是不同的属。本领域技术人员应该认识到本公开的益处,这扩展了内涵并拓展了现在认识到的单一类型TC蛋白质的应用范围。最重要的好处是本领域技术人员现在能够使用单一一组增效剂来增强独特的致病杆菌属蛋白质毒素以及独特的光杆状菌属蛋白质毒素的活性。这降低了需要通过转基因植物的表达实现对广谱靶害虫有效控制的基因数量和转化事件。文中也公开了异源TC蛋白质的一些优选组合。
Description
发明背景
在作物损失和使这些害虫得到控制的代价方面,昆虫和其它害虫每年花费农民几十亿美元。在农业生产环境中由虫害引起的损失包括减少作物产量、降低作物质量和增加收获成本。对于蔬菜和水果栽培者、观赏花生产者,以及家庭园丁和户主,虫害也是一种负担。
耕作方法,例如轮作和施用高水平氮肥,已经部分解决了由农业害虫引起的问题。然而,农田利用方面的多种需求限制轮作的使用。此外,在一些地区轮作会中断一些昆虫的越冬特性。
因此,合成的化学杀虫剂最主要是取决于达到足够水平的控制。然而,合成的化学杀虫剂的使用具有几个缺点。例如,这些化学试剂能够不利地影响一些有益的昆虫。靶昆虫也已形成对一些化学杀虫剂的抗性。而且,下雨和杀虫剂施用装置不适当的校准能够导致较差的控制。杀虫剂的使用常常引起环境问题,例如当使用不当时土壤和水源的污染,并且残余物也能够保留在所处理的水果和蔬菜上。使用一些杀虫剂的工作也可使进行施用的人员处于危险中。关于杀虫剂使用的严格的新的限制和一些有效杀虫剂的去除可能限制了实际的控制损害和病害虫的选择。
用生物杀虫药代替合成的化学杀虫剂,或者将化学杀虫剂与生物杀虫药相组合能够降低环境中毒性化学药品的水平。一些目前成功使用的生物杀虫剂来自土壤微生物苏云金芽孢杆菌(Bacillus thuringiensis(B.t.))。然而绝大多数B.t.菌株不呈现杀虫活性,仅一些B.t.菌株产生对害虫例如昆虫具有高度毒性并且其毒性活性是特异的蛋白质。已经分离了编码δ-内毒素蛋白质的基因。其它芽孢杆菌属(Bacillus)物种也产生杀虫蛋白质。
已经产生了基于重组DNA的B.t.产物并且批准其进行使用。另外,随着遗传工程技术的使用,用于将这些毒素递送至农业环境的多种方法都是优选的。这些方法包括使用用毒素基因进行遗传工程化的植物以获得昆虫抗性和使用稳定化的完整微生物细胞作为毒素递送载体。因此,所分离的芽孢杆菌属毒素基因正变得具有商业价值。
B.t.蛋白质毒素最初制剂为可喷雾昆虫控制剂。B.t.技术相对更新的应用已经用于分离编码这些毒素的基因和用其转化植物。转基因植物随后产生毒素,从而提供昆虫控制。见Mycogen公司的美国专利号5,380,831;5,567,600;和5,567,862。转基因B.t.植物是非常有效的,并且预测将在一些作物和区域具有重要用途。
对于芽孢杆菌(和其它生物学)杀虫蛋白质的成功农业使用还存在一些障碍。某些昆虫能够耐受芽孢杆菌毒素的作用。迄今为止,已经证明诸如棉子象鼻虫、黑切根虫和美洲棉铃虫(Helicoverpa zea),以及大部分物种的成虫对许多B.t.δ-内毒素没有显著的敏感性。
另一个潜在的障碍是昆虫B.t.毒素抗性的形成。B.t.植物广泛使用的潜势已经引起人们关注与传统可喷雾施用相比可能更快地产生抗性管理事项。虽然在实验室中一些昆虫已经筛选出可产生B.t.毒素抗性,但只有有菱形斑纹的蛾(菜蛾(Plutella xylostella))已经证明在田地环境中具有抗性(Ferre,J.和Van Rie,J.,Annu.Rev.Entomol.47:501-533,2002)。
B.t.转基因植物技术中的抗性管理策略已经受到极大关注。已经建议几种策略保留有效使用苏云金芽孢杆菌毒素的能力。例如,这些策略包括高剂量保护,和交替或共同配置不同的毒素(McGaughey等(1998),“B.t.抗性管理”,Nature Biotechnol 16:144-146),正如在天然细菌中的策略一样。
因此,为了有效地控制多种昆虫,仍然非常需要研发出能够在植物中表达的额外基因。除了继续设法发现新的B.t.毒素(由于已经发现了大量B.t.毒素,这将变得越来越困难)之外,将非常期望发现其它产生能够用于转基因植物策略的毒素的细菌源(不同于B.t.)。
相对更近的从光杆状菌属(Photorhabdus)/致病杆菌属(Xenorhabdus)细菌群中克隆杀虫毒素基因的努力呈现出对来自苏云金芽孢杆菌的毒素的潜在备选毒素。致病杆菌属在分类学上定义为肠杆菌科(Enterobacteriaceae)的一员,尽管其具有某些非典型的该科特征。例如,该属菌株是典型的硝酸盐还原作用阴性和过氧化氢酶阴性菌。只是在最近致病杆菌属才已经进行细分形成了第二个属,光杆状菌属,该属包括三个物种,Photorhabdus asymbiotica、Photorhabdus temperata和发光光杆状菌(Photorhabdus luminescens)。发光光杆状菌具有三个公认的亚种,发光光杆状菌akhurstii亚种、发光光杆状菌laumondii亚种和发光光杆状菌发光亚种(标准种)。(Fischer-Le Saux,M.,Viallard,V.,Brunel,B.,Normand,P.,Boemare,N.E.光杆状菌属属的名称多相分类和新分类群的建议:P.luminescens subsp.luminescens subsp.nov.、P.luminescenssubsp.akhurstii subsp.nov.、P.luminescens subsp.laumondii subsp.nov.、P.temperata sp.nov.、P.temperata subsp.temperate subsp.nov.和P.asymbiotica sp.nov.,Int.J.Syst.Bacteriol.49;1645-1656,(1999))。该分化基于容易被技术人员所辨别的几个明显特征。这些差别包括如下:DNA-DNA特性研究;过氧化氢酶活性表型的存在(光杆状菌属)或缺乏(致病杆菌属);生物发光特性的存在(光杆状菌属)或缺乏(致病杆菌属);线虫宿主的科(因为致病杆菌属发现于斯氏线虫科(Steinernematidae)而光杆状菌属发现于异小杆线虫科(Heterorhabditidae));以及可比较的细胞脂肪酸分析(Janse等人,1990,Lett.Appl.Microbiol.10,131-135;Suzuki等人,1990,J.Gen.Appl.Microbiol.,36,393-401)。此外,最近集中于16SrRNA基因的序列(Rainey等人,1995,Int.J Syst.Bacteriol.,45,379-381)和限制性分析(Brunel等人,1997,App.Environ.Micro.,63,574-580)的分子研究也支持这两个属的分开。
对于致病杆菌属所预期的特性如下:革兰氏染色阴性杆菌、白色到黄/棕色菌落色素沉着、包涵体的存在、过氧化氢酶的缺乏、还原硝酸盐能力的缺乏、生物发光的缺乏、从培养基摄入染料的能力、阳性明胶水解作用、在肠杆菌科选择培养基上生长、生长温度低于37℃、在厌氧条件下生存和游动性。
目前,细菌致病杆菌属包括4个公认的物种,嗜线虫致病杆菌(Xenorhabdus nematophilus)、波氏致病杆菌(Xenorhabdus poinarii)、伯氏致病杆菌(Xenorhabdus bovienii)和贝氏致病杆菌(Xenorhabdusbeddingii)(Brunel等人,1997,App.Environ.Micro.,63,574-580)。在文献中已经叙述了多种相关菌株(例如Akhurst和Boemare 1988 J.Gen.Microbiol.,134,1835-1845;Boemare等人,1993 Int.J.Syst.Bacteriol.43,249-255页;Putz等人,1990,Appl.Environ.Microbiol.,56,181-186,Brunel等人,1997,App.Environ.Micro.,63,574-580,Rainey等人,1995,Int.J.Syst.Bacteriol.,45,379-381)。
光杆状菌属和致病杆菌属的种是与通过昆虫治病性和共生性与土壤线虫相关的革兰阴性细菌。这些细菌发现于侵袭和杀死昆虫的昆虫致病性线虫的肠道内。当线虫侵袭昆虫宿主时,细菌被释放入昆虫血腔内(开放循环系统),并且细菌和线虫都进行多轮复制;昆虫宿主一般死亡。这些细菌能够能够离开其线虫宿主进行培养。关于这些细菌更加详细的论述,参见Forst和Nealson,60Microbiol.Rev.1(1996),21-43页。不幸的是,如在大量文章中所报道,当注射入昆虫幼虫时细菌仅具有杀虫活性并且当口服递送时细菌不呈现生物学活性。
致病杆菌属和光杆状菌属细菌将多种物质分泌入培养基中。参见R.H.ffrench-Constant等人,66 AEMNo.8,3310-3329页(2000年8月),综述了多种参与光杆状菌属昆虫毒性的因子。
已经难于有效地开发线虫或其细菌共生生物的杀虫特性。因此,来自具有口服活性的光杆状菌属/致病杆菌属细菌的蛋白质因子是令人期望的,以至于由其产生的产物能够制剂成可喷雾杀虫剂,或者编码所述蛋白质因子的基因能够进行分离并用于产生转基因植物。
在从发光光杆状菌和嗜线虫致病杆菌克隆编码杀虫毒素的基因方面已经取得实质性进展。首先检测了来自发光光杆状菌的毒素-复合体编码基因。参见WO 98/08932。最近更是从嗜线虫致病杆菌克隆了相似基因。参见,例如,Morgan等人,Applied and Environmental Microbiology 2001,67:2062-69。“相似性”的程度在下面进行更加详细地论述。
WO 95/00647涉及使用致病杆菌属蛋白质毒素控制昆虫,但是该专利不包括口服有效毒素。WO 98/08388涉及来自致病杆菌属的口服施用杀虫剂因子。美国专利号6,048,838涉及具有口服活性、从致病杆菌属物种和菌株可获得的蛋白质毒素/毒素复合体。
[0019]在光杆状菌属spp中已经鉴定了4种不同的毒素复合体(TCs)-Tca、Tcb、Tcc和Ted。这些毒素复合体的每一种作为单体或者作为二聚体在天然琼脂糖凝胶上进行分离,但是在变性凝胶上的分离显示每种复合体由25-280kDa之间范围内的种类组成。来自光杆状菌属的编码典型TCs的ORF及蛋白酶切割位点(垂直箭头)图解显示于图7。也参见R.H.ffrench-Constant和Bowen,57Cell.Mol.Life Sci.828-833(2000)。
使用DNA探针并且使用针对毒素产生的单克隆和/或多克隆抗体筛选了P.luininescens基因组文库。克隆了4个tc基因座:tca、tcb、tcc和tcd。tca基因座是从相同DNA链转录的三个开放阅读框(ORFs)tcaA、tcaB和tcaC的预测的操纵子,并具有以相反方向转录的较小终止ORF(tcaZ)。tcc基因座也包括三个推测以相同方向转录的ORF(tccA,tccB和tccC)。tcb基因座是一个大的ORF(tcbA),并且tcd基因座包括2个ORF(tcdA和tcdB);tcbA和tcdA,每个大约7.5kb,编码大的昆虫毒素。已经确定许多这些基因的产物被蛋白酶切割。例如,TcbA和TcdA切割成三个片段,命名为i、ii和iii(例如TcbAi、TcbAii和TcbAiii)。tca和tcc ORF的产物也被切割。参见图7。也参见R.H.ffrench-Constant和D.J.Bowen,Current Opinionsin Microbiology,1999,12:284-288。
如WO 98/08932中所报道,来自光杆状菌属属的蛋白质毒素已经显示具有口服的抗昆虫毒性。例如,由发光光杆状菌(W-14)产生的毒素复合体已经显示包含10到14种蛋白质,并且已知这些蛋白质是通过表达来自4个不同基因组区域的基因:tea、tcb、tcc和tcd所产生的。WO 98/08932公开了多种天然毒素基因的核苷酸序列。
Tca毒素复合体的生物学测定显示当经口施用时它们对于第一龄期的番茄天蛾幼虫(烟草天蛾(Manduca sexta))是高毒性的(LD50为每平方厘米875ng人工饵料)。R.H.ffrench-Constant和Bowen 1999。当Tca剂量低至40ng/cm2时摄食受到抑制。对于基于摩尔质量所给定的高度预测的Tca分子量,发光光杆状菌毒素是高度活性的,并且相对较少数分子表现出对于产生毒性效应是必需的。R.H.ffrench-Constant和Bowen,Current Opinions inMicriobiology,1999,12:284-288。
这4个基因座与GenBank中已知功能的任意序列均不显示总体的相似性。序列同一性的区域引起一些建议,即这些蛋白质(TcaC和TccA)可以通过攻击昆虫血细胞而克服昆虫免疫性。R.H.ffrench-Constant和Bowen,Current Opinions in Microbiology,1999,12:284-288。
TcaB、TcbA和TcdA彼此相互比较,全部显示氨基酸保守区(~50%同一性)紧密围绕其预测的蛋白质切割位点。3种不同的Tc蛋白质之间的该保守性表明它们可以全部通过相同或相似的蛋白酶进行加工。TcbA和TcdA也具有~50%同一性,以及相似的羧基-和氨基-末端预测切割模式。因此可以假定这些蛋白质可能是彼此的同系物(在某种程度上)。而且,TcbA和TcdA相似的大尺寸和两种毒素均作用于昆虫肠道的事实,也可以暗示相似的作用方式。R.H.ffrench-Constant和Bowen,Current Opinionsin Microbiology,1999,12:284-288。
缺失/敲除研究表明tca和tcd基因座的产物负责多数对鳞翅类昆虫的口服毒性。缺失tca或者tcd基因极大地降低针对烟草天蛾的口服活性。也就是说,tca和tcd基因座的产物其自身是口服鳞翅类毒素;其组合的效果有助于绝大多数分泌的口服活性。R.H.ffrench-Constant和D.J.Bowen,57Cell.Mol.Life.Sci.831(2000)。有趣地是,单独缺失tcb或者tcc基因座的任何一个也会降低死亡率,这表明可能存在不同基因产物之间的复合体相互作用。因此,tca基因座的产物可以增强tcd产物的毒性。另外,tcd产物可以调节tca产物的毒性并且可能调节其它复合体。值得注意的是上述涉及到针对单一昆虫物种的口服活性,而tcb或tcc基因座可以产生针对其它类型昆虫具有更高活性(或者通过直接注射入昆虫血腔—当通过细菌体内分泌时正常的递送路线—具有活性)的毒素。R.H.ffrench-Constant和Bowen,Current Opinions in Microbiology,1999,12:284-288。
昆虫中肠的上皮细胞含有柱状(结构)和杯状(分泌)细胞。M.sexta对tca产物的摄入导致柱状细胞顶端膨胀和大细胞质囊泡的起泡,这导致最终将小囊泡中的细胞核挤入肠腔。杯状细胞显然也受到相同方式的影响。口服递送或者注射之后tca的产物作用于昆虫中肠。R.H.french-Constant和D.J.Bowen,Current Opinions in Microbiology,1999,12:284-288。纯化的tca产物已经显示出针对烟草天蛾的口服毒性(LD50为875ng/cm2)。R.H.ffrench-Constant和D.J.Bowen,57Cell.Mol.Life Sci.828-833(2000)。
WO 99/42589和美国专利号6,281,413公开了来自发光光杆状菌的TC-样ORF。WO 00/30453和WO 00/42855公开了来自致病杆菌属的TC-样蛋白质。WO 99/03328和WO 99/54472(和美国专利号6,174,860和6,277,823)涉及来自致病杆菌属和光杆状菌属的其它毒素。
WO 01/11029和美国专利号6,590,142B1公开了编码TcdA和TcbA并且具有已经从天然基因进行改变使得其与植物基因更加相似的碱基组合物的核苷酸序列。还公开了表达毒素A和毒素B的转基因植物。这些参考文献还公开了发光光杆状菌W-14株(ATCC 55397;保藏于1993年3月5日)和许多其它菌株。
在从发光光杆状菌(W-14)分离的分别的毒素中,那些命名为毒素A和毒素B的毒素已经成为所关注的一些研究的主题,这些研究是关于这些毒素针对目的靶昆虫物种(例如玉米根虫)的活性的。毒素A包括2个不同的亚基。天然基因tcdA编码前毒素TcdA。如通过质谱测定法所确定的那样,TcdA由一种或多种蛋白酶进行加工以提供毒素A。更加具体而言,TcdA是大约282.9kDa的蛋白质(2516aa),其进行加工以提供TcdAi(最开始的88个氨基酸)、TcdAii(接下来的1849aa;由tcdA的核苷酸265-5811编码的大约208.2kDa的蛋白质)和TcdAiii,大约63.5kDa(579aa)的蛋白质(由tcdA的核苷酸5812-7551编码)。TcdAii和TcdAiii表现出组装成二聚体(可能由TcdAi辅助形成),并且该二聚体再组装成4个二聚体的四聚体。毒素B同样来源于TcbA。
然而,目前还没有了解TC蛋白质彼此之间确切的分子相互作用及其作用机制,例如,已知光杆状菌属的Tca毒素复合体对烟草天蛾是具有毒性的。此外,已知一些TC蛋白质具有“独特(stand-alone)”杀虫活性,而其它TC蛋白质已知能够赋予或增强独特毒素的活性。已知单独的TcdA蛋白质针对烟草天蛾是有活性的,但与TcdB和TccC一起时,能够用于增强TcdA的活性。Waterfield,N.等人,Appl.Environ.Microbiol.2001,67:5017-5024。TcbA(仅有的唯一一种Tcb蛋白质)是另一种来自光杆状菌属的独特毒素。该毒素(TcbA)的活性也能够通过TcdB及TccC-样蛋白质增强。
美国专利申请20020078478提供来自发光光杆状菌W-14tcd基因组区域的两个增强子基因tcdB2和tccC2的序列。在其中表明了,与当tcdA在此类异源宿主中单独表达时所获得的毒性相比,tcdB和tccC1与tcdA在异源宿主中共表达会导致增强水平的口服昆虫毒性。tcdB和tccC1与tcdA或tcbA共表达会提供增强了的口服昆虫活性。
如在下面图表中所指出,TccA与TcdA的N末端具有一定水平的同源性,并且TccB与TcdA的C末端具有一定水平的同源性。TccA和TccB对于某些测试昆虫的活性比TcdA更低。来自光杆状菌属菌株W-14的TccA和TccB称为“毒素D”。在下面还指出了“毒素A”(TcdA)、“毒素B”(TcbA)和“毒素C”(TcaA和TcaB)。
而且,TcaA与TccA具有一定水平同源性并且同样与TcdA的N末端也具有一定水平的同源性。此外,TcaB与TccB具有一定水平同源性并且同样与TcdA的N末端也具有一定水平的同源性。
TccA和TcaA具有相似的大小,正如TccB和TcaB一样。TcdB与TcaC具有显著水平的相似性(在序列和大小方面)。
光杆状菌属 | 光杆状菌属菌株W-14命名法 | 同源于 |
TcaA | 毒素C | TccA |
TcaB | TccB | |
TcaC TcdB | ||
TcbA | 毒素B | |
TccA | 毒素D | TcdA的N末端 |
TccB | TcdA的C末端 | |
TccC | ||
TcdA | 毒素A | TccA+TccB |
TcdB TeaC |
相对更新的致力于嗜线虫致病杆菌的克隆也表明已经鉴定出与发光光杆状菌tc基因座具有同源性的新的杀虫毒素基因,参见,例如,WO98/08388和Morgan等人,Applied and Environnaental Microbiology2001,67:2062-69。在R.H.ffrench-Constant和D.J.Bowen,CurrentOpinions in Microbiology,1999,12:284-288中,粘粒克隆用于直接筛选针对另一种鳞翅类欧洲粉蝶(Pieris brassicae)的口服毒性。对一种口服毒素的粘粒克隆进行了测序。对该粘粒序列的分析表明有五个不同的ORF与光杆状菌属的tc基因具有相似性;orf2和orf5都与tcbA和tcdA具有一定水平的序列相关性,而orf1与tccB相似,orf3与tccC相似并且orf4与tcaC相似。大量这些预测的ORF也共同享有在发光光杆状菌中证明的推定的切割位点,这表明有活性的毒素还可以进行蛋白水解加工。
从这些致病基因可能的起源来看,在这两种不同的细菌中具有一些相似性的毒素编码基因座的发现是非常有趣的。嗜线虫致病杆菌粘粒也表现出含有转位酶样序列,该序列的存在可以表明这些基因座能够水平转移至不同细菌菌株或物种之间的潜能。此类转移事件的范围也可以解释两种不同细菌中tc操纵子明显不同的基因组组成。另外,仅嗜线虫致病杆菌和发光光杆状菌菌株的亚群显示出对烟草天蛾具有毒性,这表明不同菌株缺乏tc基因或者它们携带不同的tc基因补充物(compliment)。对这些细菌物种内和之间的菌株和毒素种系发生进行详细分析有助于阐明毒素基因可能的起源并阐明它们如何维持于不同的细菌种群中。R.H.ffrench-Constant和Bowen,Current Opinions in Microbiology,1999,12:284-288。
存在五种典型的致病杆菌属TC蛋白质:XptA1、XptA2、XptB1、XptC1和XptD1。XptA1是“独特”毒素。XptA2是另一种来自致病杆菌属具有独特毒素活性的TC蛋白质。XptB1和XptC1是能够增强任一种(或两种)XptA毒素的活性的增效剂。XptD1与TccB具有一定水平同源性。XptC1与光杆状菌属TcaC具有一定水平相似性。致病杆菌属的XptA2蛋白质与光杆状菌属TcdA蛋白质具有一定程度相似性。XptB1与光杆状菌属TccC具有一定水平相似性。
最近已经对来自其它昆虫相关的细菌诸如嗜线虫沙雷菌(Serratiaentomophila),一种昆虫病原体,的TC蛋白质和基因进行了叙述。有学者发现假单胞菌属(Pseudomonas)物种具有增效剂。Waterfield等人,TRENDS in Microbiology,Vol.9,No.4,April 2001。
通过特定的rRNA和表型特征可以将类芽孢杆菌属(Paenibacillus)的细菌与其它细菌进行区别(C.Ash等人(1993),“利用PCR探针测试分子鉴定3种杆菌的rRNA(Ash、Farrow、Wallbanks和Collins):建立新的类芽孢杆菌属的建议”,Antonie vaya Leeuwenhoek 64:253-260)。已知该属中的一些物种对于蜜蜂(幼虫类芽孢杆菌(Paenibacillus larvae))和金龟子甲虫幼虫(P.popilliae和P.lentimorbus)是致病的。幼虫类芽孢杆菌、P.popilliae和P.lentimorbus被认为是与金龟子甲虫乳白病有关的专性昆虫病原体(D.P.Stahly等人(1992),“芽孢杆菌属:昆虫病原体”,1697-1745页,于A.Balows等人编著,原核生物,第二版,第2卷,Springer-Verlag,New York,NY)。
在Paenibacillus popilliae和Paenibacillus lentimorbus菌株中已经鉴定了晶体蛋白质Cry18。Cry18具有金龟子和幼虫毒性,并且与Cry2蛋白质具有大约40%同一性(Zhang等人,1997;Harrison等人,2000)。最近已经在类芽孢杆菌属中发现了TC蛋白质和鳞翅类毒素Cry蛋白质。参见美国系列号60/392,633(Bintrim等人),2002年6月28日申请。在该类芽孢杆菌属菌株中发现了6种TC蛋白质ORF。ORF3和ORF1显示出分别与TcaA具有一定水平同源性。ORF4和ORF2显示出与TcaB具有一定水平同源性。ORF5表现出是TcaC-样增效剂,并且ORF6与TccC增效剂具有同源性。
尽管发现了一些致病杆菌属TC蛋白质与一些光杆状菌属TC蛋白质相对应(具有相似的功能和一定水平序列同源性),所给定的光杆状菌属蛋白质与“相对应的”致病杆菌属蛋白质具有仅大约40%的序列同一性。下面图表说明了四种“独特的”毒素:
(对于更加全面的综述,参见,例如,Morgan等人,“来自嗜线虫致病杆菌PMFI296的杀虫剂基因的序列分析”,第67卷,Applied andEnvironmental Microbiology,2001年5月,2062-2069页)。当将最新新发现的来自类芽孢杆菌属的TC蛋白质(那些蛋白质和发现是共同待决的美国系列号60/392,633的主题)与其致病杆菌属和光杆状菌属“相应物”相比较时,也发现了这种近似程度的序列相关性。
与P.1.W-14TcbA的同一性 | 与P.1.W-14TcdA的同一性 | |
Xwi XptA1 | 44% | 46% |
Xwi XptA2 | 41% | 41% |
虽然光杆状菌属毒素已经成功地应用,并且致病杆菌属毒素也已经成功地应用(除了光杆状菌属毒素外),但目前为止还没有提出或阐明用来自其它生物源(例如致病杆菌属)的一种或多种TC蛋白质增效剂增强来自于这些生物源(诸如光杆状菌属)之一的TC蛋白质毒素的活性。
发明概述
本发明的主题涉及令人惊讶的发现,即毒素复合体(TC)蛋白质(从诸如致病杆菌属、光杆状菌属和类芽孢杆菌属的生物体获得)能够彼此可替换的进行使用。作为本领域的技术人员将认识到本公开的益处,这具有广泛的意义并且扩大了各种类型TC蛋白质目前已经认识到的用途的范围。这在以前没有考虑到,并且这在以前将认为是不可能的,例如,尤其在序列水平来自光杆状菌属的TC蛋白质与致病杆菌属和类芽孢杆菌属的“相应”TC蛋白质相比的高水平的序列差异。
在本发明特别优选的实施方案中,通过来自不同来源生物体的一种或多种TC蛋白质“增效剂”增强“独特”TC蛋白质(例如,来自光杆状菌属、致病杆菌属或类芽孢杆菌属)的毒性。本发明提供给本领域技术人员许多令人惊讶的优点。最重要的优点之一是本领域的技术人员现在将能够使用单一一对增效剂增强独特致病杆菌属蛋白质毒素和独特光杆状菌属蛋白质毒素的活性。(如本领域的技术人员所知,致病杆菌属毒素蛋白质更加倾向于控制鳞翅类昆虫而光杆状菌属毒素蛋白质更加倾向于控制鞘翅类昆虫。)这会降低需要由不同的转基因植物和/或植物细胞共表达的基因(和转化事件)的量,以获得对更广谱的靶害虫的有效控制。
从另一方面说明,本发明涉及致病杆菌属TC蛋白质能够用于增强光杆状菌属TC蛋白质的活性并且反之亦然。同样,并且也是令人惊奇的,发现了来自类芽孢杆菌属的TC蛋白质能够用于替换致病杆菌属/光杆状菌属的TC蛋白质,并且反之亦然。此外,没有预计到来自这些趋异生物体的蛋白质能彼此兼容;这在以前没有提出或证明。本发明尤其令人惊奇之处在于阐明了致病杆菌属、光杆状菌属和类芽孢杆菌属TC蛋白质(以及来自其它属的TC蛋白质)之间显著的差别,尽管这些蛋白质也具有一些共同的特性。
此处还公开了异源TC蛋白质的某些优选的组合。
本发明的一些目的、优点和特点对于本领域受益于本公开的技术人员将是显而易见的。
附图简述
图1显示pDAB2097中所鉴定的ORF的方向。
图2显示表达载体质粒pBT-TcdA。
图3显示表达载体质粒pET280载体。
图4显示表达载体质粒pCot-3。
图5是pBT构建的图解式图。
图6是pET/pCot构建的图解式图。
图7显示来自光杆状菌属的TC操纵子。
序列简述
SEQ ID NO:1是毒素XwiA220kDa蛋白质(XptA2Wi)的N-末端。
SEQ ID NO:2是毒素XwiA纯化的毒素(XptA2Wi)的内部肽。
SEQ ID NO:3是毒素XwiA纯化的毒素(XptA2Wi)的内部肽。
SEQ ID NO:4是毒素XwiA纯化的毒素(XptA2Wi)的内部肽。
5EQ ID NO:5是毒素XwiA纯化的毒素(XptA2Wi)的内部肽。
SEQ ID NO:6是pDAB2097粘粒插入部分:39,005bp。
SEQ ID NO:7是pDAB2097粘粒ORF1:SEQ ID NO:6的核苷酸1-1,533。
SEQ ID NO:8是由pDAB2097粘粒ORF1推导出的蛋白质:511aa。
SEQ ID NO:9是pDAB2097粘粒ORF2(XptD1Wi):SEQ ID NO:6的核苷酸1,543-5,715。
SEQ ID NO:10是由pDAB2097粘粒ORF2推导出的蛋白质:1,391aa。
SEQ ID NO:11是pDAB2097粘粒ORF3:SEQ ID NO:6的核苷酸5,764-7,707。
SEQ ID NO:12是由pDAB2097粘粒ORF3推导出的蛋白质:648aa。
SEQ ID NO:13是pDAB2097粘粒ORF4(XptA1 wi):SEQ ID NO:6的核苷酸10,709-18,277。
SEQ ID NO:14是由pDAB2097粘粒ORF4推导出的蛋白质:2,523aa。
SEQ ID NO:15是pDAB2097粘粒ORF5(xptB1wi):SEQ ID NO:6的核苷酸18,383-21,430(C)。
SEQ ID NO:16是由pDAB2097粘粒ORF5推导出的蛋白质:1,016aa。
SEQ ID NO:17是pDAB2097粘粒ORF6(xptC1wi):SEQ ID NO:6的核苷酸21,487-25,965(C)。
SEQ ID NO:18是由pDAB2097粘粒ORF6推导出的蛋白质:1,493aa。
SEQ ID NO:19是pDAB2097粘粒ORF7(xptA2wi):SEQ ID NO:6的核苷酸26,021-33,634(C)。
SEQ ID NO:20是由pDAB2097粘粒ORF7推导出的蛋白质:2,538aa。
SEQ ID NO:21是来自GENBANK登录号AF188483的TcdA基因和蛋白质序列。
SEQ ID NO:22是来自GENBANK登录号AF346500的TcdB1基因和蛋白质序列。
SEQ ID NO:23是用于从质粒pBC-AS4扩增TcdB1序列的正向引物。
SEQ ID NO:24是用于从质粒pBC-AS4扩增TcdB1序列的反向引物。
SEQ ID NO:25是来自GENBANK登录号AAC38630.1的TccC1的基因和蛋白质序列。
SEQ ID NO:26是用于从质粒pBC KS+载体扩增TccC1的正向引物。
SEQ ID NO:27是用于从质粒pBC KS+载体扩增TccC1的反向引物。
SEQ ID NO:28是用于扩增xptA2wi的正向引物。
SEQ ID NO:29是用于扩增xptA2wi的反向引物。
SEQ ID NO:30是用于扩增xptC1wi的正向引物。
SEQ ID NO:31是用于扩增xptC1wi的反向引物。
SEQ ID NO:32是用于扩增xptB1wi的正向引物。
SEQ ID NO:33是用于扩增xptB1wi的反向引物。
SEQ ID NO:34是来自嗜线虫致病杆菌Xwi的XptA2wi蛋白质的氨基酸序列。
SEQ ID NO:35是来自类芽孢杆菌DAS1529株的ORF3的核酸序列,该序列编码tcaA-样蛋白质。
SEQ ID NO:36是由类芽孢杆菌属ORF3编码的氨基酸序列。
SEQ ID NO:37是来自类芽孢杆菌DAS1529株的ORF4的核酸序列,该序列编码tcaB-样蛋白质。
SEQ ID NO:38是由类芽孢杆菌属ORF4编码的氨基酸序列。
SEQ ID NO:39是来自类芽孢杆菌DAS1529株的ORF5的核酸序列,该序列编码tcaC-样蛋白质(pptB11529)。
SEQ ID NO:40是由类芽孢杆菌属ORF5编码的氨基酸序列(PptB11529)。
SEQ ID NO:41是来自类芽孢杆菌DAS1529株的ORF6(短)的核酸序列,该序列编码tccC-样蛋白质(pptC11529S)。
SEQ ID NO:42是由类芽孢杆菌属(短)ORF6编码的蛋白质序列(PptC11529S)。
SEQ ID NO:43是由SEQ ID NO:55的类芽孢杆菌属ORF6(长)编码的另一个(长)蛋白质序列(PptC11529L)。
SEQ ID NO:44是编码TcdB2的核苷酸序列。
SEQ ID NO:45是TcdB2蛋白质的氨基酸序列。
SEQ ID NO:46是TccC3的核苷酸序列。
SEQ ID NO:47是TccC3蛋白质的氨基酸序列。
SEQ ID NO:48是天然xptB1xb编码区(4521个碱基)。
SEQ ID NO:49是由SEQ ID NO:48所编码的天然XptB1xb蛋白质(1506个氨基酸)。
SEQ ID NO:50是天然XptC1xb编码区(2889个碱基)。
SEQ ID NO:51是由SEQ ID NO:50所编码的天然XptC1xb蛋白质(962个氨基酸)。
SEQ ID NO:52是包含天然xptB1xb编码区的表达质粒pDAB6031的Xba I至XhoI片段(4595个碱基),其中碱基40至4557编码SEQ IDNO:49的蛋白质。
SEQ ID NO:53是包含天然xptC1xb编码区的表达质粒pDAB6032的Xba I至Xho I片段(2947个碱基),其中碱基40至2925编码SEQID NO:51的蛋白质。
SEQ ID NO:54是包含天然xptB1xb和天然xptC1xb编码区的表达质粒pDAB6033的Xba I至Xho I片段(7508个碱基),其中碱基40至4557编码SEQ ID NO:49的蛋白质,并且碱基4601至7486编码
SEQ ID NO:51的蛋白质。
SEQ ID NO:55是来自类芽孢杆菌DAS1529株的ORF6(长;pptC11529L)的核酸序列,该序列编码公开于SEQ ID NO:43的tccC-样蛋白质(PptC11529L)。
SEQ ID NO:56是来自GENBANK登录号AF346497.1的TcaC的基因和蛋白质序列。
SEQ ID NO:57是来自GENBANK登录号AF346500.2的TccC5的基因和蛋白质序列。
SEQ ID NO:58是来自GENBANK登录号AAL18492的TccC2的蛋白质序列。
SEQ ID NO:59显示TcbAw-14蛋白质的氨基酸序列。
SEQ ID NO:60显示SepB蛋白质的氨基酸序列。
SEQ ID NO:61显示SepC蛋白质的氨基酸序列。
SEQ ID NO:62显示TcdA2w-14蛋白质的氨基酸序列。
SEQ ID NO:63显示TcdA4w-14蛋白质的氨基酸序列。
SEQ ID NO:64显示TccC4W-14蛋白质的氨基酸序列。
发明详述
本发明涉及毒素复合体(TC)蛋白质的新用途,这些TC蛋白质可从诸如致病杆菌属、光杆状菌属和类芽孢杆菌属的生物体获得。如下面所讨论,一种或多种TC增效剂用于增强TC毒素蛋白质的活性,这不同于天然情况下一种或两种增效剂增强的TC毒素蛋白质。作为本领域的技术人员将认识到本公开的益处,这具有广泛的意义并且扩大了多种类型TC蛋白质目前已经认识到的用途的范围。
已经知道一些TC蛋白质具有“独特”杀虫活性,并且其它TC蛋白质已知可以增强由相同特定生物体所产生的该独特毒素的活性。在本发明特别优选的实施方案中,“独特”TC蛋白质(例如,来自光杆状菌属、致病杆菌属或类芽孢杆菌属)的毒性通过来自不同属源生物体的一种或多种TC蛋白质“增效剂”增强。
存在三种主要类型的TC蛋白质。如此处所提及,A类蛋白质(“蛋白质A”)是独特的毒素。天然A类蛋白质大约280kDa。
B类蛋白质(“蛋白质B”)和C类蛋白质(“蛋白质C”)增强A类蛋白质的毒性。如此处提及所使用的,天然B类蛋白质大约170kDa,并且天然C类蛋白质大约112kDa。
A类蛋白质的例子是TcbA、TcdA、XptA1和XptA2。B类蛋白质的例子是TcaC、TcdB、XptB1xb和XptC1wi。C类蛋白质的例子是TccC、XptC1xb和XptB1wi。
目前仍然不知道毒性和活性增强作用的确切机制,但是作用的确切机制也不是重要的。重要的是靶昆虫吃或者摄入A、B和C类蛋白质。
已知仅用TcdA蛋白质是对烟草天蛾具有活性的。已知TcdB1和TccC一起能够用于增强TcdA的活性。TcbA是另一种独特的光杆状菌属毒素。目前本领域中所考虑的TC蛋白质的一种组合是TcaC(或TcdB)和TccC(作为增效剂)共同与TcdA或TcbA组合。在致病杆菌属中也是同样的,已知XptB1和XptC1增强XptA1或XptA2的活性,其中后者各自是“独特”毒素。
虽然(TcbA或TcdA)+(TcaC+TccC)复合体似乎显示出与(XptA1或XptA2)+(XptC2+XptB1)复合体相似的排列,但每种光杆状菌属成分与“相应”致病杆菌属成分仅具有大约40%(近似)的序列同一性。来自类芽孢杆菌属的独特TC蛋白质也与“相应”光杆状菌属和致病杆菌属TC蛋白质(那些蛋白质和发现都是共同待决的美国申请系列号60/392,633的主题,Bintrim等,2002年6月28日申请)仅具有大约40%的序列同一性。
如此处所述,正是在本申请的内容中发现了致病杆菌属TC蛋白质能够用于增强光杆状菌属TC蛋白质的活性并且反之亦然。此处还令人惊讶地证明了类芽孢杆菌属TC蛋白质具有增强致病杆菌属(和光杆状菌属)TC毒素活性的潜能。这在以前是未提出和进行证明的,并且尤其令人惊讶的是:在于阐明了致病杆菌属、光杆状菌属和类芽孢杆菌属TC蛋白质之间显著的不同。当然以前也没有预计到来自趋异生物体的不同蛋白质彼此互相兼容。
本发明能够以多种不同的方式实施。植物能够设计成产生两种类型A类蛋白质和单一一对增效剂(B和C蛋白质)。植物的每个细胞,或给定类型组织(例如根或叶)的每个细胞能够具有编码两种A蛋白质和B与C对的基因。
另外,植物的不同细胞仅能够产生这些蛋白质中的一种(或多种)。在这种情况下,当昆虫咬食并食入植物组织时,昆虫食入了产生第一种蛋白质A的细胞,另一种产生第二种蛋白质A的细胞,另一种产生B蛋白质的细胞,以及另一种产生C蛋白质的细胞。因此,产生本发明的两种A蛋白质、B蛋白质和C蛋白质的植物(不必由每一植物细胞均产生这四种蛋白质)将是重要的,从而当害虫食入植物组织时会食入全部这四种蛋白质。
除了转基因植物以外,在本发明的组合中,还有一些其它方式将蛋白质施用于靶害虫。喷射应用是本领域中已知的。几种或全部A、B和C蛋白质能够进行喷射应用(植物能够产生一种或多种蛋白质,并且可将其它蛋白质进行喷雾)。例如,土壤应用的多种类型的饵料颗粒也是本领域已知的,并且能够依照本发明进行使用。
此处显示了多种TC蛋白质的许多组合以令人惊讶的、新的方式起作用。例如,文中提出的一个实例显示TcdB1和TccC1用于增强XptA2针对谷物害虫幼虫的活性的用途。例如,文中所提出的另一个实例是XptB1与TcdB1一起增强TcdA针对谷物根虫的活性的用途。相类似地,并且也是令人惊讶地,进一步发现了来自类芽孢杆菌属的TC蛋白质能够用于增强TcdA-样和XptA2Xwi-样蛋白质的活性。文中所包括的一些例子如下:
对于能够利用本公开的本领域技术人员,这些和其它组合的用途现在是显而易见的。
蛋白质A(毒素) | 蛋白质B(增效剂1) | 蛋白质C(增效剂2) |
XptA2 | 类芽孢杆菌属ORF5(TcaC-样) | 类芽孢杆菌属ORF6 |
XptA2 | 光杆状菌属TcdB1 | 光杆状菌属TccC1 |
光杆状菌属TcdA | 光杆状菌属TcdB1 | XptB1 |
诸如TcbA、TcdA、XptA1和XptA2的独特毒素各自在大约280kDa的大小范围内。TcaC、TcdB1、TcdB2和XptC1每个大约170kDa。TccC1、TccC3和XptB1每个大约112kDa。因此,本发明优选的实施方案包括280-kDa型TC蛋白质毒素(如此处所述)与170-kDa类TC蛋白质(如此处所述)和112-kDa类TC蛋白质(如此处所述)的用途,其中所述的三类蛋白质中至少一种来自于源生物体(诸如光杆状菌属、致病杆菌属或类芽孢杆菌属),该源生物体不同于一种或多种其它TC蛋白质所来源的源生物体的属。
本发明提供给本领域的技术人员许多令人惊奇的优点。其中最重要的优点是本领域的技术人员现在能够使用单一一对增效剂增强例如独特致病杆菌属蛋白质毒素以及例如独特光杆状菌属蛋白质毒素的活性。(如本领域的技术人员所知,致病杆菌属毒素蛋白质更预期倾向于控制鳞翅类昆虫,而光杆状菌属毒素蛋白质更预期倾向于控制鞘翅类昆虫。)这减少了需要通过转基因植物表达以获得对更广谱的靶害虫有效控制的基因(和转化事件)的数量。也就是说,不需要表达六个基因—两个毒素和两对增效剂—本发明允许仅表达四个基因—两个毒素和一对增效剂蛋白质。
因此,本发明包括共表达编码两种(或多种)不同独特TC蛋白质毒素的一条多核苷酸或多条多核苷酸,以及编码单一一对TC蛋白质增效剂—B类蛋白质和C类蛋白质—的一条多核苷酸或多条多核苷酸的转基因植物和/或转基因植物细胞,其中一种或两种所述增效剂来自与独特TC蛋白质毒素所来源的属不同属的细菌。因此,目前技术人员能够获得具有两种(或多种)能够被单一一对蛋白质增效剂(B类和C类蛋白质)增强的TC蛋白质毒素(A类蛋白质)。以前没有产生此类细胞的建议,并且肯定没有期望由所述细胞所产生的两种(或全部)此类毒素将具有足够水平的活性(由于如此处所报道的令人惊讶的增强作用)。作为术语在此处使用的TC蛋白质在本领域是已知的。此类蛋白质包括独特毒素和增效剂。已知产生TC蛋白质的细菌包括以下属的那些细菌:光杆状菌属、致病杆菌属、类芽孢杆菌属、沙雷氏菌属(Serratia)和假单胞菌属。参见,例如,丁香假单胞菌(Pseudomonas syringae pv.syringae B728a)(GenBank登录号gi:23470933和gi:23472543)。根据本发明能够使用任一此类TC蛋白质。
独特(A类)毒素(此处作为术语使用)的例子包括来自光杆状菌属的TcbA和TcdA,以及来自致病杆菌属的XptA1和XptA2。该类毒素为大约280kDa。其它独特毒素的例子包括来自嗜线虫沙雷菌的SepA(GenBank登录号AAG09642.1)。例如,A类蛋白质可以是~230kDa(特别是如果被截短时)、~250-290kDa、~260-285kDa和~270kDa。
存在两种主要型或类的增效剂,此处作为术语进行使用。“B类”增效剂(文中有时称为增效剂1)的例子包括来自光杆状菌属的TcaC、TcdB1和TcdB2,来自致病杆菌属的XptC1,以及类芽孢杆菌属菌株DAS 1529ORF5的蛋白质产物。该类增效剂一般处于大约170kDa的大小范围内。其它~170kDa类增效剂的例子是来自嗜线虫沙雷菌的SepB(GenBank登录号AAG09643.1;这里显示为SEQ ID NO:60)、来自丁香假单孢菌B728a的TcaC同系物(GenBank登录号gi23472544和gi23059431)和嗜线虫致病杆菌PO ORF268(由WO 20/004855图2的碱基258-1991所编码)。优选的~170kDa增效剂是TcdB2(SEQ ID NO:44-45)。B类蛋白质可以是例如~130-180kDa、~140-170kDa、~150-165kDa和~155kDa。
“C类”增效剂(文中有时称为增效剂2)的例子包括来自光杆状菌属的TccC1和TccC3,来自致病杆菌属的Xpt B1以及类芽孢杆菌DAS1529株ORF6的蛋白质产物。该类增效剂一般处于大约112kDa的大小范围内。其它~112kDa类增效剂的例子是来自嗜线虫沙雷菌的SepC(GenBank登录号AAG09644.1;这里显示为SEQ ID NO:61),以及来自丁香假单孢菌B728a的TccC同系物(GenBank登录号gi:23470227、gi:23472546、gi:23472540、gi:23472541、gi:23468542、gi:23472545、gi:23058175、gi:23058176、gi:23059433、gi:23059435和gi:23059432)。优选的~112kDa增效剂是TccC3(SEQ ID NO:46-47)。C类蛋白质可以是例如~90-120kDa、~95-115kDa、~100-110kDa和~105-107kDa。
WO02/94867、美国专利申请20020078478和Waterfield等人(TRENDSin Microbiology Vol.10,No.12,Dec.2002,541-545页)公开了根据本发明能够进行使用的TC蛋白质。例如,Waterfield等人公开了tcdB2、tccC3、tccC5、tcdA2、tcdA3和tcdA4基因和蛋白质。在上面发明背景部分中所讨论的相关参考文献(和任意其它与TC蛋白质相关的参考文献)公开的任意相关TC蛋白质根据本发明也能够进行使用。
因此,本发明的一个实施方案包括产生一种、两种或多种类型独特TC蛋白质毒素和单一一对增效剂:增效剂1和增效剂2(在上面或文中其它地方给出了这三种成分每一种的例子)的转基因植物或植物细胞,其中至少一种所述TC蛋白质来自与一种或多种其它TC蛋白质所来源的属不同的属的生物体。
应该很清楚本发明的例子包括这样的转基因植物或植物细胞,该转基因植物或植物细胞产生/共表达一种类型的光杆状菌属毒素(例如,TcbA或TcdA)、一种类型的致病杆菌属毒素(例如,XptA1或XptA2)和单一(一对并仅一对)一对增效剂蛋白质(例如,TcaC和TccC,不含XptC1或XptB1;或者XptC1和XptB1,不含TcaC或TccC;或者TcaC和类芽孢杆菌属ORF6,不含任意其它增效剂;或者TcdB和XptB1,不含任意其它增效剂;这些组合仅作为例证;许多其它组合对于将利用本公开的本领域技术人员是很清楚的)。根据本发明添加的增效剂能够用于增强异源毒素,但多种类型的增效剂对不是必需的。这是本发明非常令人惊讶的一个方面。
也很明显本发明能够以多种方式进行定义—而不是根据通过转基因植物或植物细胞所表达的产物进行定义。例如,本发明包括通过将一种或多种独特TC蛋白质毒素与单一一对增效剂共表达/共产生使得能够增强该种或多种独特TC蛋白质毒素的活性,其中一种或两种增效剂来自与TC蛋白质毒素所来源的生物体属不同的属的生物体。本发明还包括通过将一种或多种类型的TC蛋白质毒素与一对或多对增效剂一起(例如,TcbA和XptA1和XptC1和XptB1,可能不含TcaC和TccC),包括产生该种蛋白质组合的细胞,投喂给害虫来控制昆虫(或类似)病虫害的方法,其中一种或两种增效剂来自与一种或两种独特毒素所来源的属不同的属的生物体。
在此之前没有考虑到此类安排或预期其具有活性。理解为何本结果是令人惊奇的一个方面是考虑到此处作为例证的一些蛋白质成分的序列相关性。例如,XptA2,来自致病杆菌属的独特毒素,与TcdA具有大约43%的序列同一性并且与TcbA具有大约41%的序列同一性。TcdA和TcbA各自是来自光杆状菌属的独特毒素。XptA1(另一种来自致病杆菌属的独特毒素)与TcdA和TcbA具有45%的同一性。
TcaC(光杆状菌属~170kDa增效剂)与XptC1(170kDa致病杆菌属增效剂)具有大约49%的序列同一性。TccC(~112kDa光杆状菌属增效剂)与XptB1(~112kDa致病杆菌属增效剂)具有大约48%的序列同一性。例如,在此之前,从未期望TcaC+TccC会增强与TcaC+TccC相关的天然“靶”仅具有40-45%序列同一性的蛋白质(XptA1或XptA2)的活性。(通过使用程序FASTA 6.0获得上面报道的数值并且来自Morgan等人“来自嗜线虫致病杆菌PMFI296的杀虫剂基因的序列分析”,第67卷,Applied and Environmental Microbiology,2001年5月,2062-2069页)。
根据本发明使用的成分的一些例子和其彼此之间的相关性,包括:
A类蛋白质
光杆状菌属TcdA毒素同源物 | ||
名称 | 参考 | 与W-14TcdA(GeneBank登录号AAF05542.1)的序列同一性 |
P.1.Hph2 | 美国6,281,413B1的SEQ ID NO:13 | ~93% |
P.1.Hph3 | 美国6,281,413B1的SEQ ID NO:11中公开的碱基2416-9909 | ~57% |
光杆状菌属TcbA毒素同源物 | ||
名称 | 参考 | 与W-14TcdA(GeneBank登录号AAF05542.1)的序列同一性 |
P.1.W-14TcbA | GeneBank登录号AAC38627.1(此处以SEQ ID NO:59再现) | (与W-14TcdA大约~50%序列同一性) |
致病杆菌属XptA1毒素同源物 | ||
名称 | 参考 | 与Xwi XptA1(此处以SEQ IDNO:14公开)的序列同一性 |
X.n XptA1 | GeneBank登录号CAC38401.1(AJ308438) | ~96% |
致病杆菌属XptA2毒素同源物 | ||
名称 | 参考 | 与Xwi XptA2(此处以SEQ IDNO:20公开)的序列同一性 |
X.n XptA2 | GeneBank登录号CAC38401.1(AJ308438) | ~95% |
B类蛋白质
光杆状菌属~170kDa增效剂 | ||
名称 | 标识 | 与P.1.W-14TcdB(GeneBank登录号AAL18487.1)的序列同一性 |
P.1.ORF2 | 美国6,281,413B1的SEQID NO:14 | ~93% |
P.1.ORF4 | 由美国6,281,413B1的SEQ ID NO:11中公开的碱基9966-14633编码 | ~71% |
P.1.W-14TcaC | GeneBank登录号AF046867 | ~58% |
致病杆菌属~170kDa增效剂 | ||
名称 | 标识 | 与Xwi XptC1(此处以SEQ IDNO:18公开)的序列同一性 |
X.n.XptC1 | GeneBank登录号CAC38403.1 | ~90% |
C类蛋白质
光杆状菌属~112kDa增效剂 | ||
名称 | 标识 | 与P.1.W-14TccC1(GeneBank登录号AAC38630.1)的序列同一性 |
P.1.ORF5 | 美国6,281,413B1的SEQ ID NO:12 | ~51% |
P.1.TccC2 | GeneBank登录号AAL18492 | ~48% |
P.1.W-14TccC3 | SEQ ID NO:45 | ~53% |
致病杆菌属~112kDa增效剂 | ||
名称 | 标识 | 与Xwi XptB1(此处以SEQ IDNO:16公开)的序列同一性 |
X.n.XptB1 | GeneBank登录号CAC38402 | ~96% |
X.nem.P2-ORF2071 | 由WO 20/004855中图2的碱基2071-4929编码 | ~48% |
因此,所提到的TC蛋白质由其来源的细菌的属不是任意命名的细菌属。如上面图解说明,如此有助于定义一类TC蛋白质,该类蛋白质在彼此之间是相对保守的(例如由光杆状菌属物种和菌株产生的TC蛋白质的特定类型),但是与其它来源于不同微生物属的“相应”TC蛋白质(例如由多种致病杆菌属物种和菌株产生的TC蛋白质)相对是非常不同的。
定义本发明每种TC蛋白质成分的另一种方法是通过所给定的蛋白质与所给定的毒素或增效剂的序列同一性的程度。此处提供了计算同一性分值的方法。因此,本发明特定的实施方案包括共产生与XptA2具有至少75%序列同一性的毒素、与TcdA或TcbA具有至少75%同一性的毒素、与TcdB1或TcdB2具有至少75%序列同一性的增效剂和与TccC1或TccC3具有至少75%序列同一性的增效剂的转基因植物或植物细胞。根据本发明的教导,其它TC蛋白质能够替换入上面的配方。另外,此处其它地方还提供了更加特定范围的同一性分值。
然而另一种定义本发明给定类型TC蛋白质成分的方法是通过编码该蛋白质的多核苷酸的杂交特性。本说明书自始至终都提供了关于此类“测试”和杂交(和洗涤)更加详细的信息。因此,根据本发明通过编码TC蛋白质的多核苷酸与给定“tc”基因杂交的能力能够定义所使用的TC蛋白质。
将该指导应用于特定实例,本发明XptA2-型毒素能够定义为由这样的多核苷酸所编码,其中编码所述XptA2-型毒素的核酸序列与SEQ ID NO:19的xptA2基因杂交,其中在杂交和洗涤之后杂交维持在任意此处所述或所建议的此类条件下(例如此处所述的低、中度和高度严格杂交/洗涤条件的例子)。任意其它作为例证或所建议的TC蛋白质(包括增效剂或其它毒素)例如TcdB2、TccC3、TcdA和TcbA能够替代该定义中的XptA2。
因此,本发明包括共表达编码本发明TC蛋白质的多核苷酸的某些组合的转基因植物、转基因植物细胞或者细菌细胞。应该清楚本发明包括共表达两个毒素基因和一对增效剂的转基因植物或植物细胞。因此,本发明包括包含一条或多条编码如下面毒素对1、2、3或4所指出的一类毒素的多核苷酸的转基因植物或植物细胞,并且其中所述植物或植物细胞由编码一对增效剂的DNA组成,如下面所指出,该对增效剂选自显示于增效剂对1、2、3、4、5或6的增效剂分类中的蛋白质。以另一种方式进行叙述,所述植物或细胞由编码增效剂对1、2、3、4、5或6中的一种增效剂的多核苷酸部分组成,并且所述植物或细胞由编码所选增效剂对中的其它增效剂的另一多核苷酸部分组成。
毒素对# | |
1 | TcbA&XptA1 |
2 | TcbA&XptA2 |
3 | TcdA&XptA1 |
4 | TcdA&XptA2 |
增效剂对# | |
1 | TcdB1&TccC |
2 | TcaC&TccC |
3 | XptC1&TccC |
4 | TcdB1&XptB1 |
5 | TcaC&XptB1 |
6 | XptC1&XptB1 |
植物或细胞能够包含编码额外TC蛋白质毒素的基因(例如,从而使细胞产生TcbA与TcdA,和/或XptA1与XptA2),但是根据本发明优选的实施方案仅使用了一对增效剂。(当然,细胞或植物将产生多拷贝增效剂;关键是能够避免额外的转化事件。)
本发明其它实施方案包括共表达独特蛋白质毒素和包含至少一种“异源”(所来源的细菌的属不是毒素所来源的生物体的属)TC蛋白质的单一(不多于一种)增效剂对。本发明还包括使用作为增效剂的一对TC蛋白质赋予TC蛋白质毒素增强的杀虫活性,其中至少一种(一种或两种)所述TC蛋白质增效剂相对于它发挥增强作用的TC蛋白质而言是异源TC蛋白质。毒素和用于增强该毒素的增效剂的组包括以下结合:
TcbA | XptC1 | XptB1 |
TcbA | TcdB1 | XptB1 |
TcbA | TcaC | XptB1 |
TcbA | XptC1 | TccC1 |
TcdA | XptC1 | XptB1 |
TcdA | TcdB1 | XptB1 |
TcdA | TcaC | XptB1 |
TcdA | XptC1 | TccC1 |
XptA1 | TcdB1 | tccC1 |
XptA1 | TcdB1 | XptB1 |
XptA1 | TcaC | TccC1 |
XptA1 | TcaC | XptB1 |
XptA1 | XptC1 | TccC1 |
XptA2 | TcdB1 | TccC1 |
XptA2 | TcdB1 | XptB1 |
XptA2 | TcaC | TccC1 |
XptA2 | TcaC | XptB1 |
XptA2 | XptC1 | TccC1 |
很明显,上面的矩阵试图包括例如与诸如XptA1和/或XptA2(以及TcbA和/或TcdA)的任意毒素组合的TcdB2+TccC3(优选的增效剂对)。
对于利用本公开的本领域技术人员其它实施方案和组合将是明显的。
本发明还提供“混合对”增效剂,例如上面图表说明的增效剂对3、4和5。在此之前没有预期(或暗示)此类组合具有TC蛋白质毒素增强剂的活性。因此,目前增效剂的此类“异源”组合能够选择用于使其增强(例如)两种杀虫毒素的能力最大化。也就是说,对于特定的使用,目前人们可以发现例如TcdB1和XptB1是比XptC1和XptB1更加令人期望的增效剂对。此外,令人惊奇地给出了所给定的光杆状菌属增效剂和其可以替代的致病杆菌属增效剂之间序列趋异性的相对程度,以及增效剂天然增强的天然“靶”毒素之间差异的程度。所以,很显然本发明还提供异源增效剂对(即,B类(~170kDa)增效剂来自于与C类(~112kDa)增效剂所来源的细菌属不同的细菌属)。
本发明不限于280kDa TC蛋白质毒素和异源112kDa和/或170kDaTC蛋白质增效剂。由于这是首次发现“混合和匹配”致病杆菌属和光杆状菌属例如TC蛋白质的能力,本发明包括致病杆菌属TC蛋白质用“相应”光杆状菌属TC蛋白质的任意替代,并且反之亦然。例如,本领域技术人员目前也将寻找使用涉及“毒素C”成分(如上面背景部分中所论述)和“毒素D”成分的多种异源组合(例如,TccA+XptD1)。
例如,本发明还包括产生与一种或多种苏云金芽孢杆菌Cry蛋白质结合的受试TC蛋白质的转基因植物的用途。
蛋白质和毒素:本发明提供易于施用的功能性的蛋白质。本发明还提供递送具有功能活性并且对多数目的昆虫,优选地是鳞翅类昆虫有效的昆虫杀虫剂的方法。“功能活性”(或者“针对的活性”)在此处意思是指蛋白质毒素作为口服活性的控制昆虫的试剂发挥作用(单独或者与其它蛋白质相结合),从而所述蛋白质具有毒性效应(单独或者与其它蛋白质相结合)或者能够破坏或阻止昆虫生长和/或取食但其可能导致或不导致昆虫的死亡。当昆虫与通过转基因植物表达、制剂的蛋白质组合物、可喷射的蛋白质组合物、诱饵基质或其它递送系统递送的有效量的本发明“毒素”接触时,结果一般是昆虫死亡、昆虫生长和/或增殖的抑制、和/或阻止昆虫从食物来源(优选地是转基因植物)摄食,其中所述食物来源为昆虫从该食物来源中可以得到毒素。本发明的功能蛋白质还可以一起或单独起作用以增强或者促进一种或一种以上其它毒素蛋白质的活性。如文中所使用,术语“有毒的”、“毒性”或者“毒素”意思是表达本发明的“毒素”是具有文中所定义的“功能活性”
虽然对于喂食的昆虫完全致死是优选的,但是完全致死不需要达到功能活性。如果昆虫避免毒素或者停止摄食,既使在作用是亚致死的或者死亡推迟了或是间接的情况下,在一些应用中这种避免会是有用的。例如,如果想得到抗昆虫的转基因植物,昆虫不愿摄食该植物同对昆虫的致死毒性同样有用,因为最终目标是避免昆虫所导致的植物的损害。
有许多其它方法可以将毒素掺入到昆虫的食物中。例如通过对食物喷射如此处公开的蛋白质溶液将毒性蛋白质掺入到幼虫食物源中是可能的。备选地,可以将纯化的蛋白质经遗传工程改造进入无害的细菌中,然后将该细菌在培养基中进行生长并且应用到食物源中或者允许存在于想要根除昆虫的区域内的土壤中。还可以将蛋白质经过遗传工程改造直接进入昆虫的食物源。例如,对于许多昆虫幼虫的主要食物源是植物材料。因此,可以将编码毒素的基因转移到植物材料中以致于此植物材料能够编码目的毒素。
将功能活性转移到植物或者微生物系统通常需要将编码毒素的氨基酸序列的核苷酸序列整合进入蛋白质表达载体上,该蛋白质表达载体对于载体所要居留的宿主细胞是恰当的。得到编码具有功能活性蛋白质的核苷酸序列的一种方式是使用从毒素的氨基酸序列(如文中所公开)推导出来的信息从产生毒素的细菌物种中分离天然的遗传物质。可以对天然的序列进行优化以便在植物中表达,例如,如下面更详细的讨论。优化的多核苷酸也可以根据蛋白质序列进行设计。
本发明提供了具有毒素活性的多类TC蛋白质。表述这些毒素类和编码这些毒素的多核苷酸的一种方式是通过在具体条件的范围内,其与例证的核苷酸序列(其互补序列和/或来源于任一条链的探针)杂交的能力和/或通过使用来源于例证序列的引物进行PCR而扩增它们的能力来定义多核苷酸。
根据本发明可以用多种方法得到能够使用的杀虫毒素。例如,此处公开的杀虫毒素的抗体可以用来从蛋白质混合物中鉴定和分离其它的毒素。具体而言,针对毒素中最恒定和最能与其它毒素相区别的部分来产生抗体。通过免疫沉淀、酶联免疫吸附测定(ELISA)或者免疫印迹可以使用这些抗体特异地鉴定具有特征活性的等效毒素。使用标准的操作步骤可以容易地制备针对此处所公开的毒素、或者等效毒素、或者这些毒素片段的抗体。此类抗体是本发明的一个方面。本发明的毒素可以从大量来源的微生物中得到。
本领域技术人员可以清楚地认识到本发明的毒素(和基因)可以从多种来源中得到。毒素“从”或者“自……获得”任何分离株在此处指的意思或建议的意思是毒素(或者相似的毒素)可以从分离株或者诸如另一细菌菌株或者植物的其它一些来源得到。“来自……”也具有该含义并且包括从例如经过修饰用于在植物中进行表达的特定类型的自细菌中得到的蛋白质。本领域技术人员会清楚地认识到,一旦细菌基因和毒素被公开,可以将植物进行工程改造以产生该毒素。使用此处公开的多核苷酸和/或氨基酸序列可以制备抗体、核酸探针(DNA和RNA)等等,并且这些抗体、核酸探针等等可以用来从其它(天然)来源中筛选和回收其它毒素基因。
多核苷酸和探针:本发明进一步提供了编码根据本发明使用的TC蛋白质的核苷酸序列。本发明进一步提供了鉴定和表征编码具有毒素活性蛋白质的基因。在一个实施方案中,本发明提供了独特的核苷酸序列,该序列用作杂交探针和/或用于PCR技术的引物。这些引物可以产生能够用于鉴定、表征、和/或分离特异毒素基因的特征性基因片段。本发明的核苷酸序列编码与先前叙述的毒素不同的毒素。
本发明的多核苷酸可以用来形成在目的宿主细胞中编码蛋白质或者肽的完整“基因”。例如,如本领域技术人员能够清楚地认识到,本多核苷酸可以合适地置于目的宿主启动子的控制之下,这在本领域是已知的。
如本领域技术人员所知,DNA一般以双链形式存在。在这种排列中,一条链与另外一个链互补并且反之亦然。由于(例如)在植物中DNA是可以复制的,所以DNA的额外互补链可以产生。在本领域中经常使用“编码链”来指能够与反义链接合的链。mRNA从DNA的“反义”链上进行转录。在“有义”或者“编码”链上具有一系列的密码子(密码子是可以作为三个残基单位进行阅读以指定一个具体氨基酸的三核苷酸),其可以作为开放阅读框(ORF)进行阅读以形成目的蛋白质或者肽。为了在体内产生蛋白质,DNA的一条链一般转录成能够作为蛋白质模板使用的mRNA的互补链。因此,本发明包括在所附的序列表中所示的例证性多核苷酸和/或包括互补链的等效物的用途。与例证DNA功能等同的RNA和PNA(肽核酸)也包括在本发明之内。
在本发明的一个实施方案中,在导致微生物高度增殖的条件下培养细菌分离株。在处理微生物以提供单链基因组核酸后,将DNA与本发明的引物接触并且进行PCR扩增。通过所述操作步骤可以对编码毒素的基因的特征片段进行扩增,从而可以鉴定编码毒素的基因的存在。
本发明的一个进一步的方面包括使用文中公开的方法和核苷酸序列鉴定的基因和分离株。由此鉴定的基因编码抗害虫的毒素活性。
通过使用例如寡核苷酸探针可以鉴定和得到本发明有用的蛋白质和基因。探针是能够依靠恰当标记可以检测的可检测核苷酸序列或者如国际申请序列号WO 93/16094中所述的制作成固有的荧光。探针(和本发明的多核苷酸)可以是DNA、RNA或者PNA。除了腺嘌呤(A),胞嘧啶(C),鸟嘌呤(G),胸腺嘧啶(T)和尿嘧啶(U;对于RNA分子)外,本发明的合成探针(和多核苷酸)还可以含有次黄嘌呤(能够与所有四种碱基配对的中性碱基;有时用于代替合成的探针中所有四种碱基的混合物)。因此,当文中谈及合成的简并寡核苷酸,并且通常使用“N”或者“n”时,“N”或者“n”可以是G,A,T,C或者次黄嘌呤。文中所使用多义密码子在本申请文件中与标准的IUPAC命名规则相一致(例如,R指A或者G,Y指C或者T等)。
如本领域所熟知,如果探针分子与核酸样品杂交,可以有理由认为探针和样品具有相当的同源性/相似性/同一性。优选地,通过本领域众所周知的技术,如在Keller,G.H.,M.M.Manak(1987)DNA Probes,StocktonPress,New York,NY,169-170页中所述,多核苷酸首先进行杂交,然后在低、中或者高严格条件下进行洗涤。例如,如此处所述,通过在室温下用2×SSC(标准柠檬酸盐)/0.1%SDS(十二烷基硫酸钠)第一次洗涤15分钟实现低严格条件。一般进行两次洗涤。因此通过降低盐浓度和/或升高温度可以实现较高的严格条件。例如,在上述的洗涤后接着是用0.1×SSC/0.1%SDS在室温下洗涤15分钟的两次洗涤,再其次是用0.1×SSC/0.1%SDS在55℃条件下每次洗涤30分钟的多次洗涤。这些温度可以用于文中提出的其它杂交和洗涤操作步骤并且是本领域技术人员已知的(例如SSPE可以作为盐来代替SSC)。通过将50ml 20×SSC和5ml 10%SDS加入到445ml水中来配制2×SSC/0.1%SDS。通过组合NaCl(175.3g/0.150M)、柠檬酸钠(88.2g/0.015M)和水,用10N NaOH调节pH至7.0,然后调节体积至1升来配制20×SSC。通过在50ml灭菌水中溶解10g SDS,然后稀释到100ml来配制10%SDS。
探针检测提供了一种以已知的方式确定杂交是否维持的方法。此种探针分析提供了一种鉴定本发明编码毒素的基因的快速方法。作为本发明探针使用的核苷酸片段可以使用DNA合成仪和标准方法进行合成。这些核苷酸序列还可以用作PCR引物用于扩增本发明的基因。
分子的杂交特征可以用来定义本发明的多核苷酸。因此,本发明包括能够与文中例证的多核苷酸杂交的多核苷酸(和/或它们的互补链,优选地是它们的全长互补链)。即,例如,定义tcdA-样基因(和它编码的蛋白质)的一种方式为通过其与先前已知的(包括具体例证的)tcdA基因杂交(在此处公开的任何具体条件下)的能力而定义。例如对于xptA2-,tcaC-,tcaA-,tcaB-,tcdB-,tccC-和XptB1-样基因和相关蛋白质也是同样。这还包括tcdB2和tccC3基因。
如文中所使用,杂交的“严格”条件是指实现同当前申请所采用的条件相同或者大约相同程度的杂交特异性的条件。具体而言,通过标准的方法(见,例如Maniatis,T.,E.F.Fritsch,J.Sambrook[1982]MolecularCloning:A Laboratory Manual,Cold Spring Harbor Laboratory,ColdSpring Harbor,NY)使用32P-标记的基因特异的探针开展固相DNA杂交的Southern印迹。通常,杂交和随后的洗涤在允许检测靶序列的条件下进行。对于双链DNA基因探针,在6×SSPE,5×Denhardt′s溶液,0.1%SDS,0.1mg/ml变性DNA中,在低于DNA杂交体解链温度(Tm)20-25℃条件下过夜杂交。在下面的公式中叙述了解链温度(Beltz,G.A.,K.A.Jacobs,T.H.Eickbush,P.T.Cherbas和F.C.Kafatos[1983]Methods ofEnzyymology,R.Wu,L.Grossman和K.Moldave[著者]Academic Press,New York 100:266-285):
Tm=81.5℃-16.6Log[Na+]+0.41(%G+C)-0.61(%甲酰胺)-600/碱基对中双链体的长度。
洗涤通常如下进行:
(1)1×SSPE,0.1%SDS室温15分钟(低严格洗涤),两次。
(2)0.2×SSPE,0.1%SDS,在Tm-20℃下洗涤15分钟(中等严格洗涤),一次。
对于寡核苷酸探针,在6×SSPE,5×Denhardt′s溶液,0.1%SDS,0.1mg/ml变性DNA中,在低于杂交体解链温度(Tm)10-20℃条件下过夜杂交。对于寡核苷酸探针的Tm可以通过下式确定:
Tm(℃)=2(T/A碱基对数目)+4(G/C碱基对数目)(Suggs,S.V.,T.Miyake,E.H.Kawashime,M.J.Johnson,K.Itakura和R.B.Wallace[1981]ICN-UCLA Symp.Dev.Biol.Using Purified Genes,D.D.Brown[ed.],Academic Press,New York,23:683-693)。
洗涤通常如下进行:
(1)1×SSPE,0.1%SDS室温15分钟(低严格洗涤),两次。
(2)1×SSPE,0.1%SDS,在杂交温度下洗涤15分钟(中等严格洗涤),一次。
通常,改变盐和/或温度以改变严格度。对于标记的大约>70碱基长度的DNA片段,可以使用下面的条件:
低:1或者2×SSPE,室温
低:1或者2×SSPE,42℃
中:0.2×或者1×SSPE,65℃
高:0.1×SSPE,65℃。
双链体的形成和稳定性取决于杂合体两条链之间的基本互补性,并且,如上指出,能够耐受一定程度的错配。因此,本发明的探针序列包括所述序列的突变(单个和多重突变)、缺失、插入,及其组合,其中所述突变、插入和缺失允许与目的靶多核苷酸形成稳定杂交体。突变、插入和缺失能够以许多方式在给定的多核苷酸序列内产生,并且这些方法是本领域普通技术人员熟知的。其它方法将来可以逐渐为人们所知。
PCR技术:聚合酶链式反应(PCR)是重复的、酶促的、使用引物的核酸序列合成。该过程是众所周知的并且本领域技术人员常常使用(见Mullis,美国专利号4,683,195、4,683,202和4,800,159;Saiki,Randall K.,StephenScharf,Fred Faloona,Kary B.Mullis,Glenn T.Horn,Henry A.Erlich,Norman Amheim[1985]“β-球蛋白基因组序列的酶促扩增和限制性酶切位点分析用于诊断镰刀形红细胞贫血症”,Science 230:1350-1354)。PCR是基于目的DNA序列的酶促扩增,目的序列的两侧为可与靶序列互补链杂交的两条寡核苷酸引物。引物用指向彼此的3′末端表示方向。模板热变性、引物退火至其互补序列和使用DNA聚合酶延伸退火的引物步骤的重复循环导致由PCR引物5′末端所限定的片段的扩增。每条引物的延伸产物能够作为模板用于其它引物,因此每个循环基本上加倍以前循环中所产生的DNA片段的数量。这导致特异性靶片段的指数积累,在几个小时内增加至几百万倍。通过使用诸如Taq聚合酶(从嗜热菌嗜热水生菌(Thermusaquaticus)分离)的热稳定DNA聚合酶扩增过程能够完全自动化。其它能够使用的酶是本领域技术人员熟知的。
本发明的DNA序列能够用作PCR扩增的引物。进行PCR扩增的过程中,能够耐受引物与模板之间一定程度的错配。因此,作为例证的引物的突变、缺失和插入(特别是将核苷酸加至5′末端)包括在本发明的范围之内。通过普通技术人员已知的方法能够在给定引物中产生突变、插入和缺失。
基因和毒素的修饰:根据本发明有用的基因和毒素不仅包括特定的作为例证的全长序列,还包括这些序列、变体、突变体、嵌合体的部分、节段和/或片段(包括与全长分子相比的内部和/或末端缺失),及其融合物。本发明的蛋白质可具有替换的氨基酸,只要它们保留此处特定作为例证的蛋白质的特异杀虫/功能活性。“变体”基因具有编码相同毒素或具有等效于作为例证蛋白质的杀虫活性的等效毒素的核苷酸序列。术语“变体蛋白质”和“等效毒素”是指具有与作为例证的毒素相同或基本上相同的针对靶害虫的生物学/功能活性和等效序列的毒素。如文中所使用,当谈及“等效”序列是指具有能够提高杀虫活性或对杀虫活性无不利影响的氨基酸替换、缺失、添加或插入的序列。该定义中还包括了保留杀虫活性的片段。保留与作为例证的毒素的相应片段相同或相似功能或“毒素活性”的片段和其它等效物也包括于本发明的范围之内。变化,例如氨基酸置换或添加,能够用于多种目的,例如增加(或降低)蛋白质的蛋白酶稳定性(而没有大大地/相当大地降低毒素的功能活性)。
等效毒素和/或编码这些等效毒素的基因能够获得于/来自于野生型或重组细菌和/或使用此处所提供的技术获得于/来自于其它野生型或重组生物体。例如,其它芽胞杆菌属、沙雷氏菌属、类芽胞杆菌属、光杆状菌属和致病杆菌属能够用作来源分离株。
使用例如用于点突变的标准技术可以容易地构建基因的变化。此外,例如,美国专利号5,605,793叙述了通过在随机片段化后利用DNA重排产生额外的分子多样性的方法。变体基因能够用于产生变体蛋白质;重组宿主能够用于产生变体蛋白质。利用这些“基因改组(gene shuffling)”技术,能够构建包含此处作为例证的任意序列的任意5、10或20个连续残基(氨基酸或核苷酸)的等同基因和蛋白质。如本领域技术人员所知,基因步移技术能够进行调整以获得具有例如3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130.131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226.227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499或500个连续残基(氨基酸或核苷酸)、对应任意作为例证或所建议的序列(或其互补序列(完全互补序列))中的片段(相同大小)的等同物。相似大小的片段,特别是那些保守区相似大小的片段,能够用作探针和/或引物。
利用商业可得的核酸外切酶或核酸内切酶,按照标准步骤能够制备全长基因的片段。例如,诸如Bal31的酶或定点诱变能够用于从这些基因的末端系统地切除核苷酸。同样,利用多种限制性内切酶可以获得编码活性片段的基因。蛋白酶可以用于直接获得这些毒素的活性片段。
如此处所述,毒素(和TC蛋白质)可以被截短而仍然保留功能活性也在本发明的范围之内。“截短的蛋白质”意思是指一部分毒素蛋白质可以被切割并且切割之后仍然呈现活性。通过昆虫肠道内部或外部的蛋白酶能够实现切割。而且,使用标准分子生物学技术能够产生有效切割的蛋白质,其中编码所述毒素的DNA碱基通过用限制性内切酶消化或者使用技术人员可用的其它技术移出。截短之后,所述蛋白质能够在异源系统例如大肠杆菌、杆状病毒、基于植物的病毒系统、酵母等等进行表达,并且然后置于此处所公开的昆虫测定法中确定其活性。在本领域中众所周知,能够成功地产生截短的毒素,以至于虽然小于完整的全长序列,它们仍保留了功能活性。在本领域中众所周知,B.t.毒素能够以截短的(核心毒素)形式使用。见,例如,Adang等人,Gene 36:289-300(1985),“苏云金芽孢杆菌亚种kurstaki HD-73的晶体蛋白质的全长和截短的质粒克隆特性表征及其对烟草天蛾的毒性”。保留杀虫活性的截短蛋白质的其它例子包括昆虫保幼激素酯酶(Regents of University of California的美国专利号5,674,485)。如文中所使用,术语“毒素”意思也包括功能上有活性的截短形式。
在一些情况下,尤其对于在植物中表达,使用表达截短蛋白质的截短基因是有利的。例如,在上面背景部分所论述,Hofte等人,1989,论述了B.t.毒素的前毒素和核心毒素片段。优选的截短基因一般编码全长蛋白质的40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98或99%。例如背景部分还论述了加工和重排TcdA和TcbA片段的蛋白酶。
本发明的某些毒素/TC蛋白质已经在文中明确地举例说明。由于这些毒素/TC蛋白质仅仅是本发明蛋白质的例证,非常明显本发明包含与例证蛋白质具有相同或相似毒素活性的变体或等效蛋白质(和编码其等效物的核苷酸序列)。等效蛋白质与作为例证的毒素/TC蛋白质具有氨基酸相似性(和/或同源性)。氨基酸同一性一般将大于60%,优选地大于75%,更加优选地大于80%,甚至更加优选地大于90%,并且可大于95%。本发明优选的多核苷酸和蛋白质也可根据更加特定的同一性和/或相似性范围定义。例如,与此处作为例证或建议的序列相比,同一性和/或相似性可以是49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98或99%。上面列出的任意数字能够用于定义上限和下限。例如,A类蛋白质能够定义为与给定的TcdA蛋白质具有50-90%的同一性。因此,与任意先前已知的TcdA蛋白质(包括此处特别举例说明的任意TcdA蛋白质(并且同样地XptA2蛋白质))相比,通过此处所提供或建议的任意以数字表示的同一性分值能够定义TcdA-样蛋白质(和/或tcdA-样基因)。根据本发明,这同样适用于任意蛋白质或基因,例如TcaC-、TcaA-、TcaB-、TcdB-、TccC-和XptB2-样蛋白质和基因。因此,这适用于增效剂(例如TcdB2和TccC3)和独特毒素。
除非另外声明,使用Karlin和Altschul(1990),Proc.AM.Acad.Sci.USA 87:2264-2268的算法,该算法如在Karlin和Altschul(1993),Proc.Natl.Acad.Sci.USA 90:5873-5877中进行了改进,确定如文中所使用的两条核酸的百分比序列同一性和/或相似性。此种算法被整合入Altschul等人(1990),J.Mol.Biol.215:402-410的NBLAST和XBLAST程序。使用NBLAST程序进行BLAST核苷酸检索,分值=100,字长=12。可如Altschul等人(1997),Nucl.Acids Res.25:3389-3402中所述使用缺口BLAST。当利用BLAST和缺口BLAST程序时,使用各程序(NBLAST和XBLAST)的缺省参数。见NCBI/NIH网站。如上面背景部分所述,利用Crickmore等人的方法和算法也能够计算分值。为了获得用于比较目的的缺口比对,使用了Vector NTI Suite 8(InforMax,Inc.,North Bethesda,MD,U.S.A.)的Align X函数,其中采用了缺省参数。这些是:缺口开口罚分15,缺口延伸罚分6.66并且缺口分离罚分8。
在蛋白质的关键区域氨基酸同源性/相似性/同一性是最高的,这些关键区域负责着毒素的活性或者参与确定最终负责毒素活性的三维构象。在这点上,某些氨基酸替代是可接受的并且预计是可以忍受的。例如,这些替代可以是在蛋白质非关键活性的区域。可以使用蛋白质晶体结构分析和基于软件的蛋白质结构建模来确定蛋白质中能够被修饰的区域(使用定点诱变、改组等),以确实改变蛋白质的特性和/或提高蛋白质的功能性。
还可以对蛋白质的多种特性和三维特征进行改变而对蛋白质的毒素活性/功能性无不利影响。保守氨基酸的替代预计能够忍受/不会不利地影响分子的三维构象。氨基酸可以分为下面的几类:非极性的、不带电荷的极性的、碱性的和酸性的。一类中的氨基酸被该类中的另一个氨基酸替代的这种保守替代处于本发明的范畴,条件是只要替代不为不利于所述化合物生物学活性的替代即可。
表1 提供了属于每一类的氨基酸的例子的列表
表1. | |
氨基酸类 | 氨基酸的例子 |
非极性的不带电荷的极性的酸性的碱性的 | Ala,Val,Leu,Ile,Pro,Met,Phe,TrpGly,Ser,Thr,Cys,Tyr,Asn,GlnAsp,GluLys,Arg,His |
在一些情况下,也可以进行非保守替代。关键的因素是这种替代必须不明显损失蛋白质的功能/生物学/毒素活性。
如文中所使用,当谈及“分离的”多核苷酸和/或“纯化的”毒素是指没有与天然存在的其它分子相关联的这些分子。因此,如文中所使用,当谈及“分离的”和/或“纯化的”意思是存在“人工”的参与。例如,置入植物中用来表达的本发明细菌毒素“基因”是“分离的多核苷酸”。同样,此处例证的由植物产生的类芽孢杆菌属蛋白质是“分离的蛋白质”。
由于遗传密码的简并性/冗余性,多种不同的DNA序列可以编码此处公开的氨基酸序列。产生编码相同或者基本上相同毒素的备选DNA序列是受过本领域培训的技术人员的技术范围之内。这些变体DNA序列属于本发明的范畴。
优化序列用于在植物中表达:为了实现异源基因在植物中高表达,优选地是对所述基因重新进行改造使得它们能够在植物细胞的(胞质)中更有效的表达。玉米是此种植物的一种,优选地是在转化前将外源基因进行重新设计以提高它们在所述植物中的表达水平。因此,设计编码细菌毒素的基因的额外步骤是改造异源基因用于优化表达。
对重新改造细菌毒素用于在玉米中表达的一个原因是天然基因的G+C含量是非优化的。例如,许多天然的细菌基因G+C含量非常低(并且结果向高A+T含量倾斜)导致了模拟或者重复已知高度富含A+T的植物基因控制序列的产生。在导入植物的基因DNA中一些富含A+T的序列的存在(例如正常情况下在基因启动子中发现的TATA盒区域)会导致异常的基因转录。另一方面,位于转录的mRNA(例如多聚腺苷酸化信号序列(AAUAAA)或者与参与前体-mRNA剪接的小核RNA互补的序列)中的其它调节序列的存在可能导致RNA不稳定。因此,在对用于在玉米中表达的编码细菌毒素基因(更加优选地称为为植物优化的基因)的设计中的一个目标是产生具有较高G+C含量的DNA序列,并且优选地是与编码代谢酶玉米的基因相近的G+C含量。在设计编码细菌毒素的植物优化的基因的另一个目标是产生这样的DNA序列,其中序列的修饰不阻碍翻译。
下面的表(表2)表明在玉米中G+C%含量是如何高。对于表2中的数据,基因的编码区取自GenBank(版本71)条目,并且使用MacVectorTM程序(Accelerys,San Diego,California)计算了碱基组成。在计算中忽略了内含子序列。
由于遗传密码的冗余性/简并性所赋予的可塑性(即一些氨基酸被多于一个的密码子所指定),不同生物或者不同生物类别的基因组的进化导致对冗余密码子的不同使用。这种“密码子偏倚性”反映在蛋白质编码区的平均碱基组成上。例如,具有相对低G+C含量的生物利用冗余密码子中第三位为A或者T的密码子,而具有较高G+C含量的生物利用第三位为G或者C的密码子。据认为在mRNA中“稀有”密码子的存在降低了mRNA的绝对翻译速率,特别是当对应于稀有密码子的氨酰tRNA的相对丰度很低时。将此扩展,由于单个稀有碱基导致的翻译速率的降低对于多个稀有碱基而言至少为加成的。因此,具有相对高的稀有密码子含量的mRNA相应地具有低的翻译速率。该速率会反映在后来所编码蛋白质的低水平上。
在用于在玉米(或其它植物,例如棉花或者大豆)中表达的编码细菌毒素的工程改造基因中,确定了植物的密码子偏倚性。对于玉米的密码子偏倚性是该植物用来编码其蛋白质的统计上的密码子分布,并且优选的密码子选择示于表3。在确定了偏倚性之后,确定目的基因中密码子的频率百分数。应该确定植物优选的主要密码子,当存在多种选择时还应该确定优选密码子的第二、第三和第四种选择。因此可以设计出编码细菌毒素的氨基酸序列的新DNA序列,但是通过在毒素氨基酸序列的每一个位置上的氨基酸用植物(第一优选、第二优选、第三优选或者第四优选)密码子进行替代使得新的DNA序列不同于天然的细菌DNA序列(编码毒素)。然后对新的序列进行限制性酶切位点的分析,这些酶切位点可能通过修饰而产生。将鉴定出来的位点通过用第一、第二、第三或者第四选择的优选密码子进行替代来进一步进行修饰。可能影响目的基因转录和翻译的序列中的其它位点是外显子:内含子连接处(5′或者3′)、多聚A添加信号或者RNA多聚酶终止信号。将序列进行进一步的分析和修饰以降低TA或者GC双联体的频率。除了双联体外,具有多于约4个同一残基的G或者C的序列区会影响序列的转录。因此,通过用次优选的所选密码子替代第一或者第二选择的密码子等等也可以对所述区域进行修饰。
表2 | ||
玉米基因蛋白质编码区G+C含量的编辑 | ||
蛋白质分类a | %G+C范围 | %G+C平均b |
代谢酶(76)结构蛋白质(18)调节蛋白质(5)未表征蛋白质(9)所有蛋白质(108) | 44.4-75.348.6-70.557.2-68.841.5-70.344.4-75.3 | 59.0(.+-.8.0)63.6(.+-.6.7)62.0(.+-.4.9)64.3(.+-.7.2)60.8(.+-.5.2) |
a在括弧内给出该类中的基因数目。
b在括弧内给出标准差。
c在平均值计算中忽略了组合组平均值。
在编码细菌毒素的植物优化的基因中含有大约63%第一选择密码子、大约22%到37%之间的第二选择密码子和大约15%到大约0%之间的第三或者第四选择密码子是优选的,其中总的百分数是100%。植物优化的基因中含有大约63%第一选择密码子、至少大约22%第二选择密码子、大约7.5%第三密码子和大约7.5%第四密码子是最优选的,其中总的百分数是100%。用于在玉米中表达的工程改造基因中的优选密码子使用示于表3。上述方法能够使本领域技术人员修饰对特定植物而言为外来的基因,从而该基因在植物中优化表达。在PCT申请WO 97/13402中做了进一步阐述。
为了设计编码细菌毒素的植物优化基因,使用如表2所示的,从针对具体植物的基因序列编辑的密码子偏倚性表中建立的丰余遗传密码,将DNA序列设计成能够编码所述蛋白质毒素的氨基酸序列。所获得的DNA序列具有较高程度的密码子多样性,所期望的碱基组成可以含有战略上设置的限制性酶识别位点,并缺乏可能干扰基因转录或者产物mRNA翻译的序列。
表3. | |
在玉米中表达的蛋白质的优选氨基酸密码子 | |
氨基酸 | 密码子* |
丙氨酸半胱氨酸天冬氨酸谷氨酸苯丙氨酸甘氨酸组氨酸异亮氨酸赖氨酸亮氨酸蛋氨酸天冬酰胺脯氨酸谷氨酰胺精氨酸丝氨酸苏氨酸缬氨酸色氨酸酪氨酸终止密码子 | GCC/GCGTGC/TGTGAC/GATGAG/GAATTC/TTTGGC/GGGCAC/CATATC/ATTAAG/AAACTG/CTCATGAAC/AATCCG/CCACAG/CAAAGG/CGCAGC/TCCACC/ACGGTG/GTCTGGTAC/TATTGA/TAG |
*用于玉米的第一和第二优选密码子
因此,功能等效于本发明毒素基因的人工基因可以用来转化包括植物在内的宿主。关于产生人工基因的额外的指导可以在例如美国专利号5,380,831中找到。
转基因宿主:可以将本发明的编码毒素的基因导入广泛的多种微生物或者植物宿主中。在优选的实施方案中,使用了转基因植物细胞和植物。优选的植物(和植物细胞)是谷物、玉米和棉花。
在优选的实施方案中,毒素基因的表达直接或者间接导致杀虫蛋白质在细胞内产生(和维持)。以这种方式植物表现出昆虫抗性。当害虫摄取转基因/重组/转化/转染的宿主细胞(或其内容物)时,害虫会摄取毒素。这是导致害虫和毒素接触的优选方式。结果是对害虫的控制(杀死或者致病)。对于吮吸害虫也可以用相似的方式进行控制。备选地,在靶害虫存在的地方可以使用适宜的微生物宿主,例如假单胞菌属如荧光假单胞菌(P.fluorescens),微生物可以在此处增殖并且被靶害虫摄取。可以将具有毒素基因的微生物在延长毒素活性和稳定细胞的条件下进行处理。从而经过处理的保留毒素活性的细胞可以应用到靶害虫的环境中。
如果通过适当的载体将毒素基因导入微生物宿主,并且将所述宿主以活的状态应用到环境中时,可以用到一些宿主微生物。选择了已知占具有一种或多种目的作物植物圈(叶面、叶圈、根际和/或根面)的微生物宿主。将这些微生物进行选择以致于具有在具体环境中(作物和其它昆虫栖息地)成功竞争野生型微生物的能力,提供能够表达多肽杀虫剂的基因的稳定维持和表达,并且,希望提供使杀虫剂免受环境降解和失活的增强的保护。
已知大量的微生物栖息在广泛多样的重要作物的叶面(植物叶的表面)和/或根际(植物根周围的土壤)。这些微生物包括细菌、藻类和真菌。特别感兴趣的是微生物,诸如细菌,例如假单胞菌属、欧文氏菌属(Erwinia)、沙雷氏菌属、克雷伯氏菌属(Klebsiella)、黄单胞菌属(Xanthomonas)、链霉菌属(Streptomyces)、根瘤菌属(Rhizobium)、红假单胞菌属(Rhodopseudomonas)、Methylophilius、农杆菌属(Agrobacterium)、醋杆菌属(Acetobacter)、乳杆菌属(Lactobacillus)、节杆菌属(Arthrobacter)、固氮菌属(Azotobacter)、明串珠菌属(Lezscorzostoc)和产碱菌属(Alcaligenes);真菌,特别是酵母,例如酵母属(Saccharomyces)、隐球酵母属(Cryptococcus)、克鲁维酵母属(Kluyveromyces)、掷孢酵母属(Sporobolomyces)、红酵母属(Rhodotorula)和短梗霉属(Aureobasidium)。特别感兴趣的是植物圈细菌物种如丁香假单胞菌(Pseudomonas syringae)、荧光假单胞菌、粘质沙雷氏菌(Serratia marcescens)、木醋杆菌(Acetobacter xylinum)、根癌农杆菌(Agrobacterium tumefaciens)、球形红假单胞菌(Rhodopseudomonas spheroides)、野油菜黄单胞菌(Xanthomonascampestris)、苜蓿根瘤菌(Rhizobum melioti)、真养产碱菌(Alcaligenesentrophus)和维涅兰德固氮菌(Azotobacter vinlandii);和植物圈酵母物种如深红酵母(Rhodotorula rubra)、红酵母(R.glutinis)、海滨红酵母(R.marina)、橙黄红酵母(R.aurantiaca)、浅白隐球酵母(Cryptococcusalbidus)、流散隐球酵母(C.diffluens)、变黄罗伦隐球酵母(C.laurentii)、罗斯酵母(Saccharomyces rosei)、普地酵母(S.pretoriensis)、酿酒酵母(S.cerevisiae)、掷孢酵母(Sporobomyces roseus)、香气掷孢酵母(S.odorus)、佛地克鲁维酵母(Kluyveromyces veronae)和出芽短梗霉(Aureobasidium pollulans)。也感兴趣的是有颜色的微生物。
将基因插入以形成转基因宿主:本发明的一个方面是用表达本发明蛋白质的本发明多核苷酸转化/转染植物、植物细胞和其它宿主细胞。用此种方式转化的植物具有抗靶害虫攻击的抗性。
可以用广泛大量的方法在允许基因稳定维持和表达的条件下将编码杀虫蛋白质的基因导入靶宿主。这些方法对于本领域技术人员而言是熟知的并且在例如美国专利号5,135,867中有所叙述。
例如,包含大肠杆菌复制系统和允许对转化细胞进行选择的标记物的大量克隆载体可用于将外来基因插入到高等植物中。载体包含例如pBR322、pUC系列、M13mp系列、pACYC184等等。相应地,可以将编码毒素的序列插入到载体的适当限制性位点。所得到的质粒用于转化入大肠杆菌。将大肠杆菌在适当的营养培养基中进行培养,然后收获和裂解。将质粒回收。通常实施序列分析、限制性分析、电泳和其它生物化学-分子生物学方法作为分析的方法。每一次操作之后,将所使用的DNA序列进行切割并且连接到下一个DNA序列中。可以将每一个质粒序列克隆到相同的或者其它的质粒中。根据目的基因插入植物的方法,其它的DNA序列可能是需要的。例如,如果Ti或者Ri质粒用于转化植物细胞,那么至少是Ti或者Ri质粒T-DNA的右边界,但常常是右边界和左边界,作为所插入基因的侧翼区而连接。对于T-DNA用于转化植物细胞的用途已经得以广泛研究并在EP 120516;Hoekema(1985):The Binary Plant VectorSystem,Offset-durkkerii Kanters B.v.,Alblasserdam,第5章;Fraley等,Crit.Rev.Plant Sci.4:1-46;和An等(1985)EMBO J.4:277-287中有叙述。
有大量的技术可以将DNA插入到植物宿主细胞。这些技术包括使用根癌农杆菌或者发根农杆菌(Agrobacterium rhizogenes)作为转化介质的T-DNA转化、融合、注射、生物弹射击法(biolistics)(微粒子轰击)或者电穿孔以及其它可能的方法。如果农杆菌属用于转化,必须将欲被插入的DNA克隆到特定的质粒中,即克隆到中间载体或者双元载体。由于序列与T-DNA上的序列同源,通过同源重组可以将中间载体整合到Ti或者Ri质粒中。Ti或者Ri质粒还含有对于T-DNA转移必需的vir区域。中间载体不能够在农杆菌属中自我复制。通过辅助质粒(接合)的方法可以将中间载体转移入根癌农杆菌。双元载体可以在大肠杆菌和农杆菌属中进行自我复制。它们包含选择性标记基因和接头或者多接头,这些接头或者多接头被T-DNA右边界和左边界框起来。它们可以直接转化进入农杆菌属中(Holsters等[1978]MoL Gen.Genet.163:181-187)。作为宿主细胞使用的农杆菌属要包含携带vir区域的质粒。vir区域对于将T-DNA转移进入植物细胞是必需的。可以包含额外的T-DNA。如此转化的细菌用于转化植物细胞。植物外植体可以有利地与根癌农杆菌或者发根农杆菌一起培养,以便将DNA转移进入植物细胞。在含有用于选择的抗生素或者杀生物剂的适当培养基中使被感染的植物材料(例如叶片、茎段、根,还可以是原生质体或者悬浮培养的细胞)再生出整株植物。然后对如此得到的植物检验插入DNA是否存在。在注射和电穿孔的情况下,对质粒没有特殊的要求。使用普通的质粒,例如pUC衍生物是可行的。
通常转化的细胞生长在植物的内部。它们可以形成生殖细胞并且将转化的特性传给后代植物。此种植物可以以正常方式生长并可与转化有同样遗传因子或者其它遗传因子的植物进行杂交。所得的杂交个体具有相应的表型特征。
在本发明的一些优选实施方案中,编码细菌毒素的基因从插入植物基因组中的转录单元进行表达。优选地,所述转录单元是能够稳定整合进入植物基因组并能够对表达mRNA编码蛋白质的经转化植物品系进行筛选的重组载体。
一旦插入的DNA被整合进入基因组,则它在基因组中是相对稳定的(并且不能够再次出来)。正常情况下它含有赋予转化的植物细胞抗杀生物剂或者抗生素的选择性标记,诸如卡那霉素、G418、博来霉素、潮霉素或者氯霉素及其它。各种所应用的标记因此应该允许对转化细胞进行选择而不是对不含有插入DNA的细胞进行选择。目的基因优选地通过组成型或者诱导型启动子在植物细胞中表达。一旦表达,mRNA可以翻译成蛋白质,因此将目的氨基酸掺入到蛋白质中去。在植物细胞中表达的编码毒素的基因可以在组成型启动子、组织特异型启动子或者诱导型启动子的控制下。
存在一些用于将外来重组载体导入植物细胞并且得到稳定维持和表达所导入基因的植物的技术。此类技术包括将包被在微颗粒上的遗传物质直接导入细胞(Cornell的美国专利号4,945,050和DowElance的美国专利号5,141,131,现在为Dow AgroSciences,LLC)。此外,使用农杆菌属的技术将植物进行转化,见University of Toledo的美国专利号5,177,010;TexasA&M的美国专利号5,104,310;欧洲专利申请0131624B1;Schilperoot的欧洲专利申请120516,159418B1和176,112;Schilperoot的美国专利申请号5,149,645,5,469,976,5,464,763和4,940,838和4,693,976;MaxPlanck的欧洲专利申请116718,290799,320500;日本Tobacco的欧洲专利申请604662和627752和美国专利号5,591,616;Ciba Geigy的,现在为Novartis的欧洲专利申请0267159和0292435和美国专利号5,231,019;Calgene的美国专利号5,463,174和4,762,785和Agracetus的美国专利号5,004,863和5,159,135。其它的转化技术包括whiskers技术。见Zeneca的美国专利号5,302,523和5,464,765。电穿孔技术也用来转化植物。见BoyceThompson Institute的WO 87/06614;Dekalb的美国专利号5,472,869和5,384,253和Plant Genetic Systems的Wo 92/09696和WO 93/21335。此外,病毒载体也可以用于产生表达目的蛋白质的转基因植物。例如,使用Mycogen Plant Science和Ciba-Giegy现在为Novartis的美国专利号5,569,597以及Biosource的美国专利号5,589,367和5,316,931中叙述的方法可以用病毒载体转化单子叶植物。
如先前所提到,DNA构建体导入植物宿主的方式不是本发明的关键。能够提供有效转化的任何方法均可采用。例如,此处叙述了植物细胞转化的多种方法并且包括Ti或者Ri质粒等的使用进行农杆菌属介导的转化。在很多情况下,希望用于转化的构建体位于T-DNA边界的一侧或者两侧,更加优选的是位于右边界。虽然在其它转化模式中可见T-DNA边界的使用,但当此构建体用根癌农杆菌或者发根农杆菌作为转化模式时特别有用。当使用农杆菌属用于植物细胞转化时,欲被导入宿主的载体用来与存在于宿主中的T-DNA或者Ti或者Ri质粒进行同源重组。可以通过电穿孔、三亲杂交和本领域技术人员已知的用于转化革兰氏阴性细菌的其它技术实施载体的导入。载体转化进入农杆菌宿主的方式不是本发明的关键。包含用于重组的T-DNA的Ti或者Ri质粒能够或者不能够导致菌瘿的形成,并且这不是本发明的重点,只要所述此宿主中存在vir基因即可。
在用农杆菌属用于转化的一些情况下,位于T-DNA边界中的表达构建体插入到广谱的载体中,例如如在Ditta等(PNAS USA(1980)77:7347-7351)和EPO 0 120 515(此处引用作为参考)中所叙述的pRK2或其衍生物。表达构建体和T-DNA中包含如此处所述的一种或多种标记,所述标记允许对转化的农杆菌属和转化的植物细胞进行选择。采用的具体标记不是本发明的本质,优选的标记取决于所使用的宿主和构建体。
对于使用农杆菌转化植物细胞,将外植体和转化的农杆菌属结合并孵育足够的时间以便允许其转化。在转化后,通过用恰当的抗生素进行选择将农杆菌杀死,并且将植物细胞培养在恰当的选择培养基中。愈伤组织一旦形成,根据植物组织培养和植物再生领域内熟知的方法,通过采用适当的植物激素促进枝条的形成。然而,愈伤组织中间阶段并不总是必要。在枝条形成后,可以将此植物细胞转移至培养基中促进根的形成,从而完成植物的再生。然后将植物培养至结种并且所述种子可以用来建立未来的子代。不管转化的技术为何技术,编码细菌毒素的基因优选地掺入到基因转移载体上,通过在载体上包含植物启动子调节元件和诸如Nos等的3′非翻译转录终止区,使得该载体用来在植物细胞中表达此种基因。
除了用于转化植物的技术有多种外,与外来基因相接触的组织类型同样也会变化。此种组织包括但不限于胚发生组织、愈伤组织类型I、II和III、胚轴、分裂组织、根组织、在韧皮部表达的组织等等。使用此处叙述的恰当的技术,几乎所有植物组织在去分化过程中均能够被转化。
如上面所提到,如果期望的话,多种选择标记可以使用。对具体标记的优先使用根据本领域技术人员的判断,但是任何下面的选择标记可以与此处没有列出的但能够作为选择标记起作用的其它基因一起使用。此种选择标记包括但不限于编码对抗生素卡那霉素、新霉素和G418抗性的转座子Tn5的氨基糖苷磷酸转移酶(Aph II)和编码对草甘膦、潮霉素、氨甲喋呤、膦丝菌素(双丙氨膦)、咪唑啉酮、磺酰脲和诸如氯磺隆(chlorsulfuron)、溴草腈(bromoxynil)、茅草枯等的三唑嘧啶(triazolopyrimidine)除草剂抗性或者耐受的那些基因。
除了选择标记以外,还可以使用报告基因。在一些例子中,报告基因可以与或者不与选择标记一起使用。报告基因是这样的基因,它一般不存在于受体生物体或者组织中的并且一般编码导致一些表型改变或者酶特性的蛋白质。此种基因的例子在K.Wising等Ann.Rev.Genetics,22,421(1988)中有所提供。优选的报告基因包括大肠杆菌uidA基因座的β-葡糖醛酸糖苷酶(GUS)、来源于大肠杆菌Tn9的氯霉素乙酰转移酶基因、来源于发生物荧光的水母维多利亚水母(Aequorea victoria)的绿色荧光蛋白和来源于萤火虫北美萤火虫(Photiflus pyralis)的荧光素酶基因。在该基因被导入受体细胞适当时间后开展检测报告基因表达的分析。如Jefferson等(1987Biochem.Soc.Trans.15,17-19)所叙述,优选的此种方法是使用编码大肠杆菌uidA基因座的β-葡糖醛酸糖苷酶(GUS)的基因来鉴定转化的细胞。
除了植物启动子调节元件外,从多种来源的启动子调节元件可以在植物细胞中有效使用来表达外来基因。可以使用例如细菌来源的启动子调节元件,诸如章鱼氨酸合酶启动子、胭脂氨酸合酶启动子、marmopine合酶启动子,病毒来源的启动子,诸如花椰菜花叶病毒(35S和19S)、35T(它是一个重新进行工程改造的35S启动子,见美国专利号6,166,302,特别是实施例7E)等等。植物启动子调节元件包括但不限于核酮糖-1,6-二磷酸(RUBP)羧化酶小亚基(ssu)、β-伴大豆球蛋白(conglycinin)启动子、β-菜豆蛋白启动子、ADH启动子、热激启动子和组织特异启动子。还可以存在其它的元件诸如基质附着区、支架附着区、内含子、增强子、多聚腺苷酸化序列等等,并且这些元件可以促进转录效率或者DNA整合。此类元件对于DNA功能可以或不用是必需的,虽然它们可以通过影响转录、mRNA的稳定性等等来提供更好的DNA表达或者功能。此类元件可以如愿地包含在DNA中以便使已经转化的DNA在植物中有最佳的表现。一般的元件包括但不限于Adh-内含子1、Adh-内含子6、苜蓿花叶病毒衣壳蛋白前导序列、玉米条纹病毒衣壳蛋白前导序列以及本领域技术人员可得的其它元件。也可以使用组成型启动子调节元件从而指导基因在所有细胞类型和在所有的时间内持续表达(例如肌动蛋白、泛素、CaMV35S等)。组织特异型启动子调节元件负责基因在特异细胞或者组织类型诸如叶或者种子中表达(例如玉米醇溶蛋白、油质蛋白、napin、ACP、球蛋白等),并且这些启动子也可以被使用。
启动子调节元件还可以是在植物发育的一定阶段活化的,也可以是在植物的组织和器官中活化的。此类启动子的例子包括但不限于花粉特异、胚特异、玉米穗丝特异、棉花纤维特异、根特异、种子胚乳特异的启动子调节元件等等。在一定情况下,使用可诱导启动子调节元件是所希望的,该元件负责对特异信号作出反应而使基因表达,其中所述特异的信号诸如物理刺激(热激基因)、光(RUBP羧化酶)、激素(Em)、代谢物、化学品和胁迫。可以使用在植物中有功能的其它目的转录和翻译元件。多种植物特异的基因转移载体在本领域是已知的。
使用标准的分子生物学技术对此处所述的毒素进行克隆和测序。其它信息可以在Sambrook,J.,Fritsch,E.F.,和Maniatis,T.(1989),MolecularCloning,A Laboratory Manual,Cold Spring Harbor Press中找到,此处公开作为参考。
抗性管理:随着杀虫蛋白质在植物中使用的日益商业化,抗性管理成为一种担心。即在其产品中使用苏云金芽孢杆菌毒素的公司有很多,并且对发展成抗B.t.毒素的昆虫存在忧虑。对昆虫抗性管理的一个策略是将由致病杆菌属、光杆状菌属等产生的TC毒素与诸如B.t.晶体毒素、来源于芽孢杆菌菌株的可溶性杀虫蛋白质(见,例如WO 98/18932和WO99/57282)或其它杀昆虫的毒素相结合。这种结合可以是配制成喷雾施用的制剂或者是分子的结合。可以用产生2种或者多种不同杀昆虫毒素的细菌基因转化植物(见,例如Gould,38Bioscience 26-33(1988)和美国专利号5,500,365;同样,欧洲专利申请0400246A1和美国专利5,866,784、5,908,970和6,172,281也叙述了用两种B.t.晶体毒素进行植物转化)。产生含有多于一种抗昆虫基因的转基因植物的另一种方法是首先产生2种植物,每种植物含有一种抗昆虫的基因。然后使用常规的植物育种技术将这些植物进行杂交以产生含有多于一种抗昆虫基因的植物。因此,很明显,如文中所使用,短语“包含多核苷酸”意思是指至少一种多核苷酸(并且可能更多,它们互相接近或者不接近),除非另外特别指出。
制剂和其它递送系统:所配制的含有本发明细胞和/或蛋白质(包括含有此处所述基因的重组微生物)的诱饵颗粒可以施用于土壤。所配制的产品还可以在作物循环的后期阶段作为种子包被、根处理或者全植物处理而施用。对植物和土壤进行处理的细胞可以采用可湿性粉末、颗粒或者粉剂,通过与多种惰性物质混合施用,惰性物质诸如无机矿物质(矿页硅酸盐、碳酸盐、硫酸盐、磷酸盐等)或者植物材料(弄成粉末的玉米芯、稻皮、胡桃壳等)。制剂可以包含展布剂-粘着剂佐剂、稳定剂、其它杀虫添加剂或者表面活性剂。液体制剂可以是基于水的或者非水的并且采用诸如泡沫、凝胶、混悬液、可乳化原液等。成分可以包含流变剂、表面活性剂、乳化剂、分散剂或者聚合体。
本领域技术人员应当理解,杀虫剂的浓度会变化很大,这取决于具体制剂的性质,特别是它是否是浓缩的或者是直接使用的。杀虫剂可以以至少1%重量并且可以以100%重量存在。干制剂可以具有大约1-95%重量的杀虫剂,而液体制剂通常在液相中具有大约1-60%重量的固体。制剂通常具有大约102-104个细胞/mg。这些制剂可以以每公顷大约50mg(液体的或干的)至1kg或更多进行施用。
通过喷雾、撒粉、喷洒等可以将制剂施用于昆虫的环境中,例如土壤和树叶。
另一个递送方案是将毒素的遗传物质掺入到杆状病毒载体。杆状病毒感染特定的昆虫宿主,包括欲用毒素靶向的那些昆虫宿主。可以将带有毒素表达构建体的感染性杆状病毒引入昆虫侵害的区域,从而使被感染的昆虫中毒或者被毒死。
已知昆虫病毒或者杆状病毒可以感染并且负面影响一些昆虫。病毒对昆虫的影响是慢的,并且病毒不能够立即阻止昆虫的摄食。因此,可见病毒不是最佳的昆虫害虫控制剂。然而将毒素基因结合到杆状病毒载体中提供了一种传递毒素的有效方式。除此之外,由于不同的杆状病毒对不同的昆虫是特异的,因此有可能使用特定的毒素去选择性靶向特定的破坏性昆虫害虫。对于毒素基因特别有用的载体是核型多角体病毒。已经对使用这种病毒的转移载体进行了叙述并且现在是用于转移外来基因进入昆虫的一种选择。可以将病毒-毒素基因重组体制作成口服转移的形式。杆状病毒通常通过中肠肠粘膜感染昆虫。插入到强的病毒衣壳蛋白启动子后的毒素基因会表达并迅速杀死感染的昆虫。
除了用于本发明蛋白质毒素的昆虫病毒或杆状病毒或者转基因植物递送系统外,使用苏云金芽孢杆菌包封技术可以将蛋白质进行包封,该技术诸如但不限于美国专利号4,695,455、4,695,462、4,861,595,此处引用作为参考。用于本发明蛋白质毒素的另一个递送系统是将蛋白质配制成诱饵介质,该介质可以在地上和地下昆虫诱饵装置中使用。此种技术的例子包括但不限于PCT专利申请WO 93/23998,此处引用作为参考。
基于植物RNA病毒的系统也可以用于表达细菌毒素。在这种情况下,编码毒素的基因被插入到适当植物病毒的衣壳启动子区域,该病毒可感染目的宿主植物。然后毒素被表达,因此保护植物免受昆虫损害。基于植物RNA病毒的系统叙述于Mycogen Plant Sciences的美国专利号5,500,360和Biosource Genetics Corp.的美国专利号5,316,931和5,589,367。
除了产生经转化植物外,还有其它递送系统,其中对细菌基因进行了工程改造。例如,通过将作为食物源而吸引昆虫的分子与毒素融合在一起而构建成蛋白质毒素。在实验室纯化后,此种具有“内部”诱饵的毒性剂可以被包装在标准昆虫捕捉屋中。
突变体:通过本领域众所周知的操作步骤,可以制成细菌分离株的突变体。例如,通过对分离株进行甲基磺酸乙酯(EMS)诱变可以得到不产孢子的突变型。通过本领域众所周知的操作步骤使用紫外线和亚硝基胍可以产生突变型。
此处所提到或引用的所有专利、专利申请、临时申请和公开全部整合作为参考,直至它们与本说明书的明确指导不一致的程度之内均可作为参考。
下面是阐明用于实践本发明的步骤的实施例。这些实施例不应该被解释成限制。所有的百分数是以重量计,并且所有溶剂混合物比例是以体积计,除非另外注释。
实施例1-从致病杆菌Xwi株得到TC蛋白质和基因
以前证明(美国专利号6,048,838)嗜线虫致病杆菌菌株Xwi(NRRLB-21733,于1997年4月29日保藏)能够产生对鞘翅目(Coleoptera)、鳞翅目(Lepidoptera)、双翅目(Diptera)和蜱螨目(Acarina)的多个目的昆虫具有口服杀虫活性的胞外蛋白质。下面公开了来源于菌株Xwi的全长基因和TC蛋白质序列。用于得到它们的方法在同时申请的题目为“用于病虫害防治的致病杆菌属TC蛋白质和基因”的Bintrim的美国临时申请(系列号60/441,717)中充分讨论。这些序列(包括N-端和内部肽序列(SEQ ID NO:1-5))在序列简述部分也进行了概述。
总之,从菌株Xwi中得到39,005bp的基因组DNA片段并将其克隆到粘粒的pDAB2097中。使用Vector NTITM Suite(Informax,Inc.NorthBethesda,MD,USA)分析粘粒插入片段(SEQ ID NO:6)的序列以证实所编码的ORF(可读框)。鉴定出6个全长ORF和一个部分ORF(图1和表4)。
表4.在pDAB2097粘粒插入片段中鉴定出的ORF | ||||
ORF命名 | 在SEQ ID NO.13中ORF位置 | SEQ IDNO.(核苷酸) | 推导的氨基酸数目 | SEQ IDNO.(氨基酸) |
ORF1ORF2ORF3ORF4ORF5ORF6ORF7 | 1-1,5331,543-5,7155,764-7,70710,709-18,27718,383-21,430(C*)21,487-25,965(C)26,021-33,634(C) | 791113151719 | 5111,3916482,5231,0161,4932,538 | 8101214161820 |
*(C)定义为SEQ ID NO:6的互补链
将鉴定出的ORF的核苷酸序列和由这些ORF编码的推导氨基酸序列用于查询位于美国国家生物技术信息中心(National Center forBiotechnology Information)的数据库,查询时通过使用BLASTn,BLASTp和BLASTx通过用于进行BLAST的ncbi/nih的官方(“.gov”)网站进行。这些分析表明在pDAB2097插入片段中鉴定出的ORF同先前在发光光杆状菌和嗜线虫致病杆菌中鉴定出来的基因具有明显氨基酸序列同一性(表5)。值得注意的是GenBank登录号AJ308438中的xpt基因序列是从表达口服杀昆虫活性的重组粘粒中得到的。
表5.pDAB2097ORF编码的推导的蛋白质与已知基因的相似性 | ||
pDAB2097 ORF*(推导的氨基酸) | 基因/ORF定义(GenBank登录号) | 与数据库匹配的氨基酸序列同一性百分数% |
ORF1(1-511)ORF2(313-1,391)ORF3(1-648)ORF4(1-2,523)ORF5(1-1,016)ORF6(1-1,402)ORF7(1-2,538) | tccA(AF047028)xptD1(AJ308438)Chi(AF308438)xptA1(AJ308438)xptB1(AJ308438)xptC1(AJ308438)xptA2(AJ308438) | 21.4%96.6%100%99.5%95.9%96.4%95.1% |
*推导的氨基酸位置与数据库序列具同一性
由于ORF2,ORF4,ORF5,ORF6和ORF7表现出与先前鉴定的基因具有至少95%的氨基酸序列同一性,因此在对从pDAB2097插入序列中鉴定的ORF进行进一步研究时采用相同的基因名称(表6)。
如本申请中所使用,例如XptA2表示蛋白质,例如xptA2表示基因。此外,用于分离基因和蛋白质的来源分离株用下角标表示。这种表示的例子见表6。
表6.在pDAB2097插入序列中鉴定出的ORF的名称 | |
pDAB2097ORF | 基因名称 |
ORF2ORF4ORF5ORF6ORF7 | xptD1XwixptA1XwixptB1XwixptC1XwixptA2Xwi |
实施例2-来源于光杆状菌属和致病杆菌属的毒素复合体基因的异源表达
开展了光杆状菌属和致病杆菌属基因在大肠杆菌中表达的一系列实验。表明将tcdA或者xptA2基因与tcdB1,tccC1,xptB1和xptC1基因特异的联合进行共表达导致在抗敏感昆虫的生物测定法中具有显著活性。此处还证明光杆状菌属基因tcdA和tcdB与致病杆菌属基因xptB1的表达导致抗南方玉米根叶甲(Diabrotica undecimpunctata howardii)的显著活性。同样,致病杆菌属xptA2与光杆状菌属tcdB1和tccC1的表达产生抗美洲棉铃虫(Helicoverpa zea)的活性。
采用了两个大肠杆菌系统来测试光杆状菌属和致病杆菌属基因。第一个依赖于存在于表达载体pBT-TcdA中的大肠杆菌启动子(图2)。构建有多达三个基因的多顺反子排列的几个质粒被构建了。每一个基因含有一个单独的核糖体结合位点和起始密码子、编码序列和终止密码子。第二个系统是通过强的T7噬菌体启动子和T7RNA聚合酶介导的(图3,pET;图4,pCot)。相似地,在一些构建体中使用了编码序列的多顺反子排列。在其它实验中,相容的质粒用于共表达。叙述实验中使用的构建体的示意图示于图5和6。
pBT-TcdA的构建 表达质粒pBT-TcdA包含质粒pBC KS+(Stratagene)的复制和抗生素选择组分和质粒pTrc99a(AmershamBiosciences Corp.,Piscataway,N.J.)的表达组分(即多克隆位点上游的大肠杆菌强启动子,lac操纵子阻遏物和操纵基因)。使用体外诱变将pBCKS+的氯霉素抗性基因中的NcoI酶切位点去除。此中修饰不改变氯霉素乙酰转移酶蛋白质的氨基酸序列。如先前所述(WO 98/08932实施例27,源于光杆状菌属的杀昆虫蛋白质毒素),使用PCR方法改造5′和3′末端将TcdA编码序列(GenBank登录号AF188483;此处作为SEQ ID NO:21再现)进行修饰。随后将修饰的编码序列克隆到pTrc99a中。通过将pTrc-TcdA的钝末端Sph I/Pvu I片段与pBC KS+的钝末端Asn I/Pvu I片段连接产生质粒pBT-TcdA。结果得到质粒pBT-TcdA(图2和5)。
pBT-TcdA-TcdB的构建 使用下面的引物将TcdB 1的编码序列(GenBank登录号AF346500;此处作为SEQ ID NO:22再现)从质粒pBC-AS4(R.ffrench-Constant University of Wisconsin)扩增:5′ATATAGTCGACGAATTTTAATCTACTAGTAAAAAGGAGATAACCATGCAGAATTCACAAACATTCAGTGTTACC 3′(SEQ ID NO:23)
该引物没有改变蛋白质编码序列并且在5′非编码区添加了Sal I和Spe I位点。使用的反向引物为:
5′ATAATACGATCGTTTCTCGAGTCATTACACCAGCGCATCAGCG
GCCGTATCATTCTC 3′(SEQ ID NO:24)
再一次没有改变蛋白质编码序列但是在3′非编码区添加了一个Xho I位点。将扩增产物克隆到pCR2.1(Invitrogen)中并且测定了DNA序列。在预期的序列中出现了2个改变,在正向引物Spe I位点中一个A的删除(除去该Spe I位点)并且在对应于氨基酸位置1041处出现由A到T的替代,这导致了由Asp到Glu的保守替代。对这两个变化没有校正。用XhoI和Pvu I消化质粒pBT-TcdA(在TcdA编码序列的3′末端进行酶切)。用Sal I和Pvu I酶切质粒pCR2.1-TcdB1。将片段连接起来并且分离了pBT-TcdA-TcdB1重组体(图5)。Xho I和Sal I的末端是相容的,但在连接后该两个位点均消除了。质粒编码多顺反子TcdA-TcdB1 RNA。每个编码区带有单独的终止密码子和起始密码子,并且每一个的前面都是单独的核糖体结合位点。
pDAB3059的构建:将TccC1蛋白质的编码序列(GenBank登录号AAC38630.1;此处作为SEQ ID NO:25再现)从含有三基因Tcc操纵子的pBC KS+载体(pTccC chl;源于R.ffrench-Constant,the University ofWisconsin)扩增。正向引物为:
5′GTCGACGCACTACTAGTAAAAAGGAGATAACCCCATGAGCCCG
TCTGAGACTACTCTTTATACTCAAACCCCAACAG3′(SEQ ID NO:26)
该引物不改变tccC1基因的编码序列,但提供了5′非编码的Sal I和Spe I位点和核糖体结合位点以及ATG起始密码子。反向引物为:
5′CGGCCGCAGTCCTCGAGTCAGATTAATTACAAAGAAAAAACTC
GTCGTGCGGCTCCC3′(SEQ ID NO:27)
该引物也没有改变tccC1编码序列,但是提供了3′NotI和Xho I克隆位点。在用EPICENTRE FailSafe PCR试剂盒(EPICENTRE;Madison,WI)的成分扩增之后,将经过工程改造的TccC1编码序列克隆到pCR2.1-TOPO(invitrogen)中。通过5′Sal I和3′Not I酶切位点,将编码序列从pCR2.1上酶切下来并且转移到经改进的pET载体中(Novagen;Madison WI)。pET载体包含赋予壮观霉素/链霉素抗性的基因并且具有经修饰的多克隆位点。使用pTccC chl质粒DNA作为模板校正DNA测序发现的PCR-导入的突变,并且包含正确编码区的质粒命名为pDAB3059。双链DNA测序证实突变已被校正。
pBT-TcdA-TccC1的构建 用XhoI酶切质粒pBT-TcdA DNA,并且将其与用Sal I和Xho I酶切的pDAB3059DNA连接。随后将tccC1基因连接到tcdA基因的下游产生pBT-TcdA-TccC1(图5)。
pBT-TcdA-TcdB1-TccC1的构建 用Xho I酶切质短pBT-TcdA-TcdB1 DNA,并将其与用Sal I和Xho I酶切的pDAB3059 DNA连接。筛选出tccC1基因插入tcdB基因后面的重组体,产生了质粒pBT-TcdA-TcdB1-TccC1(图5)。
pBT-TcdA-TcdB1-XptB1的构建 用Xho I酶切质粒pBT-TcdA-TcdB1 DNA,并将其与用SalI和Xho I酶切的pET280-XptB1DNA进行鸟枪法连接。如果重组体经鉴定为XptB1编码区插入到了pBT-TcdA-TcdB1的Xho I酶切位点,则产生质粒pBT-TcdA-TcdB1-XptB1(图5)。
pET28-TcdA的构建 对于该质粒的叙述可见于别处,如WO98/08932实施例27,从光杆状菌属来源的杀昆虫蛋白质毒素。
pCot-TcdB1的构建 将用Xho I和Sal I酶切的质粒pCR2.1-TcdB1连接到T7表达质粒pCot-3的Sal I酶切位点(图4)。质粒pCot-3具有一个pACYC的复制起点,这使得其能够与具有ColE1起源的质粒(pBR322衍生物)相容。此外,它携带有氯霉素抗生素抗性标记基因和用于表达插入到多克隆位点的编码区的T7 RNA聚合酶特异的启动子。
pCot-TccC1-TcdB1的构建 用Spe I和Not I将TccC1编码区从pDAB3059 DNA上酶切下来,并且连接到质粒pET280-K(多克隆位点被替换的经修饰pET28)的多克隆位点。这导致在Spe I位点上游得到Swa I位点而在Not I位点下游得到一个Xho I位点。用Swa I和Xho I酶切质粒pET280-K-TccC1DNA以释放TccC1编码区,然后将其连接到质粒pCot-3-TcdB1的Swa I and SalI位点,从而产生质粒pCot-3-TccC1-TcdB(图6)。
pET280-XptA2,pET280-XptC1和pET280-XptB1的构建 对于XptA2,XptC1和XptB1蛋白质的编码序列可以分别通过从pDAB2097通过PCR扩增得到,所述pDAB2097是含有编码这些蛋白质的三个基因的重组粘粒。用于扩增这些编码序列的PCR引物对列于表7。在所有这些引物对中,正向引物不改变基因的编码序列但提供了5′非编码的Sal I和XbaI位点以及核糖体结合位点。反向引物也不改变相应的编码序列,但提供了3′Xho I克隆位点。在用EPICENTRE Fail Safe PCR试剂盒成分扩增之后,将经过工程改造的XptA2,XptC1和XptB1的每一个克隆到pCR2.1中。已克隆的扩增产物经过测序证实保证PCR诱导的突变没有改变编码序列。将含有XptA2,XptC1和XptB1编码序列没有改变的重组质粒鉴定出来并分别命名为pDAB3056,pDAB3064和pDAB3055。通过5′Xha I和3′Xho I位点,将每一个编码序列从pCR2.1衍生物上酶切下来并转移到经过修饰的pET载体(pET280-SS是一个pET28衍生物,它具有被替换的多克隆位点,并且将链霉素/壮观霉素抗性基因插入到主链上以提供选择标记[图3])上,以产生质粒pET280-XptA2,pET280-XptC1和pET280-XptB1。
表7.用于扩增XptA2,XptC1和XptB1编码序列的PCR引物 | ||
扩增的编码序列 | 正向引物序列(5′-3′) | 反向引物序列(5′-3′) |
XptA2 | GTCTAGACGTGCGTCGACAAGAAGGAGATATACCATGTATAGCACGGCTGTATTACTCAATAAAATCAGTCCCACTCGCGACGG*(SEQ IDNO:28) | GCTCGAG-ATTAATTAAGAACGAATGGTATAGCGGATATGCAGAATGATATCGCTCAGGCTCTCC(SEQ ID NO:29) |
XptC1 | GTCTAGACGTGCGTCGACAAGAAGGAGATATACCATGCAGGGTTCAACACCTTTGAAACTTGAAATACCGTCATTGCCCTC(SEQ ID NO:30) | GACTCGAGAGCATTAATTATGCTGTCATTTCACCGGCAGTGTCATTTTCATCTTCATTCACCAC (SEQ ID NO:31) |
XptB1 | GTCTAGACGTGCGTCGACAAGAAGGAGATATACCATGAAGAATTTCGTTCACAGCAATACGCCATCCGTCACCGTACTGGACAACC (SEQ IDNO:32) | GCTCGAGCAGATTAATTATGCTTCGGATTCATTATGACGTGCAGAGGCGTTAAAGAAGAAGTTATT(SEQ ID NO:33) |
*引物中的下划线序列对应着蛋白质编码序列
pET280-XptA2-XptC1的构建:用Xho I酶切质粒pET280-XptA2 DNA并将其连接到pDAB3064的单一Sal I位点。所得到的连接产物含有pCR2.1和pET280-SS两个载体的主链,并且通过联合使用链霉素(25μg/mL),壮观霉素(25μg/mL)和氨苄青霉素(100μg/mL)的抗生素选择法将其回收。用Xho I消化所回收质粒的DNA以检查片段的方向。得到了XptC1编码区刚好处于XptA2编码区下游的质粒,并且用Xho I消化DNA以去除pCR2.1载体主链。含有pET280-SS载体主链和XptA2和XptC1编码序列的所得构建体自我连接产生pET280-XptA2-XptC1。
pET280-XptC1-XptB1的构建:用Xho I酶切质粒pET280-XptC1DNA并将其连接入pDAB3055的SalI单一酶切位点。所得到的连接产物含有pCR2.1和pET280-SS两个载体的主链,并且通过联合使用链霉素(25μg/mL),壮观霉素(25μg/mL)和氨苄青霉素(100μg/mL)的抗生素选择法将其回收。用Xho I消化所回收质粒的DNA以检查片段的方向。得到了XptB1编码区刚好处于XptC1编码区下游的质粒,并且用Xho I消化DNA以去除pCR2.1载体主链。含有pET280-SS载体主链和XptC1和XptB1编码序列的所得构建体自我连接产生pET280-XptC1-XptB1。
pET280-XptA2-XptB1的构建:用Xho I酶切质粒pET280-XptA2DNA并将其连接入pDAB3055的Sa1I单一酶切位点。所得到的连接产物含有pCR2.1和pET280-SS两个载体的主链,并且通过联合使用链霉素(25μg/mL),壮观霉素(25μg/mL)和氨苄青霉素(100μg/mL)的抗生素选择法将其回收。用Xho I消化所回收质粒的DNA以检查片段的方向。得到了XptB1编码区刚好处于XptA2编码区下游的质粒,并且用Xho I消化DNA以去除pCR2.1载体主链。含有pET280-SS载体主链和XptA2和XptB1编码序列的所得构建体自我连接产生pET280-XptA2-XptB1。
pET280-XptA2-XptC1-XptB1的构建:用Xho I酶切质粒pET280-XptA2-XptC1 DNA并将其连接入pDAB3055的Sal I单一酶切位点。所得到的连接产物含有pCR2.1和pET280-SS两个载体的主链,并且通过联合使用链霉素(25μg/mL),壮观霉素(25μg/mL)和氨苄青霉素(100μg/mL)的抗生素选择法将其回收。用Xho I消化所回收质粒的DNA以检查片段的方向。得到了XptB1编码区刚好处于XptC1编码区下游的质粒,并且用Xho I消化DNA以去除pCR2.1载体主链。含有pET280-SS载体主链和XptA2,XptC1和XptB1编码序列的所得构建体自我连接产生pET280-XptA2-XptC1-XptB1。
基于pBT的构建体的表达:将pBT表达质粒转化大肠杆菌菌株BL21细胞并且涂板于含有50μg/mL氯霉素和50mM葡萄糖的LB琼脂上,并且转化体在37℃下生长过夜。将大约10-100个分离较好的克隆接种到500mL带挡板摇瓶的200mL含有50μg/mL氯霉素和75μM异丙基硫代-β-D-半乳糖吡喃糖苷(IPTG)的无菌LB中。在28℃下以200转每分振荡培养24小时。通过离心(大约3000×g)收集细胞并将细胞以30-120 OD600单位/mL的细胞密度悬浮于磷酸盐缓冲液中(30mM,pH7.4;NutraMax;Gloucester,MA)。然后将稀释的细胞用于昆虫的生物测定。
备选地,将生长24小时的细胞置于冰上冷却,并且用磷酸盐缓冲液调整为20-30 OD600单位/mL。使用探头超声破碎仪(Soniprep 150,MSE),并在1/3体积的0.1mm的玻璃珠(Biospec;Bartletsville,OK)中以20微米的振幅进行2×45秒脉冲。在Eppendorf微离心管中以14,000转每分离心10分钟使裂解物变得澄清。将澄清的裂解物用UltraFree 100kDaunits(Millipore;Bedford,MA)进行浓缩、收集并用磷酸盐缓冲液调节至10mg/mL,并用于昆虫生物测定法。
基于T7的构建体的表达 除了将其转化进入T7表达菌株BL21(DE3)(Novagen,Madison,WI)和联合使用链霉素(25μg/mL)和壮观霉素(25μg/mL)作为抗生素选择以外,将基于T7的表达质粒进行上述的同pBT表达质粒相同的操作。
实施例3-异源表达的毒素复合体基因的杀昆虫生物测定结果
如上所述,使用pBT表达系统开展了一系列的表达实验。将大肠杆菌进行转化、诱导和在28℃下过夜生长。将细胞进行收集、洗涤和标准化成相同的浓度并且将其作为南方玉米根叶甲的食物进行应用并进行生物测定。如表8所示,只有当所有三个光杆状菌属基因tcdA,tcdB1和tccC1在同一细胞中表达时,才观察到明显的死亡。基因的其它组合不能够导致明显的死亡。例如,表8表明tcdB1和tccC1基因的特定组合未表现出杀昆虫的活性。当在pBT-TcdA-TcdB1-TccC1质粒上的基因表达时,通常观察到死亡,并且在这些实验中(表8)在应用于昆虫食物之前将细胞在4℃储存24小时没有明显降低或提高死亡率。如果在共表达质粒pBT-TcdA-TcdB1-TccC1上的基因之后将细胞进行裂解也观察到南方玉米根叶甲的死亡。如果将裂解液在-70℃冻存一周,活性没有降低(表9)。
表8.pBT-表达的光杆状菌属毒素复合体基因对南方玉米根叶甲的生物测定 | ||
实验A. | ||
质粒 | 第1天68单位/ml | 第2天68单位/ml |
pBTpBT-TcdApBT-TcdA-TcdB1pBT-TcdA-TccC1pBT-TcdA-TcdB1-TccC1 | 0000++++ | 0000++++ |
实验B. | ||
质粒 | 第1天85单位/ml | 第2天85单位/ml |
pBTpBT-TcdApBT-TcdA-TcdB1pBT-TcdA-TccC1pBT-TcdA-TcdB1-TccC1 | 0000++++ | 00+0++++ |
将完整大肠杆菌细胞用磷酸盐缓冲液洗涤、浓缩并调整至相同的细胞浓度,并且应用到昆虫的食物中。第1天的样品是立即测定的。第2天的样品是相同的细胞制品,但在应用于昆虫食物之前在4℃过夜储存了。
等级划分代表南方玉米根叶甲的死亡百分数%(0=0-10%;+=11-20%;++=21-40%;+++=41-60%;++++=61-80%;+++++=81-100%)。
表9.pBT-表达的光杆状菌属毒素复合体基因对南方玉米根叶甲的生物测定 | |||
质粒 | 细胞74单位/ml | 裂解液10mg/ml | 冻存裂解液10mg/ml |
pBTpBT-TcdA-TcdB1-TccC1 | 0+++++ | 0+++++ | 0+++++ |
将大肠杆菌细胞用磷酸盐缓冲液洗涤、浓缩并调整至相同的细胞浓度,并且应用到昆虫的食物制剂中。可选择地,裂解液用超声进行制备,并将新鲜制备的裂解液应用到食物中或者在-70℃冻存7天后再应用到食物中。
等级划分代表南方玉米根叶甲的死亡百分数%(0=0-10%;+=11-20%;++=21-40%;+++=41-60%;++++=61-80%;+++++=81-100%)。
在另一系列实验中,致病杆菌属xptB1基因代替了光杆状菌属tccC1基因,并且作为质粒pBT-TcdA-TcdB1-XptB1的多顺反子操纵子的一部分进行了表达。这些实验证明致病杆菌属xptB1基因能够代替光杆状菌属tccC1基因,在完整大肠杆菌细胞生物测定中导致了南方玉米根叶甲的死亡(表10)。
表10.pBT-表达的光杆状菌属和致病杆菌属毒素复合体基因对南方玉米根叶甲的生物测定 | |||
质粒 | 试验1110单位/ml | 试验255单位/ml | 试验3111单位/ml |
pBTpBT-TcdA-TcdB1-TccC1pBT-TcdA-TcdB1-XptB1 | 0+++++ | 0+++++ | 0++++++ |
将完整大肠杆菌细胞用磷酸盐缓冲液洗涤、浓缩并调整至相同的细胞浓度,并且应用到昆虫的食物制剂中。
等级划分代表南方玉米根叶甲的死亡百分数%(0=0-10%;+=11-20%;++=21-40%;+++=41-60%;++++=61-80%;+++++=81-100%)。
多种光杆状菌属基因从分别的质粒上进行表达也导致南方玉米根叶甲的死亡。当将存在于pET表达质粒上的tcdA和存在于相容表达载体pCot-3上的tccC1和tcdB1基因与这些质粒的对照组合相比较时可以观察到明显的活性(表11)。如上面所提到,tcdB1和TccC1基因单独存在时不能导致明显的活性(表11)。
将完整大肠杆菌细胞用磷酸盐缓冲液洗涤、浓缩并调整至相同的细胞浓度,并且应用到昆虫的食物制剂中。
表11.pCoT/pBT(T7启动子)表达的光杆状菌属毒素复合体基因对南方玉米根叶甲的生物测定 | ||
质粒 | 试验140单位/ml | 试验260单位/ml |
pCoT/pETpCoT/pET-TcdApCoT-TccC1-TcdB1/pETpCoT-TccC1-TcdB1/pET-TcdA | 000+++ | 000+++ |
等级划分代表南方玉米根叶甲的死亡百分数%(0=0-10%;+=11-20%;++=21-40%;+++=41-60%;++++=61-80%;+++++=81-100%)。
异源表达的致病杆菌属毒素复合体基因的生物测定结果 如上所述使用pET表达系统开展了一系列表达实验。将大肠杆菌细胞进行转化、诱导并在28℃下生长过夜。将细胞收集、洗涤、并标准化成相同的浓度,并测试对欧洲玉米螟(Ostrinia nubilalis)(ECB),棉铃虫(CEW)和烟夜蛾(TBW)的杀昆虫活性。如表12所示,当xptA2,xptC1和xptB1存在于同一个构建体中时可观察到最大水平的杀昆虫活性。
表12.异源表达的致病杆菌属毒素复合体基因对TBW,CEW和ECB的生物测定 | |||
受测试质粒 | TBW生物测定 | CEW生物测定 | ECB生物测定 |
pET-280pET-280-XptA2pET-280-XptC1pET-280-XptB1pET-280-XptA2-XptC1pET-280-XptA2-XptB1pET-280-XptC1-XptB1pET-280-XptA2-XptC1-XptB1 | 0*+++00+00+++++ | 0+++00+00+++++ | 0++00000+++++ |
将完整大肠杆菌细胞用磷酸盐缓冲液洗涤、浓缩并调整至相同的细胞浓度,并且应用到昆虫的食物制剂中。
等级划分代表与对照相比的生长抑制百分数%(0=0-10%;+=11-20%;++=21-40%;+++=41-60%;++++=61-80%;+++++=81-100%)。
异源表达的xptA2,tcdB1和tccC1的生物测定结果 用表13列出的pET280和pCoT构建体共转化大肠杆菌细胞。如上所述对转化体进行诱导、处理和生物测定。在这些测定中,含有pCOT/pET280-XptA2-XptC1-XptB1或者pCoT-TcdB1-TccC1/pET280-XptA2质粒组合的共转化体表现出最高水平的杀昆虫活性。这些实验表明光杆状菌属tcdB1和tccC1基因即使相对于xptA2反式存在,也能够代替致病杆菌属xptC1和xptB1基因,这导致质量上相似水平的增强了的杀昆虫活性。
表13.异源表达的xptA2,tcdB1和TccC1对CEW的生物测定 | |
受测试质粒 | CEW生物测定 |
pET280/pCoTpET280/pCoT-TcdB1-TccC1pCoT/pET280-XptA2pCoT/pET280-XptA2-XptC1-XptB1pCoT-TcdB1-TccC1/pET280-XptA2 | 0*0+++++++++++++ |
将完整大肠杆菌细胞用磷酸盐缓冲液洗涤、浓缩并调整至相同的细胞浓度,并且应用到昆虫的食物制剂中。
等级划分代表相对于对照的生长抑制百分数%(0=0-10%;+=11-20%;++=21-40%;+++=41-60%;++++=61-80%;+++++=81-100%)。
实施例4-致病杆菌属XptA2毒素与类芽孢杆菌DAS1529株TC蛋白质
互补
该实施例提供了关于共同未决的美国临时申请系列号60/392,633的额外的数据,其在上面的背景部分进行了讨论。该数据与本申请相关,因为它为类芽孢杆菌DAS1529株TC蛋白质(此处作为单一操纵子表达)能够补偿例如源于嗜线虫致病杆菌Xwi的XptA2毒素(见SEQ ID NO:34)的能力提供了实验证据。开展了独立表达DAS1529TC操纵子和XptA2或者在同一大肠杆菌细胞中共表达XptA2基因和TC操纵子的两个独立实验。表达不同毒素/毒素组合的完整细胞用于测试抗两种鳞翅昆虫:美洲棉铃虫(Heliothis zea;CEW)和美洲烟夜蛾(Heliothis virescens;TBW)的活性。从此两个实验得到的数据表明DAS1529TC蛋白质能够增强致病杆菌属XptA2抗所测试昆虫的活性。
A.
共表达DAS1529TC蛋白质和致病杆菌属TC XptA2毒素
DAS 1529TC操纵子的表达受pET101.D表达载体上的T7启动子/lac操纵基因控制,该表达载体携带有ColEl复制起点和氨苄青霉素抗性选择标记(Invitrogen)。在美国系列号60/392,633的实施例8中可以找到关于类芽孢杆菌属TC操纵子的克隆和表达的详细叙述。将XptA2基因克隆进入pCot-3表达载体,该载体携带有氯霉素抗性选择标记和与ColEl相容的复制起点。pCot-3载体表达系统也受T7启动子/lac操纵基因的调节。因此,相容的复制起点和不同的选择标记形成了TC操纵子和XptA2在同一大肠杆菌细胞中共表达的基础。将携带有TC操纵子和XptA2的质粒DNA单独地或者组合在一起进行转化大肠杆菌BL21starTM(DE3)。在含有针对pET101.D-TC操纵子的50μg/ml羧苄青霉素、针对pCot-3-XptA2的50μg/ml氯霉素和针对pET101.D-TC操纵子/pCot-3-XptA2的两种抗生素的LB琼脂平板上筛选转化体。为了抑制毒素的基础表达,在琼脂LB培养板和液体LB培养基中含有终浓度为50mM的葡萄糖。
对于毒素的生产,将生长在LB琼脂平板上的过夜培养物接种到5mL和50mL含有抗生素和50mM葡萄糖的LB培养基中。将培养物在30℃、300转每分的摇动器中进行生长。一旦培养物浓度于600nm达到O.D.约为0.4时,将终浓度为75μM的IPTG加入到培养基中以诱导基因表达。24小时之后,收集大肠杆菌细胞通过NuPAGE系统(Invitrogen)进行蛋白质凝胶分析。将0.5mL1×培养液中的细胞沉淀重悬于100μL1×NuPAGELDS样本缓冲液中。在短暂的超声裂解和5分钟的煮沸后,将5μL的样品加到4-12%NuPAGEbis-tris梯度凝胶上用于进行总蛋白质分布分析。当XptA2单独表达时或者在TC操纵子存在下表达时,XptA2的表达达到了可检测水平。基于通过Personal Densitometer SI(Molecular Dynamics)的凝胶扫描分析,当与TC操纵子共表达时,XptA2的表达几乎是自身表达的8倍。对于5mL的诱导实验,XptA2的表达几乎相等。
B.
杀昆虫活性的生物测定
如在美国系列号60/392,633的实施例8所述,当DAS1529TC ORF单独表达或者作为一个操纵子表达时,没有抗TBW和CEW的活性。下面的生物测定实验集中在确定类芽孢杆菌属(DAS1529)TC蛋白质(ORF3-6;TcaA-,TcaB-,TcaC-和TccC样蛋白质;见SEQ ID NO:35-43)是否能够增强致病杆菌属TC蛋白质(以XptA2为例)活性。在5mL诱导实验中制备了完整大肠杆菌细胞的4×细胞浓缩液作为生物测定样品,XptA2和XptA2/TC操纵子的细胞含有的XptA2非常低但是几乎是等量的。表14中的数据显示在4×细胞浓度下,类芽孢杆菌属TC蛋白质(在表14中为“TCs”)+XptA2的组合对于抗CEW是有效的。这表明类芽孢杆菌属(DAS1529)TCs具有对致病杆菌属XptA2的补偿效应。
表14.DAS1529TC增强致病杆菌属毒素对美洲棉铃虫的生物测定 | |
昆虫: | CEW |
阴性对照TCs(DAS1529)XptA2TCs+XptA2 | ---++ |
*-,++,+++=分别为没有活性、中等活性和高活性
在第二个生物测定实验中,根据光密度计凝胶扫描分析将XptA2细胞和XptA2+TC操纵子细胞内的XptA2的量进行了标准化。如表15所示,XptA2本身在40×的条件下具有对TBW(H.virescens)的中度活性,但是在20×和低于20×的条件下活性降低到不可检测的水平。但是,当与类芽孢杆菌属TC蛋白质共表达时,在10×和5×XptA2存在下高水平活性十分明显,并且在1.25×XptA2时仍明显具有低的活性。这些观察表明这些DAS1529TC蛋白质对于XptA2抗烟夜蛾具有显著的增效作用。在所测试的最高剂量,阴性对照或者TC操纵子本身均没有抗该昆虫的任何活性。
表15.DAS1529TC增强XptA2对烟夜蛾的生物测定 | ||||||
标准化了的XptA2 | 40× | 20× | 10× | 5× | 2.5× | 1.25× |
XptA2TCs+XptA2 | +n.d. | -n.d. | -++ | -++ | n.d.+ | n.d.- |
*n.d.-没有测定;-,+,++,+++=分别为没有活性、低、中等活性和高活性
实施例5-伯氏致病杆菌B和C蛋白质的混合互补作用
实施例5A.综述
使用粘粒补偿筛选开展了对编码能够加强或者协同杀昆虫活性蛋白质光杆状菌属TcdA和致病杆菌属XptA2wi活性的因子的基因的鉴定和分离。来源于伯氏致病杆菌(菌株ILM104)粘粒基因组文库的单个大肠杆菌克隆用于制备粗细胞提取物,将这些提取物与纯化的毒素混合并进行生物测定。将裂解液与纯化的光杆状菌属毒素TcdA一起分析其抗南方玉米根叶甲幼虫的活性。同时,将裂解液与纯化的致病杆菌属XptA2wi混合并测定其抗烟夜蛾和美洲棉铃虫幼虫的活性。如果裂解液加上纯化毒素的组合具有比其单独每种成分要高的活性,则粘粒裂解液记为阳性。
对于初级筛选的样品(以96-孔的形式)的杀昆虫活性进行两次重复测试,并将所得的数值与对照进行比较。将阳性的样品进行再生长,并在二级筛选中进行测试。在初级和次级筛选中被鉴定为阳性的粘粒进行第三次筛选。较大的培养体积(见下)用于第四次筛选,在128-孔形式的生物测定中测试生物学活性。
将在本筛选中鉴定为具有增强活性的粘粒之一的DNA进行亚克隆。将仍具有活性的单个亚克隆的DNA序列进行测序并且表明其含有两个开放阅读框架,命名为xptB1xb和xptC1xb。将这些编码区亚克隆到pET质粒中并在大肠杆菌中进行了表达。当将TcdA或者XptA2wi蛋白质与共表达XptB1xb和XptC1xb的裂解液混合后可见杀昆虫活性急剧增加。当与纯化的TcdA或者XptA2wi混合后,仅含有XptB1xb或者XptC1xb的裂解液具有最小的作用。
实施例5B.杀昆虫生物测定方法
杀昆虫生物测定是通过在96孔微孔板(Becton Dickinson andCompany,Franklin Lakes,NJ)或者专门设计用于杀昆虫生物测定的128孔碟子(C-D International,Pitman,NJ)中使用人工饲料进行的。将2种鳞翅目物种的卵用于在96孔微孔板中的生物测定:美洲棉铃虫(Helicoverpa zea(Boddie))和美洲烟夜蛾(Heliothis virescens(F.))。新生的幼虫用于在128孔碟子中的生物测定。以这种形式测试的鳞翅目物种包括美洲棉铃虫、美洲烟夜蛾和甜菜夜蛾(Spodoptera exigua(Hubner))。一个鞘翅目物种,南方玉米根叶甲(Diabroticaundecimpunctata howardii(Barber)),也用于该种生物测定形式的测试。
在这些生物测定中记录的数据包括处理的昆虫总数、死亡昆虫数目、生长受妨碍的昆虫数目和存活昆虫的重量。在报导生长抑制的情况下,其计算如下:
%生长抑制=[1-(处理昆虫的平均重量/只有载体的对照中昆虫的平均重量)]*100
实施例5C.其它实验操作步骤
这在由Apel-Birkhold等同时提出的申请中有更加详细的叙述,该申请题目为“来源于伯氏致病杆菌的毒素复合体的蛋白质和基因”代理号(attorney docket no.)为DAS-114P(系列号———)。
实施例5D.
已经表明质粒pDAB6026能够编码具有协同TcdA和XptA2wi的杀昆虫毒性活性的活性物质。将含有质粒pDAB6026或者pBCKS+载体对照的大肠杆菌细胞接种于200mL含有氯霉素(50μg/mL)和75μM IPTG(异丙基硫代-β-D-半乳糖吡喃糖苷)的LB中,并于28℃、180转每分的振荡条件下生长2天。然后将细胞在3500×g离心10分钟。将沉淀重悬于5mLButterfield′s磷酸盐溶液(Fisher Scientific)中,并转移至含有1.5mL直径0.1mm玻璃珠(Biospec,Bartlesville,OK,catalog number 1107901)的超声管中。用冰冷却细胞-玻璃珠混合物,然后通过超声将细胞裂解,超声时使用带有2mm探头的Branson Sonifier250(Danbury CT)在约20的输出条件下进行两次45秒的脉冲,在两次脉冲之间要完全冷却。将上清液转移至2mL微量离心管中并在16,000×g离心5分钟。然后将上清转移至15mL管中,并测定蛋白质的浓度。用H2O 1∶5稀释Bio-Rad Protein DyeAssay Reagent并取1mL加入到10μL 1∶10稀释的每个样品和浓度为5、10、15、20和25μg/mL的牛血清白蛋白(BSA)中。然后使用ShimadzuUV160U分光光度计(Kyoto,JP)记录波长595的样品的光密度。然后,对照BSA标准曲线计算出每个样品中的蛋白质量并用磷酸盐缓冲液将其调整为3-6mg/mL。在进行杀昆虫饲养生物测定法测试之间将600ngXptA2wi毒素蛋白质加入到500μL大肠杆菌裂解液中。pDAB6026和XptA2的组合表现出具有有效的活性(表16)。
表16.两种鳞翅目物种对单独pDAB6026裂解液和该裂解液与纯化的XptA2wi蛋白质组合的反应 | ||||||||
处理 | 美洲烟夜蛾 | 美洲棉铃虫 | ||||||
死亡 | 发育障碍 | 总数 | 重量 | 死亡 | 发育障碍 | 总数 | 重量 | |
1pBC | 0 | 0 | 8 | 674 | 0 | 0 | 8 | 352 |
2pBC+XptA2wi | 0 | 0 | 8 | 538 | 0 | 0 | 8 | 423 |
3pDAB6026 | 0 | 0 | 8 | 539 | 0 | 0 | 8 | 519 |
4pDAB6026+XptA2wi | 0 | 8 | 8 | 18 | 8 | — | 8 | — |
实施例5E.xptB1xb和xptC1xb基因的发现、工程改造和测试
将质粒pDAB6026的DNA送交SeqWright DNA Sequencing(Houston,TX)进行DNA序列的测定。发现了两个具有相当大小的完整开放阅读框(ORF)。第一个(作为SEQ ID NO:48公开)与属于“B”类的已知毒素复合体基因具有明显的相似性。因此将该ORF称作xptB1xb并且其编码如SEQ ID NO:49中公开的蛋白质。第二个ORF(SEQ ID NO:50)编码与毒素复合体“C”蛋白质同源的蛋白质(SEQ ID NO:51)并且因此而称作xptC1xb。
通过在编码区的5′和3′末端加入限制性酶切位点以及加入核糖体结合位点和最佳的翻译终止信号对xptB1xb和xptC1xb基因进行工程改造(使用聚合酶链式反应;PCR)以便于高水平的重组表达。此外,将沉默突变(氨基酸序列没有变化)引入到编码区的5′末端以便减少潜在的mRNA二级结构并因此而增加翻译。策略是在基因的5′和3′末端扩增/工程改造片段,使用“剪接重叠延伸(Splice Overlap Extensions)”反应将远端片段进行连接,然后通过限制性位点加入开放阅读框的非扩增中心部分。该方法使在DNA序列中PCR-诱导的变化的可能性最小化。将工程改造的编码区作为分开的编码区(SEQ ID NO:52和SEQ ID NO:53)或者作为二顺反子操纵子(SEQ ID NO:54)克隆到pET表达质粒(Novagen,Madison,WI)中。表达质粒的名称示于表17。
表17.含有克隆到pET载体中的多种编码区的表达质粒 | |
质粒名称 | 经过工程改造用于表达的编码区 |
pDAB6031pDAB6032pDAB6033 | 如SEQ ID NO:52的xptB1xb如SEQ ID NO:53的xptC1xb如SEQ ID NO:54的xptB1xb+xptC1xb |
用pET(对照)载体或者载体pDAB6031、pDAB6032或者pDAB6033的DNA转化新鲜制备的大肠杆菌T7表达菌株BL21 StarTM(DE3)(Stratagene,La Jolla,CA)感受态细胞,并将其接种于250mL含有50μg/mL氯霉素和75μM IPTG的LB中。于28℃下180转每分振荡生长24小时后,将细胞于5500×g离心10分钟。将沉淀重悬于5mL磷酸盐溶液并且转移至含有1.5mL直径0.1mm玻璃珠的50ml锥形管中,然后用上述的作为“恒量”和设定为30的两次45秒的脉冲进行超声破碎。将样品于3000×g离心15分钟,将上清转移至2mL微量离心管中,14,000转每分离心5分钟,然后将上清液转移至15mL的管中。如上所述对蛋白质浓度进行测定并且用磷酸盐缓冲液将裂解液调整至5mg/mL。将每个裂解液的一组3个样品接受杀昆虫生物测定。对于第一个样品,加入磷酸盐缓冲液以代替纯化的毒素;对于第二个样品,加入足够量的TcdA蛋白质以便在杀昆虫生物测定孔中提供50ng/cm2的剂量;并且对于第三个样品,加入足够量的XptA2wi蛋白质以便在杀昆虫生物测定孔中提供250ng/cm2的剂量。
生物测定的结果示于表18。没有添加低水平额外TcdA或者XptA2wi蛋白质的对照样品(如来源于载体、pDAB6031、pDAB6032和pDAB6033的样品)对昆虫几乎没有影响。同样,含有低水平TcdA或者XptA2wi和pDAB6031或者pDAB6032裂解液的样品具有最小的效应。相反,在包括低水平TcdA或者XptA2wi和pDAB6033裂解液的样品中观察到明显的活性。
表18.鞘翅目和鳞翅目物种对大肠杆菌裂解液和纯化蛋白质的反应,反应表示为死亡百分率/生长抑制百分率 | |||||
样品 | 昆虫物种 | ||||
受试裂解液 | 南方玉米根叶甲 | 美国棉铃虫 | 美国烟叶蛾 | 甜菜夜蛾 | |
载体 | 0/0 | 8/0 | 0/0 | 31/0 | |
pDAB6031 | XptB1xb | 0/0 | 0/0 | 0/0 | 31/33 |
pDAB6032 | XptC1xb | 0/0 | 4/11 | 0/2 | 13/15 |
pDAB6033 | XptB1xb+XptC1xb | 0/0 | 0/0 | 0/6 | 13/38 |
载体+TcdA | 4/0 | 4/3 | 0/6 | 25/22 | |
pDAB6031+TcdA | XptB1xb+TcdA | 0/0 | 0/0 | 0/5 | 13/34 |
pDAB6032+TcdA | XptC1xb+TcdA | 0/0 | 0/2 | 0/14 | 6/25 |
pDAB6033+TcdA | XptB1xb+XptC1xb+TcdA | 25/68 | 4/14 | 4/0 | 31/48 |
载体+XptA2wi | 0/0 | 0/79 | 0/9 | 31/0 | |
pDAB6031+XptA2wi | XptB1xb+XptA2wi | 0/0 | 4/75 | 8/22 | 25/36 |
pDAB6032+XptA2wi | XptC1xb+XptA2wi | 0/0 | 0/71 | 0/22 | 6/14 |
pDAB6033+XptA2wi | XptB1xb+XptC1xb+XptA2wi | 0/0 | 83/100 | 29/98 | 81/100 |
实施例5F.伯氏致病杆菌菌株ILM104的XptB1xb和XptC1xb蛋白质的鉴定、纯化和表征
生物测定驱使的含有pDAB6033的大肠杆菌裂解液的分级导致通过MALDI-TOF鉴定出两个共纯化的蛋白质XptB1xb和XptC1xb。含有该两个蛋白质的峰有效地增强TcdA和XptA2wi的活性。
根据它们协同或增强TcdA抗南方玉米根叶甲的能力或者XptA2wi抗美国棉铃虫的能力鉴定出活性级分。所有的生物测定在上面实施例5A中叙述的128孔形式中进行。
在电导22-24mS/cm洗脱的蛋白质级分中检测到2个活性的峰(峰1和峰2)。峰1和峰2增强活性的例子示于表19。对峰1和峰2进行随后的纯化和分析。
峰1和峰2的凝胶中都含有两个主要的条带,一个迁移在约170kDa并且另一个迁移在约80kDa。从峰1的凝胶中含有三个额外的蛋白质,它们迁移在大约18、33和50kDa。经过回顾分析显示在纯化的起始阶段大约170kDa和大约80kDa的条带是丰富的并且变得在每一步进行性富集。
使用MALDI-TOF质谱分析了提取的肽,在Voyager DE-STRMALDI-TOF质谱仪(PerSeptive Biosystems,Framingham,MA)产生肽质量图谱(PMF)。对从约170kDa条带中提取的样品的分析证实与XptB1xb相同。对从约80kDa条带中提取的样品的分析证实与XptC1xb相同。虽然XptC1xb蛋白质预测的分子量根据从基因序列(SEQ ID NO:50)计算,分子量为108kDa,但是提取的蛋白质在SDS/PAGE中跑的明显地比预计的要快。代表着全长肽序列的肽片段的存在表明提取的蛋白质是全长的。
表19.从pDAB6033纯化的峰1和峰2的生物学活性 | ||||
样品 | 美国棉铃虫 | 南方玉米根叶甲 | ||
死亡 | 发育障碍 | 死亡 | 发育障碍 | |
峰10125 | 02 | 06 | 04 | 02 |
峰20125 | 10 | 08 | 05 | 03 |
样品列中的数值代表应用于食物中的峰1和峰2XptB1xb/XptC1xb蛋白质的浓度(ng/cm2)。对于抗美国棉铃虫的生物测定,在生物测定中包含250ng/cm2的XptA2xi。对于抗南方玉米根叶甲的生物测定,在生物测定中包含100ng/cm2的TcdA。在每个样品中使用总共8个幼虫。
实施例6-其它混合和匹配例子
在该实施例中,证明三种毒素复合体(TC)蛋白质的组合可以实现强效的昆虫抑制。当将A类蛋白质与B类和C类蛋白质混合后可以观察到令人信服的杀昆虫活性。本发明的惊人之处在于A类、B类和C类蛋白质的多种组合均导致强烈的昆虫抑制。毒素复合体蛋白质可以有广泛不同的来源并且可以与该类的其它功能成员拥有有限量的氨基酸同一性。
实施例6A.引言
对于由15种不同的毒素复合体基因编码的杀昆虫和生长抑制活性分别和彼此组合进行了测试。测试了来源于所述A、B或者C类的每一类中的几个实例。基因来源于3个属(光杆状菌属,致病杆菌属和类芽孢杆菌属;两个革兰氏阴性细菌和革兰氏阳性细菌)和四个不同的种。该实施例的结果与毒素复合体A类蛋白质(例如TcdA和XptA2wi)单独具有明显的活性的观察相一致。这在最近Liu等的转基因植物(Liu,D.,Burton,S.,Glancy,T.,Li,Z-S.,Hampton,R.,Meade,T.和Merlo,D.J.″283-kDa发光光杆状菌蛋白质TcdA在拟南芥菜(Arabidopsis thaliana)中表现出对昆虫的抗性″,Nature Biotechnology 2003,10月,21卷,10期,1222-1228页)中已经证明。该结果还与下面的观察相一致,即从同一个操纵子、菌株或者属共表达的三个毒素复合体基因(A类、B类和C类)导致比单独A类基因、或者该三类的任何一个或者两个组合的杀昆虫活性都要大(Hurst,M.,Glare,T.,Jackson,T.和Ronson,C.″嗜线虫沙雷菌的质粒定位的病原决定子,Grass Grub Amber病的诱发因素,表现出同发光光杆状菌杀昆虫毒素的相似性″,Journal ofBacteriology,2000年9月,182卷,18期,5127-5138页;Morgan,J.A.,Sergeant,M.,Ellis,D.,Ousley,M.和Janett,P.″嗜线虫致病杆菌PMFI 296来源的杀昆虫基因的序列分析。″Appliedand Environmental Microbiology,2001年5月,2062-2069页,67卷,5期;Waterfield,N.,Dowling,A.,Sharma,S.,Daborn,P.,Potter,U.和Ffrench-Constant,R.″大肠杆菌中发光光杆状菌W-14毒素复合体的口服毒性″,Applied and Environmental Microbiology,2001年11月,67卷,11期,5017-5024页)。
令人惊奇地,下面的数据证明了这样一种发现,即毒素复合体A类蛋白质可以与例如从大肠杆菌制备的表达具有广泛不同来源的B类和C类基因的裂解液进行混合或者匹配,以产生极好的杀昆虫活性或者昆虫生长抑制活性。例如,来源于致病杆菌属的A类蛋白质可以与表达光杆状菌属来源的B类基因和类芽孢杆菌属来源的C类基因的裂解液进行混合以提供杀昆虫活性的组合。同样,来源于光杆状菌属的A类蛋白质可以与表达致病杆菌属来源的B类和C类的裂解液混合,反之亦然。许多组合是可能的;下面所示的许多会导致强效的杀昆虫活性。不可预计的新发现是来源于以对鞘翅目具活性的菌株(发光光杆状菌W-14株)或者对鳞翅目具活性的菌株(嗜线虫致病杆菌菌株Xwi)的毒素复合体A、B和C类的成分可以功能性的混合和匹配。另一个另人惊奇的是对于单个A、B或者C蛋白质的可能的差异程度的发现。例如,能与B类/C类功能性结合的单个A类蛋白质(例如TcdA和XptA2wi)彼此只享有41%的氨基酸同一性。同样,任何一个单个B类蛋白质可以只与另一个功能性B类蛋白质享有41%的同一性。相似地,任何给定的C类蛋白质可与另一个C类蛋白质只享有35%的同一性。
实施例6B.蛋白质来源和构建
A类蛋白质TcdA和XptA2wi以从异源表达所述蛋白质的荧光假单胞菌的培养物中制备的纯化形式使用。从其它异源来源(植物;细菌)的TcdA和XptA2wi的制品在测定中是功能等效物。B类和C类蛋白质作为大肠杆菌裂解液的成分进行测试。通过与纯化的几种B类和C类组合的制品相比较,确认为裂解液的用途。通过将编码B类和C类蛋白质的阅读框克隆进入pET载体(Novagen,Madison WI)改造成能够在大肠杆菌中进行表达。每一个编码区含有一个具有恰当间隔的核糖体结合位点(相对于起始密码)和终止信号。将一些基因的5′端DNA序列进行修饰以减少预测的RNA二级结构并因此而提高翻译。这些碱基的变化是沉默的并且不导致蛋白质中氨基酸的变化。在B类基因与C类基因一起用于测试的情况下,在pET表达质粒中构建了这样的操纵子,其中B类的编码序列首先被转录然后是C类编码序列的转录。通过一个接头序列将两个编码区分隔开,此接头序列含有一个与C类蛋白质编码区的起始密码子有恰当间隔的核糖体结合位点。在双顺反子构件中编码区之间的DNA序列以5′-3′方向显示。表20-27包含由多种表达质粒编码的蛋白质、编码区来源和质粒参考编号的目录。表22B、23B和28-31显示了表达质粒中使用的接头序列。
表20. | ||
B类蛋白质 | 来源 | 质粒编号 |
TcdB1 | 发光光杆状菌W-14株 | pDAB8907 |
TcdB2 | 发光光杆状菌W-14株 | pDAB3089 |
TcaC | 发光光杆状菌W-14株 | pDAB8905 |
XptC1wi | 嗜线虫致病杆菌Xwi株 | pDAB8908 |
XptB1xb | 伯氏致病杆菌ILM104株 | pDAB6031 |
PptB11529 | 类芽孢杆菌1529株 | pDAB8722 |
表21. | ||
C类蛋白质 | 来源 | 质粒编号 |
TccC1 | 发光光杆状菌W-14株 | pDAB8913 |
TccC2 | 发光光杆状菌W-14株 | pDAB3118 |
TccC3 | 发光光杆状菌W-14株 | pDAB3090 |
TccC5 | 发光光杆状菌W-14株 | pDAB3119 |
XptB1wi | 嗜线虫致病杆菌Xwi株 | pDAB8909 |
XptC1xb | 伯氏致病杆菌ILM104株 | pDAB6032 |
PptCl1529 | 类芽孢杆菌属1529株 | pDAB8723 |
表22A. | ||
蛋白质组合 | 来源 | 质粒编号 |
TcdB1+TccC1 | 发光光杆状菌W-14株 | pDAB8912 |
TcdB1+TccC2 | 发光光杆状菌W-14株 | pDAB8712 |
TcdB1+TccC3 | 发光光杆状菌W-14株 | pDAB3104 |
TcdB1+TccC5 | 发光光杆状菌W-14株 | pDAB8718 |
TcdB1+XptB1wi | 发光光杆状菌W-14株嗜线虫致病杆菌Xwi株 | pDAB8713 |
表22B. | ||
质粒编号 | 蛋白质组合 | 接头序列 |
pDAB8912 | TcdB1+TccC1 | tgactcgacgcactactagtaaaaaggagataacccc |
pDAB8712 | TcdB1+TccC2 | tgactcgaatttaaattatatatatatatatactcgacgaattttaatctactagt aaaaaggagataacc |
pDAB3104 | TcdB1+TccC3 | tgactcgacgcactactagtaaacaagaaggagatatacc |
pDAB8718 | TcdB1+TccC5 | tgactcgaatttaaattatatatatatatatactcgacgaattttaatctactagatttatttaaatttttttactagttttgtcgacaaaaaggagataacccc |
pDAB8713 | TcdB1+XptB1wi | tgactcgaatttaaattatatatatatatatactcgacaagaaggagatatacc |
表23A. | ||
蛋白质组合 | 来源 | 质粒编号 |
TcdB2+TccC1 | 发光光杆状菌W-14株 | pDAB3114 |
TcdB2+TccC2 | 发光光杆状菌W-14株 | pDAB3115 |
TcdB2+TccC3 | 发光光杆状菌W-14株 | pDAB3093 |
TcdB2+TccC5 | 发光光杆状菌W-14株 | pDAB3106 |
TcdB2+XptB1wi | 发光光杆状菌W-14株嗜线虫致病杆菌Xwi株 | pDAB3097 |
TcdB2+XptC1xb | 发光光杆状菌W-14株伯氏致病杆菌ILM104株 | pDAB8910 |
TcdB2+PptCl1529 | 发光光杆状菌W-14株类芽孢杆菌属1529株 | pDAB8725 |
表23B. | ||
质粒编号 | 蛋白质组合 | 接头序列 |
pDAB3114 | TcdB2+TccC1 | ttaatctgactcgacgcactactagtaaaaaggagataacccc |
pDAB3115 | TcdB2+TccC2 | ttaatctgactcgacgaattttaatctactagtaaaaaggagataacc |
pDAB3093 | TcdB2+TccC3 | ttaatctgactcgacgcactactagtaaacaagaaggagatatacc |
pDAB3106 | TcdB2+TccC5 | ttaatctgactcgacaaaaaggagataacccc |
pDAB3097 | TcdB2+XptB1wi | ttaatctgactcgacaagaaggagatatacc |
pDAB8910 | TcdB2+XptC1xb | ttaatctgactcgacaaaaaggagataaccccatgccttaaagaagagag agatatacc |
pDAB8725 | TcdB2+PptCl1529 | ttaatctgactcgactttactagtaaggagatatacc |
表24. | ||
蛋白质组合 | 来源 | 质粒编号 |
TcaC+TccC1 | 发光光杆状菌W-14株 | pDAB8901 |
TcaC+TccC2 | 发光光杆状菌W-14株 | pDAB8902 |
TcaC+TccC3 | 发光光杆状菌W-14株 | pDAB8903 |
TcaC+TccC5 | 发光光杆状菌W-14株 | pDAB8904 |
TcaC+XptB1wi | 发光光杆状菌W-14株嗜线虫致病杆菌Xwi株 | pDAB8900 |
TcaC+ptClxb | 发光光杆状菌W-14株伯氏致病杆菌ILM104株 | pDAB8906 |
表25. | ||
蛋白质组合 | 来源 | 质粒编号 |
XptC1wi+TccC1 | 嗜线虫致病杆菌Xwi株发光光杆状菌W-14株 | pDAB8914 |
XptC1wi+TccC2 | 嗜线虫致病杆菌Xwi株发光光杆状菌W-14株 | pDAB8915 |
XptC1wi+TccC3 | 嗜线虫致病杆菌Xwi株发光光杆状菌W-14株 | pDAB3103 |
XptC1wi+TccC5 | 嗜线虫致病杆菌Xwi株发光光杆状菌W-14株 | pDAB3105 |
XptC1wi+XptB1wi | 嗜线虫致病杆菌Xwi株发光光杆状菌W-14株 | pDAB8916 |
表26. | ||
蛋白质组合 | 来源 | 质粒编号 |
XptB1xb+TccC1 | 伯氏致病杆菌ILM104株发光光杆状菌W-14株 | pDAB8918 |
XptB1xb+TccC3 | 发光光杆状菌W-14株 | pDAB6039 |
XptB1xb+XptC1xb | 伯氏致病杆菌ILM104株 | pDAB6033 |
XptB1xb+PptC11529 | 伯氏致病杆菌ILM104株类芽孢杆菌属1529株 | pDAB8732 |
表27. | ||
蛋白质组合 | 来源 | 质粒编号 |
PptB11529+PptC11529 | 类芽孢杆菌属1529株 | pDAB8724 |
PptB11529+TccC3 | 类芽孢杆菌属1529株发光光杆状菌W-14株 | pDAB8726 |
PptB11529+TccC1 | 类芽孢杆菌属1529株发光光杆状菌W-14株 | pDAB8733 |
表28. | ||
质粒编号 | 蛋白质组合 | 接头序列 |
pDAB8901 | TcaC+TccC1 | taactcgatatggctagcatgactggtggacagcaaatgggtcgcggatcgatccgaattcgcccttgtcgacgcactactagtaaaaaggagataacccc |
pDAB8902 | TcaC+TccC2 | taactcgatatggctagcatgactggtggacagcaaatgggtcgcggatcaaattatatatatatatatactcgacgaattttaatctactagtaaaaaggagataacc |
pDAB8903 | TcaC+TccC3 | taactcgatatggctagcatgactggtggacagcaaatgggtcgcggatccgaattcgagctccgtcgacgcactacta |
gtaaacaagaaggagatatacc | ||
pDAB8904 | TcaC+TccC5 | taactcgatatggctagcatgactggtggacagcaaatgggtcgcggatcaaatttttttactagttttgtcgacaaaaaggagataacccc |
pDAB8900 | TcaC+XptB1wi | taactcgatatggctagcatgactggtggacagcaaatgggtcgcggatctcgatcccgcgaaattaatacgactcactataggggaattgtgagcggataacaattcccctctagacgtgcgcgacaagaaggagatatacc |
pDAB8906 | TcaC+XptC1xb | taactcgatatggctagcatgactggtggacagcaaatgggtcgcggatcccttaaagaagagagagatatacc |
表29. | ||
质粒编号 | 蛋白质组合 | 接头序列 |
pDAB8914 | XptC1wi+TccC1 | ttaatgctctcgaatttgactagaaataattttgtttaactttaagaaggagatataccatgggcagcagccatcatcatcatcatcacagcagcggcctggtgccgcgcggcagccatatggctagcatgactggtggacagcaaatgggtcgcggatccgaattcgcccttgtcgacgcactactagtaaaaaggagataacccc |
pDAB8915 | XptC1wi+TccC2 | ttaatgctctcgaatttgactagagtcgacgaattttaatctactagtaaaaag gagataacc |
pDAB3103 | XptC1wi+TccC3 | ttaatgctctcgaatttgactagtcaaattatatatatatatatactcgacgcactactagtaaacaagaaggagatatacc |
pDAB3105 | XptC1wi+TccC5 | ttaatgctctcgaatttgactagatttatttaaatttttttactagttttgtcgacaaaaaggagataacccc |
pDAB8916 | XptC1wi+XptB1wi | ttaatgctctcgaatttgactagacgtgcgtcgacaagaaggagatatacc |
表30. | ||
质粒编号 | 蛋白质组合 | 接头序列 |
pDAB8918 | XptB1xb+TccC1 | ttaatgcggccgcaggaaatttttttgtcgactttactagtaaaaaggagataacccc |
pDAB6039 | XptB1xb+TccC3 | ttaatgcggccgcaggctagtaaacaagaaggagatatacc |
pDAB6033 | XptB1xb+XptC1xb | ttaatgcggccgcaggccttaaagaagagagagatatacc |
pDAB8732 | XptB1xb+PptC11529 | ttaatgcggccgcaggcctctgtaagactctcgactttactagtaaggagatatacc |
表31. | ||
质粒编号 | 蛋白质组合 | 接头序列 |
pDAB8724 | PptB11529+PptC11529 | taatgtcgactttactagtaaggagatatacc |
pDAB8726 | PptB11529+TccC3 | taatgtcgactttactagtaaacaagaaggagatatacc |
pDAB8733 | PptB11529+TccC1 | taatgtcgactttactagtaaaaaggagataacccc |
实施例6C.表达条件和裂解液制备
使用标准方法将表20-27中所列的pET表达质粒转化大肠杆菌T7表达菌株BL21(DE3)(Novagen,Madison WI)或者BL21 StarTM(DE3)(Stratagene,La Jolla,CA)。将10-200个新鲜转化的克隆接种到250mL含有50μg/ml抗生素和75μM IPTG的LB中开始进行表达培养。将培养物于28℃下、180-200转每分生长24小时。通过在250ml Nalgene瓶中于4℃、3,400×g离心10分钟收集细胞。将沉淀悬浮于4-4.5mLButterfield′s磷酸盐溶液(Hardy Diagnostics,Santa Maria,CA;0.3mM磷酸钾pH 7.2)。将悬浮的细胞转移至含有1mL0.1mm直径玻璃珠(Biospec,Bartlesville,OK,目录号1107901)的50mL带螺旋帽的聚丙烯离心管中。将细胞-玻璃珠混合液置冰上冷却,然后将细胞通过超声破碎裂解,超声时使用BransonSonifier 250(Danbury CT)在约20的输出条件下使用2mm探头进行两次45秒的脉冲,在两个脉冲之间要完全冷却。将裂解液转移至2mL Eppendorf管中并且在16,000×g下离心5分钟。收集上清液并且测定蛋白质的浓度。用H2O将Bio-Rad Protein Dye AssayReagent以1∶5进行稀释,并且将1mL加入到10μl 1∶10稀释的每一个样品中和加入到浓度为5、10、15、20和25μg/mL的牛血清白蛋白(BSA)中。然后用分光光度计法读取在Shimadzu UV160U分光光度计上(Kyoto,JP)上测量的波长595nm的吸光度。然后根据BSA的标准曲线计算出每个样品的蛋白质的量并且用磷酸盐缓冲液调整至3-6mg/mL。一般地将新鲜制备的裂解液进行测定,然而当储存于-70℃后没有观察到活性的损失。
实施例6D.生物测定条件
杀昆虫生物测定通过在特别设计用于杀昆虫生物测定法的128孔碟子(C-D International,Pitman,NJ)中人工饵料上的新生幼虫进行。所测定的物种有南方玉米根叶甲、美洲棉铃虫、美洲烟夜蛾和甜菜夜蛾。
生物测定法是通过在控制的环境条件下(28℃,~40%r.h.,16:8[L:D])孵育5天,此时记录处理的昆虫的总数、死亡昆虫数和存活昆虫的重量。对于每一个处理计算死亡的百分数和生长抑制的百分数。生长抑制的计算如下:
%生长抑制=[1-(处理的昆虫的平均重量/只有载体的对照中昆虫的平均重量)]*100
在处理的昆虫的平均重量大于只有载体的对照中昆虫的平均重量的情况下,生长抑制记录为0%。
仅用粗裂解液或者与加入的TcdA或者XptA2wi毒素蛋白质一起的生物学活性的测定如下。将对照培养物或者那些表达增效剂蛋白质的大肠杆菌粗裂解液(40μL)应用在生物测定碟的8个孔中的人工饵料的表面。在每个孔中受处理食物的平均表面积为大约1.5cm2。将裂解液调整到2-5mg/mL总蛋白质并且与或不与TcdA或者XptA2wi一起应用。所加入的TcdA或者XptA2wi是来自异源表达所述蛋白质的细菌培养物的高度纯化级分。食物中TcdA和XptA2wi的终浓度分别为250ng/cm2和50ng/cm2。
生物测定的结果总结于表32-39中。当用工程改造只表达B类或者C类蛋白质的大肠杆菌克隆的裂解液喂食幼虫时可观察到对所测试的昆虫物种的存活或生长有很小作用至没有作用(表32和33)。同样,在没有纯化毒素存在的情况下用B类和C类蛋白质的组合喂食幼虫时观察到很小的作用至没有作用(表34-39,“无”列)。当用工程改造表达B类和C类蛋白质组合的克隆的大肠杆菌裂解液与纯化的毒素喂食幼虫时通常可观察到对生存和/或生长明显的影响(表2C-2H,“TcdA”和“XptA2wi”列)。所测试的B类和C类的组合与纯化的TcdA一起最典型的是对南方玉米根叶甲造成影响而所测试组合与纯化的XptA2wi一起主要是对3个鳞翅目物种之一造成影响,且美洲棉铃虫是一贯最敏感的物种。应当指出许多B类和C类的组合产生XptA2wi针对南方玉米根叶甲的可观察的效应。相反,这些组合会产生TcdA对鳞翅类物种的可观察的效应是不正确的。
表32-39表明喂食给昆虫幼虫单独的大肠杆菌裂解液和与光杆状菌属或者致病杆菌属毒素蛋白质的组合的生物学活性。在每一大肠杆菌克隆中包含的基因对应于表20-27中包含的那些基因。使用下面的等级将生物学活性进行分类:0=平均死亡率<50%并且平均重量>空载体/无毒素处理的平均重量的50%,+=平均重量≤空载体/无毒素处理的平均重量的50%,++=平均死亡率≥50%或者平均重量≤空载体/无毒素处理的平均重量的20%,和+++=平均死亡率≥95%或者平均重量≤空载体/无毒素处理的平均重量的5%。
表32.工程改造表达B类蛋白质基因的大肠杆菌克隆的生物学活性。裂解液进行单独测试或者与纯化的TcdA或者XptA2wi一起测试。 | ||||||||||||
昆虫物种 | ||||||||||||
南方玉米根叶甲 | 美洲烟夜蛾 | 美洲棉铃虫 | 甜菜夜蛾 | |||||||||
毒素蛋白质 | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi |
B类基因tcdB1tcdB2tcaCxptC1wixptB1xbpptB11529 | 000000 | ++000+ | +00000 | 000000 | 000000 | 000000 | 000000 | 000000 | 0++00+0 | 000000 | 000000 | 000000 |
表33.工程改造表达C类蛋白质基因的大肠杆菌克隆的生物学活性。裂解液进行单独测试或者与纯化的TcdA或者XptA2wi一起测试。 | ||||||||||||
昆虫物种 | ||||||||||||
南方玉米根叶甲 | 美洲烟夜蛾 | 美洲棉铃虫 | 甜菜夜蛾 | |||||||||
毒素蛋白质 | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi |
C类基因tccC1tccC2tccC3tccC5xptB1wixptC1xbpptC11529 | 0000000 | 000000+ | 0000+00 | 0000000 | 0000000 | 0000000 | 0000000 | 0000000 | 00+0++0 | 0000000 | 0000000 | 0000000 |
表34.工程改造表达B类蛋白质基因tcdB1与多种C类蛋白质基因的组合的大肠杆菌克隆的生物学活性。裂解液进行单独测试或者与纯化的TcdA或者XptA2wi一起测试。 | ||||||||||||
昆虫物种 | ||||||||||||
南方玉米根叶甲 | 美洲烟夜蛾 | 美洲棉铃虫 | 甜菜夜蛾 | |||||||||
毒素蛋白质 | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi |
tcdB1组合tcdB1+tccC1tcdB1+tccC2tcdB1+tccC3tcdB1+tccC5tcdB1+xptB1wi | 00000 | +0+++0 | +0++0 | 00000 | 00000 | ++0++++++ | 00000 | +0000 | ++++++++++++ | 00000 | 00000 | 00+++++++ |
表35.工程改造表达B类蛋白质基因tcdB2与多种C类蛋白质基因的组合的大肠杆菌克隆的生物学活性。裂解液进行单独测试或者与纯化的TcdA或者XptA2wi一起测试。 | ||||||||||||
昆虫物种 | ||||||||||||
南方玉米根叶甲 | 美洲烟夜蛾 | 美洲棉铃虫 | 甜菜夜蛾 | |||||||||
毒素蛋白质 | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi |
tcdB2组合tcdB2+tccC3tcdB2+tccC5tcdB2+xptB1witcdB2+xptC1xbtcdB2+pptC11529 | 000nt0 | +++++++nt+++ | +++nt+ | 00000 | 00000 | ++++++++++ | 00000 | 00+00 | ++++++++++++++ | 00000 | 00000 | +++++++++ |
表36.工程改造表达B类蛋白质基因tcaC与多种C类蛋白质基因的组合的大肠杆菌克隆的生物学活性。裂解液进行单独测试或者与纯化的TcdA或者XptA2wi一起测试。 | ||||||||||||
昆虫物种 | ||||||||||||
南方玉米根叶甲 | 美洲烟夜蛾 | 美洲棉铃虫 | 甜菜夜蛾 | |||||||||
毒素蛋白质 | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi |
tcaC组合tcaC+tccC1tcaC+tccC2tcaC+tccC3tcaC+tccC5tcaC+xptB1wi | 00000 | ++0++++++ | +0++0 | 0000+ | 0000+ | ++0++++++ | 00000 | 00000 | ++++++++++++ | 00000 | 00000 | +++0++++++++ |
表37.工程改造表达B类蛋白质基因xptC1wi与多种C类蛋白质基因的组合的大肠杆菌克隆的生物学活性。裂解液进行单独测试或者与纯化的TcdA或者XptA2wi一起测试。 | ||||||||||||
昆虫物种 | ||||||||||||
南方玉米根叶甲 | 美洲烟夜蛾 | 美洲棉铃虫 | 甜菜夜蛾 | |||||||||
毒素蛋白质 | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi |
xptC1wi组合xptC1wi+tccC1xptC1wi+tccC2xptC1wi+tccC3xptC1wi+tccC5xptC1wi+xptB1wi | 00000 | 00++++0 | 00++0 | 00000 | 00000 | +0++++ | 00000 | 00000 | ++++++++++++ | 00000 | 00000 | 00++++0 |
表38.工程改造表达B类蛋白质基因xptB1xb与多种C类蛋白质基因的组合的大肠杆菌克隆的生物学活性。裂解液进行单独测试或者与纯化的TcdA或者XptA2wi一起测试。 | ||||||||||||
昆虫物种 | ||||||||||||
南方玉米根叶甲 | 美洲烟夜蛾 | 美洲棉铃虫 | 甜菜夜蛾 | |||||||||
毒素蛋白质 | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi |
xptB1xb组合xptB1xb+tccC3xptB1xb+xptC1xb | nt0 | nt+ | nt0 | 00 | 00 | ++++ | 00 | 00 | ++++++ | 00 | 00 | ++++ |
表38.工程改造表达B类蛋白质基因pptB11529与多种C类蛋白质基因的组合的大肠杆菌克隆的生物学活性。裂解液进行单独测试或者与纯化的TcdA或者XptA2wi一起测试。 | ||||||||||||
昆虫物种 | ||||||||||||
南方玉米根叶甲 | 美洲烟夜蛾 | 美洲棉铃虫 | 甜菜夜蛾 | |||||||||
毒素蛋白质 | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi | 无 | TcdA | XptA2wi |
pptB11529组合pptB11529+pptC11529ptB11529+tccC3 | 00 | ++++++ | 0+ | 00 | 00 | ++++ | 00 | 00 | ++++++ | 00 | 00 | ++++ |
实施例7-TC蛋白质的其它混合和匹配
为了证明当前发现的TC蛋白质的多功能性,采用双质粒表达系统开展了额外的大肠杆菌表达实验。基于T7启动子的系统利用pACYC衍生物(称作pCot-3或者4,氯霉素抗性)表达TcdA或者XptA2蛋白质,而相容的T7启动子pET280质粒(卡那霉素抗性)表达TcdB1(SEQ IDNO:22)、TcdB2(SEQ ID NO:45)、XptC1(SEQ ID NO:18)、TccC1(SEQ ID NO:25)、TccC3(SEQ ID NO:47)和XptB1(SEQ ID NO:16)蛋白质的多种组合,所有这些均存在于同一个细胞。同样在另一系列实验中,利用不同pACYC衍生物(称作pCTS,壮观霉素/链霉素抗性)的大肠杆菌启动子系统用来表达TcdA(SEQ ID NO:21)或者XptA2(SEQ IDNO:34)蛋白质,而相容的pBT280质粒(氯霉素抗性)表达TcdB1、TcdB2、XptC1、TccC1、TccC3和XptB1的多种组合。当进行生物测定时发现,两个系统产生相似活性的蛋白质。
通过首先制备含有pCot-3、pCot-TcdA或者pCot-XptA2的感受态BL21(DE3)细胞原种进行基于T7启动子的实验。用对照pET280质粒或者存在于pET280载体中的上面提到的TC基因的任何组合转化这些细胞。在含有氯霉素和卡那霉素的培养基上对含有两种质粒的细胞进行选择。相似地,对于大肠杆菌启动子的系统,制备了含有pCTS、pCTS-TcdA或者pCTS-XptA2的感受态BL21细胞。然后用pBT280对照质粒或者存在于pBT280载体中的上面提到的任何TC组合转化感受态细胞。当在一个特定质粒中存在多于一个的TC基因时,它们则作为在5′末端具有单一一个启动子的两个基因操纵子而排列。在第一个编码区后跟着翻译终止信号;一个分开的核糖体结合位点(SD序列)和翻译起始信号用来起始第二个编码区的翻译。实施例2和3中叙述的方法用于表达培养物的生长、裂解液的制备和杀昆虫活性的评价。在一些实验中使用了改良的测定方法,其中将蛋白质TcdA和XptA2的浓缩制品添加到含有TcdB1、TcdB2、XptC1、TccC1、TccC3和XptB1的单一或组合的裂解液中(表40和41)。
表40.异源表达的毒素复合体基因对TBW、SCR、ECB和BAW的生物测定结果 | ||||
所测试样品 | TBW生物测定 | SCR生物测定 | ECB生物测定 | BAW生物测定 |
XptA2 | ++ | 0 | ++ | ++ |
TcdB1 | 0 | 0 | + | +++ |
XptC1 | 0 | 0 | 0 | +++ |
TccC1 | + | 0 | + | +++ |
XptB1 | 0 | 0 | 0 | +++ |
TcdB1+TccC1 | 0 | 0 | ||
TcdB1+XptB1 | 0 | 0 | ||
XptC1+TccC1 | 0 | 0 | ||
XptC1+XptB1 | + | 0 | ||
XptA2+TcdB 1 | +++ | + | ++ | |
XptA2+XptC1 | ++ | 0 | 0 | ++ |
XptA2+TccC1 | +++ | + | + | +++ |
XptA2+XptB 1 | +++ | + | 0 | ++++ |
XptA2+TcdB1+TccC1 | +++++ | +++ | +++++ | |
XptA2+TcdB1+XptB1 | +++++ | +++ | +++++ | ++++ |
XptA2+XptC1+TccC1 | ++++ | 0 | ++++ | ++++ |
XptA2+XptC1+XptB1 | ++++ | + | +++++ | ++++ |
TcdA | 0 | +++ | ++ | 0 |
TcdA+TcdB1 | 0 | +++ | ++ | 0 |
TcdA+XptC1 | 0 | +++ | ++++ | 0 |
TcdA+TccC1 | 0 | ++ | 0 | 0 |
TcdA+XptB1 | 0 | +++ | ++ | 0 |
TcdA+TcdB1+TccC1 | 0 | ++++ | ++++ | 0 |
TcdA+TcdB1+XptB1 | 0 | ++++ | ++++ | 0 |
TcdA+XptC1+TccC1 | 0 | ++++ | ++ | 0 |
TcdA+XptC1+XptB1 | 0 | +++ | ++++ | 0 |
裂解完整的大肠杆菌细胞并在实验中将可溶的蛋白质通常规一化至5-10mg/ml。通过加至昆虫食物上如所述进行裂解物的生物测定。级别划分为以相对于对照的生长抑制百分数%表示(0=0-25%;+=26-50%;++=51-65%;+++=66-80%;++++=81-95%;+++++=96-100%)。
表41.异源表达的毒素复合体基因与加入的纯化TcdA毒素蛋白质*对SCR、TBW、CEW和FAW的生物测定结果 | ||||
所测试质粒 | SCR生物测定 | TBW生物测定 | CEW生物测定 | FAW生物测定 |
pET-280 | 0 | 0 | 0 | 0 |
pET-280-TcdB1+XptB1 | ++++ | + | ++ | 0 |
pET-280-TcdB2+TccC3 | +++++ | + | 0 | ++ |
裂解完整大肠杆菌细胞并将可溶蛋白质调整到8-15mg/ml的同一样品浓度。*当应用到昆虫食物制品的表面时,TcdA蛋白质以50ng/cm2的终浓度加入到样品中。
等级范围以饲喂添加TcdA毒素蛋白质的处理的存活昆虫相对于饲喂没有添加TcdA毒素蛋白质的处理存活昆虫的生长抑制%表示(0=0-10%;+=11-20%;++=21-40%;+++=41-60%;++++=61-80%;+++++>80%)。
表42.异源表达的毒素复合体基因与加入的纯化XptA2毒素蛋白质*对SCR、TBW、CEW和FAW的生物测定结果 | ||||
所测试质粒 | SCR生物测定 | TBW生物测定 | CEW生物测定 | FAW生物测定 |
pET-280 | 0 | + | +++ | 0 |
pET-280-TcdB1+XptB1 | ++++ | +++++ | +++++ | +++ |
pET-280-TcdB2+TccC3 | ++++ | +++++ | +++++ | ++++ |
裂解完整的大肠杆菌细胞并将可溶蛋白质调整到8-15mg/ml的同一样品浓度。*当应用到昆虫食物制品的表面时,XptA2蛋白质以250ng/cm2的终浓度加入到样品中。
等级范围以饲喂添加XptA2毒素蛋白质的处理的存活昆虫相对于饲喂没有添加XptA2毒素蛋白质的处理存活昆虫的生长抑制的%表示(0=0-10%;+=11-20%;++=21-40%;+++=41-60%;++++=61-80%;+++++>80%)。
实施例8-混合和匹配测定和序列关系的总结
下列表总结和比较了在上述的测定中使用的蛋白质。表43-45比较了A类、B类和C类蛋白质。表46-48比较了A类、B类和C类基因(细菌)。表中的任何数字可以用来作为定义本发明蛋白质和多核苷酸的上限和/或下限。表49比较了多种TC蛋白质的大小。另外,该表中的任何数字可以用来作为定义本发明蛋白质(和多核苷酸)的上限和下限。
这些表有助于表明一种令人惊奇的事实,即根据本发明甚至具有高度差异的蛋白质(在大约40-75%同一性范围内)能够被使用并且可以彼此替代。在此处TcdA2w-14以SEQ ID NO:62重现,TcdA4w-14以SEQ ID NO:63重现,并且TccCw-14以SEQ ID NO:64重现。
表43. | ||||||||||||||
TcdA | TcdA2 | TcdA4 | TcbA | XptA1xwi | XptA2xwi | SepA | ||||||||
相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | |
发光光杆状菌A类 | ||||||||||||||
TcdA | 100.0 | 100.0 | 61.3 | 55.0 | 74.3 | 68.0 | 61.4 | 50.1 | 57.3 | 46.3 | 53.8 | 40.6 | 52.6 | 40.7 |
TcdA2 | 100.0 | 100.0 | 63.7 | 55.9 | 52.7 | 42.4 | 52.3 | 41.3 | 48.3 | 36.8 | 45.5 | 34.7 | ||
TcdA4 | 100.0 | 100.0 | 59.0 | 49.4 | 54.8 | 44.4 | 51.7 | 38.7 | 50.6 | 38.7 | ||||
TcbA | 100.0 | 100.0 | 54.7 | 43.7 | 54.0 | 40.8 | 52.8 | 40.2 | ||||||
嗜线虫致病杆菌xwi A类 | ||||||||||||||
XptA1xwi | 100.0 | 100.0 | 57.6 | 44.2 | 57.7 | 46.6 | ||||||||
XptA2xwi | 100.0 | 100.0 | 50.7 | 38.2 | ||||||||||
嗜线虫沙雷氏菌A类 | ||||||||||||||
SepA | 100.0 | 100.0 | ||||||||||||
在混合和匹配测定中进行了测试? | 是 | 否 | 否 | 是 | 否 | 是 | 否 | |||||||
起作用吗? | 是 | NA | NA | 是 | NA | 是 | NA |
注意:tcdA3是个假基因(不编码全长蛋白质)所以不做该分析
表44. | ||||||||||||||
TcdB1 | TcdB2 | TcaC | XptC1xwi | XptB1xb | PptB1(ORF5) | SepB | ||||||||
相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | |
发光光杆状菌B类 | ||||||||||||||
TcdB1 | 100.0 | 100.0 | 79.9 | 75.6 | 69.5 | 58.2 | 61.3 | 50.2 | 65.6 | 54.6 | 55.3 | 42.3 | 63.7 | 52.6 |
TcdB2 | 100.0 | 100.0 | 68.1 | 57.2 | 60.7 | 49.8 | 65.6 | 53.3 | 54.2 | 42.0 | 61.7 | 51.4 | ||
TcaC | 100.0 | 100.0 | 63.9 | 51.6 | 70.6 | 59.8 | 56.9 | 42.6 | 61.4 | 50.1 | ||||
嗜线虫致病杆菌xwi B类 | ||||||||||||||
XptC1xwi | 100.0 | 100.0 | 65.2 | 53.2 | 53.9 | 40.7 | 58.1 | 47.8 | ||||||
伯氏致病杆菌B类 | ||||||||||||||
XptB1xb | 100.0 | 100.0 | 54.2 | 40.6 | 57.4 | 46.0 | ||||||||
类芽孢杆菌1529株B类 | ||||||||||||||
PptB1(ORF5) | 100.0 | 100.0 | 51.5 | 38.7 | ||||||||||
嗜线虫沙雷氏菌B类 | ||||||||||||||
SepB | 100.0 | 100.0 | ||||||||||||
在混合和匹配测定中进行了测试? | 是 | 是 | 是 | 是 | 是 | 是 | 否 | |||||||
起作用吗? | 是 | 是 | 是 | 是 | 是 | 是 | NA |
表45. | ||||||||||||||||||||
TccC1 | TccC2 | TccC3 | TccC4 | TccC5 | XptB1xwi | XptC1xb | PptC1(Orf6长) | PptC1(Orf6短) | SepC | |||||||||||
相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | 相似性% | 同一性% | |
发光光杆状菌C类 | ||||||||||||||||||||
TccC1 | 100.0 | 100.0 | 57.8 | 48.1 | 62.0 | 52.8 | 62.5 | 52.9 | 59.7 | 51.3 | 59.0 | 45.5 | 55.8 | 46.5 | 45.0 | 35.0 | 45.9 | 35.7 | 56.0 | 44.1 |
TccC2 | 100.0 | 100.0 | 60.3 | 52.5 | 62.2 | 53.7 | 67.9 | 61.4 | 54.0 | 44.1 | 56.4 | 47.2 | 46.5 | 35.3 | 45.7 | 36.1 | 55.8 | 46.1 | ||
TccC3 | 100.0 | 100.0 | 65.4 | 59.5 | 66.0 | 58.4 | 54.8 | 46.0 | 56.5 | 48.1 | 45.1 | 35.4 | 46.1 | 36.1 | 56.4 | 46.6 | ||||
TccC4 | 100.0 | 100.0 | 64.8 | 57.2 | 53.6 | 44.8 | 58.8 | 49.1 | 46.3 | 36.9 | 47.3 | 37.7 | 56.6 | 45.3 | ||||||
TccC5 | 100.0 | 100.0 | 55.1 | 45.6 | 57.6 | 48.7 | 45.3 | 35.2 | 46.3 | 36.0 | 54.8 | 44.9 | ||||||||
嗜线虫致病杆菌xwiC类 | ||||||||||||||||||||
XptB1xwi | 100.0 | 100.0 | 52.6 | 41.4 | 43.3 | 32.7 | 44.3 | 33.5 | 55.2 | 46.3 | ||||||||||
伯氏致病杆菌C类 | ||||||||||||||||||||
XptC1xb | 100.0 | 100.0 | 46.4 | 35.4 | 47.4 | 36.2 | 53.0 | 43.5 | ||||||||||||
类芽孢杆菌1529株C类 | ||||||||||||||||||||
PptC1(Orf6长) | 100.0 | 100.0 | 97.6 | 97.6 | 45.1 | 34.9 | ||||||||||||||
PptC1(Orf6短) | 100.0 | 100.0 | 46.2 | 35.7 | ||||||||||||||||
嗜线虫沙雷氏菌C类 | ||||||||||||||||||||
SepC | 100.0 | 100.0 | ||||||||||||||||||
在混合和匹配测定中进行了测试? | 是 | 是 | 是 | 否 | 是 | 是 | 是 | 是 | 当前测试 | 否 | ||||||||||
起作用吗? | 是 | 否 | 是 | NA | 是 | 是 | 是 | 是 | ? | NA |
表46 | |||||||
tcdA | tcdA2 | tcdA4 | tcbA | xptA1xwi | xptA2xwi | sepA | |
%同一性 | %同一性 | %同一性 | %同一性 | %同一性 | %同一性 | %同一性 | |
发光发光杆菌A类 | |||||||
tcdA | 100.0 | 65.3 | 70.6 | 58.2 | 56.8 | 54.4 | 53.1 |
tcdA2 | 100.0 | 64.5 | 56.2 | 55.9 | 53.3 | 51.9 | |
tcdA4 | 100.0 | 57.8 | 55.6 | 52.5 | 51.7 | ||
tcbA | 100.0 | 56.3 | 54.0 | 52.7 | |||
嗜线虫致病杆菌xwi A类 | |||||||
xptA1xwi | 100.0 | 55.8 | 55.4 | ||||
xptA2xwi | 100.0 | 53.8 | |||||
嗜线虫沙雷氏菌A类 | |||||||
sepA | 100.0 | ||||||
在混合和匹配测试中进行了测试? | 是 | 否 | 否 | 是 | 否 | 是 | 否 |
起作用吗? | 是 | NA | NA | 是 | NA | 是 | NA |
注:tcdA3是一个假基因(不编码全长蛋白),因此不进行分析。
表47 | ||||||||
tcdB1 | tcdB2 | tcaC | xptC1xwi | xptB1xb | pptB1(Orf5) | sepB | ||
%同一性 | %同一性 | %同一性 | %同一性 | %同一性 | %同一性 | %同一性 | ||
发光发光杆菌B类 | ||||||||
tcdB1 | 100.0 | 74.1 | 62.3 | 44.7 | 59.7 | 52.3 | 57.6 | |
tcdB2 | 100.0 | 61.5 | 44.7 | 59.6 | 52.6 | 57.1 | ||
tcaC | 100.0 | 46.0 | 62.0 | 52.5 | 55.3 | |||
嗜线虫致病杆菌xwi B类 | ||||||||
xptC1xwi | 100.0 | 44.9 | 44.9 | 44.5 | ||||
伯氏致病杆菌B类 | ||||||||
xptB1xb | 100.0 | 52.3 | 54.7 | |||||
类芽孢杆菌1529株B类 | ||||||||
pptB1(Orr5) | 100.0 | 52.5 | ||||||
嗜线虫沙雷氏菌B类 | ||||||||
sepB | 100.0 | |||||||
在混合和匹配测试中进行了测试? | 是 | 是 | 是 | 是 | 是 | 是 | 否 | |
起作用吗? | 是 | 是 | 是 | 是 | 是 | 是 | NA |
表48 | ||||||||||
TccC1 | TccC2 | TccC3 | TccC4 | TccC5 | XptB1xwi | XptC1xb | PptC1(Orf6长) | PptC1(Orf6短) | SepC | |
同一性% | 同一性% | 同一性% | 同一性% | 同一性% | 同一性% | 同一性% | 同一性% | 同一性% | 同一性% | |
发光光杆状菌C类 | ||||||||||
TccC1 | 100.0 | 55.7 | 58.8 | 60.0 | 59.4 | 45.0 | 55.7 | 47.5 | 48.4 | 54.3 |
TccC2 | 100.0 | 62.6 | 62.2 | 69.9 | 43.5 | 58.1 | 51.6 | 52.4 | 55.4 | |
TccC3 | 100.0 | 65.7 | 66.4 | 44.9 | 58.4 | 51.5 | 52.4 | 56.8 | ||
TccC4 | 100.0 | 65.8 | 43.0 | 59.4 | 52.2 | 53.2 | 54.9 | |||
TccC5 | 100.0 | 43.0 | 58.5 | 50.8 | 51.7 | 56.2 | ||||
嗜线虫致病杆菌xwi C类 | ||||||||||
XptB1xwi | 100.0 | 44.6 | 43.6 | 43.2 | 44.0 | |||||
伯氏致病杆菌C类 | ||||||||||
XptC1xb | 100.0 | 49.7 | 50.6 | 54.5 | ||||||
类芽孢杆菌1529株C类 | ||||||||||
PptC1(Orf6长) | 100.0 | 97.6 | 50.6 | |||||||
PptC1(Orf6短) | 100.0 | 51.8 | ||||||||
嗜线虫沙雷氏菌C类 | ||||||||||
SepC | 100.0 | |||||||||
在混合和匹配测定中进行了测试? | 是 | 是 | 是 | 否 | 是 | 是 | 是 | 是 | 进行中 | 否 |
起作用吗? | 是 | 否 | 是 | NA | 是 | 是 | 是 | 是 | ? | NA |
表49. | ||||
DNA碱基 | 蛋白质氨基酸 | 蛋白质道尔顿 | 功能? | |
发光光杆状菌A类 | ||||
tcdA | 7548 | 2516 | 282,932 | 有 |
tcdA2 | 7497 | 2499 | 283,725 | ? |
tcdA4 | 7143 | 2381 | 270,397 | ? |
tcbA | 7512 | 2504 | 280,632 | 有 |
嗜线虫致病杆菌xwiA类 | ||||
xptA1xwi | 7569 | 2523 | 286,799 | ? |
xptA2xwi | 7614 | 2538 | 284,108 | 有 |
嗜线虫沙雷氏菌A类 | ||||
SepA | 7128 | 2376 | 262,631 | ? |
范围 | 7128-7614 | 2376-2538 | 262,631-286,799 | |
发光光杆状菌B类 | ||||
tcdB1 | 4428 | 1476 | 165,127 | 有 |
tcdB2 | 4422 | 1474 | 166,326 | 有 |
tcaC | 4455 | 1485 | 166,153 | 有 |
嗜线虫致病杆菌xwi B类 | ||||
xptC1xwi | 4479 | 1493 | 168,076 | 有 |
伯氏致病杆菌B类 | ||||
xptB1xb | 4518 | 1506 | 168,635 | 有 |
类芽孢杆菌1529株B类 | ||||
pptB1(Orf5) | 4332 | 1444 | 161,708 | 有 |
嗜线虫沙雷氏菌B类 | ||||
SepB | 4284 | 1428 | 156,539 | ? |
范围 | 4284-4518 | 1428-1506 | 156,539-168,635 | |
发光光杆状菌C类 | ||||
TccC1 | 3129 | 1043 | 111,686 | 有 |
TccC2 | 2745 | 915 | 103,398 | 无 |
TccC3 | 2880 | 960 | 107,054 | 有 |
TccC4 | 2847 | 949 | 106,563 | ? |
TccC5 | 2814 | 938 | 105,106 | 有 |
嗜线虫致病杆菌xwi C类 | ||||
XptB1xwi | 3048 | 1016 | 111,037 | 有 |
伯氏致病杆菌C类 | ||||
XptC1xb | 2886 | 962 | 107,960 | 有 |
类芽孢杆菌1529株C类 | ||||
PptC1(Orf6长) | 2859 | 953 | 109,130 | 有 |
PptC1(Orf6短) | 2790 | 930 | 106,244 | ? |
嗜线虫沙雷氏菌C类 | ||||
SepC | 2919 | 973 | 107,020 | ? |
范围 | 2745-3129 | 915-1043 | 103,398-111,686 |
序列表
<110>美国陶氏益农公司
<120>混合并匹配TC蛋白质用于病虫害防治
<130>DAS-104XC1
<150>US 60/441,723
<151>2003-01-21
<160>64
<170>PatentIn版本3.2
<210>1
<211>20
<212>PRT
<213>嗜线虫致病杆菌(Xenorhabdus nematophilus)
<400>1
Met Tyr Ser Thr Ala Val Leu Leu Asn Lys Ile Ser Pro Thr Arg Asp
1 5 10 15
Gly Gln Thr Met
20
<210>2
<211>5
<212>PRT
<213>嗜线虫致病杆菌
<400>2
Met Trp Tyr Val Arg
1 5
<210>3
<211>6
<212>PRT
<213>嗜线虫致病杆菌
<400>3
Leu Thr Gln Phe Leu Arg
1 5
<210>4
<211>10
<212>PRT
<213>嗜线虫致病杆菌
<400>4
Ala Asn Pro Gln Leu Ser Gly Ala Ile Arg
1 5 10
<210>5
<211>8
<212>PRT
<213>嗜线虫致病杆菌
<400>5
Leu Leu Asp Gln Leu Ile Leu Arg
1 5
<210>6
<211>39005
<212>DNA
<213>嗜线虫致病杆菌
<400>6
gatcaggtat tcaatcaacc caaactgttt gatgaacctt tctttgttga taatcgtact 60
tttgattaca acgccattcg tggtaatgat gcacgaacaa ttaagcaact gtgcgccgga 120
ttgaaaatca ccgtagccac cttccaattg ttagctgagc aggtaaacac cgcctttcat 180
ctgccatccg gcaaattaac ctgttcactg cctgttattt cagcgcttta tcgtctggtg 240
actgttcctc ggttatttaa tttaaccgct gaacagggca tgatgctgat taacgcatta 300
aatgccagcg agaaattctc acctcatatt ctggctggtg agcctcgatt aagcctgtta 360
acaacagagg gttcagatac cacagaggtc gatttattgg atgttattct gatgttggaa 420
gaagttgctg tctggctgca acagagcaaa ctgaaaccgg aagaattctg cctgatgctg 480
caaagtgtta tgttgccggt ggttgccacg gacagcagtg tgacattctt cgacaacctg 540
ctgcaaggca ttcccaaaac cttactcaca gaagataact tcaacgcagg ggatatcccc 600
agactccctg aaggagaaac ctggtttgac aaactttcga tgctgataac cagcgatgga 660
ctcgtcaacg tttaccctct cagttggggc cagagtgatg aagattatct gaaatcagta 720
ttgacacctg tcgtcgaaaa aatcattagc gatccaaaca gtgtgattat cactgtttcc 780
gcattaacac aggtcattac tcaggcgaaa actgcgcagg aagatctggt ttccgccagc 840
gtgacacggg aatacggtac tggacgtgat atcgttcctt ggttattacg ctggattggc 900
agcagtgttc ccgatttcct tggcaaaatt tatatacaag gcgcaaccag aggcggacac 960
ttgcgcactc cgccggatat cagcgctgaa ttactgcata tcacctatca tctggcgatg 1020
aataacatgc tgattaagca gttacgactc aaagctcaaa tcatttcatt acgtatcatc 1080
atgcctgaat ggctcggatt accaacgata gatggcagtc cgctatccgt gcatgaaatt 1140
tgggcactga gccggttccg taactgggcg accagctcat tgttcagtga agacgagtta 1200
atcgagtatt ttgcttttgc caatcagccg gagcaggacg ttcgtaacga tgaagatttt 1260
aatcgggact gtgctgaaaa gcttgccgac atactggaat gggatgccga tgaaattgag 1320
ctggcaaccc gacattttga tcctgcccca gcacgtgcca gaaatatggg acaaattgac 1380
tggctgcgtc gtgtcatggc gttgtcgcgt cagactggcc tgtcagtgac accgttaatg 1440
acagccgcaa cgttaccgcc tttcccgccc tatgaccaga taacccatgt cggtgaagcg 1500
gtgattgcgg caacccagta cccatcagag gagtaaggaa cgatgagttc agttacccaa 1560
cctattgaag agcgtttact ggaatcacag cgcgacgcac tgctggattt ctatctcgga 1620
caggtcgttg cctattcacc tgacatgaca agtcagcgcg acaaaattaa ggatattgac 1680
gatgcctgcg actacctcct gctggatctg ctgacttccg ccaaagtcaa agcgacacga 1740
ctttcacttg cgaccaattc attgcagcaa tttgtgaacc gcgtgtcact gaatattgaa 1800
cccggtttgt ttatgaccgc ggaagagagc gaaaattggc aggaatttgc gaatcgttat 1860
aattactggt ctgcggatcg cttattacgg acttatccgg aaagctatct ggaacccctg 1920
ttacgcctga ataaaacaga attcttcttc caactggaaa gtgcccttaa tcagggaaaa 1980
attaccgaag attccgtaca acaagcggtg ctcggttatc tgaataattt tgaagatgtc 2040
agtaacctga aagttatcgc aggttatgaa gatggtgtta acatcaaacg cgataagttc 2100
ttctttgtcg gacgtacccg tacacagcca taccaatatt actggcgttc actgaatctt 2160
tcgatacgcc atcctgatac cgatgcgtta tctcccaatg cctggagcga gtggaaacct 2220
attgacctgc cattgggcag cgtagacccc aatttgatac gccccatttt cctgaataat 2280
cgcctgtata ttgcctggac ggaagttgaa gaacagtctg aaactaaaga tacaactgcg 2340
ttatcactgc ataaccaaaa cgttgagcct agtgcgggtg attgggttcc tcccacaccg 2400
ttcctgaccc ggatcaaaat cgcttatgcc aaatatgatg gcagctggag tacacccacc 2460
attctgcgcg aagacaatct gcaataccgg atggcccaga tggttgctgt gatggatata 2520
cagcaagacc cgcataaccc gtttctggct ctggttccgt ttgtccgtct tcaggggaca 2580
gataagaaag gtaaggatta tgattatgac gaagccttcg gttatgtctg cgatacactg 2640
ctggtagaaa ttactgattt gccggatgac gaatatgctg atggacgaaa aggaaaatat 2700
gtcggcaacc tggtctggta ttactcacgt gaacacaagg atgcagaagg caatcctatc 2760
gattaccgta ctatggtgct ctatccggca acccgggaag aacgctttcc tattgccgga 2820
gaagccaaac cggaaggaag ccctgatttt ggcaaagaca gtatcaaact gattgtcaat 2880
tttgttcatg gcactgatga cacactggag attgtcgctc aatctgactt taagtttggt 2940
gcgatagaag atcatcaata ttacaacggt tctttccggc tgatgcacga taatactgtc 3000
ttggatgaac aaccactggt actgaacgaa aaagttcctg atttaaccta tccatcaatc 3060
aagctggggt cggataatcg aatcaccctg aaagccgaac ttctctttaa gcccaaaggt 3120
ggtgttggca atgaaagtgc cagctgtact caagagttca gaatcggtat gcacattcgc 3180
gaactgatta aactcaatga acaggatcag gtgcaattcc tttccttccc cgcagatgaa 3240
actggtaacg cgccacaaaa cattcgcctt aatacactgt ttgcaaaaaa actgatcgcc 3300
attgccagtc agggtatccc gcaggtactg agctggaata cacagcttat tactgaacaa 3360
cccatacccg gttcattccc tacgccgatt gatttaaatg gcgcaaatgg gatctatttc 3420
tgggaactgt ttttccatat gccatttctg gtcgcgtggc gactgaatat cgaacaacga 3480
ttaaaagagg ccaccgaatg gctgcactat atttttaatc cgctggaaga tgaacttgtt 3540
caggccagca accaaggtaa accgcgttac tggaattcac ggccaattat tgatcctcca 3600
cccaccgtgt accggatgtt aattgaacca accgatccgg atgccattgc agccagtgaa 3660
cccattcact accggaaagc aatattccgt ttctatgtca agaatctgtt agatcaggga 3720
gacatggaat accgtaagct gacatccagt gcacgtactg tcgccaagca gatctatgac 3780
tccgtcaata tgttactggg taccagccct gatattctgc tcgcggcaaa ctggcaaccc 3840
cgtacgctgc aagatgtggc tctgtatgaa aacagtgaag cacgggcaca ggagttaatg 3900
cttactgtca gcagcgtgcc acttctgcct gtgacatatg atacatccgt ctctgccgca 3960
ccgtctgatt tatttgtcaa acctgttgat acggaatatc tcaaactgtg gcaaatgttg 4020
gatcagcgtc tatataactt acgtcataac ctgaccttgg atggtaaaga gtttccggcc 4080
ggattatacg atgaacccat cagcccgcaa gatctgctca ggcagcgtta ccagcgtgtt 4140
gtggctaatc gtatggcggg catgaaacgc cgggcaatcc cgaattatcg tttcaccccg 4200
atcatgagcc gggcaaaaga ggccgcagaa acgctgattc agtacggcag cacgttactg 4260
agtttgctgg agaaaaaaga caataccgat tttgaacact tccgtatgca gcagcaactg 4320
gggctgtaca gctttacccg caatctgcaa cagcaagcga ttgacatgca acaggcttca 4380
ttggatgcac tgaccatcag ccgacgggcc gctcaggagc gccagcaaca ctataaatcg 4440
ctctatgatg aaaacatctc catcaccgag caggaagtta tcgcattaca atcaagagcg 4500
gctgaaggtg tgatcgctgc ccagtcagcc gccactgcgg ccgctgtggc ggatatggtt 4560
cccaatattt tcggtctggc cgtcgggggg atggtctttg gcggtatgct tcgggcaatc 4620
ggtgaaggaa tacgcattga cgttgaaagt aaaaatgcca aagccaccag cctgagcgtg 4680
tcagaaaatt accgtcgccg tcagcaagaa tgggagctgc aatacaaaca ggcggatatc 4740
aacattgagg agatcgacgc acagattggt atccagcaac gccaactgaa tatcagcaca 4800
acccaactgg cacaattgga agcccagcat gagcaggatc aagtcctgct ggagtactat 4860
tcaaaccgtt ttaccaatga tgcgttatac atgtggatga tcagccaaat ctccgggctt 4920
tacctgcaag cctatgatgc ggttaattcc ctctgtttac tggccgaagc ctcctggcag 4980
tacgaaacag gtcagtatga tatgaatttc gtccaaagtg gtctctggaa tgatctttat 5040
caggggctgc tggtcggaga acatctgaaa ttagccttac aacggatgga tcaggcgtat 5100
ttgcaacata acaccagacg tctggagatc ataaaaacca tatcggtaaa atcattactg 5160
acatcatcac agtgggaaat tggcaagagt acgggttcat tcactttctt actgagcgcc 5220
gaaatgttct tgcgcgatta tccgacccac gctgatcggc gtataaaaac cgtagcgctg 5280
tcattgcccg cattgctggg gccttatgaa gatgtacggg cttcactggt acaactcagc 5340
aatacgcttt acagtactgc tgacttaaaa actatcgatt atttgcttaa ccccttggaa 5400
tacaccaaac ccgaaaacgt tttgctgaac gtacaggcta atcaaggtgt ggtgatttca 5460
acggccatgg aagacagcgg catgttcagg ctcaattttg atgatgaact tttcctgcct 5520
tttgaaggga caggcgccat ttcacagtgg aagttggaat tcggttccga tcaggatcag 5580
ctgctggagt cgctgagcga tattatcctc catctgcgtt ataccgcgcg tgatgtgagt 5640
ggcggaagta atgagttcag ccagcaggtt cgtagccgtc tgaataaaca tcaattaaaa 5700
caagacaatt ctaactgata tcaggagccg gccccggaat ataacggggc cggaagtgaa 5760
attatgtctc aaaatgttta tcgataccct tcaattaaag cgatgtctga cgccagcagc 5820
gaagtaggcg catctctggt tgcctggcag aatcaatctg gtggtcaaac ctggtatgtc 5880
atttatgata gcgcggtttt taaaaacatc ggctgggttg aacgctggca tattcccgac 5940
cgcaatattt cacctgattt accggtttat gagaatgcct ggcaatatgt ccgtgaggcg 6000
acaccggaag aaattgccga tcacggtaac cccaatacgc ctgatgtacc gccgggagaa 6060
aaaaccgagg tattgcaata tgatgcactc acagaagaaa cctatcagaa ggtgggatat 6120
aaacctgacg gcagcggaac tcctttgagt tattcttcag cacgtgttgc caagtccctg 6180
tacaacgaat atgaagttga tccggaaaat acagaaccgc tgcctaaagt ctctgcctat 6240
attactgact ggtgccagta tgatgcgcgt ttgtcgccag aaacccagga taacactgcg 6300
ctgaccagcg acgatgcccc cggccgtggt tttgatctgg aaaaaatccc gcctaccgcc 6360
tacgaccgcc tgattttcag ttttatggcc gtcaacggtg ataaaggcaa gttatccgaa 6420
cggattaatg aggttgttga cgggtggaac cggcaagcag aagccagcag tggccagatt 6480
gcccctatta cattaggcca tattgtaccc gttgatcctt atggtgattt aggcaccaca 6540
cgcaatgtcg gtctggacgc ggatcagcgc cgtgatgcca gcccgaagaa tttcttgcaa 6600
tattacaatc aggatgcagc ctccggttta ctggggggat tgcgtaatct gaaagcgcga 6660
gcaaaacagg cagggcacaa gctggaactc gcattcagta tcggcggctg gagtatgtca 6720
gggtatttct ctgtgatggc caaagatcct gagcaacgtg ctacatttgt gagtagcatc 6780
gtcgacttct tccggcgttt tcccatgttt actgcggtgg atatcgactg ggaatacccc 6840
ggcgccacag gtgaagaagg taatgaattc gacccggaac atgatggccc aaactatgtt 6900
ttgttagtga aagagctgcg tgaagcactg aacatcgcct ttggaacccg ggcccgtaaa 6960
gaaatcacga tagcctgtag cgccgtcgtt gccaaaatgg agaagtccag cttcaaagaa 7020
atcgcacctt atttagacaa tatctttgtg atgacctacg acttctttgg taccggttgg 7080
gcagaataca tcggtcacca tactaacctg tatcccccca gatatgaata tgacggcgat 7140
aaccctcctc cgcccaatcc tgatcgggac atggattact cggctgatga ggcgatccgc 7200
tttttactgt cacaaggtgt acaaccggag aaaattcacc tcggatttgc taactatgga 7260
cgttcatgtc tgggtgctga tctgacaact cgccgctata acagaacagg agagccactg 7320
ggcacgatgg aaaaaggtgc tccggaattc ttctgtctgc tgaataacca atacgatgcg 7380
gaatatgaaa ttgcacgcgg gaaaaatcag tttgaactgg tgacagacac ggaaaccgac 7440
gctgacgcac tctttaatgc tgacggtggt cactggattt cactggatac gccccgcact 7500
gtgctgcata agggaattta tgcaaccaaa atgaaattgg gcgggatctt ctcttggtca 7560
ggcgatcagg atgatggcct gttggcaaat gctgctcacg aaggtttggg ttacttacct 7620
gtacgcggaa aagagaagat tgatatggga ccgttatata acaaaggacg tctcattcag 7680
cttcctaaag taacccgtcg taaatcgtag taaataaaat tttccggtgg cctcacaggg 7740
gtcaccatat cctgctgtga aaaagcgtat ccatttaatg ctttaacgct tcaattttct 7800
cccggctcag gccggtactg gtgacaatga tgtccagact gacaccatgc cgtaataatg 7860
cgcgcgccgt ttccagcttg ccttcttccc gtccttcagc tctgccttct gttctgcctt 7920
cagccctgcc ttctgtccgg ccttgctcac gccctttttg ttcaagctgt tctgcaatag 7980
tcatcaacat ggtttcatgc tccggagatt gttcagtcag ttgatggaca aactgggcga 8040
gatccagcgt atgtccattc agtaaaatat agcttaacac aacatggcgc tgttcggcgc 8100
tattataacc ggcattcaac aacgccacta attggggaac ccactccagc atatcccggc 8160
aacggatatg tttttgtacc agctccatca aggcaatgct tttatgtgtc aggatctctt 8220
catcactgag cgcactgata tccaccaacg gcaggggctg attatacagg tgagccgcgt 8280
gttcagagag tgtaaaacaa tccagccatc gatttgagta agggtaaggc ctcacctcac 8340
catgataaaa cagcaggggg acgaccaaag ggagttcagt atgtcctttt ttcagatgcg 8400
cagccatggc tgacagcgaa taatacatca gccgccaggc cattaacgga tcaggcgtgg 8460
actggtgttc aatcaggcaa taaatgtaac cgtccccgtg ggttgtctcg acagaataca 8520
gcacatcact gtgcaactga cgtaattgcc tgtccacaaa gctgccgggt tccagtttta 8580
gtgtggttaa atcacacact gaccggatcg cttccggcag ataaagggat aaaaattccc 8640
gggcggtttc tggttgggtt aaaaaatgtt tgaataacgc gtcatggtga ggcttttttg 8700
ctttcctggc cacaatccgt ctctctgttt tatcggttat taatcgcctt tactgccaaa 8760
gctatcatct cgctgaaaaa tccacagcca atatacaaca tattatctgc tgacccaaca 8820
ctcgtccggc taatcaatcc agtatcaatg cgagttctac agtaaataca gctcttcatg 8880
gtcaggaaac cggacaaaag ttgattgaat ttcctaacca tgaattttct gttatgttaa 8940
ttattaccgt ctcacaataa taatcacatc caacagaatt tatttactat ataaataaac 9000
tatcaattat tataagaaaa ataatatgat tggcattaaa tataaaacca taaaaaagta 9060
gaattaattt ttaaaactta attgcagaaa ccagatgaaa tataaactta atttcttatc 9120
cataaataat aatgaatcaa tatttattca ataccatcag tggaaggttc ccgtttgttt 9180
taatttcaag cttataatcc cctttgcctt tagctgaatc accagacata atttgcttat 9240
tgctaaattg tttactactg tctgtaaaat aaacataact gccatgttga aacatgtagt 9300
tcacaatatc agcagcgtcc tttttactga aagtaacttt gatataatgg ccagagttaa 9360
tatctttctg actatcgcac caaggaatcc acataccacc ggtagatgaa tcatttcccg 9420
gagaaacaac cacatggtca ggtattatgg ggaataactc atttgctgac tcctgattaa 9480
ataaatccgc tttatattca caaccaaaat tgttatcaac attaataata ttacgaacat 9540
ctgacataat aatttccccc gaatatagtt taaaggtttt ttcaatttaa taacatatca 9600
aaggaactat aatactgtat atttacatcc gtcaacatta ttcacctaca gggtgacatt 9660
cctctattaa ataaaaaata agttttgatt tttaactttt gataacttat gcaccaaatc 9720
agtgaccact gccgttaact tagttttgat cctcgtcact acggttaaac ttccgactcc 9780
cagaaagcaa aaaaccccgc gagtgcgggg ctatattcaa agtgcttgag ttatttcact 9840
atgcggatag ttttgacatc aatttcaaca ctgttccagt ctttgtccac ttcaccttcg 9900
atacgaactt tgtcagttgg agtggccgtc agacccatcc agcgcttatc atcaatgtca 9960
acataaacag aaccactgtt atccctgaat tcatagagtt cgtgaccaac ctgtttaaca 10020
atgtttcctt ccagaacaac ccacgcatca tcacgaaaag attttgcttg agcaacgctg 10080
gtcaggttgg gagttggacc tttaaatcca ccctgagtat agtctgtgct gtctggggaa 10140
acgaagccac cctgctgtgc caaagcacca aaagaaaggg tactgagaat aagagtaatc 10200
agtgtttttt tcatagcttt ctctttgatt atgcgaagaa aaaccccgca tttgcgaggt 10260
tcgggtattc aataaattat gtgacattac tatcactctt gtcacgatat atcaactttt 10320
gtaattacgc aactttatta aggatttctt tttgcacaca tttatctgac tccaacgtag 10380
ccccctgaaa ccagcaagac atcctcaata aataatcttt catagataaa tattagttat 10440
tcatttttca aacagcacaa acacaattaa aaatatttaa acaattgttg agttgaattt 10500
tttcatgaaa gtttgttaaa atttaatttt taacatacgg tattcattat ttaaatccat 10560
gtattatagg gaagttcttt attttttatt gaaagaatag agcgataaat cagtatcaat 10620
ttaattaacc ataatattcc tatcagatta taataatctc cacctaaaaa ccattaatca 10680
ttaaattgac aataacttaa ggatttatat gataaaagtt aatgaactgt tagataagat 10740
aaatagaaaa aggtctggtg atactttatt attgacaaac atttcgttta tgtctttcag 10800
cgaatttcgt cataggacaa gtggaactct gacgtggcga gaaacagact ttttatatca 10860
acaggctcat caggaatcaa aacagaataa acttgaagaa ctgcgcattt tgtcccgtgc 10920
taatccacaa ctggctaata ccactaacct taatattaca ccgtcaaccc taaacaatag 10980
ttacaacagt tggttttatg gccgtgccca ccgttttgta aaaccgggat caattgcttc 11040
catattttca ccagcggctt atttaacaga attatatcgg gaagcgaaag attttcatcc 11100
tgacaattct caatatcacc tgaataaacg acgccccgac attgcttcac tggcactgac 11160
acagaataat atggatgaag aaatttccac attatcctta tctaatgaat tactgctgca 11220
taatattcag acgttagaga aaactgacta taacggtgta atgaaaatgt tgtccactta 11280
ccggcaaacc ggcatgacac cctatcatct gccgtatgag tcagcccgtc aggcaatttt 11340
attgcaagat aaaaacctca ccgcatttag ccgtaataca gacgtagcgg aattaatgga 11400
cccaacatcg ctactggcta ttaagactga tatatcgcct gaattgtatc aaatccttgt 11460
agaagaaatt acaccggaaa attcaacaga actgatgaag aaaaatttcg gtacagatga 11520
tgtactgatt tttaagagtt atgcttcttt ggctcgctac tacgatttgt cttatgatga 11580
actcagttta tttgtcaatc tctccttcgg taagaaaaat acaaatcaac agtataagaa 11640
tgagcaactg ataacattgg tcaatgacgg gaatgatacg gcaacggcaa gattgattaa 11700
gcgaacccgc aaagatttct acgattcaca tttaaactat gcagaactaa ttccaatcaa 11760
agaaaatgaa tacaaatata atttcagtgt aaaaaaaaca gaacctgacc acttggattt 11820
tcgtctccag aatggagata aagaatatat ataccaagat aaaaatttcg tccccattgc 11880
taatacccat tacagtattc ccattaaatt gacgacagag caaatcacca acggtataac 11940
actccgctta tggcgagtta aaccaaatcc gtcggatgct atcaatgcca atgcatactt 12000
taaaatgatg gagttccccg gtgatatatt cctgttaaag ctgaataaag cgattcgttt 12060
gtataaagcc acaggcatat ctccagaaga tatctggcaa gtaatagaaa gtatttatga 12120
tgacttaacc attgacagca atgtgttggg taagctgttt tatgttcaat attatatgca 12180
gcactataat attagcgtca gcgatgcgct ggtattgtgt cattcagata tcagccaata 12240
ttccactaaa caacaaccca gtcattttac aatactgttc aatacaccgc tattaaatgg 12300
ccaagagttt tctgctgata ataccaaact ggatttaacc cccggtgaat caaaaaacca 12360
tttttatttg ggaataatga aacgtgcttt cagagtgaat gatactgaac tgtatacatt 12420
atggaagctg gctaatggcg gaacaaatcc agaatttatg tgttccatcg agaacctgtc 12480
tctgctttat cgcgttcgtc tgctggcaga cattcatcat ctgacagtga atgaattatc 12540
catgttgttg tcggtttctc cctatgtgaa cacgaaaatt gccctttttt ctgatacagc 12600
attaacgcaa ttaatcagct ttctgttcca atgcacccag tggctgacaa cacagaaatg 12660
gtctgtcagt gatgtgtttc tgatgaccac ggataattac agcactgtcc ttacgccgga 12720
tattgaaaac cttatcacga cactaagtaa tggattatca acactttcac tcggtgatga 12780
cgaactgatc cgtgcagctg ccccgctgat tgctgccagc attcaaatgg attcagccaa 12840
gacagcagaa actattttgc tgtggattaa tcagataaaa ccacaaggac tgacattcga 12900
tgatttcatg attattgcgg ctaaccgtga tcgctcagag aatgaaacca gcaacatggt 12960
ggctttttgt caggtactgg ggcaactttc tctgattgtg cgcaatattg gactcagcga 13020
aaacgaactg accctgttgg tgacaaaacc ggagaaattc caatcagaaa ccacagcact 13080
gcaacatgat ctccccactt tgcaagcgct gacccgcttc catgctgtga tcatgcgttg 13140
tggaagctac gcgacagaaa tcttaacagc attggaacta ggagcgctga ctgccgaaca 13200
attggcggtg gcgttaaaat ttgatgctca ggttgtgaca caagcattgc aacagaccgg 13260
tttgggagtg aataccttta ccaactggag aactatagat gtcactctgc aatggctgga 13320
tgtcgctgct acattgggta ttaccccgga tggtgttgct gcactcataa aattaaaata 13380
tatcggtgaa ccagaaaccc cgatgccaac atttgatgat tggcaagccg ccagtacttt 13440
gttgcaggcg ggactgaaca gtcaacaatc cgaccagctt caggcatggc tggatgaagc 13500
cacgacgaca gcggccagtg cttactacat caaaaatagt gcacctcaac agattaagag 13560
ccgggatgag ttgtacagct atctgctgat tgataaccaa gtttctgccc aagtgaaaac 13620
cacccgtgtg gcagaagcca ttgccagcat tcagttatat gtcaaccggg cgttgaataa 13680
tgttgaagga aaagtatcaa agccagtgaa aacccgtcag ttcttctgcg actgggaaac 13740
ctacaatcga cggtatagca cctgggccgg cgtatctgaa ctggcctatt atccggaaaa 13800
ctatatcgac cccacgattc gtattggtca gacaggtatg atgaacaacc tgttacagca 13860
actttcccaa agtcagttaa atatcgatac cgttgaagat agctttaaaa attatctgac 13920
cgcatttgaa gatgtcgcta acttgcaggt gattagcgga tatcatgaca gtatcaatgt 13980
caatgaggga ctcacttatt taattggtta tagccagaca gaacccagaa tatattattg 14040
gcgcaatgtc gatcaccaaa agtgccagca cggtcaattt gctgccaatg cctggggaga 14100
atggaaaaaa attgaaatac ccatcaatgt atggcaggaa aatatcagac ctgttattta 14160
caagtctcgt ttgtatttac tgtggctgga acaaaaagag ctgaaaaatg aaagtgaaga 14220
tggcaagata gatatcactg attatatatt aaaactgtca catattcgtt atgatggcag 14280
ctggagctca ccgtttaatt ttaatgtgac tgataaaata gaaaacctga tcaataaaaa 14340
agccagcatt ggtatgtatt gttcttctga ttatgaaaaa gacgtcatta ttgtttattt 14400
ccatgagaaa aaagacaatt attcttttaa tagtcttcct gcaagagaag ggatgaccat 14460
taaccctgat atgacattat ccattctcac agaaaatgat ttagacgcca ttgttaagag 14520
cacattatca gaacttgata ccaggacaga atacaaagtc aacaatcaat ttgctacaga 14580
ttatttggcc gaatataagg aatctataac cacaaaaaat aaattagcca gttttaccgg 14640
aaatattttt gatctctcgt atatatcacc aggaaatggt catattaatt taacgttcaa 14700
tccttcaatg gaaattaatt tttcaaaagg caatatatat aatgatgagg ttaaatacct 14760
gttatcgatg gtagaagatg aaacggttat tttatttgat tatgatagac atgatgaaat 14820
gcttggaaaa gaagaagaag tttttcatta tggaactttg gattttatta tttccatcga 14880
tcttaaaaat gccgaatatt ttagagtgtt aatgcatcta agaaccaagg aaaaaattcc 14940
tagaaaatca gaaattggag ttggtataaa ttatgattat gaatcaaatg atgctgaatt 15000
caaacttgat actaacatag tattagattg gaaagataac acaggagtat ggcatactat 15060
atgtgaatca tttactaatg atgtttcaat cattaataac atgggaaata ttgcggcact 15120
gttccttcgc gaggatccat gtgtgtattt atgttcaata gccacagata taaaaattgc 15180
ttcatctatg atcgaacaga tccaagataa aaacattagt tttttattaa aaaatggctc 15240
tgatattcta gtggagttaa atgctgaaga ccatgtggca tctaaacctt cacacgaatc 15300
tgaccctatg gtatatgatt ttaatcaagt aaaagttgat attgaaggct atgatattcc 15360
tctggtgagc gagtttatta ttaagcaacc cgacggcggt tataacgata ttgttattga 15420
atcgccaatt catataaaac taaaatccaa agatacaagt aacgttatat cactgcataa 15480
aatgccatca ggcacacaat atatgcagat tggcccttac agaacccggt taaatacttt 15540
attttccaga aaattagctg aaagagccaa tattggtatt gataatgttt taagtatgga 15600
aacgcaaaat ttaccagagc cgcaattagg tgaagggttt tatgcgacat ttaagttgcc 15660
cccctacaat aaagaggagc atggtgatga acgttggttt aagatccata ttgggaatat 15720
tgatggcaat tctgccagac aaccttatta cgaaggaatg ttatctgata ttgaaaccac 15780
agtaacgctc tttgttccct atgctaaagg atattacata cgtgaaggtg tcagattagg 15840
ggttgggtac aaaaaaatta tctatgacaa atcctgggaa tctgctttct tttattttga 15900
tgagacgaaa aatcaattta tattcattaa tgatgccgat catgattcgg gaatgacaca 15960
acaggggata gtaaaaaata tcaaaaaata taaagggttt attcatgtcg ttgtcatgaa 16020
aaataacact gaacccatgg atttcaacgg cgccaatgca atctatttct gggaattgtt 16080
ctattacacg cccatgatgg tattccagcg cttattgcaa gagcagaatt ttaccgaatc 16140
gacacgctgg ctgcgctata tctggaaccc ggccggatat tcggttcagg gtgaaatgca 16200
ggattattac tggaacgtcc gcccattgga ggaagatacg tcctggaatg ccaatccgct 16260
ggattcggtc gatcctgacg ccgttgccca gcatgatccg atgcactata aagtggctac 16320
ctttatgaaa atgctggatt tgttgattac ccgcggagat agcgcctatc gccagcttga 16380
acgtgatacc ttaaacgaag ctaaaatgtg gtatgtacag gcgctcactt tattgggtga 16440
tgagccttat ttttcattgg ataacgattg gtcagagcca cggctggaag aagctgccag 16500
ccaaacaatg cggcatcatt atcaacataa aatgctgcaa ctgcgtcagc gcgctgcatt 16560
acccacgaaa cgtacggcaa attcgttaac cgcattgttc ctccctcaaa ttaataaaaa 16620
actgcaaggt tactggcaga cattgacgca acgcctctat aacttacgcc ataacctgac 16680
aatcgacggt cagccactgt cattatctct ctatgccacg cccgcagatc cgtccatgtt 16740
actcagtgct gccatcactg cttcacaagg cggcggcgat ttacctcatg cagtgatgcc 16800
gatgtaccgt tttccggtga ttctggaaaa tgccaagtgg ggggtaagcc agttgataca 16860
atttggcaat accctgctca gcattactga acggcaggat gcagaagcct tggctgaaat 16920
actgcaaact caaggcagtg agttagccct gcaaagtatt aaaatgcagg ataaggtcat 16980
ggctgaaatt gatgctgata aattggcgct tcaagaaagc cgtcatggtg cacagtctcg 17040
ttttgacagt ttcaatacgc tgtacgacga agatgttaac gctggtgaaa aacaagcgat 17100
ggatctttac ctctcttcat cggtcttgag caccagcggc acagccctgc atatggccgc 17160
cgccgcggca gatctcgtcc ccaatattta cggttttgct gtgggaggtt cccgttttgg 17220
ggcgcttttc aatgccagtg cgattggtat cgaaatttct gcgtcagcaa cacgtattgc 17280
cgcagacaaa atcagccaat cagaaatata ccgtcgccgt cggcaagagt gggaaattca 17340
gcgcaataat gcggaagctg agataaaaca aattgatgct caattagcga cgctggctgt 17400
acgtcgtgaa gcggcagtat tacaaaaaaa ctatctggaa actcagcagg cacaaactca 17460
ggcgcagtta gcctttctgc aaagtaaatt cagtaatgca gcgctataca actggctccg 17520
tggaaggttg tccgctattt attatcagtt ttatgatttg gcggtctcac tctgtttaat 17580
ggcagagcaa acttatcagt atgaattgaa taatgcggca gcacacttta ttaaaccagg 17640
tgcctggcat gggacttatg cgggtttatt agcgggtgaa accctgatgc tgaatttagc 17700
acagatggaa aaaagctatt tggaaaaaga tgaacgggca ctggaggtca ccagaaccgt 17760
ttctctggct gaagtgtatg ctggtctgac agaaaatagt ttcattttaa aagataaagt 17820
gactgagtta gtcaatgcag gtgaaggcag tgcaggcaca acgcttaacg gtttgaacgt 17880
cgaagggaca caactgcaag ccagcctcaa attatcggat ctgaatattg ctaccgatta 17940
tcctgacggt ttaggtaata cacgccgtat caaacaaatc agtgtgacat tacctgccct 18000
tttagggcct tatcaggatg ttcgggcaat actaagttat ggcggcagca caatgatgcc 18060
acgtggctgc aaagcgattg cgatctcaca tggcatgaat gacagtggtc aattccagat 18120
ggatttcaat gatgccaagt acctgccatt tgaagggctt cctgtggccg atacaggcac 18180
attaaccctc agttttcccg gtatcagtgg taaacagaaa agcttattgc tcagcctgag 18240
cgatatcatt ctgcatatcc gttacaccat tcgttcttga tccaaaaatt aactggacag 18300
agaccctgta cgggtctctg tccacacatc cgaaaaaccc accttgtcat ccatgacaaa 18360
gtgggaatga acatgattgt tatgcttcgg attcattatg acgtgcagag gcgttaaaga 18420
agaagttatt aaaagcccgc ttaaagccgc tccaggtaac ccggctagcg gcattggcaa 18480
cttcccctcc aacggcatga tgagcggccg cggctgtccc gccaatggct gcaccaaccc 18540
attcaccggg tgtacggcta taaggtaata atacttcaga aatatttctc ccgacacttt 18600
ctcctatcat tcggccaaac cagctcctgg aactgacagc gtgggaaatg gcagagctaa 18660
tgcctcttct gagcagtaac ctgccgataa accgataagg gccatcccat agattaccaa 18720
tgatccttcc ccatcgagca ccatacatag caccaatcgc tgcccgttca cccagctcag 18780
aacttccctg atggcggcca agtaatatgc cgccaataat tgcgcctgat agtgccccta 18840
accgctctgg cgcgctgaca ttaccgggcc tgagcgtatc cagcgtacct tgtccggcgg 18900
gtgtggcaat actgatagcc atgcccgtgt tatgctctcc ggctaaagcc attaatcctc 18960
caacggtgac cgctgttgct gcggaaatgg cggtacctgt cgaagagctg ttaaatagtg 19020
cagacgtcac aagcgatgtg acaacaaaag cgccaacctg aacaggaaca gaacgtttac 19080
gcgtcagata acttaaaact tccccaattt tttctgagat gttgttcgcg aaaaacccca 19140
tcaccgcccc ggagacaaaa ccaccaatgg cagccccgac aatcccccaa ggcgacgctc 19200
ctgcaatcgt ggccgccttc acccccagac ttgctacccc cacacccaaa acaaacgttc 19260
gcaatcctcg gtttaatttc aagaacgtat caaaggaagc gccttgttca agcaggtgtt 19320
ctgtcgtgat gttgactgcc tttcgatacg cttttttccc tatccaggca aggacaccct 19380
gaccggggaa acgaccatca gaatcagaaa aaacgatggg gttattcctg cacattcgga 19440
acaaattgag accatcgacc tcaccggcag gatctacact caaccatcgc cctgtccacg 19500
attgataata acgataaccg tagtaataca accctgttgc atcccgctct ttgccagaat 19560
aacgcacggt tttgtaatca gcttctgact gacttcgggc tgcccacacg gcggttcccc 19620
cataggggta atattcttcc tgactaatga tctgcccgtc actgtccaat tccagcccgc 19680
tactgccaat caggttgcca taactgtagc gcagctgatc attgctgata tccgccggtt 19740
tgcctgtttc ccaatgcagc acccgcactt gtgcctgacc cgattcaccg acagtgatga 19800
cctgcaaaaa ctcttttaat gtattgccgc tatatgtcgt gcgccattcc agctctggca 19860
aatataatgt tcgctgtatt tgctcactgt tacctgtctt ctgaatatga gtcttaatga 19920
cacgctgact gtctgcatca taacggtaga attcctgatc aggcgtcgta ttttccctat 19980
tgaccaatat cacttgttgc aattcgtcac ggggtgtcca gaaaagatcc tgaccgggaa 20040
caagccgggt ctgatgcccg ccgggggtga acaacatatc cacctgagtg ggatcttgcg 20100
ccagctcttc cagtacagcc cggttgctgt gatctgaaac ggtcatgttc gttgtatagt 20160
tattaccggt gatcggtgaa ttatggcgaa ttctggtcag atttccccca cgatcatagt 20220
cgtaagtgcg agagtaattc gtataagtat tgttatcaat cagagcgggg atgggtaact 20280
ggtttttttg tcggccaata ttcgccattt cacgcccagt gacggaaacc agctggtaca 20340
ggctgtcata ggtgtaagta ttttccggta caattttctg gttgcgccaa aagcgggtaa 20400
tttcagcatc attagttgat ttcagcacat ttccgacagg atcatattca taacgcaggt 20460
tttgtaaaat tttctcccca gcggcatgac cggaaggacg ttctgttttt atgccaataa 20520
ctcgttgcgt ctcgggttca taggtatatg tagtcactat cccgttacca tgttcctccc 20580
gtagcttctg gctggcagcc gaataggtca gggatttcac gataacttgt tcttgtttcc 20640
ccttcagcgc caaccaactg ccttgaagca gaccggccac atcataggcg atacgttgct 20700
tgtttccggc agcatctgta ctcgttaata ccgtgccggt agcatccgtt gtgctgacag 20760
aagtgaagct ttccggcgcc agcgcgtttt tccagccaga ttcatccata ccgtgccaat 20820
cggcttcgct gtcatctttc agtaattgct gtgtgatgga caagggtatg ctggttaacg 20880
atatgctgtt ggtttgattc attccggtgg gatcataatg gaccacgcac tggccggcca 20940
gattattgcc tttttctgcc ggcgtatttc ctgaccagat caatcgctcc gtgatacagg 21000
cgttctctcc ttttacctgc tcggtaatcg ttagcaatcg tcccggaagg ttatcacttt 21060
catactgaaa cgttcggcta acgccattgg cgctgacagc taaaacggga cgcccggcaa 21120
catcatgcag ggcgacacgg gttccggcat ccacactttg cgtacgcaat gccttcttac 21180
tgagtgatga caagagaata agattgggtg taatggcgtt cttgtcactc gctgtctgct 21240
ggcgttcata aaatcgcgga tcaatactct gagtcagaga tccttgagca tcatattgat 21300
aaccggtgat gcgttcatcg gttacctgag gtgtatcggg gtgccgatac caggctattt 21360
cgcgtactgt ctgaccacgg ttgtccagta cggtgacgga tggcgtattg ctgtgaacga 21420
aattcttcat gattcattcc taaatggagt gatgtctgtt cagtgaacag gcatcactga 21480
gctttatgct gtcatttcac cggcagtgtc attttcatct tcattcacca caaaccaggg 21540
agtgaataag gatcgacgaa acccgccttt ggccgtgata acctgatatt cacgccccaa 21600
cggatcatag taatgggtat cggcatatat atcctgccgg gcactgtcat cactgacgta 21660
ctgccaacta ttcaggaaat acggttgata cttacgcagg gcttggcctt ttccgtcata 21720
ttctgtacgt ccggaaactg cccaacggaa atctgtcatc gccgtttcag gcgcgccatg 21780
attttcagcc acaatggctc catactcatc acgtacccag gcttcaccac tttcatggcg 21840
tacggctgtt tgtaaggttc gcccaaaacc atcactaaac gtaaacgttt gacgtaattg 21900
ttgttccgga tcggcatcat agcggtcggt gatcacactc agtacatggg gtgggttctg 21960
tgaattgact tgctttggca tggcagcggc agggttattt tgttgccagc ggcgaaaagc 22020
aagcgacagg agataaccat cttcagtgat gatcccagcc ggtttcagct ctccataaag 22080
ctccccatca ttagaaaagc tggcctgaac catccagctc agaggggcat aaaccatcag 22140
ccctgcaaca ggtataccgg gtttcaatgc cagagcatca tccaccgttg tggggacaat 22200
aaaggggaca gtttcatttt ccgcaggggt atatccttgt ttttcaccgt tttcagtccc 22260
ccagaaacgg aagctggtta ccctccccag tgcatcaaac gtcacggtgt gatagttatc 22320
attgacatct gtggtgttat ccgcaaccat aaatcgataa tcgtaatgcg cttgcatacg 22380
caggccagcc gcatcctctg ttgcggtgat aacacagtaa tggctatccc acgtgactgt 22440
cgttttacct gtaagcttgg tttcccgttg caccaatggc cgatagaatc cgtctgcacc 22500
ggcatattct gtaaattcct tttgtcccac ccagacatgg aaatctgtct tttcactgaa 22560
cggcactttt gccgtattcc agcccgcatc attcagctgt tttgtcagct cctgctcatc 22620
catcacctcc tcaaaagccg ccaacgatcg ttcatcaaac tctgcggttt caatgtatgc 22680
caccagcgga ggaatagcgg gttgttcttc tggaccggta tatgctacac gctgatgtcc 22740
cagataatcg gctgcggcat caggcaacaa caatgctcct gcacctgtgg cagaaaacca 22800
ttcaagggaa aatccaccgt ccggcacttt atcggcttga taaatacgtg cgtcactgcg 22860
tgaggtatcc ataagccctg tgatccacgt attatcatca tgattcagat gatgataaga 22920
agaacgctgg cgtgtcagac gaaggaacat ctgctgttcg tcgaaactgc tggtgaaaag 22980
tgtttcgggc agggtatccg gataaggcga gaactcaggc tgtggacgtc tcgaataggc 23040
aatctcaaga ttgtcctgcg gaaatcctaa cgcatcagat ttaaggacga tcttttggct 23100
gcactgtgga tcggtagcaa cccgttcata tcggtattgg cgggattcgg ccaccgaaac 23160
cagtaccgca ggcacgtccg ataccatcac cggtaacaaa cgtacttggg tgcgggattc 23220
atccactgaa taaggcgtac cggccagtat agaatcatca tccccataca gctcactgcg 23280
taaacgttgt ccttttaagg ctcgatgtaa ccagtattct tcctgttcgc tcggcgtgac 23340
cgtcatatca ccaccggatt tttcgtcata acgggtaaag cgtggggtaa aatggggaaa 23400
tgcctgttga tccccctgcc aatattccgt gggcagaaga atatcgactt cccgtacgcc 23460
agtgccgtac caattaaccg tgcgcgaagg tgccggtggt tcagcatgtg tcccctgtgt 23520
cgcactcgcc cgtgaatcaa tatcagtttg tgtcacccgc ccaaaaccac gaaactcccg 23580
ttccagacca tcccaggcac catgtgagta atgataatgg ctggtcaatc ggttaccgga 23640
aatttcatcc agcacttccg tgcgccacaa cacatgcacc gggaacggta agtagctgac 23700
caccgtcatc ccggattcag aagcctgtaa tttctcatcc agccagaact gggcagagct 23760
gcgataatac agcgtggttt ctgttcccat attgttattg acggcattca gcagccaagg 23820
cttgaatatg gtcatatcca atcgccagtg ctgcaccttc atatggggga tcgtcaaaat 23880
aatgctggca gtccctaatc cttgtgtatc cgctatttgt aaccgacaag tatcatcaaa 23940
acgtacccca tccggcagat caatacgctg aggttcagca aaatgattgc cgctttcatt 24000
ggcatagagt tcaaggtaag tattgcgggc ataaataaaa tcggtggtgc ctgagccatc 24060
tatgtctacc atatacagtc tgtcggggtt aaacgtttcc ccgctaatct ggaagcctgt 24120
catcatcaga ggctcaccaa attttccatg ccccaggttc ggccagtagc gcacgctatc 24180
tgccgttact tccaccagat gtgattgccc ggagcctgtc atatcactga atgcgacaag 24240
atgacgctca tttctgccgg gaaccggcag tggcatatct gacaaatgaa tcacatcctg 24300
agcgcgatcc catcctgccc gattatttga ccagacacgt acactatttg gcccgataag 24360
cgctaagtca ggcagcccag ccccatcaat atcagccagt tttgcctgcg gatggaaata 24420
ttccattggc acagcggata atggaataaa gggtgtccat tcaccttccg gtgacatggt 24480
gtggtagccc cgtaaccctg atgccgtaat cacccaatcc agacgcccgt caccattgat 24540
gtccaacaac atcgcgcttt cctgttgtgc cggaatatgt ggcagtggtt tggcctcctc 24600
ataggtaacc gcattcgttc cttcggcagt gatatcccgt accggagcac ggtaccacca 24660
ggctttctga gtatcctgat aaagtacgcc ggaaattcct tctccatata aatcaaccaa 24720
ttggtatggc tgcaacgtgt tcattttttc taactgcggc atggactgcc agttcagatt 24780
cacgccatga ttaacacgtt gataatccat ttccagcggg gacatcatca ctggcgtacc 24840
gtccgtttca tgggccagtc tgcgggccgt ttgcagcaag gaaaccttgt tgttcaggtc 24900
ataatccaga ataagacggg aaaccagcgc cggtgtttct tctgcaacct tttcccctgc 24960
cagcgctttc agctgatgaa acatcagaac ttggcgacac aagcgacggg ttcgaatttc 25020
aaacccatat tcatagcggg agaaactgtc cggacgacaa cgccattttt caggcacatt 25080
gttttcagac acattgtttt ctgacacatt gaattcgggt acagagttca gcgaagatga 25140
gcgctcaccg taatcaaata ccagatgaaa cagccagtca ttatcagcag gaatacctga 25200
ttttaccgcg aaaaaagcgg tttccggctg agtattgcca tagctgactt ttgccagata 25260
acgctgggcc gtaacacctg aatgctgagc aagttcatgc tcatcacagt caagatcgtc 25320
ttctgcccga tagtgatagt aaatatgttc cccggtatgc gtgacggttt cctccatcag 25380
ccagcgggca attctggttt catcctgcgg gtcagcaata cgtgcatggt gatgcttacc 25440
gaataggtgc actaaaccat ccgcagtaaa aagtacccaa aaagacgtct cttcctcacg 25500
tctctgctgt ggctgccagt gttctaaacg aacgattttt tctgccacgc gggactgata 25560
gcgggtaaca gtatgcggct gtgtcagaac cgtccccaac agtgaggttg cggtgcgttg 25620
ctctggttgc ccttggctgt ccggcacaat actcaacact tccccatccg gcccgagata 25680
ctcatcttgt cccgtatagt gcggaacgcc cttggcggta cgcaggctga taaaaccaac 25740
cccacattgc caccccatcc cgaatgaccc attgccggca gtactgctgt aattcagtga 25800
tagcaccggc accagaccac gcccgacaga gatcggcaag ggcagtgaaa atgacgctcc 25860
cccttccgct ccgacggcat tgagtgcttc tcccattcct tttagtgatc cgcccccaga 25920
gggcaatgac ggtatttcaa gtttcaaagg tgttgaaccc tgcataaaaa ctccttaaac 25980
aggctccctc aggagcctgc ctatcacaat gttttaatta agaacgaatg gtatagcgga 26040
tatgcagaat gatatcgctc aggctctcca gcagcgcttt ctgccgatca gtcgcatccg 26100
ggaaactcaa cgtcaggctg ccgctgtcat tcacggaaat accttcaaac ggcagataac 26160
gggaatcgtt gaaatccagc ataaattgac cactgtcatt cacgccgtgg gagagagcaa 26220
tagcactgca accgcgtggc atgacgatgc tgcccccgta attcagcacc gcccgaatat 26280
cttcatacgg cccaaccagc gccggcaagg tgacactcac ctgtttcaac tgacgggtat 26340
tgccaaggct ttcggggtag tcgctgaaaa ttttcaaatc agacaatcgc actgaggctt 26400
ctatctgacg gttactgagt tttaattcat tgccggaagc tcctacgttg cctttccctt 26460
cacgcaggaa ttgcgtgagt ttttcggtca gattaaagtt gtctgatgat aaggcctgat 26520
agaactgtgc caacgagacg gtacgggtca cttccagtgc ccgctcatca cgctccagcc 26580
agactttttc catttctgcc agattcagca gcaacgtttc acccgccatc aaacccgcag 26640
tcgtaccgtt ccaggcccca ccccggataa aggtaacacc gttgtcggtc agctcgcggc 26700
gcagcgcttc ctgtgccatc aggcagaagg actgggtcag gtcaaagaac tggtaataga 26760
tagcactcag cttgccgcgc atccaactgt aaagcgcttt gtttgtgaat ttacgctgta 26820
acagctctaa ctgagcctga gtatgggcct gctgggtctc ctgatattcc acctgcatct 26880
gtgctgcttc gcggcggatt ttcaggcttt ccaactgggc atccatttgt ttgacttcac 26940
cgtcagcatt atcacgctga atttcccact cctgacggcg gcggcggtag gcttccgaac 27000
ggctgatttt gtctgcggaa tattgggaag ctgtggcaga aagcgacatc acggaggcgg 27060
aagcacgcag tgctgccccc caacgactgc cgccacaagc taaaccgaac acgtttggca 27120
ctaaatcggc caccccttcc gctattgaaa gcacctgccc ggccagagac tgacctgccg 27180
ctgcatcaag cagtgacatt gcccgctgtt ctccgtggtt gatatcctcg tcatacagct 27240
gctggtattt ttccagacga ttttgtgcac tgcggcggct ctctgccaat acagcaatat 27300
cagcatccac ttcatcgaca gttcgttgct gaatacggat gctctgtgtc gccagttcca 27360
taccctgctg tagtagcagc gtggtgagtt catcggcatc atcatgctct gccatactga 27420
gcagagaggt gccgaactgg gttaattgcg ctaccagatt gcgggtccgc tccagcatca 27480
ccgggaagcg gtataacgac aatgtgccgg gcagcactgc actaccgccc tgagaggcct 27540
gtaccatact ggtgagcagc gctttcggat cggtaggctc ggcgtaaatc gccagcgata 27600
acggctgtcc gtcaatggaa agattatggc gcaggttaaa caggcgcaaa cgcagggttt 27660
gccagtaatc ggtgagcgcc gggttatatt ccggcaggaa caaacccacc aacgagttag 27720
cggtacggag attcttggaa accccaccac ggcccagcat cgtaagatcc tgctgataag 27780
ccgcctgcac ggtttgactc gccgccccgg aaagggacgg tgctgcccac tgttggctac 27840
cgtaatcctc cggctcatca ccgagcaatt ctaaagtacg cacataccac attttggctt 27900
cattcaacgc atcgcgggtc agttctcgat aggccatatc gccgcgcaga ataagttgat 27960
ccaacaggcg cataaaggtg gcaatcttgt agtgcattgg gtcattttgg gcgacggcat 28020
ccggatcgat ggcatccagc ggattggcat tccaggaggt ggtctcttcc agcggccggc 28080
agttccagat ccagggggcg atttctccgt taacgatata gccggcggga ttgtagacgt 28140
agtttatcca ttgtgtggct tcgtcgaatt gtttttcctg tagcaaacgc tggaagcaca 28200
tcatcggggt gtaatagaac aattcccagt aatagagggc gctggcacta ttgaaatcca 28260
tcggggcgga atagcccgtt gcgatagaaa cattcaaaaa tcctttgtat ttcttgatat 28320
ttttcacgat cccctgttgc gtcattcctg aatcatgatc agcatcgtta attaatacaa 28380
attgctgttt tgtctcatca aaataaaaga aagcagattc ccaagtgttg tcataggtaa 28440
ttttctggta tccaaccccc aatctgacac cttcatgcat gtaataccct tcggcataag 28500
ggacaaacag tgtcatactg gtttccgacg tatcggataa cattccgctg taataaggct 28560
gccttcccgt gttaccgcca acattcccaa tatggatttt aaaccaccgc tcatcgccat 28620
gttcagcagg gtcatattta ggcagaacaa agttggcaaa gaagccttct cccaacggag 28680
gttccggtaa ccgctgggtt tccattgtca ggatagtatc aatgcccgtg tttgctctgg 28740
ataccagttg agaagccagc agggtattaa gacgaatacg atacaccccg agctgcatat 28800
attgggcacc cgaatgagtt tcacgcagaa acagaatatc ttccggatta taatttaccc 28860
gtttcaccga taatgtttgc ttgatcttac ccagcactcg cccgtctttg gctttggtct 28920
caaaaacgat atccagagga gcaatattat tggtaaaggc caacgatgaa gcatcgattt 28980
ccagtggctt aaaggtgtac ggcatagcat caaaactgtt tgccggcaag gaagcaatat 29040
ggtcactggc cgtaaaggtg tgggttttac tgccagccat caccgtaatt ttgatatcgg 29100
tattgttaat gcctgtatca atatccagcc agccggatga ttgataggaa ctaaatatct 29160
ggaagccctg agaattatta ccatcaacag cataactgca ctttttaaaa tcactggttt 29220
tattgctacc aaccgtgaaa acggtgttta aaattgtgtt tggagaaaat ggaaactcga 29280
acaatctggc ataataatta ttttcaactg gtgttagaat caaacgccta gtgtaatctg 29340
cgttcatcaa gtggccttga actgatgcaa tatagttttt cgttttatta taaacggtga 29400
tcggcccccc cagatcagag taaccgccat aatgtttaac ggtatttgcg atgataaatg 29460
cattgccgta ctgggacttt ccatccaccc ccgtcagttt catggcgctg atttgtttgt 29520
ttctgatgac attgccactg ccatcatatc tgacagtgaa agcggcgtta tgtagcgtaa 29580
tagcaaggtt atcgctggag tatttactgg ttatctgcgg aatattcccg ttctccatca 29640
ccgtcagact atcatcaccg atggcagaac ccatattcaa cgaggcaggc acttcaaaat 29700
cctgcgcgaa acgatagctg gcctttctta ccaagtcgtt gccttgagta tgaatgatat 29760
caaaggtatt tttcagttgg ctgtaacggc tgagtgctgt gttctccatc tttttgaagg 29820
agccatcgcc gtaaatggtc atgcctgcca catttttatt gctgccgcca aaatccgagt 29880
aactcttccc ggttttgtag acaaacacca gcagagtgtc ctcgccctga aagcctgatg 29940
cggccagcgc cagccgttca gtgtcaggtt ttttgtcagt gaccgcctcc acctgcgttg 30000
tgatatcgta agaccagggg gcactccaac tgccatcatg acgcagaaac gccagtttca 30060
gagtaaaacg gtcataggtt tccaccggat cagtaccatt tttcgccact tcctcttttt 30120
ctacccagat aaggtgcaaa cgttccctga atatgaccgg acgtattgca tccttgtagg 30180
ggttgaccgc tgtatcaatc ttcgtccact ctttccaggc attggcggcc agttcacccg 30240
cctgcatccg tgatatatcc acgttacgcc agtagtattc cggcaggttc tcccgcgttt 30300
ggccgacaaa ccaggtcagt ccggtgttgc tgttgacgtt gtcgtgatag gcgctgacaa 30360
ctttcagatc cgccacggtt tcaaagcggg tcaggtaagt tttaaaggca tcctccactg 30420
tgtcccggct aagtttactc tggctgatat tttccagcag ttcatccatc atccgggtct 30480
gcccgatacg ctgggttggg tcaatgtaat tttccggata ataaaccagc cgcgacaccc 30540
cgccccaggt gctgtaacgg ttattcaccg tccagtcggt aaaaaactgg cgggttgaca 30600
catcggcacg ggcattaggc tctatccgat tcagcgcccg gttgatgtag agctgaatac 30660
cggcaatggc ctctgccagt cgggtggttt ttatggcaga agagacctga ttatcaatca 30720
ggaaatagct gtacaggtca tcccggctgt gcagggacac cccttctggc tggatattcg 30780
ccagaaacca attgcacagc acgctactca ggcgctccgc ggtataatcc gccagcgtct 30840
gagcctgttg tgtactgagt ccggcttcca tattttctgc cagtgtctgc cactcatccc 30900
aggaaggcag attcgactcg gctttgttta atgcagtcac gtaacggata ttcaccagcg 30960
tacggataac cgacggcatc gtgtgcagtg ctgatgccac atctatccac tgcaacacgg 31020
tgttgatatc ctgccaacac tgaagctggt tcacgccggc ggaaaccatg gcctgcgtta 31080
ccatactgat gtccagcccc atcacggagg ccagtctgtc ggccgtgagt gtctgctggc 31140
gcagcatatc cagcgtgtca gagccgggat tgcccagccc attaatccac tggtggaatc 31200
ggtagagtga gaatagcgta tcaatattgt gctgtccggc aggttgattt tttgccccca 31260
gcacggcgaa tccggagatg accagcacgg atagctccgc ttcactgagg cgcagtgtct 31320
gtacggaaag cgataactgt gccatcacat ggcagaattg taccaattgg gtggtttcat 31380
tggcatttaa cgactctttc aataccagtg tcataaaccc ggcaatatct aagccacccg 31440
gccgcaggtt atcggtccac aacaggatat accgtgccat atccggtgac gccagatgca 31500
gcgttgcagc aataaacggc gcgagaattt cagcctgcag ctcccgattg tgactctgtg 31560
ccatatcttc actaatactc ggtcggaggt tattgagcag attactgatt tccggtgaaa 31620
tattcccgct aaactctggc gtacataata accagatcgc ttcagtggtg atttccgcct 31680
cagtcagcca ctgcgtcacc tgatacagcc agataaccag ccgtggcaac tccccggaag 31740
acaaagaagc cgttgttttg ccattgaacg gcgaaagacc ataaagcata cacagttcat 31800
tgaccgtcag ctgatggaca cgggccagta acgtgaggcg atacagtgaa gagataacga 31860
agacagaaag tgtgatggta ttttgggcgt ccagcacacc cgccagtttg cctaactgat 31920
acagttcacc actgttgacc cccagaccac gcatcagggc tgaacgggca aaggtagatt 31980
gctcttcatc cggatcaatg ctgaccgtgt tgccgtcggc ttcaaagatt ttccctttca 32040
gcggcggtgt attaaagaga cggttaaaat gactgacact gtcatcgtcg gcatattgat 32100
taatgaccga tccgttcagt acctgtgcat catcaaagct cagtgcataa cggtgactgt 32160
agaacagagt atagaaaact ttggtcagaa cggagtcgtt gatgatgcct tgtgcattgt 32220
cactgcgtac gatagtttgc agttcattcg gtgaaagccc gctagtcagg cacaagcgaa 32280
tggctttatt cagtttgagc gcaaatatag tcaggggata agactcaaaa gtgaatattc 32340
cgccgccctg atttgtggcg ctggtggaag acgtatagcg ataggcgtat atctttacac 32400
cgtttttgta ttcagaatca gatatgttac ttagataatt acttttaaaa ttcgtattgg 32460
ctattagagg accggaaagg ctgccgacaa tgccacttgg ccctgcgttt tttctaagag 32520
tagccccaaa ttctcttgat accttaaaat tagcacgtat aaagaactga ttatttcctt 32580
catacatcaa atcaaagtaa tttatatttt tatcataatc atctgttttt acacgtgtta 32640
ttttgtaagc ttcgagttta ctttcattat tgaccactaa acccgttgag atattatcca 32700
cataagcaga ggtgctgtca gaatagccat tctgcaacat cccgaggtat ttttgcacct 32760
cagaaagttc aagaccataa tacttggcta tccatgattg tgacgcgaaa ttttcgggcg 32820
tgatattttc actgaagttt tgcgcaaata aagcatcagc gttcttttcc gtaatctctt 32880
cggtcaaaat gttataaagc tccggagaaa tattggccag aatcgccagt aatgaagccc 32940
cttccgcctg ccccatcacc tcaggattac gggacagcgc tgacagtgta ctgtcatggg 33000
tcataatgac ctgacggata gtctcgtaag gctgatggta aggggtatca atggcctgac 33060
ggtaagttga caggctctcc atcaatgcgt ccgaatcacc tccggtcttg cgggtaatat 33120
gctccagcaa cagttcgtta gacagtgtca gggtggaaat ttctgtatcc atattactct 33180
ggctcagagt cagatcagcc agatccggac ggcgattatc aagatgataa gcagagcttg 33240
aaaaatgtaa gtccttcgct tcacgataca attcggtgag atagccagcc ggtgaaaaca 33300
tggaagccac tgaacccggt ttcacaaagg aagaagaacg ggcaccaaac atttcatcat 33360
aactgcgtga aacgctgtct cgttcaatac cgagtcggat agcaccggat aattgtgggt 33420
tggcacgggt aaaaatacgc gcttccagca agcgattatt ttttttctgc tctatagttt 33480
catgatagag atggcgagcc tctccccaac tgagctggtc atcaaagatt tttctcagtt 33540
cactgaagga taaatattgc agatccgcaa gagtcatcgt ctgaccgtcg cgagtgggac 33600
tgattttatt gagtaataca gccgtgctat acataataac ctcaattatt ttataaaata 33660
gtgttgtcag ttaagagttc atctgaagat ttagtgctta ttttgtaagt cattattcta 33720
ttcacattgt aattatttgt tttatctgag attaatgata ttaaagagga tgctattgta 33780
aatggcggaa tagaatacga ttttctactg aaatttcatt ttaatcataa aatttataac 33840
tgactttaat gttacagtcg taatcgatat tgtgtcatgt tggcatcctc ttcatctgcc 33900
ttaaaataaa gtagggtacc aaaaggaata catacttgaa tccaagattg agcacaaatc 33960
cacatattca gcttattaaa gataattaaa ttttatttat cataaataaa taggataacg 34020
gccctggatt ctgaccaggc gaggccaaaa gtcgatgaag ctaagttacg gttgaacaaa 34080
tttgtttact ggttaaatgg gcacaaactc tttatataaa taaatagcat attgtaacga 34140
gtaattaaaa aatgaaaatc cagcttacct ggttattcat tcattaacaa aacacaaaat 34200
atttatgcca acggcactta gaattaaata attttcttta tcaactttta cgttaacttc 34260
atttgataaa agtaagatcc catgattttt caagatcctt attcggttat aactgaccag 34320
attgggaaaa tcaaccttaa tgtctcatgt gaaataaaat attgtccaag tgatttattg 34380
ttttgtatta taattcagtc tcttttatca acatctaact taagtcctca agagaaatta 34440
attgcaaatc ggtcaccata accggctaat aatgtattga tctcatattc cattgtttcc 34500
tgagtccagg tgataaaacg tcgccagtgg tatttagcct ccctccagac gatttcaatc 34560
aaattcagtt ctgggctgta ggcgggaagg taaagtaaaa gcaggttgtg ctctcgtaac 34620
cagcgatttt taagtttttc atcgatccca tgatggatag gcgcattatc taacacgaca 34680
aatgtcaggc gatgctcgcc ttgttgggcg acctgctcta aaaaatcaat gacattactt 34740
cgcgtgacac tgcttgatat tatctgataa aacagcctgt tatcagtgta atttagcgca 34800
cctagcactg accgtctgac agagctttgc ggctctgctt catggggctt acctcgtgga 34860
ctccatccat attggaccgg tgggcaggcg gaaaaacccg cctcatcaag atagagcagc 34920
cgataatgac ctgcccgtgc gcccgcctta attttattca gtaaggcggc tttttcagca 34980
aattccgttt tattgcgttt tttttaagcg acaggcgggg gcgtttatag gggagtccct 35040
gttttttcag ggtattcgcc agcgtttcaa gcgtacaggg cagggaaccc tgcctggctt 35100
cgacgcacgt cagggactct gcgctggcgg cttcgagcgc agtggcaatc atgtcaggcg 35160
tcatggcgag ataccggcct ccggcatgac cccctaataa tcccgctatc cctgaatggt 35220
gccacatgtg aacccaatta tagataaccc ggagactgca tccgatttca gcggtgatct 35280
gggacggctt gatccctctg gcaagcatga gcaaacccgt tcctcgcgta cgaatgtccc 35340
ggtgtgggtg attcaaagcg agtggttgca atgtgattcg ttcaggctca gaaaggatta 35400
tcttcgagtt cataagaaca ggcagaaagt caggttatcg tgttatcgat tataacagta 35460
atgcagataa tttatctgat taacttaatt tatttttgca ataaacgttt ttagcaccat 35520
gaaaaataat aagaaagaat cctgattatg ttgagagagt atacaaaagt ataaaaatgg 35580
cgaattaaat caccattctg atagtgacaa ttattccctt acttttatag tataattttt 35640
attgaactct ttccctgcgt acattgtacc caaagtaaat cctaccactt caatttttat 35700
caattctgtc ttctttgggg tacctgttat atttatatga tgataatctt ttcttatttc 35760
tttttttcca ccctatagga aagaactatc ttttggattc catgttagtc ctgagccagt 35820
aggatgaatt tttacccaaa cgcttttttc attcaatgca cctcctgtga tggtaattga 35880
tgcctgatat ggttcattta tcactgcatc aggaagaaac tctgattttg gtaaaacttt 35940
tggcgttgga ttaccacaac cataaagtaa aaaaataaaa ataattaata tttttttcat 36000
gttatttcat tggtttaaaa accaaccctc aaaaaatatg gttgagtaca taatctagtg 36060
atttgttttt tttaatttga tgttccactt taccccatga aaacagattt aaatttacat 36120
attgattcag atctgaatta tttgtgatat tttctttctc atagttttct actactcctt 36180
cccaaactat ccaatgattt tttcctgatg tttctatgtc accaaaatct gataacattc 36240
ctgctgaaat caaagtaaca acatgatatc ctttgttata gtaatcacta agagttacta 36300
tgtcatttat attagaatgg gataagccga cattactaaa tactttttca taccctgatt 36360
tttcaaacca ttctgtcaat tttccccaca ttgtaatacc agctacttca tcatcaacct 36420
catcataact catcatcata ttttctgaat ctcttaggct tgccaatgtc agccaatcta 36480
acccagatat tctttcacca tattgattat aaaaagtacc tttaggatgt cggcaaccct 36540
cacccagctt aatttccagt tgaccaattt tagttcggcc atattgccat aattctcgtg 36600
cagcttgctc ataaatatct ggtctatcta ttggcaggca ataaaaaaaa gcggcagggc 36660
cacataaact tgctccattt tgatccggat aagttctttt cgataccctg tcctgtattt 36720
atgattcaat tttacttttt tcaaatggat cgtgtgggtg accaatggga tattctttgg 36780
caataaaagt ccgttcagga atggtgattt ttaattcgac ggtattatcc ccgtcgctga 36840
caggttttgt ttcaactgta ctttcaaaat aacaggcatt ttcccgtttt ttctccgaac 36900
aggctttacc ggaagttaat tcaccgaggg tatacatttt cttaaaatgg gtttgcagtg 36960
ccggaaatat atttgagcca tgtttgcggt gaagcataat ttgcataata taaatcacag 37020
aaagatcgtc gtctatttcc tgcgtgatat gctgttgtgt ttcataatta ctctgtagtt 37080
tctcctgaaa aacatcaagc aagatatgga aatttccttt tgcattctga cgataaatgg 37140
ttaagtcata aagcaggttt tcaacaggta cgttggattc tatttttgtc tgaacggtat 37200
aggctggcat ggtattaatc ctttaaaata tgaaattcaa gtttattttt gtcatccgta 37260
agatgccatt gggtgtaacc ttgtttatca gtctttcctt cttttatctg accatccggc 37320
aggcaaaccc gatacttgcg ttcggttaaa agattgccgt catcatccac acaacgataa 37380
cgggcatgat gtttagcggg ttttaccggg gtttcttcaa ccagcggctt cttaagtggc 37440
ttaacattcg agaccccaga taaaagcggc ttgccttctt cacttccggc taattgcgta 37500
atgctgtagc ccgaagtccc tgcggcgatg tggttgcatg aaatggactt ataatcgagc 37560
atcaccactt catggggtat cgcaccatta tcattgattg aatgcggata attataggaa 37620
atatccacaa tcgtggcact ggtcagcttg atttcataaa agaactccag ttgtcccatc 37680
tggcttgtcc tgtaatgcac aaaactggca tccagcaaac aggggagagg atttatcaat 37740
gggtttcaca aaactgacgg gttgatgatt aacattctgg tcacggctca tcgaatgatt 37800
caggctcaat acctgtattt gatcacacag gccagctggt ttgcccgttc gttaagctgg 37860
cggtaggtca gtgttgcgcc ttcaaacacc agtgccccgt tatccggtgt cctttccacc 37920
tgagcttcaa acagttgtgg cagggttttg tcctgtggat aaggcgcatc ggtctggttc 37980
caggtatgca gcagggtatg gcgctcctgt gcggacagaa tatccagcgc ggacagcggt 38040
tgtttctggt ctgccacaaa ggcttccagt acccgttgat agctctctgt cagcctgacg 38100
atggtggttt cattaaacag gctgactgcg taattcaggc aaccggtaat ctcggtttgt 38160
ccgtcggaca taaacaggct gaggtcaaac ttggcggggc tgtatagcgg ctcatccaga 38220
gtcaccggcc tgaatggcag gcggttgtct gacggatttt ctccaaagct ctgtaaacca 38280
aacatgatct gaaaaatcgg gtggcgggcg gtatcacgtt caatattcag ggcatcaagg 38340
agctgttcaa acggcatatc ctgataggcc ttggcttcgg caacctgttt atgggtctgc 38400
tcaatcaggg cttccacgct gacagtctgt tgcaactgtg cccttaagac cagtgaattg 38460
acaaacatcc caatcagggg ctgagtctgg gcatggtggc ggttatcggt tggcgtcccc 38520
agtacgatat cgttttgccc ggataatttt gccagcgtga cataaaaggc actgagcaac 38580
acggtataca gggtggtttc ctgtgttttt gccagactcc ttaactgttc agatagccgg 38640
gtattcagcc caaaactgaa attacatccc tgataattca cctgagccgg tctggggtaa 38700
tcggttggca aggccagtga ttcatagttg gctaaagcct gttgccagta agcgagttgg 38760
cgttcgcgcc ggtccccttg taaatagttg cgttgccatg cggcataatc gccataggtg 38820
atatccagcg ctgcaagctg gctgtcgcgg ttttcccgca aggactggta aatttccgcc 38880
agttcagcca taaagatatc aattgaccag ccatcaatgg cgatatggtg ccataacaat 38940
aataaatagt ggctgtcaga aaccggatag tgacacaggc gcagactggg ttctgtggtc 39000
agatc 39005
<210>7
<211>1533
<212>DNA
<213>嗜线虫致病杆菌
<400>7
gatcaggtat tcaatcaacc caaactgttt gatgaacctt tctttgttga taatcgtact 60
tttgattaca acgccattcg tggtaatgat gcacgaacaa ttaagcaact gtgcgccgga 120
ttgaaaatca ccgtagccac cttccaattg ttagctgagc aggtaaacac cgcctttcat 180
ctgccatccg gcaaattaac ctgttcactg cctgttattt cagcgcttta tcgtctggtg 240
actgttcctc ggttatttaa tttaaccgct gaacagggca tgatgctgat taacgcatta 300
aatgccagcg agaaattctc acctcatatt ctggctggtg agcctcgatt aagcctgtta 360
acaacagagg gttcagatac cacagaggtc gatttattgg atgttattct gatgttggaa 420
gaagttgctg tctggctgca acagagcaaa ctgaaaccgg aagaattctg cctgatgctg 480
caaagtgtta tgttgccggt ggttgccacg gacagcagtg tgacattctt cgacaacctg 540
ctgcaaggca ttcccaaaac cttactcaca gaagataact tcaacgcagg ggatatcccc 600
agactccctg aaggagaaac ctggtttgac aaactttcga tgctgataac cagcgatgga 660
ctcgtcaacg tttaccctct cagttggggc cagagtgatg aagattatct gaaatcagta 720
ttgacacctg tcgtcgaaaa aatcattagc gatccaaaca gtgtgattat cactgtttcc 780
gcattaacac aggtcattac tcaggcgaaa actgcgcagg aagatctggt ttccgccagc 840
gtgacacggg aatacggtac tggacgtgat atcgttcctt ggttattacg ctggattggc 900
agcagtgttc ccgatttcct tggcaaaatt tatatacaag gcgcaaccag aggcggacac 960
ttgcgcactc cgccggatat cagcgctgaa ttactgcata tcacctatca tctggcgatg 1020
aataacatgc tgattaagca gttacgactc aaagctcaaa tcatttcatt acgtatcatc 1080
atgcctgaat ggctcggatt accaacgata gatggcagtc cgctatccgt gcatgaaatt 1140
tgggcactga gccggttccg taactgggcg accagctcat tgttcagtga agacgagtta 1200
atcgagtatt ttgcttttgc caatcagccg gagcaggacg ttcgtaacga tgaagatttt 1260
aatcgggact gtgctgaaaa gcttgccgac atactggaat gggatgccga tgaaattgag 1320
ctggcaaccc gacattttga tcctgcccca gcacgtgcca gaaatatggg acaaattgac 1380
tggctgcgtc gtgtcatggc gttgtcgcgt cagactggcc tgtcagtgac accgttaatg 1440
acagccgcaa cgttaccgcc tttcccgccc tatgaccaga taacccatgt cggtgaagcg 1500
gtgattgcgg caacccagta cccatcagag gag 1533
<210>8
<211>511
<212>PRT
<213>嗜线虫致病杆菌
<400>8
Asp Gln Val Phe Asn Gln Pro Lys Leu Phe Asp Glu Pro Phe Phe Val
1 5 10 15
Asp Asn Arg Thr Phe Asp Tyr Asn Ala Ile Arg Gly Asn Asp Ala Arg
20 25 30
Thr Ile Lys Gln Leu Cys Ala Gly Leu Lys Ile Thr Val Ala Thr Phe
35 40 45
Gln Leu Leu Ala Glu Gln Val Asn Thr Ala Phe His Leu Pro Ser Gly
50 55 60
Lys Leu Thr Cys Ser Leu Pro Val Ile Ser Ala Leu Tyr Arg Leu Val
65 70 75 80
Thr Val Pro Arg Leu Phe Asn Leu Thr Ala Glu Gln Gly Met Met Leu
85 90 95
Ile Asn Ala Leu Asn Ala Ser Glu Lys Phe Ser Pro His Ile Leu Ala
100 105 110
Gly Glu Pro Arg Leu Ser Leu Leu Thr Thr Glu Gly Ser Asp Thr Thr
115 120 125
Glu Val Asp Leu Leu Asp Val Ile Leu Met Leu Glu Glu Val Ala Val
130 135 140
Trp Leu Gln Gln Ser Lys Leu Lys Pro Glu Glu Phe Cys Leu Met Leu
145 150 155 160
Gln Ser Val Met Leu Pro Val Val Ala Thr Asp Ser Ser Val Thr Phe
165 170 175
Phe Asp Asn Leu Leu Gln Gly Ile Pro Lys Thr Leu Leu Thr Glu Asp
180 185 190
Asn Phe Asn Ala Gly Asp Ile Pro Arg Leu Pro Glu Gly Glu Thr Trp
195 200 205
Phe Asp Lys Leu Ser Met Leu Ile Thr Ser Asp Gly Leu Val Asn Val
210 215 220
Tyr Pro Leu Ser Trp Gly Gln Ser Asp Glu Asp Tyr Leu Lys Ser Val
225 230 235 240
Leu Thr Pro Val Val Glu Lys Ile Ile Ser Asp Pro Asn Ser Val Ile
245 250 255
Ile Thr Val Ser Ala Leu Thr Gln Val Ile Thr Gln Ala Lys Thr Ala
260 265 270
Gln Glu Asp Leu Val Ser Ala Ser Val Thr Arg Glu Tyr Gly Thr Gly
275 280 285
Arg Asp Ile Val Pro Trp Leu Leu Arg Trp Ile Gly Ser Ser Val Pro
290 295 300
Asp Phe Leu Gly Lys Ile Tyr Ile Gln Gly Ala Thr Arg Gly Gly His
305 310 315 320
Leu Arg Thr Pro Pro Asp Ile Ser Ala Glu Leu Leu His Ile Thr Tyr
325 330 335
His Leu Ala Met Asn Asn Met Leu Ile Lys Gln Leu Arg Leu Lys Ala
340 345 350
Gln Ile Ile Ser Leu Arg Ile Ile Met Pro Glu Trp Leu Gly Leu Pro
355 360 365
Thr Ile Asp Gly Ser Pro Leu Ser Val His Glu Ile Trp Ala Leu Ser
370 375 380
Arg Phe Arg Ash Trp Ala Thr Ser Ser Leu Phe Ser Glu Asp Glu Leu
385 390 395 400
Ile Glu Tyr Phe Ala Phe Ala Asn Gln Pro Glu Gln Asp Val Arg Asn
405 410 415
Asp Glu Asp Phe Asn Arg Asp Cys Ala Glu Lys Leu Ala Asp Ile Leu
420 425 430
Glu Trp Asp Ala Asp Glu Ile Glu Leu Ala Thr Arg His Phe Asp Pro
435 440 445
Ala Pro Ala Arg Ala Arg Asn Met Gly Gln Ile Asp Trp Leu Arg Arg
450 455 460
Val Met Ala Leu Ser Arg Gln Thr Gly Leu Ser Val Thr Pro Leu Met
465 470 475 480
Thr Ala Ala Thr Leu Pro Pro Phe Pro Pro Tyr Asp Gln Ile Thr His
485 490 495
Val Gly Glu Ala Val Ile Ala Ala Thr Gln Tyr Pro Ser Glu Glu
500 505 510
<210>9
<211>4173
<212>DNA
<213>嗜线虫致病杆菌
<400>9
atgagttcag ttacccaacc tattgaagag cgtttactgg aatcacagcg cgacgcactg 60
ctggatttct atctcggaca ggtcgttgcc tattcacctg acatgacaag tcagcgcgac 120
aaaattaagg atattgacga tgcctgcgac tacctcctgc tggatctgct gacttccgcc 180
aaagtcaaag cgacacgact ttcacttgcg accaattcat tgcagcaatt tgtgaaccgc 240
gtgtcactga atattgaacc cggtttgttt atgaccgcgg aagagagcga aaattggcag 300
gaatttgcga atcgttataa ttactggtct gcggatcgct tattacggac ttatccggaa 360
agctatctgg aacccctgtt acgcctgaat aaaacagaat tcttcttcca actggaaagt 420
gcccttaatc agggaaaaat taccgaagat tccgtacaac aagcggtgct cggttatctg 480
aataattttg aagatgtcag taacctgaaa gttatcgcag gttatgaaga tggtgttaac 540
atcaaacgcg ataagttctt ctttgtcgga cgtacccgta cacagccata ccaatattac 600
tggcgttcac tgaatctttc gatacgccat cctgataccg atgcgttatc tcccaatgcc 660
tggagcgagt ggaaacctat tgacctgcca ttgggcagcg tagaccccaa tttgatacgc 720
cccattttcc tgaataatcg cctgtatatt gcctggacgg aagttgaaga acagtctgaa 780
actaaagata caactgcgtt atcactgcat aaccaaaacg ttgagcctag tgcgggtgat 840
tgggttcctc ccacaccgtt cctgacccgg atcaaaatcg cttatgccaa atatgatggc 900
agctggagta cacccaccat tctgcgcgaa gacaatctgc aataccggat ggcccagatg 960
gttgctgtga tggatataca gcaagacccg cataacccgt ttctggctct ggttccgttt 1020
gtccgtcttc aggggacaga taagaaaggt aaggattatg attatgacga agccttcggt 1080
tatgtctgcg atacactgct ggtagaaatt actgatttgc cggatgacga atatgctgat 1140
ggacgaaaag gaaaatatgt cggcaacctg gtctggtatt actcacgtga acacaaggat 1200
gcagaaggca atcctatcga ttaccgtact atggtgctct atccggcaac ccgggaagaa 1260
cgctttccta ttgccggaga agccaaaccg gaaggaagcc ctgattttgg caaagacagt 1320
atcaaactga ttgtcaattt tgttcatggc actgatgaca cactggagat tgtcgctcaa 1380
tctgacttta agtttggtgc gatagaagat catcaatatt acaacggttc tttccggctg 1440
atgcacgata atactgtctt ggatgaacaa ccactggtac tgaacgaaaa agttcctgat 1500
ttaacctatc catcaatcaa gctggggtcg gataatcgaa tcaccctgaa agccgaactt 1560
ctctttaagc ccaaaggtgg tgttggcaat gaaagtgcca gctgtactca agagttcaga 1620
atcggtatgc acattcgcga actgattaaa ctcaatgaac aggatcaggt gcaattcctt 1680
tccttccccg cagatgaaac tggtaacgcg ccacaaaaca ttcgccttaa tacactgttt 1740
gcaaaaaaac tgatcgccat tgccagtcag ggtatcccgc aggtactgag ctggaataca 1800
cagcttatta ctgaacaacc catacccggt tcattcccta cgccgattga tttaaatggc 1860
gcaaatggga tctatttctg ggaactgttt ttccatatgc catttctggt cgcgtggcga 1920
ctgaatatcg aacaacgatt aaaagaggcc accgaatggc tgcactatat ttttaatccg 1980
ctggaagatg aacttgttca ggccagcaac caaggtaaac cgcgttactg gaattcacgg 2040
ccaattattg atcctccacc caccgtgtac cggatgttaa ttgaaccaac cgatccggat 2100
gccattgcag ccagtgaacc cattcactac cggaaagcaa tattccgttt ctatgtcaag 2160
aatctgttag atcagggaga catggaatac cgtaagctga catccagtgc acgtactgtc 2220
gccaagcaga tctatgactc cgtcaatatg ttactgggta ccagccctga tattctgctc 2280
gcggcaaact ggcaaccccg tacgctgcaa gatgtggctc tgtatgaaaa cagtgaagca 2340
cgggcacagg agttaatgct tactgtcagc agcgtgccac ttctgcctgt gacatatgat 2400
acatccgtct ctgccgcacc gtctgattta tttgtcaaac ctgttgatac ggaatatctc 2460
aaactgtggc aaatgttgga tcagcgtcta tataacttac gtcataacct gaccttggat 2520
ggtaaagagt ttccggccgg attatacgat gaacccatca gcccgcaaga tctgctcagg 2580
cagcgttacc agcgtgttgt ggctaatcgt atggcgggca tgaaacgccg ggcaatcccg 2640
aattatcgtt tcaccccgat catgagccgg gcaaaagagg ccgcagaaac gctgattcag 2700
tacggcagca cgttactgag tttgctggag aaaaaagaca ataccgattt tgaacacttc 2760
cgtatgcagc agcaactggg gctgtacagc tttacccgca atctgcaaca gcaagcgatt 2820
gacatgcaac aggcttcatt ggatgcactg accatcagcc gacgggccgc tcaggagcgc 2880
cagcaacact ataaatcgct ctatgatgaa aacatctcca tcaccgagca ggaagttatc 2940
gcattacaat caagagcggc tgaaggtgtg atcgctgccc agtcagccgc cactgcggcc 3000
gctgtggcgg atatggttcc caatattttc ggtctggccg tcggggggat ggtctttggc 3060
ggtatgcttc gggcaatcgg tgaaggaata cgcattgacg ttgaaagtaa aaatgccaaa 3120
gccaccagcc tgagcgtgtc agaaaattac cgtcgccgtc agcaagaatg ggagctgcaa 3180
tacaaacagg cggatatcaa cattgaggag atcgacgcac agattggtat ccagcaacgc 3240
caactgaata tcagcacaac ccaactggca caattggaag cccagcatga gcaggatcaa 3300
gtcctgctgg agtactattc aaaccgtttt accaatgatg cgttatacat gtggatgatc 3360
agccaaatct ccgggcttta cctgcaagcc tatgatgcgg ttaattccct ctgtttactg 3420
gccgaagcct cctggcagta cgaaacaggt cagtatgata tgaatttcgt ccaaagtggt 3480
ctctggaatg atctttatca ggggctgctg gtcggagaac atctgaaatt agccttacaa 3540
cggatggatc aggcgtattt gcaacataac accagacgtc tggagatcat aaaaaccata 3600
tcggtaaaat cattactgac atcatcacag tgggaaattg gcaagagtac gggttcattc 3660
actttcttac tgagcgccga aatgttcttg cgcgattatc cgacccacgc tgatcggcgt 3720
ataaaaaccg tagcgctgtc attgcccgca ttgctggggc cttatgaaga tgtacgggct 3780
tcactggtac aactcagcaa tacgctttac agtactgctg acttaaaaac tatcgattat 3840
ttgcttaacc ccttggaata caccaaaccc gaaaacgttt tgctgaacgt acaggctaat 3900
caaggtgtgg tgatttcaac ggccatggaa gacagcggca tgttcaggct caattttgat 3960
gatgaacttt tcctgccttt tgaagggaca ggcgccattt cacagtggaa gttggaattc 4020
ggttccgatc aggatcagct gctggagtcg ctgagcgata ttatcctcca tctgcgttat 4080
accgcgcgtg atgtgagtgg cggaagtaat gagttcagcc agcaggttcg tagccgtctg 4140
aataaacatc aattaaaaca agacaattct aac 4173
<210>10
<211>1391
<212>PRT
<213>嗜线虫致病杆菌
<400>10
Met Ser Ser Val Thr Gln Pro Ile Glu Glu Arg Leu Leu Glu Ser Gln
1 5 10 15
Arg Asp Ala Leu Leu Asp Phe Tyr Leu Gly Gln Val Val Ala Tyr Ser
20 25 30
Pro Asp Met Thr Ser Gln Arg Asp Lys Ile Lys Asp Ile Asp Asp Ala
35 40 45
Cys Asp Tyr Leu Leu Leu Asp Leu Leu Thr Ser Ala Lys Val Lys Ala
50 55 60
Thr Arg Leu Ser Leu Ala Thr Asn Ser Leu Gln Gln Phe Val Asn Arg
65 70 75 80
Val Ser Leu Asn Ile Glu Pro Gly Leu Phe Met Thr Ala Glu Glu Ser
85 90 95
Glu Asn Trp Gln Glu Phe Ala Asn Arg Tyr Asn Tyr Trp Ser Ala Asp
100 105 110
Arg Leu Leu Arg Thr Tyr Pro Glu Ser Tyr Leu Glu Pro Leu Leu Arg
115 120 125
Leu Asn Lys Thr Glu Phe Phe Phe Gln Leu Glu Ser Ala Leu Asn Gln
130 135 140
Gly Lys Ile Thr Glu Asp Ser Val Gln Gln Ala Val Leu Gly Tyr Leu
145 150 155 160
Asn Asn Phe Glu Asp Val Ser Asn Leu Lys Val Ile Ala Gly Tyr Glu
165 170 175
Asp Gly Val Asn Ile Lys Arg Asp Lys Phe Phe Phe Val Gly Arg Thr
180 185 190
Arg Thr Gln Pro Tyr Gln Tyr Tyr Trp Arg Ser Leu Asn Leu Ser Ile
195 200 205
Arg His Pro Asp Thr Asp Ala Leu Ser Pro Asn Ala Trp Ser Glu Trp
210 215 220
Lys Pro Ile Asp Leu Pro Leu Gly Ser Val Asp Pro Asn Leu Ile Arg
225 230 235 240
Pro Ile Phe Leu Asn Asn Arg Leu Tyr Ile Ala Trp Thr Glu Val Glu
245 250 255
Glu Gln Ser Glu Thr Lys Asp Thr Thr Ala Leu Ser Leu His Asn Gln
260 265 270
Asn Val Glu Pro Ser Ala Gly Asp Trp Val Pro Pro Thr Pro Phe Leu
275 280 285
Thr Arg Ile Lys Ile Ala Tyr Ala Lys Tyr Asp Gly Ser Trp Ser Thr
290 295 300
Pro Thr Ile Leu Arg Glu Asp Asn Leu Gln Tyr Arg Met Ala Gln Met
305 310 315 320
Val Ala Val Met Asp Ile Gln Gln Asp Pro His Asn Pro Phe Leu Ala
325 330 335
Leu Val Pro Phe Val Arg Leu Gln Gly Thr Asp Lys Lys Gly Lys Asp
340 345 350
Tyr Asp Tyr Asp Glu Ala Phe Gly Tyr Val Cys Asp Thr Leu Leu Val
355 360 365
Glu Ile Thr Asp Leu Pro Asp Asp Glu Tyr Ala Asp Gly Arg Lys Gly
370 375 380
Lys Tyr Val Gly Asn Leu Val Trp Tyr Tyr Ser Arg Glu His Lys Asp
385 390 395 400
Ala Glu Gly Asn Pro Ile Asp Tyr Arg Thr Met Val Leu Tyr Pro Ala
405 410 415
Thr Arg Glu Glu Arg Phe Pro Ile Ala Gly Glu Ala Lys Pro Glu Gly
420 425 430
Ser Pro Asp Phe Gly Lys Asp Ser Ile Lys Leu Ile Val Asn Phe Val
435 440 445
His Gly Thr Asp Asp Thr Leu Glu Ile Val Ala Gln Ser Asp Phe Lys
450 455 460
Phe Gly Ala Ile Glu Asp His Gln Tyr Tyr Asn Gly Ser Phe Arg Leu
465 470 475 480
Met His Asp Asn Thr Val Leu Asp Glu Gln Pro Leu Val Leu Asn Glu
485 490 495
Lys Val Pro Asp Leu Thr Tyr Pro Ser Ile Lys Leu Gly Ser Asp Asn
500 505 510
Arg Ile Thr Leu Lys Ala Glu Leu Leu Phe Lys Pro Lys Gly Gly Val
515 520 525
Gly Asn Glu Ser Ala Ser Cys Thr Gln Glu Phe Arg Ile Gly Met His
530 535 540
Ile Arg Glu Leu Ile Lys Leu Asn Glu Gln Asp Gln Val Gln Phe Leu
545 550 555 560
Ser Phe Pro Ala Asp Glu Thr Gly Asn Ala Pro Gln Asn Ile Arg Leu
565 570 575
Asn Thr Leu Phe Ala Lys Lys Leu Ile Ala Ile Ala Ser Gln Gly Ile
580 585 590
Pro Gln Val Leu Ser Trp Asn Thr Gln Leu Ile Thr Glu Gln Pro Ile
595 600 605
Pro Gly Ser Phe Pro Thr Pro Ile Asp Leu Asn Gly Ala Asn Gly Ile
610 615 620
Tyr Phe Trp Glu Leu Phe Phe His Met Pro Phe Leu Val Ala Trp Arg
625 630 635 640
Leu Asn Ile Glu Gln Arg Leu Lys Glu Ala Thr Glu Trp Leu His Tyr
645 650 655
Ile Phe Asn Pro Leu Glu Asp Glu Leu Val Gln Ala Ser Asn Gln Gly
660 665 670
Lys Pro Arg Tyr Trp Asn Ser Arg Pro Ile Ile Asp Pro Pro Pro Thr
675 680 685
Val Tyr Arg Met Leu Ile Glu Pro Thr Asp Pro Asp Ala Ile Ala Ala
690 695 700
Ser Glu Pro Ile His Tyr Arg Lys Ala Ile Phe Arg Phe Tyr Val Lys
705 710 715 720
Asn Leu Leu Asp Gln Gly Asp Met Glu Tyr Arg Lys Leu Thr Ser Ser
725 730 735
Ala Arg Thr Val Ala Lys Gln Ile Tyr Asp Ser Val Asn Met Leu Leu
740 745 750
Gly Thr Ser Pro Asp Ile Leu Leu Ala Ala Asn Trp Gln Pro Arg Thr
755 760 765
Leu Gln Asp Val Ala Leu Tyr Glu Asn Ser Glu Ala Arg Ala Gln Glu
770 775 780
Leu Met Leu Thr Val Ser Ser Val Pro Leu Leu Pro Val Thr Tyr Asp
785 790 795 800
Thr Ser Val Ser Ala Ala Pro Ser Asp Leu Phe Val Lys Pro Val Asp
805 810 815
Thr Glu Tyr Leu Lys Leu Trp Gln Met Leu Asp Gln Arg Leu Tyr Asn
820 825 830
Leu Arg His Asn Leu Thr Leu Asp Gly Lys Glu Phe Pro Ala Gly Leu
835 840 845
Tyr Asp Glu Pro Ile Ser Pro Gln Asp Leu Leu Arg Gln Arg Tyr Gln
850 855 860
Arg Val Val Ala Asn Arg Met Ala Gly Met Lys Arg Arg Ala Ile Pro
865 870 875 880
Asn Tyr Arg Phe Thr Pro Ile Met Ser Arg Ala Lys Glu Ala Ala Glu
885 890 895
Thr Leu Ile Gln Tyr Gly Ser Thr Leu Leu Ser Leu Leu Glu Lys Lys
900 905 910
Asp Asn Thr Asp Phe Glu His Phe Arg Met Gln Gln Gln Leu Gly Leu
915 920 925
Tyr Ser Phe Thr Arg Asn Leu Gln Gln Gln Ala Ile Asp Met Gln Gln
930 935 940
Ala Ser Leu Asp Ala Leu Thr Ile Ser Arg Arg Ala Ala Gln Glu Arg
945 950 955 960
Gln Gln His Tyr Lys Ser Leu Tyr Asp Glu Asn Ile Ser Ile Thr Glu
965 970 975
Gln Glu Val Ile Ala Leu Gln Ser Arg Ala Ala Glu Gly Val Ile Ala
980 985 990
Ala Gln Ser Ala Ala Thr Ala Ala Ala Val Ala Asp Met Val Pro Asn
995 1000 1005
Ile Phe Gly Leu Ala Val Gly Gly Met Val Phe Gly Gly Met Leu
1010 1015 1020
Arg Ala Ile Gly Glu Gly Ile Arg Ile Asp Val Glu Ser Lys Asn
1025 1030 1035
Ala Lys Ala Thr Ser Leu Ser Val Ser Glu Asn Tyr Arg Arg Arg
1040 1045 1050
Gln Gln Glu Trp Glu Leu Gln Tyr Lys Gln Ala Asp Ile Asn Ile
1055 1060 1065
Glu Glu Ile Asp Ala Gln Ile Gly Ile Gln Gln Arg Gln Leu Asn
1070 1075 1080
Ile Ser Thr Thr Gln Leu Ala Gln Leu Glu Ala Gln His Glu Gln
1085 1090 1095
Asp Gln Val Leu Leu Glu Tyr Tyr Ser Asn Arg Phe Thr Asn Asp
1100 1105 1110
Ala Leu Tyr Met Trp Met Ile Ser Gln Ile Ser Gly Leu Tyr Leu
1115 1120 1125
Gln Ala Tyr Asp Ala Val Asn Ser Leu Cys Leu Leu Ala Glu Ala
1130 1135 1140
Ser Trp Gln Tyr Glu Thr Gly Gln Tyr Asp Met Asn Phe Val Gln
1145 1150 1155
Ser Gly Leu Trp Asn Asp Leu Tyr Gln Gly Leu Leu Val Gly Glu
1160 1165 1170
His Leu Lys Leu Ala Leu Gln Arg Met Asp Gln Ala Tyr Leu Gln
1175 1180 1185
His Asn Thr Arg Arg Leu Glu Ile Ile Lys Thr Ile Ser Val Lys
1190 1195 1200
Ser Leu Leu Thr Ser Ser Gln Trp Glu Ile Gly Lys Ser Thr Gly
1205 1210 1215
Ser Phe Thr Phe Leu Leu Ser Ala Glu Met Phe Leu Arg Asp Tyr
1220 1225 1230
Pro Thr His Ala Asp Arg Arg Ile Lys Thr Val Ala Leu Ser Leu
1235 1240 1245
Pro Ala Leu Leu Gly Pro Tyr Glu Asp Val Arg Ala Ser Leu Val
1250 1255 1260
Gln Leu Ser Asn Thr Leu Tyr Ser Thr Ala Asp Leu Lys Thr Ile
1265 1270 1275
Asp Tyr Leu Leu Asn Pro Leu Glu Tyr Thr Lys Pro Glu Asn Val
1280 1285 1290
Leu Leu Asn Val Gln Ala Asn Gln Gly Val Val Ile Ser Thr Ala
1295 1300 1305
Met Glu Asp Ser Gly Met Phe Arg Leu Asn Phe Asp Asp Glu Leu
1310 1315 1320
Phe Leu Pro Phe Glu Gly Thr Gly Ala Ile Ser Gln Trp Lys Leu
1325 1330 1335
Glu Phe Gly Ser Asp Gln Asp Gln Leu Leu Glu Ser Leu Ser Asp
1340 1345 1350
Ile Ile Leu His Leu Arg Tyr Thr Ala Arg Asp Val Ser Gly Gly
1355 1360 1365
Ser Asn Glu Phe Ser Gln Gln Val Arg Ser Arg Leu Asn Lys His
1370 1375 1380
Gln Leu Lys Gln Asp Asn Ser Asn
1385 1390
<210>11
<211>1944
<212>DNA
<213>嗜线虫致病杆菌
<400>11
atgtctcaaa atgtttatcg atacccttca attaaagcga tgtctgacgc cagcagcgaa 60
gtaggcgcat ctctggttgc ctggcagaat caatctggtg gtcaaacctg gtatgtcatt 120
tatgatagcg cggtttttaa aaacatcggc tgggttgaac gctggcatat tcccgaccgc 180
aatatttcac ctgatttacc ggtttatgag aatgcctggc aatatgtccg tgaggcgaca 240
ccggaagaaa ttgccgatca cggtaacccc aatacgcctg atgtaccgcc gggagaaaaa 300
accgaggtat tgcaatatga tgcactcaca gaagaaacct atcagaaggt gggatataaa 360
cctgacggca gcggaactcc tttgagttat tcttcagcac gtgttgccaa gtccctgtac 420
aacgaatatg aagttgatcc ggaaaataca gaaccgctgc ctaaagtctc tgcctatatt 480
actgactggt gccagtatga tgcgcgtttg tcgccagaaa cccaggataa cactgcgctg 540
accagcgacg atgcccccgg ccgtggtttt gatctggaaa aaatcccgcc taccgcctac 600
gaccgcctga ttttcagttt tatggccgtc aacggtgata aaggcaagtt atccgaacgg 660
attaatgagg ttgttgacgg gtggaaccgg caagcagaag ccagcagtgg ccagattgcc 720
cctattacat taggccatat tgtacccgtt gatccttatg gtgatttagg caccacacgc 780
aatgtcggtc tggacgcgga tcagcgccgt gatgccagcc cgaagaattt cttgcaatat 840
tacaatcagg atgcagcctc cggtttactg gggggattgc gtaatctgaa agcgcgagca 900
aaacaggcag ggcacaagct ggaactcgca ttcagtatcg gcggctggag tatgtcaggg 960
tatttctctg tgatggccaa agatcctgag caacgtgcta catttgtgag tagcatcgtc 1020
gacttcttcc ggcgttttcc catgtttact gcggtggata tcgactggga ataccccggc 1080
gccacaggtg aagaaggtaa tgaattcgac ccggaacatg atggcccaaa ctatgttttg 1140
ttagtgaaag agctgcgtga agcactgaac atcgcctttg gaacccgggc ccgtaaagaa 1200
atcacgatag cctgtagcgc cgtcgttgcc aaaatggaga agtccagctt caaagaaatc 1260
gcaccttatt tagacaatat ctttgtgatg acctacgact tctttggtac cggttgggca 1320
gaatacatcg gtcaccatac taacctgtat ccccccagat atgaatatga cggcgataac 1380
cctcctccgc ccaatcctga tcgggacatg gattactcgg ctgatgaggc gatccgcttt 1440
ttactgtcac aaggtgtaca accggagaaa attcacctcg gatttgctaa ctatggacgt 1500
tcatgtctgg gtgctgatct gacaactcgc cgctataaca gaacaggaga gccactgggc 1560
acgatggaaa aaggtgctcc ggaattcttc tgtctgctga ataaccaata cgatgcggaa 1620
tatgaaattg cacgcgggaa aaatcagttt gaactggtga cagacacgga aaccgacgct 1680
gacgcactct ttaatgctga cggtggtcac tggatttcac tggatacgcc ccgcactgtg 1740
ctgcataagg gaatttatgc aaccaaaatg aaattgggcg ggatcttctc ttggtcaggc 1800
gatcaggatg atggcctgtt ggcaaatgct gctcacgaag gtttgggtta cttacctgta 1860
cgcggaaaag agaagattga tatgggaccg ttatataaca aaggacgtct cattcagctt 1920
cctaaagtaa cccgtcgtaa atcg 1944
<210>12
<211>648
<212>PRT
<213>嗜线虫致病杆菌
<400>12
Met Ser Gln Asn Val Tyr Arg Tyr Pro Ser Ile Lys Ala Met Ser Asp
1 5 10 15
Ala Ser Ser Glu Val Gly Ala Ser Leu Val Ala Trp Gln Asn Gln Ser
20 25 30
Gly Gly Gln Thr Trp Tyr Val Ile Tyr Asp Ser Ala Val Phe Lys Asn
35 40 45
Ile Gly Trp Val Glu Arg Trp His Ile Pro Asp Arg Asn Ile Ser Pro
50 55 60
Asp Leu Pro Val Tyr Glu Asn Ala Trp Gln Tyr Val Arg Glu Ala Thr
65 70 75 80
Pro Glu Glu Ile Ala Asp His Gly Asn Pro Asn Thr Pro Asp Val Pro
85 90 95
Pro Gly Glu Lys Thr Glu Val Leu Gln Tyr Asp Ala Leu Thr Glu Glu
100 105 110
Thr Tyr Gln Lys Val Gly Tyr Lys Pro Asp Gly Ser Gly Thr Pro Leu
115 120 125
Ser Tyr Ser Ser Ala Arg Val Ala Lys Ser Leu Tyr Asn Glu Tyr Glu
130 135 140
Val Asp Pro Glu Asn Thr Glu Pro Leu Pro Lys Val Ser Ala Tyr Ile
145 150 155 160
Thr Asp Trp Cys Gln Tyr Asp Ala Arg Leu Ser Pro Glu Thr Gln Asp
165 170 175
Asn Thr Ala Leu Thr Ser Asp Asp Ala Pro Gly Arg Gly Phe Asp Leu
180 185 190
Glu Lys Ile Pro Pro Thr Ala Tyr Asp Arg Leu Ile Phe Ser Phe Met
195 200 205
Ala Val Asn Gly Asp Lys Gly Lys Leu Ser Glu Arg Ile Asn Glu Val
210 215 220
Val Asp Gly Trp Asn Arg Gln Ala Glu Ala Ser Ser Gly Gln Ile Ala
225 230 235 240
Pro Ile Thr Leu Gly His Ile Val Pro Val Asp Pro Tyr Gly Asp Leu
245 250 255
Gly Thr Thr Arg Asn Val Gly Leu Asp Ala Asp Gln Arg Arg Asp Ala
260 265 270
Ser Pro Lys Asn Phe Leu Gln Tyr Tyr Asn Gln Asp Ala Ala Ser Gly
275 280 285
Leu Leu Gly Gly Leu Arg Asn Leu Lys Ala Arg Ala Lys Gln Ala Gly
290 295 300
His Lys Leu Glu Leu Ala Phe Ser Ile Gly Gly Trp Ser Met Ser Gly
305 310 315 320
Tyr Phe Ser Val Met Ala Lys Asp Pro Glu Gln Arg Ala Thr Phe Val
325 330 335
Ser SerIle Val Asp Phe Phe Arg Arg Phe Pro Met Phe Thr Ala Val
340 345 350
Asp Ile Asp Trp Glu Tyr Pro Gly Ala Thr Gly Glu Glu Gly Asn Glu
355 360 365
Phe Asp Pro Glu His Asp Gly Pro Asn Tyr Val Leu Leu Val Lys Glu
370 375 380
Leu Arg Glu Ala Leu Asn Ile Ala Phe Gly Thr Arg Ala Arg Lys Glu
385 390 395 400
Ile Thr Ile Ala Cys Ser Ala Val Val Ala Lys Met Glu Lys Ser Ser
405 410 415
Phe Lys Glu Ile Ala Pro Tyr Leu Asp Asn Ile Phe Val Met Thr Tyr
420 425 430
Asp Phe Phe Gly Thr Gly Trp Ala Glu Tyr Ile Gly His His Thr Asn
435 440 445
Leu Tyr Pro Pro Arg Tyr Glu Tyr Asp Gly Asp Asn Pro Pro Pro Pro
450 455 460
Asn Pro Asp Arg Asp Met Asp Tyr Ser Ala Asp Glu Ala Ile Arg Phe
465 470 475 480
Leu Leu Ser Gln Gly Val Gln Pro Glu Lys Ile His Leu Gly Phe Ala
485 490 495
Asn Tyr Gly Arg Ser Cys Leu Gly Ala Asp Leu Thr Thr Arg Arg Tyr
500 505 510
Asn Arg Thr Gly Glu Pro Leu Gly Thr Met Glu Lys Gly Ala Pro Glu
515 520 525
Phe Phe Cys Leu Leu Asn Asn Gln Tyr Asp Ala Glu Tyr Glu Ile Ala
530 535 540
Arg Gly Lys Asn Gln Phe Glu Leu Val Thr Asp Thr Glu Thr Asp Ala
545 550 555 560
Asp Ala Leu Phe Asn Ala Asp Gly Gly His Trp Ile Ser Leu Asp Thr
565 570 575
Pro Arg Thr Val Leu His Lys Gly Ile Tyr Ala Thr Lys Met Lys Leu
580 585 590
Gly Gly Ile Phe Ser Trp Ser Gly Asp Gln Asp Asp Gly Leu Leu Ala
595 600 605
Asn Ala Ala His Glu Gly Leu Gly Tyr Leu Pro Val Arg Gly Lys Glu
610 615 620
Lys Ile Asp Met Gly Pro Leu Tyr Asn Lys Gly Arg Leu Ile Gln Leu
625 630 635 640
Pro Lys Val Thr Arg Arg Lys Ser
645
<210>13
<211>7569
<212>DNA
<213>嗜线虫致病杆菌
<400>13
atgataaaag ttaatgaact gttagataag ataaatagaa aaaggtctgg tgatacttta 60
ttattgacaa acatttcgtt tatgtctttc agcgaatttc gtcataggac aagtggaact 120
ctgacgtggc gagaaacaga ctttttatat caacaggctc atcaggaatc aaaacagaat 180
aaacttgaag aactgcgcat tttgtcccgt gctaatccac aactggctaa taccactaac 240
cttaatatta caccgtcaac cctaaacaat agttacaaca gttggtttta tggccgtgcc 300
caccgttttg taaaaccggg atcaattgct tccatatttt caccagcggc ttatttaaca 360
gaattatatc gggaagcgaa agattttcat cctgacaatt ctcaatatca cctgaataaa 420
cgacgccccg acattgcttc actggcactg acacagaata atatggatga agaaatttcc 480
acattatcct tatctaatga attactgctg cataatattc agacgttaga gaaaactgac 540
tataacggtg taatgaaaat gttgtccact taccggcaaa ccggcatgac accctatcat 600
ctgccgtatg agtcagcccg tcaggcaatt ttattgcaag ataaaaacct caccgcattt 660
agccgtaata cagacgtagc ggaattaatg gacccaacat cgctactggc tattaagact 720
gatatatcgc ctgaattgta tcaaatcctt gtagaagaaa ttacaccgga aaattcaaca 780
gaactgatga agaaaaattt cggtacagat gatgtactga tttttaagag ttatgcttct 840
ttggctcgct actacgattt gtcttatgat gaactcagtt tatttgtcaa tctctccttc 900
ggtaagaaaa atacaaatca acagtataag aatgagcaac tgataacatt ggtcaatgac 960
gggaatgata cggcaacggc aagattgatt aagcgaaccc gcaaagattt ctacgattca 1020
catttaaact atgcagaact aattccaatc aaagaaaatg aatacaaata taatttcagt 1080
gtaaaaaaaa cagaacctga ccacttggat tttcgtctcc agaatggaga taaagaatat 1140
atataccaag ataaaaattt cgtccccatt gctaataccc attacagtat tcccattaaa 1200
ttgacgacag agcaaatcac caacggtata acactccgct tatggcgagt taaaccaaat 1260
ccgtcggatg ctatcaatgc caatgcatac tttaaaatga tggagttccc cggtgatata 1320
ttcctgttaa agctgaataa agcgattcgt ttgtataaag ccacaggcat atctccagaa 1380
gatatctggc aagtaataga aagtatttat gatgacttaa ccattgacag caatgtgttg 1440
ggtaagctgt tttatgttca atattatatg cagcactata atattagcgt cagcgatgcg 1500
ctggtattgt gtcattcaga tatcagccaa tattccacta aacaacaacc cagtcatttt 1560
acaatactgt tcaatacacc gctattaaat ggccaagagt tttctgctga taataccaaa 1620
ctggatttaa cccccggtga atcaaaaaac catttttatt tgggaataat gaaacgtgct 1680
ttcagagtga atgatactga actgtataca ttatggaagc tggctaatgg cggaacaaat 1740
ccagaattta tgtgttccat cgagaacctg tctctgcttt atcgcgttcg tctgctggca 1800
gacattcatc atctgacagt gaatgaatta tccatgttgt tgtcggtttc tccctatgtg 1860
aacacgaaaa ttgccctttt ttctgataca gcattaacgc aattaatcag ctttctgttc 1920
caatgcaccc agtggctgac aacacagaaa tggtctgtca gtgatgtgtt tctgatgacc 1980
acggataatt acagcactgt ccttacgccg gatattgaaa accttatcac gacactaagt 2040
aatggattat caacactttc actcggtgat gacgaactga tccgtgcagc tgccccgctg 2100
attgctgcca gcattcaaat ggattcagcc aagacagcag aaactatttt gctgtggatt 2160
aatcagataa aaccacaagg actgacattc gatgatttca tgattattgc ggctaaccgt 2220
gatcgctcag agaatgaaac cagcaacatg gtggcttttt gtcaggtact ggggcaactt 2280
tctctgattg tgcgcaatat tggactcagc gaaaacgaac tgaccctgtt ggtgacaaaa 2340
ccggagaaat tccaatcaga aaccacagca ctgcaacatg atctccccac tttgcaagcg 2400
ctgacccgct tccatgctgt gatcatgcgt tgtggaagct acgcgacaga aatcttaaca 2460
gcattggaac taggagcgct gactgccgaa caattggcgg tggcgttaaa atttgatgct 2520
caggttgtga cacaagcatt gcaacagacc ggtttgggag tgaatacctt taccaactgg 2580
agaactatag atgtcactct gcaatggctg gatgtcgctg ctacattggg tattaccccg 2640
gatggtgttg ctgcactcat aaaattaaaa tatatcggtg aaccagaaac cccgatgcca 2700
acatttgatg attggcaagc cgccagtact ttgttgcagg cgggactgaa cagtcaacaa 2760
tccgaccagc ttcaggcatg gctggatgaa gccacgacga cagcggccag tgcttactac 2820
atcaaaaata gtgcacctca acagattaag agccgggatg agttgtacag ctatctgctg 2880
attgataacc aagtttctgc ccaagtgaaa accacccgtg tggcagaagc cattgccagc 2940
attcagttat atgtcaaccg ggcgttgaat aatgttgaag gaaaagtatc aaagccagtg 3000
aaaacccgtc agttcttctg cgactgggaa acctacaatc gacggtatag cacctgggcc 3060
ggcgtatctg aactggccta ttatccggaa aactatatcg accccacgat tcgtattggt 3120
cagacaggta tgatgaacaa cctgttacag caactttccc aaagtcagtt aaatatcgat 3180
accgttgaag atagctttaa aaattatctg accgcatttg aagatgtcgc taacttgcag 3240
gtgattagcg gatatcatga cagtatcaat gtcaatgagg gactcactta tttaattggt 3300
tatagccaga cagaacccag aatatattat tggcgcaatg tcgatcacca aaagtgccag 3360
cacggtcaat ttgctgccaa tgcctgggga gaatggaaaa aaattgaaat acccatcaat 3420
gtatggcagg aaaatatcag acctgttatt tacaagtctc gtttgtattt actgtggctg 3480
gaacaaaaag agctgaaaaa tgaaagtgaa gatggcaaga tagatatcac tgattatata 3540
ttaaaactgt cacatattcg ttatgatggc agctggagct caccgtttaa ttttaatgtg 3600
actgataaaa tagaaaacct gatcaataaa aaagccagca ttggtatgta ttgttcttct 3660
gattatgaaa aagacgtcat tattgtttat ttccatgaga aaaaagacaa ttattctttt 3720
aatagtcttc ctgcaagaga agggatgacc attaaccctg atatgacatt atccattctc 3780
acagaaaatg atttagacgc cattgttaag agcacattat cagaacttga taccaggaca 3840
gaatacaaag tcaacaatca atttgctaca gattatttgg ccgaatataa ggaatctata 3900
accacaaaaa ataaattagc cagttttacc ggaaatattt ttgatctctc gtatatatca 3960
ccaggaaatg gtcatattaa tttaacgttc aatccttcaa tggaaattaa tttttcaaaa 4020
ggcaatatat ataatgatga ggttaaatac ctgttatcga tggtagaaga tgaaacggtt 4080
attttatttg attatgatag acatgatgaa atgcttggaa aagaagaaga agtttttcat 4140
tatggaactt tggattttat tatttccatc gatcttaaaa atgccgaata ttttagagtg 4200
ttaatgcatc taagaaccaa ggaaaaaatt cctagaaaat cagaaattgg agttggtata 4260
aattatgatt atgaatcaaa tgatgctgaa ttcaaacttg atactaacat agtattagat 4320
tggaaagata acacaggagt atggcatact atatgtgaat catttactaa tgatgtttca 4380
atcattaata acatgggaaa tattgcggca ctgttccttc gcgaggatcc atgtgtgtat 4440
ttatgttcaa tagccacaga tataaaaatt gcttcatcta tgatcgaaca gatccaagat 4500
aaaaacatta gttttttatt aaaaaatggc tctgatattc tagtggagtt aaatgctgaa 4560
gaccatgtgg catctaaacc ttcacacgaa tctgacccta tggtatatga ttttaatcaa 4620
gtaaaagttg atattgaagg ctatgatatt cctctggtga gcgagtttat tattaagcaa 4680
cccgacggcg gttataacga tattgttatt gaatcgccaa ttcatataaa actaaaatcc 4740
aaagatacaa gtaacgttat atcactgcat aaaatgccat caggcacaca atatatgcag 4800
attggccctt acagaacccg gttaaatact ttattttcca gaaaattagc tgaaagagcc 4860
aatattggta ttgataatgt tttaagtatg gaaacgcaaa atttaccaga gccgcaatta 4920
ggtgaagggt tttatgcgac atttaagttg cccccctaca ataaagagga gcatggtgat 4980
gaacgttggt ttaagatcca tattgggaat attgatggca attctgccag acaaccttat 5040
tacgaaggaa tgttatctga tattgaaacc acagtaacgc tctttgttcc ctatgctaaa 5100
ggatattaca tacgtgaagg tgtcagatta ggggttgggt acaaaaaaat tatctatgac 5160
aaatcctggg aatctgcttt cttttatttt gatgagacga aaaatcaatt tatattcatt 5220
aatgatgccg atcatgattc gggaatgaca caacagggga tagtaaaaaa tatcaaaaaa 5280
tataaagggt ttattcatgt cgttgtcatg aaaaataaca ctgaacccat ggatttcaac 5340
ggcgccaatg caatctattt ctgggaattg ttctattaca cgcccatgat ggtattccag 5400
cgcttattgc aagagcagaa ttttaccgaa tcgacacgct ggctgcgcta tatctggaac 5460
ccggccggat attcggttca gggtgaaatg caggattatt actggaacgt ccgcccattg 5520
gaggaagata cgtcctggaa tgccaatccg ctggattcgg tcgatcctga cgccgttgcc 5580
cagcatgatc cgatgcacta taaagtggct acctttatga aaatgctgga tttgttgatt 5640
acccgcggag atagcgccta tcgccagctt gaacgtgata ccttaaacga agctaaaatg 5700
tggtatgtac aggcgctcac tttattgggt gatgagcctt atttttcatt ggataacgat 5760
tggtcagagc cacggctgga agaagctgcc agccaaacaa tgcggcatca ttatcaacat 5820
aaaatgctgc aactgcgtca gcgcgctgca ttacccacga aacgtacggc aaattcgtta 5880
accgcattgt tcctccctca aattaataaa aaactgcaag gttactggca gacattgacg 5940
caacgcctct ataacttacg ccataacctg acaatcgacg gtcagccact gtcattatct 6000
ctctatgcca cgcccgcaga tccgtccatg ttactcagtg ctgccatcac tgcttcacaa 6060
ggcggcggcg atttacctca tgcagtgatg ccgatgtacc gttttccggt gattctggaa 6120
aatgccaagt ggggggtaag ccagttgata caatttggca ataccctgct cagcattact 6180
gaacggcagg atgcagaagc cttggctgaa atactgcaaa ctcaaggcag tgagttagcc 6240
ctgcaaagta ttaaaatgca ggataaggtc atggctgaaa ttgatgctga taaattggcg 6300
cttcaagaaa gccgtcatgg tgcacagtct cgttttgaca gtttcaatac gctgtacgac 6360
gaagatgtta acgctggtga aaaacaagcg atggatcttt acctctcttc atcggtcttg 6420
agcaccagcg gcacagccct gcatatggcc gccgccgcgg cagatctcgt ccccaatatt 6480
tacggttttg ctgtgggagg ttcccgtttt ggggcgcttt tcaatgccag tgcgattggt 6540
atcgaaattt ctgcgtcagc aacacgtatt gccgcagaca aaatcagcca atcagaaata 6600
taccgtcgcc gtcggcaaga gtgggaaatt cagcgcaata atgcggaagc tgagataaaa 6660
caaattgatg ctcaattagc gacgctggct gtacgtcgtg aagcggcagt attacaaaaa 6720
aactatctgg aaactcagca ggcacaaact caggcgcagt tagcctttct gcaaagtaaa 6780
ttcagtaatg cagcgctata caactggctc cgtggaaggt tgtccgctat ttattatcag 6840
ttttatgatt tggcggtctc actctgttta atggcagagc aaacttatca gtatgaattg 6900
aataatgcgg cagcacactt tattaaacca ggtgcctggc atgggactta tgcgggttta 6960
ttagcgggtg aaaccctgat gctgaattta gcacagatgg aaaaaagcta tttggaaaaa 7020
gatgaacggg cactggaggt caccagaacc gtttctctgg ctgaagtgta tgctggtctg 7080
acagaaaata gtttcatttt aaaagataaa gtgactgagt tagtcaatgc aggtgaaggc 7140
agtgcaggca caacgcttaa cggtttgaac gtcgaaggga cacaactgca agccagcctc 7200
aaattatcgg atctgaatat tgctaccgat tatcctgacg gtttaggtaa tacacgccgt 7260
atcaaacaaa tcagtgtgac attacctgcc cttttagggc cttatcagga tgttcgggca 7320
atactaagtt atggcggcag cacaatgatg ccacgtggct gcaaagcgat tgcgatctca 7380
catggcatga atgacagtgg tcaattccag atggatttca atgatgccaa gtacctgcca 7440
tttgaagggc ttcctgtggc cgatacaggc acattaaccc tcagttttcc cggtatcagt 7500
ggtaaacaga aaagcttatt gctcagcctg agcgatatca ttctgcatat ccgttacacc 7560
attcgttct 7569
<210>14
<211>2523
<212>PRT
<213>嗜线虫致病杆菌
<400>14
Met Ile Lys Val Asn Glu Leu Leu Asp Lys Ile Asn Arg Lys Arg Ser
1 5 10 15
Gly Asp Thr Leu Leu Leu Thr Asn Ile Ser Phe Met Set Phe Ser Glu
20 25 30
Phe Arg His Arg Thr Ser Gly Thr Leu Thr Trp Arg Glu Thr Asp Phe
35 40 45
Leu Tyr Gln Gln Ala His Gln Glu Ser Lys Gln Asn Lys Leu Glu Glu
50 55 60
Leu Arg Ile Leu Ser Arg Ala Asn Pro Gln Leu Ala Asn Thr Thr Asn
65 70 75 80
Leu Asn Ile Thr Pro Ser Thr Leu Asn Asn Ser Tyr Asn Ser Trp Phe
85 90 95
Tyr Gly Arg Ala His Arg Phe Val Lys Pro Gly Ser Ile Ala Ser Ile
100 105 110
Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asp
115 120 125
Phe His Pro Asp Asn Ser Gln Tyr His Leu Asn Lys Arg Arg Pro Asp
130 135 140
Ile Ala Ser Leu Ala Leu Thr Gln Asn Asn Met Asp Glu Glu Ile Ser
145 150 155 160
Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu His Asn Ile Gln Thr Leu
165 170 175
Glu Lys Thr Asp Tyr Asn Gly Val Met Lys Met Leu Ser Thr Tyr Arg
180 185 190
Gln Thr Gly Met Thr Pro Tyr His Leu Pro Tyr Glu Ser Ala Arg Gln
195 200 205
Ala Ile Leu Leu Gln Asp Lys Asn Leu Thr Ala Phe Ser Arg Asn Thr
210 215 220
Asp Val Ala Glu Leu Met Asp Pro Thr Ser Leu Leu Ala Ile Lys Thr
225 230 235 240
Asp Ile Ser Pro Glu Leu Tyr Gln Ile Leu Val Glu Glu Ile Thr Pro
245 250 255
Glu Asn Ser Thr Glu Leu Met Lys Lys Asn Phe Gly Thr Asp Asp Val
260 265 270
Leu Ile Phe Lys Ser Tyr Ala Ser Leu Ala Arg Tyr Tyr Asp Leu Ser
275 280 285
Tyr Asp Glu Leu Ser Leu Phe Val Asn Leu Ser Phe Gly Lys Lys Asn
290 295 300
Thr Asn Gln Gln Tyr Lys Asn Glu Gln Leu Ile Thr Leu Val Asn Asp
305 310 315 320
Gly Asn Asp Thr Ala Thr Ala Arg Leu Ile Lys Arg Thr Arg Lys Asp
325 330 335
Phe Tyr Asp Ser His Leu Asn Tyr Ala Glu Leu Ile Pro Ile Lys Glu
340 345 350
Asn Glu Tyr Lys Tyr Asn Phe Ser Val Lys Lys Thr Glu Pro Asp His
355 360 365
Leu Asp Phe Arg Leu Gln Asn Gly Asp Lys Glu Tyr Ile Tyr Gln Asp
370 375 380
Lys Asn Phe Val Pro Ile Ala Asn Thr His Tyr Ser Ile Pro Ile Lys
385 390 395 400
Leu Thr Thr Glu Gln Ile Thr Asn Gly Ile Thr Leu Arg Leu Trp Arg
405 410 415
Val Lys Pro Asn Pro Ser Asp Ala Ile Asn Ala Asn Ala Tyr Phe Lys
420 425 430
Met Met Glu Phe Pro Gly Asp Ile Phe Leu Leu Lys Leu Asn Lys Ala
435 440 445
Ile Arg Leu Tyr Lys Ala Thr Gly Ile Ser Pro Glu Asp Ile Trp Gln
450 455 460
Val Ile Glu Ser Ile Tyr Asp Asp Leu Thr Ile Asp Ser Asn Val Leu
465 470 475 480
Gly Lys Leu Phe Tyr Val Gln Tyr Tyr Met Gln His Tyr Asn Ile Ser
485 490 495
Val Ser Asp Ala Leu Val Leu Cys His Ser Asp Ile Ser Gln Tyr Ser
500 505 510
Thr Lys Gln Gln Pro Ser His Phe Thr Ile Leu Phe Asn Thr Pro Leu
515 520 525
Leu Asn Gly Gln Glu Phe Ser Ala Asp Asn Thr Lys Leu Asp Leu Thr
530 535 540
Pro Gly Glu Ser Lys Asn His Phe Tyr Leu Gly Ile Met Lys Arg Ala
545 550 555 560
Phe Arg Val Asn Asp Thr Glu Leu Tyr Thr Leu Trp Lys Leu Ala Asn
565 570 575
Gly Gly Thr Asn Pro Glu Phe Met Cys Ser Ile Glu Asn Leu Ser Leu
580 585 590
Leu Tyr Arg Val Arg Leu Leu Ala Asp Ile His His Leu Thr Val Asn
595 600 605
Glu Leu Ser Met Leu Leu Ser Val Ser Pro Tyr Val Asn Thr Lys Ile
610 615 620
Ala Leu Phe Ser Asp Thr Ala Leu Thr Gln Leu Ile Ser Phe Leu Phe
625 630 635 640
Gln Cys Thr Gln Trp Leu Thr Thr Gln Lys Trp Ser Val Ser Asp Val
645 650 655
Phe Leu Met Thr Thr Asp Asn Tyr Ser Thr Val Leu Thr Pro Asp Ile
660 665 670
Glu Asn Leu Ile Thr Thr Leu Ser Asn Gly Leu Ser Thr Leu Ser Leu
675 680 685
Gly Asp Asp Glu Leu Ile Arg Ala Ala Ala Pro Leu Ile Ala Ala Ser
690 695 700
Ile Gln Met Asp Ser Ala Lys Thr Ala Glu Thr Ile Leu Leu Trp Ile
705 710 715 720
Asn Gln Ile Lys Pro Gln Gly Leu Thr Phe Asp Asp Phe Met Ile Ile
725 730 735
Ala Ala Asn Arg Asp Arg Ser Glu Asn Glu Thr Ser Asn Met Val Ala
740 745 750
Phe Cys Gln Val Leu Gly Gln Leu Ser Leu Ile Val Arg Asn Ile Gly
755 760 765
Leu Ser Glu Asn Glu Leu Thr Leu Leu Val Thr Lys Pro Glu Lys Phe
770 775 780
Gln Ser Glu Thr Thr Ala Leu Gln His Asp Leu Pro Thr Leu Gln Ala
785 790 795 800
Leu Thr Arg Phe His Ala Val Ile Met Arg Cys Gly Ser Tyr Ala Thr
805 810 815
Glu Ile Leu Thr Ala Leu Glu Leu Gly Ala Leu Thr Ala Glu Gln Leu
820 825 830
Ala Val Ala Leu Lys Phe Asp Ala Gln Val Val Thr Gln Ala Leu Gln
835 840 845
Gln Thr Gly Leu Gly Val Asn Thr Phe Thr Asn Trp Arg Thr Ile Asp
850 855 860
Val Thr Leu Gln Trp Leu Asp Val Ala Ala Thr Leu Gly Ile Thr Pro
865 870 875 880
Asp Gly Val Ala Ala Leu Ile Lys Leu Lys Tyr Ile Gly Glu Pro Glu
885 890 895
Thr Pro Met Pro Thr Phe Asp Asp Trp Gln Ala Ala Ser Thr Leu Leu
900 905 910
Gln Ala Gly Leu Asn Ser Gln Gln Ser Asp Gln Leu Gln Ala Trp Leu
915 920 925
Asp Glu Ala Thr Thr Thr Ala Ala Ser Ala Tyr Tyr Ile Lys Asn Ser
930 935 940
Ala Pro Gln Gln Ile Lys Ser Arg Asp Glu Leu Tyr Ser Tyr Leu Leu
945 950 955 960
Ile Asp Asn Gln Val Ser Ala Gln Val Lys Thr Thr Arg Val Ala Glu
965 970 975
Ala Ile Ala Ser Ile Gln Leu Tyr Val Asn Arg Ala Leu Asn Asn Val
980 985 990
Glu Gly Lys Val Ser Lys Pro Val Lys Thr Arg Gln Phe Phe Cys Asp
995 1000 1005
Trp Glu Thr Tyr Asn Arg Arg Tyr Ser Thr Trp Ala Gly Val Ser
1010 1015 1020
Glu Leu Ala Tyr Tyr Pro Glu Asn Tyr Ile Asp Pro Thr Ile Arg
1025 1030 1035
Ile Gly Gln Thr Gly Met Met Asn Asn Leu Leu Gln Gln Leu Ser
1040 1045 1050
Gln Ser Gln Leu Asn Ile Asp Thr Val Glu Asp Ser Phe Lys Asn
1055 1060 1065
Tyr Leu Thr Ala Phe Glu Asp Val Ala Asn Leu Gln Val Ile Ser
1070 1075 1080
Gly Tyr His Asp Ser Ile Asn Val Asn Glu Gly Leu Thr Tyr Leu
1085 1090 1095
Ile Gly Tyr Ser Gln Thr Glu Pro Arg Ile Tyr Tyr Trp Arg Asn
1100 1105 1110
Val Asp His Gln Lys Cys Gln His Gly Gln Phe Ala Ala Asn Ala
1115 1120 1125
Trp Gly Glu Trp Lys Lys Ile Glu Ile Pro Ile Asn Val Trp Gln
1130 1135 1140
Glu Asn Ile Arg Pro Val Ile Tyr Lys Ser Arg Leu Tyr Leu Leu
1145 1150 1155
Trp Leu Glu Gln Lys Glu Leu Lys Asn Glu Ser Glu Asp Gly Lys
1160 1165 1170
Ile Asp Ile Thr Asp Tyr Ile Leu Lys Leu Ser His Ile Arg Tyr
1175 1180 1185
Asp Gly Ser Trp Ser Ser Pro Phe Asn Phe Asn Val Thr Asp Lys
1190 1195 1200
Ile Glu Asn Leu Ile Asn Lys Lys Ala Ser Ile Gly Met Tyr Cys
1205 1210 1215
Ser Ser Asp Tyr Glu Lys Asp Val Ile Ile Val Tyr Phe His Glu
1220 1225 1230
Lys Lys Asp Asn Tyr Ser Phe Asn Ser Leu Pro Ala Arg Glu Gly
1235 1240 1245
Met Thr Ile Asn Pro Asp Met Thr Leu Ser Ile Leu Thr Glu Asn
1250 1255 1260
Asp Leu Asp Ala Ile Val Lys Ser Thr Leu Ser Glu Leu Asp Thr
1265 1270 1275
Arg Thr Glu Tyr Lys Val Asn Asn Gln Phe Ala Thr Asp Tyr Leu
1280 1285 1290
Ala Glu Tyr Lys Glu Ser Ile Thr Thr Lys Asn Lys Leu Ala Ser
1295 1300 1305
Phe Thr Gly Asn Ile Phe Asp Leu Ser Tyr Ile Ser Pro Gly Asn
1310 1315 1320
Gly His Ile Asn Leu Thr Phe Asn Pro Ser Met Glu Ile Asn Phe
1325 1330 1335
Ser Lys Gly Asn Ile Tyr Asn Asp Glu Val Lys Tyr Leu Leu Ser
1340 1345 1350
Met Val Glu Asp Glu Thr Val Ile Leu Phe Asp Tyr Asp Arg His
1355 1360 1365
Asp Glu Met Leu Gly Lys Glu Glu Glu Val Phe His Tyr Gly Thr
1370 1375 1380
Leu Asp Phe IleIle Ser Ile Asp Leu Lys Asn Ala Glu Tyr Phe
1385 1390 1395
Arg Val Leu Met His Leu Arg Thr Lys Glu Lys Ile Pro Arg Lys
1400 1405 1410
Ser Glu Ile Gly Val Gly Ile Asn Tyr Asp Tyr Glu Ser Asn Asp
1415 1420 1425
Ala Glu Phe Lys Leu Asp Thr AsnIle Val Leu Asp Trp Lys Asp
1430 1435 1440
Asn Thr Gly Val Trp His Thr Ile Cys Glu Ser Phe Thr Asn Asp
1445 1450 1455
Val Ser Ile Ile Asn Asn Met Gly Asn Ile Ala Ala Leu Phe Leu
1460 1465 1470
Arg Glu Asp Pro Cys Val Tyr Leu Cys Ser Ile Ala Thr Asp Ile
1475 1480 1485
Lys Ile Ala Ser Ser Met Ile Glu Gln Ile Gln Asp Lys Asn Ile
1490 1495 1500
Ser Phe Leu Leu Lys Asn Gly Ser Asp Ile Leu Val Glu Leu Asn
1505 1510 1515
Ala Glu Asp His Val Ala Ser Lys Pro Ser His Glu Ser Asp Pro
1520 1525 1530
Met Val Tyr Asp Phe Asn Gln Val Lys Val Asp Ile Glu Gly Tyr
1535 1540 1545
Asp Ile Pro Leu Val Ser Glu Phe Ile Ile Lys Gln Pro Asp Gly
1550 1555 1560
Gly Tyr Asn Asp Ile Val Ile Glu Ser Pro Ile His Ile Lys Leu
1565 1570 1575
Lys Ser Lys Asp Thr Ser Asn Val Ile Ser Leu His Lys Met Pro
1580 1585 1590
Ser Gly Thr Gln Tyr Met Gln Ile Gly Pro Tyr Arg Thr Arg Leu
1595 1600 1605
Asn Thr Leu Phe Ser Arg Lys Leu Ala Glu Arg Ala Asn Ile Gly
1610 1615 1620
Ile Asp Asn Val Leu Ser Met Glu Thr Gln Asn Leu Pro Glu Pro
1625 1630 1635
Gln Leu Gly Glu Gly Phe Tyr Ala Thr Phe Lys Leu Pro Pro Tyr
1640 1645 1650
Asn Lys Glu Glu His Gly Asp Glu Arg Trp Phe Lys Ile His Ile
1655 1660 1665
Gly Asn Ile Asp Gly Asn Ser Ala Arg Gln Pro Tyr Tyr Glu Gly
1670 1675 1680
Met Leu Ser Asp Ile Glu Thr Thr Val Thr Leu Phe Val Pro Tyr
1685 1690 1695
Ala Lys Gly Tyr Tyr Ile Arg Glu Gly Val Arg Leu Gly Val Gly
1700 1705 1710
Tyr Lys Lys Ile Ile Tyr Asp Lys Ser Trp Glu Ser Ala Phe Phe
1715 1720 1725
Tyr Phe Asp Glu Thr Lys Asn Gln Phe Ile Phe Ile Asn Asp Ala
1730 1735 1740
Asp His Asp Ser Gly Met Thr Gln Gln Gly Ile Val Lys Asn Ile
1745 1750 1755
Lys Lys Tyr Lys Gly Phe Ile His Val Val Val Met Lys Asn Asn
1760 1765 1770
Thr Glu Pro Met Asp Phe Asn Gly Ala Asn Ala Ile Tyr Phe Trp
1775 1780 1785
Glu Leu Phe Tyr Tyr Thr Pro Met Met Val Phe Gln Arg Leu Leu
1790 1795 1800
Gln Glu Gln Asn Phe Thr Glu Ser Thr Arg Trp Leu Arg Tyr Ile
1805 1810 1815
Trp Asn Pro Ala Gly Tyr Ser Val Gln Gly Glu Met Gln Asp Tyr
1820 1825 1830
Tyr Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala
1835 1840 1845
Asn Pro Leu Asp Ser Val Asp Pro Asp Ala Val Ala Gln His Asp
1850 1855 1860
Pro Met His Tyr Lys Val Ala Thr Phe Met Lys Met Leu Asp Leu
1865 1870 1875
Leu Ile Thr Arg Gly Asp Ser Ala Tyr Arg Gln Leu Glu Arg Asp
1880 1885 1890
Thr Leu Asn Glu Ala Lys Met Trp Tyr Val Gln Ala Leu Thr Leu
1895 1900 1905
Leu Gly Asp Glu Pro Tyr Phe Ser Leu Asp Asn Asp Trp Ser Glu
1910 1915 1920
Pro Arg Leu Glu Glu Ala Ala Ser Gln Thr Met Arg His His Tyr
1925 1930 1935
Gln His Lys Met Leu Gln Leu Arg Gln Arg Ala Ala Leu Pro Thr
1940 1945 1950
Lys Arg Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gln Ile
1955 1960 1965
Asn Lys Lys Leu Gln Gly Tyr Trp Gln Thr Leu Thr Gln Arg Leu
1970 1975 1980
Tyr Asn Leu Arg His Asn Leu Thr Ile Asp Gly Gln Pro Leu Ser
1985 1990 1995
Leu Ser Leu Tyr Ala Thr Pro Ala Asp Pro Ser Met Leu Leu Ser
2000 2005 2010
Ala Ala Ile Thr Ala Ser Gln Gly Gly Gly Asp Leu Pro His Ala
2015 2020 2025
Val Met Pro Met Tyr Arg Phe Pro Val Ile Leu Glu Asn Ala Lys
2030 2035 2040
Trp Gly Val Ser Gln Leu Ile Gln Phe Gly Asn Thr Leu Leu Ser
2045 2050 2055
Ile Thr Glu Arg Gln Asp Ala Glu Ala Leu Ala Glu Ile Leu Gln
2060 2065 2070
Thr Gln Gly Ser Glu Leu Ala Leu Gln Ser Ile Lys Met Gln Asp
2075 2080 2085
Lys Val Met Ala Glu Ile Asp Ala Asp Lys Leu Ala Leu Gln Glu
2090 2095 2100
Ser Arg His Gly Ala Gln Ser Arg Phe Asp Ser Phe Asn Thr Leu
2105 2110 2115
Tyr Asp Glu Asp Val Asn Ala Gly Glu Lys Gln Ala Met Asp Leu
2120 2125 2130
Tyr Leu Ser Ser Ser Val Leu Ser Thr Ser Gly Thr Ala Leu His
2135 2140 2145
Met Ala Ala Ala Ala Ala Asp Leu Val Pro Asn Ile Tyr Gly Phe
2150 2155 2160
Ala Val Gly Gly Ser Arg Phe Gly Ala Leu Phe Asn Ala Ser Ala
2165 2170 2175
Ile Gly Ile Glu Ile Ser Ala Ser Ala Thr Arg Ile Ala Ala Asp
2180 2185 2190
Lys Ile Ser Gln Ser Glu Ile Tyr Arg Arg Arg Arg Gln Glu Trp
2195 2200 2205
Glu Ile Gln Arg Asn Asn Ala Glu Ala Glu Ile Lys Gln Ile Asp
2210 2215 2220
Ala Gln Leu Ala Thr Leu Ala Val Arg Arg Glu Ala Ala Val Leu
2225 2230 2235
Gln Lys Asn Tyr Leu Glu Thr Gln Gln Ala Gln Thr Gln Ala Gln
2240 2245 2250
Leu Ala Phe Leu Gln Ser Lys Phe Ser Asn Ala Ala Leu Tyr Asn
2255 2260 2265
Trp Leu Arg Gly Arg Leu Ser Ala Ile Tyr Tyr Gln Phe Tyr Asp
2270 2275 2280
Leu Ala Val Ser Leu Cys Leu Met Ala Glu Gln Thr Tyr Gln Tyr
2285 2290 2295
Glu Leu Asn Asn Ala Ala Ala His Phe Ile Lys Pro Gly Ala Trp
2300 2305 2310
His Gly Thr Tyr Ala Gly Leu Leu Ala Gly Glu Thr Leu Met Leu
2315 2320 2325
Asn Leu Ala Gln Met Glu Lys Ser Tyr Leu Glu Lys Asp Glu Arg
2330 2335 2340
Ala Leu Glu Val Thr Arg Thr Val Ser Leu Ala Glu Val Tyr Ala
2345 2350 2355
Gly Leu Thr Glu Asn Ser Phe Ile Leu Lys Asp Lys Val Thr Glu
2360 2365 2370
Leu Val Asn Ala Gly Glu Gly Ser Ala Gly Thr Thr Leu Asn Gly
2375 2380 2385
Leu Asn Val Glu Gly Thr Gln Leu Gln Ala Ser Leu Lys Leu Ser
2390 2395 2400
Asp Leu Asn Ile Ala Thr Asp Tyr Pro Asp Gly Leu Gly Asn Thr
2405 2410 2415
Arg Arg Ile Lys Gln Ile Ser Val Thr Leu Pro Ala Leu Leu Gly
2420 2425 2430
Pro Tyr Gln Asp Val Arg Ala Ile Leu Ser Tyr Gly Gly Ser Thr
2435 2440 2445
Met Met Pro Arg Gly Cys Lys Ala Ile Ala Ile Ser His Gly Met
2450 2455 2460
Asn Asp Ser Gly Gln Phe Gln Met Asp Phe Asn Asp Ala Lys Tyr
2465 2470 2475
Leu Pro Phe Glu Gly Leu Pro Val Ala Asp Thr Gly Thr Leu Thr
2480 2485 2490
Leu Ser Phe Pro Gly Ile Ser Gly Lys Gln Lys Ser Leu Leu Leu
2495 2500 2505
Ser Leu Ser Asp Ile Ile Leu His Ile Arg Tyr Thr Ile Arg Ser
2510 2515 2520
<210>15
<211>3048
<212>DNA
<213>嗜线虫致病杆菌
<400>15
atgaagaatt tcgttcacag caatacgcca tccgtcaccg tactggacaa ccgtggtcag 60
acagtacgcg aaatagcctg gtatcggcac cccgatacac ctcaggtaac cgatgaacgc 120
atcaccggtt atcaatatga tgctcaagga tctctgactc agagtattga tccgcgattt 180
tatgaacgcc agcagacagc gagtgacaag aacgccatta cacccaatct tattctcttg 240
tcatcactca gtaagaaggc attgcgtacg caaagtgtgg atgccggaac ccgtgtcgcc 300
ctgcatgatg ttgccgggcg tcccgtttta gctgtcagcg ccaatggcgt tagccgaacg 360
tttcagtatg aaagtgataa ccttccggga cgattgctaa cgattaccga gcaggtaaaa 420
ggagagaacg cctgtatcac ggagcgattg atctggtcag gaaatacgcc ggcagaaaaa 480
ggcaataatc tggccggcca gtgcgtggtc cattatgatc ccaccggaat gaatcaaacc 540
aacagcatat cgttaaccag catacccttg tccatcacac agcaattact gaaagatgac 600
agcgaagccg attggcacgg tatggatgaa tctggctgga aaaacgcgct ggcgccggaa 660
agcttcactt ctgtcagcac aacggatgct accggcacgg tattaacgag tacagatgct 720
gccggaaaca agcaacgtat cgcctatgat gtggccggtc tgcttcaagg cagttggttg 780
gcgctgaagg ggaaacaaga acaagttatc gtgaaatccc tgacctattc ggctgccagc 840
cagaagctac gggaggaaca tggtaacggg atagtgacta catataccta tgaacccgag 900
acgcaacgag ttattggcat aaaaacagaa cgtccttccg gtcatgccgc tggggagaaa 960
attttacaaa acctgcgtta tgaatatgat cctgtcggaa atgtgctgaa atcaactaat 1020
gatgctgaaa ttacccgctt ttggcgcaac cagaaaattg taccggaaaa tacttacacc 1080
tatgacagcc tgtaccagct ggtttccgtc actgggcgtg aaatggcgaa tattggccga 1140
caaaaaaacc agttacccat ccccgctctg attgataaca atacttatac gaattactct 1200
cgcacttacg actatgatcg tgggggaaat ctgaccagaa ttcgccataa ttcaccgatc 1260
accggtaata actatacaac gaacatgacc gtttcagatc acagcaaccg ggctgtactg 1320
gaagagctgg cgcaagatcc cactcaggtg gatatgttgt tcacccccgg cgggcatcag 1380
acccggcttg ttcccggtca ggatcttttc tggacacccc gtgacgaatt gcaacaagtg 1440
atattggtca atagggaaaa tacgacgcct gatcaggaat tctaccgtta tgatgcagac 1500
agtcagcgtg tcattaagac tcatattcag aagacaggta acagtgagca aatacagcga 1560
acattatatt tgccagagct ggaatggcgc acgacatata gcggcaatac attaaaagag 1620
tttttgcagg tcatcactgt cggtgaatcg ggtcaggcac aagtgcgggt gctgcattgg 1680
gaaacaggca aaccggcgga tatcagcaat gatcagctgc gctacagtta tggcaacctg 1740
attggcagta gcgggctgga attggacagt gacgggcaga tcattagtca ggaagaatat 1800
tacccctatg ggggaaccgc cgtgtgggca gcccgaagtc agtcagaagc tgattacaaa 1860
accgtgcgtt attctggcaa agagcgggat gcaacagggt tgtattacta cggttatcgt 1920
tattatcaat cgtggacagg gcgatggttg agtgtagatc ctgccggtga ggtcgatggt 1980
ctcaatttgt tccgaatgtg caggaataac cccatcgttt tttctgattc tgatggtcgt 2040
ttccccggtc agggtgtcct tgcctggata gggaaaaaag cgtatcgaaa ggcagtcaac 2100
atcacgacag aacacctgct tgaacaaggc gcttcctttg atacgttctt gaaattaaac 2160
cgaggattgc gaacgtttgt tttgggtgtg ggggtagcaa gtctgggggt gaaggcggcc 2220
acgattgcag gagcgtcgcc ttgggggatt gtcggggctg ccattggtgg ttttgtctcc 2280
ggggcggtga tggggttttt cgcgaacaac atctcagaaa aaattgggga agttttaagt 2340
tatctgacgc gtaaacgttc tgttcctgtt caggttggcg cttttgttgt cacatcgctt 2400
gtgacgtctg cactatttaa cagctcttcg acaggtaccg ccatttccgc agcaacagcg 2460
gtcaccgttg gaggattaat ggctttagcc ggagagcata acacgggcat ggctatcagt 2520
attgccacac ccgccggaca aggtacgctg gatacgctca ggcccggtaa tgtcagcgcg 2580
ccagagcggt taggggcact atcaggcgca attattggcg gcatattact tggccgccat 2640
cagggaagtt ctgagctggg tgaacgggca gcgattggtg ctatgtatgg tgctcgatgg 2700
ggaaggatca ttggtaatct atgggatggc ccttatcggt ttatcggcag gttactgctc 2760
agaagaggca ttagctctgc catttcccac gctgtcagtt ccaggagctg gtttggccga 2820
atgataggag aaagtgtcgg gagaaatatt tctgaagtat tattacctta tagccgtaca 2880
cccggtgaat gggttggtgc agccattggc gggacagccg cggccgctca tcatgccgtt 2940
ggaggggaag ttgccaatgc cgctagccgg gttacctgga gcggctttaa gcgggctttt 3000
aataacttct tctttaacgc ctctgcacgt cataatgaat ccgaagca 3048
<210>16
<211>1016
<212>PRT
<213>嗜线虫致病杆菌
<400>16
Met Lys Asn Phe Val His Ser Asn Thr Pro Ser Val Thr Val Leu Asp
1 5 10 15
Asn Arg Gly Gln Thr Val Arg Glu Ile Ala Trp Tyr Arg His Pro Asp
20 25 30
Thr Pro Gln Val Thr Asp Glu Arg Ile Thr Gly Tyr Gln Tyr Asp Ala
35 40 45
Gln Gly Ser Leu Thr Gln Ser Ile Asp Pro Arg Phe Tyr Glu Arg Gln
50 55 60
Gln Thr Ala Ser Asp Lys Asn Ala Ile Thr Pro Asn Leu Ile Leu Leu
65 70 75 80
Ser Ser Leu Ser Lys Lys Ala Leu Arg Thr Gln Ser Val Asp Ala Gly
85 90 95
Thr Arg Val Ala Leu His Asp Val Ala Gly Arg Pro Val Leu Ala Val
100 105 110
Ser Ala Asn Gly Val Ser Arg Thr Phe Gln Tyr Glu Ser Asp Asn Leu
115 120 125
Pro Gly Arg Leu Leu Thr Ile Thr Glu Gln Val Lys Gly Glu Asn Ala
130 135 140
Cys Ile Thr Glu Arg Leu Ile Trp Ser Gly Asn Thr Pro Ala Glu Lys
145 150 155 160
Gly Asn Asn Leu Ala Gly Gln Cys Val Val His Tyr Asp Pro Thr Gly
165 170 175
Met Asn Gln Thr Asn Ser Ile Ser Leu Thr Ser Ile Pro Leu Ser Ile
180 185 190
Thr Gln Gln Leu Leu Lys Asp Asp Ser Glu Ala Asp Trp His Gly Met
195 200 205
Asp Glu Ser Gly Trp Lys Asn Ala Leu Ala Pro Glu Ser Phe Thr Ser
210 215 220
Val Ser Thr Thr Asp Ala Thr Gly Thr Val Leu Thr Ser Thr Asp Ala
225 230 235 240
Ala Gly Asn Lys Gln Arg Ile Ala Tyr Asp Val Ala Gly Leu Leu Gln
245 250 255
Gly Ser Trp Leu Ala Leu Lys Gly Lys Gln Glu Gln Val Ile Val Lys
260 265 270
Ser Leu Thr Tyr Ser Ala Ala Ser Gln Lys Leu Arg Glu Glu His Gly
275 280 285
Asn Gly Ile Val Thr Thr Tyr Thr Tyr Glu Pro Glu Thr Gln Arg Val
290 295 300
Ile Gly Ile Lys Thr Glu Arg Pro Ser Gly His Ala Ala Gly Glu Lys
305 310 315 320
Ile Leu Gln Asn Leu Arg Tyr Glu Tyr Asp Pro Val Gly Asn Val Leu
325 330 335
Lys Ser Thr Asn Asp Ala Glu Ile Thr Arg Phe Trp Arg Asn Gln Lys
340 345 350
Ile Val Pro Glu Asn Thr Tyr Thr Tyr Asp Ser Leu Tyr Gln Leu Val
355 360 365
Ser Val Thr Gly Arg Glu Met Ala Asn Ile Gly Arg Gln Lys Asn Gln
370 375 380
Leu Pro Ile Pro Ala Leu Ile Asp Asn Asn Thr Tyr Thr Asn Tyr Ser
385 390 395 400
Arg Thr Tyr Asp Tyr Asp Arg Gly Gly Asn Leu Thr Arg Ile Arg His
405 410 415
Asn Ser Pro Ile Thr Gly Asn Asn Tyr Thr Thr Asn Met Thr Val Ser
420 425 430
Asp His Ser Asn Arg Ala Val Leu Glu Glu Leu Ala Gln Asp Pro Thr
435 440 445
Gln Val Asp Met Leu Phe Thr Pro Gly Gly His Gln Thr Arg Leu Val
450 455 460
Pro Gly Gln Asp Leu Phe Trp Thr Pro Arg Asp Glu Leu Gln Gln Val
465 470 475 480
Ile Leu Val Asn Arg Glu Asn Thr Thr Pro Asp Gln Glu Phe Tyr Arg
485 490 495
Tyr Asp Ala Asp Ser Gln Arg Val Ile Lys Thr His Ile Gln Lys Thr
500 505 510
Gly Asn Ser Glu Gln Ile Gln Arg Thr Leu Tyr Leu Pro Glu Leu Glu
515 520 525
Trp Arg Thr Thr Tyr Ser Gly Asn Thr Leu Lys Glu Phe Leu Gln Val
530 535 540
Ile Thr Val Gly Glu Ser Gly Gln Ala Gln Val Arg Val Leu His Trp
545 550 555 560
Glu Thr Gly Lys Pro Ala Asp Ile Ser Asn Asp Gln Leu Arg Tyr Ser
565 570 575
Tyr Gly Asn Leu Ile Gly Ser Ser Gly Leu Glu Leu Asp Ser Asp Gly
580 585 590
Gln Ile Ile Ser Gln Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr Ala Val
595 600 605
Trp Ala Ala Arg Ser Gln Ser Glu Ala Asp Tyr Lys Thr Val Arg Tyr
610 615 620
Ser Gly Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly Tyr Arg
625 630 635 640
Tyr Tyr Gln Ser Trp Thr Gly Arg Trp Leu Ser Val Asp Pro Ala Gly
645 650 655
Glu Val Asp Gly Leu Asn Leu Phe Arg Met Cys Arg Asn Asn Pro Ile
660 665 670
Val Phe Ser Asp Ser Asp Gly Arg Phe Pro Gly Gln Gly Val Leu Ala
675 680 685
Trp Ile Gly Lys Lys Ala Tyr Arg Lys Ala Val Asn Ile Thr Thr Glu
690 695 700
His Leu Leu Glu Gln Gly Ala Ser Phe Asp Thr Phe Leu Lys Leu Asn
705 710 715 720
Arg Gly Leu Arg Thr Phe Val Leu Gly Val Gly Val Ala Ser Leu Gly
725 730 735
Val Lys Ala Ala Thr Ile Ala Gly Ala Ser Pro Trp Gly Ile Val Gly
740 745 750
Ala Ala Ile Gly Gly Phe Val Ser Gly Ala Val Met Gly Phe Phe Ala
755 760 765
Asn Asn Ile Ser Glu Lys Ile Gly Glu Val Leu Ser Tyr Leu Thr Arg
770 775 780
Lys Arg Ser Val Pro Val Gln Val Gly Ala Phe Val Val Thr Ser Leu
785 790 795 800
Val Thr Ser Ala Leu Phe Asn Ser Ser Ser Thr Gly Thr Ala Ile Ser
805 810 815
Ala Ala Thr Ala Val Thr Val Gly Gly Leu Met Ala Leu Ala Gly Glu
820 825 830
His Asn Thr Gly Met Ala Ile Ser Ile Ala Thr Pro Ala Gly Gln Gly
835 840 845
Thr Leu Asp Thr Leu Arg Pro Gly Asn Val Ser Ala Pro Glu Arg Leu
850 855 860
Gly Ala Leu Ser Gly Ala Ile Ile Gly Gly Ile Leu Leu Gly Arg His
865 870 875 880
Gln Gly Ser Ser Glu Leu Gly Glu Arg Ala Ala Ile Gly Ala Met Tyr
885 890 895
Gly Ala Arg Trp Gly Arg Ile Ile Gly Asn Leu Trp Asp Gly Pro Tyr
900 905 910
Arg Phe Ile Gly Arg Leu Leu Leu Arg Arg Gly Ile Ser Ser Ala Ile
915 920 925
Ser His Ala Val Ser Ser Arg Ser Trp Phe Gly Arg Met Ile Gly Glu
930 935 940
Ser Val Gly Arg Asn Ile Ser Glu Val Leu Leu Pro Tyr Ser Arg Thr
945 950 955 960
Pro Gly Glu Trp Val Gly Ala Ala Ile Gly Gly Thr Ala Ala Ala Ala
965 970 975
His His Ala Val Gly Gly Glu Val Ala Asn Ala Ala Ser Arg Val Thr
980 985 990
Trp Ser Gly Phe Lys Arg Ala Phe Asn Asn Phe Phe Phe Asn Ala Ser
995 1000 1005
Ala Arg His Asn Glu Ser Glu Ala
1010 1015
<210>17
<211>4479
<212>DNA
<213>嗜线虫致病杆菌
<400>17
atgcagggtt caacaccttt gaaacttgaa ataccgtcat tgccctctgg gggcggatca 60
ctaaaaggaa tgggagaagc actcaatgcc gtcggagcgg aagggggagc gtcattttca 120
ctgcccttgc cgatctctgt cgggcgtggt ctggtgccgg tgctatcact gaattacagc 180
agtactgccg gcaatgggtc attcgggatg gggtggcaat gtggggttgg ttttatcagc 240
ctgcgtaccg ccaagggcgt tccgcactat acgggacaag atgagtatct cgggccggat 300
ggggaagtgt tgagtattgt gccggacagc caagggcaac cagagcaacg caccgcaacc 360
tcactgttgg ggacggttct gacacagccg catactgtta cccgctatca gtcccgcgtg 420
gcagaaaaaa tcgttcgttt agaacactgg cagccacagc agagacgtga ggaagagacg 480
tctttttggg tactttttac tgcggatggt ttagtgcacc tattcggtaa gcatcaccat 540
gcacgtattg ctgacccgca ggatgaaacc agaattgccc gctggctgat ggaggaaacc 600
gtcacgcata ccggggaaca tatttactat cactatcggg cagaagacga tcttgactgt 660
gatgagcatg aacttgctca gcattcaggt gttacggccc agcgttatct ggcaaaagtc 720
agctatggca atactcagcc ggaaaccgct tttttcgcgg taaaatcagg tattcctgct 780
gataatgact ggctgtttca tctggtattt gattacggtg agcgctcatc ttcgctgaac 840
tctgtacccg aattcaatgt gtcagaaaac aatgtgtctg aaaacaatgt gcctgaaaaa 900
tggcgttgtc gtccggacag tttctcccgc tatgaatatg ggtttgaaat tcgaacccgt 960
cgcttgtgtc gccaagttct gatgtttcat cagctgaaag cgctggcagg ggaaaaggtt 1020
gcagaagaaa caccggcgct ggtttcccgt cttattctgg attatgacct gaacaacaag 1080
gtttccttgc tgcaaacggc ccgcagactg gcccatgaaa cggacggtac gccagtgatg 1140
atgtccccgc tggaaatgga ttatcaacgt gttaatcatg gcgtgaatct gaactggcag 1200
tccatgccgc agttagaaaa aatgaacacg ttgcagccat accaattggt tgatttatat 1260
ggagaaggaa tttccggcgt actttatcag gatactcaga aagcctggtg gtaccgtgct 1320
ccggtacggg atatcactgc cgaaggaacg aatgcggtta cctatgagga ggccaaacca 1380
ctgccacata ttccggcaca acaggaaagc gcgatgttgt tggacatcaa tggtgacggg 1440
cgtctggatt gggtgattac ggcatcaggg ttacggggct accacaccat gtcaccggaa 1500
ggtgaatgga caccctttat tccattatcc gctgtgccaa tggaatattt ccatccgcag 1560
gcaaaactgg ctgatattga tggggctggg ctgcctgact tagcgcttat cgggccaaat 1620
agtgtacgtg tctggtcaaa taatcgggca ggatgggatc gcgctcagga tgtgattcat 1680
ttgtcagata tgccactgcc ggttcccggc agaaatgagc gtcatcttgt cgcattcagt 1740
gatatgacag gctccgggca atcacatctg gtggaagtaa cggcagatag cgtgcgctac 1800
tggccgaacc tggggcatgg aaaatttggt gagcctctga tgatgacagg cttccagatt 1860
agcggggaaa cgtttaaccc cgacagactg tatatggtag acatagatgg ctcaggcacc 1920
accgatttta tttatgcccg caatacttac cttgaactct atgccaatga aagcggcaat 1980
cattttgctg aacctcagcg tattgatctg ccggatgggg tacgttttga tgatacttgt 2040
cggttacaaa tagcggatac acaaggatta gggactgcca gcattatttt gacgatcccc 2100
catatgaagg tgcagcactg gcgattggat atgaccatat tcaagccttg gctgctgaat 2160
gccgtcaata acaatatggg aacagaaacc acgctgtatt atcgcagctc tgcccagttc 2220
tggctggatg agaaattaca ggcttctgaa tccgggatga cggtggtcag ctacttaccg 2280
ttcccggtgc atgtgttgtg gcgcacggaa gtgctggatg aaatttccgg taaccgattg 2340
accagccatt atcattactc acatggtgcc tgggatggtc tggaacggga gtttcgtggt 2400
tttgggcggg tgacacaaac tgatattgat tcacgggcga gtgcgacaca ggggacacat 2460
gctgaaccac cggcaccttc gcgcacggtt aattggtacg gcactggcgt acgggaagtc 2520
gatattcttc tgcccacgga atattggcag ggggatcaac aggcatttcc ccattttacc 2580
ccacgcttta cccgttatga cgaaaaatcc ggtggtgata tgacggtcac gccgagcgaa 2640
caggaagaat actggttaca tcgagcctta aaaggacaac gtttacgcag tgagctgtat 2700
ggggatgatg attctatact ggccggtacg ccttattcag tggatgaatc ccgcacccaa 2760
gtacgtttgt taccggtgat ggtatcggac gtgcctgcgg tactggtttc ggtggccgaa 2820
tcccgccaat accgatatga acgggttgct accgatccac agtgcagcca aaagatcgtc 2880
cttaaatctg atgcgttagg atttccgcag gacaatcttg agattgccta ttcgagacgt 2940
ccacagcctg agttctcgcc ttatccggat accctgcccg aaacactttt caccagcagt 3000
ttcgacgaac agcagatgtt ccttcgtctg acacgccagc gttcttctta tcatcatctg 3060
aatcatgatg ataatacgtg gatcacaggg cttatggata cctcacgcag tgacgcacgt 3120
atttatcaag ccgataaagt gccggacggt ggattttccc ttgaatggtt ttctgccaca 3180
ggtgcaggag cattgttgtt gcctgatgcc gcagccgatt atctgggaca tcagcgtgta 3240
gcatataccg gtccagaaga acaacccgct attcctccgc tggtggcata cattgaaacc 3300
gcagagtttg atgaacgatc gttggcggct tttgaggagg tgatggatga gcaggagctg 3360
acaaaacagc tgaatgatgc gggctggaat acggcaaaag tgccgttcag tgaaaagaca 3420
gatttccatg tctgggtggg acaaaaggaa tttacagaat atgccggtgc agacggattc 3480
tatcggccat tggtgcaacg ggaaaccaag cttacaggta aaacgacagt cacgtgggat 3540
agccattact gtgttatcac cgcaacagag gatgcggctg gcctgcgtat gcaagcgcat 3600
tacgattatc gatttatggt tgcggataac accacagatg tcaatgataa ctatcacacc 3660
gtgacgtttg atgcactggg gagggtaacc agcttccgtt tctgggggac tgaaaacggt 3720
gaaaaacaag gatatacccc tgcggaaaat gaaactgtcc cctttattgt ccccacaacg 3780
gtggatgatg ctctggcatt gaaacccggt atacctgttg cagggctgat ggtttatgcc 3840
cctctgagct ggatggttca ggccagcttt tctaatgatg gggagcttta tggagagctg 3900
aaaccggctg ggatcatcac tgaagatggt tatctcctgt cgcttgcttt tcgccgctgg 3960
caacaaaata accctgccgc tgccatgcca aagcaagtca attcacagaa cccaccccat 4020
gtactgagtg tgatcaccga ccgctatgat gccgatccgg aacaacaatt acgtcaaacg 4080
tttacgttta gtgatggttt tgggcgaacc ttacaaacag ccgtacgcca tgaaagtggt 4140
gaagcctggg tacgtgatga gtatggagcc attgtggctg aaaatcatgg cgcgcctgaa 4200
acggcgatga cagatttccg ttgggcagtt tccggacgta cagaatatga cggaaaaggc 4260
caagccctgc gtaagtatca accgtatttc ctgaatagtt ggcagtacgt cagtgatgac 4320
agtgcccggc aggatatata tgccgatacc cattactatg atccgttggg gcgtgaatat 4380
caggttatca cggccaaagg cgggtttcgt cgatccttat tcactccctg gtttgtggtg 4440
aatgaagatg aaaatgacac tgccggtgaa atgacagca 4479
<210>18
<211>1493
<212>PRT
<213>嗜线虫致病杆菌
<400>18
Met Gln Gly Ser Thr Pro Leu Lys Leu Glu Ile Pro Ser Leu Pro Ser
1 5 10 15
Gly Gly Gly Ser Leu Lys Gly Met Gly Glu Ala Leu Asn Ala Val Gly
20 25 30
Ala Glu Gly Gly Ala Ser Phe Ser Leu Pro Leu Pro Ile Ser Val Gly
35 40 45
Arg Gly Leu Val Pro Val Leu Ser Leu Asn Tyr Ser Ser Thr Ala Gly
50 55 60
Asn Gly Ser Phe Gly Met Gly Trp Gln Cys Gly Val Gly Phe Ile Ser
65 70 75 80
Leu Arg Thr Ala Lys Gly Val Pro His Tyr Thr Gly Gln Asp Glu Tyr
85 90 95
Leu Gly Pro Asp Gly Glu Val Leu Ser Ile Val Pro Asp Ser Gln Gly
100 105 110
Gln Pro Glu Gln Arg Thr Ala Thr Ser Leu Leu Gly Thr Val Leu Thr
115 120 125
Gln Pro His Thr Val Thr Arg Tyr Gln Ser Arg Val Ala Glu Lys Ile
130 135 140
Val Arg Leu Glu His Trp Gln Pro Gln Gln Arg Arg Glu Glu Glu Thr
145 150 155 160
Ser Phe Trp Val Leu Phe Thr Ala Asp Gly Leu Val His Leu Phe Gly
165 170 175
Lys His His His Ala Arg Ile Ala Asp Pro Gln Asp Glu Thr Arg Ile
180 185 190
Ala Arg Trp Leu Met Glu Glu Thr Val Thr His Thr Gly Glu His Ile
195 200 205
Tyr Tyr His Tyr Arg Ala Glu Asp Asp Leu Asp Cys Asp Glu His Glu
210 215 220
Leu Ala Gln His Ser Gly Val Thr Ala Gln Arg Tyr Leu Ala Lys Val
225 230 235 240
Ser Tyr Gly Asn Thr Gln Pro Glu Thr Ala Phe Phe Ala Val Lys Ser
245 250 255
Gly Ile Pro Ala Asp Asn Asp Trp Leu Phe His Leu Val Phe Asp Tyr
260 265 270
Gly Glu Arg Ser Ser Ser Leu Asn Ser Val Pro Glu Phe Asn Val Ser
275 280 285
Glu Asn Asn Val Ser Glu Asn Asn Val Pro Glu Lys Trp Arg Cys Arg
290 295 300
Pro Asp Ser Phe Ser Arg Tyr Glu Tyr Gly Phe Glu Ile Arg Thr Arg
305 310 315 320
Arg Leu Cys Arg Gln Val Leu Met Phe His Gln Leu Lys Ala Leu Ala
325 330 335
Gly Glu Lys Val Ala Glu Glu Thr Pro Ala Leu Val Ser Arg Leu Ile
340 345 350
Leu Asp Tyr Asp Leu Asn Asn Lys Val Ser Leu Leu Gln Thr Ala Arg
355 360 365
Arg Leu Ala Hi s Glu Thr Asp Gly Thr Pro Val Met Met Ser Pro Leu
370 375 380
Glu Met Asp Tyr Gln Arg Val Asn His Gly Val Asn Leu Asn Trp Gln
385 390 395 400
Ser Met Pro Gln Leu Glu Lys Met Asn Thr Leu Gln Pro Tyr Gln Leu
405 410 415
Val Asp Leu Tyr Gly Glu Gly Ile Ser Gly Val Leu Tyr Gln Asp Thr
420 425 430
Gln Lys Ala Trp Trp Tyr Arg Ala Pro Val Arg Asp Ile Thr Ala Glu
435 440 445
Gly Thr Asn Ala Val Thr Tyr Glu Glu Ala Lys Pro Leu Pro His Ile
450 455 460
Pro Ala Gln Gln Glu Ser Ala Met Leu Leu Asp Ile Asn Gly Asp Gly
465 470 475 480
Arg Leu Asp Trp Val Ile Thr Ala Ser Gly Leu Arg Gly Tyr His Thr
485 490 495
Met Ser Pro Glu Gly Glu Trp Thr Pro Phe Ile Pro Leu Ser Ala Val
500 505 510
Pro Met Glu Tyr Phe His Pro Gln Ala Lys Leu Ala Asp Ile Asp Gly
515 520 525
Ala Gly Leu Pro Asp Leu Ala Leu Ile Gly Pro Asn Ser Val Arg Val
530 535 540
Trp Ser Asn Asn Arg Ala Gly Trp Asp Arg Ala Gln Asp Val Ile His
545 550 555 560
Leu Ser Asp Met Pro Leu Pro Val Pro Gly Arg Asn Glu Arg His Leu
565 570 575
Val Ala Phe Ser Asp Met Thr Gly Ser Gly Gln Ser His Leu Val Glu
580 585 590
Val Thr Ala Asp Ser Val Arg Tyr Trp Pro Asn Leu Gly His Gly Lys
595 600 605
Phe Gly Glu Pro Leu Met Met Thr Gly Phe Gln Ile Ser Gly Glu Thr
610 615 620
Phe Asn Pro Asp Arg Leu Tyr Met Val Asp Ile Asp Gly Ser Gly Thr
625 630 635 640
Thr Asp Phe Ile Tyr Ala Arg Asn Thr Tyr Leu Glu Leu Tyr Ala Asn
645 650 655
Glu Ser Gly Asn His Phe Ala Glu Pro Gln Arg Ile Asp Leu Pro Asp
660 665 670
Gly Val Arg Phe Asp Asp Thr Cys Arg Leu Gln Ile Ala Asp Thr Gln
675 680 685
Gly Leu Gly Thr Ala Ser Ile Ile Leu Thr Ile Pro His Met Lys Val
690 695 700
Gln His Trp Arg Leu Asp Met Thr Ile Phe Lys Pro Trp Leu Leu Asn
705 710 715 720
Ala Val Asn Asn Asn Met Gly Thr Glu Thr Thr Leu Tyr Tyr Arg Ser
725 730 735
Ser Ala Gln Phe Trp Leu Asp Glu Lys Leu Gln Ala Ser Glu Ser Gly
740 745 750
Met Thr Val Val Ser Tyr Leu Pro Phe Pro Val His Val Leu Trp Arg
755 760 765
Thr Glu Val Leu Asp Glu Ile Ser Gly Asn Arg Leu Thr Ser His Tyr
770 775 780
His Tyr Ser His Gly Ala Trp Asp Gly Leu Glu Arg Glu Phe Arg Gly
785 790 795 800
Phe Gly Arg Val Thr Gln Thr Asp Ile Asp Ser Arg Ala Ser Ala Thr
805 810 815
Gln Gly Thr His Ala Glu Pro Pro Ala Pro Ser Arg Thr Val Asn Trp
820 825 830
Tyr Gly Thr Gly Val Arg Glu Val Asp Ile Leu Leu Pro Thr Glu Tyr
835 840 845
Trp Gln Gly Asp Gln Gln Ala Phe Pro His Phe Thr Pro Arg Phe Thr
850 855 860
Arg Tyr Asp Glu Lys Ser Gly Gly Asp Met Thr Val Thr Pro Ser Glu
865 870 875 880
Gln Glu Glu Tyr Trp Leu His Arg Ala Leu Lys Gly Gln Arg Leu Arg
885 890 895
Ser Glu Leu Tyr Gly Asp Asp Asp Ser Ile Leu Ala Gly Thr Pro Tyr
900 905 910
Ser Val Asp Glu Ser Arg Thr Gln Val Arg Leu Leu Pro Val Met Val
915 920 925
Ser Asp Val Pro Ala Val Leu Val Ser Val Ala Glu Ser Arg Gln Tyr
930 935 940
Arg Tyr Glu Arg Val Ala Thr Asp Pro Gln Cys Ser Gln Lys Ile Val
945 950 955 960
Leu Lys Ser Asp Ala Leu Gly Phe Pro Gln Asp Asn Leu Glu Ile Ala
965 970 975
Tyr Ser Arg Arg Pro Gln Pro Glu Phe Ser Pro Tyr Pro Asp Thr Leu
980 985 990
Pro Glu Thr Leu Phe Thr Ser Ser Phe Asp Glu Gln Gln Met Phe Leu
995 1000 1005
Arg Leu Thr Arg Gln Arg Ser Ser Tyr His His Leu Asn His Asp
1010 1015 1020
Asp Asn Thr Trp Ile Thr Gly Leu Met Asp Thr Ser Arg Ser Asp
1025 1030 1035
Ala Arg Ile Tyr Gln Ala Asp Lys Val Pro Asp Gly Gly Phe Ser
1040 1045 1050
Leu Glu Trp Phe Ser Ala Thr Gly Ala Gly Ala Leu Leu Leu Pro
1055 1060 1065
Asp Ala Ala Ala Asp Tyr Leu Gly His Gln Arg Val Ala Tyr Thr
1070 1075 1080
Gly Pro Glu Glu Gln Pro Ala Ile Pro Pro Leu Val Ala Tyr Ile
1085 1090 1095
Glu Thr Ala Glu Phe Asp Glu Arg Ser Leu Ala Ala Phe Glu Glu
1100 1105 1110
Val Met Asp Glu Gln Glu Leu Thr Lys Gln Leu Asn Asp Ala Gly
1115 1120 1125
Trp Asn Thr Ala Lys Val Pro Phe Ser Glu Lys Thr Asp Phe His
1130 1135 1140
Val Trp Val Gly Gln Lys Glu Phe Thr Glu Tyr Ala Gly Ala Asp
1145 1150 1155
Gly Phe Tyr Arg Pro Leu Val Gln Arg Glu Thr Lys Leu Thr Gly
1160 1165 1170
Lys Thr Thr Val Thr Trp Asp Ser His Tyr Cys Val Ile Thr Ala
1175 1180 1185
Thr Glu Asp Ala Ala Gly Leu Arg Met Gln Ala His Tyr Asp Tyr
1190 1195 1200
Arg Phe Met Val Ala Asp Asn Thr Thr Asp Val Asn Asp Asn Tyr
1205 1210 1215
His Thr Val Thr Phe Asp Ala Leu Gly Arg Val Thr Ser Phe Arg
1220 1225 1230
Phe Trp Gly Thr Glu Asn Gly Glu Lys Gln Gly Tyr Thr Pro Ala
1235 1240 1245
Glu Asn Glu Thr Val Pro Phe Ile Val Pro Thr Thr Val Asp Asp
1250 1255 1260
Ala Leu Ala Leu Lys Pro Gly Ile Pro Val Ala Gly Leu Met Val
1265 1270 1275
Tyr Ala Pro Leu Ser Trp Met Val Gln Ala Ser Phe Ser Asn Asp
1280 1285 1290
Gly Glu Leu Tyr Gly Glu Leu Lys Pro Ala Gly Ile Ile Thr Glu
1295 1300 1305
Asp Gly Tyr Leu Leu Ser Leu Ala Phe Arg Arg Trp Gln Gln Asn
1310 1315 1320
Asn Pro Ala Ala Ala Met Pro Lys Gln Val Asn Ser Gln Asn Pro
1325 1330 1335
Pro His Val Leu Ser Val Ile Thr Asp Arg Tyr Asp Ala Asp Pro
1340 1345 1350
Glu Gln Gln Leu Arg Gln Thr Phe Thr Phe Ser Asp Gly Phe Gly
1355 1360 1365
Arg Thr Leu Gln Thr Ala Val Arg His Glu Ser Gly Glu Ala Trp
1370 1375 1380
Val Arg Asp Glu Tyr Gly Ala Ile Val Ala Glu Asn His Gly Ala
1385 1390 1395
Pro Glu Thr Ala Met Thr Asp Phe Arg Trp Ala Val Ser GlyArg
1400 1405 1410
Thr Glu Tyr Asp Gly Lys Gly Gln Ala Leu Arg Lys Tyr Gln Pro
1415 1420 1425
Tyr Phe Leu Asn Ser Trp Gln Tyr Val Ser Asp Asp Ser Ala Arg
1430 1435 1440
Gln Asp Ile Tyr Ala Asp Thr His Tyr Tyr Asp Pro Leu Gly Arg
1445 1450 1455
Glu Tyr Gln Val Ile Thr Ala Lys Gly Gly Phe Arg Arg Ser Leu
1460 1465 1470
Phe Thr Pro Trp Phe Val Val Asn Glu Asp Glu Asn Asp Thr Ala
1475 1480 1485
Gly Glu Met Thr Ala
1490
<210>19
<211>7614
<212>DNA
<213>嗜线虫致病杆菌
<400>19
atgtatagca cggctgtatt actcaataaa atcagtccca ctcgcgacgg tcagacgatg 60
actcttgcgg atctgcaata tttatccttc agtgaactga gaaaaatctt tgatgaccag 120
ctcagttggg gagaggctcg ccatctctat catgaaacta tagagcagaa aaaaaataat 180
cgcttgctgg aagcgcgtat ttttacccgt gccaacccac aattatccgg tgctatccga 240
ctcggtattg aacgagacag cgtttcacgc agttatgatg aaatgtttgg tgcccgttct 300
tcttcctttg tgaaaccggg ttcagtggct tccatgtttt caccggctgg ctatctcacc 360
gaattgtatc gtgaagcgaa ggacttacat ttttcaagct ctgcttatca tcttgataat 420
cgccgtccgg atctggctga tctgactctg agccagagta atatggatac agaaatttcc 480
accctgacac tgtctaacga actgttgctg gagcatatta cccgcaagac cggaggtgat 540
tcggacgcat tgatggagag cctgtcaact taccgtcagg ccattgatac cccttaccat 600
cagccttacg agactatccg tcaggtcatt atgacccatg acagtacact gtcagcgctg 660
tcccgtaatc ctgaggtgat ggggcaggcg gaaggggctt cattactggc gattctggcc 720
aatatttctc cggagcttta taacattttg accgaagaga ttacggaaaa gaacgctgat 780
gctttatttg cgcaaaactt cagtgaaaat atcacgcccg aaaatttcgc gtcacaatca 840
tggatagcca agtattatgg tcttgaactt tctgaggtgc aaaaatacct cgggatgttg 900
cagaatggct attctgacag cacctctgct tatgtggata atatctcaac gggtttagtg 960
gtcaataatg aaagtaaact cgaagcttac aaaataacac gtgtaaaaac agatgattat 1020
gataaaaata taaattactt tgatttgatg tatgaaggaa ataatcagtt ctttatacgt 1080
gctaatttta aggtatcaag agaatttggg gctactctta gaaaaaacgc agggccaagt 1140
ggcattgtcg gcagcctttc cggtcctcta atagccaata cgaattttaa aagtaattat 1200
ctaagtaaca tatctgattc tgaatacaaa aacggtgtaa agatatacgc ctatcgctat 1260
acgtcttcca ccagcgccac aaatcagggc ggcggaatat tcacttttga gtcttatccc 1320
ctgactatat ttgcgctcaa actgaataaa gccattcgct tgtgcctgac tagcgggctt 1380
tcaccgaatg aactgcaaac tatcgtacgc agtgacaatg cacaaggcat catcaacgac 1440
tccgttctga ccaaagtttt ctatactctg ttctacagtc accgttatgc actgagcttt 1500
gatgatgcac aggtactgaa cggatcggtc attaatcaat atgccgacga tgacagtgtc 1560
agtcatttta accgtctctt taatacaccg ccgctgaaag ggaaaatctt tgaagccgac 1620
ggcaacacgg tcagcattga tccggatgaa gagcaatcta cctttgcccg ttcagccctg 1680
atgcgtggtc tgggggtcaa cagtggtgaa ctgtatcagt taggcaaact ggcgggtgtg 1740
ctggacgccc aaaataccat cacactttct gtcttcgtta tctcttcact gtatcgcctc 1800
acgttactgg cccgtgtcca tcagctgacg gtcaatgaac tgtgtatgct ttatggtctt 1860
tcgccgttca atggcaaaac aacggcttct ttgtcttccg gggagttgcc acggctggtt 1920
atctggctgt atcaggtgac gcagtggctg actgaggcgg aaatcaccac tgaagcgatc 1980
tggttattat gtacgccaga gtttagcggg aatatttcac cggaaatcag taatctgctc 2040
aataacctcc gaccgagtat tagtgaagat atggcacaga gtcacaatcg ggagctgcag 2100
gctgaaattc tcgcgccgtt tattgctgca acgctgcatc tggcgtcacc ggatatggca 2160
cggtatatcc tgttgtggac cgataacctg cggccgggtg gcttagatat tgccgggttt 2220
atgacactgg tattgaaaga gtcgttaaat gccaatgaaa ccacccaatt ggtacaattc 2280
tgccatgtga tggcacagtt atcgctttcc gtacagacac tgcgcctcag tgaagcggag 2340
ctatccgtgc tggtcatctc cggattcgcc gtgctggggg caaaaaatca acctgccgga 2400
cagcacaata ttgatacgct attctcactc taccgattcc accagtggat taatgggctg 2460
ggcaatcccg gctctgacac gctggatatg ctgcgccagc agacactcac ggccgacaga 2520
ctggcctccg tgatggggct ggacatcagt atggtaacgc aggccatggt ttccgccggc 2580
gtgaaccagc ttcagtgttg gcaggatatc aacaccgtgt tgcagtggat agatgtggca 2640
tcagcactgc acacgatgcc gtcggttatc cgtacgctgg tgaatatccg ttacgtgact 2700
gcattaaaca aagccgagtc gaatctgcct tcctgggatg agtggcagac actggcagaa 2760
aatatggaag ccggactcag tacacaacag gctcagacgc tggcggatta taccgcggag 2820
cgcctgagta gcgtgctgtg caattggttt ctggcgaata tccagccaga aggggtgtcc 2880
ctgcacagcc gggatgacct gtacagctat ttcctgattg ataatcaggt ctcttctgcc 2940
ataaaaacca cccgactggc agaggccatt gccggtattc agctctacat caaccgggcg 3000
ctgaatcgga tagagcctaa tgcccgtgcc gatgtgtcaa cccgccagtt ttttaccgac 3060
tggacggtga ataaccgtta cagcacctgg ggcggggtgt cgcggctggt ttattatccg 3120
gaaaattaca ttgacccaac ccagcgtatc gggcagaccc ggatgatgga tgaactgctg 3180
gaaaatatca gccagagtaa acttagccgg gacacagtgg aggatgcctt taaaacttac 3240
ctgacccgct ttgaaaccgt ggcggatctg aaagttgtca gcgcctatca cgacaacgtc 3300
aacagcaaca ccggactgac ctggtttgtc ggccaaacgc gggagaacct gccggaatac 3360
tactggcgta acgtggatat atcacggatg caggcgggtg aactggccgc caatgcctgg 3420
aaagagtgga cgaagattga tacagcggtc aacccctaca aggatgcaat acgtccggtc 3480
atattcaggg aacgtttgca ccttatctgg gtagaaaaag aggaagtggc gaaaaatggt 3540
actgatccgg tggaaaccta tgaccgtttt actctgaaac tggcgtttct gcgtcatgat 3600
ggcagttgga gtgccccctg gtcttacgat atcacaacgc aggtggaggc ggtcactgac 3660
aaaaaacctg acactgaacg gctggcgctg gccgcatcag gctttcaggg cgaggacact 3720
ctgctggtgt ttgtctacaa aaccgggaag agttactcgg attttggcgg cagcaataaa 3780
aatgtggcag gcatgaccat ttacggcgat ggctccttca aaaagatgga gaacacagca 3840
ctcagccgtt acagccaact gaaaaatacc tttgatatca ttcatactca aggcaacgac 3900
ttggtaagaa aggccagcta tcgtttcgcg caggattttg aagtgcctgc ctcgttgaat 3960
atgggttctg ccatcggtga tgatagtctg acggtgatgg agaacgggaa tattccgcag 4020
ataaccagta aatactccag cgataacctt gctattacgc tacataacgc cgctttcact 4080
gtcagatatg atggcagtgg caatgtcatc agaaacaaac aaatcagcgc catgaaactg 4140
acgggggtgg atggaaagtc ccagtacggc aatgcattta tcatcgcaaa taccgttaaa 4200
cattatggcg gttactctga tctggggggg ccgatcaccg tttataataa aacgaaaaac 4260
tatattgcat cagttcaagg ccacttgatg aacgcagatt acactaggcg tttgattcta 4320
acaccagttg aaaataatta ttatgccaga ttgttcgagt ttccattttc tccaaacaca 4380
attttaaaca ccgttttcac ggttggtagc aataaaacca gtgattttaa aaagtgcagt 4440
tatgctgttg atggtaataa ttctcagggc ttccagatat ttagttccta tcaatcatcc 4500
ggctggctgg atattgatac aggcattaac aataccgata tcaaaattac ggtgatggct 4560
ggcagtaaaa cccacacctt tacggccagt gaccatattg cttccttgcc ggcaaacagt 4620
tttgatgcta tgccgtacac ctttaagcca ctggaaatcg atgcttcatc gttggccttt 4680
accaataata ttgctcctct ggatatcgtt tttgagacca aagccaaaga cgggcgagtg 4740
ctgggtaaga tcaagcaaac attatcggtg aaacgggtaa attataatcc ggaagatatt 4800
ctgtttctgc gtgaaactca ttcgggtgcc caatatatgc agctcggggt gtatcgtatt 4860
cgtcttaata ccctgctggc ttctcaactg gtatccagag caaacacggg cattgatact 4920
atcctgacaa tggaaaccca gcggttaccg gaacctccgt tgggagaagg cttctttgcc 4980
aactttgttc tgcctaaata tgaccctgct gaacatggcg atgagcggtg gtttaaaatc 5040
catattggga atgttggcgg taacacggga aggcagcctt attacagcgg aatgttatcc 5100
gatacgtcgg aaaccagtat gacactgttt gtcccttatg ccgaagggta ttacatgcat 5160
gaaggtgtca gattgggggt tggataccag aaaattacct atgacaacac ttgggaatct 5220
gctttctttt attttgatga gacaaaacag caatttgtat taattaacga tgctgatcat 5280
gattcaggaa tgacgcaaca ggggatcgtg aaaaatatca agaaatacaa aggatttttg 5340
aatgtttcta tcgcaacggg ctattccgcc ccgatggatt tcaatagtgc cagcgccctc 5400
tattactggg aattgttcta ttacaccccg atgatgtgct tccagcgttt gctacaggaa 5460
aaacaattcg acgaagccac acaatggata aactacgtct acaatcccgc cggctatatc 5520
gttaacggag aaatcgcccc ctggatctgg aactgccggc cgctggaaga gaccacctcc 5580
tggaatgcca atccgctgga tgccatcgat ccggatgccg tcgcccaaaa tgacccaatg 5640
cactacaaga ttgccacctt tatgcgcctg ttggatcaac ttattctgcg cggcgatatg 5700
gcctatcgag aactgacccg cgatgcgttg aatgaagcca aaatgtggta tgtgcgtact 5760
ttagaattgc tcggtgatga gccggaggat tacggtagcc aacagtgggc agcaccgtcc 5820
ctttccgggg cggcgagtca aaccgtgcag gcggcttatc agcaggatct tacgatgctg 5880
ggccgtggtg gggtttccaa gaatctccgt accgctaact cgttggtggg tttgttcctg 5940
ccggaatata acccggcgct caccgattac tggcaaaccc tgcgtttgcg cctgtttaac 6000
ctgcgccata atctttccat tgacggacag ccgttatcgc tggcgattta cgccgagcct 6060
accgatccga aagcgctgct caccagtatg gtacaggcct ctcagggcgg tagtgcagtg 6120
ctgcccggca cattgtcgtt ataccgcttc ccggtgatgc tggagcggac ccgcaatctg 6180
gtagcgcaat taacccagtt cggcacctct ctgctcagta tggcagagca tgatgatgcc 6240
gatgaactca ccacgctgct actacagcag ggtatggaac tggcgacaca gagcatccgt 6300
attcagcaac gaactgtcga tgaagtggat gctgatattg ctgtattggc agagagccgc 6360
cgcagtgcac aaaatcgtct ggaaaaatac cagcagctgt atgacgagga tatcaaccac 6420
ggagaacagc gggcaatgtc actgcttgat gcagcggcag gtcagtctct ggccgggcag 6480
gtgctttcaa tagcggaagg ggtggccgat ttagtgccaa acgtgttcgg tttagcttgt 6540
ggcggcagtc gttggggggc agcactgcgt gcttccgcct ccgtgatgtc gctttctgcc 6600
acagcttccc aatattccgc agacaaaatc agccgttcgg aagcctaccg ccgccgccgt 6660
caggagtggg aaattcagcg tgataatgct gacggtgaag tcaaacaaat ggatgcccag 6720
ttggaaagcc tgaaaatccg ccgcgaagca gcacagatgc aggtggaata tcaggagacc 6780
cagcaggccc atactcaggc tcagttagag ctgttacagc gtaaattcac aaacaaagcg 6840
ctttacagtt ggatgcgcgg caagctgagt gctatctatt accagttctt tgacctgacc 6900
cagtccttct gcctgatggc acaggaagcg ctgcgccgcg agctgaccga caacggtgtt 6960
acctttatcc ggggtggggc ctggaacggt acgactgcgg gtttgatggc gggtgaaacg 7020
ttgctgctga atctggcaga aatggaaaaa gtctggctgg agcgtgatga gcgggcactg 7080
gaagtgaccc gtaccgtctc gttggcacag ttctatcagg ccttatcatc agacaacttt 7140
aatctgaccg aaaaactcac gcaattcctg cgtgaaggga aaggcaacgt aggagcttcc 7200
ggcaatgaat taaaactcag taaccgtcag atagaagcct cagtgcgatt gtctgatttg 7260
aaaattttca gcgactaccc cgaaagcctt ggcaataccc gtcagttgaa acaggtgagt 7320
gtcaccttgc cggcgctggt tgggccgtat gaagatattc gggcggtgct gaattacggg 7380
ggcagcatcg tcatgccacg cggttgcagt gctattgctc tctcccacgg cgtgaatgac 7440
agtggtcaat ttatgctgga tttcaacgat tcccgttatc tgccgtttga aggtatttcc 7500
gtgaatgaca gcggcagcct gacgttgagt ttcccggatg cgactgatcg gcagaaagcg 7560
ctgctggaga gcctgagcga tatcattctg catatccgct ataccattcg ttct 7614
<210>20
<211>2538
<212>PRT
<213>嗜线虫致病杆菌
<400>20
Met Tyr Ser Thr Ala Val Leu Leu Asn Lys Ile Ser Pro Thr Arg Asp
1 5 10 15
Gly Gln Thr Met Thr Leu Ala Asp Leu Gln Tyr Leu Ser Phe Ser Glu
20 25 30
Leu Arg Lys Ile Phe Asp Asp Gln Leu Ser Trp Gly Glu Ala Arg His
35 40 45
Leu Tyr His Glu ThrIle Glu Gln Lys Lys Asn Asn Arg Leu Leu Glu
50 55 60
Ala Arg Ile Phe Thr Arg Ala Asn Pro Gln Leu Ser Gly Ala Ile Arg
65 70 75 80
Leu Gly Ile Glu Arg Asp Ser Val Ser Arg Ser Tyr Asp Glu Met Phe
85 90 95
Gly Ala Arg Ser Ser Ser Phe Val Lys Pro Gly Ser Val Ala Ser Met
100 105 110
Phe Ser Pro Ala Gly Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asp
115 120 125
Leu His Phe Ser Ser Ser Ala Tyr His Leu Asp Asn Arg Arg Pro Asp
130 135 140
Leu Ala Asp Leu Thr Leu Ser Gln Ser Asn Met Asp Thr Glu Ile Ser
145 150 155 160
Thr Leu Thr Leu Ser Asn Glu Leu Leu Leu Glu His Ile Thr Arg Lys
165 170 175
Thr Gly Gly Asp Ser Asp Ala Leu Met Glu Ser Leu Ser Thr Tyr Arg
180 185 190
Gln Ala Ile Asp Thr Pro Tyr His Gln Pro Tyr Glu Thr Ile Arg Gln
195 200 205
Val Ile Met Thr His Asp Ser Thr Leu Ser Ala Leu Ser Arg Asn Pro
210 215 220
Glu Val Met Gly Gln Ala Glu Gly Ala Ser Leu Leu Ala Ile Leu Ala
225 230 235 240
Asn Ile Ser Pro Glu Leu Tyr Asn Ile Leu Thr Glu Glu Ile Thr Glu
245 250 255
Lys Asn Ala Asp Ala Leu Phe Ala Gln Asn Phe Ser Glu Asn Ile Thr
260 265 270
Pro Glu Asn Phe Ala Ser Gln Ser Trp Ile Ala Lys Tyr Tyr Gly Leu
275 280 285
Glu Leu Ser Glu Val Gln Lys Tyr Leu Gly Met Leu Gln Asn Gly Tyr
290 295 300
Ser Asp Ser Thr Ser Ala Tyr Val Asp Asn Ile Ser Thr Gly Leu Val
305 310 315 320
Val Asn Asn Glu Ser Lys Leu Glu Ala Tyr Lys Ile Thr Arg Val Lys
325 330 335
Thr Asp Asp Tyr Asp Lys Asn Ile Asn Tyr Phe Asp Leu Met Tyr Glu
340 345 350
Gly Asn Asn Gln Phe Phe Ile Arg Ala Asn Phe Lys Val Ser Arg Glu
355 360 365
Phe Gly Ala Thr Leu Arg Lys Asn Ala Gly Pro Ser Gly Ile Val Gly
370 375 380
Ser Leu Ser Gly Pro Leu Ile Ala Asn Thr Asn Phe Lys Ser Asn Tyr
385 390 395 400
Leu Ser Asn Ile Ser Asp Ser Glu Tyr Lys Asn Gly Val Lys Ile Tyr
405 410 415
Ala Tyr Arg Tyr Thr Ser Ser Thr Ser Ala Thr Asn Gln Gly Gly Gly
420 425 430
Ile Phe Thr Phe Glu Ser Tyr Pro Leu Thr Ile Phe Ala Leu Lys Leu
435 440 445
Asn Lys Ala Ile Arg Leu Cys Leu Thr Ser Gly Leu Ser Pro Asn Glu
450 455 460
Leu Gln Thr Ile Val Arg Ser Asp Asn Ala Gln Gly Ile Ile Asn Asp
465 470 475 480
Ser Val Leu Thr Lys Val Phe Tyr Thr Leu Phe Tyr Ser His Arg Tyr
485 490 495
Ala Leu Ser Phe Asp Asp Ala Gln Val Leu Asn Gly Ser Val Ile Asn
500 505 510
Gln Tyr Ala Asp Asp Asp Ser Val Ser His Phe Asn Arg Leu Phe Asn
515 520 525
Thr Pro Pro Leu Lys Gly Lys Ile Phe Glu Ala Asp Gly Asn Thr Val
530 535 540
Ser Ile Asp Pro Asp Glu Glu Gln Ser Thr Phe Ala Arg Ser Ala Leu
545 550 555 560
Met Arg Gly Leu Gly Val Asn Ser Gly Glu Leu Tyr Gln Leu Gly Lys
565 570 575
Leu Ala Gly Val Leu Asp Ala Gln Asn Thr Ile Thr Leu Ser Val Phe
580 585 590
Val Ile Ser Ser Leu Tyr Arg Leu Thr Leu Leu Ala Arg Val His Gln
595 600 605
Leu Thr Val Asn Glu Leu Cys Met Leu Tyr Gly Leu Ser Pro Phe Asn
610 615 620
Gly Lys Thr Thr Ala Ser Leu Ser Ser Gly Glu Leu Pro Arg Leu Val
625 630 635 640
Ile Trp Leu Tyr Gln Val Thr Gln Trp Leu Thr Glu Ala Glu Ile Thr
645 650 655
Thr Glu Ala Ile Trp Leu Leu Cys Thr Pro Glu Phe Ser Gly Asn Ile
660 665 670
Ser Pro Glu Ile Ser Asn Leu Leu Asn Asn Leu Arg Pro Ser Ile Ser
675 680 685
Glu Asp Met Ala Gln Ser His Asn Arg Glu Leu Gln Ala Glu Ile Leu
690 695 700
Ala Pro Phe Ile Ala Ala Thr Leu His Leu Ala Ser Pro Asp Met Ala
705 710 715 720
Arg Tyr Ile Leu Leu Trp Thr Asp Asn Leu Arg Pro Gly Gly Leu Asp
725 730 735
Ile Ala Gly Phe Met Thr Leu Val Leu Lys Glu Ser Leu Asn Ala Asn
740 745 750
Glu Thr Thr Gln Leu Val Gln Phe Cys His Val Met Ala Gln Leu Ser
755 760 765
Leu Ser Val Gln Thr Leu Arg Leu Ser Glu Ala Glu Leu Ser Val Leu
770 775 780
Val Ile Ser Gly Phe Ala Val Leu Gly Ala Lys Asn Gln Pro Ala Gly
785 790 795 800
Gln His Asn Ile Asp Thr Leu Phe Ser Leu Tyr Arg Phe His Gln Trp
805 810 815
Ile Asn Gly Leu Gly Asn Pro Gly Ser Asp Thr Leu Asp Met Leu Arg
820 825 830
Gln Gln Thr Leu Thr Ala Asp Arg Leu Ala Ser Val Met Gly Leu Asp
835 840 845
Ile Ser Met Val Thr Gln Ala Met Val Ser Ala Gly Val Asn Gln Leu
850 855 860
Gln Cys Trp Gln Asp Ile Asn Thr Val Leu Gln Trp Ile Asp Val Ala
865 870 875 880
Ser Ala Leu His Thr Met Pro Ser Val Ile Arg Thr Leu Val Asn Ile
885 890 895
Arg Tyr Val Thr Ala Leu Asn Lys Ala Glu Ser Asn Leu Pro Ser Trp
900 905 910
Asp Glu Trp Gln Thr Leu Ala Glu Asn Met Glu Ala Gly Leu Ser Thr
915 920 925
Gln Gln Ala Gln Thr Leu Ala Asp Tyr Thr Ala Glu Arg Leu Ser Ser
930 935 940
Val Leu Cys Asn Trp Phe Leu Ala Asn Ile Gln Pro Glu Gly Val Ser
945 950 955 960
Leu His Ser Arg Asp Asp Leu Tyr Ser Tyr Phe Leu Ile Asp Asn Gln
965 970 975
Val Ser Ser Ala Ile Lys Thr Thr Arg Leu Ala Glu Ala Ile Ala Gly
980 985 990
Ile Gln Leu Tyr Ile Asn Arg Ala Leu Asn Arg Ile Glu Pro Asn Ala
995 1000 1005
Arg Ala Asp Val Ser Thr Arg Gln Phe Phe Thr Asp Trp Thr Val
1010 1015 1020
Asn Asn Arg Tyr Ser Thr Trp Gly Gly Val Ser Arg Leu Val Tyr
1025 1030 1035
Tyr Pro Glu Asn Tyr Ile Asp Pro Thr Gln Arg Ile Gly Gln Thr
1040 1045 1050
Arg Met Met Asp Glu Leu Leu Glu Asn Ile Ser Gln Ser Lys Leu
1055 1060 1065
Ser Arg Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Arg
1070 1075 1080
Phe Glu Thr Val Ala Asp Leu Lys Val Val Ser Ala Tyr His Asp
1085 1090 1095
Asn Val Asn Ser Asn Thr Gly Leu Thr Trp Phe Val Gly Gln Thr
1100 1105 1110
Arg Glu Asn Leu Pro Glu Tyr Tyr Trp Arg Asn Val Asp Ile Ser
1115 1120 1125
Arg Met Gln Ala Gly Glu Leu Ala Ala Asn Ala Trp Lys Glu Trp
1130 1135 1140
Thr Lys Ile Asp Thr Ala Val Asn Pro Tyr Lys Asp Ala Ile Arg
1145 1150 1155
Pro Val Ile Phe Arg Glu Arg Leu His Leu Ile Trp Val Glu Lys
1160 1165 1170
Glu Glu Val Ala Lys Asn Gly Thr Asp Pro Val Glu Thr Tyr Asp
1175 1180 1185
Arg Phe Thr Leu Lys Leu Ala Phe Leu Arg His Asp Gly Ser Trp
1190 1195 1200
Ser Ala Pro Trp Ser Tyr Asp Ile Thr Thr Gln Val Glu Ala Val
1205 1210 1215
Thr Asp Lys Lys Pro Asp Thr Glu Arg Leu Ala Leu Ala Ala Ser
1220 1225 1230
Gly Phe Gln Gly Glu Asp Thr Leu Leu Val Phe Val Tyr Lys Thr
1235 1240 1245
Gly Lys Ser Tyr Ser Asp Phe Gly Gly Ser Asn Lys Asn Val Ala
1250 1255 1260
Gly Met Thr Ile Tyr Gly Asp Gly Ser Phe Lys Lys Met Glu Asn
1265 1270 1275
Thr Ala Leu Ser Arg Tyr Ser Gln Leu Lys Asn Thr Phe Asp Ile
1280 1285 1290
Ile His Thr Gln Gly Asn Asp Leu Val Arg Lys Ala Ser Tyr Arg
1295 1300 1305
Phe Ala Gln Asp Phe Glu Val Pro Ala Ser Leu Asn Met Gly Ser
1310 1315 1320
Ala Ile Gly Asp Asp Ser Leu Thr Val Met Glu Asn Gly Asn Ile
1325 1330 1335
Pro Gln Ile Thr Ser Lys Tyr Ser Ser Asp Asn Leu Ala Ile Thr
1340 1345 1350
Leu His Asn Ala Ala Phe Thr Val Arg Tyr Asp Gly Ser Gly Asn
1355 1360 1365
Val Ile Arg Asn Lys Gln Ile Ser Ala Met Lys Leu Thr Gly Val
1370 1375 1380
Asp Gly Lys Ser Gln Tyr Gly Asn Ala Phe Ile Ile Ala Asn Thr
1385 1390 1395
Val Lys His Tyr Gly Gly Tyr Ser Asp Leu Gly Gly Pro Ile Thr
1400 1405 1410
Val Tyr Asn Lys Thr Lys Asn Tyr Ile Ala Ser Val Gln Gly His
1415 1420 1425
Leu Met Asn Ala Asp Tyr Thr Arg Arg LeuIle Leu Thr Pro Val
1130 1435 1440
Glu Asn Asn Tyr Tyr Ala Arg Leu Phe Glu Phe Pro Phe Ser Pro
1445 1450 1455
Asn Thr Ile Leu Asn Thr Val Phe Thr Val Gly Ser Asn Lys Thr
1460 1465 1470
Ser Asp Phe Lys Lys Cys Ser Tyr Ala Val Asp Gly Asn Asn Ser
1475 1480 1485
Gln Gly Phe Gln Ile Phe Ser Ser Tyr Gln Ser Ser Gly Trp Leu
1490 1495 1500
Asp Ile Asp Thr Gly Ile Asn Asn Thr Asp Ile Lys Ile Thr Val
1505 1510 1515
Met Ala Gly Ser Lys Thr His Thr Phe Thr Ala Ser Asp His Ile
1520 1525 1530
Ala Ser Leu Pro Ala Asn Ser Phe Asp Ala Met Pro Tyr Thr Phe
1535 1540 1545
Lys Pro Leu Glu Ile Asp Ala Ser Ser Leu Ala Phe Thr Asn Asn
1550 1555 1560
Ile Ala Pro Leu Asp Ile Val Phe Glu Thr Lys Ala Lys Asp Gly
1565 1570 1575
Arg Val Leu Gly Lys Ile Lys Gln Thr Leu Ser Val Lys Arg Val
1580 1585 1590
Asn Tyr Asn Pro Glu Asp Ile Leu Phe Leu Arg Glu Thr His Ser
1595 1600 1605
Gly Ala Gln Tyr Met Gln Leu Gly Val Tyr Arg Ile Arg Leu Asn
1610 1615 1620
Thr Leu Leu Ala Ser Gln Leu Val Ser Arg Ala Asn Thr Gly Ile
1625 1630 1635
Asp Thr Ile Leu Thr Met Glu Thr Gln Arg Leu Pro Glu Pro Pro
1640 1645 1650
Leu Gly Glu Gly Phe Phe Ala Asn Phe Val Leu Pro Lys Tyr Asp
1655 1660 1665
Pro Ala Glu His Gly Asp Glu Arg Trp Phe Lys Ile His Ile Gly
1670 1675 1680
Asn Val Gly Gly Asn Thr Gly Arg Gln Pro Tyr Tyr Ser Gly Met
1685 1690 1695
Leu Ser Asp Thr Ser Glu Thr Ser Met Thr Leu Phe Val Pro Tyr
1700 1705 1710
Ala Glu Gly Tyr Tyr Met His Glu Gly Val Arg Leu Gly Val Gly
1715 1720 1725
Tyr Gln Lys Ile Thr Tyr Asp Asn Thr Trp Glu Ser Ala Phe Phe
1730 1735 1740
Tyr Phe Asp Glu Thr Lys Gln Gln Phe Val Leu Ile Asn Asp Ala
1745 1750 1755
Asp His Asp Ser Gly Met Thr Gln Gln Gly Ile Val Lys Asn Ile
1760 1765 1770
Lys Lys Tyr Lys Gly Phe Leu Asn Val Ser Ile Ala Thr Gly Tyr
1775 1780 1785
Ser Ala Pro Met Asp Phe Asn Ser Ala Ser Ala Leu Tyr Tyr Trp
1790 1795 1800
Glu Leu Phe Tyr Tyr Thr Pro Met Met Cys Phe Gln Arg Leu Leu
1805 1810 1815
Gln Glu Lys Gln Phe Asp Glu Ala Thr Gln Trp Ile Asn Tyr Val
1820 1825 1830
Tyr Asn Pro Ala Gly Tyr Ile Val Asn Gly Glu Ile Ala Pro Trp
1835 1840 1845
Ile Trp Asn Cys Arg Pro Leu Glu Glu Thr Thr Ser Trp Asn Ala
1850 1855 1860
Asn Pro Leu Asp Ala Ile Asp Pro Asp Ala Val Ala Gln Asn Asp
1865 1870 1875
Pro Met His Tyr Lys Ile Ala Thr Phe Met Arg Leu Leu Asp Gln
1880 1885 1890
Leu Ile Leu Arg Gly Asp Met Ala Tyr Arg Glu Leu Thr Arg Asp
1895 1900 1905
Ala Leu Asn Glu Ala Lys Met Trp Tyr Val Arg Thr Leu Glu Leu
1910 1915 1920
Leu Gly Asp Glu Pro Glu Asp Tyr Gly Ser Gln Gln Trp Ala Ala
1925 1930 1935
Pro Ser Leu Ser Gly Ala Ala Ser Gln Thr Val Gln Ala Ala Tyr
1940 1945 1950
Gln Gln Asp Leu Thr Met Leu Gly Arg Gly Gly Val Ser Lys Asn
1955 1960 1965
Leu Arg Thr Ala Asn Ser Leu Val Gly Leu Phe Leu Pro Glu Tyr
1970 1975 1980
Asn Pro Ala Leu Thr Asp Tyr Trp Gln Thr Leu Arg Leu Arg Leu
1985 1990 1995
Phe Asn Leu Arg His Asn Leu Ser Ile Asp Gly Gln Pro Leu Ser
2000 2005 2010
Leu Ala Ile Tyr Ala Glu Pro Thr Asp Pro Lys Ala Leu Leu Thr
2015 2020 2025
Ser Met Val Gln Ala Ser Gln Gly Gly Ser Ala Val Leu Pro Gly
2030 2035 2040
Thr Leu Ser Leu Tyr Arg Phe Pro Val Met Leu Glu Arg Thr Arg
2045 2050 2055
Asn Leu Val Ala Gln Leu Thr Gln Phe Gly Thr Ser Leu Leu Ser
2060 2065 2070
Met Ala Glu His Asp Asp Ala Asp Glu Leu Thr Thr Leu Leu Leu
2075 2080 2085
Gln Gln Gly Met Glu Leu Ala Thr Gln Ser Ile Arg Ile Gln Gln
2090 2095 2100
Arg Thr Val Asp Glu Val Asp Ala Asp Ile Ala Val Leu Ala Glu
2105 2110 2115
Ser Arg Arg Ser Ala Gln Asn Arg Leu Glu Lys Tyr Gln Gln Leu
2120 2125 2130
Tyr Asp Glu Asp Ile Asn His Gly Glu Gln Arg Ala Met Ser Leu
2135 2140 2145
Leu Asp Ala Ala Ala Gly Gln Ser Leu Ala Gly Gln Val Leu Ser
2150 2155 2160
Ile Ala Glu Gly Val Ala Asp Leu Val Pro Asn Val Phe Gly Leu
2165 2170 2175
Ala Cys Gly Gly Ser Arg Trp Gly Ala Ala Leu Arg Ala Ser Ala
2180 2185 2190
Ser Val Met Ser Leu Ser Ala Thr Ala Ser Gln Tyr Ser Ala Asp
2195 2200 2205
Lys Ile Ser Arg Ser Glu Ala Tyr Arg Arg Arg Arg Gln Glu Trp
2210 2215 2220
Glu Ile Gln Arg Asp Asn Ala Asp Gly Glu Val Lys Gln Met Asp
2225 2230 2235
Ala Gln Leu Glu Ser Leu Lys Ile Arg Arg Glu Ala Ala Gln Met
2240 2245 2250
Gln Val Glu Tyr Gln Glu Thr Gln Gln Ala His Thr Gln Ala Gln
2255 2260 2265
Leu Glu Leu Leu Gln Arg Lys Phe Thr Asn Lys Ala Leu Tyr Ser
2270 2275 2280
Trp Met Arg Gly Lys Leu Ser Ala Ile Tyr Tyr Gln Phe Phe Asp
2285 2290 2295
Leu Thr Gln Ser Phe Cys Leu Met Ala Gln Glu Ala Leu Arg Arg
2300 2305 2310
Glu Leu Thr Asp Asn Gly Val Thr Phe Ile Arg Gly Gly Ala Trp
2315 2320 2325
Asn Gly Thr Thr Ala Gly Leu Met Ala Gly Glu Thr Leu Leu Leu
2330 2335 2340
Asn Leu Ala Glu Met Glu Lys Val Trp Leu Glu Arg Asp Glu Arg
2345 2350 2355
Ala Leu Glu Val Thr Arg Thr Val Ser Leu Ala Gln Phe Tyr Gln
2360 2365 2370
Ala Leu Ser Ser Asp Asn Phe Asn Leu Thr Glu Lys Leu Thr Gln
2375 2380 2385
Phe Leu Arg Glu Gly Lys Gly Asn Val Gly Ala Ser Gly Asn Glu
2390 2395 2400
Leu Lys Leu Ser Asn Arg Gln Ile Glu Ala Ser Val Arg Leu Ser
2405 2410 2415
Asp Leu Lys Ile Phe Ser Asp Tyr Pro Glu Ser Leu Gly Asn Thr
2420 2425 2430
Arg Gln Leu Lys Gln Val Ser Val Thr Leu Pro Ala Leu Val Gly
2435 2440 2445
Pro Tyr Glu Asp Ile Arg Ala Val Leu Asn Tyr Gly Gly Ser Ile
2450 2455 2460
Val Met Pro Arg Gly Cys Ser Ala Ile Ala Leu Ser His Gly Val
2465 2470 2475
Asn Asp Ser Gly Gln Phe Met Leu Asp Phe Asn Asp Ser Arg Tyr
2480 2485 2490
Leu Pro Phe Glu Gly Ile Ser Val Asn Asp Ser Gly Ser Leu Thr
2495 2500 2505
Leu Ser Phe Pro Asp Ala Thr Asp Arg Gln Lys Ala Leu Leu Glu
2510 2515 2520
Ser Leu Ser Asp Ile Ile Leu His Ile Arg Tyr Thr Ile Arg Ser
2525 2530 2535
<210>21
<211>7551
<212>DNA
<213>发光光杆状菌(Photorhabdus luminescens)
<220>
<221>外显子
<222>(1)..(7551)
<400>21
atg aac gag tct gta aaa gag ata cct gat gta tta aaa agc cag tgt 48
Met Asn Glu Ser Val Lys Glu Ile Pro Asp Val Leu Lys Ser Gln Cys
1 5 10 15
ggt ttt aat tgt ctg aca gat att agc cac agc tct ttt aat gaa ttt 96
Gly Phe Asn Cys Leu Thr Asp Ile Ser His Ser Ser Phe Asn Glu Phe
20 25 30
cgc cag caa gta tct gag cac ctc tcc tgg tcc gaa aca cac gac tta 144
Arg Gln Gln Val Ser Glu His Leu Ser Trp Ser Glu Thr His Asp Leu
35 40 45
tat cat gat gca caa cag gca caa aag gat aat cgc ctg tat gaa gcg 192
Tyr His Asp Ala Gln Gln Ala Gln Lys Asp Asn Arg Leu Tyr Glu Ala
50 55 60
cgt att ctc aaa cgc gcc aat ccc caa tta caa aat gcg gtg cat ctt 240
Arg Ile Leu Lys Arg Ala Asn Pro Gln Leu Gln Asn Ala Val His Leu
65 70 75 80
gcc att ctc gct ccc aat gct gaa ctg ata ggc tat aac aat caa ttt 288
Ala Ile Leu Ala Pro Asn Ala Glu Leu Ile Gly Tyr Asn Asn Gln Phe
85 90 95
agc ggt aga gcc agt caa tat gtt gcg ccg ggt acc gtt tct tcc atg 336
Ser Gly Arg Ala Ser Gln Tyr Val Ala Pro Gly Thr Val Ser Ser Met
100 105 110
ttc tcc ccc gcc gct tat ttg act gaa ctt tat cgt gaa gca cgc aat 384
Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Arg Asn
115 120 125
tta cac gca agt gac tcc gtt tat tat ctg gat acc cgc cgc cca gat 432
Leu His Ala Ser Asp Ser Val Tyr Tyr Leu Asp Thr Arg Arg Pro Asp
130 135 140
ctc aaa tca atg gcg ctc agt cag caa aat atg gat ata gaa tta tcc 480
Leu Lys Ser Met Ala Leu Ser Gln Gln Asn Met Asp Ile Glu Leu Ser
145 150 155 160
aca ctc tct ttg tcc aat gag ctg tta ttg gaa agc att aaa act gaa 528
Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser Ile Lys Thr Glu
165 170 175
tct aaa ctg gaa aac tat act aaa gtg atg gaa atg ctc tcc act ttc 576
Ser Lys Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser Thr Phe
180 185 190
cgt cct tcc ggc gca acg cct tat cat gat gct tat gaa aat gtg cgt 624
Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn Val Arg
195 200 205
gaa gtt atc cag cta caa gat cct gga ctt gag caa ctc aat gca tca 672
Glu Val Ile Gln Leu Gln Asp Pro Gly Leu Glu Gln Leu Asn Ala Ser
210 215 220
ccg gca att gcc ggg ttg atg cat caa gcc tcc cta ttg ggt att aac 720
Pro Ala Ile Ala Gly Leu Met His Gln Ala Ser Leu Leu Gly Ile Asn
225 230 235 240
gct tca atc tcg cct gag cta ttt aat att ctg acg gag gag att acc 768
Ala Ser Ile Ser Pro Glu Leu Phe Asn Ile Leu Thr Glu Glu Ile Thr
245 250 255
gaa ggt aat gct gag gaa ctt tat aag aaa aat ttt ggt aat atc gaa 816
Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn Ile Glu
260 265 270
ccg gcc tca ttg gct atg ccg gaa tac ctt aaa cgt tat tat aat tta 864
Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu
275 280 285
agc gat gaa gaa ctt agt cag ttt att ggt aaa gcc agc aat ttt ggt 912
Ser Asp Glu Glu Leu Ser Gln Phe Ile Gly Lys Ala Ser Asn Phe Gly
290 295 300
caa cag gaa tat agt aat aac caa ctt att act ccg gta gtc aac agc 960
Gln Gln Glu Tyr Ser Asn Asn Gln Leu Ile Thr Pro Val Val Asn Ser
305 310 315 320
agt gat ggc acg gtt aag gta tat cgg atc acc cgc gaa tat aca acc 1008
Ser Asp Gly Thr Val Lys Val Tyr Arg Ile Thr Arg Glu Tyr Thr Thr
325 330 335
aat gct tat caa atg gat gtg gag cta ttt ccc ttc ggt ggt gag aat 1056
Asn Ala Tyr Gln Met Asp Val Glu Leu Phe Pro Phe Gly Gly Glu Asn
340 345 350
tat cgg tta gat tat aaa ttc aaa aat ttt tat aat gcc tct tat tta 1104
Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu
355 360 365
tcc atc aag tta aat gat aaa aga gaa ctt gtt cga act gaa ggc gct 1152
Ser Ile Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu Gly Ala
370 375 380
cct caa gtc aat ata gaa tac tcc gca aat atc aca tta aat acc gct 1200
Pro Gln Val Asn Ile Glu Tyr Ser Ala Asn Ile Thr Leu Asn Thr Ala
385 390 395 400
gat atc agt caa cct ttt gaa att ggc ctg aca cga gta ctt cct tcc 1248
Asp Ile Ser Gln Pro Phe Glu Ile Gly Leu Thr Arg Val Leu Pro Ser
405 410 415
ggt tct tgg gca tat gcc gcc gca aaa ttt acc gtt gaa gag tat aac 1296
Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn
420 425 430
caa tac tct ttt ctg cta aaa ctt aac aag gct att cgt cta tca cgt 1344
Gln Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala Ile Arg Leu Ser Arg
435 440 445
gcg aca gaa ttg tca ccc acg att ctg gaa ggc att gtg cgc agt gtt 1392
Ala Thr Glu Leu Ser Pro Thr Ile Leu Glu Gly Ile Val Arg Ser Val
450 455 460
aat cta caa ctg gat atc aac aca gac gta tta ggt aaa gtt ttt ctg 1440
Asn Leu Gln Leu Asp Ile Asn Thr Asp Val Leu Gly Lys Val Phe Leu
465 470 475 480
act aaa tat tat atg cag cgt tat gct att cat gct gaa act gcc ctg 1488
Thr Lys Tyr Tyr Met Gln Arg Tyr Ala Ile His Ala Glu Thr Ala Leu
485 490 495
ata cta tgc aac gcg cct att tca caa cgt tca tat gat aat caa cct 1536
Ile Leu Cys Asn Ala Pro Ile Ser Gln Arg Ser Tyr Asp Asn Gln Pro
500 505 510
agc caa ttt gat cgc ctg ttt aat acg cca tta ctg aac gga caa tat 1584
Ser Gln Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly Gln Tyr
515 520 525
ttt tct acc ggc gat gag gag att gat tta aat tca ggt agc acc ggc 1632
Phe Ser Thr Gly Asp Glu Glu Ile Asp Leu Asn Ser Gly Ser Thr Gly
530 535 540
gat tgg cga aaa acc ata ctt aag cgt gca ttt aat att gat gat gtc 1680
Asp Trp Arg Lys Thr Ile Leu Lys Arg Ala Phe Asn Ile Asp Asp Val
545 550 555 560
tcg ctc ttc cgc ctg ctt aaa att acc gac cat gat aat aaa gat gga 1728
Ser Leu Phe Arg Leu Leu Lys Ile Thr Asp His Asp Asn Lys Asp Gly
565 570 575
aaa att aaa aat aac cta aag aat ctt tcc aat tta tat att gga aaa 1776
Lys Ile Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr Ile Gly Lys
580 585 590
tta ctg gca gat att cat caa tta acc att gat gaa ctg gat tta tta 1824
Leu Leu Ala Asp Ile His Gln Leu Thr Ile Asp Glu Leu Asp Leu Leu
595 600 605
ctg att gcc gta ggt gaa gga aaa act aat tta tcc gct atc agt gat 1872
Leu Ile Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala Ile Ser Asp
610 615 620
aag caa ttg gct acc ctg atc aga aaa ctc aat act att acc agc tgg 1920
Lys Gln Leu Ala Thr Leu Ile Arg Lys Leu Asn Thr Ile Thr Ser Trp
625 630 635 640
cta cat aca cag aag tgg agt gta ttc cag cta ttt atc atg acc tcc 1968
Leu His Thr Gln Lys Trp Ser Val Phe Gln Leu Phe Ile Met Thr Ser
645 650 655
acc agc tat aac aaa acg cta acg cct gaa att aag aat ttg ctg gat 2016
Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu Ile Lys Asn Leu Leu Asp
660 665 670
acc gtc tac cac ggt tta caa ggt ttt gat aaa gac aaa gca gat ttg 2064
Thr Val Tyr His Gly Leu Gln Gly Phe Asp Lys Asp Lys Ala Asp Leu
675 680 685
cta cat gtc atg gcg ccc tat att gcg gcc acc ttg caa tta tca tcg 2112
Leu His Val Met Ala Pro Tyr Ile Ala Ala Thr Leu Gln Leu Ser Ser
690 695 700
gaa aat gtc gcc cac tcg gta ctc ctt tgg gca gat aag tta cag ccc 2160
Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu Gln Pro
705 710 715 720
ggc gac ggc gca atg aca gca gaa aaa ttc tgg gac tgg ttg aat act 2208
Gly Asp Gly Ala Met Thr Ala Glu Lys Phe Trp Asp Trp Leu Asn Thr
725 730 735
aag tat acg ccg ggt tca tcg gaa gcc gta gaa acg cag gaa cat atc 2256
Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gln Glu His Ile
740 745 750
gtt cag tat tgt cag gct ctg gca caa ttg gaa atg gtt tac cat tcc 2304
Val Gln Tyr Cys Gln Ala Leu Ala Gln Leu Glu Met Val Tyr His Ser
755 760 765
acc ggc atc aac gaa aac gcc ttc cgt cta ttt gtg aca aaa cca gag 2352
Thr Gly Ile Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys Pro Glu
770 775 780
atg ttt ggc gct gca act gga gca gcg ccc gcg cat gat gcc ctt tca 2400
Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Leu Ser
785 790 795 800
ctg att atg ctg aca cgt ttt gcg gat tgg gtg aac gca cta ggc gaa 2448
Leu Ile Met Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu Gly Glu
805 810 815
aaa gcg tcc tcg gtg cta gcg gca ttt gaa gct aac tcg tta acg gca 2496
Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr Ala
820 825 830
gaa caa ctg gct gat gcc atg aat ctt gat gct aat ttg ctg ttg caa 2544
Glu Gln Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu Leu Gln
835 840 845
gcc agt att caa gca caa aat cat caa cat ctt ccc cca gta act cca 2592
Ala Ser Ile Gln Ala Gln Asn His Gln His Leu Pro Pro Val Thr Pro
850 855 860
gaa aat gcg ttc tcc tgt tgg aca tct atc aat act atc ctg caa tgg 2640
Glu Asn Ala Phe Ser Cys Trp Thr Ser Ile Asn Thr Ile Leu Gln Trp
865 870 875 880
gtt aat gtc gca caa caa ttg aat gtc gcc cca cag ggc gtt tcc gct 2688
Val Asn Val Ala Gln Gln Leu Asn Val Ala Pro Gln Gly Val Ser Ala
885 890 895
ttg gtc ggg ctg gat tat att caa tca atg aaa gag aca ccg acc tat 2736
Leu Val Gly Leu Asp Tyr Ile Gln Ser Met Lys Glu Thr Pro Thr Tyr
900 905 910
gcc cag tgg gaa aac gcg gca ggc gta tta acc gcc ggg ttg aat tca 2784
Ala Gln Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser
915 920 925
caa cag gct aat aca tta cac gct ttt ctg gat gaa tct cgc agt gcc 2832
Gln Gln Ala Asn Thr Leu His Ala Phe Leu Asp Glu Ser Arg Ser Ala
930 935 940
gca tta agc acc tac tat atc cgt caa gtc gcc aag gca gcg gcg gct 2880
Ala Leu Ser Thr Tyr Tyr Ile Arg Gln Val Ala Lys Ala Ala Ala Ala
945 950 955 960
att aaa agc cgt gat gac ttg tat caa tac tta ctg att gat aat cag 2928
Ile Lys Ser Arg Asp Asp Leu Tyr Gln Tyr Leu Leu Ile Asp Asn Gln
965 970 975
gtt tct gcg gca ata aaa acc acc cgg atc gcc gaa gcc att gcc agt 2976
Val Ser Ala Ala Ile Lys Thr Thr Arg Ile Ala Glu Ala Ile Ala Ser
980 985 990
att caa ctg tac gtc aac cgg gca ttg gaa aat gtg gaa gaa aat gcc 3024
Ile Gln Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala
995 1000 1005
aat tcg ggg gtt atc agc cgc caa ttc ttt atc gac tgg gac aaa 3069
Asn Ser Gly Val Ile Ser Arg Gln Phe Phe Ile Asp Trp Asp Lys
1010 1015 1020
tac aat aaa cgc tac agc act tgg gcg ggt gtt tct caa tta gtt 3114
Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gln Leu Val
1025 1030 1035
tac tac ccg gaa aac tat att gat ccg acc atg cgt atc gga caa 3159
Tyr Tyr Pro Glu Asn Tyr Ile Asp Pro Thr Met Arg Ile Gly Gln
1040 1045 1050
acc aaa atg atg gac gca tta ctg caa tcc gtc agc caa agc caa 3204
Thr Lys Met Met Asp Ala Leu Leu Gln Ser Val Ser Gln Ser Gln
1055 1060 1065
tta aac gcc gat acc gtc gaa gat gcc ttt atg tct tat ctg aca 3249
Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr
1070 1075 1080
tcg ttt gaa caa gtg gct aat ctt aaa gtt att agc gca tat cac 3294
Ser Phe Glu Gln Val Ala Asn Leu Lys Val Ile Ser Ala Tyr His
1085 1090 1095
gat aat att aat aac gat caa ggg ctg acc tat ttt atc gga ctc 3339
Asp Asn Ile Asn Asn Asp Gln Gly Leu Thr Tyr Phe Ile Gly Leu
1100 1105 1110
agt gaa act gat gcc ggt gaa tat tat tgg cgc agt gtc gat cac 3384
Ser Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His
1115 1120 1125
agt aaa ttc aac gac ggt aaa ttc gcg gct aat gcc tgg agt gaa 3429
Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu
1130 1135 1140
tgg cat aaa att gat tgt cca att aac cct tat aaa agc act atc 3474
Trp His Lys Ile Asp Cys Pro Ile Asn Pro Tyr Lys Ser ThrIle
1145 1150 1155
cgt cca gtg ata tat aaa tcc cgc ctg tat ctg ctc tgg ttg gaa 3519
Arg Pro Val Ile Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu
1160 1165 1170
caa aag gag atc acc aaa cag aca gga aat agt aaa gat ggc tat 3564
Gln Lys Glu Ile Thr Lys Gln Thr Gly Asn Ser Lys Asp Gly Tyr
1175 1180 1185
caa act gaa acg gat tat cgt tat gaa cta aaa ttg gcg cat atc 3609
Gln Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His Ile
1190 1195 1200
cgc tat gat ggc act tgg aat acg cca atc acc ttt gat gtc aat 3654
Arg Tyr Asp Gly Thr Trp Asn Thr Pro Ile Thr Phe Asp Val Asn
1205 1210 1215
aaa aaa ata tcc gag cta aaa ctg gaa aaa aat aga gcg ccc gga 3699
Lys Lys Ile Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro Gly
1220 1225 1230
ctc tat tgt gcc ggt tat caa ggt gaa gat acg ttg ctg gtg atg 3744
Leu Tyr Cys Ala Gly Tyr Gln Gly Glu Asp Thr Leu Leu Val Met
1235 1240 1245
ttt tat aac caa caa gac aca cta gat agt tat aaa aac gct tca 3789
Phe Tyr Asn Gln Gln Asp Thr Leu Asp Ser Tyr Lys Asn Ala Ser
1250 1255 1260
atg caa gga cta tat atc ttt gct gat atg gca tcc aaa gat atg 3834
Met Gln Gly Leu Tyr Ile Phe Ala Asp Met Ala Ser Lys Asp Met
1265 1270 1275
acc cca gaa cag agc aat gtt tat cgg gat aat agc tat caa caa 3879
Thr Pro Glu Gln Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gln Gln
1280 1285 1290
ttt gat acc aat aat gtc aga aga gtg aat aac cgc tat gca gag 3924
Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu
1295 1300 1305
gat tat gag att cct tcc tcg gta agt agc cgt aaa gac tat ggt 3969
Asp Tyr Glu Ile Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly
1310 1315 1320
tgg gga gat tat tac ctc agc atg gta tat aac gga gat att cca 4014
Trp Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp Ile Pro
1325 1330 1335
act atc aat tac aaa gcc gca tca agt gat tta aaa atc tat atc 4059
Thr Ile Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys Ile Tyr Ile
1340 1345 1350
tca cca aaa tta aga att att cat aat gga tat gaa gga cag aag 4104
Ser Pro Lys Leu Arg Ile Ile His Asn Gly Tyr Glu Gly Gln Lys
1355 1360 1365
cgc aat caa tgc aat ctg atg aat aaa tat ggc aaa cta ggt gat 4149
Arg Asn Gln Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp
1370 1375 1380
aaa ttt att gtt tat act agc ttg ggg gtc aat cca aat aac tcg 4194
Lys Phe Ile Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser
1385 1390 1395
tca aat aag ctc atg ttt tac ccc gtc tat caa tat agc gga aac 4239
Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr Gln Tyr Ser Gly Asn
1400 1405 1410
acc agt gga ctc aat caa ggg aga cta cta ttc cac cgt gac acc 4284
Thr Ser Gly Leu Asn Gln Gly Arg Leu Leu Phe His Arg Asp Thr
1415 1420 1425
act tat cca tct aaa gta gaa gct tgg att cct gga gca aaa cgt 4329
Thr Tyr Pro Ser Lys Val Glu Ala Trp Ile Pro Gly Ala Lys Arg
1430 1435 1440
tct cta acc aac caa aat gcc gcc att ggt gat gat tat gct aca 4374
Ser Leu Thr Asn Gln Asn Ala Ala Ile Gly Asp Asp Tyr Ala Thr
1445 1450 1455
gac tct ctg aat aaa ccg gat gat ctt aag caa tat atc ttt atg 4419
Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gln Tyr Ile Phe Met
1460 1465 1470
act gac agt aaa ggg act gct act gat gtc tca ggc cca gta gag 4464
Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu
1475 1480 1485
att aat act gca att tct cca gca aaa gtt cag ata ata gtc aaa 4509
Ile Asn Thr Ala Ile Ser Pro Ala Lys Val Gln Ile Ile Val Lys
1490 1495 1500
gcg ggt ggc aag gag caa act ttt acc gca gat aaa gat gtc tcc 4554
Ala Gly Gly Lys Glu Gln Thr Phe Thr Ala Asp Lys Asp Val Ser
1505 1510 1515
att cag cca tca cct agc ttt gat gaa atg aat tat caa ttt aat 4599
Ile Gln Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr Gln Phe Asn
1520 1525 1530
gcc ctt gaa ata gac ggt tct ggt ctg aat ttt att aac aac tca 4644
Ala Leu Glu Ile Asp Gly Ser Gly Leu Asn Phe Ile Asn Asn Ser
1535 1540 1545
gcc agt att gat gtt act ttt acc gca ttt gcg gag gat ggc cgc 4689
Ala Ser Ile Asp Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg
1550 1555 1560
aaa ctg ggt tat gaa agt ttc agt att cct gtt acc ctc aag gta 4734
Lys Leu Gly Tyr Glu Ser Phe Ser Ile Pro Val Thr Leu Lys Val
1565 1570 1575
agt acc gat aat gcc ctg acc ctg cac cat aat gaa aat ggt gcg 4779
Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu Asn Gly Ala
1580 1585 1590
caa tat atg caa tgg caa tcc tat cgt acc cgc ctg aat act cta 4824
Gln Tyr Met Gln Trp Gln Ser Tyr Arg Thr Arg Leu Asn Thr Leu
1595 1600 1605
ttt gcc cgc cag ttg gtt gca cgc gcc acc acc gga atc gat aca 4869
Phe Ala Arg Gln Leu Val Ala Arg Ala Thr Thr Gly Ile Asp Thr
1610 1615 1620
att ctg agt atg gaa act cag aat att cag gaa ccg cag tta ggc 4914
Ile Leu Ser Met Glu Thr Gln Asn Ile Gln Glu Pro Gln Leu Gly
1625 1630 1635
aaa ggt ttc tat gct acg ttc gtg ata cct ccc tat aac cta tca 4959
Lys Gly Phe Tyr Ala Thr Phe Val Ile Pro Pro Tyr Asn Leu Ser
1640 1645 1650
act cat ggt gat gaa cgt tgg ttt aag ctt tat atc aaa cat gtt 5004
Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr Ile Lys His Val
1655 1660 1665
gtt gat aat aat tca cat att atc tat tca ggc cag cta aca gat 5049
Val Asp Asn Asn Ser His Ile Ile Tyr Ser Gly Gln Leu Thr Asp
1670 1675 1680
aca aat ata aac atc aca tta ttt att cct ctt gat gat gtc cca 5094
Thr Asn Ile Asn Ile Thr Leu Phe Ile Pro Leu Asp Asp Val Pro
1685 1690 1695
ttg aat caa gat tat cac gcc aag gtt tat atg acc ttc aag aaa 5139
Leu Asn Gln Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys Lys
1700 1705 1710
tca cca tca gat ggt acc tgg tgg ggc cct cac ttt gtt aga gat 5184
Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp
1715 1720 1725
gat aaa gga ata gta aca ata aac cct aaa tcc att ttg acc cat 5229
Asp Lys Gly Ile Val Thr Ile Asn Pro Lys Ser Ile Leu Thr His
1730 1735 1740
ttt gag agc gtc aat gtc ctg aat aat att agt agc gaa cca atg 5274
Phe Glu Ser Val Asn Val Leu Asn Asn Ile Ser Ser Glu Pro Met
1745 1750 1755
gat ttc agc ggc gct aac agc ctc tat ttc tgg gaa ctg ttc tac 5319
Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr
1760 1765 1770
tat acc ccg atg ctg gtt gct caa cgt ttg ctg cat gaa cag aac 5364
Tyr Thr Pro Met Leu Val Ala Gln Arg Leu Leu His Glu Gln Asn
1775 1780 1785
ttc gat gaa gcc aac cgt tgg ctg aaa tat gtc tgg agt cca tcc 5409
Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser
1790 1795 1800
ggt tat att gtc cac ggc cag att cag aac tac cag tgg aac gtc 5454
Gly Tyr Ile Val His Gly Gln Ile Gln Asn Tyr Gln Trp Asn Val
1805 1810 1815
cgc ccg tta ctg gaa gac acc agt tgg aac agt gat cct ttg gat 5499
Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu Asp
1820 1825 1830
tcc gtc gat cct gac gcg gta gca cag cac gat cca atg cac tac 5544
Ser Val Asp Pro Asp Ala Val Ala Gln His Asp Pro Met His Tyr
1835 1840 1845
aaa gtt tca act ttt atg cgt acc ttg gat cta ttg ata gca cgc 5589
Lys Val Ser Thr Phe Met Arg Thr Leu Asp Leu Leu Ile Ala Arg
1850 1855 1860
ggc gac cat gct tat cgc caa ctg gaa cga gat aca ctc aac gaa 5634
Gly Asp His Ala Tyr Arg Gln Leu Glu Arg Asp Thr Leu Asn Glu
1865 1870 1875
gcg aag atg tgg tat atg caa gcg ctg cat cta tta ggt gac aaa 5679
Ala Lys Met Trp Tyr Met Gln Ala Leu His Leu Leu Gly Asp Lys
1880 1885 1890
cct tat cta ccg ctg agt acg aca tgg agt gat cca cga cta gac 5724
Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp
1895 1900 1905
aga gcc gcg gat atc act acc caa aat gct cac gac agc gca ata 5769
Arg Ala Ala Asp Ile Thr Thr Gln Asn Ala His Asp Ser Ala Ile
1910 1915 1920
gtc gct ctg cgg cag aat ata cct aca ccg gca cct tta tca ttg 5814
Val Ala Leu Arg Gln Asn Ile Pro Thr Pro Ala Pro Leu Ser Leu
1925 1930 1935
cgc agc gct aat acc ctg act gat ctc ttc ctg ccg caa atc aat 5859
Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gln Ile Asn
1940 1945 1950
gaa gtg atg atg aat tac tgg cag aca tta gct cag aga gta tac 5904
Glu Val Met Met Asn Tyr Trp Gln Thr Leu Ala Gln Arg Val Tyr
1955 1960 1965
aat ctg cgt cat aac ctc tct atc gac ggc cag ccg tta tat ctg 5949
Asn Leu Arg His Asn Leu Ser Ile Asp Gly Gln Pro Leu Tyr Leu
1970 1975 1980
cca atc tat gcc aca ccg gcc gat ccg aaa gcg tta ctc agc gcc 5994
Pro Ile Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala
1985 1990 1995
gcc gtt gcc act tct caa ggt gga ggc aag cta ccg gaa tca ttt 6039
Ala Val Ala Thr Ser Gln Gly Gly Gly Lys Leu Pro Glu Ser Phe
2000 2005 2010
atg tcc ctg tgg cgt ttc ccg cac atg ctg gaa aat gcg cgc ggc 6084
Met Ser Leu Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly
2015 2020 2025
atg gtt agc cag ctc acc cag ttc ggc tcc acg tta caa aat att 6129
Met Val Ser Gln Leu Thr Gln Phe Gly Ser Thr Leu Gln Asn Ile
2030 2035 2040
atc gaa cgt cag gac gcg gaa gcg ctc aat gcg tta tta caa aat 6174
Ile Glu Arg Gln Asp Ala Glu Ala Leu Asn Ala Leu Leu Gln Asn
2045 2050 2055
cag gcc gcc gag ctg ata ttg act aac ctg agc att cag gac aaa 6219
Gln Ala Ala Glu Leu Ile Leu Thr Asn Leu Ser Ile Gln Asp Lys
2060 2065 2070
acc att gaa gaa ttg gat gcc gag aaa acg gtg ttg gaa aaa tcc 6264
Thr Ile Glu Glu Leu Asp Ala Glu Lys Thr Val Leu Glu Lys Ser
2075 2080 2085
aaa gcg gga gca caa tcg cgc ttt gat agc tac ggc aaa ctg tac 6309
Lys Ala Gly Ala Gln Ser Arg Phe Asp Ser Tyr Gly Lys Leu Tyr
2090 2095 2100
gat gag aat atc aac gcc ggt gaa aac caa gcc atg acg cta cga 6354
Asp Glu Asn Ile Asn Ala Gly Glu Asn Gln Ala Met Thr Leu Arg
2105 2110 2115
gcg tcc gcc gcc ggg ctt acc acg gca gtt cag gca tcc cgt ctg 6399
Ala Ser Ala Ala Gly Leu Thr Thr Ala Val Gln Ala Ser Arg Leu
2120 2125 2130
gcc ggt gcg gcg gct gat ctg gtg cct aac atc ttc ggc ttt gcc 6444
Ala Gly Ala Ala Ala Asp Leu Val Pro Asn Ile Phe Gly Phe Ala
2135 2140 2145
ggt ggc ggc agc cgt tgg ggg gct atc gct gag gcg aca ggt tat 6489
Gly Gly Gly Ser Arg Trp Gly Ala Ile Ala Glu Ala Thr Gly Tyr
2150 2155 2160
gtg atg gaa ttc tcc gcg aat gtt atg aac acc gaa gcg gat aaa 6534
Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala Asp Lys
2165 2170 2175
att agc caa tct gaa acc tac cgt cgt cgc cgt cag gag tgg gag 6579
Ile Ser Gln Ser Glu Thr Tyr Arg Arg Arg Arg Gln Glu Trp Glu
2180 2185 2190
atc cag cgg aat aat gcc gaa gcg gaa ttg aag caa atc gat gct 6624
Ile Gln Arg Asn Asn Ala Glu Ala Glu Leu Lys Gln Ile Asp Ala
2195 2200 2205
cag ctc aaa tca ctc gct gta cgc cgc gaa gcc gcc gta ttg cag 6669
Gln Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gln
2210 2215 2220
aaa acc agt ctg aaa acc caa caa gaa cag acc caa tct caa ttg 6714
Lys Thr Ser Leu Lys Thr Gln Gln Glu Gln Thr Gln Ser Gln Leu
2225 2230 2235
gcc ttc ctg caa cgt aag ttc agc aat cag gcg tta tac aac tgg 6759
Ala Phe Leu Gln Arg Lys Phe Ser Asn Gln Ala Leu Tyr Asn Trp
2240 2245 2250
ctg cgt ggt cga ctg gcg gcg att tac ttc cag ttc tac gat ttg 6804
Leu Arg Gly Arg Leu Ala Ala Ile Tyr Phe Gln Phe Tyr Asp Leu
2255 2260 2265
gcc gtc gcg cgt tgc ctg atg gca gaa caa gct tac cgt tgg gaa 6849
Ala Val Ala Arg Cys Leu Met Ala Glu Gln Ala Tyr Arg Trp Glu
2270 2275 2280
ctc aat gat gac tct gcc cgc ttc att aaa ccg ggc gcc tgg cag 6894
Leu Asn Asp Asp Ser Ala Arg Phe Ile Lys Pro Gly Ala Trp Gln
2285 2290 2295
gga acc tat gcc ggt ctg ctt gca ggt gaa acc ttg atg ctg agt 6939
Gly Thr Tyr Ala Gly Leu Leu Ala Gly Glu Thr Leu Met Leu Ser
2300 2305 2310
ctg gca caa atg gaa gac gct cat ctg aaa cgc gat aaa cgc gca 6984
Leu Ala Gln Met Glu Asp Ala His Leu Lys Arg Asp Lys Arg Ala
2315 2320 2325
tta gag gtt gaa cgc aca gta tcg ctg gcc gaa gtt tat gca gga 7029
Leu Glu Val Glu Arg Thr Val Ser Leu Ala Glu Val Tyr Ala Gly
2330 2335 2340
tta cca aaa gat aac ggt cca ttt tcc ctg gct cag gaa att gac 7074
Leu Pro Lys Asp Asn Gly Pro Phe Ser Leu Ala Gln Glu Ile Asp
2345 2350 2355
aag ctg gtg agt caa ggt tca ggc agt gcc ggc agt ggt aat aat 7119
Lys Leu Val Ser Gln Gly Ser Gly Ser Ala Gly Ser Gly Asn Asn
2360 2365 2370
aat ttg gcg ttc ggc gcc ggc acg gac act aaa acc tct ttg cag 7164
Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys Thr Ser Leu Gln
2375 2380 2385
gca tca gtt tca ttc gct gat ttg aaa att cgt gaa gat tac ccg 7209
Ala Ser Val Ser Phe Ala Asp Leu Lys Ile Arg Glu Asp Tyr Pro
2390 2395 2400
gca tcg ctt ggc aaa att cga cgt atc aaa cag atc agc gtc act 7254
Ala Ser Leu Gly Lys Ile Arg Arg Ile Lys Gln Ile Ser Val Thr
2405 2410 2415
ttg ccc gcg cta ctg gga ccg tat cag gat gta cag gca ata ttg 7299
Leu Pro Ala Leu Leu Gly Pro Tyr Gln Asp Val Gln Ala Ile Leu
2420 2425 2430
tct tac ggc gat aaa gcc gga tta gct aac ggc tgt gaa gcg ctg 7344
Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu
2435 2440 2445
gca gtt tct cac ggt atg aat gac agc ggc caa ttc cag ctc gat 7389
Ala Val Ser His Gly Met Asn Asp Ser Gly Gln Phe Gln Leu Asp
2450 2455 2460
ttc aac gat ggc aaa ttc ctg cca ttc gaa ggc atc gcc att gat 7434
Phe Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly Ile Ala Ile Asp
2465 2470 2475
caa ggc acg ctg aca ctg agc ttc cca aat gca tct atg ccg gag 7479
Gln Gly Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu
2480 2485 2490
aaa ggt aaa caa gcc act atg tta aaa acc ctg aac gat atc att 7524
Lys Gly Lys Gln Ala Thr Met Leu Lys Thr Leu Asn Asp Ile Ile
2495 2500 2505
ttg cat att cgc tac acc att aaa taa 7551
Leu His Ile Arg Tyr Thr Ile Lys
2510 2515
<210>22
<211>4431
<212>DNA
<213>发光光杆状菌
<220>
<221>外显子
<222>(1)..(4431)
<400>22
atg cag aat tca caa aca ttc agt gtt acc gag ctg tca tta ccc aaa 48
Met Gln Asn Ser Gln Thr Phe Ser Val Thr Glu Leu Ser Leu Pro Lys
1 5 10 15
ggc ggc ggc gct att acc ggt atg ggt gaa gca tta aca cca gcc ggg 96
Gly Gly Gly Ala Ile Thr Gly Met Gly Glu Ala Leu Thr Pro Ala Gly
20 25 30
ccg gat ggt atg gcc gcc tta tcc ctg cca tta ccc att tcc gcc ggg 144
Pro Asp Gly Met Ala Ala Leu Ser Leu Pro Leu Pro Ile Ser Ala Gly
35 40 45
cgt ggt tac gca ccc tcg ctc act ctg aat tac aac agt gga acc ggt 192
Arg Gly Tyr Ala Pro Ser Leu Thr Leu Asn Tyr Ash Ser Gly Thr Gly
50 55 60
aac agc cca ttt ggt ctc ggt tgg gac tgc ggc gtc atg gca att cgt 240
Asn Ser Pro Phe Gly Leu Gly Trp Asp Cys Gly Val Met Ala Ile Arg
65 70 75 80
cgt cgc acc agt acc ggc gta ccg aat tac gat gaa acc gat act ttt 288
Arg Arg Thr Ser Thr Gly Val Pro Asn Tyr Asp Glu Thr Asp Thr Phe
85 90 95
ctg ggg ccg gaa ggt gaa gtg ttg gtc gta gca tta aat gag gca ggt 336
Leu Gly Pro Glu Gly Glu Val Leu Val Val Ala Leu Asn Glu Ala Gly
100 105 110
caa gct gat atc cgc agt gaa tcc tca ttg cag ggc atc aat ttg ggt 384
Gln Ala Asp Ile Arg Ser Glu Ser Ser Leu Gln Gly Ile Asn Leu Gly
115 120 125
gcg acc ttc acc gtt acc tgt tat cgc tcc cgc cta gaa agc cac ttt 432
Ala Thr Phe Thr Val Thr Cys Tyr Arg Ser Arg Leu Glu Ser His Phe
130 135 140
aac cgg ttg gaa tac tgg caa ccc caa aca acc ggc gca acc gat ttc 480
Asn Arg Leu Glu Tyr Trp Gln Pro Gln Thr Thr Gly Ala Thr Asp Phe
145 150 155 160
tgg ctg ata tac agc ccc gac gga cag gtc cat tta ctg ggc aaa aat 528
Trp Leu Ile Tyr Ser Pro Asp Gly Gln Val His Leu Leu Gly Lys Asn
165 170 175
cct cag gca cgt atc agc aat cca ctc aat gtt aac caa aca gcg caa 576
Pro Gln Ala Arg Ile Ser Asn Pro Leu Asn Val Asn Gln Thr Ala Gln
180 185 190
tgg ctg ttg gaa gcc tcg ata tca tcc cac agc gaa cag att tat tat 624
Trp Leu Leu Glu Ala Ser Ile Ser Ser His Ser Glu Gln Ile Tyr Tyr
195 200 205
caa tat cgc gct gaa gat gaa gca ggt tgt gaa acc gac gag cta gca 672
Gln Tyr Arg Ala Glu Asp Glu Ala Gly Cys Glu Thr Asp Glu Leu Ala
210 215 220
gcc cac ccc agc gca acc gtt cag cgc tac ctg caa aca gta cat tac 720
Ala His Pro Ser Ala Thr Val Gln Arg Tyr Leu Gln Thr Val His Tyr
225 230 235 240
ggg aac ctg acc gcc agc gac gtt ttt cct aca cta aac gga gat gac 768
Gly Asn Leu Thr Ala Ser Asp Val Phe Pro Thr Leu Asn Gly Asp Asp
245 250 255
cca ctt aaa tct ggc tgg atg ttc tgt tta gta ttt gac tac ggt gag 816
Pro Leu Lys Ser Gly Trp Met Phe Cys Leu Val Phe Asp Tyr Gly Glu
260 265 270
cgc aaa aac agc tta tct gaa atg ccg ctg ttt aaa gcc aca ggc aat 864
Arg Lys Asn Ser Leu Ser Glu Met Pro Leu Phe Lys Ala Thr Gly Asn
275 280 285
tgg ctt tgc cga aaa gac cgt ttt tcc cgt tat gag tac ggt ttt gaa 912
Trp Leu Cys Arg Lys Asp Arg Phe Ser Arg Tyr Glu Tyr Gly Phe Glu
290 295 300
ttg cgt act cgc cgc tta tgc cgc caa ata ctg atg ttt cac cgt cta 960
Leu Arg Thr Arg Arg Leu Cys Arg Gln Ile Leu Met Phe His Arg Leu
305 310 315 320
caa acc cta tct ggt cag gca aag ggg gat gat gaa cct gcg cta gtg 1008
Gln Thr Leu Ser Gly Gln Ala Lys Gly Asp Asp Glu Pro Ala Leu Val
325 330 335
tcg cgt ctg ata ctg gat tat gac gaa aac gcg atg gtc agt acg ctc 1056
Ser Arg Leu Ile Leu Asp Tyr Asp Glu Asn Ala Met Val Ser Thr Leu
340 345 350
gtt tct gtc cgc cgg gta ggc cat gag gac aac aac acg gtt acc gcg 1104
Val Ser Val Arg Arg Val Gly His Glu Asp Asn Asn Thr Val Thr Ala
355 360 365
ctg cca cca ctg gaa ctg gcc tat cag cct ttt gag cca gaa caa acc 1152
Leu Pro Pro Leu Glu Leu Ala Tyr Gln Pro Phe Glu Pro Glu Gln Thr
370 375 380
gca ctc tgg caa tca atg gat gta ctg gca aat ttc aac acc att cag 1200
Ala Leu Trp Gln Ser Met Asp Val Leu Ala Asn Phe Asn Thr Ile Gln
385 390 395 400
cgc tgg caa ctg ctt gac ctg aaa gga gaa ggc gtg ccc ggc att ctc 1248
Arg Trp Gln Leu Leu Asp Leu Lys Gly Glu Gly Val Pro Gly Ile Leu
405 410 415
tat cag gat aga aat ggc tgg tgg tat cga tct gcc caa cgt cag gcc 1296
Tyr Gln Asp Arg Asn Gly Trp Trp Tyr Arg Ser Ala Gln Arg Gln Ala
420 425 430
ggg gaa gag atg aat gcg gtc acc tgg ggg aaa atg caa ctc ctt ccc 1344
Gly Glu Glu Met Asn Ala Val Thr Trp Gly Lys Met Gln Leu Leu Pro
435 440 445
atc aca cca gct gtg cag gat aac gcc tca ctg atg gat att aac ggt 1392
Ile Thr Pro Ala Val Gln Asp Asn Ala Ser Leu Met Asp Ile Asn Gly
450 455 460
gac ggg caa ctg gac tgg gtg att acc ggg ccg ggg cta agg ggc tat 1440
Asp Gly Gln Leu Asp Trp Val Ile Thr Gly Pro Gly Leu Arg Gly Tyr
465 470 475 480
cac agc caa cac ccg gat ggc agt tgg acg cgt ttt acg cca tta cat 1488
His Ser Gln His Pro Asp Gly Ser Trp Thr Arg Phe Thr Pro Leu His
485 490 495
gcc ctg ccg ata gaa tat tct cat cct cgc gct caa ctt gcc gat tta 1536
Ala Leu Pro Ile Glu Tyr Ser His Pro Arg Ala Gln Leu Ala Asp Leu
500 505 510
atg gga gcc ggg ctg tcc gat tta gtg cta att ggt ccc aaa agt gtg 1584
Met Gly Ala Gly Leu Ser Asp Leu Val Leu Ile Gly Pro Lys Ser Val
515 520 525
cgc tta tat gtc aat aac cgt gat ggt ttt acc gaa ggg cgg gat gtg 1632
Arg Leu Tyr Val Asn Asn Arg Asp Gly Phe Thr Glu Gly Arg Asp Val
530 535 540
gtg caa tcc ggt gat atc acc ctg ccg cta ccg ggc gcc gat gcc cgt 1680
Val Gln Ser Gly Asp Ile Thr Leu Pro Leu Pro Gly Ala Asp Ala Arg
545 550 555 560
aag tta gtg gca ttt agt gac gta ctg ggt tca ggc caa gca cat ctg 1728
Lys Leu Val Ala Phe Ser Asp Val Leu Gly Ser Gly Gln Ala His Leu
565 570 575
gtt gaa gtt agt gca act caa gtc acc tgc tgg ccg aat ctg ggg cat 1776
Val Glu Val Ser Ala Thr Gln Val Thr Cys Trp Pro Asn Leu Gly His
580 585 590
ggc cgt ttt ggt cag cca atc gta ttg ccg gga ttc agc caa tct gcc 1824
Gly Arg Phe Gly Gln Pro Ile Val Leu Pro Gly Phe Ser Gln Ser Ala
595 600 605
gcc agt ttt aat cct gat cga gtt cat ctg gcc gat ttg gat ggg agc 1872
Ala Ser Phe Asn Pro Asp Arg Val His Leu Ala Asp Leu Asp Gly Ser
610 615 620
ggc cct gcc gat ttg att tat gtt cat gct gac cgt ctg gat att ttc 1920
Gly Pro Ala Asp Leu Ile Tyr Val His Ala Asp Arg Leu Asp Ile Phe
625 630 635 640
agc aat gaa agt ggc aac ggt ttt gca aaa cca ttc aca ctc tct ttt 1968
Ser Asn Glu Ser Gly Asn Gly Phe Ala Lys Pro Phe Thr Leu Ser Phe
645 650 655
cct gac ggc ctg cgt ttt gat gat acc tgc cag ttg caa gta gcc gat 2016
Pro Asp Gly Leu Arg Phe Asp Asp Thr Cys Gln Leu Gln Val Ala Asp
660 665 670
gta caa ggg tta ggc gtt gtc agc ctg atc cta agc gta ccg cat atg 2064
Val Gln Gly Leu Gly Val Val Ser Leu Ile Leu Ser Val Pro His Met
675 680 685
gcg cca cat cat tgg cgc tgc gat ctg acc aac gcg aaa ccg tgg tta 2112
Ala Pro His His Trp Arg Cys Asp Leu Thr Asn Ala Lys Pro Trp Leu
690 695 700
ctc agt gaa acg aac aac aat atg ggg gcc aat cac acc ttg cat tac 2160
Leu Ser Glu Thr Asn Asn Asn Met Gly Ala Asn His Thr Leu His Tyr
705 710 715 720
cgt agc tct gtc cag ttc tgg ctg gat gaa aaa gct gcg gca ttg gct 2208
Arg Ser Ser Val Gln Phe Trp Leu Asp Glu Lys Ala Ala Ala Leu Ala
725 730 735
acc gga caa aca ccg gtc tgt tac ctg ccc ttc ccg gtc cat acc ctt 2256
Thr Gly Gln Thr Pro Val Cys Tyr Leu Pro Phe Pro Val His Thr Leu
740 745 750
tgg caa aca gaa acc gag gat gaa atc agc ggc aat aag tta gtg acc 2304
Trp Gln Thr Glu Thr Glu Asp Glu Ile Ser Gly Asn Lys Leu Val Thr
755 760 765
acg tta cgt tat gct cac ggc gct tgg gat gga cgt gaa cgg gaa ttt 2352
Thr Leu Arg Tyr Ala His Gly Ala Trp Asp Gly Arg Glu Arg Glu Phe
770 775 780
cgt ggc ttt ggt tat gtt gag cag aca gac agc cat caa ctc gct caa 2400
Arg Gly Phe Gly Tyr Val Glu Gln Thr Asp Ser His Gln Leu Ala Gln
785 790 79 5800
ggc aat gcg ccg gaa cgt aca cca ccg gca ctc acc aaa agc tgg tat 2448
Gly Asn Ala Pro Glu Arg Thr Pro Pro Ala Leu Thr Lys Ser Trp Tyr
805 810 815
gcc acc gga tta cct gcg gta gat aat gcg tta tcc gcc ggg tat tgg 2496
Ala Thr Gly Leu Pro Ala Val Asp Asn Ala Leu Ser Ala Gly Tyr Trp
820 825 830
cgt ggc gat aag caa gct ttc gcc ggt ttt acg cca cgt ttt act ctc 2544
Arg Gly Asp Lys Gln Ala Phe Ala Gly Phe Thr Pro Arg Phe Thr Leu
835 840 845
tgg aaa gag ggc aaa gat gtt cca ctg aca ccg gaa gat gac cat aat 2592
Trp Lys Glu Gly Lys Asp Val Pro Leu Thr Pro Glu Asp Asp His Asn
850 855 860
cta tac tgg tta aac cgg gcg cta aaa ggt cag cca ctg cgt agt gaa 2640
Leu Tyr Trp Leu Asn Arg Ala Leu Lys Gly Gln Pro Leu Arg Ser Glu
865 870 875 880
ctc tac ggg ctg gat ggc agc gca cag caa cag atc ccc tat aca gtg 2688
Leu Tyr Gly Leu Asp Gly Ser Ala Gln Gln Gln Ile Pro Tyr Thr Val
885 890 895
act gaa tcc cgt cca cag gtg cgc caa tta caa gat ggc gcc acc gtt 2736
Thr Glu Ser Arg Pro Gln Val Arg Gln Leu Gln Asp Gly Ala Thr Val
900 905 910
tcc ccg gtg ctc tgg gcc tca gtc gtg gaa agc cgt agt tat cac tac 2784
Ser Pro Val Leu Trp Ala Ser Val Val Glu Ser Arg Ser Tyr His Tyr
915 920 925
gaa cgt att atc agt gat ccc cag tgc aat cag gat atc acg ttg tcc 2832
Glu Arg Ile Ile Ser Asp Pro Gln Cys Asn Gln Asp Ile Thr Leu Ser
930 935 940
agt gac cta ttc ggg caa cca ctg aaa cag gtt tcc gta caa tat ccc 2880
Ser Asp Leu Phe Gly Gln Pro Leu Lys Gln Val Ser Val Gln Tyr Pro
945 950 955 960
cgc cgc aac aaa cca aca acc aat ccg tat ccc gat acc cta ccg gat 2928
Arg Arg Asn Lys Pro Thr Thr Asn Pro Tyr Pro Asp Thr Leu Pro Asp
965 970 975
acg ctg ttt gcc agc agt tat gac gat caa caa cag cta ttg cga tta 2976
Thr Leu Phe Ala Ser Ser Tyr Asp Asp Gln Gln Gln Leu Leu Arg Leu
980 985 990
acc tgc cga caa tcc agt tgg cac cat ctt att ggt aat gag cta aga 3024
Thr Cys Arg Gln Ser Ser Trp His His Leu Ile Gly Asn Glu Leu Arg
995 1000 1005
gtg ttg gga tta ccg gat ggc aca cgc agt gat gcc ttt act tac 3069
Val Leu Gly Leu Pro Asp Gly Thr Arg Ser Asp Ala Phe Thr Tyr
1010 1015 1020
gat gcc aaa cag gta cct gtc gat ggc tta aat ctg gaa acc ctg 3114
Asp Ala Lys Gln Val Pro Val Asp Gly Leu Asn Leu Glu Thr Leu
1025 1030 1035
tgt gct gaa aat agc ctg att gcc gat gat aaa cct cgc gaa tac 3159
Cys Ala Glu Asn Ser Leu Ile Ala Asp Asp Lys Pro Arg Glu Tyr
1040 1045 1050
ctc aat cag caa cga acg ttc tat acc gac ggg aaa aac caa aca 3204
Leu Asn Gln Gln Arg Thr Phe Tyr Thr Asp Gly Lys Asn Gln Thr
1055 1060 1065
ccg ctg aaa aca ccg aca cga caa gcg tta atc gcc ttt acc gaa 3249
Pro Leu Lys Thr Pro Thr Arg Gln Ala Leu Ile Ala Phe Thr Glu
1070 1075 1080
acg gcg gta tta acg gaa tct ctg tta tcc gcg ttt gat ggc ggt 3294
Thr Ala Val Leu Thr Glu Ser Leu Leu Ser Ala Phe Asp Gly Gly
1085 1090 1095
att acg cca gac gaa tta ccg gga ata ctg aca cag gcc gga tac 3339
Ile Thr Pro Asp Glu Leu Pro Gly Ile Leu Thr Gln Ala Gly Tyr
1100 1105 1110
caa caa gag cct tat ctg ttt cca cgc acc ggc gaa aac aaa gtt 3384
Gln Gln Glu Pro Tyr Leu Phe Pro Arg Thr Gly Glu Asn Lys Val
1115 1120 1125
tgg gta gcg cgt caa ggc tat acc gat tac ggg acg gaa gca caa 3429
Trp Val Ala Arg Gln Gly Tyr Thr Asp Tyr Gly Thr Glu Ala Gln
1130 1135 1140
ttt tgg cgt cct gtc gca caa cgt aac agc ctg tta acc ggg aaa 3474
Phe Trp Arg Pro Val Ala Gln Arg Asn Ser Leu Leu Thr Gly Lys
1145 1150 1155
atg acg tta aaa tgg gat act cac tat tgt gtc atc acc caa acc 3519
Met Thr Leu Lys Trp Asp Thr His Tyr Cys Val Ile Thr Gln Thr
1160 1165 1170
caa gat gct gcc ggc ctc acc gtc tca gcc aat tat gac tgg cgt 3564
Gln Asp Ala Ala Gly Leu Thr Val Ser Ala Asn Tyr Asp Trp Arg
1175 1180 1185
ttt ctc aca cca acg caa ctg act gac atc aac gat aat gtg cat 3609
Phe Leu Thr Pro Thr Gln Leu Thr Asp Ile Asn Asp Asn Val His
1190 1195 1200
ctc atc acc ttg gat gct ctg gga cgc cct gtc acg caa cgt ttc 3654
Leu Ile Thr Leu Asp Ala Leu Gly Arg Pro Val Thr Gln Arg Phe
1205 1210 1215
tgg ggg atc gaa agc ggt gtg gca aca ggt tac tct tca tca gaa 3699
Trp Gly Ile Glu Ser Gly Val Ala Thr Gly Tyr Ser Ser Ser Glu
1220 1225 1230
gaa aaa cca ttc tct cca cca aac gat atc gat acc gct att aat 3744
Glu Lys Pro Phe Ser Pro Pro Asn Asp Ile Asp Thr Ala Ile Asn
1235 1240 1245
cta acc gga cca ctc cct gtc gca cag tgt ctg gtc tat gca ccg 3789
Leu Thr Gly Pro Leu Pro Val Ala Gln Cys Leu Val Tyr Ala Pro
1250 1255 1260
gac agt tgg atg cca cta ttc agt caa gaa acc ttc aac aca tta 3834
Asp Ser Trp Met Pro Leu Phe Ser Gln Glu Thr Phe Asn Thr Leu
1265 1270 1275
acg cag gaa gag cag gag acg ctg cgt gat tca cgt att atc acg 3879
Thr Gln Glu Glu Gln Glu Thr Leu Arg Asp Ser Arg Ile Ile Thr
1280 1285 1290
gaa gat tgg cgt att tgc gca ctg act cgc cgc cgt tgg cta caa 3924
Glu Asp Trp Arg Ile Cys Ala Leu Thr Arg Arg Arg Trp Leu Gln
1295 1300 1305
agt caa aag atc agt aca cca tta gtt aaa ctg tta acc aac agc 3969
Ser Gln Lys Ile Ser Thr Pro Leu Val Lys Leu Leu Thr Asn Ser
1310 1315 1320
att ggt tta cct ccc cat aac ctt acg ctg acc aca gac cgt tat 4014
Ile Gly Leu Pro Pro His Asn Leu Thr Leu Thr Thr Asp Arg Tyr
1325 1330 1335
gac cgc gac tct gag cag caa att cgc caa caa gtc gca ttt agt 4059
Asp Arg Asp Ser Glu Gln Gln Ile Arg Gln Gln Val Ala Phe Ser
1340 1345 1350
gat ggt ttt ggc cgt ctg cta caa gcg tct gta cga cat gag gca 4104
Asp Gly Phe Gly Arg Leu Leu Gln Ala Ser Val Arg His Glu Ala
1355 1360 1365
ggc gaa gcc tgg caa cgt aac caa gac ggt tct ctg gtg aca aaa 4149
Gly Glu Ala Trp Gln Arg Asn Gln Asp Gly Ser Leu Val Thr Lys
1370 1375 1380
gtg gag aat acc aaa acg cgt tgg gcg gtc acg gga cgc acc gaa 4194
Val Glu Asn Thr Lys Thr Arg Trp Ala Val Thr Gly Arg Thr Glu
1385 1390 1395
tat gat aat aaa ggg caa acg ata cgc act tat cag ccc tat ttc 4239
Tyr Asp Asn Lys Gly Gln Thr Ile Arg Thr Tyr Gln Pro Tyr Phe
1400 1405 1410
ctc aac gac tgg cga tat gtc agt gat gac agc gcc aga aaa gaa 4284
Leu Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Lys Glu
1415 1420 1425
gcc tat gcg gat act cat att tat gat cca att ggg cga gaa atc 4329
Ala Tyr Ala Asp Thr His Ile Tyr Asp Pro Ile Gly Arg Glu Ile
1430 1435 1440
cgg gtt att act gca aaa ggc tgg ctg cgc caa agc caa tat ttc 4374
Arg Val Ile Thr Ala Lys Gly Trp Leu Arg Gln Ser Gln Tyr Phe
1445 1450 1455
ccg tgg ttt acc gtg agt gag gat gag aat gat acg gcc gct gat 4419
Pro Trp Phe Thr Val Ser Glu Asp Glu Asn Asp Thr Ala Ala Asp
1460 1465 1470
gcg ctg gtg taa 4431
Ala Leu Val
1475
<210>23
<211>74
<212>DNA
<213>人工序列
<220>
<223>用于从质粒pBC-AS4扩增TcdB1序列的正向引物
<400>23
atatagtcga cgaattttaa tctactagta aaaaggagat aaccatgcag aattcacaaa 60
cattcagtgt tacc 74
<210>24
<211>57
<212>DNA
<213>人工序列
<220>
<223>用于从质粒pBC-AS4扩增TcdB1序列的反向引物.
<400>24
ataatacgat cgtttctcga gtcattacac cagcgcatca gcggccgtat cattctc 57
<210>25
<211>3132
<212>DNA
<213>发光光杆状菌
<220>
<221>外显子
<222>(1)..(3132)
<400>25
atg agt ccg tct gag act act ctt tat act caa acc cca aca gtc agc 48
Met Ser Pro Ser Glu Thr Thr Leu Tyr Thr Gln Thr Pro Thr Val Ser
1 5 10 15
gtg tta gat aat cgc ggt ctg tcc att cgt gat att ggt ttt cac cgt 96
Val Leu Asp Asn Arg Gly Leu Ser Ile Arg Asp Ile Gly Phe His Arg
20 25 30
att gta atc ggg ggg gat act gac acc cgc gtc acc cgt cac cag tat 144
Ile Val Ile Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gln Tyr
35 40 45
gat gcc cgt gga cac ctg aac tac agt att gac cca cgc ttg tat gat 192
Asp Ala Arg Gly His Leu Asn Tyr Ser Ile Asp Pro Arg Leu Tyr Asp
50 55 60
gca aag cag gct gat aac tca gta aag cct aat ttt gtc tgg cag cat 240
Ala Lys Gln Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gln His
65 70 75 80
gat ctg gcc ggt cat gcc ctg cgg aca gag agt gtc gat gct ggt cgt 288
Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg
85 90 95
act gtt gca ttg aat gat att gaa ggt cgt tcg gta atg aca atg aat 336
Thr Val Ala Leu Asn Asp Ile Glu Gly Arg Ser Val Met Thr Met Asn
100 105 110
gcg acc ggt gtt cgt cag acc cgt cgc tat gaa ggc aac acc ttg ccc 384
Ala Thr Gly Val Arg Gln Thr Arg Arg Tyr Glu Gly Asn Thr Leu Pro
115 120 125
ggt cgc ttg tta tct gtg agc gag caa gtt ttc aac caa gag agt gct 432
Gly Arg Leu Leu Ser Val Ser Glu Gln Val Phe Asn Gln Glu Ser Ala
130 135 140
aaa gtg aca gag cgc ttt atc tgg gct ggg aat aca acc tcg gag aaa 480
Lys Val Thr Glu Arg Phe Ile Trp Ala Gly Asn Thr Thr Ser Glu Lys
145 150 155 160
gag tat aac ctc tcc ggt ctg tgt ata cgc cac tac gac aca gcg gga 528
Glu Tyr Asn Leu Ser Gly Leu Cys Ile Arg His Tyr Asp Thr Ala Gly
165 170 175
gtg acc cgg ttg atg agt cag tca ctg gcg ggc gcc atg cta tcc caa 576
Val Thr Arg Leu Met Ser Gln Ser Leu Ala Gly Ala Met Leu Ser Gln
180 185 190
tct cac caa ttg ctg gcg gaa ggg cag gag gct aac tgg agc ggt gac 624
Ser His Gln Leu Leu Ala Glu Gly Gln Glu Ala Asn Trp Ser Gly Asp
195 200 205
gac gaa act gtc tgg cag gga atg ctg gca agt gag gtc tat acg aca 672
Asp Glu Thr Val Trp Gln Gly Met Leu Ala Ser Glu Val Tyr Thr Thr
210 215 220
caa agt acc act aat gcc atc ggg gct tta ctg acc caa acc gat gcg 720
Gln Ser Thr Thr Asn Ala Ile Gly Ala Leu Leu Thr Gln Thr Asp Ala
225 230 235 240
aaa ggc aat att cag cgt ctg gct tat gac att gcc ggt cag tta aaa 768
Lys Gly Asn Ile Gln Arg Leu Ala Tyr Asp Ile Ala Gly Gln Leu Lys
245 250 255
ggg agt tgg ttg acg gtg aaa ggc cag agt gaa cag gtg att gtt aag 816
Gly Ser Trp Leu Thr Val Lys Gly Gln Ser Glu Gln Val Ile Val Lys
260 265 270
tcc ctg agc tgg tca gcc gca ggt cat aaa ttg cgt gaa gag cac ggt 864
Ser Leu Ser Trp Ser Ala Ala Gly His Lys Leu Arg Glu Glu His Gly
275 280 285
aac ggc gtg gtt acg gag tac agt tat gag ccg gaa act caa cgt ctg 912
Asn Gly Val Val Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Gln Arg Leu
290 295 300
ata ggt atc acc acc cgg cgt gcc gaa ggg agt caa tca gga gcc aga 960
Ile Gly Ile Thr Thr Arg Arg Ala Glu Gly Ser Gln Ser Gly Ala Arg
305 310 315 320
gta ttg cag gat cta cgc tat aag tat gat ccg gtg ggg aat gtt atc 1008
Val Leu Gln Asp Leu Arg Tyr Lys Tyr Asp Pro Val Gly Asn Val Ile
325 330 335
agt atc cat aat gat gcc gaa gct acc cgc ttt tgg cgt aat cag aaa 1056
Ser Ile His Asn Asp Ala Glu Ala Thr Arg Phe Trp Arg Asn Gln Lys
340 345 350
gtg gag ccg gag aat cgc tat gtt tat gat tct ctg tat cag ctt atg 1104
Val Glu Pro Glu Asn Arg Tyr Val Tyr Asp Ser Leu Tyr Gln Leu Met
355 360 365
agt gcg aca ggg cgt gaa atg gct aat atc ggt cag caa agc aac caa 1152
Ser Ala Thr Gly Arg Glu Met Ala Asn Ile Gly Gln Gln Ser Asn Gln
370 375 380
ctt ccc tca ccc gtt ata cct gtt cct act gac gac agc act tat acc 1200
Leu Pro Ser Pro Val Ile Pro Val Pro Thr Asp Asp Ser Thr Tyr Thr
385 390 395 400
aat tac ctt cgt acc tat act tat gac cgt ggc ggt aat ttg gtt caa 1248
Asn Tyr Leu Arg Thr Tyr Thr Tyr Asp Arg Gly Gly Asn Leu Val Gln
405 410 415
atc cga cac agt tca ccc gcg act caa aat agt tac acc aca gat atc 1296
Ile Arg His Ser Ser Pro Ala Thr Gln Asn Ser Tyr Thr Thr Asp Ile
420 425 430
acc gtt tca agc cgc agt aac cgg gcg gta ttg agt aca tta acg aca 1344
Thr Val Ser Ser Arg Ser Asn Arg Ala Val Leu Ser Thr Leu Thr Thr
435 440 445
gat cca acc cga gtg gat gcg cta ttt gat tcc ggc ggt cat cag aag 1392
Asp Pro Thr Arg Val Asp Ala Leu Phe Asp Ser Gly Gly His Gln Lys
450 455 460
atg tta ata ccg ggg caa aat ctg gat tgg aat att cgg ggt gaa ttg 1440
Met Leu Ile Pro Gly Gln Asn Leu Asp Trp Asn Ile Arg Gly Glu Leu
465 470 475 480
caa cga gtc aca ccg gtg agc cgt gaa aat agc agt gac agt gaa tgg 1488
Gln Arg Val Thr Pro Val Ser Arg Glu Asn Ser Ser Asp Ser Glu Trp
485 490 495
tat cgc tat agc agt gat ggc atg cgg ctg cta aaa gtg agt gaa cag 1536
Tyr Arg Tyr Ser Ser Asp Gly Met Arg Leu Leu Lys Val Ser Glu Gln
500 505 510
cag acg ggc aac agt act caa gta caa cgg gtg act tat ctg ccg gga 1584
Gln Thr Gly Asn Ser Thr Gln Val Gln Arg Val Thr Tyr Leu Pro Gly
515 520 525
tta gag cta cgg aca act ggg gtt gca gat aaa aca acc gaa gat ttg 1632
Leu Glu Leu Arg Thr Thr Gly Val Ala Asp Lys Thr Thr Glu Asp Leu
530 535 540
cag gtg att acg gta ggt gaa gcg ggt cgc gca cag gta agg gta ttg 1680
Gln Val Ile Thr Val Gly Glu Ala Gly Arg Ala Gln Val Arg Val Leu
545 550 555 560
cac tgg gaa agt ggt aag ccg aca gat att gac aac aat cag gtg cgc 1728
His Trp Glu Ser Gly Lys Pro Thr Asp Ile Asp Asn Asn Gln Val Arg
565 570 575
tac agc tac gat aat ctg ctt ggc tcc agc cag ctt gaa ctg gat agc 1776
Tyr Ser Tyr Asp Asn Leu Leu Gly Ser Ser Gln Leu Glu Leu Asp Ser
580 585 590
gaa ggg cag att ctc agt cag gaa gag tat tat ccg tat ggc ggt acg 1824
Glu Gly Gln Ile Leu Ser Gln Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr
595 600 605
gcg ata tgg gcg gcg aga aat cag aca gaa gcc agc tac aaa ttt att 1872
Ala Ile Trp Ala Ala Arg Asn Gln Thr Glu Ala Ser Tyr Lys Phe Ile
610 615 620
cgt tac tcc ggt aaa gag cgg gat gcc act gga ttg tat tat tac ggc 1920
Arg Tyr Ser Gly Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly
625 630 635 640
tac cgt tat tat caa cct tgg gtg ggt cga tgg ttg agt gct gat ccg 1968
Tyr Arg Tyr Tyr Gln Pro Trp Val Gly Arg Trp Leu Ser Ala Asp Pro
645 650 655
gcg gga acc gtg gat ggg ctg aat ttg tac cga atg gtg agg aat aac 2016
Ala Gly Thr Val Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn
660 665 670
ccc atc aca ttg act gac cat gac gga tta gca ccg tct cca aat aga 2064
Pro Ile Thr Leu Thr Asp His Asp Gly Leu Ala Pro Ser Pro Asn Arg
675 680 685
aat cga aat aca ttt tgg ttt gct tca ttt ttg ttt cgt aaa cct gat 2112
Asn Arg Asn Thr Phe Trp Phe Ala Ser Phe Leu Phe Arg Lys Pro Asp
690 695 700
gag gga atg tcc gcg tca atg aga cgg gga caa aaa att ggc aga gcc 2160
Glu Gly Met Ser Ala Ser Met Arg Arg Gly Gln Lys Ile Gly Arg Ala
705 710 715 720
att gcc ggc ggg att gcg att ggc ggt ctt gcg gct acc att gcc gct 2208
Ile Ala Gly Gly Ile Ala Ile Gly Gly Leu Ala Ala Thr Ile Ala Ala
725 730 735
acg gct ggc gcg gct atc ccc gtc att ctg ggg gtt gcg gcc gta ggc 2256
Thr Ala Gly Ala Ala Ile Pro Val Ile Leu Gly Val Ala Ala Val Gly
740 745 750
gcg ggg att ggc gcg ttg atg gga tat aac gtc ggt agc ctg ctg gaa 2304
Ala Gly Ile Gly Ala Leu Met Gly Tyr Asn Val Gly Ser Leu Leu Glu
755 760 765
aaa ggc ggg gca tta ctt gct cga ctc gta cag ggg aaa tcg acg tta 2352
Lys Gly Gly Ala Leu Leu Ala Arg Leu Val Gln Gly Lys Ser Thr Leu
770 775 780
gta cag tcg gcg gct ggc gcg gct gcc gga gcg agt tca gcc gcg gct 2400
Val Gln Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala
785 790 795 800
tat ggc gca cgg gca caa ggt gtc ggt gtt gca tca gcc gcc ggg gcg 2448
Tyr Gly Ala Arg Ala Gln Gly Val Gly Val Ala Ser Ala Ala Gly Ala
805 810 815
gta aca ggg gct gtg gga tca tgg ata aat aat gct gat cgg ggg att 2496
Val Thr Gly Ala Val Gly Ser Trp Ile Asn Asn Ala Asp Arg Gly Ile
820 825 830
ggc ggc gct att ggg gcc ggg agt gcg gta ggc acc att gat act atg 2544
Gly Gly Ala Ile Gly Ala Gly Ser Ala Val Gly Thr Ile Asp Thr Met
835 840 845
tta ggg act gcc tct acc ctt acc cat gaa gtc ggg gca gcg gcg ggt 2592
Leu Gly Thr Ala Ser Thr Leu Thr His Glu Val Gly Ala Ala Ala Gly
850 855 860
ggg gcg gcg ggt ggg atg atc acc ggt acg caa ggg agt act cgg gca 2640
Gly Ala Ala Gly Gly Met Ile Thr Gly Thr Gln Gly Ser Thr Arg Ala
865 870 875 880
ggt atc cat gcc ggt att ggc acc tat tat ggc tcc tgg att ggt ttt 2688
Gly Ile His Ala Gly Ile Gly Thr Tyr Tyr Gly Ser Trp Ile Gly Phe
885 890 895
ggt tta gat gtc gct agt aac ccc gcc gga cat tta gcg aat tac gca 2736
Gly Leu Asp Val Ala Ser Asn Pro Ala Gly His Leu Ala Asn Tyr Ala
900 905 910
gtg ggt tat gcc gct ggt ttg ggt gct gaa atg gct gtc aac aga ata 2784
Val Gly Tyr Ala Ala Gly Leu Gly Ala Glu Met Ala Val Asn Arg Ile
915 920 925
atg ggt ggt gga ttt ttg agt agg ctc tta ggc cgg gtt gtc agc cca 2832
Met Gly Gly Gly Phe Leu Ser Arg Leu Leu Gly Arg Val Val Ser Pro
930 935 940
tat gcc gcc ggt tta gcc aga caa tta gta cat ttc agt gtc gcc aga 2880
Tyr Ala Ala Gly Leu Ala Arg Gln Leu Val His Phe Ser Val Ala Arg
945 950 955 960
cct gtc ttt gag ccg ata ttt agt gtt ctc ggc ggg ctt gtc ggt ggt 2928
Pro Val Phe Glu Pro Ile Phe Ser Val Leu Gly Gly Leu Val Gly Gly
965 970 975
att gga act ggc ctg cac aga gtg atg gga aga gag agt tgg att tcc 2976
Ile Gly Thr Gly Leu His Arg Val Met Gly Arg Glu Ser Trp Ile Ser
980 985 990
aga gcg tta agt gct gcc ggt agt ggt ata gat cat gtc gct ggc atg 3024
Arg Ala Leu Ser Ala Ala Gly Ser Gly Ile Asp His Val Ala Gly Met
995 1000 1005
att ggt aat cag atc aga ggc agg gtc ttg acc aca acc ggg atc 3069
Ile Gly Asn Gln Ile Arg Gly Arg Val Leu Thr Thr Thr Gly Ile
1010 1015 1020
gct aat gcg ata gac tat ggc acc agt gct gtg gga gcc gca cga 3114
Ala Asn Ala Ile Asp Tyr Gly Thr Ser Ala Val Gly Ala Ala Arg
1025 1030 1035
cga gtt ttt tct ttg taa 3132
Arg Val Phe Ser Leu
1040
<210>26
<211>77
<212>DNA
<213>人工序列
<220>
<223>用于从pBC KS+载体扩增TccC1的正向引物
<400>26
gtcgacgcac tactagtaaa aaggagataa ccccatgagc ccgtctgaga ctactcttta 60
tactcaaacc ccaacag 77
<210>27
<211>57
<212>DNA
<213>人工序列
<220>
<223>用于从pBC KS+载体扩增TccC1的反向引物
<400>27
cggccgcagt cctcgagtca gattaattac aaagaaaaaa ctcgtcgtgc ggctccc 57
<210>28
<211>84
<212>DNA
<213>人工序列
<220>
<223>用于扩增XptA2的正向引物.
<400>28
gtctagacgt gcgtcgacaa gaaggagata taccatgtat agcacggctg tattactcaa 60
taaaatcagt cccactcgcg acgg 84
<210>29
<211>64
<212>DNA
<213>人工序列
<220>
<223>用于扩增XptA2的反向引物.
<400>29
gctcgagatt aattaagaac gaatggtata gcggatatgc agaatgatat cgctcaggct 60
ctcc 64
<210>30
<211>81
<212>DNA
<213>人工序列
<220>
<223>用于扩增XptC1的正向引物.
<400>30
gtctagacgt gcgtcgacaa gaaggagata taccatgcag ggttcaacac ctttgaaact 60
tgaaataccg tcattgccct c 81
<210>31
<211>64
<212>DNA
<213>人工序列
<220>
<223>用于扩增XptC1的反向引物.
<400>31
gactcgagag cattaattat gctgtcattt caccggcagt gtcattttca tcttcattca 60
ccac 64
<210>32
<211>86
<212>DNA
<213>人工序列
<220>
<223>用于扩增XptB1的正向引物.
<400>32
gtctagacgt gcgtcgacaa gaaggagata taccatgaag aatttcgttc acagcaatac 60
gccatccgtc accgtactgg acaacc 86
<210>33
<211>66
<212>DNA
<213>人工序列
<220>
<223>用于扩增XptB1的反向引物.
<400>33
gctcgagcag attaattatg cttcggattc attatgacgt gcagaggcgt taaagaagaa 60
gttatt 66
<210>34
<211>2538
<212>PRT
<213>嗜线虫致病杆菌
<400>34
Met Tyr Ser Thr Ala Val Leu Leu Asn Lys Ile Ser Pro Thr Arg Asp
1 5 10 15
Gly Gln Thr Met Thr Leu Ala Asp Leu Gln Tyr Leu Ser Phe Ser Glu
20 25 30
Leu Arg Lys Ile Phe Asp Asp Gln Leu Ser Trp Gly Glu Ala Arg His
35 40 45
Leu Tyr His Glu Thr Ile Glu Gln Lys Lys Asn Asn Arg Leu Leu Glu
50 55 60
Ala Arg Ile Phe Thr Arg Ala Asn Pro Gln Leu Ser Gly Ala Ile Arg
65 70 75 80
Leu Gly Ile Glu Arg Asp Ser Val Ser Arg Ser Tyr Asp Glu Met Phe
85 90 95
Gly Ala Arg Ser Ser Ser Phe Val Lys Pro Gly Ser Val Ala Ser Met
100 105 110
Phe Ser Pro Ala Gly Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asp
115 120 125
Leu His Phe Ser Ser Ser Ala Tyr His Leu Asp Asn Arg Arg Pro Asp
130 135 140
Leu Ala Asp Leu Thr Leu Ser Gln Ser Asn Met Asp Thr Glu Ile Ser
145 150 155 160
Thr Leu Thr Leu Ser Asn Glu Leu Leu Leu Glu His Ile Thr Arg Lys
165 170 175
Thr Gly Gly Asp Ser Asp Ala Leu Met Glu Ser Leu Ser Thr Tyr Arg
180 185 190
Gln Ala Ile Asp Thr Pro Tyr His Gln Pro Tyr Glu Thr Ile Arg Gln
195 200 205
Val Ile Met Thr His Asp Ser Thr Leu Ser Ala Leu Ser Arg Asn Pro
210 215 220
Glu Val Met Gly Gln Ala Glu Gly Ala Ser Leu Leu Ala Ile Leu Ala
225 230 235 240
Asn Ile Ser Pro Glu Leu Tyr Asn Ile Leu Thr Glu Glu Ile Thr Glu
245 250 255
Lys Asn Ala Asp Ala Leu Phe Ala Gln Asn Phe Ser Glu Asn Ile Thr
260 265 270
Pro Glu Asn Phe Ala Ser Gln Ser Trp Ile Ala Lys Tyr Tyr Gly Leu
275 280 285
Glu Leu Ser Glu Val Gln Lys Tyr Leu Gly Met Leu Gln Asn Gly Tyr
290 295 300
Ser Asp Ser Thr Ser Ala Tyr Val Asp Asn Ile Ser Thr Gly Leu Val
305 310 315 320
Val Asn Asn Glu Ser Lys Leu Glu Ala Tyr Lys Ile Thr Arg Val Lys
325 330 335
Thr Asp Asp Tyr Asp Lys Asn Ile Asn Tyr Phe Asp Leu Met Tyr Glu
340 345 350
Gly Asn Asn Gln Phe Phe Ile Arg Ala Asn Phe Lys Val Ser Arg Glu
355 360 365
Phe Gly Ala Thr Leu Arg Lys Asn Ala Gly Pro Ser Gly Ile Val Gly
370 375 380
Ser Leu Ser Gly Pro Leu Ile Ala Asn Thr Asn Phe Lys Ser Asn Tyr
385 390 395 400
Leu Ser Asn Ile Ser Asp Ser Glu Tyr Lys Asn Gly Val Lys Ile Tyr
405 410 415
Ala Tyr Arg Tyr Thr Ser Ser Thr Ser Ala Thr Asn Gln Gly Gly Gly
420 425 430
Ile Phe Thr Phe Glu Ser Tyr Pro Leu Thr Ile Phe Ala Leu Lys Leu
435 440 445
Asn Lys Ala Ile Arg Leu Cys Leu Thr Ser Gly Leu Ser Pro Asn Glu
450 455 460
Leu Gln Thr Ile Val Arg Ser Asp Asn Ala Gln Gly Ile Ile Asn Asp
465 470 475 480
Ser Val Leu Thr Lys Val Phe Tyr Thr Leu Phe Tyr Ser His Arg Tyr
485 490 495
Ala Leu Ser Phe Asp Asp Ala Gln Val Leu Asn Gly Ser Val Ile Asn
500 505 510
Gln Tyr Ala Asp Asp Asp Ser Val Ser His Phe Asn Arg Leu Phe Asn
515 520 525
Thr Pro Pro Leu Lys Gly Lys Ile Phe Glu Ala Asp Gly Asn Thr Val
530 535 540
Ser Ile Asp Pro Asp Glu Glu Gln Ser Thr Phe Ala Arg Ser Ala Leu
545 550 555 560
Met Arg Gly Leu Gly Val Asn Ser Gly Glu Leu Tyr Gln Leu Gly Lys
565 570 575
Leu Ala Gly Val Leu Asp Ala Gln Asn Thr Ile Thr Leu Ser Val Phe
580 585 590
Val Ile Ser Ser Leu Tyr Arg Leu Thr Leu Leu Ala Arg Val His Gln
595 600 605
Leu Thr Val Asn Glu Leu Cys Met Leu Tyr Gly Leu Ser Pro Phe Asn
610 615 620
Gly Lys Thr Thr Ala Ser Leu Ser Ser Gly Glu Leu Pro Arg Leu Val
625 630 635 640
Ile Trp Leu Tyr Gln Val Thr Gln Trp Leu Thr Glu Ala Glu Ile Thr
645 650 655
Thr Glu Ala Ile Trp Leu Leu Cys Thr Pro Glu Phe Ser Gly Asn Ile
660 665 670
Ser Pro Glu Ile Ser Asn Leu Leu Asn Asn Leu Arg Pro Ser Ile Ser
675 680 685
Glu Asp Met Ala Gln Ser His Asn Arg Glu Leu Gln Ala Glu Ile Leu
690 695 700
Ala Pro Phe Ile Ala Ala Thr Leu His Leu Ala Ser Pro Asp Met Ala
705 710 715 720
Arg Tyr Ile Leu Leu Trp Thr Asp Asn Leu Arg Pro Gly Gly Leu Asp
725 730 735
Ile Ala Gly Phe Met Thr Leu Val Leu Lys Glu Ser Leu Asn Ala Asn
740 745 750
Glu Thr Thr Gln Leu Val Gln Phe Cys His Val Met Ala Gln Leu Ser
755 760 765
Leu Ser Val Gln Thr Leu Arg Leu Ser Glu Ala Glu Leu Ser Val Leu
770 775 780
Val Ile Ser Gly Phe Ala Val Leu Gly Ala Lys Asn Gln Pro Ala Gly
785 790 795 800
Gln His Asn Ile Asp Thr Leu Phe Ser Leu Tyr Arg Phe His Gln Trp
805 810 815
Ile Asn Gly Leu Gly Asn Pro Gly Ser Asp Thr Leu Asp Met Leu Arg
820 825 830
Gln Gln Thr Leu Thr Ala Asp Arg Leu Ala Ser Val Met Gly Leu Asp
835 840 845
Ile Ser Met Val Thr Gln Ala Met Val Ser Ala Gly Val Asn Gln Leu
850 855 860
Gln Cys Trp Gln Asp Ile Asn Thr Val Leu Gln Trp Ile Asp Val Ala
865 870 875 880
Ser Ala Leu His Thr Met Pro Ser Val Ile Arg Thr Leu Val Asn Ile
885 890 895
Arg Tyr Val Thr Ala Leu Asn Lys Ala Glu Ser Asn Leu Pro Ser Trp
900 905 910
Asp Glu Trp Gln Thr Leu Ala Glu Asn Met Glu Ala Gly Leu Ser Thr
915 920 925
Gln Gln Ala Gln Thr Leu Ala Asp Tyr Thr Ala Glu Arg Leu Ser Ser
930 935 940
Val Leu Cys Asn Trp Phe Leu Ala Asn Ile Gln Pro Glu Gly Val Ser
945 950 955 960
Leu His Ser Arg Asp Asp Leu Tyr Ser Tyr Phe Leu Ile Asp Asn Gln
965 970 975
Val Ser Ser Ala Ile Lys Thr Thr Arg Leu Ala Glu Ala Ile Ala Gly
980 985 990
Ile Gln Leu Tyr Ile Asn Arg Ala Leu Asn Arg Ile Glu Pro Asn Ala
995 1000 1005
Arg Ala Asp Val Ser Thr Arg Gln Phe Phe Thr Asp Trp Thr Val
1010 1015 1020
Asn Asn Arg Tyr Ser Thr Trp Gly Gly Val Ser Arg Leu Val Tyr
1025 1030 1035
Tyr Pro Glu Asn Tyr Ile Asp Pro Thr Gln Arg Ile Gly Gln Thr
1040 1045 1050
Arg Met Met Asp Glu Leu Leu Glu Asn Ile Ser Gln Ser Lys Leu
1055 1060 1065
Ser Arg Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Arg
1070 1075 1080
Phe Glu Thr Val Ala Asp Leu Lys Val Val Ser Ala Tyr His Asp
1085 1090 1095
Asn Val Asn Ser Asn Thr Gly Leu Thr Trp Phe Val Gly Gln Thr
1100 1105 1110
Arg Glu Asn Leu Pro Glu Tyr Tyr Trp Arg Asn Val Asp Ile Ser
1115 1120 1125
Arg Met Gln Ala Gly Glu Leu Ala Ala Asn Ala Trp Lys Glu Trp
1130 1135 1140
Thr Lys Ile Asp Thr Ala Val Asn Pro Tyr Lys Asp Ala Ile Arg
1145 1150 1155
Pro Val Ile Phe Arg Glu Arg Leu His Leu Ile Trp Val Glu Lys
1160 1165 1170
Glu Glu Val Ala Lys Asn Gly Thr Asp Pro Val Glu Thr Tyr Asp
1175 1180 1185
Arg Phe Thr Leu Lys Leu Ala Phe Leu Arg His Asp Gly Ser Trp
1190 1195 1200
Ser Ala Pro Trp Ser Tyr Asp Ile Thr Thr Gln Val Glu Ala Val
1205 1210 1215
Thr Asp Lys Lys Pro Asp Thr Glu Arg Leu Ala Leu Ala Ala Ser
1220 1225 1230
Gly Phe Gln Gly Glu Asp Thr Leu Leu Val Phe Val Tyr Lys Thr
1235 1240 1245
Gly Lys Ser Tyr Ser Asp Phe Gly Gly Ser Asn Lys Asn Val Ala
1250 1255 1260
Gly Met Thr Ile Tyr Gly Asp Gly Ser Phe Lys Lys Met Glu Asn
1265 1270 1275
Thr Ala Leu Ser Arg Tyr Ser Gln Leu Lys Asn Thr Phe Asp Ile
1280 1285 1290
Ile His Thr Gln Gly Asn Asp Leu Val Arg Lys Ala Ser Tyr Arg
1295 1300 1305
Phe Ala Gln Asp Phe Glu Val Pro Ala Ser Leu Asn Met Gly Ser
1310 1315 1320
Ala Ile Gly Asp Asp Ser Leu Thr Val Met Glu Asn Gly Asn Ile
1325 1330 1335
Pro Gln Ile Thr Ser Lys Tyr Ser Ser Asp Asn Leu Ala Ile Thr
1340 1345 1350
Leu His Asn Ala Ala Phe Thr Val Arg Tyr Asp Gly Ser Gly Asn
1355 1360 1365
Val Ile Arg Asn Lys Gln Ile Ser Ala Met Lys Leu Thr Gly Val
1370 1375 1380
Asp Gly Lys Ser Gln Tyr Gly Asn Ala Phe Ile Ile Ala Asn Thr
1385 1390 1395
Val Lys His Tyr Gly Gly Tyr Ser Asp Leu Gly Gly Pro Ile Thr
1400 1405 1410
Val Tyr Asn Lys Thr Lys Asn Tyr Ile Ala Ser Val Gln Gly His
1415 1420 1425
Leu Met Asn Ala Asp Tyr Thr Arg Arg Leu Ile Leu Thr Pro Val
1430 1435 1440
Glu Asn Asn Tyr Tyr Ala Arg Leu Phe Glu Phe Pro Phe Ser Pro
1445 1450 1455
Asn Thr Ile Leu Asn Thr Val Phe Thr Val Gly Ser Asn Lys Thr
1460 1465 1470
Ser Asp Phe Lys Lys Cys Ser Tyr Ala Val Asp Gly Asn Asn Ser
1475 1480 1485
Gln Gly Phe Gln Ile Phe Ser Ser Tyr Gln Ser Ser Gly Trp Leu
1490 1495 1500
Asp Ile Asp Thr Gly Ile Asn Asn Thr Asp Ile Lys Ile Thr Val
1505 1510 1515
Met Ala Gly Ser Lys Thr His Thr Phe Thr Ala Ser Asp His Ile
1520 1525 1530
Ala Ser Leu Pro Ala Asn Ser Phe Asp Ala Met Pro Tyr Thr Phe
1535 1540 1545
Lys Pro Leu Glu Ile Asp Ala Ser Ser Leu Ala Phe Thr Asn Asn
1550 1555 1560
Ile Ala Pro Leu Asp Ile Val Phe Glu Thr Lys Ala Lys Asp Gly
1565 1570 1575
Arg Val Leu Gly Lys Ile Lys Gln Thr Leu Ser Val Lys Arg Val
1580 1585 1590
Asn Tyr Asn Pro Glu Asp Ile Leu Phe Leu Arg Glu Thr His Ser
1595 1600 1605
Gly Ala Gln Tyr Met Gln Leu Gly Val Tyr Arg Ile Arg Leu Asn
1610 1615 1620
Thr Leu Leu Ala Ser Gln Leu Val Ser Arg Ala Asn Thr Gly Ile
1625 1630 1635
Asp Thr Ile Leu Thr Met Glu Thr Gln Arg Leu Pro Glu Pro Pro
1640 1645 1650
Leu Gly Glu Gly Phe Phe Ala Asn Phe Val Leu Pro Lys Tyr Asp
1655 1660 1665
Pro Ala Glu His Gly Asp Glu Arg Trp Phe Lys Ile His Ile Gly
1670 1675 1680
Asn Val Gly Gly Asn Thr Gly Arg Gln Pro Tyr Tyr Ser Gly Met
1685 1690 1695
Leu Ser Asp Thr Ser Glu Thr Ser Met Thr Leu Phe Val Pro Tyr
1700 1705 1710
A1a Glu Gly Tyr Tyr Met His Glu Gly Val Arg Leu Gly Val Gly
1715 1720 1725
Tyr Gln Lys Ile Thr Tyr Asp Asn Thr Trp Glu Ser Ala Phe Phe
1730 1735 1740
Tyr Phe Asp Glu Thr Lys Gln Gln Phe Val Leu Ile Asn Asp Ala
1745 1750 1755
Asp His Asp Ser Gly Met Thr Gln Gln Gly Ile Val Lys Asn Ile
1760 1765 1770
Lys Lys Tyr Lys Gly Phe Leu Asn Val Ser Ile Ala Thr Gly Tyr
1775 1780 1785
Ser Ala Pro Met Asp Phe Asn Ser Ala Ser Ala Leu Tyr Tyr Trp
1790 1795 1800
Glu Leu Phe Tyr Tyr Thr Pro Met Met Cys Phe Gln Arg Leu Leu
1805 1810 1815
Gln Glu Lys Gln Phe Asp Glu Ala Thr Gln Trp Ile Asn Tyr Val
1820 1825 1830
Tyr Asn Pro Ala Gly Tyr Ile Val Asn Gly Glu Ile Ala Pro Trp
1835 1840 1845
Ile Trp Asn Cys Arg Pro Leu Glu Glu Thr Thr Ser Trp Asn Ala
1850 1855 1860
Asn Pro Leu Asp Ala Ile Asp Pro Asp Ala Val Ala Gln Asn Asp
1865 1870 1875
Pro Met His Tyr Lys Ile Ala Thr Phe Met Arg Leu Leu Asp Gln
1880 1885 1890
Leu Ile Leu Arg Gly Asp Met Ala Tyr Arg Glu Leu Thr Arg Asp
1895 1900 1905
Ala Leu Asn Glu Ala Lys Met Trp Tyr Val Arg Thr Leu Glu Leu
1910 1915 1920
Leu Gly Asp Glu Pro Glu Asp Tyr Gly Ser Gln Gln Trp Ala Ala
1925 1930 1935
Pro Ser Leu Ser Gly Ala Ala Ser Gln Thr Val Gln Ala Ala Tyr
1940 1945 1950
Gln Gln Asp Leu Thr Met Leu Gly Arg Gly Gly Val Ser Lys Asn
1955 1960 1965
Leu Arg Thr Ala Asn Ser Leu Val Gly Leu Phe Leu Pro Glu Tyr
1970 1975 1980
Asn Pro Ala Leu Thr Asp Tyr Trp Gln Thr Leu Arg Leu Arg Leu
1985 1990 1995
Phe Asn Leu Arg His Asn Leu Ser Ile Asp Gly Gln Pro Leu Ser
2000 2005 2010
Leu Ala Ile Tyr Ala Glu Pro Thr Asp Pro Lys Ala Leu Leu Thr
2015 2020 2025
Ser Met Val Gln Ala Ser Gln Gly Gly Ser Ala Val Leu Pro Gly
2030 2035 2040
Thr Leu Ser Leu Tyr Arg Phe Pro Val Met Leu Glu Arg Thr Arg
2045 2050 2055
Asn Leu Val Ala Gln Leu Thr Gln Phe Gly Thr Ser Leu Leu Ser
2060 2065 2070
Met Ala Glu His Asp Asp Ala Asp Glu Leu Thr Thr Leu Leu Leu
2075 2080 2085
Gln Gln Gly Met Glu Leu Ala Thr Gln Ser Ile Arg Ile Gln Gln
2090 2095 2100
Arg Thr Val Asp Glu Val Asp Ala Asp Ile Ala Val Leu Ala Glu
2105 2110 2115
Ser Arg Arg Ser Ala Gln Asn Arg Leu Glu Lys Tyr Gln Gln Leu
2120 2125 2130
Tyr Asp Glu Asp Ile Asn His Gly Glu Gln Arg Ala Met Ser Leu
2135 2140 2145
Leu Asp Ala Ala Ala Gly Gln Ser Leu Ala Gly Gln Val Leu Ser
2150 2155 2160
Ile Ala Glu Gly Val Ala Asp Leu Val Pro Asn Val Phe Gly Leu
2165 2170 2175
Ala Cys Gly Gly Ser Arg Trp Gly Ala Ala Leu Arg Ala Ser Ala
2180 2185 2190
Ser Val Met Ser Leu Ser Ala Thr Ala Ser Gln Tyr Ser Ala Asp
2195 2200 2205
Lys Ile Ser Arg Ser Glu Ala Tyr Arg Arg Arg Arg Gln Glu Trp
2210 2215 2220
Glu Ile Gln Arg Asp Asn Ala Asp Gly Glu Val Lys Gln Met Asp
2225 2230 2235
Ala Gln Leu Glu Ser Leu Lys Ile Arg Arg Glu Ala Ala Gln Met
2240 2245 2250
Gln Val Glu Tyr Gln Glu Thr Gln Gln Ala His Thr Gln Ala Gln
2255 2260 2265
Leu Glu Leu Leu Gln Arg Lys Phe Thr Asn Lys Ala Leu Tyr Ser
2270 2275 2280
Trp Met Arg Gly Lys Leu Ser Ala Ile Tyr Tyr Gln Phe Phe Asp
2285 2290 2295
Leu Thr Gln Ser Phe Cys Leu Met Ala Gln Glu Ala Leu Arg Arg
2300 2305 2310
Glu Leu Thr Asp Asn Gly Val Thr Phe Ile Arg Gly Gly Ala Trp
2315 2320 2325
Asn Gly Thr Thr Ala Gly Leu Met Ala Gly Glu Thr Leu Leu Leu
2330 2335 2340
Asn Leu Ala Glu Met Glu Lys Val Trp Leu Glu Arg Asp Glu Arg
2345 2350 2355
Ala Leu Glu Val Thr Arg Thr Val Ser Leu Ala Gln Phe Tyr Gln
2360 2365 2370
Ala Leu Ser Ser Asp Asn Phe Asn Leu Thr Glu Lys Leu Thr Gln
2375 2380 2385
Phe Leu Arg Glu Gly Lys Gly Asn Val Gly Ala Ser Gly Asn Glu
2390 2395 2400
Leu Lys Leu Ser Asn Arg Gln Ile Glu Ala Ser Val Arg Leu Ser
2405 2410 2415
Asp Leu Lys Ile Phe Ser Asp Tyr Pro Glu Ser Leu Gly Asn Thr
2420 2425 2430
Arg Gln Leu Lys Gln Val Ser Val Thr Leu Pro Ala Leu Val Gly
2435 2440 2443
Pro Tyr Glu Asp Ile Arg Ala Val Leu Asn Tyr Gly Gly Ser Ile
2450 2455 2460
Val Met Pro Arg Gly Cys Ser Ala Ile Ala Leu Ser His Gly Val
2465 2470 2475
Asn Asp Ser Gly Gln Phe Met Leu Asp Phe Asn Asp Ser Arg Tyr
2480 2485 2490
Leu Pro Phe Glu Gly Ile Ser Val Asn Asp Ser Gly Ser Leu Thr
2495 2500 2505
Leu Ser Phe Pro Asp Ala Thr Asp Arg Gln Lys Ala Leu Leu Glu
2510 2515 2520
Ser Leu Ser Asp Ile Ile Leu His Ile Arg Tyr Thr Ile Arg Ser
2525 2530 2535
<210>35
<211>3300
<212>DNA
<213>类芽孢杆菌(Paenibacillus)DAS1529株
<400>35
atggtgtcaa caacagacaa cacggccggc gtattccggc tcggaaccga agaattaaca 60
gaagcgctta agcagtccgg ttatcggacc gtctttgata ttgtatctga caatcttgcg 120
gaatttcaga aaaacaatcc ggagattccc tcttctgacg cgaaggagat tcatcaatta 180
gccgtccaga ggacagaaaa cttatgcatg ctttataagg cctggcagct gcacaatgat 240
ccggttgtcc agagccttcc caaattatcc gcggataccg gcctgcaagg catgcgtgcc 300
gcgttggagc ggagtcttgg aggcggagcc gattttggag acttgttccc ggagcgatcg 360
ccagagggct atgcggaagc ctcctctata cagtcgcttt tctcgccggg acgttacctg 420
acggtgctgt ataaaattgc gcgggatctc cacgacccaa aagataaact gcatattgac 480
aaccgccgtc cagatttgaa gtcgctgatc ctcaataatg acaatatgaa ccgagaggta 540
tcttctctgg atatccttct ggatgtgctg cagcccgaag gctctgacac gctgacatcc 600
ttgaaggata cctaccatcc gatgaccctt ccctatgatg acgaccttgc gcaaatcaat 660
gccgtggcgg aggcgcgttc atctaatttg ctggggattt gggataccct gctggacacg 720
cagcggactt ccatcctgca gaattccgcc gctgcccgcc ggataagcaa ggcgcggcac 780
tcggcatacg ccaatcagaa agcctccaat gatgagccgg tattcatcac gggagaggaa 840
atctacctgg aaaccggagg taaacggctt tttctggcgc ataaactcga gataggttca 900
actattagcg ctaaaatcaa cattggaccg ccgcaagcgg ccgatatcgc gccggcaaag 960
ttgcaactcg tatattacgg cagaggcggc agagggaact acttcctgcg cgtggcagac 1020
gatgtgtccc tcggtggaaa gctgctgacc aattgttatc tgaccagcga tgacggacag 1080
agcaacaata ttagcgggcc atactgccta atgatcaacc gaggcaccgg cagcatgcct 1140
agcgggactc accttccagt tcagattgaa agagtgaccg atacatccat ccgcattttt 1200
gtgccggatc acggctattt ggggctaggc gaaagccttg ccagcaactg gaatgaaccg 1260
ttggcgctga atctgggctt ggatgaagcg ttgaccttta ccttgagaaa gaaggagacg 1320
ggaaatgaca ccatttccat aatcgacatg ctgccgccgg tagcgaacac gactccgtct 1380
ccgccgacga gggaaacgct ttccttgacg ccaaacagct tccgtctgct ggtcaaccct 1440
gagccgacag cggaggacat cgccaagcac tacaacgtca cgacggtaac ccgggctcct 1500
gccgatctgg cctccgcctt aaatgttgtc gatgatttct gcttgaaaac cggtttgagc 1560
tttaacgaat tgctggattt aaccatgcag aaggattatc agtcaaaaag cagtgagtac 1620
aaaagccgat ttgtaaaatt cggcggcggg gagaatgttc cggtatcaag ctatggcgca 1680
gcctttctga caggagcgga agatactcct ttgtgggtga aacagtataa cagcgtgggg 1740
actgcaacaa gcacccctgt tttaaacttt acgccagata atgttgtggc tttggcagga 1800
agggcggaaa agcttgtccg gctgatgcgc agcacgggtc tttcctttga gcagttggat 1860
tggctgattg ccaatgccag ccgtgccgtt atcgaacacg gtggagagct ttttctggat 1920
aagccggtac tggaagctgt ggccgaattc acaaggctca ataagcgtta tggcgtcaca 1980
tcggatatgt tcgccgcgtt tatcggcgaa gtcaatacgt atacagaagc gggcaaggac 2040
agcttttatc aggcgagttt cagcacggcc gaccattcgg ctaccttacc tttgggcgct 2100
tctttgcaac ttgaggtgag caagcaggat cgatatgaag cgatttgctg cggggctatg 2160
ggggtgaccg ccgatgagtt ctcccgtatc ggcaaatact gctttgggga taaagcacag 2220
caaatcacgg ccaatgaaac aaccgttgcc cagctttatc gtttaggccg aattcctcat 2280
atgctaggct tgcgttttac cgaggcagag ctgttgtgga aattgatggc tgggggcgag 2340
gataccttgc tccgcacgat tggcgcgaac cctcgcagtt tagaagcgtt agagattatt 2400
cgccggacgg aggtcctttt ggactggatg gatgcccatc agctggatgt tgtctccctg 2460
caagccatgg ttaccaatcg gtacagcggc acagccacgc cggagctgta caattttttg 2520
gcacaggtgc atcaatccgc aagcagtgcc gcgaacgtgg ccagagcgga tggtcaggat 2580
acgttgcctg cggacaagct gctccgggca ttggcggcgg gcttcaaact gaaagccaac 2640
gtgatggcgc gagtaatcga ctggatggac aaaaccaata aagcgtttac gctgcgggct 2700
ttctgggaca agcttcaagc gtatttcagc gccgatcatg aagaagaact gaccgccctg 2760
gaaggagaag ccgcaatgct gcagtggtgc cagcagatca gccagtatgc gctcattgtc 2820
cgctggtgcg ggttaagcga gcaggatctg gcgctgctga ccgggaatcc ggagcagctt 2880
ctggacggac aacatacggt gcccgtaccc tcgctgcatc tcctgctggt gctgacccgc 2940
ctgaaggaat ggcagcagcg cgtccaggtt tccagcgagg aggctatgcg ctattttgcc 3000
caggccgatt cgccaaccgt cacgcgcgac gatgcggtta atctgcttgc ccgtatccat 3060
ggctggaatg aagcggatac cgtctcgatg aatgactacc tgctgggaga gaacgaatat 3120
cctaagaact ttgatcagat ctttgcactg gaaagctggg tcaacctggg ccgtcaactg 3180
aacgtgggca gcagaacgct gggagagctg gttgacatgg ctgaagagga taaaaccgcg 3240
gaaaacatgg atctgattac ttcggtggcc catagcctga tggctgcagc gaaagcctga 3300
<210>36
<211>1080
<212>PRT
<213>类芽孢杆菌DAS1529株
<400>36
Met Val Ser Thr Thr Asp Asn Thr Ala Gly Val Phe Arg Leu Gly Thr
1 5 10 15
Glu Glu Leu Thr Glu Ala Leu Lys Gln Ser Gly Tyr Arg Thr Val Phe
20 25 30
Asp Ile Val Ser Asp Asn Leu Ala Glu Phe Gln Lys Asn Asn Pro Glu
35 40 45
Ile Pro Ser Ser Asp Ala Lys Glu Ile His Gln Leu Ala Val Gln Arg
50 55 60
Thr Glu Asn Leu Cys Met Leu Tyr Lys Ala Trp Gln Leu His Asn Asp
65 70 75 80
Pro Val Val Gln Ser Leu Pro Lys Leu Ser Ala Asp Thr Gly Leu Gln
85 90 95
Gly Met Arg Ala Ala Leu Glu Arg Ser Leu Gly Gly Gly Ala Asp Phe
100 105 110
Gly Asp Leu Phe Pro Glu Arg Ser Pro Glu Gly Tyr Ala Glu Ala Ser
115 120 125
Ser Ile Gln Ser Leu Phe Ser Pro Gly Arg Tyr Leu Thr Val Leu Tyr
130 135 140
Lys Ile Ala Arg Asp Leu His Asp Pro Lys Asp Lys Leu His Ile Asp
145 150 155 160
Asn Arg Arg Pro Asp Leu Lys Ser Leu Ile Leu Asn Asn Asp Asn Met
165 170 175
Asn Arg Glu Val Ser Ser Leu Asp Ile Leu Leu Asp Val Leu Gln Pro
180 185 l90
Glu Gly Ser Asp Thr Leu Thr Ser Leu Lys Asp Thr Tyr His Pro Met
195 200 205
Thr Leu Pro Tyr Asp Asp Asp Leu Ala Gln Ile Asn Ala Val Ala Glu
210 215 220
Ala Arg Ser Ser Asn Leu Leu Gly Ile Trp Asp Thr Leu Leu Asp Thr
225 230 235 240
Gln Arg Thr Ser Ile Leu Gln Asn Ser Ala Ala Ala Arg Arg Ile Ser
245 250 255
Lys Ala Arg His Ser Ala Tyr Ala Asn Gln Lys Ala Ser Asn Asp Glu
260 265 270
Pro Val Phe Ile Thr Gly Glu Glu Ile Tyr Leu Glu Thr Gly Gly Lys
275 280 285
Arg Leu Phe Leu Ala His Lys Leu GluIle Gly Ser Thr Ile Ser Ala
290 295 300
Lys Ile Asn Ile Gly Pro Pro Gln Ala Ala Asp Ile Ala Pro Ala Lys
305 310 315 320
Leu Gln Leu Val Tyr Tyr Gly Arg Gly Gly Arg Gly Asn Tyr Phe Leu
325 330 335
Arg Val Ala Asp Asp Val Ser Leu Gly Gly Lys Leu Leu Thr Asn Cys
340 345 350
Tyr Leu Thr Ser Asp Asp Gly Gln Ser Asn Asn Ile Ser Gly Pro Tyr
355 360 365
Cys Leu Met Ile Asn Arg Gly Thr Gly Ser Met Pro Ser Gly Thr His
370 375 380
Leu Pro Val Gln Ile Glu Arg Val Thr Asp Thr Ser Ile Arg Ile Phe
385 390 395 400
Val Pro Asp His Gly Tyr Leu Gly Leu Gly Glu Ser Leu Ala Ser Asn
405 410 415
Trp Asn Glu Pro Leu Ala Leu Asn Leu Gly Leu Asp Glu Ala Leu Thr
420 425 430
Phe Thr Leu Arg Lys Lys Glu Thr Gly Asn Asp Thr Ile Ser Ile Ile
435 440 445
Asp Met Leu Pro Pro Val Ala Asn Thr Thr Pro Ser Pro Pro Thr Arg
450 455 460
Glu Thr Leu Ser Leu Thr Pro Asn Ser Phe Arg Leu Leu Val Asn Pro
465 470 475 480
Glu Pro Thr Ala Glu Asp Ile Ala Lys His Tyr Asn Val Thr Thr Val
485 490 495
Thr Arg Ala Pro Ala Asp Leu Ala Ser Ala Leu Asn Val Val Asp Asp
500 505 510
Phe Cys Leu Lys Thr Gly Leu Ser Phe Asn Glu Leu Leu Asp Leu Thr
515 520 525
Met Gln Lys Asp Tyr Gln Ser Lys Ser Ser Glu Tyr Lys Ser Arg Phe
530 535 540
Val Lys Phe Gly Gly Gly Glu Asn Val Pro Val Ser Ser Tyr Gly Ala
545 550 555 560
Ala Phe Leu Thr Gly Ala Glu Asp Thr Pro Leu Trp Val Lys Gln Tyr
565 570 575
Asn Ser Val Gly Thr Ala Thr Ser Thr Pro Val Leu Asn Phe Thr Pro
580 585 590
Asp Asn Val Val Ala Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu
595 600 605
Met Arg Ser Thr Gly Leu Ser Phe Glu Gln Leu Asp Trp Leu Ile Ala
610 615 620
Asn Ala Ser Arg Ala Val Ile Glu His Gly Gly Glu Leu Phe Leu Asp
625 630 635 640
Lys Pro Val Leu Glu Ala Val Ala Glu Phe Thr Arg Leu Asn Lys Arg
645 650 655
Tyr Gly Val Thr Ser Asp Met Phe Ala Ala Phe Ile Gly Glu Val Asn
660 665 670
Thr Tyr Thr Glu Ala Gly Lys Asp Ser Phe Tyr Gln Ala Ser Phe Ser
675 680 685
Thr Ala Asp His Ser Ala Thr Leu Pro Leu Gly Ala Ser Leu Gln Leu
690 695 700
Glu Val Ser Lys Gln Asp Arg Tyr Glu Ala Ile Cys Cys Gly Ala Met
705 710 715 720
Gly Val Thr Ala Asp Glu Phe Ser Arg Ile Gly Lys Tyr Cys Phe Gly
725 730 735
Asp Lys Ala Gln Gln Ile Thr Ala Asn Glu Thr Thr Val Ala Gln Leu
740 745 750
Tyr Arg Leu Gly Arg Ile Pro His Met Leu Gly Leu Arg Phe Thr Glu
755 760 765
Ala Glu Leu Leu Trp Lys Leu Met Ala Gly Gly Glu Asp Thr Leu Leu
770 775 780
Arg Thr Ile Gly Ala Asn Pro Arg Ser Leu Glu Ala Leu Glu Ile Ile
785 790 795 800
Arg Arg Thr Glu Val Leu Leu Asp Trp Met Asp Ala His Gln Leu Asp
805 810 815
Val Val Ser Leu Gln Ala Met Val Thr Asn Arg Tyr Ser Gly Thr Ala
820 825 830
Thr Pro Glu Leu Tyr Asn Phe Leu Ala Gln Val His Gln Ser Ala Ser
835 840 845
Ser Ala Ala Asn Val Ala Arg Ala Asp Gly Gln Asp Thr Leu Pro Ala
850 855 860
Asp Lys Leu Leu Arg Ala Leu Ala Ala Gly Phe Lys Leu Lys Ala Asn
865 870 875 880
Val Met Ala Arg Val Ile Asp Trp Met Asp Lys Thr Asn Lys Ala Phe
885 890 895
Thr Leu Arg Ala Phe Trp Asp Lys Leu Gln Ala Tyr Phe Ser Ala Asp
900 905 910
His Glu Glu Glu Leu Thr Ala Leu Glu Gly Glu Ala Ala Met Leu Gln
915 920 925
Trp Cys Gln Gln Ile Ser Gln Tyr Ala Leu Ile Val Arg Trp Cys Gly
930 935 940
Leu Ser Glu Gln Asp Leu Ala Leu Leu Thr Gly Asn Pro Glu Gln Leu
945 950 955 960
Leu Asp Gly Gln His Thr Val Pro Val Pro Ser Leu His Leu Leu Leu
965 970 975
Val Leu Thr Arg Leu Lys Glu Trp Gln Gln Arg Val Gln Val Ser Ser
980 985 990
Glu Glu Ala Met Arg Tyr Phe Ala Gln Ala Asp Ser Pro Thr Val Thr
995 1000 1005
Arg Asp Asp Ala Val Asn Leu Leu Ala Arg Ile His Gly Trp Asn
1010 1015 1020
Glu Ala Asp Thr Val Ser Met Asn Asp Tyr Leu Leu Gly Glu Asn
1025 1030 1035
Glu Tyr Pro Lys Ash Phe Asp Gln Ile Phe Ala Leu Glu Ser Trp
1040 1045 1050
Val Asn Leu Gly Arg Gln Leu Asn Val Gly Ser Arg Thr Leu Gly
1055 1060 1065
Glu Leu Val Asp Met Ala Glu Glu Asp Lys Thr Ala
1070 1075 1080
<210>37
<211>3627
<212>DNA
<213>类芽孢杆菌DAS1529株
<400>37
atgaccaagg aaggtgataa gcatatgtct acttcaaccc tgttgcaatc gattaaagaa 60
gcccgccggg atgcgctggt caaccattat attgctaatc aggttccgac agcgcttgcg 120
gacaagatta cggacgcgga cagcctgtat gagtacttgc tgctggatac caagatcagt 180
gaactcgtaa aaacatcgcc gatagcggag gccatcagca gcgtgcagtt atacatgaac 240
cgctgcgtcg aaggctatga aggcaagttg actccggaaa gtaatactca ttttggccca 300
ggtaaatttc tatataactg ggatacgtac aacaaacgtt tttccacctg ggcaggaaaa 360
gaacgcttga aatattatgc aggcagctat attgagccgt ccttgcgcta caacaaaacc 420
gatccattcc tgaacctgga acagagcatc agccagggaa gaattactga tgataccgta 480
aagaacgcgc tgcaacacta cctgactgaa tatgaagtgt tggcggatct ggattatatc 540
agcgttaata aaggcggcga cgaaagtgtt ttactctttg ttggacgcac caaaaccgta 600
ccgtatgaat actactggcg ccgtttgctt ttaaaaaggg acaataataa taagctagta 660
ccagcagtct ggtctcagtg gaaaaaaatc agtgccaata tcggtgaagc ggttgatagt 720
tatgtggtgc ctcggtggca taaaaaccgg ctacatgtgc aatggtgttc tatagagaaa 780
agtgaaaatg atgccggtga acccattgag aaacgatatt tgaatgactg gttcatggat 840
agttccggag tctggtcttc atttcgaaag attccggttg tggaaaagag tttcgaatat 900
ttggacggaa gcctcgatcc ccgatttgtc gctcttgtta gaaatcaaat attaattgat 960
gagccagaaa tattcagaat tacagtatca gcccctaatc cgatagatgc aaatggaaga 1020
gtagaggtac attttgaaga aaactatgca aacagatata atattaccat taaatatggg 1080
acaacgagtc ttgctattcc tgcagggcag gtagggcatc caaatatctc tattaatgaa 1140
acattaaggg ttgaattcgg caccaggccg gattggtatt atactttcag atatttagga 1200
aatacaatcc aaaactcata cggttcaatt gtcaataatc aattttcacc tccatcagga 1260
agcaatatta aaggtcctat cgaccttacc ctgaaaaata acatcgacct gtcggccttg 1320
ttggatgaga gccttgacgc actgttcgac tataccattc agggcgataa ccaattgggc 1380
ggcttagctg cctttaacgg gccttacgga ctttacttgt gggaaatctt cttccatgtt 1440
ccttttttaa tggcggttcg cttccacacc gagcagcggt atgagttggc ggaacgttgg 1500
tttaaattca tcttcaacag cgcaggatac cgtgatgatt acggcagtct gctgacggat 1560
gacaaaggca acgtgcgtta ctggaacgtg ataccgctgc aagaggacac ggagtgggat 1620
gacacgttgt ccctggcaac gaccgacccg gacgagattg cgatggccga cccgatgcaa 1680
tacaagctgg ctatatttat tcacaccatg gacttcctga tcagccgcgg cgatagcttg 1740
taccggatgc tggagcggga taccctggcc gaagccaaga tgtattacat tcaggccagc 1800
caactgcttg ggccccgccc cgacatccgg ctcaatcaca gttggcctaa tccgaccttg 1860
caaagcgaag cggacgcggt aaccgccgtg ccgacgcgaa gcgattcgcc ggcagcgcca 1920
attttggcct tgcgagcgct tctgacaggc gaaaacggtc atttcctgcc gccttataat 1980
gatgaactgt tcgctttctg ggacaaaatc gatctgcgtt tatacaattt gcgccacaat 2040
ttgagtctgg acggtcagcc gcttcatttg ccgctctttg ccgaaccggt caatccgcgt 2100
gaattgcagg ttcagcatgg cccgggcgat ggcttggggg gaagcgcggg ttccgcccaa 2160
agccgtcaga gtgtctatcg ttttcctctg gtcatcgata aggcgcgcaa tgcggccaac 2220
agtgtcatcc aattcggcaa tgccctggaa aacgcactga ccaagcaaga cagcgaagca 2280
atgaccatgc tgttgcagtc ccagcagcag attgtcctgc agcaaacccg cgatattcag 2340
gagaagaacc tggccgcgct gcaagcaagt ctggaagcaa cgatgacagc gaaagcgggg 2400
gcggagtccc ggaagaccca ttttgccggc ttggcggaca actggatgtc ggacaatgaa 2460
accgcctcac tcgcactgcg taccaccgcg ggaatcatca ataccagctc aaccgtgccg 2520
atcgccatca ccggcggctt ggatatggct ccgaacattt ttggtttcgc agttggaggt 2580
tcccgctggg gagcagccag cgcggctgta gcccaaggat tgcaaatcgc cgccggcgta 2640
atggaacaga cggccaatat tatcgatatt agcgaaagct accgccggcg ccgggaggat 2700
tggctgctgc agcgggatgt tgccgaaaat gaagcggcgc agttggattc gcagattgcg 2760
gccctgcggg aacagatgga tatggcgcgc aagcaacttg cgctggcgga gacggaacag 2820
gcgcacgcgc aagcggtcta cgagctgcaa agcacccgct ttacgaatca agctttgtat 2880
aactggatgg ctggacgtct gtcgtctcta tactatcaaa tgtatgacgc cgcattgccg 2940
ctctgcttga tggcgaagca ggctttagag aaagaaatcg gttcggataa aacggtcgga 3000
gtcttgtccc tcccggcctg gaatgatcta tatcagggat tattggcggg cgaggcgctg 3060
ctgctcgagc ttcagaagct ggagaatctg tggctggagg aagacaagcg cggaatggaa 3120
gccgtaaaaa cagtctctct ggatactctt ctccgcaaaa caaatccgaa ctccgggttt 3180
gcggatctcg tcaaggaggc actggacgaa aacggaaaga cgcctgaccc ggtgagcgga 3240
gtcggcgtac agctgcaaaa caatattttc agcgcaaccc ttgacctctc cgttcttggc 3300
ctggatcgct cttacaatca ggcggaaaag tcccgcagga tcaaaaatat gtcggttacc 3360
ttacctgcgc tattggggcc ttaccaggat atagaggcaa ccttatcgct aggcggcgag 3420
accgttgcgc tgtcccatgg cgtggatgac agcggcttgt tcatcactga tctcaacgac 3480
agccggttcc tgcctttcga gggcatggat ccgttatccg gcacactcgt cctgtcgata 3540
ttccatgccg ggcaagacgg cgaccagcgc ctcctgctgg aaagtctcaa tgacgtcatc 3600
ttccacattc gatatgttat gaaatag 3627
<210>38
<211>1208
<212>PRT
<213>类芽孢杆菌DAS1529株
<400>38
Met Thr Lys Glu Gly Asp Lys His Met Ser Thr Ser Thr Leu Leu Gln
1 5 10 15
Ser Ile Lys Glu Ala Arg Arg Asp Ala Leu Val Asn His Tyr Ile Ala
20 25 30
Asn Gln Val Pro Thr Ala Leu Ala Asp Lys Ile Thr Asp Ala Asp Ser
35 40 45
Leu Tyr Glu Tyr Leu Leu Leu Asp Thr Lys Ile Ser Glu Leu Val Lys
50 55 60
Thr Ser Pro Ile Ala Glu Ala Ile Ser Ser Val Gln Leu Tyr Met Asn
65 70 75 80
Arg Cys Val Glu Gly Tyr Glu Gly Lys Leu Thr Pro Glu Ser Asn Thr
85 90 95
His Phe Gly Pro Gly Lys Phe Leu Tyr Asn Trp Asp Thr Tyr Asn Lys
100 105 110
Arg Phe Ser Thr Trp Ala Gly Lys Glu Arg Leu Lys Tyr Tyr Ala Gly
115 120 125
Ser Tyr Ile Glu Pro Ser Leu Arg Tyr Asn Lys Thr Asp Pro Phe Leu
130 135 140
Asn Leu Glu Gln Ser Ile Ser Gln Gly Arg Ile Thr Asp Asp Thr Val
145 150 155 160
Lys Asn Ala Leu Gln His Tyr Leu Thr Glu Tyr Glu Val Leu Ala Asp
165 170 175
Leu Asp Tyr Ile Ser Val Asn Lys Gly Gly Asp Glu Ser Val Leu Leu
180 185 190
Phe Val Gly Arg Thr Lys Thr Val Pro Tyr Glu Tyr Tyr Trp Arg Arg
195 200 205
Leu Leu Leu Lys Arg Asp Asn Asn Asn Lys Leu Val Pro Ala Val Trp
210 215 220
Ser Gln Trp Lys Lys Ile Ser Ala Asn Ile Gly Glu Ala Val Asp Ser
225 230 235 240
Tyr Val Val Pro Arg Trp His Lys Asn Arg Leu His Val Gln Trp Cys
245 250 255
Ser Ile Glu Lys Ser Glu Asn Asp Ala Gly Glu Pro Ile Glu Lys Arg
260 265 270
Tyr Leu Asn Asp Trp Phe Met Asp Ser Ser Gly Val Trp Ser Ser Phe
275 280 285
Arg Lys Ile Pro Val Val Glu Lys Ser Phe Glu Tyr Leu Asp Gly Ser
290 295 300
Leu Asp Pro Arg Phe Val Ala Leu Val Arg Asn Gln Ile Leu Ile Asp
305 310 315 320
Glu Pro Glu Ile Phe Arg Ile Thr Val Ser Ala Pro Asn Pro Ile Asp
325 330 335
Ala Asn Gly Arg Val Glu Val His Phe Glu Glu Asn Tyr Ala Asn Arg
340 345 350
Tyr Asn Ile Thr Ile Lys Tyr Gly Thr Thr Ser Leu Ala Ile Pro Ala
355 360 365
Gly Gln Val Gly His Pro Asn Ile Ser Ile Asn Glu Thr Leu Arg Val
370 375 380
Glu Phe Gly Thr Arg Pro Asp Trp Tyr Tyr Thr Phe Arg Tyr Leu Gly
385 390 395 400
Asn Thr Ile Gln Asn Ser Tyr Gly Ser Ile Val Asn Asn Gln Phe Ser
405 410 415
Pro Pro Ser Gly Ser Asn Ile Lys Gly Pro Ile Asp Leu Thr Leu Lys
420 425 430
Asn Asn Ile Asp Leu Ser Ala Leu Leu Asp Glu Ser Leu Asp Ala Leu
435 440 445
Phe Asp Tyr Thr Ile Gln Gly Asp Asn Gln Leu Gly Gly Leu Ala Ala
450 455 460
Phe Asn Gly Pro Tyr Gly Leu Tyr Leu Trp Glu Ile Phe Phe His Val
465 470 475 480
Pro Phe Leu Met Ala Val Arg Phe His Thr Glu Gln Arg Tyr Glu Leu
485 490 495
Ala Glu Arg Trp Phe Lys Phe Ile Phe Asn Ser Ala Gly Tyr Arg Asp
500 505 510
Asp Tyr Gly Ser Leu Leu Thr Asp Asp Lys Gly Asn Val Arg Tyr Trp
515 520 525
Asn Val Ile Pro Leu Gln Glu Asp Thr Glu Trp Asp Asp Thr Leu Ser
530 535 540
Leu Ala Thr Thr Asp Pro Asp Glu Ile Ala Met Ala Asp Pro Met Gln
545 550 555 560
Tyr Lys Leu Ala Ile Phe Ile His Thr Met Asp Phe Leu Ile Ser Arg
565 570 575
Gly Asp Ser Leu Tyr Arg Met Leu Glu Arg Asp Thr Leu Ala Glu Ala
580 585 590
Lys Met Tyr Tyr Ile Gln Ala Ser Gln Leu Leu Gly Pro Arg Pro Asp
595 600 605
Ile Arg Leu Asn His Ser Trp Pro Asn Pro Thr Leu Gln Ser Glu Ala
610 615 620
Asp Ala Val Thr Ala Val Pro Thr Arg Ser Asp Ser Pro Ala Ala Pro
625 630 635 640
Ile Leu Ala Leu Arg Ala Leu Leu Thr Gly Glu Asn Gly His Phe Leu
645 650 655
Pro Pro Tyr Asn Asp Glu Leu Phe Ala Phe Trp Asp Lys Ile Asp Leu
660 665 670
Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gln Pro Leu
675 680 685
His Leu Pro Leu Phe Ala Glu Pro Val Asn Pro Arg Glu Leu Gln Val
690 695 700
Gln His Gly Pro Gly Asp Gly Leu Gly Gly Ser Ala Gly Ser Ala Gln
705 710 715 720
Ser Arg Gln Ser Val Tyr Arg Phe Pro Leu Val Ile Asp Lys Ala Arg
725 730 735
Asn Ala Ala Asn Ser Val Ile Gln Phe Gly Asn Ala Leu Glu Asn Ala
740 745 750
Leu Thr Lys Gln Asp Ser Glu Ala Met Thr Met Leu Leu Gln Ser Gln
755 760 765
Gln Gln Ile Val Leu Gln Gln Thr Arg Asp Ile Gln Glu Lys Asn Leu
770 775 780
Ala Ala Leu Gln Ala Ser Leu Glu Ala Thr Met Thr Ala Lys Ala Gly
785 790 795 800
Ala Glu Ser Arg Lys Thr His Phe Ala Gly Leu Ala Asp Asn Trp Met
805 810 8l5
Ser Asp Asn Glu Thr Ala Ser Leu Ala Leu Arg Thr Thr Ala Gly Ile
820 825 830
Ile Asn Thr Ser Ser Thr Val Pro Ile Ala Ile Thr Gly Gly Leu Asp
835 840 845
Met Ala Pro Asn Ile Phe Gly Phe Ala Val Gly Gly Ser Arg Trp Gly
850 855 860
Ala Ala Ser Ala Ala Val Ala Gln Gly Leu Gln Ile Ala Ala Gly Val
865 870 875 880
Met Glu Gln Thr Ala Asn Ile Ile Asp Ile Ser Glu Ser Tyr Arg Arg
885 890 895
Arg Arg Glu Asp Trp Leu Leu Gln Arg Asp Val Ala Glu Asn Glu Ala
900 905 910
Ala Gln Leu Asp Ser Gln Ile Ala Ala Leu Arg Glu Gln Met Asp Met
915 920 925
Ala Arg Lys Gln Leu Ala Leu Ala Glu Thr Glu Gln Ala His Ala Gln
930 935 940
Ala Val Tyr Glu Leu Gln Ser Thr Arg Phe Thr Asn Gln Ala Leu Tyr
945 950 955 960
Asn Trp Met Ala Gly Arg Leu Ser Ser Leu Tyr Tyr Gln Met Tyr Asp
965 970 975
Ala Ala Leu Pro Leu Cys Leu Met Ala Lys Gln Ala Leu Glu Lys Glu
980 985 990
Ile Gly Ser Asp Lys Thr Val Gly Val Leu Ser Leu Pro Ala Trp Asn
995 1000 1005
Asp Leu Tyr Gln Gly Leu Leu Ala Gly Glu Ala Leu Leu Leu Glu
1010 1015 1020
Leu Gln Lys Leu Glu Asn Leu Trp Leu Glu Glu Asp Lys Arg Gly
1025 1030 1035
Met Glu Ala Val Lys Thr Val Ser Leu Asp Thr Leu Leu Arg Lys
1040 1045 1050
Thr Asn Pro Asn Ser Gly Phe Ala Asp Leu Val Lys Glu Ala Leu
1055 1060 1065
Asp Glu Asn Gly Lys Thr Pro Asp Pro Val Ser Gly Val Gly Val
1070 1075 1080
Gln Leu Gln Asn Asn Ile Phe Ser Ala Thr Leu Asp Leu Ser Val
1085 1090 1095
Leu Gly Leu Asp Arg Ser Tyr Asn Gln Ala Glu Lys Ser Arg Arg
1100 1105 1110
Ile Lys Asn Met Ser Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr
1115 1120 1125
Gln Asp Ile Glu Ala Thr Leu Ser Leu Gly Gly Glu Thr Val Ala
1130 1135 1140
Leu Ser His Gly Val Asp Asp Ser Gly Leu Phe Ile Thr Asp Leu
1145 1150 1155
Asn Asp Ser Arg Phe Leu Pro Phe Glu Gly Met Asp Pro Leu Set
1160 1165 1170
Gly Thr Leu Val Leu Ser Ile Phe His Ala Gly Gln Asp Gly Asp
1175 1180 1185
Gln Arg Leu Leu Leu Glu Ser Leu Asn Asp Val Ile Phe His Ile
1190 1195 1200
Arg Tyr Val Met Lys
1205
<210>39
<211>4335
<212>DNA
<213>类芽孢杆菌DAS1529株
<400>39
atgccacaat ctagcaatgc cgatatcaag ctattgtcgc catcgctgcc aaagggcggc 60
ggttccatga agggaatcga agaaaacatc gcggctcccg gctccgacgg catggcacgt 120
tgtaatgtgc cgctgccggt aacctccggc cgctatatta ctcctgatat aagcctgtcc 180
tatgcgagcg gccacggcaa cggcgcttat ggaatgggct ggacgatggg agtgatgagc 240
attagccgga gaacaagccg agggaccccc agttatacat ccgaagacca gttccttggt 300
ccggatgggg aggtgcttgt tccggaaagc aacgaacaag gggagatcat tacccgccac 360
accgatacgg cccaagggat accgttaggc gagacgttta cggttacacg ctattttccc 420
cggatcgaga gcgcttttca tttgctggaa tactgggaag cgcaagcagg aagcgcaaca 480
gcgtcgtttt ggcttattca ctctgccgat ggagtgctgc actgtctggg taaaactgct 540
caggcgagga tagccgcccc tgacgattcc gccaagatcg cagaatggct agtggaggag 600
tccgtctccc ccttcggaga gcatatttat taccaataca aagaagaaga caatcaaggc 660
gtgaatctgg aggaagacaa tcatcaatat ggggcgaacc gctatctgaa atcgattcgc 720
tatggaaata aggttgcctc tccttctctc tatgtctgga agggggaaat tccggcagac 780
ggccaatggc tgtattccgt tatcctggat tatggcgaga acgatacctc agcggatgtt 840
cctcccctat acacgcccca aggggagtgg ctggtgcgcc cggaccgttt ttcccgctat 900
gactacggat ttgaggtccg gacttgccgc ttgtgccgcc aggtcttgat gttccacgtc 960
tttaaggagc ttggcgggga gccggcgctg gtgtggcgga tgcagttgga atacgacgag 1020
aacccggcgg cgtccatgct gagcgcggtc cggcaattgg cttatgaagc agatggggcc 1080
attcgaagct tgccgccgct ggaattcgat tatactccat ttggcatcga gacaacggcc 1140
gattggcagc cttttctgcc tgtgcctgaa tgggcggatg aagaacatta tcagttggtc 1200
gatttgtacg gagaaggcat accgggctta ttatatcaga acaatgacca ctggcattat 1260
cgttcgcccg cccggggcga cacaccggac gggatcgcct ataacagctg gcggccgctt 1320
cctcatatcc ccgtgaactc ccggaacggg atgctgatgg atctgaatgg agacgggtat 1380
ctggaatggt tgcttgcgga acccggggtt gcggggcgct atagcatgaa cccggataag 1440
agctggtccg gttttgtgcc gctccaggca ctgccaacgg aattcttcca tccgcaggca 1500
cagcttgcca atgttaccgg atcgggttta accgacttgg ttatgatcgg tccgaagagc 1560
gtccggtttt atgccggaga agaagcgggc ttcaagcgcg catgtgaagt gtggcagcaa 1620
gtgggcatta ctttgcctgt ggaacgcgtg gataaaaagg aactggtggc attcagcgat 1680
atgctgggat cgggtcagtc tcatctggtg cgcatccggc atgatggcgt tacatgctgg 1740
cctaatctgg ggaacggcgt gttcggggcg ccgttggccc ttcacgggtt tacggcatcg 1800
gagcgggaat tcaatccgga acgtgtatat cttgtggacc ttgatggatc cggcgcttcc 1860
gatatcattt atgcttctcg tgacgctcta ctcatttacc gaaatctttc cggcaatggc 1920
tttgctgatc cggtgcgggt tccgctgcct gacggcgtgc ggtttgataa tctgtgccgg 1980
ctgctgcctg ccgatatccg cgggttaggt gtggccagtc tggtgctgca tgtaccttac 2040
atggcccccc gcagttggaa attagatttc tttgcggcga agccgtattt attgcaaacg 2100
gtcagcaaca atcttggagc ttccagctcg ttttggtacc gaagctccac ccagtattgg 2160
ctggatgaga aacaggcggc ctcatcggct gtctccgctt tgcccttccc gataaacgtg 2220
gtatcggata tgcacacggt ggacgaaatc agcggccgca ccaggactca gaagtatact 2280
taccgccatg gcgtgtatga ccggaccgaa aaggaatttg ccggattcgg ccgcattgac 2340
acatgggaag aggagcggga ttccgaagga accctgagcg tcagcactcc gcccgtgctg 2400
acgcggacct ggtatcatac cgggcaaaag caggatgagg agcgtgccgt gcagcaatat 2460
tggcaaggcg accctgcggc ttttcaggtt aaacccgtcc ggcttactcg attcgatgcg 2520
gcagcggccc aggatctgcc gctagattct aataatgggc agcaagaata ctggctgtac 2580
cgatcattac aagggatgcc gctgcggact gagatttttg cgggagatgt tggcgggtcg 2640
cctccttatc aggtagagag cttccgttat caagtgcgct tggtgcagag catcgattcg 2700
gaatgtgttg ccttgcccat gcagttggag cagcttacgt acaactatga gcaaatcgcc 2760
tctgatccgc agtgttcaca gcagatacag caatggttcg acgaatacgg cgtggcggca 2820
cagagtgtaa caatccaata tccgcgccgg gcacagccgg aggacaatcc gtaccctcgc 2880
acgctgccgg ataccagctg gagcagcagt tatgattcgc agcaaatgct gctgcggttg 2940
accaggcaaa ggcaaaaagc gtaccacctt gcagatcctg aaggctggcg cttgaatatt 3000
ccccatcaga cacgcctgga tgccttcatt tattctgctg acagcgtgcc cgccgaagga 3060
ataagcgccg agctgctgga ggtggacggc acgttacgat cttcggcgct ggaacaggct 3120
tatggcggcc agtcagagat catctatgcg ggcgggggcg aaccggattt gcgagccctg 3180
gtccattaca ccagaagcgc ggttcttgat gaagactgtt tacaagccta tgaaggcgta 3240
ctgagcgata gccaattgaa ctcgcttctt gcctcttccg gctatcaacg aagcgcaaga 3300
atattgggtt cgggcgatga agtggatatt tttgtcgcgg aacaaggatt tacccgttat 3360
gcggatgaac cgaatttttt ccgtattctg gggcaacaat cctctctctt gtccggggaa 3420
caagtattaa catgggatga taatttctgt gcggttacat ccatcgaaga cgcgcttggc 3480
aatcaaattc agattgcata tgattaccgc tttgtggagg ccatccagat taccgatacg 3540
aataataatg tgaatcaggt cgccctggat gctctcggcc gggtcgtata cagccggacc 3600
tggggcacgg aggaagggat aaagaccggc ttccgcccgg aggtggaatt cgcgacgccc 3660
gagacaatgg agcaggcgct tgccctggca tctcccttgc cggttgcatc ctgctgtgta 3720
tatgatgcgc atagctggat gggaacgata actcttgcac aactgtcaga gcttgttcca 3780
gatagtgaaa agcaatggtc gttcttgata gacaatcgct tgattatgcc ggacggcaga 3840
atcagatccc gcggtcggga tccatggtcg cttcaccggc tattgccgcc tgctgtgggc 3900
gaattgctga gcgaggcgga ccgtaaaccg ccgcatacgg taattttggc agcagatcgt 3960
tacccggatg acccatccca gcaaattcag gcgagcatcg tgtttagcga tggctttggg 4020
cgtacgatac aaactgctaa aagagaagat acccgatggg cgattgcgga acgggtggac 4080
tatgacggaa ccggagccgt aatccgcagc tttcagcctt tttatcttga cgactggaat 4140
tatgtgggcg aagaggctgt cagcagctct atgtacgcaa cgatctatta ttatgatgct 4200
ctggcacgac aattaaggat ggtcaacgct aaaggatatg agaggagaac tgctttttac 4260
ccatggttta cagtaaacga agatgaaaat gataccatgg actcatcatt atttgcttca 4320
ccgcctgcgc ggtga 4335
<210>40
<211>1444
<212>PRT
<213>类芽孢杆菌DAS1529株
<400>40
Met Pro Gln Ser Ser Asn Ala Asp Ile Lys Leu Leu Ser Pro Ser Leu
1 5 10 15
Pro Lys Gly Gly Gly Ser Met Lys Gly Ile Glu Glu Asn Ile Ala Ala
20 25 30
Pro Gly Ser Asp Gly Met Ala Arg Cys Asn Val Pro Leu Pro Val Thr
35 40 45
Ser Gly Arg Tyr Ile Thr Pro Asp Ile Ser Leu Ser Tyr Ala Ser Gly
50 55 60
His Gly Asn Gly Ala Tyr Gly Met Gly Trp Thr Met Gly Val Met Ser
65 70 75 80
Ile Ser Arg Arg Thr Ser Arg Gly Thr Pro Ser Tyr Thr Ser Glu Asp
85 90 95
Gln Phe Leu Gly Pro Asp Gly Glu Val Leu Val Pro Glu Ser Asn Glu
100 105 110
Gln Gly Glu Ile Ile Thr Arg His Thr Asp Thr Ala Gln Gly Ile Pro
115 120 125
Leu Gly Glu Thr Phe Thr Val Thr Arg Tyr Phe Pro Arg Ile Glu Ser
130 135 140
Ala Phe His Leu Leu Glu Tyr Trp Glu Ala Gln Ala Gly Ser Ala Thr
145 150 155 160
Ala Ser Phe Trp Leu Ile His Ser Ala Asp Gly Val Leu His Cys Leu
165 170 175
Gly Lys Thr Ala Gln Ala Arg Ile Ala Ala Pro Asp Asp Ser Ala Lys
180 185 190
Ile Ala Glu Trp Leu Val Glu Glu Ser Val Ser Pro Phe Gly Glu His
195 200 205
Ile Tyr Tyr Gln Tyr Lys Glu Glu Asp Asn Gln Gly Val Asn Leu Glu
210 215 220
Glu Asp Asn Hi s Gln Tyr Gly Ala Asn Arg Tyr Leu Lys Ser Ile Arg
225 230 235 240
Tyr Gly Asn Lys Val Ala Ser Pro Ser Leu Tyr Val Trp Lys Gly Glu
245 250 255
Ile Pro Ala Asp Gly Gln Trp Leu Tyr Ser Val Ile Leu Asp Tyr Gly
260 265 270
Glu Asn Asp Thr Ser Ala Asp Val Pro Pro Leu Tyr Thr Pro Gln Gly
275 280 285
Glu Trp Leu Val Arg Pro Asp Arg Phe Ser Arg Tyr Asp Tyr Gly Phe
290 295 300
Glu Val Arg Thr Cys Arg Leu Cys Arg Gln Val Leu Met Phe His Val
305 310 315 320
Phe Lys Glu Leu Gly Gly Glu Pro Ala Leu Val Trp Arg Met Gln Leu
325 330 335
Glu Tyr Asp Glu Asn Pro Ala Ala Ser Met Leu Ser Ala Val Arg Gln
340 345 350
Leu Ala Tyr Glu Ala Asp Gly Ala Ile Arg Ser Leu Pro Pro Leu Glu
355 360 365
Phe Asp Tyr Thr Pro Phe Gly Ile Glu Thr Thr Ala Asp Trp Gln Pro
370 375 380
Phe Leu Pro Val Pro Glu Trp Ala Asp Glu Glu His Tyr Gln Leu Val
385 390 395 400
Asp Leu Tyr Gly Glu Gly Ile Pro Gly Leu Leu Tyr Gln Asn Asn Asp
405 410 415
His Trp His Tyr Arg Ser Pro Ala Arg Gly Asp Thr Pro Asp Gly Ile
420 425 430
Ala Tyr Asn Ser Trp Arg Pro Leu Pro His Ile Pro Val Asn Ser Arg
435 440 445
Asn Gly Met Leu Met Asp Leu Asn Gly Asp Gly Tyr Leu Glu Trp Leu
450 455 460
Leu Ala Glu Pro Gly Val Ala Gly Arg Tyr Ser Met Asn Pro Asp Lys
465 470 475 480
Ser Trp Ser Gly Phe Val Pro Leu Gln Ala Leu Pro Thr Glu Phe Phe
485 490 495
His Pro Gln Ala Gln Leu Ala Asn Val Thr Gly Ser Gly Leu Thr Asp
500 505 510
Leu Val Met Ile Gly Pro Lys Ser Val Arg Phe Tyr Ala Gly Glu Glu
515 520 525
Ala Gly Phe Lys Arg Ala Cys Glu Val Trp Gln Gln Val Gly Ile Thr
530 535 540
Leu Pro Val Glu Arg Val Asp Lys Lys Glu Leu Val Ala Phe Ser Asp
545 550 555 560
Met Leu Gly Ser Gly Gln Ser His Leu Val Arg Ile Arg His Asp Gly
565 570 575
Val Thr Cys Trp Pro Asn Leu Gly Asn Gly Val Phe Gly Ala Pro Leu
580 585 590
Ala Leu His Gly Phe Thr Ala Ser Glu Arg Glu Phe Asn Pro Glu Arg
595 600 605
Val Tyr Leu Val Asp Leu Asp Gly Ser Gly Ala Ser Asp Ile Ile Tyr
610 615 620
Ala Ser Arg Asp Ala Leu Leu Ile Tyr Arg Asn Leu Ser Gly Asn Gly
625 630 635 640
Phe Ala Asp Pro Val Arg Val Pro Leu Pro Asp Gly Val Arg Phe Asp
645 650 655
Asn Leu Cys Arg Leu Leu Pro Ala Asp Ile Arg Gly Leu Gly Val Ala
660 665 670
Ser Leu Val Leu His Val Pro Tyr Met Ala Pro Arg Ser Trp Lys Leu
675 680 685
Asp Phe Phe Ala Ala Lys Pro Tyr Leu Leu Gln Thr Val Ser Asn Asn
690 695 700
Leu Gly Ala Ser Ser Ser Phe Trp Tyr Arg Ser Ser Thr Gln Tyr Trp
705 710 715 720
Leu Asp Glu Lys Gln Ala Ala Ser Ser Ala Val Ser Ala Leu Pro Phe
725 730 735
Pro Ile Asn Val Val Ser Asp Met His Thr Val Asp Glu Ile Ser Gly
740 745 750
Arg Thr Arg Thr Gln Lys Tyr Thr Tyr Arg His Gly Val Tyr Asp Arg
755 760 765
Thr Glu Lys Glu Phe Ala Gly Phe Gly Arg Ile Asp Thr Trp Glu Glu
770 775 780
Glu Arg Asp Ser Glu Gly Thr Leu Ser Val Ser Thr Pro Pro Val Leu
785 790 795 800
Thr Arg Thr Trp Tyr His Thr Gly Gln Lys Gln Asp Glu Glu Arg Ala
805 810 815
Val Gln Gln Tyr Trp Gln Gly Asp Pro Ala Ala Phe Gln Val Lys Pro
820 825 830
Val Arg Leu Thr Arg Phe Asp Ala Ala Ala Ala Gln Asp Leu Pro Leu
835 840 845
Asp Ser Asn Asn Gly Gln Gln Glu Tyr Trp Leu Tyr Arg Ser Leu Gln
850 855 860
Gly Met Pro Leu Arg Thr Glu Ile Phe Ala Gly Asp Val Gly Gly Ser
865 870 875 880
Pro Pro Tyr Gln Val Glu Ser Phe Arg Tyr Gln Val Arg Leu Val Gln
885 890 895
Ser Ile Asp Ser Glu Cys Val Ala Leu Pro Met Gln Leu Glu Gln Leu
900 905 910
Thr Tyr Asn Tyr Glu Gln Ile Ala Ser Asp Pro Gln Cys Ser Gln Gln
915 920 925
Ile Gln Gln Trp Phe Asp Glu Tyr Gly Val Ala Ala Gln Ser Val Thr
930 935 940
Ile Gln Tyr Pro Arg Arg Ala Gln Pro Glu Asp Asn Pro Tyr Pro Arg
945 950 955 960
Thr Leu Pro Asp Thr Ser Trp Ser Ser Ser Tyr Asp Ser Gln Gln Met
965 970 975
Leu Leu Arg Leu Thr Arg Gln Arg Gln Lys Ala Tyr His Leu Ala Asp
980 985 990
Pro Glu Gly Trp Arg Leu Asn Ile Pro His Gln Thr Arg Leu Asp Ala
995 1000 1005
Phe Ile Tyr Ser Ala Asp Ser Val Pro Ala Glu Gly Ile Ser Ala
1010 1015 1020
Glu Leu Leu Glu Val Asp Gly Thr Leu Arg Ser Ser Ala Leu Glu
1025 1030 1035
Gln Ala Tyr Gly Gly Gln Ser Glu Ile Ile Tyr Ala Gly Gly Gly
1040 1045 1050
Glu Pro Asp Leu Arg Ala Leu Val His Tyr Thr Arg Ser Ala Val
1055 1060 1065
Leu Asp Glu Asp Cys Leu Gln Ala Tyr Glu Gly Val Leu Ser Asp
1070 1075 1080
Ser Gln Leu Asn Ser Leu Leu Ala Ser Ser Gly Tyr Gln Arg Ser
1085 1090 1095
Ala Arg Ile Leu Gly Ser Gly Asp Glu Val Asp Ile Phe Val Ala
1100 1105 1110
Glu Gln Gly Phe Thr Arg Tyr Ala Asp Glu Pro Asn Phe Phe Arg
1115 1120 1125
Ile Leu Gly Gln Gln Ser Ser Leu Leu Ser Gly Glu Gln Val Leu
1130 1135 1140
Thr Trp Asp Asp Asn Phe Cys Ala Val Thr Ser Ile Glu Asp Ala
1145 1150 1155
Leu Gly Asn Gln Ile Gln Ile Ala Tyr Asp Tyr Arg Phe Val Glu
1160 1165 1170
Ala Ile Gln Ile Thr Asp Thr Asn Asn Asn Val Asn Gln Val Ala
1175 1180 1185
Leu Asp Ala Leu Gly Arg Val Val Tyr Ser Arg Thr Trp Gly Thr
1190 1195 1200
Glu Glu Gly Ile Lys Thr Gly Phe Arg Pro Glu Val Glu Phe Ala
1205 1210 1215
Thr Pro Glu Thr Met Glu Gln Ala Leu Ala Leu Ala Ser Pro Leu
1220 1225 1230
Pro Val Ala Ser Cys Cys Val Tyr Asp Ala His Ser Trp Met Gly
1235 1240 1245
Thr Ile Thr Leu Ala Gln Leu Ser Glu Leu Val Pro Asp Ser Glu
1250 1255 1260
Lys Gln Trp Ser Phe Leu Ile Asp Asn Arg Leu Ile Met Pro Asp
1265 1270 1275
Gly Arg Ile Arg Ser Arg Gly Arg Asp Pro Trp Ser Leu His Arg
1280 1285 1290
Leu Leu Pro Pro Ala Val Gly Glu Leu Leu Ser Glu Ala Asp Arg
1295 1300 1305
Lys Pro Pro His Thr Val Ile Leu Ala Ala Asp Arg Tyr Pro Asp
1310 1315 1320
Asp Pro Ser Gln Gln Ile Gln Ala Ser Ile Val Phe Ser Asp Gly
1325 1330 1335
Phe Gly Arg Thr Ile Gln Thr Ala Lys Arg Glu Asp Thr Arg Trp
1340 1345 1350
Ala Ile Ala Glu Arg Val Asp Tyr Asp Gly Thr Gly Ala Val Ile
1355 1360 1365
Arg Ser Phe Gln Pro Phe Tyr Leu Asp Asp Trp Ash Tyr Val Gly
1370 1375 1380
Glu Glu Ala Val Ser Ser Ser Met Tyr Ala Thr Ile Tyr Tyr Tyr
1385 1390 1395
Asp Ala Leu Ala Arg Gln Leu Arg Met Val Asn Ala Lys Gly Tyr
1400 1405 1410
Glu Arg Arg Thr Ala Phe Tyr Pro Trp Phe Thr Val Asn Glu Asp
1415 1420 1425
Glu Asn Asp Thr Met Asp Ser Ser Leu Phe Ala Ser Pro Pro Ala
1430 1435 1440
Arg
<210>41
<211>2793
<212>DNA
<213>类芽孢杆菌DAS1529株
<400>41
atgaacacaa cgtccatata taggggcacg cctacgattt cagttgtgga taaccggaac 60
ttggagattc gcattcttca gtataaccgt atcgcggctg aagatccggc agatgagtgt 120
atcctgcgga acacgtatac gccgttaagc tatcttggca gcagcatgga tccccgtttg 180
ttctcgcaat atcaggatga tcgcggaaca ccgccgaata tacgaaccat ggcttccctg 240
agaggcgaag cgctgtgttc ggaaagtgtg gatgccggcc gcaaggcgga gctttttgat 300
atcgaggggc ggcccgtctg gcttatcgat gccaacggca cagagacgac tctcgaatat 360
gatgtcttag gcaggccaac agccgtattc gagcaacagg aaggtacgga ctccccccag 420
tgcagggagc ggtttattta tggtgagaag gaggcggatg cccaggccaa caatttgcgc 480
ggacaactgg ttcgccacta cgataccgcg ggccggatac agaccgacag catctccttg 540
gctggactgc cgttgcgcca aagccgtcaa ctgctgaaaa attgggatga acctggcgac 600
tggagtatgg atgaggaaag cgcctgggcc tcgttgctgg ctgccgaagc ttatgatacg 660
agctggcggt atgacgcgca ggacagggtg ctcgcccaaa ccgacgccaa agggaatctc 720
cagcaactga cttacaatga cgccggccag ccgcaggcgg tcagcctcaa gctgcaaggc 780
caagcggagc aacggatttg gaaccggatc gagtacaacg cggcgggtca agtggatctc 840
gccgaagccg ggaatggaat cgtaacggaa tatacttacg aggaaagcac gcagcggtta 900
atccgaaaaa aagattcccg cggactgtcc tccggggaaa gagaagtgct gcaggattat 960
cgttatgaat atgatccggt aggcaatatc ctttctattt acaatgaagc ggagccggtt 1020
cgttatttcc gcaatcaggc cgttgctccg aaaaggcaat atgcctacga tgccttgtat 1080
cagcttgtat ctagttcggg gcgggaatcc gacgcgcttc ggcagcagac gtcgcttcct 1140
cccttgatca cgcctatccc tctggacgat agccaatacg tcaattacgc tgaaaaatac 1200
agctatgatc aggcgggcaa tttaatcaag cttagccata acggggcaag tcaatataca 1260
acgaatgtgt atgtggacaa aagctcaaac cgggggattt ggcggcaagg ggaagacatc 1320
ccggatatcg cggcttcctt tgacagagca ggcaatcaac aagctttatt cccggggaga 1380
ccgttggaat gggatacacg caatcaatta agccgtgtcc atatggtcgt gcgcgaaggc 1440
ggagacaacg actgggaagg ctatctctat gacagctcgg gaatgcgtat cgtaaaacga 1500
tctacccgca aaacacagac aacgacgcaa acggatacga ccctctattt gccgggcctg 1560
gagctgcgaa tccgccagac cggggaccgg gtcacggaag cattgcaggt cattaccgtg 1620
gatgagggag cgggacaagt gagggtgctg cactgggagg atggaaccga gccgggcggc 1680
atcgccaatg atcagtaccg gtacagcctg aacgatcatc ttacctcctc tttattggaa 1740
gttgacgggc aaggtcagat cattagtaag gaagaatttt atccctatgg cggcacagcc 1800
ctgtggacag cccggtcaga ggtagaggca agctacaaga ccatccgcta ttcaggcaaa 1860
gagcgggatg ccacaggcct gtattattac ggacaccgct actatatgcc atggttgggt 1920
cgctggctga atccggaccc ggccggaatg gtagatggac taaacctgta ccgtatggtc 1980
aggaacaatc ctataggact gatggatccg aatgggaatg cgccaatcaa cgtggcggat 2040
tatagcttcg tgcatggtga tttagtttat ggtcttagta aggaaagagg aagatatcta 2100
aagctattta atccaaactt taatatggaa aaatcagact ctcctgctat ggttatagat 2160
caatataata ataatgttgc attgagtata actaaccaat ataaagtaga agaattgatg 2220
aaatttcaaa aagacccaca aaaagccgca cggaaaataa aggttccaga agggaatcgt 2280
ttatcgagga acgaaaatta tcctttgtgg cacgattata ttaacattgg agaagctaaa 2340
gctgcattta aggcctctca tattttccaa gaagtgaagg ggaattatgg gaaagattat 2400
tatcataaat tattattaga cagaatgata gaatcgccgt tgctgtggaa acgaggcagc 2460
aaactcgggc tagaaatcgc cgctaccaat cagagaacaa aaatacactt tgttcttgac 2520
aatttaaata tcgagcaggt ggttacgaaa gagggtagcg gcggtcagtc aatcacagct 2580
tcggagctcc gttatattta tcgaaatcgc gaaagattga acgggcgtgt cattttctat 2640
agaaataatg aaaggctaga tcaggctcca tggcaagaaa atccggactt atggagcaaa 2700
tatcaaccgg gtcttagaca aagcagcagt tcaagagtca aagaacgagg gattgggaac 2760
tttttccgcc ggttttcaat gaagagaaag tag 2793
<210>42
<211>930
<212>PRT
<213>类芽孢杆菌DAS1529株
<400>42
Met Asn Thr Thr Ser Ile Tyr Arg Gly Thr Pro Thr Ile Ser Val Val
1 5 10 15
Asp Asn Arg Asn Leu Glu Ile Arg Ile Leu Gln Tyr Asn Arg Ile Ala
20 25 30
Ala Glu Asp Pro Ala Asp Glu Cys Ile Leu Arg Asn Thr Tyr Thr Pro
35 40 45
Leu Ser Tyr Leu Gly Ser Ser Met Asp Pro Arg Leu Phe Ser Gln Tyr
50 55 60
Gln Asp Asp Arg Gly Thr Pro Pro Asn Ile Arg Thr Met Ala Ser Leu
65 70 75 80
Arg Gly Glu Ala Leu Cys Ser Glu Ser Val Asp Ala Gly Arg Lys Ala
85 90 95
Glu Leu Phe Asp Ile Glu Gly Arg Pro Val Trp Leu Ile Asp Ala Asn
100 105 110
Gly Thr Glu Thr Thr Leu Glu Tyr Asp Val Leu Gly Arg Pro Thr Ala
115 120 125
Val Phe Glu Gln Gln Glu Gly Thr Asp Ser Pro Gln Cys Arg Glu Arg
130 135 140
Phe Ile Tyr Gly Glu Lys Glu Ala Asp Ala Gln Ala Asn Asn Leu Arg
145 150 155 160
Gly Gln Leu Val Arg His Tyr Asp Thr Ala Gly Arg Ile Gln Thr Asp
165 170 175
Ser Ile Ser Leu Ala Gly Leu Pro Leu Arg Gln Ser Arg Gln Leu Leu
180 185 190
Lys Asn Trp Asp Glu Pro Gly Asp Trp Ser Met Asp Glu Glu Ser Ala
195 200 205
Trp Ala Ser Leu Leu Ala Ala Glu Ala Tyr Asp Thr Ser Trp Arg Tyr
210 215 220
Asp Ala Gln Asp Arg Val Leu Ala Gln Thr Asp Ala Lys Gly Asn Leu
225 230 235 240
Gln Gln Leu Thr Tyr Asn Asp Ala Gly Gln Pro Gln Ala Val Ser Leu
245 250 255
Lys Leu Gln Gly Gln Ala Glu Gln Arg Ile Trp Asn Arg Ile Glu Tyr
260 265 270
Asn Ala Ala Gly Gln Val Asp Leu Ala Glu Ala Gly Asn Gly Ile Val
275 280 285
Thr Glu Tyr Thr Tyr Glu Glu Ser Thr Gln Arg Leu Ile Arg Lys Lys
290 295 300
Asp Ser Arg Gly Leu Ser Ser Gly Glu Arg Glu Val Leu Gln Asp Tyr
305 310 315 320
Arg Tyr Glu Tyr Asp Pro Val Gly Asn Ile Leu Ser Ile Tyr Asn Glu
325 330 335
Ala Glu Pro Val Arg Tyr Phe Arg Asn Gln Ala Val Ala Pro Lys Arg
340 345 350
Gln Tyr Ala Tyr Asp Ala Leu Tyr Gln Leu Val Ser Ser Ser Gly Arg
355 360 365
Glu Ser Asp Ala Leu Arg Gln Gln Thr Ser Leu Pro Pro Leu Ile Thr
370 375 380
Pro Ile Pro Leu Asp Asp Ser Gln Tyr Val Asn Tyr Ala Glu Lys Tyr
385 390 395 400
Ser Tyr Asp Gln Ala Gly Asn Leu Ile Lys Leu Ser His Asn Gly Ala
405 410 415
Ser Gln Tyr Thr Thr Asn Val Tyr Val Asp Lys Ser Ser Asn Arg Gly
420 425 430
Ile Trp Arg Gln Gly Glu Asp Ile Pro Asp Ile Ala Ala Ser Phe Asp
435 440 445
Arg Ala Gly Asn Gln Gln Ala Leu Phe Pro Gly Arg Pro Leu Glu Trp
450 455 460
Asp Thr Arg Asn Gln Leu Ser Arg Val His Met Val Val Arg Glu Gly
465 470 475 480
Gly Asp Asn Asp Trp Glu Gly Tyr Leu Tyr Asp Ser Ser Gly Met Arg
485 490 495
Ile Val Lys Arg Ser Thr Arg Lys Thr Gln Thr Thr Thr Gln Thr Asp
500 505 510
Thr Thr Leu Tyr Leu Pro Gly Leu Glu Leu Arg Ile Arg Gln Thr Gly
515 520 525
Asp Arg Val Thr Glu Ala Leu Gln Val Ile Thr Val Asp Glu Gly Ala
530 535 540
Gly Gln Val Arg Val Leu His Trp Glu Asp Gly Thr Glu Pro Gly Gly
545 550 555 560
Ile Ala Asn Asp Gln Tyr Arg Tyr Ser Leu Asn Asp His Leu Thr Ser
565 570 575
Ser Leu Leu Glu Val Asp Gly Gln Gly Gln Ile Ile Ser Lys Glu Glu
580 585 590
Phe Tyr Pro Tyr Gly Gly Thr Ala Leu Trp Thr Ala Arg Ser Glu Val
595 600 605
Glu Ala Ser Tyr Lys Thr Ile Arg Tyr Ser Gly Lys Glu Arg Asp Ala
610 615 620
Thr Gly Leu Tyr Tyr Tyr Gly His Arg Tyr Tyr Met Pro Trp Leu Gly
625 630 635 640
Arg Trp Leu Asn Pro Asp Pro Ala Gly Met Val Asp Gly Leu Asn Leu
645 650 655
Tyr Arg Met Val Arg Asn Asn Pro Ile Gly Leu Met Asp Pro Asn Gly
660 665 670
Asn Ala Pro Ile Asn Val Ala Asp Tyr Ser Phe Val His Gly Asp Leu
675 680 685
Val Tyr Gly Leu Ser Lys Glu Arg Gly Arg Tyr Leu Lys Leu Phe Asn
690 695 700
Pro Asn Phe Asn Met Glu Lys Ser Asp Ser Pro Ala Met Val Ile Asp
705 710 715 720
Gln Tyr Asn Asn Asn Val Ala Leu Ser Ile Thr Asn Gln Tyr Lys Val
725 730 735
Glu Glu Leu Met Lys Phe Gln Lys Asp Pro Gln Lys Ala Ala Arg Lys
740 745 750
Ile Lys Val Pro Glu Gly Asn Arg Leu Ser Arg Asn Glu Asn Tyr Pro
755 760 765
Leu Trp His Asp Tyr Ile Asn Ile Gly Glu Ala Lys Ala Ala Phe Lys
770 775 780
Ala Ser His Ile Phe Gln Glu Val Lys Gly Asn Tyr Gly Lys Asp Tyr
785 790 795 800
Tyr His Lys Leu Leu Leu Asp Arg Met Ile Glu Ser Pro Leu Leu Trp
805 810 815
Lys Arg Gly Ser Lys Leu Gly Leu Glu Ile Ala Ala Thr Asn Gln Arg
820 825 830
Thr Lys Ile His Phe Val Leu Asp Asn Leu Asn Ile Glu Gln Val Val
835 840 845
Thr Lys Glu Gly Ser Gly Gly Gln Ser Ile Thr Ala Ser Glu Leu Arg
850 855 860
Tyr Ile Tyr Arg Asn Arg Glu Arg Leu Asn Gly Arg Val Ile Phe Tyr
865 870 875 880
Arg Asn Asn Glu Arg Leu Asp Gln Ala Pro Trp Gln Glu Asn Pro Asp
885 890 895
Leu Trp Ser Lys Tyr Gln Pro Gly Leu Arg Gln Ser Set Ser Ser Arg
900 905 910
Val Lys Glu Arg Gly Ile Gly Ash Phe Phe Arg Arg Phe Ser Met Lys
915 920 925
Arg Lys
930
<210>43
<211>953
<212>PRT
<213>类芽孢杆菌DAS1529株
<400>43
Met Lys Met Ile Pro Trp Thr His His Tyr Leu Leu His Arg Leu Arg
1 5 10 15
Gly Glu Met Glu Val Lys Pro Met Asn Thr Thr Ser Ile Tyr Arg Gly
20 25 30
Thr Pro Thr Ile Ser Val Val Asp Asn Arg Asn Leu Glu Ile Arg Ile
35 40 45
Leu Gln Tyr Asn Arg Ile Ala Ala Glu Asp Pro Ala Asp Glu Cys Ile
50 55 60
Leu Arg Asn Thr Tyr Thr Pro Leu Ser Tyr Leu Gly Ser Ser Met Asp
65 70 75 80
Pro Arg Leu Phe Ser Gln Tyr Gln Asp Asp Arg Gly Thr Pro Pro Asn
85 90 95
Ile Arg Thr Met Ala Ser Leu Arg Gly Glu Ala Leu Cys Ser Glu Ser
100 105 110
Val Asp Ala Gly Arg Lys Ala Glu Leu Phe Asp Ile Glu Gly Arg Pro
115 120 125
Val Trp Leu Ile Asp Ala Asn Gly Thr Glu Thr Thr Leu Glu Tyr Asp
130 135 140
Val Leu Gly Arg Pro Thr Ala Val Phe Glu Gln Gln Glu Gly Thr Asp
145 150 155 160
Ser Pro Gln Cys Arg Glu Arg Phe Ile Tyr Gly Glu Lys Glu Ala Asp
165 170 175
Ala Gln Ala Asn Asn Leu Arg Gly Gln Leu Val Arg His Tyr Asp Thr
180 185 190
Ala Gly Arg Ile Gln Thr Asp Ser Ile Ser Leu Ala Gly Leu Pro Leu
195 200 205
Arg Gln Ser Arg Gln Leu Leu Lys Asn Trp Asp Glu Pro Gly Asp Trp
210 215 220
Ser Met Asp Glu Glu Ser Ala Trp Ala Ser Leu Leu Ala Ala Glu Ala
225 230 235 240
Tyr Asp Thr Ser Trp Arg Tyr Asp Ala Gln Asp Arg Val Leu Ala Gln
245 250 255
Thr Asp Ala Lys Gly Asn Leu Gln Gln Leu Thr Tyr Asn Asp Ala Gly
260 265 270
Gln Pro Gln Ala Val Ser Leu Lys Leu Gln Gly Gln Ala Glu Gln Arg
275 280 285
Ile Trp Asn Arg Ile Glu Tyr Asn Ala Ala Gly Gln Val Asp Leu Ala
290 295 300
Glu Ala Gly Asn Gly Ile Val Thr Glu Tyr Thr Tyr Glu Glu Ser Thr
305 310 315 320
Gln Arg Leu Ile Arg Lys Lys Asp Ser Arg Gly Leu Ser Ser Gly Glu
325 330 335
Arg Glu Val Leu Gln Asp Tyr Arg Tyr Glu Tyr Asp Pro Val Gly Asn
340 345 350
Ile Leu Ser Ile Tyr Asn Glu Ala Glu Pro Val Arg Tyr Phe Arg Asn
355 360 365
Gln Ala Val Ala Pro Lys Arg Gln Tyr Ala Tyr Asp Ala Leu Tyr Gln
370 375 380
Leu Val Ser Ser Ser Gly Arg Glu Ser Asp Ala Leu Arg Gln Gln Thr
385 390 395 400
Ser Leu Pro Pro Leu Ile Thr Pro Ile Pro Leu Asp Asp Ser Gln Tyr
405 410 415
Val Asn Tyr Ala Glu Lys Tyr Ser Tyr Asp Gln Ala Gly Asn Leu Ile
420 425 430
Lys Leu Ser Hi s Asn Gly Ala Ser Gln Tyr Thr Thr Asn Val Tyr Val
435 440 445
Asp Lys Ser Ser Asn Arg Gly Ile Trp Arg Gln Gly Glu Asp Ile Pro
450 455 460
Asp Ile Ala Ala Ser Phe Asp Arg Ala Gly Asn Gln Gln Ala Leu Phe
465 470 475 480
Pro Gly Arg Pro Leu Glu Trp Asp Thr Arg Asn Gln Leu Ser Arg Val
485 490 495
His Met Val Val Arg Glu Gly Gly Asp Asn Asp Trp Glu Gly Tyr Leu
500 505 510
Tyr Asp Ser Ser Gly Met Arg Ile Val Lys Arg Ser Thr Arg Lys Thr
515 520 525
Gln Thr Thr Thr Gln Thr Asp Thr Thr Leu Tyr Leu Pro Gly Leu Glu
530 535 540
Leu Arg Ile Arg Gln Thr Gly Asp Arg Val Thr Glu Ala Leu Gln Val
545 550 555 560
Ile Thr Val Asp Glu Gly Ala Gly Gln Val Arg Val Leu His Trp Glu
565 570 575
Asp Gly Thr Glu Pro Gly Gly Ile Ala Asn Asp Gln Tyr Arg Tyr Ser
580 585 590
Leu Asn Asp His Leu Thr Ser Ser Leu Leu Glu Val Asp Gly Gln Gly
595 600 605
Gln Ile Ile Ser Lys Glu Glu Phe Tyr Pro Tyr Gly Gly Thr Ala Leu
610 615 620
Trp Thr Ala Arg Ser Glu Val Glu Ala Ser Tyr Lys Thr Ile Arg Tyr
625 630 635 640
Ser Gly Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly His Arg
645 650 655
Tyr Tyr Met Pro Trp Leu Gly Arg Trp Leu Asn Pro Asp Pro Ala Gly
660 665 670
Met Val Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn Pro Ile
675 680 685
Gly Leu Met Asp Pro Asn Gly Asn Ala Pro Ile Asn Val Ala Asp Tyr
690 695 700
Ser Phe Val His Gly Asp Leu Val Tyr Gly Leu Ser Lys Glu Arg Gly
705 710 715 720
Arg Tyr Leu Lys Leu Phe Asn Pro Asn Phe Asn Met Glu Lys Ser Asp
725 730 735
Ser Pro Ala Met Val Ile Asp Gln Tyr Asn Asn Asn Val Ala Leu Ser
740 745 750
Ile Thr Asn Gln Tyr Lys Val Glu Glu Leu Met Lys Phe Gln Lys Asp
755 760 765
Pro Gln Lys Ala Ala Arg Lys Ile Lys Val Pro Glu Gly Asn Arg Leu
770 775 780
Ser Arg Asn Glu Asn Tyr Pro Leu Trp His Asp Tyr Ile Asn Ile Gly
785 790 795 800
Glu Ala Lys Ala Ala Phe Lys Ala Ser His Ile Phe Gln Glu Val Lys
805 810 815
Gly Asn Tyr Gly Lys Asp Tyr Tyr His Lys Leu Leu Leu Asp Arg Met
820 825 830
Ile Glu Ser Pro Leu Leu Trp Lys Arg Gly Ser Lys Leu Gly Leu Glu
835 840 845
Ile Ala Ala Thr Asn Gln Arg Thr Lys Ile His Phe Val Leu Asp Asn
850 855 860
Leu Asn Ile Glu Gln Val Val Thr Lys Glu Gly Ser Gly Gly Gln Ser
865 870 875 880
Ile Thr Ala Ser Glu Leu Arg Tyr Ile Tyr Arg Asn Arg Glu Arg Leu
885 890 895
Asn Gly Arg Val Ile Phe Tyr Arg Asn Asn Glu Arg Leu Asp Gln Ala
900 905 910
Pro Trp Gln Glu Asn Pro Asp Leu Trp Ser Lys Tyr Gln Pro Gly Leu
915 920 925
Arg Gln Ser Ser Ser Ser Arg Val Lys Glu Arg Gly Ile Gly Asn Phe
930 935 940
Phe Arg Arg Phe Ser Met Lys Arg Lys
945 950
<210>44
<211>4425
<212>DNA
<213>发光光杆状菌
<220>
<221>CDS
<222>(1)..(4422)
<400>44
atg caa aat tca caa gat ttt agt att acg gaa ctg tca ctg ccc aaa 48
Met Gln Asn Ser Gln Asp Phe Ser Ile Thr Glu Leu Ser Leu Pro Lys
1 5 10 15
ggg ggg ggc gct atc acg gga atg ggt gaa gca tta acc ccc act gga 96
Gly Gly Gly Ala Ile Thr Gly Met Gly Glu Ala Leu Thr Pro Thr Gly
20 25 30
ccg gat ggt atg gcc gcg cta tct cta cca ttg cct att tct gcc ggg 144
Pro Asp Gly Met Ala Ala Leu Ser Leu Pro Leu Pro Ile Ser Ala Gly
35 40 45
cgc ggt tat gct ccc gca ttc act ctg aat tac aac agc ggc gcc ggt 192
Arg Gly Tyr Ala Pro Ala Phe Thr Leu Asn Tyr Asn Ser Gly Ala Gly
50 55 60
aac agt cca ttt ggt ctg ggt tgg gat tgc aac gtt atg act atc cgc 240
Ash Ser Pro Phe Gly Leu Gly Trp Asp Cys Asn Val Met Thr Ile Arg
65 70 75 80
cgc cgc acc cat ttt ggc gtc ccc cat tat gac gaa acc gat acc ttt 288
Arg Arg Thr His Phe Gly Val Pro His Tyr Asp Glu Thr Asp Thr Phe
85 90 95
ttg ggg cca gaa ggc gaa gtg ctg gtg gta gcg gat caa cct cgc gac 336
Leu Gly Pro Glu Gly Glu Val Leu Val Val Ala Asp Gln Pro Arg Asp
100 105 110
gaa tcc aca tta cag ggt atc aat tta ggc gcc acc ttt acc gtt acc 384
Glu Ser Thr Leu Gln Gly Ile Asn Leu Gly Ala Thr Phe Thr Val Thr
115 120 125
ggc tac cgt tcc cgt ctg gaa agc cat ttc agc cga ttg gaa tat tgg 432
Gly Tyr Arg Ser Arg Leu Glu Ser His Phe Ser Arg Leu Glu Tyr Trp
130 135 140
caa ccc aaa aca aca ggt aaa aca gat ttt tgg ttg ata tat agc cca 480
Gln Pro Lys Thr Thr Gly Lys Thr Asp Phe Trp Leu Ile Tyr Ser Pro
145 150 155 160
gat ggg cag gtg cat cta ctg ggt aaa tca ccg caa gcg cgg atc agc 528
Asp Gly Gln Val His Leu Leu Gly Lys Ser Pro Gln Ala Arg Ile Ser
165 170 175
aac cca tcc caa acg aca caa aca gca caa tgg ctg ctg gaa gcc tct 576
Asn Pro Ser Gln Thr Thr Gln Thr Ala Gln Trp Leu Leu Glu Ala Ser
180 185 190
gta tca tca cgt ggc gaa caa att tat tat caa tat cgc gcc gaa gat 624
Val Ser Ser Arg Gly Glu Gln Ile Tyr Tyr Gln Tyr Arg Ala Glu Asp
l95 200 205
gac aca ggt tgc gaa gca gat gaa att acg cac cat tta cag gct aca 672
Asp Thr Gly Cys Glu Ala Asp Glu Ile Thr His His Leu Gln Ala Thr
210 215 220
gcg caa cgt tat tta cac atc gtg tat tac ggc aac cgt aca gcc agc 720
Ala Gln Arg Tyr Leu His Ile Val Tyr Tyr Gly Asn Arg Thr Ala Ser
225 230 235 240
gaa aca tta ccc ggt ctg gat ggc agc gcc cca tca caa gca gac tgg 768
Glu Thr Leu Pro Gly Leu Asp Gly Ser Ala Pro Ser Gln Ala Asp Trp
245 250 255
ttg ttc tat ctg gta ttt gat tac ggc gaa cgc agt aac aac ctg aaa 816
Leu Phe Tyr Leu Val Phe Asp Tyr Gly Glu Arg Ser Asn Asn Leu Lys
260 265 270
acg cca cca gca ttt tcg act aca ggt agc tgg ctt tgc cgt cag gac 864
Thr Pro Pro Ala Phe Ser Thr Thr Gly Ser Trp Leu Cys Arg Gln Asp
275 280 285
cgt ttt tcc cgt tat gaa tat ggc ttt gag att cgt acc cgc cgc tta 912
Arg Phe Ser Arg Tyr Glu Tyr Gly Phe Glu Ile Arg Thr Arg Arg Leu
290 295 300
tgc cgt cag gta ttg atg tac cat cac ctg caa gca ctg gat agt aag 960
Cys Arg Gln Val Leu Met Tyr His His Leu Gln Ala Leu Asp Ser Lys
305 310 315 320
ata aca gaa cac aac gga cca acg ctg gtt tca cgc ctg ata ctc aat 1008
Ile Thr Glu His Asn Gly Pro Thr Leu Val Ser Arg Leu Ile Leu Asn
325 330 335
tac gac gaa agc gcg ata gcc agc acg cta gta ttc gtt cgc cga gtg 1056
Tyr Asp Glu Ser Ala Ile Ala Ser Thr Leu Val Phe Val Arg Arg Val
340 345 350
gga cac gag caa gat ggt aat gtc gtc acc ctg ccg cca tta gaa ttg 1104
Gly His Glu Gln Asp Gly Asn Val Val Thr Leu Pro Pro Leu Glu Leu
355 360 365
gca tat cag gat ttt tca ccg cga cat cac gct cac tgg caa cca atg 1152
Ala Tyr Gln Asp Phe Ser Pro Arg His His Ala His Trp Gln Pro Met
370 375 380
gat gta ctg gca aac ttc aat gcc att cag cgc tgg cag cta gtc gat 1200
Asp Val Leu Ala Asn Phe Asn Ala Ile Gln Arg Trp Gln Leu Val Asp
385 390 395 400
cta aaa ggc gaa gga tta ccc ggc ctg tta tat cag gat aaa ggc gct 1248
Leu Lys Gly Glu Gly Leu Pro Gly Leu Leu Tyr Gln Asp Lys Gly Ala
405 410 415
tgg tgg tac cgc tcc gca cag cgt ctg ggc gaa att ggc tca gat gcc 1296
Trp Trp Tyr Arg Ser Ala Gln Arg Leu Gly Glu Ile Gly Ser Asp Ala
420 425 430
gtc act tgg gaa aag atg caa cct tta tcg gtt att cct tct ttg caa 1344
Val Thr Trp Glu Lys Met Gln Pro Leu Ser Val Ile Pro Ser Leu Gln
435 440 445
agt aat gcc tcg ttg gtg gat atc aat gga gac ggc caa ctt gac tgg 1392
Ser Asn Ala Ser Leu Val Asp Ile Asn Gly Asp Gly Gln Leu Asp Trp
450 455 460
gtt atc acc gga ccg gga tta cgg gga tat cat agt caa cgc ccg gat 1440
Val Ile Thr Gly Pro Gly Leu Arg Gly Tyr His Ser Gln Arg Pro Asp
465 470 475 480
ggc agt tgg aca cgt ttt acc cca ctc aac gct ctg ccg gtg gaa tac 1488
Gly Ser Trp Thr Arg Phe Thr Pro Leu Asn Ala Leu Pro Val Glu Tyr
485 490 495
acc cat cca cgc gcg caa ctc gca gat tta atg gga gcc ggg cta tcc 1536
Thr His Pro Arg Ala Gln Leu Ala Asp Leu Met Gly Ala Gly Leu Ser
500 505 510
gat ttg gtg ctg atc ggc cct aag agc gtg cgt tta tat gcc aat acc 1584
Asp Leu Val Leu Ile Gly Pro Lys Ser Val Arg Leu Tyr Ala Asn Thr
515 520 525
cgc gac ggc ttt gcc aaa gga aaa gat gtg gtg caa tcc ggt gat atc 1632
Arg Asp Gly Phe Ala Lys Gly Lys Asp Val Val Gln Ser Gly Asp Ile
530 535 540
aca ctg ccg gtg ccg ggc gcc gat cca cgt aag ttg gtg gcg ttt agt 1680
Thr Leu Pro Val Pro Gly Ala Asp Pro Arg Lys Leu Val Ala Phe Ser
545 550 555 560
gat gta ttg ggt tca ggt caa gcc cat ctg gtt gaa gta agc gcg act 1728
Asp Val Leu Gly Ser Gly Gln Ala His Leu Val Glu Val Ser Ala Thr
565 570 575
aaa gtc acc tgc tgg cct aat ctg ggg cgc gga cgt ttt ggt caa ccc 1776
Lys Val Thr Cys Trp Pro Asn Leu Gly Arg Gly Arg Phe Gly Gln Pro
580 585 590
att acc tta ccg gga ttc agc cag cca gca acc gag ttt aac ccg gct 1824
Ile Thr Leu Pro Gly Phe Ser Gln Pro Ala Thr Glu Phe Asn Pro Ala
595 600 605
caa gtt tat ctg gcc gat ctg gat ggc agc ggt cca acg gat ctg att 1872
Gln Val Tyr Leu Ala Asp Leu Asp Gly Ser Gly Pro Thr Asp Leu Ile
610 615 620
tat gtt cat aca aac cgt ctg gat atc ttc ctg aac aaa agt ggc aat 1920
Tyr Val His Thr Asn Arg Leu Asp Ile Phe Leu Asn Lys Ser Gly Asn
625 630 635 640
ggc ttt gct gaa cca gtg aca tta cgc ttc ccg gaa ggt ctg cgt ttt 1968
Gly Phe Ala Glu Pro Val Thr Leu Arg Phe Pro Glu Gly Leu Arg Phe
645 650 655
gat cat acc tgt cag tta caa atg gcc gat gta caa gga tta ggc gtc 2016
Asp His Thr Cys Gln Leu Gln Met Ala Asp Val Gln Gly Leu Gly Val
660 665 670
gcc agc ctg ata ctg agc gtg ccg cat atg tct ccc cat cac tgg cgc 2064
Ala Ser Leu Ile Leu Ser Val Pro His Met Ser Pro His His Trp Arg
675 680 685
tgc gat ctg acc aac atg aag ccg tgg tta ctc aat gaa atg aac aac 2112
Cys Asp Leu Thr Asn Met Lys Pro Trp Leu Leu Asn Glu Met Asn Asn
690 695 700
aat atg ggg gtc cat cac acc ttg cgt tac cgc agt tcc tcc caa ttc 2160
Asn Met Gly Val His His Thr Leu Arg Tyr Arg Ser Ser Ser Gln Phe
705 710 715 720
tgg ctg gat gaa aaa gcc gcg gcg ctg act acc gga caa aca ccg gtt 2208
Trp Leu Asp Glu Lys Ala Ala Ala Leu Thr Thr Gly Gln Thr Pro Val
725 730 735
tgc tat ctc ccc ttc ccg atc cac acc cta tgg caa acg gaa aca gaa 2256
Cys Tyr Leu Pro Phe Pro Ile His Thr Leu Trp Gln Thr Glu Thr Glu
740 745 750
gat gaa atc agc ggc aac aaa tta gtc aca aca ctt cgt tat gct cgt 2304
Asp Glu Ile Ser Gly Asn Lys Leu Val Thr Thr Leu Arg Tyr Ala Arg
755 760 765
ggc gca tgg gac gga cgc gag cgg gaa ttt cgc gga ttt ggt tat gta 2352
Gly Ala Trp Asp Gly Arg Glu Arg Glu Phe Arg Gly Phe Gly Tyr Val
770 775 780
gag cag aca gac agc cat caa ctg gct caa ggc aac gcg cca gaa cgt 2400
Glu Gln Thr Asp Ser His Gln Leu Ala Gln Gly Asn Ala Pro Glu Arg
785 790 795 800
acg cca ccg gcg ctg acc aaa aac tgg tat gcc acc gga ctg ccg gtg 2448
Thr Pro Pro Ala Leu Thr Lys Asn Trp Tyr Ala Thr Gly Leu Pro Val
805 810 815
ata gat aac gca tta tca acc gag tat tgg cgt gat gat cag gct ttt 2496
Ile Asp Asn Ala Leu Ser Thr Glu Tyr Trp Arg Asp Asp Gln Ala Phe
820 825 830
gcc ggt ttc tca ccg cgc ttt acg act tgg caa gat aac aaa gat gtc 2544
Ala Gly Phe Ser Pro Arg Phe Thr Thr Trp Gln Asp Asn Lys Asp Val
835 840 845
ccg tta aca ccg gaa gat gat aac agt cgt tac tgg ttc aac cgc gcg 2592
Pro Leu Thr Pro Glu Asp Asp Asn Ser Arg Tyr Trp Phe Asn Arg Ala
850 855 860
ttg aaa ggt caa ctg cta cgt agt gaa ctg tac gga ttg gac gat agt 2640
Leu Lys Gly Gln Leu Leu Arg Ser Glu Leu Tyr Gly Leu Asp Asp Ser
865 870 875 880
aca aat aaa cac gtt ccc tat act gtc act gaa ttt cgt tca cag gta 2688
Thr Asn Lys His Val Pro Tyr Thr Val Thr Glu Phe Arg Ser Gln Val
885 890 895
cgt cga tta cag cat acc gac agc cga tac cct gta ctt tgg tca tct 2736
Arg Arg Leu Gln His Thr Asp Ser Arg Tyr Pro Val Leu Trp Ser Ser
900 905 910
gta gtt gaa agc cgc aac tat cac tac gaa cgt atc gcc agc gac ccg 2784
Val Val Glu Ser Arg Asn Tyr His Tyr Glu Arg Ile Ala Ser Asp Pro
915 920 925
caa tgc agt caa aat att acg cta tcc agt gat cga ttt ggt cag ccg 2832
Gln Cys Ser Gln Asn Ile Thr Leu Ser Ser Asp Arg Phe Gly Gln Pro
930 935 940
cta aaa cag ctt tcg gta cag tac ccg cgc cgc cag cag cca gca atc 2880
Leu Lys Gln Leu Ser Val Gln Tyr Pro Arg Arg Gln Gln Pro Ala Ile
945 950 955 960
aat ctg tat cct gat aca ttg cct gat aag ttg tta gcc aac agc tat 2928
Asn Leu Tyr Pro Asp Thr Leu Pro Asp Lys Leu Leu Ala Asn Ser Tyr
965 970 975
gat gac caa caa cgc caa tta cgg ctc acc tat caa caa tcc agt tgg 2976
Asp Asp Gln Gln Arg Gln Leu Arg Leu Thr Tyr Gln Gln Ser Ser Trp
980 985 990
cat cac ctg acc aac aat acc gtt cga gta ttg gga tta ccg gat agt 3024
His His Leu Thr Asn Asn Thr Val Arg Val Leu Gly Leu Pro Asp Ser
995 1000 1005
acc cgc agt gat atc ttt act tat ggc gct gaa aat gtg cct gct 3069
Thr Arg Ser Asp Ile Phe Thr Tyr Gly Ala Glu Asn Val Pro Ala
1010 1015 1020
ggt ggt tta aat ctg gaa ctt ctg agt gat aaa aat agc ctg atc 3114
Gly Gly Leu Asn Leu Glu Leu Leu Ser Asp Lys Asn Ser Leu Ile
1025 1030 1035
gcg gac gat aaa cca cgt gaa tac ctc ggt cag caa aaa acc gct 3159
Ala Asp Asp Lys Pro Arg Glu Tyr Leu Gly Gln Gln Lys Thr Ala
1040 1045 1050
tat acc gat gga caa aat aca acg ccg ttg caa aca cca aca cgg 3204
Tyr Thr Asp Gly Gln Asn Thr Thr Pro Leu Gln Thr Pro Thr Arg
1055 1060 1065
caa gcc ctg att gcc ttt acc gaa aca acg gta ttc aac cag tcc 3249
Gln Ala Leu Ile Ala Phe Thr Glu Thr Thr Val Phe Asn Gln Ser
1070 1075 1080
aca tta tca gcg ttt aac gga agc atc ccg tcc gat aaa tta tca 3294
Thr Leu Ser Ala Phe Asn Gly Ser Ile Pro Ser Asp Lys Leu Ser
1085 1090 1095
acg acg ctg gag caa gct gga tat cag caa aca aat tat cta ttc 3339
Thr Thr Leu Glu Gln Ala Gly Tyr Gln Gln Thr Asn Tyr Leu Phe
1100 1105 1110
cct cgc act gga gaa gat aaa gtt tgg gta gcc cat cac ggc tat 3384
Pro Arg Thr Gly Glu Asp Lys Val Trp Val Ala His His Gly Tyr
1115 1120 1125
acc gat tat ggt aca gcg gca cag ttc tgg cgc ccg caa aaa cag 3429
Thr Asp Tyr Gly Thr Ala Ala Gln Phe Trp Arg Pro Gln Lys Gln
1130 1135 1140
agc aac acc caa ctc acc ggt aaa atc acc ctc atc tgg gat gca 3474
Ser Asn Thr Gln Leu Thr Gly Lys Ile Thr Leu Ile Trp Asp Ala
1145 1150 1155
aac tat tgc gtt gtg gta caa acc cgg gat gct gct gga ctg aca 3519
Asn Tyr Cys Val Val Val Gln Thr Arg Asp Ala Ala Gly Leu Thr
1160 1165 1170
acc tca gcc aaa tat gac tgg cgt ttt ctg acc ccg gtg caa ctc 3564
Thr Ser Ala Lys Tyr Asp Trp Arg Phe Leu Thr Pro Val Gln Leu
1175 1180 1185
acc gat atc aat gac aat cag cac ctt atc aca ctg gat gca ttg 3609
Thr Asp Ile Asn Asp Asn Gln His Leu Ile Thr Leu Asp Ala Leu
1190 1195 1200
ggc cga cca atc aca ttg cgc ttt tgg gga act gaa aac ggc aag 3654
Gly Arg Pro Ile Thr Leu Arg Phe Trp Gly Thr Glu Asn Gly Lys
1205 1210 1215
atg aca ggt tat tcc tca ccg gaa aaa gca tca ttt tct cca cca 3699
Met Thr Gly Tyr Ser Ser Pro Glu Lys Ala Ser Phe Ser Pro Pro
1220 1225 1230
tcc gat gtt aat gcc gct att gag tta aaa aaa ccg ctc cct gta 3744
Ser Asp Val Asn Ala Ala Ile Glu Leu Lys Lys Pro Leu Pro Val
1235 1240 1245
gca cag tgt cag gtc tac gca cca gaa agc tgg atg cca gta tta 3789
Ala Gln Cys Gln Val Tyr Ala Pro Glu Ser Trp Met Pro Val Leu
1250 1255 1260
agt cag aaa acc ttc aat cga ctg gca gaa caa gat tgg caa aag 3834
Ser Gln Lys Thr Phe Asn Arg Leu Ala Glu Gln Asp Trp Gln Lys
1265 1270 1275
tta tat aac gcc cga atc atc acc gaa gat gga cgt atc tgc aca 3879
Leu Tyr Asn Ala Arg Ile Ile Thr Glu Asp Gly Arg Ile Cys Thr
1280 1285 1290
ctg gct tat cgc cgc tgg gta caa agc caa aag gca atc cct caa 3924
Leu Ala Tyr Arg Arg Trp Val Gln Ser Gln Lys Ala Ile Pro Gln
1295 1300 1305
ctc att agc ctg tta aac aac gga ccc cgt tta cct cct cac agc 3969
Leu Ile Ser Leu Leu Asn Asn Gly Pro Arg Leu Pro Pro His Ser
1310 1315 1320
ctg aca ttg acg acg gat cgt tat gat cac gat cct gag caa cag 4014
Leu Thr Leu Thr Thr Asp Arg Tyr Asp His Asp Pro Glu Gln Gln
1325 1330 1335
atc cgt caa cag gtg gta ttc agt gat ggc ttt ggc cgc ttg ctg 4059
Ile Arg Gln Gln Val Val Phe Ser Asp Gly Phe Gly Arg Leu Leu
1340 1345 1350
caa gcc gct gcc cga cat gag gca ggc atg gcc cgg caa cgc aat 4104
Gln Ala Ala Ala Arg His Glu Ala Gly Met Ala Arg Gln Arg Asn
1355 1360 1365
gaa gac ggc tct ttg att ata aat gtc cag cat act gag aac cgt 4149
Glu Asp Gly Ser Leu Ile Ile Asn Val Gln His Thr Glu Asn Arg
1370 1375 1380
tgg gca gtg act gga cga acg gaa tat gac aat aag ggg caa ccg 4194
Trp Ala Val Thr Gly Arg Thr Glu Tyr Asp Asn Lys Gly Gln Pro
1385 1390 1395
ata cgt acc tat cag ccc tat ttc ctc aat gac tgg cga tac gtc 4239
Ile Arg Thr Tyr Gln Pro Tyr Phe Leu Asn Asp Trp Arg Tyr Val
1400 1405 1410
agc aat gat agt gcc cgg cag gaa aaa gaa gct tat gca gat acc 4284
Ser Asn Asp Ser Ala Arg Gln Glu Lys Glu Ala Tyr Ala Asp Thr
1415 1420 1425
cat gtc tat gat ccc ata ggt cga gaa atc aag gtt atc acc gca 4329
His Val Tyr Asp Pro Ile Gly Arg Glu Ile Lys Val Ile Thr Ala
1430 1435 1440
aaa ggt tgg ttc cgt cga acc ttg ttc act ccc tgg ttt act gtc 4374
Lys Gly Trp Phe Arg Arg Thr Leu Phe Thr Pro Trp Phe Thr Val
1445 1450 1455
aat gaa gat gaa aat gac aca gcc gct gag gtg aag aag gta aag 4419
Asn Glu Asp Glu Asn Asp Thr Ala Ala Glu Val Lys Lys Val Lys
1460 1465 1470
atg taa 4425
Met
<210>45
<211>1474
<212>PRT
<213>发光光杆状菌
<400>45
Met Gln Asn Ser Gln Asp Phe Ser Ile Thr Glu Leu Ser Leu Pro Lys
1 5 10 15
Gly Gly Gly Ala Ile Thr Gly Met Gly Glu Ala Leu Thr Pro Thr Gly
20 25 30
Pro Asp Gly Met Ala Ala Leu Ser Leu Pro Leu Pro Ile Ser Ala Gly
35 40 45
Arg Gly Tyr Ala Pro Ala Phe Thr Leu Asn Tyr Asn Ser Gly Ala Gly
50 55 60
Asn Ser Pro Phe Gly Leu Gly Trp Asp Cys Asn Val Met Thr Ile Arg
65 70 75 80
Arg Arg Thr His Phe Gly Val Pro His Tyr Asp Glu Thr Asp Thr Phe
85 90 95
Leu Gly Pro Glu Gly Glu Val Leu Val Val Ala Asp Gln Pro Arg Asp
100 105 110
Glu Ser Thr Leu Gln Gly Ile Asn Leu Gly Ala Thr Phe Thr Val Thr
115 120 125
Gly Tyr Arg Ser Arg Leu Glu Ser His Phe Ser Arg Leu Glu Tyr Trp
130 135 140
Gln Pro Lys Thr Thr Gly Lys Thr Asp Phe Trp Leu Ile Tyr Ser Pro
145 150 155 160
Asp Gly Gln Val His Leu Leu Gly Lys Ser Pro Gln Ala Arg Ile Ser
165 170 175
Asn Pro Ser Gln Thr Thr Gln Thr Ala Gln Trp Leu Leu Glu Ala Ser
180 185 190
Val Ser Ser Arg Gly Glu Gln Ile Tyr Tyr Gln Tyr Arg Ala Glu Asp
195 200 205
Asp Thr Gly Cys Glu Ala Asp Glu Ile Thr His His Leu Gln Ala Thr
210 215 220
Ala Gln Arg Tyr Leu His Ile Val Tyr Tyr Gly Asn Arg Thr Ala Ser
225 230 235 240
Glu Thr Leu Pro Gly Leu Asp Gly Ser Ala Pro Ser Gln Ala Asp Trp
245 250 255
Leu Phe Tyr Leu Val Phe Asp Tyr Gly Glu Arg Ser Asn Asn Leu Lys
260 265 270
Thr Pro Pro Ala Phe Ser Thr Thr Gly Ser Trp Leu Cys Arg Gln Asp
275 280 285
Arg Phe Ser Arg Tyr Glu Tyr Gly Phe Glu Ile Arg Thr Arg Arg Leu
290 295 300
Cys Arg Gln Val Leu Met Tyr His His Leu Gln Ala Leu Asp Ser Lys
305 310 315 320
Ile Thr Glu His Asn Gly Pro Thr Leu Val Ser Arg Leu Ile Leu Asn
325 330 335
Tyr Asp Glu Ser Ala Ile Ala Ser Thr Leu Val Phe Val Arg Arg Val
340 345 350
Gly His Glu Gln Asp Gly Asn Val Val Thr Leu Pro Pro Leu Glu Leu
355 360 365
Ala Tyr Gln Asp Phe Ser Pro Arg His His Ala His Trp Gln Pro Met
370 375 380
Asp Val Leu Ala Asn Phe Asn Ala Ile Gln Arg Trp Gln Leu Val Asp
385 390 395 400
Leu Lys Gly Glu Gly Leu Pro Gly Leu Leu Tyr Gln Asp Lys Gly Ala
405 410 415
Trp Trp Tyr Arg Ser Ala Gln Arg Leu Gly Glu Ile Gly Ser Asp Ala
420 425 430
Val Thr Trp Glu Lys Met Gln Pro Leu Ser Val Ile Pro Ser Leu Gln
435 440 445
Ser Asn Ala Ser Leu Val Asp Ile Asn Gly Asp Gly Gln Leu Asp Trp
450 455 460
Val Ile Thr Gly Pro Gly Leu Arg Gly Tyr His Ser Gln Arg Pro Asp
465 470 475 480
Gly Ser Trp Thr Arg Phe Thr Pro Leu Asn Ala Leu Pro Val Glu Tyr
485 490 495
Thr His Pro Arg Ala Gln Leu Ala Asp Leu Met Gly Ala Gly Leu Ser
500 505 510
Asp Leu Val Leu Ile Gly Pro Lys Ser Val Arg Leu Tyr Ala Asn Thr
515 520 525
Arg Asp Gly Phe Ala Lys Gly Lys Asp Val Val Gln Ser Gly Asp Ile
530 535 540
Thr Leu Pro Val Pro Gly Ala Asp Pro Arg Lys Leu Val Ala Phe Ser
545 550 555 560
Asp Val Leu Gly Ser Gly Gln Ala His Leu Val Glu Val Ser Ala Thr
565 570 575
Lys Val Thr Cys Trp Pro Asn Leu Gly Arg Gly Arg Phe Gly Gln Pro
580 585 590
Ile Thr Leu Pro Gly Phe Ser Gln Pro Ala Thr Glu Phe Asn Pro Ala
595 600 605
Gln Val Tyr Leu Ala Asp Leu Asp Gly Ser Gly Pro Thr Asp Leu Ile
610 615 620
Tyr Val His Thr Asn Arg Leu Asp Ile Phe Leu Asn Lys Ser Gly Asn
625 630 635 640
Gly Phe Ala Glu Pro Val Thr Leu Arg Phe Pro Glu Gly Leu Arg Phe
645 650 655
Asp His Thr Cys Gln Leu Gln Met Ala Asp Val Gln Gly Leu Gly Val
660 665 670
Ala Ser Leu Ile Leu Ser Val Pro His Met Ser Pro His His Trp Arg
675 680 685
Cys Asp Leu Thr Asn Met Lys Pro Trp Leu Leu Asn Glu Met Asn Asn
690 695 700
Asn Met Gly Val His His Thr Leu Arg Tyr Arg Ser Ser Ser Gln Phe
705 710 715 720
Trp Leu Asp Glu Lys Ala Ala Ala Leu Thr Thr Gly Gln Thr Pro Val
725 730 735
Cys Tyr Leu Pro Phe Pro Ile His Thr Leu Trp Gln Thr Glu Thr Glu
740 745 750
Asp Glu Ile Ser Gly Asn Lys Leu Val Thr Thr Leu Arg Tyr Ala Arg
755 760 765
Gly Ala Trp Asp Gly Arg Glu Arg Glu Phe Arg Gly Phe Gly Tyr Val
770 775 780
Glu Gln Thr Asp Ser His Gln Leu Ala Gln Gly Asn Ala Pro Glu Arg
785 790 795 800
Thr Pro Pro Ala Leu Thr Lys Asn Trp Tyr Ala Thr Gly Leu Pro Val
805 810 815
Ile Asp Asn Ala Leu Ser Thr Glu Tyr Trp Arg Asp Asp Gln Ala Phe
820 825 830
Ala Gly Phe Ser Pro Arg Phe Thr Thr Trp Gln Asp Asn Lys Asp Val
835 840 845
Pro Leu Thr Pro Glu Asp Asp Asn Ser Arg Tyr Trp Phe Asn Arg Ala
850 855 860
Leu Lys Gly Gln Leu Leu Arg Ser Glu Leu Tyr Gly Leu Asp Asp Ser
865 870 875 880
Thr Asn Lys His Val Pro Tyr Thr Val Thr Glu Phe Arg Ser Gln Val
885 890 895
Arg Arg Leu Gln His Thr Asp Ser Arg Tyr Pro Val Leu Trp Ser Ser
900 905 910
Val Val Glu Ser Arg Asn Tyr His Tyr Glu Arg Ile Ala Ser Asp Pro
915 920 925
Gln Cys Ser Gln Asn Ile Thr Leu Ser Ser Asp Arg Phe Gly Gln Pro
930 935 940
Leu Lys Gln Leu Ser Val Gln Tyr Pro Arg Arg Gln Gln Pro Ala Ile
945 950 955 960
Asn Leu Tyr Pro Asp Thr Leu Pro Asp Lys Leu Leu Ala Asn Ser Tyr
965 970 975
Asp Asp Gln Gln Arg Gln Leu Arg Leu Thr Tyr Gln Gln Ser Ser Trp
980 985 990
His His Leu Thr Asn Asn Thr Val Arg Val Leu Gly Leu Pro Asp Ser
995 1000 1005
Thr Arg Ser Asp Ile Phe Thr Tyr Gly Ala Glu Asn Val Pro Ala
1010 1015 1020
Gly Gly Leu Asn Leu Glu Leu Leu Ser Asp Lys Asn Ser Leu Ile
1025 1030 1035
Ala Asp Asp Lys Pro Arg Glu Tyr Leu Gly Gln Gln Lys Thr Ala
1040 1045 1050
Tyr Thr Asp Gly Gln Asn Thr Thr Pro Leu Gln Thr Pro Thr Arg
1055 1060 1065
Gln Ala Leu Ile Ala Phe Thr Glu Thr Thr Val Phe Asn Gln Ser
1070 1075 1080
Thr Leu Ser Ala Phe Asn Gly Ser Ile Pro Ser Asp Lys Leu Ser
1085 1090 1095
Thr Thr Leu Glu Gln Ala Gly Tyr Gln Gln Thr Asn Tyr Leu Phe
1100 1105 1110
Pro Arg Thr Gly Glu Asp Lys Val Trp Val Ala His His Gly Tyr
1115 1120 1125
Thr Asp Tyr Gly Thr Ala Ala Gln Phe Trp Arg Pro Gln Lys Gln
1130 1135 1140
Ser Asn Thr Gln Leu Thr Gly Lys Ile Thr Leu Ile Trp Asp Ala
1145 1150 1155
Asn Tyr Cys Val Val Val Gln Thr Arg Asp Ala Ala Gly Leu Thr
1160 1165 1170
Thr Ser Ala Lys Tyr Asp Trp Arg Phe Leu Thr Pro Val Gln Leu
1175 1180 1185
Thr Asp Ile Asn Asp Asn Gln His Leu Ile Thr Leu Asp Ala Leu
1190 1195 1200
Gly Arg Pro Ile Thr Leu Arg Phe Trp Gly Thr Glu Asn Gly Lys
1205 1210 1215
Met Thr Gly Tyr Ser Ser Pro Glu Lys Ala Ser Phe Ser Pro Pro
1220 1225 1230
Ser Asp Val Asn Ala Ala Ile Glu Leu Lys Lys Pro Leu Pro Val
1235 1240 1245
Ala Gln Cys Gln Val Tyr Ala Pro Glu Ser Trp Met Pro Val Leu
1250 1255 1260
Ser Gln Lys Thr Phe Asn Arg Leu Ala Glu Gln Asp Trp Gln Lys
1265 1270 1275
Leu Tyr Asn Ala Arg Ile Ile Thr Glu Asp Gly Arg Ile Cys Thr
1280 1285 1290
Leu Ala Tyr Arg Arg Trp Val Gln Ser Gln Lys Ala Ile Pro Gln
1295 1300 1305
Leu Ile Ser Leu Leu Asn Asn Gly Pro Arg Leu Pro Pro His Ser
1310 1315 1320
Leu Thr Leu Thr Thr Asp Arg Tyr Asp His Asp Pro Glu Gln Gln
1325 1330 1335
Ile Arg Gln Gln Val Val Phe Ser Asp Gly Phe Gly Arg Leu Leu
1340 1345 1350
Gln Ala Ala Ala Arg His Glu Ala Gly Met Ala Arg Gln Arg Asn
1355 1360 1365
Glu Asp Gly Ser Leu Ile Ile Asn Val Gln His Thr Glu Asn Arg
1370 1375 1380
Trp Ala Val Thr Gly Arg Thr Glu Tyr Asp Asn Lys Gly Gln Pro
1385 1390 1395
Ile Arg Thr Tyr Gln Pro Tyr Phe Leu Asn Asp Trp Arg Tyr Val
1400 1405 1410
Ser Asn Asp Ser Ala Arg Gln Glu Lys Glu Ala Tyr Ala Asp Thr
1415 1420 1425
His Val Tyr Asp Pro Ile Gly Arg Glu Ile Lys Val Ile Thr Ala
1430 1435 1440
Lys Gly Trp Phe Arg Arg Thr Leu Phe Thr Pro Trp Phe Thr Val
1445 1450 1455
Asn Glu Asp Glu Asn Asp Thr Ala Ala Glu Val Lys Lys Val Lys
1460 1465 1470
Met
<210>46
<211>2883
<212>DNA
<213>发光光杆状菌
<220>
<221>CDS
<222>(1)..(2880)
<400>46
atg aaa aac att gat ccc aaa ctt tat caa aaa acc cct act gtc agc 48
Met Lys Asn Ile Asp Pro Lys Leu Tyr Gln Lys Thr Pro Thr Val Ser
1 5 10 15
gtt tac gat aac cgt ggt ctg ata atc cgt aac atc gat ttt cat cgt 96
Val Tyr Asp Asn Arg Gly Leu Ile Ile Arg Asn Ile Asp Phe His Arg
20 25 30
act acc gca aat ggt gat ccc gat acc cgt att acc cgc cat caa tac 144
Thr Thr Ala Asn Gly Asp Pro Asp Thr Arg Ile Thr Arg His Gln Tyr
35 40 45
gat att cac gga cac cta aat caa agc atc gat ccg cgc cta tat gaa 192
Asp Ile His Gly His Leu Asn Gln Ser Ile Asp Pro Arg Leu Tyr Glu
50 55 60
gcc aag caa acc aac aat acg atc aaa ccc aat ttt ctt tgg cag tat 240
Ala Lys Gln Thr Asn Asn Thr Ile Lys Pro Asn Phe Leu Trp Gln Tyr
65 70 75 80
gat ttg acc ggt aat ccc cta tgt aca gag agc att gat gca ggt cgc 288
Asp Leu Thr Gly Asn Pro Leu Cys Thr Glu Ser Ile Asp Ala Gly Arg
85 90 95
act gtc acc ttg aat gat att gaa ggc cgt ccg cta cta acg gtg act 336
Thr Val Thr Leu Asn Asp Ile Glu Gly Arg Pro Leu Leu Thr Val Thr
100 105 110
gca aca ggg gtt ata caa act cga caa tat gaa act tct tcc ctg ccc 384
Ala Thr Gly Val Ile Gln Thr Arg Gln Tyr Glu Thr Ser Ser Leu Pro
115 120 125
ggt cgt ctg tta tct gtt gcc gaa caa aca ccc gag gaa aaa aca tcc 432
Gly Arg Leu Leu Ser Val Ala Glu Gln Thr Pro Glu Glu Lys Thr Ser
130 135 140
cgt atc acc gaa cgc ctg att tgg gct ggc aat acc gaa gca gag aaa 480
Arg Ile Thr Glu Arg Leu Ile Trp Ala Gly Asn Thr Glu Ala Glu Lys
145 150 155 160
gac cat aac ctt gcc ggc cag tgc gtg cgt cac tat gac acg gcg gga 528
Asp His Asn Leu Ala Gly Gln Cys Val Arg His Tyr Asp Thr Ala Gly
165 170 175
gtt acc cgg tta gag agt tta tca ctg acc ggt act gtt tta tct caa 576
Val Thr Arg Leu Glu Ser Leu Ser Leu Thr Gly Thr Val Leu Ser Gln
180 185 190
tcc agc caa cta ttg atc gac act caa gag gca aac tgg aca ggt gat 624
Ser Ser Gln Leu Leu Ile Asp Thr Gln Glu Ala Asn Trp Thr Gly Asp
195 200 205
aac gaa acc gtc tgg caa aac atg ctg gct gat gac atc tac aca acc 672
Asn Glu Thr Val Trp Gln Asn Met Leu Ala Asp Asp Ile Tyr Thr Thr
210 215 220
ctg agc acc ttc gat gcc acc ggt gct tta ctg act cag acc gat gcg 720
Leu Ser Thr Phe Asp Ala Thr Gly Ala Leu Leu Thr Gln Thr Asp Ala
225 230 235 240
aaa ggg aac att cag aga ctg gct tat gat gtg gcc ggg cag cta aac 768
Lys Gly Asn Ile Gln Arg Leu Ala Tyr Asp Val Ala Gly Gln Leu Asn
245 250 255
ggg agc tgg cta aca ctc aaa ggc cag acg gaa caa gtg att atc aaa 816
Gly Ser Trp Leu Thr Leu Lys Gly Gln Thr Glu Gln Val Ile Ile Lys
260 265 270
tcc ctg acc tac tcc gcc gcc gga caa aaa tta cgt gag gaa cac ggc 864
Ser Leu Thr Tyr Ser Ala Ala Gly Gln Lys Leu Arg Glu Glu His Gly
275 280 285
aat gat gtt atc acc gaa tac agt tat gaa ccg gaa acc caa cgg ctg 912
Asn Asp Val Ile Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Gln Arg Leu
290 295 300
atc ggt atc aaa acc cgc cgt ccg tca gac act aaa gtg cta caa gac 960
Ile Gly Ile Lys Thr Arg Arg Pro Ser Asp Thr Lys Val Leu Gln Asp
305 310 315 320
ctg cgc tat gaa tat gac ccg gta ggc aat gtc atc agc atc cgt aat 1008
Leu Arg Tyr Glu Tyr Asp Pro Val Gly Asn Val Ile Ser Ile Arg Asn
325 330 335
gac gcg gaa gcc acc cgc ttt tgg cac aat cag aaa gtg atg ccg gaa 1056
Asp Ala Glu Ala Thr Arg Phe Trp His Asn Gln Lys Val Met Pro Glu
340 345 350
aac act tat acc tac gat tcc ctg tat cag ctt atc agc gcc acc ggg 1104
Asn Thr Tyr Thr Tyr Asp Ser Leu Tyr Gln Leu Ile Ser Ala Thr Gly
355 360 365
cgc gaa atg gcg aat ata ggt caa caa agt cac caa ttt ccc tca ccc 1152
Arg Glu Met Ala Asn Ile Gly Gln Gln Ser His Gln Phe Pro Ser Pro
370 375 380
gct cta cct tct gat aac aac acc tat acc aac tat acc cgt act tat 1200
Ala Leu Pro Ser Asp Asn Asn Thr Tyr Thr Asn Tyr Thr Arg Thr Tyr
385 390 395 400
act tat gac cgt ggc ggc aat ctg acc aaa atc cag cac agt tca ccg 1248
Thr Tyr Asp Arg Gly Gly Asn Leu Thr Lys Ile Gln His Ser Ser Pro
405 410 415
gcg acg caa aac aac tac acc acc aat atc acg gtt tca aat cgc agc 1296
Ala Thr Gln Asn Asn Tyr Thr Thr Asn Ile Thr Val Ser Asn Arg Ser
420 425 430
aac cgc gca gta ctc agc aca ttg acc gaa gat ccg gcg caa gta gat 1344
Asn Arg Ala Val Leu Ser Thr Leu Thr Glu Asp Pro Ala Gln Val Asp
435 440 445
gct ttg ttt gat gca ggc gga cat cag aac acc ttg ata tca gga caa 1392
Ala Leu Phe Asp Ala Gly Gly His Gln Asn Thr Leu Ile Ser Gly Gln
450 455 460
aac ctg aac tgg aat act cgt ggt gaa ctg caa caa gta aca ctg gtt 1440
Asn Leu Asn Trp Asn Thr Arg Gly Glu Leu Gln Gln Val Thr Leu Val
465 470 475 480
aaa cgg gac aag ggc gcc aat gat gat cgg gaa tgg tat cgt tat agc 1488
Lys Arg Asp Lys Gly Ala Asn Asp Asp Arg Glu Trp Tyr Arg Tyr Ser
485 490 495
ggt gac gga aga agg atg tta aaa atc aat gaa cag cag gcc agc aac 1536
Gly Asp Gly Arg Arg Met Leu Lys Ile Asn Glu Gln Gln Ala Ser Asn
500 505 510
aac gct caa aca caa cgt gtg act tat ttg ccg aac tta gaa ctt cgt 1584
Asn Ala Gln Thr Gln Arg Val Thr Tyr Leu Pro Asn Leu Glu Leu Arg
515 520 525
cta aca caa aac agc acg gcc aca acc gaa gat ttg caa gtt atc acc 1632
Leu Thr Gln Asn Ser Thr Ala Thr Thr Glu Asp Leu Gln Val Ile Thr
530 535 540
gta ggc gaa gcg ggc cgg gca cag gta cga gta tta cat tgg gag agc 1680
Val Gly Glu Ala Gly Arg Ala Gln Val Arg Val Leu His Trp Glu Ser
545 550 555 560
ggt aaa ccg gaa gat atc gac aat aat cag ttg cgt tat agt tac gat 1728
Gly Lys Pro Glu Asp Ile Asp Asn Asn Gln Leu Arg Tyr Ser Tyr Asp
565 570 575
aat ctt atc ggt tcc agt caa ctt gaa tta gat agc gaa gga caa att 1776
Asn Leu Ile Gly Ser Ser Gln Leu Glu Leu Asp Ser Glu Gly Gln Ile
580 585 590
atc agt gaa gaa gaa tat tat ccc tat ggt gga aca gca tta tgg gcc 1824
Ile Ser Glu Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr Ala Leu Trp Ala
595 600 605
gcc agg aat cag aca gaa gcc agt tat aaa act atc cgt tat tca ggc 1872
Ala Arg Asn Gln Thr Glu Ala Ser Tyr Lys Thr Ile Arg Tyr Ser Gly
610 615 620
aaa gag cgg gat gcc acc ggg cta tat tac tac ggc tat cgg tat tac 1920
Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly Tyr Arg Tyr Tyr
625 630 635 640
caa ccg tgg ata gga cgg tgg tta agc tcc gat ccg gca gga aca atc 1968
Gln Pro Trp Ile Gly Arg Trp Leu Ser Ser Asp Pro Ala Gly Thr Ile
645 650 655
gat ggg ctg aat tta tat cgg atg gtg agg aat aat cca gtt acc ctc 2016
Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn Pro Val Thr Leu
660 665 670
ctt gat cct gat gga tta atg cca aca att gca gaa cgc ata gca gca 2064
Leu Asp Pro Asp Gly Leu Met Pro Thr Ile Ala Glu Arg Ile Ala Ala
675 680 685
cta aaa aaa aat aaa gta aca gac tca gcg cct tcg cca gca aat gcc 2112
Leu Lys Lys Asn Lys Val Thr Asp Ser Ala Pro Ser Pro Ala Asn Ala
690 695 700
aca aac gta gcg ata aac atc cgc ccg cct gta gca cca aaa cct agc 2160
Thr Asn Val Ala Ile Asn Ile Arg Pro Pro Val Ala Pro Lys Pro Ser
705 710 715 720
tta ccg aaa gca tca acg agt agc caa cca acc aca cac cct atc gga 2208
Leu Pro Lys Ala Ser Thr Ser Ser Gln Pro Thr Thr His Pro Ile Gly
725 730 735
gct gca aac ata aaa cca acg acg tct ggg tca tct att gtt gct cca 2256
Ala Ala Asn Ile Lys Pro Thr Thr Ser Gly Ser Ser Ile Val Ala Pro
740 745 750
ttg agt cca gta gga aat aaa tct act tct gaa atc tct ctg cca gaa 2304
Leu Ser Pro Val Gly Asn Lys Ser Thr Ser Glu Ile Ser Leu Pro Glu
755 760 765
agc gct caa agc agt tct tca agc act acc tcg aca aat cta cag aaa 2352
Ser Ala Gln Ser Ser Ser Ser Ser Thr Thr Ser Thr Asn Leu Gln Lys
770 775 780
aaa tca ttt act tta tat aga gca gat aac aga tcc ttt gaa gaa atg 2400
Lys Ser Phe Thr Leu Tyr Arg Ala Asp Asn Arg Ser Phe Glu Glu Met
785 790 795 800
caa agt aaa ttc cct gaa gga ttt aaa gcc tgg act cct cta gac act 2448
Gln Ser Lys Phe Pro Glu Gly Phe Lys Ala Trp Thr Pro Leu Asp Thr
805 810 815
aag atg gca agg caa ttt gct agt atc ttt att ggt cag aaa gat aca 2496
Lys Met Ala Arg Gln Phe Ala Ser Ile Phe Ile Gly Gln Lys Asp Thr
820 825 830
tct aat tta cct aaa gaa aca gtc aag aac ata agc aca tgg gga gca 2544
Ser Asn Leu Pro Lys Glu Thr Val Lys Asn Ile Ser Thr Trp Gly Ala
835 840 845
aag cca aaa cta aaa gat ctc tca aat tac ata aaa tat acc aag gac 2592
Lys Pro Lys Leu Lys Asp Leu Ser Asn Tyr Ile Lys Tyr Thr Lys Asp
850 855 860
aaa tct aca gta tgg gtt tct act gca att aat act gaa gca ggt gga 2640
Lys Ser Thr Val Trp Val Ser Thr Ala Ile Asn Thr Glu Ala Gly Gly
865 870 875 880
caa agc tca ggg gct cca ctc cat aaa att gat atg gat ctc tac gag 2688
Gln Ser Ser Gly Ala Pro Leu His Lys Ile Asp Met Asp Leu Tyr Glu
885 890 895
ttt gcc att gat gga caa aaa cta aat cca cta ccg gag ggt aga act 2736
Phe Ala Ile Asp Gly Gln Lys Leu Asn Pro Leu Pro Glu Gly Arg Thr
900 905 910
aaa aac atg gta cct tcc ctt tta ctc gac acc cca caa ata gag aca 2784
Lys Asn Met Val Pro Ser Leu Leu Leu Asp Thr Pro Gln Ile Glu Thr
915 920 925
tca tcc atc att gca ctt aat cat gga ccg gta aat gat gca gaa att 2832
Ser Ser Ile Ile Ala Leu Asn His Gly Pro Val Asn Asp Ala Glu Ile
930 935 940
tca ttt ctg aca aca att ccg ctt aaa aat gta aaa cct cat aag aga 2880
Ser Phe Leu Thr Thr Ile Pro Leu Lys Ash Val Lys Pro His Lys Arg
945 950 955 960
taa 2883
<210>47
<211>960
<212>PRT
<213>发光光杆状菌
<400>47
Met Lys Asn Ile Asp Pro Lys Leu Tyr Gln Lys Thr Pro Thr Val Ser
1 5 10 15
Val Tyr Asp Asn Arg Gly Leu Ile Ile Arg Asn Ile Asp Phe His Arg
20 25 30
Thr Thr Ala Asn Gly Asp Pro Asp Thr Arg Ile Thr Arg His Gln Tyr
35 40 45
Asp Ile His Gly His Leu Asn Gln Ser Ile Asp Pro Arg Leu Tyr Glu
50 55 60
Ala Lys Gln Thr Asn Asn Thr Ile Lys Pro Asn Phe Leu Trp Gln Tyr
65 70 75 80
Asp Leu Thr Gly Asn Pro Leu Cys Thr Glu Ser Ile Asp Ala Gly Arg
85 90 95
Thr Val Thr Leu Asn Asp Ile Glu Gly Arg Pro Leu Leu Thr Val Thr
100 105 110
Ala Thr Gly Val Ile Gln Thr Arg Gln Tyr Glu Thr Ser Ser Leu Pro
115 120 125
Gly Arg Leu Leu Ser Val Ala Glu Gln Thr Pro Glu Glu Lys Thr Ser
130 135 140
Arg Ile Thr Glu Arg Leu Ile Trp Ala Gly Asn Thr Glu Ala Glu Lys
145 150 155 160
Asp His Asn Leu Ala Gly Gln Cys Val Arg His Tyr Asp Thr Ala Gly
165 170 175
Val Thr Arg Leu Glu Ser Leu Ser Leu Thr Gly Thr Val Leu Ser Gln
180 185 190
Ser Ser Gln Leu Leu Ile Asp Thr Gln Glu Ala Asn Trp Thr Gly Asp
195 200 205
Asn Glu Thr Val Trp Gln Asn Met Leu Ala Asp Asp Ile Tyr Thr Thr
210 215 220
Leu Ser Thr Phe Asp Ala Thr Gly Ala Leu Leu Thr Gln Thr Asp Ala
225 230 235 240
Lys Gly Asn Ile Gln Arg Leu Ala Tyr Asp Val Ala Gly Gln Leu Asn
245 250 255
Gly Ser Trp Leu Thr Leu Lys Gly Gln Thr Glu Gln Val Ile Ile Lys
260 265 270
Ser Leu Thr Tyr Ser Ala Ala Gly Gln Lys Leu Arg Glu Glu His Gly
275 280 285
Asn Asp Val Ile Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Gln Arg Leu
290 295 300
Ile Gly Ile Lys Thr Arg Arg Pro Ser Asp Thr Lys Val Leu Gln Asp
305 310 315 320
Leu Arg Tyr Glu Tyr Asp Pro Val Gly Asn Val Ile Ser Ile Arg Asn
325 330 335
Asp Ala Glu Ala Thr Arg Phe Trp His Asn Gln Lys Val Met Pro Glu
340 345 350
Asn Thr Tyr Thr Tyr Asp Ser Leu Tyr Gln Leu Ile Ser Ala Thr Gly
355 360 365
Arg Glu Met Ala Asn Ile Gly Gln Gln Ser His Gln Phe Pro Ser Pro
370 375 380
Ala Leu Pro Ser Asp Asn Asn Thr Tyr Thr Asn Tyr Thr Arg Thr Tyr
385 390 395 400
Thr Tyr Asp Arg Gly Gly Asn Leu Thr Lys Ile Gln His Ser Ser Pro
405 410 415
Ala Thr Gln Asn Asn Tyr Thr Thr Asn Ile Thr Val Ser Asn Arg Ser
420 425 430
Asn Arg Ala Val Leu Ser Thr Leu Thr Glu Asp Pro Ala Gln Val Asp
435 440 445
Ala Leu Phe Asp Ala Gly Gly His Gln Asn Thr Leu Ile Ser Gly Gln
450 455 460
Asn Leu Asn Trp Asn Thr Arg Gly Glu Leu Gln Gln Val Thr Leu Val
465 470 475 480
Lys Arg Asp Lys Gly Ala Asn Asp Asp Arg Glu Trp Tyr Arg Tyr Ser
485 490 495
Gly Asp Gly Arg Arg Met Leu Lys Ile Asn Glu Gln Gln Ala Ser Asn
500 505 510
Asn Ala Gln Thr Gln Arg Val Thr Tyr Leu Pro Asn Leu Glu Leu Arg
515 520 525
Leu Thr Gln Asn Ser Thr Ala Thr Thr Glu Asp Leu Gln Val Ile Thr
530 535 540
Val Gly Glu Ala Gly Arg Ala Gln Val Arg Val Leu His Trp Glu Ser
545 550 555 560
Gly Lys Pro Glu Asp Ile Asp Asn Asn Gln Leu Arg Tyr Ser Tyr Asp
565 570 575
Asn Leu Ile Gly Ser Ser Gln Leu Glu Leu Asp Ser Glu Gly Gln Ile
580 585 590
Ile Ser Glu Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr Ala Leu Trp Ala
595 600 605
Ala Arg Asn Gln Thr Glu Ala Ser Tyr Lys Thr Ile Arg Tyr Ser Gly
610 615 620
Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly Tyr Arg Tyr Tyr
625 630 635 640
Gln Pro Trp Ile Gly Arg Trp Leu Ser Ser Asp Pro Ala Gly Thr Ile
645 650 655
Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn Pro Val Thr Leu
660 665 670
Leu Asp Pro Asp Gly Leu Met Pro Thr Ile Ala Glu Arg Ile Ala Ala
675 680 685
Leu Lys Lys Asn Lys Val Thr Asp Ser Ala Pro Ser Pro Ala Asn Ala
690 695 700
Thr Asn Val Ala Ile Asn Ile Arg Pro Pro Val Ala Pro Lys Pro Ser
705 710 715 720
Leu Pro Lys Ala Ser Thr Ser Ser Gln Pro Thr Thr His Pro Ile Gly
725 730 735
Ala Ala Asn Ile Lys Pro Thr Thr Ser Gly Ser Ser Ile Val Ala Pro
740 745 750
Leu Ser Pro Val Gly Asn Lys Ser Thr Ser Glu Ile Ser Leu Pro Glu
755 760 765
Ser Ala Gln Ser Ser Ser Ser Ser Thr Thr Ser Thr Asn Leu Gln Lys
770 775 780
Lys Ser Phe Thr Leu Tyr Arg Ala Asp Asn Arg Ser Phe Glu Glu Met
785 790 795 800
Gln Ser Lys Phe Pro Glu Gly Phe Lys Ala Trp Thr Pro Leu Asp Thr
805 810 815
Lys Met Ala Arg Gln Phe Ala Ser Ile Phe Ile Gly Gln Lys Asp Thr
820 825 830
Ser Asn Leu Pro Lys Glu Thr Val Lys Asn Ile Ser Thr Trp Gly Ala
835 840 845
Lys Pro Lys Leu Lys Asp Leu Ser Asn Tyr Ile Lys Tyr Thr Lys Asp
850 855 860
Lys Ser Thr Val Trp Val Ser Thr Ala Ile Asn Thr Glu Ala Gly Gly
865 870 875 880
Gln Ser Ser Gly Ala Pro Leu His Lys Ile Asp Met Asp Leu Tyr Glu
885 890 895
Phe Ala Ile Asp Gly Gln Lys Leu Asn Pro Leu Pro Glu Gly Arg Thr
900 905 910
Lys Asn Met Val Pro Ser Leu Leu Leu Asp Thr Pro Gln Ile Glu Thr
915 920 925
Ser Ser Ile Ile Ala Leu Asn His Gly Pro Val Asn Asp Ala Glu Ile
930 935 940
Ser Phe Leu Thr Thr Ile Pro Leu Lys Ash Val Lys Pro His Lys Arg
945 950 955 960
<210>48
<211>4521
<212>DNA
<213>伯氏致病杆菌(Xenorhabdus bovienii)
<400>48
atgaaacaag attcacagga catgacagta acacagctgt ccctgcccaa agggggcggt 60
gcgatcagtg gcatgggtga cactatcagc aatgcagggc cggatgggat ggcttcgctt 120
tccgtgcctt tgcctatctc tgccggtcgg gggggcgcac cgaatttatc cctgaactac 180
agtagcggag caggaaacgg gtcatttggt attggctggc aatccagtac catggctatc 240
agccgtcgta ctcaacatgg cgtaccgcaa tatcacggcg aagatacttt tttatgtccg 300
atgggagaag tgatggcggt tgccgtcaat cagagcgggc aacccgatgt gcgtaaaacc 360
gataaactat taggcgggca actgcctgtt acttataccg ttacgcgtca tcagcccaga 420
aatattcagc acttcagcaa acttgaatac tggcagcccc caacggatgt ggaaaccacg 480
cctttttggt taatgtattc acccgatgga caaattcaca ttttcggaaa aactgagcag 540
gctcagatcg ctaacccggc agaggtttca cagattgccc aatggctttt ggaagaaacc 600
gtaacaccag cgggagaaca catttattac cagtatcggg cagaagacga tatcggttgt 660
gatgacagcg aaaaaaatgc ccaccctaat gccagtgctc aacgttattt gactcaggtg 720
aactacggca atattacacc tgaatccagc ctgcttgtgc tgaagaatac gccaccggcg 780
gataacgaat ggctattcca tttggttttt gattatggtg aacgagcgca ggaaataaac 840
acggttcctc ctttcaaagc accttcaaac aactggaaaa tacggccaga ccgtttctcc 900
cgctttgaat atggttttga ggtgcgaacc cgccgcctgt gtcaacaaat tctgatgttc 960
catcgcctga aatcccttgc aggagaacag attgacggag aagaaatccc tgccttggtt 1020
gcccgtctgc ttctcagtta tgacctgaac gacagcgtga caacccttac cgccattcgg 1080
caaatggcgt atgaaactga cgcaacctta atcgctttac cgccactgga gtttgactat 1140
cagccctttg aggcaaaagt cacgcagaaa tggcaggaaa tgcctcaatt ggccggattg 1200
aatgcccaac aaccttacca actcgtcgat ctctatggtg aaggtatctc cggcatcttg 1260
tatcaggaca gacccggagc atggtggtat caggcaccga tccgtcagaa aaacgttgaa 1320
gatattaacg ctgtcaccta tagcccaata aaccccttac ctaagatccc cagccagcag 1380
gacagagcaa cgttgatgga tatcgacggt gatggacatc tggattgggt gatcgctggc 1440
gcaggtattc aggggcggta cagtatgcag ccgaatggag agtggacaca ctttattccc 1500
atttctgcac tgccaacaga atattttcat ccacaggcac aactggcgga tctggtgggg 1560
gccgggttat ctgatttagc gctgattggc cccagaagtg tgcgtttata tgccaacgac 1620
cgaggaaact ggaaagcggg tattaatgtt atgccacctg atggtgtgaa tttgccgata 1680
tttggtggtg atgccagcag tctggtcgca ttttctgaca tgttgggatc gggacagcag 1740
catttggtgg aaattgccgc tcagagcgtc aaatgctggc cgaatctagg acatggccgt 1800
tttggtgcgg ctattttgct gccggggttt agccagccga atggaacatt caatgctaac 1860
caagtttttc tggcagatat cgatggttcc ggcaccgccg acatcatcta tgcacacagt 1920
acgtatctgg atatttacct gaacgaaagc ggcaaccgtt tcagtgcacc cgttcggctt 1980
aatttgccgg aaggggtgat gtttgacaat acctgtcagt tacaggtgtc ggatattcaa 2040
ggattgggcg ctgccagcat tgtactgacc gtacctcata tgacaccgcg ccattggcgt 2100
tatgatttta ctcacaataa accttggctg ctcaatgtca tcaacaacaa tcgtggcgca 2160
gaaaccacgt tgttttaccg tagttctgcc caattctggc tggatgaaaa aagtcagatc 2220
gaagagctgg gaaaatttgc agcgagttat ctgcctttcc ccatacattt gttgtggcgc 2280
aatgaggcgc tggatgaaat tactggtaat cgactgacta aggtcatgaa ttatgcccac 2340
ggtgcatggg atggcagaga gagagaattt tgcggatttg gccgtgtaac gcaaattgat 2400
accgacgaat ttgccaaggg aaccacagag aaagcgccgg atgaaaatat ctatccttcc 2460
cgtagcataa gctggtttgc cacgggttta ccagaagtgg attctcaact tccggcagaa 2520
tactggcgtg gtgacgatca ggcatttgcc ggctttacac cgcgcttcac tcgttatgaa 2580
aaaggtaatg cggggcaaga ggggcaggat accccgatta aagaaccgac cgaaacagaa 2640
gcgtattggc ttaaccgcgc catgaaaggc caattactgc gcagtgaagt ctatggtgac 2700
gacaaaacag aaaaagctaa aattccgtac accgtcacag aagctcgctg tcaggtcaga 2760
ttaattccca gcaatgacga agccgcgccg tcgtcttgga cgtcgatcat tgaaaaccgc 2820
agttatcact atgagcgtat cgtcgtcgat ccgagttgca aacaacaggt cgtgctcaag 2880
gcggatgaat atggcttccc actggcaaaa gtagatatcg cctatccacg gcgcaataaa 2940
ccggcacaga acccttatcc ggattcgtta ccggatactc tgttcgccga tagctatgac 3000
gaccagcaaa aacagttata tctgacaaaa cagcagcaga gctattacca cctgacccag 3060
caggatgatt gggttctggg tttgacggat agccgataca gcgaagttta tcattatgcg 3120
caaactgacg ctcaaagtga catccccaag gcagggctga tattggaaga cctgctgaaa 3180
gttgacggcc tgataggtaa agacaagact tttatctatt tagggcagca gcgagtggct 3240
tatgtgggag gagatgcaga aaaaccgaca cgtcaggtgc gggtggctta tacagaaacc 3300
gctgcttttg atgacaatgc gctgcacgcc tttgatggcg tgattgcccc tgatgaactg 3360
acgcaacagt tgctggcggg tggatacctg ctcgtgccgc agatttctga tgtggcaggc 3420
agtagtgaaa aggtatgggt agctcggcag ggatacaccg aatacggcag tgctgctcaa 3480
ttctaccggc cactcatcca gcgcaaaagc ttgctgaccg gaaaatatac ccttagttgg 3540
gatacgcact attgtgtggt ggtaaaaacc gaagatggtg cgggaatgac cacgcaagcg 3600
aagtacgatt accgcttcct gcttccggcg caattgacag atatcaatga caaccagcac 3660
atcgtgacat ttaatgcatt ggggcaggtg acttccagcc gtttctgggg cacagaaaat 3720
ggcaaaataa gcggttactc gacgccggag agtaaaccgt tcacagtacc cgataccgtc 3780
gaaaaagccc ttgccttgca accgacgatc ccggtttcac agtgcaacat ttatgtgccg 3840
gatagttgga tgcggcttct gccccaacag tctctgactg gccagctaaa agagggggaa 3900
actttgtgga acgcattaca ccgggcgggt gtagtaacgg aagacggttt gatctgtgaa 3960
ctggcctatc gtcgttggat caaacgtcag gcaacgtctt caatgatggc cgtgacatta 4020
cagcaaatct tggctcagac tccacgacaa cctccgcatg ccatgacgat cacgacagat 4080
cgttatgaca gcgattctca gcagcaactt cggcagtcga tagtattgag tgatggtttt 4140
ggtcgcgtat tgcaaagcgc ccagcgtcat gaagcaggag aggcatggca gcgtgcagaa 4200
gatggttctt tggttgtcga taataccggt aaacccgttg ttgctaatac cacaacgcgc 4260
tgggcagtat ccggtcgcac agaatacgac ggcaaagggc aggcgatcag agcttacctg 4320
ccttattatc tcaatgattg gcgctatgtc agtgatgaca gcgcccggga tgacctgtac 4380
gccgataccc atttttacga tcctctgggg cgtgaatatc aggtaaaaac cgcgaaagga 4440
ttttggcgtg aaaacatgtt tatgccgtgg tttgtcgtca atgaagatga aaatgacaca 4500
gcagcacgtt taacatctta a 4521
<210>49
<211>1506
<212>PRT
<213>伯氏致病杆菌
<400>49
Met Lys Gln Asp Ser Gln Asp Met Thr Val Thr Gln Leu Ser Leu Pro
1 5 10 15
Lys Gly Gly Gly Ala Ile Ser Gly Met Gly Asp Thr Ile Ser Asn Ala
20 25 30
Gly Pro Asp Gly Met Ala Ser Leu Ser Val Pro Leu Pro Ile Ser Ala
35 40 45
Gly Arg Gly Gly Ala Pro Asn Leu Ser Leu Asn Tyr Ser Ser Gly Ala
50 55 60
Gly Asn Gly Ser Phe Gly Ile Gly Trp Gln Ser Ser Thr Met Ala Ile
65 70 75 80
Ser Arg Arg Thr Gln His Gly Val Pro Gln Tyr His Gly Glu Asp Thr
85 90 95
Phe Leu Cys Pro Met Gly Glu Val Met Ala Val Ala Val Asn Gln Ser
100 105 110
Gly Gln Pro Asp Val Arg Lys Thr Asp Lys Leu Leu Gly Gly Gln Leu
115 120 125
Pro Val Thr Tyr Thr Val Thr Arg His Gln Pro Arg Asn Ile Gln His
130 135 140
Phe Ser Lys Leu Glu Tyr Trp Gln Pro Pro Thr Asp Val Glu Thr Thr
145 150 155 160
Pro Phe Trp Leu Met Tyr Ser Pro Asp Gly Gln Ile His Ile Phe Gly
165 170 175
Lys Thr Glu Gln Ala Gln Ile Ala Asn Pro Ala Glu Val Ser Gln Ile
180 185 190
Ala Gln Trp Leu Leu Glu Glu Thr Val Thr Pro Ala Gly Glu His Ile
195 200 205
Tyr Tyr Gln Tyr Arg Ala Glu Asp Asp Ile Gly Cys Asp Asp Ser Glu
210 215 220
Lys Asn Ala His Pro Asn Ala Ser Ala Gln Arg Tyr Leu Thr Gln Val
225 230 235 240
Asn Tyr Gly Asn Ile Thr Pro Glu Ser Ser Leu Leu Val Leu Lys Asn
245 250 255
Thr Pro Pro Ala Asp Asn Glu Trp Leu Phe His Leu Val Phe Asp Tyr
260 265 270
Gly Glu Arg Ala Gln Glu Ile Asn Thr Val Pro Pro Phe Lys Ala Pro
275 280 285
Ser Asn Asn Trp Lys Ile Arg Pro Asp Arg Phe Ser Arg Phe Glu Tyr
290 295 300
Gly Phe Glu Val Arg Thr Arg Arg Leu Cys Gln Gln Ile Leu Met Phe
305 310 315 320
His Arg Leu Lys Ser Leu Ala Gly Glu Gln Ile Asp Gly Glu Glu Ile
325 330 335
Pro Ala Leu Val Ala Arg Leu Leu Leu Ser Tyr Asp Leu Asn Asp Ser
340 345 350
Val Thr Thr Leu Thr Ala Ile Arg Gln Met Ala Tyr Glu Thr Asp Ala
355 360 365
Thr Leu Ile Ala Leu Pro Pro Leu Glu Phe Asp Tyr Gln Pro Phe Glu
370 375 380
Ala Lys Val Thr Gln Lys Trp Gln Glu Met Pro Gln Leu Ala Gly Leu
385 390 395 400
Asn Ala Gln Gln Pro Tyr Gln Leu Val Asp Leu Tyr Gly Glu Gly Ile
405 410 415
Ser Gly Ile Leu Tyr Gln Asp Arg Pro Gly Ala Trp Trp Tyr Gln Ala
420 425 430
Pro Ile Arg Gln Lys Asn Val Glu Asp Ile Asn Ala Val Thr Tyr Ser
435 440 445
Pro Ile Asn Pro Leu Pro Lys Ile Pro Ser Gln Gln Asp Arg Ala Thr
450 455 460
Leu Met Asp Ile Asp Gly Asp Gly His Leu Asp Trp Val Ile Ala Gly
465 470 475 480
Ala Gly Ile Gln Gly Arg Tyr Ser Met Gln Pro Asn Gly Glu Trp Thr
485 490 495
His Phe Ile Pro Ile Ser Ala Leu Pro Thr Glu Tyr Phe His Pro Gln
500 505 510
Ala Gln Leu Ala Asp Leu Val Gly Ala Gly Leu Ser Asp Leu Ala Leu
515 520 525
Ile Gly Pro Arg Ser Val Arg Leu Tyr Ala Asn Asp Arg Gly Asn Trp
530 535 540
Lys Ala Gly Ile Asn Val Met Pro Pro Asp Gly Val Asn Leu Pro Ile
545 550 555 560
Phe Gly Gly Asp Ala Ser Ser Leu Val Ala Phe Ser Asp Met Leu Gly
565 570 575
Ser Gly Gln Gln His Leu Val Glu Ile Ala Ala Gln Ser Val Lys Cys
580 585 590
Trp Pro Asn Leu Gly His Gly Arg Phe Gly Ala Ala Ile Leu Leu Pro
595 600 605
Gly Phe Ser Gln Pro Asn Gly Thr Phe Asn Ala Asn Gln Val Phe Leu
610 615 620
Ala Asp Ile Asp Gly Ser Gly Thr Ala Asp Ile Ile Tyr Ala His Ser
625 630 635 640
Thr Tyr Leu Asp Ile Tyr Leu Asn Glu Ser Gly Asn Arg Phe Ser Ala
645 650 655
Pro Val Arg Leu Asn Leu Pro Glu Gly Val Met Phe Asp Asn Thr Cys
660 665 670
Gln Leu Gln Val Ser Asp Ile Gln Gly Leu Gly Ala Ala Ser Ile Val
675 680 685
Leu Thr Val Pro His Met Thr Pro Arg His Trp Arg Tyr Asp Phe Thr
690 695 700
His Asn Lys Pro Trp Leu Leu Asn Val Ile Asn Asn Asn Arg Gly Ala
705 710 715 720
Glu Thr Thr Leu Phe Tyr Arg Ser Ser Ala Gln Phe Trp Leu Asp Glu
725 730 735
Lys Ser Gln Ile Glu Glu Leu Gly Lys Phe Ala Ala Ser Tyr Leu Pro
740 745 750
Phe Pro Ile His Leu Leu Trp Arg Asn Glu Ala Leu Asp Glu Ile Thr
755 760 765
Gly Asn Arg Leu Thr Lys Val Met Asn Tyr Ala His Gly Ala Trp Asp
770 775 780
Gly Arg Glu Arg Glu Phe Cys Gly Phe Gly Arg Val Thr Gln Ile Asp
785 790 795 800
Thr Asp Glu Phe Ala Lys Gly Thr Thr Glu Lys Ala Pro Asp Glu Asn
805 810 815
Ile Tyr Pro Ser Arg Ser Ile Ser Trp Phe Ala Thr Gly Leu Pro Glu
820 825 830
Val Asp Ser Gln Leu Pro Ala Glu Tyr Trp Arg Gly Asp Asp Gln Ala
835 840 845
Phe Ala Gly Phe Thr Pro Arg Phe Thr Arg Tyr Glu Lys Gly Asn Ala
850 855 860
Gly Gln Glu Gly Gln Asp Thr Pro Ile Lys Glu Pro Thr Glu Thr Glu
865 870 875 880
Ala Tyr Trp Leu Asn Arg Ala Met Lys Gly Gln Leu Leu Arg Ser Glu
885 890 895
Val Tyr Gly Asp Asp Lys Thr Glu Lys Ala Lys Ile Pro Tyr Thr Val
900 905 910
Thr Glu Ala Arg Cys Gln Val Arg Leu Ile Pro Ser Asn Asp Glu Ala
915 920 925
Ala Pro Ser Ser Trp Thr Ser Ile Ile Glu Asn Arg Ser Tyr His Tyr
930 935 940
Glu Arg Ile Val Val Asp Pro Ser Cys Lys Gln Gln Val Val Leu Lys
945 950 955 960
Ala Asp Glu Tyr Gly Phe Pro Leu Ala Lys Val Asp Ile Ala Tyr Pro
965 970 975
Arg Arg Asn Lys Pro Ala Gln Asn Pro Tyr Pro Asp Ser Leu Pro Asp
980 985 990
Thr Leu Phe Ala Asp Ser Tyr Asp Asp Gln Gln Lys Gln Leu Tyr Leu
995 1000 1005
Thr Lys Gln Gln Gln Ser Tyr Tyr His Leu Thr Gln Gln Asp Asp
1010 1015 1020
Trp Val Leu Gly Leu Thr Asp Ser Arg Tyr Ser Glu Val Tyr His
1025 1030 1035
Tyr Ala Gln Thr Asp Ala Gln Ser Asp Ile Pro Lys Ala Gly Leu
1040 1045 1050
Ile Leu Glu Asp Leu Leu Lys Val Asp Gly Leu Ile Gly Lys Asp
1055 1060 1065
Lys Thr Phe Ile Tyr Leu Gly Gln Gln Arg Val Ala Tyr Val Gly
1070 1075 1080
Gly Asp Ala Glu Lys Pro Thr Arg Gln Val Arg Val Ala Tyr Thr
1085 1090 1095
Glu Thr Ala Ala Phe Asp Asp Asn Ala Leu His Ala Phe Asp Gly
1100 1105 1110
Val Ile Ala Pro Asp Glu Leu Thr Gln Gln Leu Leu Ala Gly Gly
1115 1120 1125
Tyr Leu Leu Val Pro Gln Ile Ser Asp Val Ala Gly Ser Ser Glu
1130 1135 1140
Lys Val Trp Val Ala Arg Gln Gly Tyr Thr Glu Tyr Gly Ser Ala
1145 1150 1155
Ala Gln Phe Tyr Arg Pro Leu Ile Gln Arg Lys Ser Leu Leu Thr
1160 1165 1170
Gly Lys Tyr Thr Leu Ser Trp Asp Thr His Tyr Cys Val Val Val
1175 1180 1185
Lys Thr Glu Asp Gly Ala Gly Met Thr Thr Gln Ala Lys Tyr Asp
1190 1195 1200
Tyr Arg Phe Leu Leu Pro Ala Gln Leu Thr Asp Ile Asn Asp Asn
1205 1210 1215
Gln His Ile Val Thr Phe Asn Ala Leu Gly Gln Val Thr Ser Ser
1220 1225 1230
Arg Phe Trp Gly Thr Glu Asn Gly Lys Ile Ser Gly Tyr Ser Thr
1235 1240 1245
Pro Glu Ser Lys Pro Phe Thr Val Pro Asp Thr Val Glu Lys Ala
1250 1255 1260
Leu Ala Leu Gln Pro Thr Ile Pro Val Ser Gln Cys Asn Ile Tyr
1265 1270 1275
Val Pro Asp Ser Trp Met Arg Leu Leu Pro Gln Gln Ser Leu Thr
1280 1285 1290
Gly Gln Leu Lys Glu Gly Glu Thr Leu Trp Asn Ala Leu His Arg
1295 1300 1305
Ala Gly Val Val Thr Glu Asp Gly Leu Ile Cys Glu Leu Ala Tyr
1310 1315 1320
Arg Arg Trp Ile Lys Arg Gln Ala Thr Ser Ser Met Met Ala Val
1325 1330 1335
Thr Leu Gln Gln Ile Leu Ala Gln Thr Pro Arg Gln Pro Pro His
1340 1345 1350
Ala Met Thr Ile Thr Thr Asp Arg Tyr Asp Ser Asp Ser Gln Gln
1355 1360 1365
Gln Leu Arg Gln Ser Ile Val Leu Ser Asp Gly Phe Gly Arg Val
1370 1375 1380
Leu Gln Ser Ala Gln Arg His Glu Ala Gly Glu Ala Trp Gln Arg
1385 1390 1395
Ala Glu Asp Gly Ser Leu Val Val Asp Asn Thr Gly Lys Pro Val
1400 1405 1410
Val Ala Asn Thr Thr Thr Arg Trp Ala Val Ser Gly Arg Thr Glu
1415 1420 1425
Tyr Asp Gly Lys Gly Gln Ala Ile Arg Ala Tyr Leu Pro Tyr Tyr
1430 1435 1440
Leu Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp
1445 1450 1455
Leu Tyr Ala Asp Thr His Phe Tyr Asp Pro Leu Gly Arg Glu Tyr
1460 1465 1470
Gln Val Lys Thr Ala Lys Gly Phe Trp Arg Glu Asn Met Phe Met
1475 1480 1485
Pro Trp Phe Val Val Asn Glu Asp Glu Asn Asp Thr Ala Ala Arg
1490 1495 1500
Leu Thr Ser
1505
<210>50
<211>2889
<212>DNA
<213>伯氏致病杆菌
<400>50
atgaatgttt ttaatccaac tttatatgcc ggtacaccga ctgtcaccgt catggacaat 60
cgagggctgt cagtgcggga tattgcttat caccgtacaa cagcaggaga gcaggctgac 120
actcgcatca cccgccatca atacagtccc cataattttt taatcgagag cattgatcca 180
cgcctttttg atttgcaatc tcagagcacc ataaaaccta atttcaccta ctgtcctgcc 240
ttgaagggtg atgtcctacg gacagagagt gtggatgccg gacaaactgt cattttgagt 300
gacatcgaag gtcgtccgtt actgaatatc agtgcgatgg gtgtcgtcaa acactggcaa 360
tatgaagaga gtacattgcc ggggcgcttg ctcgctgtca gtgaacggaa gaatgaggct 420
tcaacacccc aaattattga acggtttatt tggtcgggaa atagcccatc agaaaaagat 480
cacaatttgg cgggaaaata tcttcgtcat tatgataccg ccggattaaa ccagcttaat 540
gctgtgtctc tgaccagcgt ggatctctca caatcccgtc agttattgca ggatgatgtc 600
acagcagatt ggagcggaag tgacgaatcc cagtggaaga cgcgactgag taacgacata 660
ttcacaaccg aaatcaccgc tgatgcggtt ggcaatttct tgactcagaa tgatgccaaa 720
agcaaccagc aacgattgtc ctatgatgtg gcagggcagt taaaggcaag ctggctgacg 780
ataaaaggcc agaatgagca ggtgatagtt aactccctga cttactccgc cgcagggcag 840
aaactgcgtg aagagcaggg taacggcgtt gtcactgaat actcctatga agcacaaacc 900
tggcgtttga taggtgtaac ggcttaccgt cagtcagata aaaaaagatt gcaggatctt 960
gtctataact atgatccggt cggtaatctc ctgaatattc gcaataatgc agaggcaacc 1020
cgtttctggc gtaatcagat agtagaacca gagaaccact atgcttatga ctcgctttat 1080
caactcatca gtgctagtgg tcgagaaatc gccagtatcg gtcagcaggg cagccggctg 1140
cctgtaccga ttattcctct tcctgccaat gacgatgttt atactcgcta cacccgcaca 1200
tatcactatg atcgcggtgg aaatctctgc cagatccggc attgcgctcc tgctacagat 1260
aataagtaca ccacaaagat caccgtatcg aatcgtagta atcgtgcagt atgggatacc 1320
ttgaccacag atcccgccaa agtggatacc ctgtttgatc atggagggca tcaacttcaa 1380
ctccagtcag gccagacttt atgttggaac tatcggggtg aactacagca aataacaaag 1440
atacagcgtg acgaaaaacc cgcagataaa gagcggtatc gctatggtgt tggggctgcg 1500
cgggtcgtga aaatcagcac acagcaggcg gggggaagca gccatgtgca gcgtgttgtt 1560
tatctgccgg ggttggaact acgcacaact cagcatgatg cgacattaat cgaagactta 1620
caggtgatta tcatgggtga agcaggacgt gctcaggtac gcgtacttca ttgggaaata 1680
ccaccaccgg ataatcttaa caatgactca ctgcgttaca gctacgatag tttgatgggt 1740
tccagtcagc ttgaattgga tggagcaggg cagattatta cgcaggaaga atactacccc 1800
tatggaggta cagcaatatg ggcggcaaga aaccagaccg aagccaatta caaaaccatt 1860
cgctactccg gcaaagagcg tgatgcgacg gggctttatt actacgggca ccgttattat 1920
cagccgtggc tagggcgctg gttgagcgca gatcccgccg gaaccgtgga cggactgaat 1980
ctatatcgaa tggtgaggaa taacccgatt acttaccggg atgcagatgg gcttgcgccg 2040
ataggcgata agatcagcga agggatttat gagcctgagt tgcgagttgg tcttgaacga 2100
gatgacccaa atgtcagaga ttatgaccgg gtttatcctg atacggccaa gacagagatg 2160
atcgaagcaa ctgcgaccac aattgctccc agtcaaatgt tatcggcgca tgcttttgca 2220
tctgtaccta tattgacaga tttgtttaat cctcaaacag caaggctttc tcaaaagaca 2280
acggatattg tattaaacac acaaggtgga ggcgatttaa tctttactgg catgaatatt 2340
aaaggtaagg gaaaagaatt taatgcatta aaaatcgttg atacttatgg cggagaaatg 2400
cctgatagca aaaccgctat ttcagcatat tggcttccgc aaggtgggta tactgatatt 2460
ccgatacatc cgactggaat acaaaagtat ttgtttacgc ctgcgtttag tggttgcact 2520
ctggcagtag ataagcttaa cgaaaataca ttacgggcgt atcacgtcga aggaagtaag 2580
gaagatgctc aatataataa tttagcagtt gcagcgcacg gagagggttt ggtcatggct 2640
atggaatttc ctgactatgg atttcataca gacaaaacag ggcaaagact aaggaacaca 2700
cagggatttg cgtttatgtc ctacaatcaa tcccagaaaa aatgggaaat tcattatcaa 2760
aggcaagcat tgacatcaaa caccggtatc atgaatgtta gtgctaaaaa caagattcga 2820
ttgaatgccc ccagtcatgt aaaaaatagc tcaatcaaag gaactgaaat aatgacgaca 2880
catttttaa 2889
<210>51
<211>962
<212>PRT
<213>伯氏致病杆菌
<400>51
Met Asn Val Phe Asn Pro Thr Leu Tyr Ala Gly Thr Pro Thr Val Thr
1 5 10 15
Val Met Asp Asn Arg Gly Leu Ser Val Arg Asp Ile Ala Tyr His Arg
20 25 30
Thr Thr Ala Gly Glu Gln Ala Asp Thr Arg Ile Thr Arg His Gln Tyr
35 40 45
Ser Pro His Asn Phe Leu Ile Glu Ser Ile Asp Pro Arg Leu Phe Asp
50 55 60
Leu Gln Ser Gln Ser Thr Ile Lys Pro Ash Phe Thr Tyr Cys Pro Ala
65 70 75 80
Leu Lys Gly Asp Val Leu Arg Thr Glu Ser Val Asp Ala Gly Gln Thr
85 90 95
Val Ile Leu Ser Asp Ile Glu Gly Arg Pro Leu Leu Asn Ile Ser Ala
100 105 110
Met Gly Val Val Lys His Trp Gln Tyr Glu Glu Ser Thr Leu Pro Gly
115 120 125
Arg Leu Leu Ala Val Ser Glu Arg Lys Asn Glu Ala Ser Thr Pro Gln
130 135 140
Ile Ile Glu Arg Phe Ile Trp Ser Gly Asn Ser Pro Ser Glu Lys Asp
145 150 155 160
His Asn Leu Ala Gly Lys Tyr Leu Arg His Tyr Asp Thr Ala Gly Leu
165 170 175
Asn Gln Leu Asn Ala Val Ser Leu Thr Ser Val Asp Leu Ser Gln Ser
180 185 190
Arg Gln Leu Leu Gln Asp Asp Val Thr Ala Asp Trp Ser Gly Ser Asp
195 200 205
Glu Ser Gln Trp Lys Thr Arg Leu Ser Asn Asp Ile Phe Thr Thr Glu
210 215 220
Ile Thr Ala Asp Ala Val Gly Asn Phe Leu Thr Gln Asn Asp Ala Lys
225 230 235 240
Ser Asn Gln Gln Arg Leu Ser Tyr Asp Val Ala Gly Gln Leu Lys Ala
245 250 255
Ser Trp Leu Thr Ile Lys Gly Gln Asn Glu Gln Val Ile Val Asn Ser
260 265 270
Leu Thr Tyr Ser Ala Ala Gly Gln Lys Leu Arg Glu Glu Gln Gly Asn
275 280 285
Gly Val Val Thr Glu Tyr Ser Tyr Glu Ala Gln Thr Trp Arg Leu Ile
290 295 300
Gly Val Thr Ala Tyr Arg Gln Ser Asp Lys Lys Arg Leu Gln Asp Leu
305 310 315 320
Val Tyr Asn Tyr Asp Pro Val Gly Asn Leu Leu Asn Ile Arg Asn Asn
325 330 335
Ala Glu Ala Thr Arg Phe Trp Arg Asn Gln Ile Val Glu Pro Glu Asn
340 345 350
His Tyr Ala Tyr Asp Ser Leu Tyr Gln Leu Ile Ser Ala Ser Gly Arg
355 360 365
Glu Ile Ala Ser Ile Gly Gln Gln Gly Ser Arg Leu Pro Val Pro Ile
370 375 380
Ile Pro Leu Pro Ala Asn Asp Asp Val Tyr Thr Arg Tyr Thr Arg Thr
385 390 395 400
Tyr His Tyr Asp Arg Gly Gly Asn Leu Cys Gln Ile Arg His Cys Ala
405 410 415
Pro Ala Thr Asp Asn Lys Tyr Thr Thr Lys Ile Thr Val Ser Asn Arg
420 425 430
Ser Asn Arg Ala Val Trp Asp Thr Leu Thr Thr Asp Pro Ala Lys Val
435 440 445
Asp Thr Leu Phe Asp His Gly Gly His Gln Leu Gln Leu Gln Ser Gly
450 455 460
Gln Thr Leu Cys Trp Asn Tyr Arg Gly Glu Leu Gln Gln Ile Thr Lys
465 470 475 480
Ile Gln Arg Asp Glu Lys Pro Ala Asp Lys Glu Arg Tyr Arg Tyr Gly
485 490 495
Val Gly Ala Ala Arg Val Val Lys Ile Ser Thr Gln Gln Ala Gly Gly
500 505 510
Ser Ser His Val Gln Arg Val Val Tyr Leu Pro Gly Leu Glu Leu Arg
515 520 525
Thr Thr Gln His Asp Ala Thr Leu Ile Glu Asp Leu Gln Val Ile Ile
530 535 540
Met Gly Glu Ala Gly Arg Ala Gln Val Arg Val Leu His Trp Glu Ile
545 550 555 560
Pro Pro Pro Asp Asn Leu Asn Asn Asp Ser Leu Arg Tyr Ser Tyr Asp
565 570 575
Ser Leu Met Gly Ser Ser Gln Leu Glu Leu Asp Gly Ala Gly Gln Ile
580 585 590
Ile Thr Gln Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr Ala Ile Trp Ala
595 600 605
Ala Arg Asn Gln Thr Glu Ala Asn Tyr Lys Thr Ile Arg Tyr Ser Gly
610 615 620
Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly His Arg Tyr Tyr
625 630 635 640
Gln Pro Trp Leu Gly Arg Trp Leu Ser Ala Asp Pro Ala Gly Thr Val
645 650 655
Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn Pro Ile Thr Tyr
660 665 670
Arg Asp Ala Asp Gly Leu Ala Pro Ile Gly Asp Lys Ile Ser Glu Gly
675 680 685
Ile Tyr Glu Pro Glu Leu Arg Val Gly Leu Glu Arg Asp Asp Pro Asn
690 695 700
Val Arg Asp Tyr Asp Arg Val Tyr Pro Asp Thr Ala Lys Thr Glu Met
705 710 715 720
Ile Glu Ala Thr Ala Thr Thr Ile Ala Pro Ser Gln Met Leu Ser Ala
725 730 735
His Ala Phe Ala Ser Val Pro Ile Leu Thr Asp Leu Phe Asn Pro Gln
740 745 750
Thr Ala Arg Leu Ser Gln Lys Thr Thr Asp Ile Val Leu Asn Thr Gln
755 760 765
Gly Gly Gly Asp Leu Ile Phe Thr Gly Met Asn Ile Lys Gly Lys Gly
770 775 780
Lys Glu Phe Asn Ala Leu Lys Ile Val Asp Thr Tyr Gly Gly Glu Met
785 790 795 800
Pro Asp Ser Lys Thr Ala Ile Ser Ala Tyr Trp Leu Pro Gln Gly Gly
805 810 815
Tyr Thr Asp Ile Pro Ile His Pro Thr Gly Ile Gln Lys Tyr Leu Phe
820 825 830
Thr Pro Ala Phe Ser Gly Cys Thr Leu Ala Val Asp Lys Leu Asn Glu
835 840 845
Asn Thr Leu Arg Ala Tyr His Val Glu Gly Ser Lys Glu Asp Ala Gln
850 855 860
Tyr Asn Asn Leu Ala Val Ala Ala His Gly Glu Gly Leu Val Met Ala
865 870 875 880
Met Glu Phe Pro Asp Tyr Gly Phe His Thr Asp Lys Thr Gly Gln Arg
885 890 895
Leu Arg Asn Thr Gln Gly Phe Ala Phe Met Ser Tyr Asn Gln Ser Gln
900 905 910
Lys Lys Trp Glu Ile His Tyr Gln Arg Gln Ala Leu Thr Ser Asn Thr
915 920 925
Gly Ile Met Asn Val Ser Ala Lys Asn Lys Ile Arg Leu Asn Ala Pro
930 935 940
Ser His Val Lys Asn Ser Ser Ile Lys Gly Thr Glu Ile Met Thr Thr
945 950 955 960
His Phe
<210>52
<211>4595
<212>DNA
<213>伯氏致病杆菌
<400>52
tctagaacta gtgtcgacta aagaagaagg agatatacca tgaaacaaga cagccaggac 60
atgacagtaa cacagctgtc cctgcccaaa gggggcggtg cgatcagtgg catgggtgac 120
actatcagca atgcagggcc ggatgggatg gcttcgcttt ccgtgccttt gcctatctct 180
gccggtcggg ggggcgcacc gaatttatcc ctgaactaca gtagcggagc aggaaacggg 240
tcatttggta ttggctggca atccagtacc atggctatca gccgtcgtac tcaacatggc 300
gtaccgcaat atcacggcga agatactttt ttatgtccga tgggagaagt gatggcggtt 360
gccgtcaatc agagcgggca acccgatgtg cgtaaaaccg ataaactatt aggcgggcaa 420
ctgcctgtta cttataccgt tacgcgtcat cagcccagaa atattcagca cttcagcaaa 480
cttgaatact ggcagccccc aacggatgtg gaaaccacgc ctttttggtt aatgtattca 540
cccgatggac aaattcacat tttcggaaaa actgagcagg ctcagatcgc taacccggca 600
gaggtttcac agattgccca atggcttttg gaagaaaccg taacaccagc gggagaacac 660
atttattacc agtatcgggc agaagacgat atcggttgtg atgacagcga aaaaaatgcc 720
caccctaatg ccagtgctca acgttatttg actcaggtga actacggcaa tattacacct 780
gaatccagcc tgcttgtgct gaagaatacg ccaccggcgg ataacgaatg gctattccat 840
ttggtttttg attatggtga acgagcgcag gaaataaaca cggttcctcc tttcaaagca 900
ccttcaaaca actggaaaat acggccagac cgtttctccc gctttgaata tggttttgag 960
gtgcgaaccc gccgcctgtg tcaacaaatt ctgatgttcc atcgcctgaa atcccttgca 1020
ggagaacaga ttgacggaga agaaatccct gccttggttg cccgtctgct tctcagttat 1080
gacctgaacg acagcgtgac aacccttacc gccattcggc aaatggcgta tgaaactgac 1140
gcaaccttaa tcgctttacc gccactggag tttgactatc agccctttga ggcaaaagtc 1200
acgcagaaat ggcaggaaat gcctcaattg gccggattga atgcccaaca accttaccaa 1260
ctcgtcgatc tctatggtga aggtatctcc ggcatcttgt atcaggacag acccggagca 1320
tggtggtatc aggcaccgat ccgtcagaaa aacgttgaag atattaacgc tgtcacctat 1380
agcccaataa accccttacc taagatcccc agccagcagg acagagcaac gttgatggat 1440
atcgacggtg atggacatct ggattgggtg atcgctggcg caggtattca ggggcggtac 1500
agtatgcagc cgaatggaga gtggacacac tttattccca tttctgcact gccaacagaa 1560
tattttcatc cacaggcaca actggcggat ctggtggggg ccgggttatc tgatttagcg 1620
ctgattggcc ccagaagtgt gcgtttatat gccaacgacc gaggaaactg gaaagcgggt 1680
attaatgtta tgccacctga tggtgtgaat ttgccgatat ttggtggtga tgccagcagt 1740
ctggtcgcat tttctgacat gttgggatcg ggacagcagc atttggtgga aattgccgct 1800
cagagcgtca aatgctggcc gaatctagga catggccgtt ttggtgcggc tattttgctg 1860
ccggggttta gccagccgaa tggaacattc aatgctaacc aagtttttct ggcagatatc 1920
gatggttccg gcaccgccga catcatctat gcacacagta cgtatctgga tatttacctg 1980
aacgaaagcg gcaaccgttt cagtgcaccc gttcggctta atttgccgga aggggtgatg 2040
tttgacaata cctgtcagtt acaggtgtcg gatattcaag gattgggcgc tgccagcatt 2100
gtactgaccg tacctcatat gacaccgcgc cattggcgtt atgattttac tcacaataaa 2160
ccttggctgc tcaatgtcat caacaacaat cgtggcgcag aaaccacgtt gttttaccgt 2220
agttctgccc aattctggct ggatgaaaaa agtcagatcg aagagctggg aaaatttgca 2280
gcgagttatc tgcctttccc catacatttg ttgtggcgca atgaggcgct ggatgaaatt 2340
actggtaatc gactgactaa ggtcatgaat tatgcccacg gtgcatggga tggcagagag 2400
agagaatttt gcggatttgg ccgtgtaacg caaattgata ccgacgaatt tgccaaggga 2460
accacagaga aagcgccgga tgaaaatatc tatccttccc gtagcataag ctggtttgcc 2520
acgggtttac cagaagtgga ttctcaactt ccggcagaat actggcgtgg tgacgatcag 2580
gcatttgccg gctttacacc gcgcttcact cgttatgaaa aaggtaatgc ggggcaagag 2640
gggcaggata ccccgattaa agaaccgacc gaaacagaag cgtattggct taaccgcgcc 2700
atgaaaggcc aattactgcg cagtgaagtc tatggtgacg acaaaacaga aaaagctaaa 2760
attccgtaca ccgtcacaga agctcgctgt caggtcagat taattcccag caatgacgaa 2820
gccgcgccgt cgtcttggac gtcgatcatt gaaaaccgca gttatcacta tgagcgtatc 2880
gtcgtcgatc cgagttgcaa acaacaggtc gtgctcaagg cggatgaata tggcttccca 2940
ctggcaaaag tagatatcgc ctatccacgg cgcaataaac cggcacagaa cccttatccg 3000
gattcgttac cggatactct gttcgccgat agctatgacg accagcaaaa acagttatat 3060
ctgacaaaac agcagcagag ctattaccac ctgacccagc aggatgattg ggttctgggt 3120
ttgacggata gccgatacag cgaagtttat cattatgcgc aaactgacgc tcaaagtgac 3180
atccccaagg cagggctgat attggaagac ctgctgaaag ttgacggcct gataggtaaa 3240
gacaagactt ttatctattt agggcagcag cgagtggctt atgtgggagg agatgcagaa 3300
aaaccgacac gtcaggtgcg ggtggcttat acagaaaccg ctgcttttga tgacaatgcg 3360
ctgcacgcct ttgatggcgt gattgcccct gatgaactga cgcaacagtt gctggcgggt 3420
ggatacctgc tcgtgccgca gatttctgat gtggcaggca gtagtgaaaa ggtatgggta 3480
gctcggcagg gatacaccga atacggcagt gctgctcaat tctaccggcc actcatccag 3540
cgcaaaagct tgctgaccgg aaaatatacc cttagttggg atacgcacta ttgtgtggtg 3600
gtaaaaaccg aagatggtgc gggaatgacc acgcaagcga agtacgatta ccgcttcctg 3660
cttccggcgc aattgacaga tatcaatgac aaccagcaca tcgtgacatt taatgcattg 3720
gggcaggtga cttccagccg tttctggggc acagaaaatg gcaaaataag cggttactcg 3780
acgccggaga gtaaaccgtt cacagtaccc gataccgtcg aaaaagccct tgccttgcaa 3840
ccgacgatcc cggtttcaca gtgcaacatt tatgtgccgg atagttggat gcggcttctg 3900
ccccaacagt ctctgactgg ccagctaaaa gagggggaaa ctttgtggaa cgcattacac 3960
cgggcgggtg tagtaacgga agacggtttg atctgtgaac tggcctatcg tcgttggatc 4020
aaacgtcagg caacgtcttc aatgatggcc gtgacattac agcaaatctt ggctcagact 4080
ccacgacaac ctccgcatgc catgacgatc acgacagatc gttatgacag cgattctcag 4140
cagcaacttc ggcagtcgat agtattgagt gatggttttg gtcgcgtatt gcaaagcgcc 4200
cagcgtcatg aagcaggaga ggcatggcag cgtgcagaag atggttcttt ggttgtcgat 4260
aataccggta aacccgttgt tgctaatacc acaacgcgct gggcagtatc cggtcgcaca 4320
gaatacgacg gcaaagggca ggcgatcaga gcttacctgc cttattatct caatgattgg 4380
cgctatgtca gtgatgacag cgcccgggat gacctgtacg ccgataccca tttttacgat 4440
cctctggggc gtgaatatca ggtaaaaacc gcgaaaggat tttggcgtga aaacatgttt 4500
atgccgtggt ttgtcgtcaa tgaagatgaa aatgacacag cagcacgttt aacatcttaa 4560
ttaatgcggc cgcaggcctc tgtaagactc tcgag 4595
<210>53
<211>2947
<212>DNA
<213>伯氏致病杆菌
<400>53
tctagaacta gtaggcctta aagaagagag agatatacca tgaatgtttt taatccaact 60
ttatatgccg gtacaccgac tgtcaccgtc atggacaatc gagggctgtc agtgcgggat 120
attgcttatc accgtacaac agcaggagag caggctgaca ctcgcatcac ccgccatcaa 180
tacagtcccc ataatttttt aatcgagagc attgatccac gcctttttga tttgcaatct 240
cagagcacca taaaacctaa tttcacctac tgtcctgcct tgaagggtga tgtcctacgg 300
acagagagtg tggatgccgg acaaactgtc attttgagtg acatcgaagg tcgtccgtta 360
ctgaatatca gtgcgatggg tgtcgtcaaa cactggcaat atgaagagag tacattgccg 420
gggcgcttgc tcgctgtcag tgaacggaag aatgaggctt caacacccca aattattgaa 480
cggtttattt ggtcgggaaa tagcccatca gaaaaagatc acaatttggc gggaaaatat 540
cttcgtcatt atgataccgc cggattaaac cagcttaatg ctgtgtctct gaccagcgtg 600
gatctctcac aatcccgtca gttattgcag gatgatgtca cagcagattg gagcggaagt 660
gacgaatccc agtggaagac gcgactgagt aacgacatat tcacaaccga aatcaccgct 720
gatgcggttg gcaatttctt gactcagaat gatgccaaaa gcaaccagca acgattgtcc 780
tatgatgtgg cagggcagtt aaaggcaagc tggctgacga taaaaggcca gaatgagcag 840
gtgatagtta actccctgac ttactccgcc gcagggcaga aactgcgtga agagcagggt 900
aacggcgttg tcactgaata ctcctatgaa gcacaaacct ggcgtttgat aggtgtaacg 960
gcttaccgtc agtcagataa aaaaagattg caggatcttg tctataacta tgatccggtc 1020
ggtaatctcc tgaatattcg caataatgca gaggcaaccc gtttctggcg taatcagata 1080
gtagaaccag agaaccacta tgcttatgac tcgctttatc aactcatcag tgctagtggt 1140
cgagaaatcg ccagtatcgg tcagcagggc agccggctgc ctgtaccgat tattcctctt 1200
cctgccaatg acgatgttta tactcgctac acccgcacat atcactatga tcgcggtgga 1260
aatctctgcc agatccggca ttgcgctcct gctacagata ataagtacac cacaaagatc 1320
accgtatcga atcgtagtaa tcgtgcagta tgggatacct tgaccacaga tcccgccaaa 1380
gtggataccc tgtttgatca tggagggcat caacttcaac tccagtcagg ccagacttta 1440
tgttggaact atcggggtga actacagcaa ataacaaaga tacagcgtga cgaaaaaccc 1500
gcagataaag agcggtatcg ctatggtgtt ggggctgcgc gggtcgtgaa aatcagcaca 1560
cagcaggcgg ggggaagcag ccatgtgcag cgtgttgttt atctgccggg gttggaacta 1620
cgcacaactc agcatgatgc gacattaatc gaagacttac aggtgattat catgggtgaa 1680
gcaggacgtg ctcaggtacg cgtacttcat tgggaaatac caccaccgga taatcttaac 1740
aatgactcac tgcgttacag ctacgatagt ttgatgggtt ccagtcagct tgaattggat 1800
ggagcagggc agattattac gcaggaagaa tactacccct atggaggtac agcaatatgg 1860
gcggcaagaa accagaccga agccaattac aaaaccattc gctactccgg caaagagcgt 1920
gatgcgacgg ggctttatta ctacgggcac cgttattatc agccgtggct agggcgctgg 1980
ttgagcgcag atcccgccgg aaccgtggac ggactgaatc tatatcgaat ggtgaggaat 2040
aacccgatta cttaccggga tgcagatggg cttgcgccga taggcgataa gatcagcgaa 2100
gggatttatg agcctgagtt gcgagttggt cttgaacgag atgacccaaa tgtcagagat 2160
tatgaccggg tttatcctga tacggccaag acagagatga tcgaagcaac tgcgaccaca 2220
attgctccca gtcaaatgtt atcggcgcat gcttttgcat ctgtacctat attgacagat 2280
ttgtttaatc ctcaaacagc aaggctttct caaaagacaa cggatattgt attaaacaca 2340
caaggtggag gcgatttaat ctttactggc atgaatatta aaggtaaggg aaaagaattt 2400
aatgcattaa aaatcgttga tacttatggc ggagaaatgc ctgatagcaa aaccgctatt 2460
tcagcatatt ggcttccgca aggtgggtat actgatattc cgatacatcc gactggaata 2520
caaaagtatt tgtttacgcc tgcgtttagt ggttgcactc tggcagtaga taagcttaac 2580
gaaaatacat tacgggcgta tcacgtcgaa ggaagtaagg aagatgctca atataataat 2640
ttagcagttg cagcgcacgg agagggtttg gtcatggcta tggaatttcc tgactatgga 2700
tttcatacag acaaaacagg gcaaagacta aggaacacac agggatttgc gtttatgtcc 2760
tacaatcaat cccagaaaaa atgggaaatt cattatcaaa ggcaagcatt gacatcaaac 2820
accggtatca tgaatgttag tgctaaaaac aagattcgat tgaatgcccc cagtcatgta 2880
aaaaatagct caatcaaagg aactgaaata atgacgacac atttttaatt aatgcggccg 2940
cctcgag 2947
<210>54
<211>7508
<212>DNA
<213>伯氏致病杆菌
<400>54
tctagaacta gtgtcgacta aagaagaagg agatatacca tgaaacaaga cagccaggac 60
atgacagtaa cacagctgtc cctgcccaaa gggggcggtg cgatcagtgg catgggtgac 120
actatcagca atgcagggcc ggatgggatg gcttcgcttt ccgtgccttt gcctatctct 180
gccggtcggg ggggcgcacc gaatttatcc ctgaactaca gtagcggagc aggaaacggg 240
tcatttggta ttggctggca atccagtacc atggctatca gccgtcgtac tcaacatggc 300
gtaccgcaat atcacggcga agatactttt ttatgtccga tgggagaagt gatggcggtt 360
gccgtcaatc agagcgggca acccgatgtg cgtaaaaccg ataaactatt aggcgggcaa 420
ctgcctgtta cttataccgt tacgcgtcat cagcccagaa atattcagca cttcagcaaa 480
cttgaatact ggcagccccc aacggatgtg gaaaccacgc ctttttggtt aatgtattca 540
cccgatggac aaattcacat tttcggaaaa actgagcagg ctcagatcgc taacccggca 600
gaggtttcac agattgccca atggcttttg gaagaaaccg taacaccagc gggagaacac 660
atttattacc agtatcgggc agaagacgat atcggttgtg atgacagcga aaaaaatgcc 720
caccctaatg ccagtgctca acgttatttg actcaggtga actacggcaa tattacacct 780
gaatccagcc tgcttgtgct gaagaatacg ccaccggcgg ataacgaatg gctattccat 840
ttggtttttg attatggtga acgagcgcag gaaataaaca cggttcctcc tttcaaagca 900
ccttcaaaca actggaaaat acggccagac cgtttctccc gctttgaata tggttttgag 960
gtgcgaaccc gccgcctgtg tcaacaaatt ctgatgttcc atcgcctgaa atcccttgca 1020
ggagaacaga ttgacggaga agaaatccct gccttggttg cccgtctgct tctcagttat 1080
gacctgaacg acagcgtgac aacccttacc gccattcggc aaatggcgta tgaaactgac 1140
gcaaccttaa tcgctttacc gccactggag tttgactatc agccctttga ggcaaaagtc 1200
acgcagaaat ggcaggaaat gcctcaattg gccggattga atgcccaaca accttaccaa 1260
ctcgtcgatc tctatggtga aggtatctcc ggcatcttgt atcaggacag acccggagca 1320
tggtggtatc aggcaccgat ccgtcagaaa aacgttgaag atattaacgc tgtcacctat 1380
agcccaataa accccttacc taagatcccc agccagcagg acagagcaac gttgatggat 1440
atcgacggtg atggacatct ggattgggtg atcgctggcg caggtattca ggggcggtac 1500
agtatgcagc cgaatggaga gtggacacac tttattccca tttctgcact gccaacagaa 1560
tattttcatc cacaggcaca actggcggat ctggtggggg ccgggttatc tgatttagcg 1620
ctgattggcc ccagaagtgt gcgtttatat gccaacgacc gaggaaactg gaaagcgggt 1680
attaatgtta tgccacctga tggtgtgaat ttgccgatat ttggtggtga tgccagcagt 1740
ctggtcgcat tttctgacat gttgggatcg ggacagcagc atttggtgga aattgccgct 1800
cagagcgtca aatgctggcc gaatctagga catggccgtt ttggtgcggc tattttgctg 1860
ccggggttta gccagccgaa tggaacattc aatgctaacc aagtttttct ggcagatatc 1920
gatggttccg gcaccgccga catcatctat gcacacagta cgtatctgga tatttacctg 1980
aacgaaagcg gcaaccgttt cagtgcaccc gttcggctta atttgccgga aggggtgatg 2040
tttgacaata cctgtcagtt acaggtgtcg gatattcaag gattgggcgc tgccagcatt 2100
gtactgaccg tacctcatat gacaccgcgc cattggcgtt atgattttac tcacaataaa 2160
ccttggctgc tcaatgtcat caacaacaat cgtggcgcag aaaccacgtt gttttaccgt 2220
agttctgccc aattctggct ggatgaaaaa agtcagatcg aagagctggg aaaatttgca 2280
gcgagttatc tgcctttccc catacatttg ttgtggcgca atgaggcgct ggatgaaatt 2340
actggtaatc gactgactaa ggtcatgaat tatgcccacg gtgcatggga tggcagagag 2400
agagaatttt gcggatttgg ccgtgtaacg caaattgata ccgacgaatt tgccaaggga 2460
accacagaga aagcgccgga tgaaaatatc tatccttccc gtagcataag ctggtttgcc 2520
acgggtttac cagaagtgga ttctcaactt ccggcagaat actggcgtgg tgacgatcag 2580
gcatttgccg gctttacacc gcgcttcact cgttatgaaa aaggtaatgc ggggcaagag 2640
gggcaggata ccccgattaa agaaccgacc gaaacagaag cgtattggct taaccgcgcc 2700
atgaaaggcc aattactgcg cagtgaagtc tatggtgacg acaaaacaga aaaagctaaa 2760
attccgtaca ccgtcacaga agctcgctgt caggtcagat taattcccag caatgacgaa 2820
gccgcgccgt cgtcttggac gtcgatcatt gaaaaccgca gttatcacta tgagcgtatc 2880
gtcgtcgatc cgagttgcaa acaacaggtc gtgctcaagg cggatgaata tggcttccca 2940
ctggcaaaag tagatatcgc ctatccacgg cgcaataaac cggcacagaa cccttatccg 3000
gattcgttac cggatactct gttcgccgat agctatgacg accagcaaaa acagttatat 3060
ctgacaaaac agcagcagag ctattaccac ctgacccagc aggatgattg ggttctgggt 3120
ttgacggata gccgatacag cgaagtttat cattatgcgc aaactgacgc tcaaagtgac 3180
atccccaagg cagggctgat attggaagac ctgctgaaag ttgacggcct gataggtaaa 3240
gacaagactt ttatctattt agggcagcag cgagtggctt atgtgggagg agatgcagaa 3300
aaaccgacac gtcaggtgcg ggtggcttat acagaaaccg ctgcttttga tgacaatgcg 3360
ctgcacgcct ttgatggcgt gattgcccct gatgaactga cgcaacagtt gctggcgggt 3420
ggatacctgc tcgtgccgca gatttctgat gtggcaggca gtagtgaaaa ggtatgggta 3480
gctcggcagg gatacaccga atacggcagt gctgctcaat tctaccggcc actcatccag 3540
cgcaaaagct tgctgaccgg aaaatatacc cttagttggg atacgcacta ttgtgtggtg 3600
gtaaaaaccg aagatggtgc gggaatgacc acgcaagcga agtacgatta ccgcttcctg 3660
cttccggcgc aattgacaga tatcaatgac aaccagcaca tcgtgacatt taatgcattg 3720
gggcaggtga cttccagccg tttctggggc acagaaaatg gcaaaataag cggttactcg 3780
acgccggaga gtaaaccgtt cacagtaccc gataccgtcg aaaaagccct tgccttgcaa 3840
ccgacgatcc cggtttcaca gtgcaacatt tatgtgccgg atagttggat gcggcttctg 3900
ccccaacagt ctctgactgg ccagctaaaa gagggggaaa ctttgtggaa cgcattacac 3960
cgggcgggtg tagtaacgga agacggtttg atctgtgaac tggcctatcg tcgttggatc 4020
aaacgtcagg caacgtcttc aatgatggcc gtgacattac agcaaatctt ggctcagact 4080
ccacgacaac ctccgcatgc catgacgatc acgacagatc gttatgacag cgattctcag 4140
cagcaacttc ggcagtcgat agtattgagt gatggttttg gtcgcgtatt gcaaagcgcc 4200
cagcgtcatg aagcaggaga ggcatggcag cgtgcagaag atggttcttt ggttgtcgat 4260
aataccggta aacccgttgt tgctaatacc acaacgcgct gggcagtatc cggtcgcaca 4320
gaatacgacg gcaaagggca ggcgatcaga gcttacctgc cttattatct caatgattgg 4380
cgctatgtca gtgatgacag cgcccgggat gacctgtacg ccgataccca tttttacgat 4440
cctctggggc gtgaatatca ggtaaaaacc gcgaaaggat tttggcgtga aaacatgttt 4500
atgccgtggt ttgtcgtcaa tgaagatgaa aatgacacag cagcacgttt aacatcttaa 4560
ttaatgcggc cgcaggcctt aaagaagaga gagatatacc atgaatgttt ttaatccaac 4620
tttatatgcc ggtacaccga ctgtcaccgt catggacaat cgagggctgt cagtgcggga 4680
tattgcttat caccgtacaa cagcaggaga gcaggctgac actcgcatca cccgccatca 4740
atacagtccc cataattttt taatcgagag cattgatcca cgcctttttg atttgcaatc 4800
tcagagcacc ataaaaccta atttcaccta ctgtcctgcc ttgaagggtg atgtcctacg 4860
gacagagagt gtggatgccg gacaaactgt cattttgagt gacatcgaag gtcgtccgtt 4920
actgaatatc agtgcgatgg gtgtcgtcaa acactggcaa tatgaagaga gtacattgcc 4980
ggggcgcttg ctcgctgtca gtgaacggaa gaatgaggct tcaacacccc aaattattga 5040
acggtttatt tggtcgggaa atagcccatc agaaaaagat cacaatttgg cgggaaaata 5100
tcttcgtcat tatgataccg ccggattaaa ccagcttaat gctgtgtctc tgaccagcgt 5160
ggatctctca caatcccgtc agttattgca ggatgatgtc acagcagatt ggagcggaag 5220
tgacgaatcc cagtggaaga cgcgactgag taacgacata ttcacaaccg aaatcaccgc 5280
tgatgcggtt ggcaatttct tgactcagaa tgatgccaaa agcaaccagc aacgattgtc 5340
ctatgatgtg gcagggcagt taaaggcaag ctggctgacg ataaaaggcc agaatgagca 5400
ggtgatagtt aactccctga cttactccgc cgcagggcag aaactgcgtg aagagcaggg 5460
taacggcgtt gtcactgaat actcctatga agcacaaacc tggcgtttga taggtgtaac 5520
ggcttaccgt cagtcagata aaaaaagatt gcaggatctt gtctataact atgatccggt 5580
cggtaatctc ctgaatattc gcaataatgc agaggcaacc cgtttctggc gtaatcagat 5640
agtagaacca gagaaccact atgcttatga ctcgctttat caactcatca gtgctagtgg 5700
tcgagaaatc gccagtatcg gtcagcaggg cagccggctg cctgtaccga ttattcctct 5760
tcctgccaat gacgatgttt atactcgcta cacccgcaca tatcactatg atcgcggtgg 5820
aaatctctgc cagatccggc attgcgctcc tgctacagat aataagtaca ccacaaagat 5880
caccgtatcg aatcgtagta atcgtgcagt atgggatacc ttgaccacag atcccgccaa 5940
agtggatacc ctgtttgatc atggagggca tcaacttcaa ctccagtcag gccagacttt 6000
atgttggaac tatcggggtg aactacagca aataacaaag atacagcgtg acgaaaaacc 6060
cgcagataaa gagcggtatc gctatggtgt tggggctgcg cgggtcgtga aaatcagcac 6120
acagcaggcg gggggaagca gccatgtgca gcgtgttgtt tatctgccgg ggttggaact 6180
acgcacaact cagcatgatg cgacattaat cgaagactta caggtgatta tcatgggtga 6240
agcaggacgt gctcaggtac gcgtacttca ttgggaaata ccaccaccgg ataatcttaa 6300
caatgactca ctgcgttaca gctacgatag tttgatgggt tccagtcagc ttgaattgga 6360
tggagcaggg cagattatta cgcaggaaga atactacccc tatggaggta cagcaatatg 6420
ggcggcaaga aaccagaccg aagccaatta caaaaccatt cgctactccg gcaaagagcg 6480
tgatgcgacg gggctttatt actacgggca ccgttattat cagccgtggc tagggcgctg 6540
gttgagcgca gatcccgccg gaaccgtgga cggactgaat ctatatcgaa tggtgaggaa 6600
taacccgatt acttaccggg atgcagatgg gcttgcgccg ataggcgata agatcagcga 6660
agggatttat gagcctgagt tgcgagttgg tcttgaacga gatgacccaa atgtcagaga 6720
ttatgaccgg gtttatcctg atacggccaa gacagagatg atcgaagcaa ctgcgaccac 6780
aattgctccc agtcaaatgt tatcggcgca tgcttttgca tctgtaccta tattgacaga 6840
tttgtttaat cctcaaacag caaggctttc tcaaaagaca acggatattg tattaaacac 6900
acaaggtgga ggcgatttaa tctttactgg catgaatatt aaaggtaagg gaaaagaatt 6960
taatgcatta aaaatcgttg atacttatgg cggagaaatg cctgatagca aaaccgctat 7020
ttcagcatat tggcttccgc aaggtgggta tactgatatt ccgatacatc cgactggaat 7080
acaaaagtat ttgtttacgc ctgcgtttag tggttgcact ctggcagtag ataagcttaa 7140
cgaaaataca ttacgggcgt atcacgtcga aggaagtaag gaagatgctc aatataataa 7200
tttagcagtt gcagcgcacg gagagggttt ggtcatggct atggaatttc ctgactatgg 7260
atttcataca gacaaaacag ggcaaagact aaggaacaca cagggatttg cgtttatgtc 7320
ctacaatcaa tcccagaaaa aatgggaaat tcattatcaa aggcaagcat tgacatcaaa 7380
caccggtatc atgaatgtta gtgctaaaaa caagattcga ttgaatgccc ccagtcatgt 7440
aaaaaatagc tcaatcaaag gaactgaaat aatgacgaca catttttaat taatgcggcc 7500
gcctcgag 7508
<210>55
<211>2862
<212>DNA
<213>类芽孢杆菌DAS1529株
<400>55
atgaaaatga taccgtggac tcaccattat ttgcttcacc gcctgcgcgg tgagatggag 60
gttaaaccta tgaacacaac gtccatatat aggggcacgc ctacgatttc agttgtggat 120
aaccggaact tggagattcg cattcttcag tataaccgta tcgcggctga agatccggca 180
gatgagtgta tcctgcggaa cacgtatacg ccgttaagct atcttggcag cagcatggat 240
ccccgtttgt tctcgcaata tcaggatgat cgcggaacac cgccgaatat acgaaccatg 300
gcttccctga gaggcgaagc gctgtgttcg gaaagtgtgg atgccggccg caaggcggag 360
ctttttgata tcgaggggcg gcccgtctgg cttatcgatg ccaacggcac agagacgact 420
ctcgaatatg atgtcttagg caggccaaca gccgtattcg agcaacagga aggtacggac 480
tccccccagt gcagggagcg gtttatttat ggtgagaagg aggcggatgc ccaggccaac 540
aatttgcgcg gacaactggt tcgccactac gataccgcgg gccggataca gaccgacagc 600
atctccttgg ctggactgcc gttgcgccaa agccgtcaac tgctgaaaaa ttgggatgaa 660
cctggcgact ggagtatgga tgaggaaagc gcctgggcct cgttgctggc tgccgaagct 720
tatgatacga gctggcggta tgacgcgcag gacagggtgc tcgcccaaac cgacgccaaa 780
gggaatctcc agcaactgac ttacaatgac gccggccagc cgcaggcggt cagcctcaag 840
ctgcaaggcc aagcggagca acggatttgg aaccggatcg agtacaacgc ggcgggtcaa 900
gtggatctcg ccgaagccgg gaatggaatc gtaacggaat atacttacga ggaaagcacg 960
cagcggttaa tccgaaaaaa agattcccgc ggactgtcct ccggggaaag agaagtgctg 1020
caggattatc gttatgaata tgatccggta ggcaatatcc tttctattta caatgaagcg 1080
gagccggttc gttatttccg caatcaggcc gttgctccga aaaggcaata tgcctacgat 1140
gccttgtatc agcttgtatc tagttcgggg cgggaatccg acgcgcttcg gcagcagacg 1200
tcgcttcctc ccttgatcac gcctatccct ctggacgata gccaatacgt caattacgct 1260
gaaaaataca gctatgatca ggcgggcaat ttaatcaagc ttagccataa cggggcaagt 1320
caatatacaa cgaatgtgta tgtggacaaa agctcaaacc gggggatttg gcggcaaggg 1380
gaagacatcc cggatatcgc ggcttccttt gacagagcag gcaatcaaca agctttattc 1440
ccggggagac cgttggaatg ggatacacgc aatcaattaa gccgtgtcca tatggtcgtg 1500
cgcgaaggcg gagacaacga ctgggaaggc tatctctatg acagctcggg aatgcgtatc 1560
gtaaaacgat ctacccgcaa aacacagaca acgacgcaaa cggatacgac cctctatttg 1620
ccgggcctgg agctgcgaat ccgccagacc ggggaccggg tcacggaagc attgcaggtc 1680
attaccgtgg atgagggagc gggacaagtg agggtgctgc actgggagga tggaaccgag 1740
ccgggcggca tcgccaatga tcagtaccgg tacagcctga acgatcatct tacctcctct 1800
ttattggaag ttgacgggca aggtcagatc attagtaagg aagaatttta tccctatggc 1860
ggcacagccc tgtggacagc ccggtcagag gtagaggcaa gctacaagac catccgctat 1920
tcaggcaaag agcgggatgc cacaggcctg tattattacg gacaccgcta ctatatgcca 1980
tggttgggtc gctggctgaa tccggacccg gccggaatgg tagatggact aaacctgtac 2040
cgtatggtca ggaacaatcc tataggactg atggatccga atgggaatgc gccaatcaac 2100
gtggcggatt atagcttcgt gcatggtgat ttagtttatg gtcttagtaa ggaaagagga 2160
agatatctaa agctatttaa tccaaacttt aatatggaaa aatcagactc tcctgctatg 2220
gttatagatc aatataataa taatgttgca ttgagtataa ctaaccaata taaagtagaa 2280
gaattgatga aatttcaaaa agacccacaa aaagccgcac ggaaaataaa ggttccagaa 2340
gggaatcgtt tatcgaggaa cgaaaattat cctttgtggc acgattatat taacattgga 2400
gaagctaaag ctgcatttaa ggcctctcat attttccaag aagtgaaggg gaattatggg 2460
aaagattatt atcataaatt attattagac agaatgatag aatcgccgtt gctgtggaaa 2520
cgaggcagca aactcgggct agaaatcgcc gctaccaatc agagaacaaa aatacacttt 2580
gttcttgaca atttaaatat cgagcaggtg gttacgaaag agggtagcgg cggtcagtca 2640
atcacagctt cggagctccg ttatatttat cgaaatcgcg aaagattgaa cgggcgtgtc 2700
attttctata gaaataatga aaggctagat caggctccat ggcaagaaaa tccggactta 2760
tggagcaaat atcaaccggg tcttagacaa agcagcagtt caagagtcaa agaacgaggg 2820
attgggaact ttttccgccg gttttcaatg aagagaaagt aa 2862
<210>56
<211>4458
<212>DNA
<213>发光光杆状菌W14株
<220>
<221>外显子
<222>(1)..(4458)
<400>56
atg cag gat tca cca gaa gta tcg att aca acg ctg tca ctt ccc aaa 48
Met Gln Asp Ser Pro Glu Val Ser Ile Thr Thr Leu Ser Leu Pro Lys
1 5 10 15
ggt ggc ggt gct atc aat ggc atg gga gaa gca ctg aat gct gcc ggc 96
Gly Gly Gly Ala Ile Asn Gly Met Gly Glu Ala Leu Asn Ala Ala Gly
20 25 30
cct gat gga atg gcc tcc cta tct ctg cca tta ccc ctt tcg acc ggc 144
Pro Asp Gly Met Ala Ser Leu Ser Leu Pro Leu Pro Leu Ser Thr Gly
35 40 45
aga ggg acg gct cct gga tta tcg ctg att tac agc aac agt gca ggt 192
Arg Gly Thr Ala Pro Gly Leu Ser Leu Ile Tyr Ser Asn Ser Ala Gly
50 55 60
aat ggg cct ttc ggc atc ggc tgg caa tgc ggt gtt atg tcc att agc 240
Asn Gly Pro Phe Gly Ile Gly Trp Gln Cys Gly Val Met Ser Ile Ser
65 70 75 80
cga cgc acc caa cat ggc att cca caa tac ggt aat gac gac acg ttc 288
Arg Arg Thr Gln His Gly Ile Pro Gln Tyr Gly Asn Asp Asp Thr Phe
85 90 95
cta tcc cca caa ggc gag gtc atg aat atc gcc ctg aat gac caa ggg 336
Leu Ser Pro Gln Gly Glu Val Met Asn Ile Ala Leu Asn Asp Gln Gly
100 105 110
caa cct gat atc cgt caa gac gtt aaa acg ctg caa ggc gtt acc ttg 384
Gln Pro Asp Ile Arg Gln Asp Val Lys Thr Leu Gln Gly Val Thr Leu
115 120 125
cca att tcc tat acc gtg acc cgc tat caa gcc cgc cag atc ctg gat 432
Pro Ile Ser Tyr Thr Val Thr Arg Tyr Gln Ala Arg Gln Ile Leu Asp
130 135 140
ttc agt aaa atc gaa tac tgg caa cct gcc tcc ggt caa gaa gga cgc 480
Phe Ser Lys Ile Glu Tyr Trp Gln Pro Ala Ser Gly Gln Glu Gly Arg
145 150 155 160
gct ttc tgg ctg ata tcg tca ccg gac ggc caa cta cac atc tta ggg 528
Ala Phe Trp Leu Ile Ser Ser Pro Asp Gly Gln Leu His Ile Leu Gly
165 170 175
aaa acc gcg cag gct tgt ctg gca aat ccg caa aat gac caa caa atc 576
Lys Thr Ala Gln Ala Cys Leu Ala Asn Pro Gln Asn Asp Gln Gln Ile
180 185 190
gcc cag tgg ttg ctg gaa gaa act gtg acg cca gcc ggt gaa cat gtc 624
Ala Gln Trp Leu Leu Glu Glu Thr Val Thr Pro Ala Gly Glu His Val
195 200 205
agc tat caa tat cga gcc gaa gat gaa gcc cat tgt gac gac aat gaa 672
Ser Tyr Gln Tyr Arg Ala Glu Asp Glu Ala His Cys Asp Asp Asn Glu
210 215 220
aaa acc gct cat ccc aat gtt acc gca cag cgc tat ctg gta cag gtg 720
Lys Thr Ala His Pro Asn Val Thr Ala Gln Arg Tyr Leu Val Gln Val
225 230 235 240
aac tac ggc aac atc aaa cca caa gcc agc ctg ttc gta ctg gat aac 768
Asn Tyr Gly Asn Ile Lys Pro Gln Ala Ser Leu Phe Val Leu Asp Asn
245 250 255
gca cct ccc gca ccg gaa gag tgg ctg ttt cat ctg gtc ttt gac cac 816
Ala Pro Pro Ala Pro Glu Glu Trp Leu Phe His Leu Val Phe Asp His
260 265 270
ggt gag cgc gat acc tca ctt cat acc gtg cca aca tgg gat gca ggt 864
Gly Glu Arg Asp Thr Ser Leu His Thr Val Pro Thr Trp Asp Ala Gly
275 280 285
aca gcg caa tgg tct gta cgc ccg gat atc ttc tct cgc tat gaa tat 912
Thr Ala Gln Trp Ser Val Arg Pro Asp Ile Phe Ser Arg Tyr Glu Tyr
290 295 300
ggt ttt gaa gtg cgt act cgc cgc tta tgt caa caa gtg ctg atg ttt 960
Gly Phe Glu Val Arg Thr Arg Arg Leu Cys Gln Gln Val Leu Met Phe
305 310 315 320
cac cgc acc gcg ctc atg gcc gga gaa gcc agt acc aat gac gcc ccg 1008
His Arg Thr Ala Leu Met Ala Gly Glu Ala Ser Thr Asn Asp Ala Pro
325 330 335
gaa ctg gtt gga cgc tta ata ctg gaa tat gac aaa aac gcc agc gtc 1056
Glu Leu Val Gly Arg Leu Ile Leu Glu Tyr Asp Lys Asn Ala Ser Val
340 345 350
acc acg ttg att acc atc cgt caa tta agc cat gaa tcg gac ggc agc 1104
Thr Thr Leu Ile Thr Ile Arg Gln Leu Ser His Glu Ser Asp Gly Ser
355 360 365
cca gtc acc cag cca cca cta gaa cta gcc tgg caa cgg ttt gat ctg 1152
Pro Val Thr Gln Pro Pro Leu Glu Leu Ala Trp Gln Arg Phe Asp Leu
370 375 380
gag aaa atg ccg aca tgg caa cgc ttt gac gca cta gat aat ttt aac 1200
Glu Lys Met Pro Thr Trp Gln Arg Phe Asp Ala Leu Asp Asn Phe Asn
385 390 395 400
tcg cag caa cgt tat caa ctg gtt gat ctg cgg gga gaa ggg ttg cca 1248
Ser Gln Gln Arg Tyr Gln Leu Val Asp Leu Arg Gly Glu Gly Leu Pro
405 410 415
ggt atg ctg tat caa gat cga ggc gct tgg tgg tat aaa gct ccg caa 1296
Gly Met Leu Tyr Gln Asp Arg Gly Ala Trp Trp Tyr Lys Ala Pro Gln
420 425 430
cgt cag gaa gac gga gac agc aat gcc gtc act tac gac aaa atc gcc 1344
Arg Gln Glu Asp Gly Asp Ser Asn Ala Val Thr Tyr Asp Lys Ile Ala
435 440 445
cca ctg cct acc cta ccc aat ttg cag gat aat gcc tca ttg atg gat 1392
Pro Leu Pro Thr Leu Pro Asn Leu Gln Asp Asn Ala Ser Leu Met Asp
450 455 460
atc aac gga gac ggc caa ctg gat tgg gtt gtt acc gcc tcc ggt att 1440
Ile Asn Gly Asp Gly Gln Leu Asp Trp Val Val Thr Ala Ser Gly Ile
465 470 475 480
cgc gga tac cat agt cag caa ccc gat gga aag tgg acg cac ttt acg 1488
Arg Gly Tyr His Ser Gln Gln Pro Asp Gly Lys Trp Thr His Phe Thr
485 490 495
cca atc aat gcc ttg ccc gtg gaa tat ttt cat cca agc atc cag ttc 1536
Pro Ile Asn Ala Leu Pro Val Glu Tyr Phe His Pro Ser Ile Gln Phe
500 505 510
gct gac ctt acc ggg gca ggc tta tct gat tta gtg ttg atc ggg ccg 1584
Ala Asp Leu Thr G1y Ala Gly Leu Ser Asp Leu Val Leu Ile Gly Pro
515 520 525
aaa agc gtg cgt cta tat gcc aac cag cga aac ggc tgg cgt aaa gga 1632
Lys Ser Val Arg Leu Tyr Ala Asn Gln Arg Asn Gly Trp Arg Lys Gly
530 535 540
gaa gat gtc ccc caa tcc aca ggt atc acc ctg cct gtc aca ggg acc 1680
Glu Asp Val Pro Gln Ser Thr Gly Ile Thr Leu Pro Val Thr Gly Thr
545 550 555 560
gat gcc cgc aaa ctg gtg gct ttc agt gat atg ctc ggt tcc ggt caa 1728
Asp Ala Arg Lys Leu Val Ala Phe Ser Asp Met Leu Gly Ser Gly Gln
565 570 575
caa cat ctg gtg gaa atc aag gct aat cgc gtc acc tgt tgg ccg aat 1776
Gln His Leu Val Glu Ile Lys Ala Asn Arg Val Thr Cys Trp Pro Asn
580 585 590
cta ggg cat ggc cgt ttc ggt caa cca cta act ctg tca gga ttt agc 1824
Leu G1y His Gly Arg Phe Gly Gln Pro Leu Thr Leu Ser Gly Phe Ser
595 600 605
cag ccc gaa aat agc ttc aat ccc gaa cgg ctg ttt ctg gcg gat atc 1872
Gln Pro Glu Asn Ser Phe Asn Pro Glu Arg Leu Phe Leu Ala Asp Ile
610 615 620
gac ggc tcc ggc acc acc gac ctt atc tat gcg caa tcc ggc tct ttg 1920
Asp Gly Ser Gly Thr Thr Asp Leu Ile Tyr Ala Gln Ser Gly Ser Leu
625 630 635 640
ctc att tat ctc aac caa agt ggt aat cag ttt gat gcc ccg ttg aca 1968
Leu Ile Tyr Leu Asn Gln Ser Gly Asn Gln Phe Asp Ala Pro Leu Thr
645 650 655
tta gcg ttg cca gaa ggc gta caa ttt gac aac act tgc caa ctt caa 2016
Leu Ala Leu Pro Glu Gly Val Gln Phe Asp Asn Thr Cys Gln Leu Gln
660 665 670
gtc gcc gat att cag gga tta ggg ata gcc agc ttg att ctg act gtg 2064
Val Ala Asp Ile Gln Gly Leu Gly Ile Ala Ser Leu Ile Leu Thr Val
675 680 685
cca cat atc gcg cca cat cac tgg cgt tgt gac ctg tca ctg acc aaa 2112
Pro His Ile Ala Pro His His Trp Arg Cys Asp Leu Ser Leu Thr Lys
690 695 700
ccc tgg ttg ttg aat gta atg aac aat aac cgg ggc gca cat cac acg 2160
Pro Trp Leu Leu Asn Val Met Asn Asn Asn Arg Gly Ala His His Thr
705 710 715 720
cta cat tat cgt agt tcc gcg caa ttc tgg ttg gat gaa aaa tta cag 2208
Leu His Tyr Arg Ser Ser Ala Gln Phe Trp Leu Asp Glu Lys Leu Gln
725 730 735
ctc acc aaa gca ggc aaa tct ccg gct tgt tat ctg ccg ttt cca atg 2256
Leu Thr Lys Ala Gly Lys Ser Pro Ala Cys Tyr Leu Pro Phe Pro Met
740 745 750
cat ttg cta tgg tat acc gaa att cag gat gaa atc agc ggc aac cgg 2304
His Leu Leu Trp Tyr Thr Glu Ile Gln Asp Glu Ile Ser Gly Asn Arg
755 760 765
ctc acc agt gaa gtc aac tac agc cac ggc gtc tgg gat ggt aaa gag 2352
Leu Thr Ser Glu Val Asn Tyr Ser His Gly Val Trp Asp Gly Lys Glu
770 775 780
cgg gaa ttc aga gga ttt ggc tgc atc aaa cag aca gat acc aca acg 2400
Arg Glu Phe Arg Gly Phe Gly Cys Ile Lys Gln Thr Asp Thr Thr Thr
785 790 795 800
ttt tct cac ggc acc gcc ccc gaa cag gcg gca ccg tcg ctg agt att 2448
Phe Ser His Gly Thr Ala Pro Glu Gln Ala Ala Pro Ser Leu Ser Ile
805 810 815
agc tgg ttt gcc acc ggc atg gat gaa gta gac agc caa tta gct acg 2496
Ser Trp Phe Ala Thr Gly Met Asp Glu Val Asp Ser Gln Leu Ala Thr
820 825 830
gaa tat tgg cag gca gac acg caa gct tat agc gga ttt gaa acc cgt 2544
Glu Tyr Trp Gln Ala Asp Thr Gln Ala Tyr Ser Gly Phe Glu Thr Arg
835 840 845
tat acc gtc tgg gat cac acc aac cag aca gac caa gca ttt acc ccc 2592
Tyr Thr Val Trp Asp His Thr Asn Gln Thr Asp Gln Ala Phe Thr Pro
850 855 860
aat gag aca caa cgt aac tgg ctg acg cga gcg ctt aaa ggc caa ctg 2640
Asn Glu Thr Gln Arg Asn Trp Leu Thr Arg Ala Leu Lys Gly Gln Leu
865 870 875 880
cta cgc act gag ctc tac ggt ctg gac gga aca gat aag caa aca gtg 2688
Leu Arg Thr Glu Leu Tyr Gly Leu Asp Gly Thr Asp Lys Gln Thr Val
885 890 895
cct tat acc gtc agt gaa tcg cgc tat cag gta cgc tct att ccc gta 2736
Pro Tyr Thr Val Ser Glu Ser Arg Tyr Gln Val Arg SerIle Pro Val
900 905 910
aat aaa gaa act gaa tta tct gcc tgg gtg act gct att gaa aat cgc 2784
Asn Lys Glu Thr Glu Leu Ser Ala Trp Val Thr Ala Ile Glu Asn Arg
915 920 925
agc tac cac tat gaa cgt atc atc act gac cca cag ttc agc cag agt 2832
Ser Tyr His Tyr Glu Arg Ile Ile Thr Asp Pro Gln Phe Ser Gln Ser
930 935 940
atc aag ttg caa cac gat atc ttt ggt caa tca ctg caa agt gtc gat 2880
Ile Lys Leu Gln His Asp Ile Phe Gly Gln Ser Leu Gln Ser Val Asp
945 950 955 960
att gcc tgg ccg cgc cgc gaa aaa cca gca gtg aat ccc tac ccg cct 2928
Ile Ala Trp Pro Arg Arg Glu Lys Pro Ala Val Asn Pro Tyr Pro Pro
965 970 975
acc ctg ccg gaa acg cta ttt gac agc agc tat gat gat caa caa caa 2976
Thr Leu Pro Glu Thr Leu Phe Asp Ser Ser Tyr Asp Asp Gln Gln Gln
980 985 990
cta tta cgt ctg gtg aga caa aaa aat agc tgg cat cac ctg act gat 3024
Leu Leu Arg Leu Val Arg Gln Lys Asn Ser Trp His His Leu Thr Asp
995 1000 1005
ggg gaa aac tgg cga tta ggt tta ccg aat gca caa cgc cgt gat 3069
Gly Glu Asn Trp Arg Leu Gly Leu Pro Asn Ala Gln Arg Arg Asp
1010 1015 1020
gtt tat act tat gac cgg agc aaa att cca acc gaa ggg att tcc 3114
Val Tyr Thr Tyr Asp Arg Ser Lys Ile Pro Thr Glu Gly Ile Ser
1025 1030 1035
ctt gaa atc ttg ctg aaa gat gat ggc ctg cta gca gat gaa aaa 3159
Leu Glu Ile Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys
1040 1045 1050
gcg gcc gtt tat ctg gga caa caa cag acg ttt tac acc gcc ggt 3204
Ala Ala Val Tyr Leu Gly Gln Gln Gln Thr Phe Tyr Thr Ala Gly
1055 1060 1065
caa gcg gaa gtc act cta gaa aaa ccc acg tta caa gca ctg gtc 3249
Gln Ala Glu Val Thr Leu Glu Lys Pro Thr Leu Gln Ala Leu Val
1070 1075 1080
gcg ttc caa gaa acc gcc atg atg gac gat acc tca tta cag gcg 3294
Ala Phe Gln Glu Thr Ala Met Met Asp Asp Thr Ser Leu Gln Ala
1085 1090 1095
tat gaa ggc gtg att gaa gag caa gag ttg aat acc gcg ctg aca 3339
Tyr Glu Gly Val Ile Glu Glu Gln Glu Leu Asn Thr Ala Leu Thr
1100 1105 11l0
cag gcc ggt tat cag caa gtc gcg cgg ttg ttt aat acc aga tca 3384
Gln Ala Gly Tyr Gln Gln Val Ala Arg Leu Phe Asn Thr Arg Ser
1115 1120 1125
gaa agc ccg gta tgg gcg gca cgg caa ggt tat acc gat tac ggt 3429
Glu Ser Pro Val Trp Ala Ala Arg Gln Gly Tyr Thr Asp Tyr Gly
1130 1135 1140
gac gcc gca cag ttc tgg cgg cct cag gct cag cgt aac tcg ttg 3474
Asp Ala Ala Gln Phe Trp Arg Pro Gln Ala Gln Arg Asn Ser Leu
1145 1150 1155
ctg aca ggg aaa acc aca ctg acc tgg gat acc cat cat tgt gta 3519
Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp Thr His His Cys Val
1160 1165 1170
ata ata cag act caa gat gcc gct gga tta acg acg caa gcc cat 3564
Ile Ile Gln Thr Gln Asp Ala Ala Gly Leu Thr Thr Gln Ala His
1175 1180 1185
tac gat tat cgt ttc ctt aca ccg gta caa ctg aca gat att aat 3609
Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gln Leu Thr Asp Ile Asn
1190 1195 1200
gat aat caa cat att gtg act ctg gac gcg cta ggt cgc gta acc 3654
Asp Asn Gln His Ile Val Thr Leu Asp Ala Leu Gly Arg Val Thr
1205 1210 1215
acc agc cgg ttc tgg ggc aca gag gca gga caa gcc gca ggc tat 3699
Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gln Ala Ala Gly Tyr
1220 1225 1230
tcc aac cag ccc ttc aca cca ccg gac tcc gta gat aaa gcg ctg 3744
Ser Asn Gln Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu
1235 1240 1245
gca tta acc ggc gca ctc cct gtt gcc caa tgt tta gtc tat gcc 3789
Ala Leu Thr Gly Ala Leu Pro Val Ala Gln Cys Leu Val Tyr Ala
1250 1255 1260
gtt gat agc tgg atg ccg tcg tta tct ttg tct cag ctt tct cag 3834
Val Asp Ser Trp Met Pro Ser Leu Ser Leu Ser Gln Leu Ser Gln
1265 1270 1275
tca caa gaa gag gca gaa gcg cta tgg gcg caa ctg cgt gcc gct 3879
Ser Gln Glu Glu Ala Glu Ala Leu Trp Ala Gln Leu Arg Ala Ala
1280 1285 1290
cat atg att acc gaa gat ggg aaa gtg tgt gcg tta agc ggg aaa 3924
His Met Ile Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys
1295 1300 1305
cga gga aca agc cat cag aac ctg acg att caa ctt att tcg cta 3969
Arg Gly Thr Ser His Gln Asn Leu Thr Ile Gln Leu Ile Ser Leu
1310 1315 1320
ttg gca agt att ccc cgt tta ccg cca cat gta ctg ggg atc acc 4014
Leu Ala Ser Ile Pro Arg Leu Pro Pro His Val Leu Gly Ile Thr
1325 1330 1335
act gat cgc tat gat agc gat ccg caa cag cag cac caa cag acg 4059
Thr Asp Arg Tyr Asp Ser Asp Pro Gln Gln Gln His Gln Gln Thr
1340 1345 1350
gtg agc ttt agt gac ggt ttt ggc cgg tta ctc cag agt tca gct 4104
Val Ser Phe Ser Asp Gly Phe Gly Arg Leu Leu Gln Ser Ser Ala
1355 1360 1365
cgt cat gag tca ggt gat gcc tgg caa cgt aaa gag gat ggc ggg 4149
Arg His Glu Ser Gly Asp Ala Trp Gln Arg Lys Glu Asp Gly Gly
1370 1375 1380
ctg gtc gtg gat gca aat ggc gtt ctg gtc agt gcc cct aca gac 4194
Leu Val Val Asp Ala Asn Gly Val Leu Val Ser Ala Pro Thr Asp
1385 1390 1395
acc cga tgg gcc gtt tcc ggt cgc aca gaa tat gac gac aaa ggc 4239
Thr Arg Trp Ala Val Ser Gly Arg Thr Glu Tyr Asp Asp Lys Gly
1400 1405 1410
caa cct gtg cgt act tat caa ccc tat ttt cta aat gac tgg cgt 4284
Gln Pro Val Arg Thr Tyr Gln Pro Tyr Phe Leu Asn Asp Trp Arg
1415 1420 1425
tac gtt agt gat gac agc gca cga gat gac ctg ttt gcc gat acc 4329
Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe Ala Asp Thr
1430 1435 1440
cac ctt tat gat cca ttg gga cgg gaa tac aaa gtc atc act gct 4374
His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val Ile Thr Ala
1445 1450 1455
aag aaa tat ttg cga gaa aag ctg tac acc ccg tgg ttt att gtc 4419
Lys Lys Tyr Leu Arg Glu Lys Leu Tyr Thr Pro Trp Phe Ile Val
1460 1465 1470
agt gag gat gaa aac gat aca gca tca aga acc cca tag 4458
Ser Glu Asp Glu Asn Asp Thr Ala Ser Arg Thr Pro
1475 1480 1485
<210>57
<211>2817
<212>DNA
<213>发光光杆状菌W14株
<220>
<221>外显子
<222>(1)..(2817)
<400>57
atg gaa aac att gac cca aaa ctt tat cac cat acg cct acc gtc agt 48
Met Glu Asn Ile Asp Pro Lys Leu Tyr His His Thr Pro Thr Val Ser
1 5 10 15
gtt cac gat aac cgt gga cta gct atc cgt aat att agt ttt cac cgc 96
Val His Asp Asn Arg Gly Leu Ala Ile Arg Asn Ile Ser Phe His Arg
20 25 30
act acc gca gaa gca aat acc gat acc cgt att acc cgc cat caa tat 144
Thr Thr Ala Glu Ala Asn Thr Asp Thr Arg Ile Thr Arg His Gln Tyr
35 40 45
aat gcc ggc gga tat ttg aac caa agc att gat cct cgc ctg tat gac 192
Asn Ala Gly Gly Tyr Leu Asn Gln Ser Ile Asp Pro Arg Leu Tyr Asp
50 55 60
gcc aaa cag act aac aac gct gta caa ccg aat ttt atc tgg cga cat 240
Ala Lys Gln Thr Asn Asn Ala Val Gln Pro Ash Phe Ile Trp Arg His
65 70 75 80
aat ttg acc ggc aat atc ctg cga aca gag agc gtc gat gcc ggt cgg 288
Asn Leu Thr Gly Asn Ile Leu Arg Thr Glu Ser Val Asp Ala Gly Arg
85 90 95
acg att acc ctc aac gat att gaa ggc cgc ccg gtg ttg acc atc aat 336
Thr Ile Thr Leu Asn Asp Ile Glu Gly Arg Pro Val Leu Thr Ile Asn
100 105 110
gca gcc ggt gtc cgg caa aac cat cgc tac gaa gat aac acc ctg ccc 384
Ala Ala Gly Val Arg Gln Asn His Arg Tyr Glu Asp Asn Thr Leu Pro
115 120 125
ggt cgc ctg ctc gct atc agc gaa caa gga cag gca gaa gag aaa acg 432
Gly Arg Leu Leu Ala Ile Ser Glu Gln Gly Gln Ala Glu Glu Lys Thr
130 135 140
acc gag cgc ctt atc tgg gcc ggc aat acg ccg caa gaa aaa gac cac 480
Thr Glu Arg Leu Ile Trp Ala Gly Asn Thr Pro Gln Glu Lys Asp His
145 150 155 160
aac ctt gcc ggt cag tgc gtc cgc cat tac gat acc gca gga ctc act 528
Asn Leu Ala Gly Gln Cys Val Arg His Tyr Asp Thr Ala Gly Leu Thr
165 170 175
caa ctc aac agc ctt gcc ctg acc ggc gcc gtt cta tca caa tct caa 576
Gln Leu Asn Ser Leu Ala Leu Thr Gly Ala Val Leu Ser Gln Ser Gln
180 185 190
caa ctg ctt acc gat aac cag gat gcc gac tgg aca ggt gaa gac cag 624
Gln Leu Leu Thr Asp Asn Gln Asp Ala Asp Trp Thr Gly Glu Asp Gln
195 200 205
agc ctc tgg caa caa aaa ctg agt agt gat gtc tat atc acc caa agt 672
Ser Leu Trp Gln Gln Lys Leu Ser Ser Asp Val Tyr Ile Thr Gln Ser
210 215 220
aac act gat gcc acc ggg gct tta ctg acc cag acc gat gcc aaa ggc 720
Asn Thr Asp Ala Thr Gly Ala Leu Leu Thr Gln Thr Asp Ala Lys Gly
225 230 235 240
aac att cag cgg ctg gcc tat gat gtg gcc ggg cag cta aaa ggg agt 768
Asn Ile Gln Arg Leu Ala Tyr Asp Val Ala Gly Gln Leu Lys Gly Ser
245 250 255
tgg tta aca ctc aaa ggt cag gcg gaa cag gtg att atc aaa tcg cta 816
Trp Leu Thr Leu Lys Gly Gln Ala Glu Gln Val Ile Ile Lys Ser Leu
260 265 270
acc tac tcc gcc gcc ggg caa aaa tta cgt gaa gag cac ggt aac ggg 864
Thr Tyr Ser Ala Ala Gly Gln Lys Leu Arg Glu Glu His Gly Asn Gly
275 280 285
att gtc act gaa tac agc tac gaa ccg gaa acc caa cgg ctt atc ggc 912
Ile Val Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Gln Arg Leu Ile Gly
290 295 300
att acc act cgc cgt cca tca gac gcc aag gtg ttg caa gac cta cgc 960
Ile Thr Thr Arg Arg Pro Ser Asp Ala Lys Val Leu Gln Asp Leu Arg
305 310 315 320
tat caa tat gac cca gta ggc aat gtc att agt atc cgt aat gat gcg 1008
Tyr Gln Tyr Asp Pro Val Gly Asn Val Ile Ser Ile Arg Asn Asp Ala
325 330 335
gaa gcc act cgc ttt tgg cgc aat cag aaa gta gcc ccg gag aat agc 1056
Glu Ala Thr Arg Phe Trp Arg Asn Gln Lys Val Ala Pro Glu Asn Ser
340 345 350
tat acc tac gat tcc ctg tat cag ctt atc agc gcc acc ggg cgc gag 1104
Tyr Thr Tyr Asp Ser Leu Tyr Gln Leu Ile Ser Ala Thr Gly Arg Glu
355 360 365
atg gcc aat atc ggt cag caa agc aac caa ctt ccc tct ccg gcg cta 1152
Met Ala Asn Ile Gly Gln Gln Ser Asn Gln Leu Pro Ser Pro Ala Leu
370 375 380
cct tct gat aac aat acc tac acc aac tat act cgc act tat act tat 1200
Pro Ser Asp Asn Asn Thr Tyr Thr Asn Tyr Thr Arg Thr Tyr Thr Tyr
385 390 395 400
gac cgt ggc ggc aat ttg acg aaa att cag cat agt tca cca gcc gcg 1248
Asp Arg Gly Gly Asn Leu Thr Lys Ile Gln His Ser Ser Pro Ala Ala
405 410 415
caa aat aac tac acg acg gat ata acg gtt tca aat cgc agc aac cgc 1296
Gln Asn Asn Tyr Thr Thr Asp Ile Thr Val Ser Asn Arg Ser Asn Arg
420 425 430
gcg gta ctc agc aca ttg acc gca gat cca act caa gtc gat gcc tta 1344
Ala Val Leu Ser Thr Leu Thr Ala Asp Pro Thr Gln Val Asp Ala Leu
435 440 445
ttt gat gcg gga ggc cat caa acc agc ttg tta tcc ggc caa gtt cta 1392
Phe Asp Ala Gly Gly His Gln Thr Ser Leu Leu Ser Gly Gln Val Leu
450 455 460
act tgg aca ccg cga ggc gaa ttg aaa caa gcc aac aat agc gca gga 1440
Thr Trp Thr Pro Arg Gly Glu Leu Lys Gln Ala Asn Asn Ser Ala Gly
465 470 475 480
aat gag tgg tat cgc tac gat agc aac ggc ata cgc cag cta aaa gtg 1488
Asn Glu Trp Tyr Arg Tyr Asp Ser Asn Gly Ile Arg Gln Leu Lys Val
485 490 495
aat gaa caa caa act cag aat atc ccg caa caa caa agg gta act tat 1536
Asn Glu Gln Gln Thr Gln Asn Ile Pro Gln Gln Gln Arg Val Thr Tyr
500 505 510
cta ccg ggg ctg gaa ata cgt aca acc cag aac aac gcc aca aca aca l584
Leu Pro Gly Leu Glu Ile Arg Thr Thr Gln Asn Asn Ala Thr Thr Thr
515 520 525
gaa gag tta cac gtt atc aca ctc ggt aaa gcc ggc cgc gcg caa gtc 1632
Glu Glu Leu His Val Ile Thr Leu Gly Lys Ala Gly Arg Ala Gln Val
530 535 540
cga gta ttg cat tgg gag agc ggt aaa cca gaa gat att aat aac aat 1680
Arg Val Leu His Trp Glu Ser Gly Lys Pro Glu Asp Ile Asn Asn Asn
545 550 555 560
cag ctt cgt tac agc tac gat aat ctt att ggc tcc agc caa ctt caa 1728
Gln Leu Arg Tyr Ser Tyr Asp Asn Leu Ile Gly Ser Ser Gln Leu Gln
565 570 575
tta gat agc gac gga caa att atc agt gaa gaa gaa tat tat cca ttt 1776
Leu Asp Ser Asp Gly Gln Ile Ile Ser Glu Glu Glu Tyr Tyr Pro Phe
580 585 590
ggt ggt aca gcg ctg tgg gcg gca agg aat caa acc gaa gcc agc tat 1824
Gly Gly Thr Ala Leu Trp Ala Ala Arg Asn Gln Thr Glu Ala Ser Tyr
595 600 605
aaa acc att cgt tat tct ggt aaa gag cgg gat gtt acc ggg ctg tat 1872
Lys Thr Ile Arg Tyr Ser Gly Lys Glu Arg Asp Val Thr Gly Leu Tyr
610 615 620
tat tat ggc tac cgt tat tac caa ccg tgg gcg ggc aga tgg tta ggt 1920
Tyr Tyr Gly Tyr Arg Tyr Tyr Gln Pro Trp Ala Gly Arg Trp Leu Gly
625 630 635 640
gca gac ccg gca gga acc att gat gga ctg aat tta tat cgc atg gtg 1968
Ala Asp Pro Ala Gly Thr Ile Asp Gly Leu Asn Leu Tyr Arg Met Val
645 650 655
aga aat aac ccg gtg acg caa ttt gat gtt cag gga tta tca ccg gcc 2016
Arg Asn Asn Pro Val Thr Gln Phe Asp Val Gln Gly Leu Ser Pro Ala
660 665 670
aac aga aca gaa gaa gcg ata ata aaa cag ggt tcc ttt acg gga atg 2064
Asn Arg Thr Glu Glu Ala Ile Ile Lys Gln Gly Ser Phe Thr Gly Met
675 680 685
gaa gaa gct gtt tat aaa aaa atg gct aaa cct caa act ttc aaa cgc 2112
Glu Glu Ala Val Tyr Lys Lys Met Ala Lys Pro Gln Thr Phe Lys Arg
690 695 700
caa aga gct atc gct gcc caa aca gag caa gaa gcc cat gaa tca ttg 2160
Gln Arg Ala Ile Ala Ala Gln Thr Glu Gln Glu Ala His Glu Ser Leu
705 710 715 720
acc aac aac cct agt gta gat att agc cca att aaa aac tac acc aca 2208
Thr Asn Asn Pro Ser Val Asp Ile Ser Pro Ile Lys Asn Tyr Thr Thr
725 730 735
gat agc tca caa att aat gcc gcg ata agg gaa aat cgt att acg cca 2256
Asp Ser Ser Gln Ile Asn Ala Ala Ile Arg Glu Asn Arg Ile Thr Pro
740 745 750
gca gtg gaa agt tta gac gcc aca tta tct tcc cta caa gat aga caa 2304
Ala Val Glu Ser Leu Asp Ala Thr Leu Ser Ser Leu Gln Asp Arg Gln
755 760 765
atg agg gta act tat cgg gtg atg acc tat gta gat aat tcc acg cca 2352
Met Arg Val Thr Tyr Arg Val Met Thr Tyr Val Asp Asn Ser Thr Pro
770 775 780
tcg cct tgg cac tcg cca cag gaa gga aat agt att aat gtt ggt gat 2400
Ser Pro Trp His Ser Pro Gln Glu Gly Asn Ser Ile Asn Val Gly Asp
785 790 795 800
atc gtt tcg gat aac gct tat tta tca aca tcg gcc cat cgt ggt ttt 2448
Ile Val Ser Asp Asn Ala Tyr Leu Ser Thr Ser Ala His Arg Gly Phe
805 810 815
ctg aat ttt gtt cac aaa aaa gaa acc agt gaa act cga tac gtc aag 2496
Leu Asn Phe Val His Lys Lys Glu Thr Ser Glu Thr Arg Tyr Val Lys
820 825 830
atg gca ttt tta acg aat gcg ggt gtc aat gtc cca gca gca tct atg 2544
Met Ala Phe Leu Thr Asn Ala Gly Val Asn Val Pro Ala Ala Ser Met
835 840 845
tat aat aat gct ggc gag gag caa gta ttt aaa atg gat tta aac gat 2592
Tyr Asn Asn Ala Gly Glu Glu Gln Val Phe Lys Met Asp Leu Asn Asp
850 855 860
tca aga aaa agc ctt gct gaa aaa tta aaa cta aga gtc agt gga cca 2640
Ser Arg Lys Ser Leu Ala Glu Lys Leu Lys Leu Arg Val Ser Gly Pro
865 870 875 880
caa tcg gga caa gcg gaa ata tta cta cct agg gaa aca cag ttc gaa 2688
Gln Ser Gly Gln Ala Glu Ile Leu Leu Pro Arg Glu Thr Gln Phe Glu
885 890 895
gtt gtt tca atg aaa cat caa ggc aga gat acc tat gta tta ttg caa 2736
Val Val Ser Met Lys His Gln Gly Arg Asp Thr Tyr Val Leu Leu Gln
900 905 910
gat att aac caa tcc gca gcc act cat aga aat gta cgt aac act tac 2784
Asp Ile Asn Gln Ser Ala Ala Thr His Arg Asn Val Arg Asn Thr Tyr
915 920 925
acc ggt aat ttc aaa tca tcc agt gca aat taa 2817
Thr Gly Ash Phe Lys Ser Ser Ser Ala Asn
930 935
<210>58
<211>915
<212>PRT
<213>发光光杆状菌
<400>58
Met Ser Ser Tyr Asn Ser Ala Ile Asp Gln Lys Thr Pro Ser Ile Lys
1 5 10 15
Val Leu Asp Asn Arg Lys Leu Asn Val Arg Thr Leu Glu Tyr Leu Arg
20 25 30
Thr Gln Ala Asp Glu Asn Ser Asp Glu Leu Ile Thr Phe Tyr Glu Phe
35 40 45
Asn Ile Pro Gly Phe Gln Val Lys Ser Thr Asp Pro Arg Lys Asn Lys
50 55 60
Asn Gln Ser Gly Pro Asn Phe Ile Arg Val Phe Asn Leu Ala Gly Gln
65 70 75 80
Val Leu Arg Glu Glu Ser Val Asp Ala Gly Arg Thr Ile Thr Leu Asn
85 90 95
Asp Ile Glu Ser Arg Pro Val Leu Ile Ile Asn Ala Thr Gly Val Arg
100 105 110
Gln Asn His Arg Tyr Glu Asp Asn Thr Leu Pro Gly Arg Leu Leu Ala
115 120 125
Ile Thr Glu Gln Val Gln Ala Gly Glu Lys Thr Thr Glu Arg Leu Ile
130 135 140
Trp Ala Gly Asn Thr Pro Gln Glu Lys Asp Tyr Asn Leu Ala Gly Gln
145 150 155 160
Cys Val Arg His Tyr Asp Thr Ala Gly Leu Thr Gln Leu Asn Ser Leu
165 170 175
Ser Leu Ala Gly Val Val Leu Ser Gln Ser Gln Gln Leu Leu Val Asp
180 185 190
Asp Lys Asn Ala Asp Trp Thr Gly Glu Asp Gln Ser Leu Trp Gln Gln
195 200 205
Lys Leu Ser Ser Asp Val Tyr Thr Thr Gln Asn Lys Ala Asp Ala Thr
210 215 220
Gly Ala Leu Leu Thr Gln Thr Asp Ala Lys Gly Asn Ile Gln Arg Leu
225 230 235 240
Ala Tyr Asp Val Ala Gly Gln Leu Lys Gly Cys Trp Leu Thr Leu Lys
245 250 255
Gly Gln Ala Glu Gln Val Ile Ile Lys Ser Leu Thr Tyr Ser Ala Ala
260 265 270
Gly Gln Lys Leu Arg Glu Glu His Gly Asn Gly Val Ile Thr Glu Tyr
275 280 285
Ser Tyr Glu Pro Glu Thr Gln Arg Leu Ile Gly Ile Ala Thr Arg Arg
290 295 300
Pro Ser Asp Ala Lys Val Leu Gln Asp Leu Arg Tyr Gln Tyr Asp Pro
305 310 315 320
Val Gly Asn Val Ile Asn Ile Arg Asn Asp Ala Glu Ala Thr Arg Phe
325 330 335
Trp Arg Asn Gln Lys Val Val Pro Glu Asn Ser Tyr Thr Tyr Asp Ser
340 345 350
Leu Tyr Gln Leu Ile Ser Ala Thr Gly Arg Glu Met Ala Asn Ile Gly
355 360 365
Gln Gln Asn Asn Gln Leu Pro Ser Pro Ala Leu Pro Ser Asp Asn Asn
370 375 380
Thr Tyr Thr Asn Tyr Thr Arg Ser Tyr Ser Tyr Asp His Ser Gly Asn
385 390 395 400
Leu Thr Gln Ile Arg His Ser Ser Pro Ala Thr Gln Asn Asn Tyr Thr
405 410 415
Val Ala Ile Thr Leu Ser Asn Arg Ser Asn Arg Gly Val Leu Ser Thr
420 425 430
Leu Thr Thr Asp Pro Asn Gln Val Asp Thr Leu Phe Asp Ala Gly Gly
435 440 445
His Gln Thr Ser Leu Leu Pro Gly Gln Thr Leu Ile Trp Thr Pro Arg
450 455 460
Gly Glu Leu Lys Gln Val Asn Asn Gly Pro Gly Asn Glu Trp Tyr Arg
465 470 475 480
Tyr Asp Ser Asn Gly Met Arg Gln Leu Lys Val Ser Glu Gln Pro Thr
485 490 495
Gln Asn Thr Thr Gln Gln Gln Arg Val Ile Tyr Leu Pro Gly Leu Glu
500 505 510
Leu Arg Thr Thr Gln Ser Asn Ala Thr Thr Thr Glu Glu Leu His Val
515 520 525
Ile Thr Leu Gly Glu Ala Gly Arg Ala Gln Val Arg Val Leu His Trp
530 535 540
Glu Ser Gly Lys Pro Glu Asp Val Asn Asn Asn Gln Leu Arg Tyr Ser
545 550 555 560
Tyr Asp Asn Leu Ile Gly Ser Ser Gln Leu Glu Leu Asp Asn Gln Gly
565 570 575
Gln Ile Ile Ser Glu Glu Glu Tyr Tyr Pro Phe Gly Gly Thr Ala Leu
580 585 590
Trp Ala Ala Asn Ser Gln Thr Glu Ala Ser Tyr Lys Thr Ile Arg Tyr
595 600 605
Ser Gly Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly Tyr Arg
610 615 620
Tyr Tyr Gln Pro Trp Ala Gly Arg Trp Leu Ser Ala Asp Pro Ala Gly
625 630 635 640
Thr Ile Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn Pro Val
645 650 655
Ser Leu Gln Asp Glu Asn Gly Leu Ala Pro Glu Lys Gly Lys Tyr Thr
660 665 670
Lys Glu Val Asn Phe Phe Asp Glu Leu Lys Phe Lys Leu Ala Ala Lys
675 680 685
Ser Ser His Val Val Lys Trp Asn Glu Lys Glu Ser Ser Tyr Thr Lys
690 695 700
Asn Lys Ser Leu Lys Val Val Arg Val Gly Asp Ser Asp Pro Ser Gly
705 710 715 720
Tyr Leu Leu Ser His Glu Glu Leu Leu Lys Gly Ile Glu Lys Ser Gln
725 730 735
Ile Ile Tyr Ser Arg Leu Glu Glu Asn Ser Ser Leu Ser Glu Lys Ser
740 745 750
Lys Thr Asn Leu Ser Leu Gly Ser Glu Ile Ser Gly Tyr Met Ala Arg
755 760 765
Thr Ile Gln Asp Thr Ile Ser Glu Tyr Ala Glu Glu His Lys Tyr Arg
770 775 780
Ser Asn His Pro Asp Phe Tyr Ser Glu Thr Asp Phe Phe Ala Leu Met
785 790 795 800
Asp Lys Ser Glu Lys Asn Asp Tyr Ser Gly Glu Arg Lys Ile Tyr Ala
805 810 815
Ala Met Glu Val Lys Val Tyr His Asp Leu Lys Asn Lys Gln Ser Glu
820 825 830
Leu His Val Asn Tyr Ala Leu Ala His Pro Tyr Thr Gln Leu Ser Asn
835 840 845
Glu Glu Arg Ala Leu Leu Gln Glu Thr Glu Pro Ala Ile Ala Ile Asp
850 855 860
Arg Glu Tyr Asn Phe Lys Gly Val Gly Lys Phe Leu Thr Met Lys Ala
865 870 875 880
Ile Lys Lys Ser Leu Lys Gly His Lys Ile Asn Arg Ile Ser Thr Glu
885 890 895
Ala Ile Asn Ile Arg Ser Ala Ala Ile Ala Glu Asn Leu Gly Met Arg
900 905 910
Arg Thr Ser
915
<210>59
<211>2504
<212>PRT
<213>发光光杆状菌
<400>59
Met Gln Asn Ser Leu Ser Ser Thr Ile Asp Thr Ile Cys Gln Lys Leu
1 5 10 15
Gln Leu Thr Cys Pro Ala Glu Ile Ala Leu Tyr Pro Phe Asp Thr Phe
20 25 30
Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg Ile
35 40 45
Tyr Glu Ile Ala Gln Ala Glu Gln Asp Arg Ash Leu Leu His Glu Lys
50 55 60
Arg Ile Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu
65 70 75 80
Gly Thr Arg Gln Met Leu Gly Phe Ile Gln Gly Tyr Ser Asp Leu Phe
85 90 95
Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met
100 105 110
Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn
115 120 125
Leu His Asp Ser Ser Ser Ile Tyr Tyr Leu Asp Lys Arg Arg Pro Asp
130 135 140
Leu Ala Ser Leu Met Leu Ser Gln Lys Asn Met Asp Glu Glu Ile Ser
145 150 155 160
Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly Ile Glu Thr Lys
165 170 175
Thr Gly Lys Ser Gln Asp Glu Val Met Asp Met Leu Ser Thr Tyr Arg
180 185 190
Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val Arg Glu
195 200 205
Ile Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gln Ala Pro
210 215 220
Ile Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly Ile Ser Ser
225 230 235 240
His Ile Ser Pro Glu Leu Tyr Asn Leu Leu Ile Glu Glu Ile Pro Glu
245 250 255
Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp
260 265 270
Ile Thr Thr Ala Gln Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr
275 280 285
Gly Val Ser Pro Glu Asp Ile Ala Tyr Val Thr Thr Ser Leu Ser His
290 295 300
Val Gly Tyr Ser Ser Asp Ile Leu Val Ile Pro Leu Val Asp Gly Val
305 310 315 320
Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr
325 330 335
Thr Ser Gln Thr Asn Tyr Ile Glu Leu Tyr Pro Gln Gly Gly Asp Asn
340 345 350
Tyr Leu Ile Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe
355 360 365
Tyr Leu Gln Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu Ile Ala His
370 375 380
Asn Pro Tyr Pro Asp Met Val Ile Asn Gln Lys Tyr Glu Ser Gln Ala
385 390 395 400
Thr Ile Lys Arg Ser Asp Ser Asp Asn Ile Leu Ser Ile Gly Leu Gln
405 410 415
Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys Ile
420 425 430
Asp Gln Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala Ile
435 440 445
Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arg Ile
450 455 460
Val Asp Ser Val Asn Ser Thr Lys Ser Ile Thr Val Glu Val Leu Asn
465 470 475 480
Lys Val Tyr Arg Val Lys Phe Tyr Ile Asp Arg Tyr Gly Ile Ser Glu
485 490 495
Glu Thr Ala Ala Ile Leu Ala Asn Ile Asn Ile Ser Gln Gln Ala Val
500 505 510
Gly Asn Gln Leu Ser Gln Phe Glu Gln Leu Phe Asn His Pro Pro Leu
515 520 525
Asn Gly Ile Arg Tyr Glu Ile Ser Glu Asp Asn Ser Lys His Leu Pro
530 535 540
Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gln Arg
545 550 555 560
Lys Ala Val Leu Lys Arg Ala Phe Gln Val Asn Ala Ser Glu Leu Tyr
565 570 575
Gln Met Leu Leu Ile Thr Asp Arg Lys Glu Asp Gly Val Ile Lys Asn
580 585 590
Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu Ala Gln
595 600 605
Ile His Asn Leu Thr Ile Ala Glu Leu Asn Ile Leu Leu Val Ile Cys
610 615 620
Gly Tyr Gly Asp Thr Asn Ile Tyr Gln Ile Thr Asp Asp Asn Leu Ala
625 630 635 640
Lys Ile Val Glu Thr Leu Leu Trp Ile Thr Gln Trp Leu Lys Thr Gln
645 650 655
Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser
660 665 670
Thr Thr Leu Thr Pro Glu Ile Ser Asn Leu Thr Ala Thr Leu Ser Ser
675 680 685
Thr Leu His Gly Lys Glu Ser Leu Ile Gly Glu Asp Leu Lys Arg Ala
690 695 700
Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gln Glu Val
705 710 715 720
Ala Tyr Asp Leu Leu Leu Trp Ile Asp Gln Ile Gln Pro Ala Gln Ile
725 730 735
Thr Val Asp Gly Phe Trp Glu Glu Val Gln Thr Thr Pro Thr Ser Leu
740 745 750
Lys Val Ile Thr Phe Ala Gln Val Leu Ala Gln Leu Ser Leu Ile Tyr
755 760 765
Arg Arg Ile Gly Leu Ser Glu Thr Glu Leu Ser Leu Ile Val Thr Gln
770 775 780
Ser Ser Leu Leu Val Ala Gly Lys Ser Ile Leu Asp His Gly Leu Leu
785 790 795 800
Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly
805 810 815
Gln His Ala Ser Leu Ile Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr
820 825 830
Val Thr Asp Val Ala Gln Ala Met Asn Lys Glu Glu Ser Leu Leu Gln
835 840 845
Met Ala Ala Asn Gln Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp
850 855 860
Thr Gln Ile Asp Ala Ile Leu Gln Trp Leu Gln Met Ser Ser Ala Leu
865 870 875 880
Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Gly
885 890 895
Ile Asp His Asn Tyr Ala Ala Trp Gln Ala Ala Ala Ala Ala Leu Met
900 905 910
Ala Asp His Ala Asn Gln Ala Gln Lys Lys Leu Asp Glu Thr Phe Ser
915 920 925
Lys Ala Leu Cys Asn Tyr Tyr Ile Asn Ala Val Val Asp Ser Ala Ala
930 935 940
Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu Ile Asp Asn
945 950 955 960
Gln Val Ser Ala Asp Val Ile Thr Ser Arg Ile Ala Glu Ala Ile Ala
965 970 975
Gly Ile Gln Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gln
980 985 990
Leu Ala Ser Asp Val Ser Thr Arg Gln Phe Phe Thr Asp Trp Glu Arg
995 1000 1005
Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val
1010 1015 1020
Tyr Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gln Arg Ile Gly Gln
1025 1030 1035
Thr Lys Met Met Asp Ala Leu Leu Gln Ser Ile Asn Gln Ser Gln
1040 1045 1050
Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr
1055 1060 1065
Ser Phe Glu Gln Val Ala Asn Leu Lys Val Ile Ser Ala Tyr His
1070 1075 1080
Asp Asn Val Asn Val Asp Gln Gly Leu Thr Tyr Phe Ile Gly Ile
1085 1090 1095
Asp Gln Ala Ala Pro Gly Thr Tyr Tyr Trp Arg Ser Val Asp His
1100 1105 1110
Ser Lys Cys Glu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly Glu
1115 1120 1125
Trp Asn Lys Ile Thr Cys Ala Val Asn Pro Trp Lys Asn Ile Ile
1130 1135 1140
Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Glu
1145 1150 1155
Gln Gln Ser Lys Lys Ser Asp Asp Gly Lys Thr Thr Ile Tyr Gln
1160 1165 1170
Tyr Asn Leu Lys Leu Ala His Ile Arg Tyr Asp Gly Ser Trp Asn
1175 1180 1185
Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr
1190 1195 1200
Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cys Thr Gly
1205 1210 1215
Tyr Gln Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met Gln
1220 1225 1230
Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly
1235 1240 1245
Leu Tyr Ile Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn Ala
1250 1255 1260
Gln Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gln Phe Asp Thr
1265 1270 1275
Val Met Ala Asp Pro Asp Ser Asp Asn Lys Lys Val Ile Thr Arg
1280 1285 1290
Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu Ile Pro Ser Ser
1295 1300 1305
Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr
1310 1315 1320
Met Leu Tyr Gly Gly Ser Val Pro Asn Ile Thr Phe Glu Ser Ala
1325 1330 1335
Ala Glu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser Ile Ile
1340 1345 1350
His Asn Gly Tyr Ala Gly Thr Arg Arg Ile Gln Cys Asn Leu Met
1355 1360 1365
Lys Gln Tyr Ala Ser Leu Gly Asp Lys Phe Ile Ile Tyr Asp Ser
1370 1375 1380
Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys
1385 1390 1395
Phe Gly Lys Asp Glu Asn Ser Asp Asp Ser Ile Cys Ile Tyr Asn
1400 1405 1410
Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe Ser Ser Lys
1415 1420 1425
Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gln Cys Ile
1430 1435 1440
Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gln Glu
1445 1450 1455
Ile Glu Val Ile Ser Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys
1460 1465 1470
Ile Ser Asn Pro Ile Asn Ile Asn Thr Gly Ile Asp Ser Ala Lys
1475 1480 1485
Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gln Ile Phe Thr
1490 1495 1500
Ala Asp Asn Ser Thr Tyr Val Pro Gln Gln Pro Ala Pro Ser Phe
1505 1510 1515
Glu Glu Met Ile Tyr Gln Phe Asn Asn Leu Thr Ile Asp Cys Lys
1520 1525 1530
Asn Leu Asn Phe Ile Asp Asn Gln Ala His Ile Glu Ile Asp Phe
1535 1540 1545
Thr Ala Thr Ala Gln Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe
1550 1555 1560
Ile Ile Pro Val Thr Lys Lys Val Leu Gly Thr Glu Asn Val Ile
1565 1570 1575
Ala Leu Tyr Ser Glu Asn Asn Gly Val Gln Tyr Met Gln Ile Gly
1580 1585 1590
Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gln Gln Leu Val
1595 1600 1605
Ser Arg Ala Asn Arg Gly Ile Asp Ala Val Leu Ser Met Glu Thr
1610 1615 1620
Gln Asn Ile Gln Glu Pro Gln Leu Gly Ala Gly Thr Tyr Val Gln
1625 1630 1635
Leu Val Leu Asp Lys Tyr Asp Glu Ser Ile His Gly Thr Asn Lys
1640 1645 1650
Ser Phe Ala Ile Glu Tyr Val Asp Ile Phe Lys Glu Asn Asp Ser
1655 1660 1665
Phe Val Ile Tyr Gln Gly Glu Leu Ser Glu Thr Ser Gln Thr Val
1670 1675 1680
Val Lys Val Phe Leu Ser Tyr Phe Ile Glu Ala Thr Gly Asn Lys
1685 1690 1695
Asn His Leu Trp Val Arg Ala Lys Tyr Gln Lys Glu Thr Thr Asp
1700 1705 1710
Lys Ile Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp
1715 1720 1725
Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala
1730 1735 1740
Gln Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala
1745 1750 1755
Asn Ala Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met
1760 1765 1770
Met Ala His Arg Leu Leu Gln Glu Gln Asn Phe Asp Ala Ala Asn
1775 1780 1785
His Trp Phe Arg Tyr Val Trp Ser Pro Ser Gly Tyr Ile Val Asp
1790 1795 1800
Gly Lys Ile Ala Ile Tyr His Trp Asn Val Arg Pro Leu Glu Glu
1805 1810 1815
Asp Thr Ser Trp Asn Ala Gln Gln Leu Asp Ser Thr Asp Pro Asp
1820 1825 1830
Ala Val Ala Gln Asp Asp Pro Met His Tyr Lys Val Ala Thr Phe
1835 1840 1845
Met Ala Thr Leu Asp Leu Leu Met Ala Arg Gly Asp Ala Ala Tyr
1850 1855 1860
Arg Gln Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Met Trp Tyr
1865 1870 1875
Thr Gln Ala Leu Asn Leu Leu Gly Asp Glu Pro Gln Val Met Leu
1880 1885 1890
Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys
1895 1900 1905
Thr Thr Gln Gln Val Arg Gln Gln Val Leu Thr Gln Leu Arg Leu
1910 1915 1920
Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn Ser Leu
1925 1930 1935
Thr Ala Leu Phe Leu Pro Gln Glu Asn Ser Lys Leu Lys Gly Tyr
1940 1945 1950
Trp Arg Thr Leu Ala Gln Arg Met Phe Asn Leu Arg His Asn Leu
1955 1960 1965
Ser Ile Asp Gly Gln Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro
1970 1975 1980
Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gln
1985 1990 1995
Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr Ile His Arg Phe
2000 2005 2010
Pro Gln Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gln Leu Ile
2015 2020 2025
Gln Phe Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gln Asp Ala
2030 2035 2040
Glu Ala Met Ser Gln Leu Leu Gln Thr Gln Ala Ser Glu Leu Ile
2045 2050 2055
Leu Thr Ser Ile Arg Met Gln Asp Asn Gln Leu Ala Glu Leu Asp
2060 2065 2070
Ser Glu Lys Thr Ala Leu Gln Val Ser Leu Ala Gly Val Gln Gln
2075 2080 2085
Arg Phe Asp Ser Tyr Ser Gln Leu Tyr Glu Glu Asn Ile Asn Ala
2090 2095 2100
Gly Glu Gln Arg Ala Leu Ala Leu Arg Ser Glu Ser Ala Ile Glu
2105 2110 2115
Ser Gln Gly Ala Gln Ile Ser Arg Met Ala Gly Ala Gly Val Asp
2120 2125 2130
Met Ala Pro Asn Ile Phe Gly Leu Ala Asp Gly Gly Met His Tyr
2135 2140 2145
Gly Ala Ile Ala Tyr Ala Ile Ala Asp Gly Ile Glu Leu Ser Ala
2150 2155 2160
Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gln Ser Glu Ile
2165 2170 2175
Tyr Arg Arg Arg Arg Gln Glu Trp Lys Ile Gln Arg Asp Asn Ala
2180 2185 2190
Gln Ala Glu Ile Asn Gln Leu Asn Ala Gln Leu Glu Ser Leu Ser
2195 2200 2205
Ile Arg Arg Glu Ala Ala Glu Met Gln Lys Glu Tyr Leu Lys Thr
2210 2215 2220
Gln Gln Ala Gln Ala Gln Ala Gln Leu Thr Phe Leu Arg Ser Lys
2225 2230 2235
Phe Ser Asn Gln Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser
2240 2245 2250
Gly Ile Tyr Phe Gln Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu
2255 2260 2265
Met Ala Glu Gln Ser Tyr Gln Trp Glu Ala Asn Asp Asn Ser Ile
2270 2275 2280
Ser Phe Val Lys Pro Gly Ala Trp Gln Gly Thr Tyr Ala Gly Leu
2285 2290 2295
Leu Cys Gly Glu Ala Leu Ile Gln Asn Leu Ala Gln Met Glu Glu
2300 2305 2310
Ala Tyr Leu Lys Trp Glu Ser Arg Ala Leu Glu Val Glu Arg Thr
2315 2320 2325
Val Ser Leu Ala Val Val Tyr Asp Ser Leu Glu Gly Asn Asp Arg
2330 2335 2340
Phe Asn Leu Ala Glu Gln Ile Pro Ala Leu Leu Asp Lys Gly Glu
2345 2350 2355
Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu Ser Leu Ala Asn Ala
2360 2365 2370
Ile Leu Ser Ala Ser Val Lys Leu Ser Asp Leu Lys Leu Gly Thr
2375 2380 2385
Asp Tyr Pro Asp Ser Ile Val Gly Ser Asn Lys Val Arg Arg Ile
2390 2395 2400
Lys Gln Ile Ser Val Ser Leu Pro Ala Leu Val Gly Pro Tyr Gln
2405 2410 2415
Asp Val Gln Ala Met Leu Ser Tyr Gly Gly Ser Thr Gln Leu Pro
2420 2425 2430
Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser
2435 2440 2445
Gly Gln Phe Gln Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe
2450 2455 2460
Glu Gly Ile Ala Leu Asp Asp Gln Gly Thr Leu Asn Leu Gln Phe
2465 2470 2475
Pro Ash Ala Thr Asp Lys Gln Lys Ala Ile Leu Gln Thr Met Ser
2480 2485 2490
Asp Ile Ile Leu His Ile Arg Tyr Thr Ile Arg
2495 2500
<210>60
<211>1428
<212>PRT
<213>嗜线虫沙雷菌(Serratia entomophila)
<400>60
Met Gln Asn His Gln Asp Met Ala Ile Thr Ala Pro Thr Leu Pro Ser
1 5 10 15
Gly Gly Gly Ala Val Thr Gly Leu Lys Gly Asp Ile Ala Ala Ala Gly
20 25 30
Pro Asp Gly Ala Ala Thr Leu Ser Ile Pro Leu Pro Val Ser Pro Gly
35 40 45
Arg Gly Tyr Ala Pro Thr Gly Ala Leu Asn Tyr His Ser Arg Ser Gly
50 55 60
Asn Gly Pro Phe Gly Ile Gly Trp Gly Ile Gly Gly Ala Ala Val Gln
65 70 75 80
Arg Arg Thr Arg Asn Gly Ala Pro Thr Tyr Asp Asp Thr Asp Glu Phe
85 90 95
Thr Gly Pro Asp Gly Glu Val Leu Val Pro Ala Leu Thr Ala Ala Gly
100 105 110
Thr Gln Glu Ala Arg Gln Ala Thr Ser Leu Leu Gly Ile Asn Pro Gly
115 120 125
Gly Ser Phe Asn Val Gln Val Tyr Arg Ser Arg Thr Glu Gly Ser Leu
130 135 140
Ser Arg Leu Glu Arg Trp Leu Pro Ala Asp Glu Thr Glu Thr Glu Phe
145 150 155 160
Trp Val Leu Tyr Thr Pro Asp Gly Gln Val Ala Leu Leu Gly Arg Asn
165 170 175
Ala Gln Ala Arg Ile Ser Asn Pro Thr Ala Pro Thr Gln Thr Ala Val
180 185 190
Trp Leu Met Glu Ser Ser Val Ser Leu Thr Gly Glu Gln Met Tyr Tyr
195 200 205
Gln Tyr Arg Ala Glu Asp Asp Asp Gly Cys Asp Glu Ala Glu Arg Asp
210 215 220
Ala His Pro Gln Ala Gly Ala Gln Arg Tyr Pro Val Ala Val Trp Tyr
225 230 235 240
Gly Asn Arg Gln Ala Ala Arg Thr Leu Pro Ala Leu Val Ser Thr Pro
245 250 255
Ser Met Asp Ser Trp Leu Phe Ile Leu Val Phe Asp Tyr Gly Glu Arg
260 265 270
Ser Ser Val Leu Ser Glu Ala Pro Ala Trp Gln Thr Pro Gly Ser Gly
275 280 285
Glu Trp Leu Cys Arg Gln Asp Cys Phe Ser Gly Tyr Glu Phe Gly Phe
290 295 300
Asn Leu Arg Thr Arg Arg Leu Cys Arg Gln Val Leu Met Phe His Tyr
305 310 315 320
Leu Gly Val Leu Ala Gly Ser Ser Gly Ala Asn Asp Ala Pro Ala Leu
325 330 335
Ile Ser Arg Leu Leu Leu Asp Tyr Arg Glu Ser Pro Ser Leu Ser Leu
340 345 350
Leu Glu Asn Val His Gln Val Ala Tyr Glu Ser Asp Gly Thr Ser Cys
355 360 365
Ala Leu Pro Ala Leu Ala Leu Gly Trp Gln Thr Phe Thr Pro Pro Thr
370 375 380
Leu Ser Ala Trp Gln Thr Arg Asp Asp Met Gly Lys Leu Ser Leu Leu
385 390 395 400
Gln Pro Tyr Gln Leu Val Asp Leu Asn Gly Glu Gly Val Val Gly Ile
405 410 415
Leu Tyr Gln Asp Ser Gly Ala Trp Trp Tyr Arg Glu Pro Val Arg Gln
420 425 430
Ser Gly Asp Asp Pro Asp Ala Val Thr Trp Gly Ala Ala Ala Ala Leu
435 440 445
Pro Thr Met Pro Ala Leu His Asn Ser Gly Ile Leu Ala Asp Leu Asn
450 455 460
Gly Asp Gly Arg Leu Glu Trp Val Val Thr Ala Pro Gly Val Ala Gly
465 470 475 480
Met Tyr Asp Arg Thr Pro Gly Arg Asp Trp Leu His Phe Thr Pro Leu
485 490 495
Ser Ala Leu Pro Val Glu Tyr Ala His Pro Lys Ala Val Leu Ala Asp
500 505 510
Ile Leu Gly Ala Gly Leu Thr Asp Met Val Leu Ile Gly Pro Arg Ser
515 520 525
Val Arg Leu Tyr Ser Gly Lys Asn Asp Gly Trp Asn Lys Gly Glu Thr
530 535 540
Val Gln Gln Thr Glu Arg Leu Thr Leu Pro Val Pro Gly Val Asp Pro
545 550 555 560
Arg Thr Leu Val Ala Phe Ser Asp Met Ala Gly Ser Gly Gln Gln His
565 570 575
Leu Thr Glu Val Arg Ala Asn Gly Val Arg Tyr Trp Pro Asn Leu Gly
580 585 590
His Gly Arg Phe Gly Gln Pro Val Asn Ile Pro Gly Phe Ser Gln Ser
595 600 605
Val Thr Thr Phe Asn Pro Asp Gln Ile Leu Leu Ala Asp Thr Asp Gly
610 615 620
Ser Gly Thr Thr Asp Leu Ile Tyr Ala Met Ser Asp Arg Leu Val Ile
625 630 635 640
Tyr Phe Asn Gln Ser Gly Asn Tyr Phe Ala Glu Pro His Thr Leu Leu
645 650 655
Leu Pro Lys Gly Val Arg Tyr Asp Arg Thr Cys Ser Leu Gln Val Ala
660 665 670
Asp Ile Gln Gly Leu Gly Val Pro Ser Leu Leu Leu Thr Val Pro His
675 680 685
Val Ala Pro His His Trp Val Cys His Leu Ser Ala Asp Lys Pro Trp
690 695 700
Leu Leu Asn Gly Met Asn Asn Asn Met Gly Ala Arg His Ala Leu His
705 710 715 720
Tyr Arg Ser Ser Val Gln Phe Trp Leu Asp Glu Lys Ala Glu Ala Leu
725 730 735
Ala Ala Gly Ser Ser Pro Ala Cys Tyr Leu Pro Phe Thr Leu His Thr
740 745 750
Leu Trp Arg Ser Val Val Gln Asp Glu Ile Thr Gly Asn Arg Leu Val
755 760 765
Ser Asp Val Leu Tyr Arg His Gly Val Trp Asp Gly Gln Glu Arg Glu
770 775 780
Phe Arg Gly Phe Gly Phe Val Glu Ile Arg Asp Thr Asp Thr Leu Ala
785 790 795 800
Ser Gln Gly Thr Ala Thr Glu Leu Ser Met Pro Ser Val Ser Arg Asn
805 810 815
Trp Tyr Ala Thr Gly Val Pro Ala Val Asp Glu Arg Leu Pro Glu Thr
820 825 830
Tyr Trp Gln Asn Asp Ala Ala Ala Phe Ala Asp Phe Ala Thr Arg Phe
835 840 845
Thr Val Gly Ser Gly Glu Asp Glu Gln Thr Tyr Thr Pro Asp Asp Ser
850 855 860
Lys Thr Phe Trp Leu Gln Arg Ala Leu Lys Gly Ile Leu Leu Arg Ser
865 870 875 880
Glu Leu Tyr Gly Ala Asp Gly Ser Ser Gln Ala Asp Ile Pro Tyr Ser
885 890 895
Val Thr Glu Ser Arg Pro Gln Val Arg Leu Val Glu Ala Asn Gly Asp
900 905 910
Tyr Pro Val Val Trp Pro Met Gly Ala Glu Ser Arg Thr Ser Val Tyr
915 920 925
Glu Arg Tyr His Asn Asp Pro Gln Cys Gln Gln Gln Ala Val Leu Leu
930 935 940
Ser Asp Glu Tyr Gly Phe Pro Leu Arg Gln Val Ser Val Asn Tyr Pro
945 950 955 960
Arg Arg Pro Pro Ser Ala Asp Asn Pro Tyr Pro Ala Ser Leu Pro Ala
965 970 975
Thr Leu Phe Ala Asn Ser Tyr Asp Glu Gln Gln Gln Ile Leu Arg Leu
980 985 990
Gly Leu Gln Gln Ser Ser Ala His His Leu Val Ser Leu Ser Glu Gly
995 1000 1005
His Trp Leu Leu Gly Leu Ala Glu Ala Ser Arg Asp Asp Val Phe
1010 1015 1020
Thr Tyr Ser Ala Asp Asn Val Pro Glu Gly Gly Leu Thr Leu Glu
1025 1030 1035
His Leu Leu Ala Pro Glu Ser Leu Val Ser Asp Ser Gln Val Gly
1040 1045 1050
Thr Leu Ala Gly Gln Gln Gln Val Trp Tyr Leu Asp Ser Gln Asp
1055 1060 1065
Val Ala Thr Val Ala Ala Pro Pro Leu Pro Pro Lys Val Ala Phe
1070 1075 1080
Ile Glu Thr Ala Val Leu Asp Glu Gly Met Val Ser Ser Leu Ala
1085 1090 1095
Ala Tyr Ile Val Asp Glu His Leu Glu Gln Ala Gly Tyr Arg Gln
1100 1105 1110
Ser Gly Tyr Leu Phe Pro Arg Gly Arg Glu Ala Glu Gln Ala Leu
1115 1120 1125
Trp Thr Gln Cys Gln Gly Tyr Val Thr Tyr Ala Gly Ala Glu His
1130 1135 1140
Phe Trp Leu Pro Leu Ser Phe Arg Asp Ser Met Leu Thr Gly Pro
1145 1150 1155
Val Thr Val Thr Arg Asp Ala Tyr Asp Cys Val Ile Thr Gln Trp
1160 1165 1170
Gln Asp Ala Ala Gly Ile Val Thr Thr Ala Asp Tyr Asp Trp Arg
1175 1180 1185
Phe Leu Thr Pro Val Arg Val Thr Asp Pro Asn Asp Asn Leu Gln
1190 1195 1200
Ser Val Thr Leu Asp Ala Leu Gly Arg Val Thr Thr Leu Arg Phe
1205 1210 1215
Trp Gly Thr Glu Asn Gly Ile Ala Thr Gly Tyr Ser Asp Ala Thr
1220 1225 1230
Leu Ser Val Pro Asp Gly Ala Ala Ala Ala Leu Ala Leu Thr Ala
1235 1240 1245
Pro Leu Pro Val Ala Gln Cys Leu Val Tyr Val Thr Asp Ser Trp
1250 1255 1260
Gly Asp Asp Asp Asn Glu Lys Met Pro Pro His Val Val Val Leu
1265 1270 1275
Ala Thr Asp Arg Tyr Asp Ser Asp Thr Gly Gln Gln Val Arg Gln
1280 1285 1290
Gln Val Thr Phe Ser Asp Gly Phe Gly Arg Glu Leu Gln Ser Ala
1295 1300 1305
Thr Arg Gln Ala Glu Gly Asn Ala Trp Gln Arg Gly Arg Asp Gly
1310 1315 1320
Lys Leu Val Thr Ala Ser Asp Gly Leu Pro Val Thr Val Ala Thr
1325 1330 1335
Asn Phe Arg Trp Ala Val Thr Gly Arg Ala Glu Tyr Asp Asn Lys
1340 1345 1350
Gly Leu Pro Val Arg Val Tyr Gln Pro Tyr Phe Leu Asp Ser Trp
1355 1360 1365
Gln Tyr Val Ser Asp Asp Ser Ala Arg Gln Asp Leu Tyr Ala Asp
1370 1375 1380
Thr His Phe Tyr Asp Pro Thr Ala Arg Glu Trp Gln Val Ile Thr
1385 1390 1395
Ala Lys Gly Glu Arg Arg Gln Val Leu Tyr Thr Pro Trp Phe Val
1400 1405 1410
Val Ser Glu Asp Glu Asn Asp Thr Val Gly Leu Ash Asp Ala Ser
1415 1420 1425
<210>61
<211>973
<212>PRT
<213>嗜线虫沙雷菌
<400>61
Met Ser Thr Ser Leu Phe Ser Ser Thr Pro Ser Val Ala Val Leu Asp
1 5 10 15
Asn Arg Gly Leu Leu Val Arg Glu Leu Gin Tyr Tyr Arg His Pro Asp
20 25 30
Thr Pro Glu Glu Thr Asp Glu Arg Ile Thr Cys His Gln His Asp Glu
35 40 45
Arg Gly Ser Leu Ser Gln Ser Ala Asp Pro Arg Leu His Ala Ala Gly
50 55 60
Leu Thr Asn Phe Thr Tyr Leu Asn Ser Leu Thr Gly Thr Val Leu Gln
65 70 75 80
Ser Val Ser Ala Asp Ala Gly Thr Ser Leu Glu Leu Ser Asp Ala Ala
85 90 95
Gly Arg Ala Phe Leu Ala Val Thr Gly Ala Gly Thr Glu Asp Ala Val
100 105 110
Thr Arg Thr Trp Gln Tyr Glu Asp Asp Thr Leu Pro Gly Arg Pro Leu
115 120 125
Ser Ile Thr Glu Gln Val Thr Gly Glu Ala Ala Gln Ile Thr Glu Arg
130 135 l40
Phe Val Tyr Ala Gly Asn Thr Asp Ala Glu Lys Ile Leu Asn Leu Ala
145 150 155 160
Gly Gln Cys Val Ser His Tyr Asp Thr Ala Gly Leu Val Gln Thr Asp
165 170 175
Ser Ile Ala Leu Ser Gly Val Pro Leu Ala Val Thr Arg Gln Leu Leu
180 185 190
Pro Asp Ala Ala Gly Ala Asn Trp Met Gly Glu Asp Ala Ser Ala Trp
195 200 205
Asn Asp Leu Leu Asp Gly Glu Thr Phe Phe Thr Gln Thr His Ala Asp
210 215 220
Ala Thr Gly Ala Val Leu Ser Ile Thr Asp Ala Lys Gly Asn Leu Gln
225 230 235 240
Arg Val Ala Tyr Asp Val Ala Gly Leu Leu Ser Gly Ser Trp Leu Thr
245 250 255
Leu Lys Asp Gly Thr Glu Gln Val Ile Val Ala Ser Leu Thr Tyr Ser
260 265 270
Ala Ala Gly Lys Lys Leu Arg Glu Glu His Gly Asn Gly Val Val Thr
275 280 285
Ser Tyr Ile Tyr Glu Pro Glu Thr Gln Arg Leu Thr Gly Ile Lys Thr
290 295 300
Glu Arg Pro Ser Gly His Val Ala Gly Ala Lys Val Leu Gln Asp Leu
305 310 315 320
Arg Tyr Thr Tyr Asp Pro Val Gly Asn Val Leu Ser Val Asn Asn Asp
325 330 335
Ala Glu Glu Thr Arg Phe Trp Arg Asn Gln Lys Val Val Pro Glu Asn
340 345 350
Thr Tyr Ile Tyr Asp Ser Leu Tyr Gln Leu Val Ser Ala Thr Gly Arg
355 360 365
Glu Met Ala Asn Ala Gly Gln Gln Gly Asn Asp Leu Pro Ser Ala Thr
370 375 380
Ala Pro Leu Pro Thr Asp Ser Ser Ala Tyr Thr Asn Tyr Thr Arg Thr
385 390 395 400
Tyr Arg Tyr Asp Arg Gly Gly Asn Leu Thr Gln Met Arg His Ser Ala
405 410 415
Pro Ala Thr Asn Asn Asn Tyr Thr Thr Asp Ile Thr Val Ser Asp Arg
420 425 430
Ser Asn Arg Ala Val Leu Ser Thr Leu Ala Glu Val Pro Ser Asp Val
435 440 445
Asp Met Leu Phe Ser Ala Gly Gly His Gln Lys His Leu Gln Pro Gly
450 455 460
Gln Ala Leu Val Trp Thr Pro Arg Gly Glu Leu Gln Lys Val Thr Pro
465 470 475 480
Val Val Arg Asp Gly Gly Ala Asp Asp Ser Glu Ser Tyr Arg Tyr Asp
485 490 495
Ala Gly Ser Gln Arg Ile Ile Lys Thr Gly Thr Arg Gln Thr Gly Asn
500 505 510
Asn Val Gln Thr Gln Arg Val Val Tyr Leu Pro Gly Leu Glu Leu Arg
515 520 525
Ile Met Ala Asn Gly Val Thr Glu Lys Glu Ser Leu Gln Val Ile Thr
530 535 540
Val Gly Glu Ala Gly Arg Ala Gln Val Arg Val Leu His Trp Glu Ile
545 550 555 560
Gly Lys Pro Asp Asp Leu Asp Glu Asp Ser Val Arg Tyr Ser Tyr Asp
565 570 575
Asn Leu Val Gly Ser Ser Gln Leu Glu Leu Asp Arg Glu Gly Tyr Leu
580 585 590
Ile Ser Glu Glu Glu Phe Tyr Pro Tyr Gly Gly Thr Ala Val Leu Thr
595 600 605
Ala Arg Ser Glu Val Glu Ala Asp Tyr Lys Thr Ile Arg Tyr Ser Gly
610 615 620
Lys Glu Arg Asp Ala Thr Gly Leu Asp Tyr Tyr Gly Tyr Arg Tyr Tyr
625 630 635 640
Gln Pro Trp Ala Gly Arg Trp Leu Ser Thr Asp Pro Ala Gly Thr Val
645 650 655
Asp Gly Leu Asn Leu Phe Arg Met Val Arg Asn Asn Pro Val Thr Leu
660 665 670
Phe Asp Ser Asn Gly Arg Ile Ser Thr Gly Gln Glu Ala Arg Arg Leu
675 680 685
Val Gly Glu Ala Phe Val His Pro Leu His Met Pro Val Phe Glu Arg
690 695 700
Ile Ser Val Glu Arg Lys Ile Ser Met Ser Val Arg Glu Ala Gly Ile
705 710 715 720
Tyr Thr Ile Ser Ala Leu Gly Glu Gly Ala Ala Ala Lys Gly His Asn
725 730 735
Ile Leu Glu Lys Thr Ile Lys Pro Gly Ser Leu Lys Ala Ile Tyr Gly
740 745 750
Asp Lys Ala Glu Ser Ile Leu Gly Leu Ala Lys Arg Ser Gly Leu Val
755 760 765
Gly Arg Val Gly Gln Trp Asp Ala Ser Gly Val Arg Gly Ile Tyr Ala
770 775 780
His Asn Arg Pro Gly Gly Glu Asp Leu Val Tyr Pro Val Ser Leu Gln
785 790 795 800
Asn Thr Ser Ala Asn Glu Ile Val Asn Ala Trp Ile Lys Phe Lys Ile
805 810 815
Ile Thr Pro Tyr Thr Gly Asp Tyr Asp Met His Asp Ile Ile Lys Phe
820 825 830
Ser Asp Gly Lys Gly His Val Pro Thr Ala Glu Ser Ser Glu Glu Arg
835 840 845
Gly Val Lys Asp Leu Ile Asn Lys Gly Val Ala Glu Val Asp Pro Ser
850 855 860
Arg Pro Phe Glu Tyr Thr Ala Met Asn Val Ile Arg His Gly Pro Gln
865 870 875 880
Val Asn Phe Val Pro Tyr Met Trp Glu His Glu His Asp Lys Val Val
885 890 895
Asn Asp Asn Gly Tyr Leu Gly Val Val Ala Ser Pro Gly Pro Phe Pro
900 905 910
Val Ala Met Val His Gln Gly Glu Trp Thr Val Phe Asp Asn Ser Glu
915 920 925
Glu Leu Phe Asn Phe Tyr Lys Ser Thr Asn Thr Pro Leu Pro Glu His
930 935 940
Trp Ser Gln Asp Phe Met Asp Arg Gly Lys Gly Ile Val Ala Thr Pro
945 950 955 960
Arg His Ala Glu Leu Leu Asp Lys Arg Arg Val Met Tyr
965 970
<210>62
<211>2499
<212>PRT
<213>发光光杆状菌
<400>62
Met Asn Thr Leu Lys Ser Glu Tyr Gln Gln Ala Leu Gly Ala Gly Phe
1 5 10 15
Asn Asn Leu Thr Asp Ile Cys His Leu Ser Phe Asp Glu Leu Arg Lys
20 25 30
Lys Val Lys Asp Lys Leu Ser Trp Ser Gln Thr Gln Ser Leu Tyr Leu
35 40 45
Glu Ala Gln Gln Val Gln Lys Asp Asn Leu Leu His Glu Ala Arg Ile
50 55 60
Leu Lys Arg Ala Asn Pro His Leu Gln Ser Ala Val His Leu Ala Leu
65 70 75 80
Thr Ala Pro His Ala Asp Gln Gln Gly Tyr Asn Ser Arg Phe Gly Asn
85 90 95
Arg Ala Ser Lys Tyr Ala Ala Pro Gly Ala Ile Ser Ser Met Phe Ser
100 105 110
Leu Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Gln Ala Arg Asn Leu His
115 120 125
Ala Glu Gly Ser Ile Tyr His Leu Asp Thr Arg Arg Pro Asp Leu Lys
130 135 140
Ser Leu Val Leu Ser Gln Lys Asn Met Asn Thr Glu Ile Ser Thr Leu
145 150 155 160
Ser Leu Ser Asn Asn Met Leu Leu Asn Ser Ile Lys Thr Gln Pro Asn
165 170 175
Leu Asn Ser His Ala Lys Val Met Glu Lys Leu Ser Thr Phe Arg Thr
180 185 190
Ser Gly Ser Met Pro Tyr His Asp Ala Tyr Glu Ser Val Arg Lys Ile
195 200 205
Ile Gln Leu Gln Ala Pro Val Phe Glu Gln Ser Ser Thr Leu Thr Asp
210 215 220
Thr Pro Ile Thr Lys Leu Met Tyr Gln Ile Ser Leu Leu Gly Ile Asn
225 230 235 240
Ala Ser Val Ser Pro Glu Leu Phe Thr Ile Leu Thr Gln Lys Ile Lys
245 250 255
Pro Ala Thr Asn Ala Asp Asn Thr Asn Glu Leu Lys Lys Leu Tyr Lys
260 265 270
Lys Asn Phe Gly Glu Ile Lys Ser Ile Gln Met Ala Arg Ala Glu Tyr
275 280 285
Leu Lys Ser Tyr Tyr Asn Leu Thr Asp Lys Glu Leu Asn Gln Phe Ser
290 295 300
Lys Lys Ile Lys Gln Ile Asp Ser Leu Trp Asn Ile Gly Asp Glu Ile
305 310 315 320
Thr Gln Tyr His Leu Leu Lys Phe Asn Lys Ala Ile Asn Leu Ser Arg
325 330 335
Ser Thr Glu Leu Ser Pro Ile Ile Leu Asn Ser Ile Ala Ile Asp Ile
340 345 350
Leu Lys Lys Thr Pro Pro Glu Asp Asp Ser Asp Asn Pro Phe Arg Asp
355 360 365
Asp Pro Asp Tyr Leu Glu Ser Phe Gln Asp Leu Asp Leu Ser Asp Glu
370 375 380
Pro Asp Ile Asp Glu Asp Val Leu Arg Glu Ala Leu Arg Val Lys Asp
385 390 395 400
Tyr Met Gln Arg Tyr Gly Ile Asp Ala Glu Thr Ala Leu Ile Leu Cys
405 410 415
Lys Ala Pro Ile Ser Glu Asn Pro Ser His Pro Asp Leu Ser Lys Leu
420 425 430
Leu Ala Asp Ile His Gln Leu Thr Ile Asp Glu Leu Gly Val Leu Leu
435 440 445
Val Ala Ile Asp Glu Gly Lys Thr Asp Leu Ser Gln Ile Thr His Asp
450 455 460
Asn Leu Ala Val Leu Ile Ser Lys Leu Tyr Ser Val Thr Asn Trp Leu
465 470 475 480
Arg Thr Arg Lys Trp Ser Val Tyr Gln Leu Phe Val Met Thr Thr Asp
485 490 495
Lys Tyr Asn Lys Thr Leu Thr Pro Glu Ile Asn Asn Leu Leu Asp Thr
500 505 5l0
Val Tyr Asn Gly Leu Gln Asn Phe Tyr Lys Asp Asn Leu Leu Lys Ile
515 520 525
Lys Asp Asn Leu Leu Lys Ala Lys Glu Ser Leu Pro Glu Asp Lys Asp
530 535 540
Asn Leu Pro Lys Ala Glu Gln Tyr Leu Leu Glu Ala Glu Lys Tyr Leu
545 550 555 560
Leu Ala Ala Glu Lys Tyr Leu Leu Ala Ala Glu Lys Tyr Leu Leu Glu
565 570 575
Ala Asn Lys Asn Pro Leu Glu Ala Lys Lys Ala Leu Lys Glu Tyr Glu
580 585 590
Lys Asn Gln Glu Ala Tyr Glu Lys Asn Leu Lys Glu His Glu Lys Tyr
595 600 605
Leu Leu Lys Ala Gly Glu Asn Leu Pro Ala Ile Lys Glu Asn Leu Leu
610 615 620
Lys Ile Lys Glu Asn Leu Pro Lys Ala Ile Ser Pro Tyr Ile Ala Ala
625 630 635 640
Ala Leu Gln Leu Pro Ser Glu Asn Val Ala Leu Ser Val Leu Ala Trp
645 650 655
Ala Asp Lys Leu Asn Ser Gly Lys Glu Asn Lys Met Thr Ala Asp Ser
660 665 670
Phe Trp Asn Trp Leu Arg Lys Lys Pro Ile Glu Thr Gln Ser Lys Thr
675 680 685
Thr Glu Ala Thr Glu Ala Thr Glu Ala Thr Glu Ala Thr Glu Ala Thr
690 695 700
Glu Ala Thr Glu Lys Thr Thr Leu Ile Gln Gln Ala Val Gln Tyr Cys
705 710 715 720
Gln Cys Leu Ala Gln Leu Ala Leu Ile Tyr Arg Ser Thr Gly Leu Ser
725 730 735
Glu Ser Thr Leu Arg Leu Phe Val Thr Asn Pro Gln Ile Phe Gly Leu
740 745 750
Thr Ala Lys Thr Thr Ser Thr His Asn Val Leu Ser Leu Ile Met Leu
755 760 765
Thr Arg Phe Thr Asp Trp Val Asn Ser Leu Gly Glu Asn Ala Ser Ser
770 775 780
Val Leu Thr Glu Phe Glu Lys Gly Thr Leu Thr Ala Glu Leu Leu Ala
785 790 795 800
Asn Ala Met Asn Leu Asp Lys Asn Leu Leu Glu Gln Ala Ser Thr Gln
805 810 815
Ala Gln Ala Asp Phe Ser Asn Trp Pro Ser Ile Asp Asn Leu Leu Gln
820 825 830
Trp Ile Asn Ile Ser Arg Gln Leu Asn Ile Ser Pro Gln Gly Val Ser
835 840 845
Glu Leu Ala Lys Ile Leu Asp Ile Glu Ser Ser Thr Asn Tyr Ala Gln
850 855 860
Trp Glu Asn Val Ala Ser Ile Leu Thr Ala Gly Leu Asp Thr Gln Lys
865 870 875 880
Ala Asn Thr Leu His Ala Phe Leu Gly Glu Ser Arg Ser Thr Ala Leu
885 890 895
Ser Thr Tyr Tyr Ile Tyr Ser His Asn Gln Lys Asp Arg Glu Glu Arg
900 905 910
Lys His Thr Val Ile Lys Asp Arg Asp Asp Leu Tyr Gln Tyr Leu Leu
915 920 925
Ile Asp Asn Gln Val Ser Ala Ala Ile Lys Thr Thr Glu Ile Ala Glu
930 935 940
Ala Ile Ala Ser Ile Gln Leu Tyr Ile Asn Arg Ala Leu Lys Asn Met
945 950 955 960
Glu Gly Asp Thr Asp Thr Ser Val Thr Ser Arg Leu Phe Phe Thr Asn
965 970 975
Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Ile Thr Lys
980 985 990
Leu Leu Tyr Tyr Pro Glu Asn Tyr Ile Asp Pro Thr Leu Arg Ile Gly
995 1000 1005
Gln Thr Lys Met Met Asp Thr Leu Leu Gln Ser Ile Ser Gln Ser
1010 1015 1020
Gln Leu Asn Thr Asp Thr Val Glu Asp Ala Phe Lys Ser Tyr Leu
1025 1030 1035
Thr Ser Phe Glu Gln Val Ala Asn Leu Glu Val Ile Ser Ala Tyr
1040 1045 1050
His Asp Asn Ile Asn Asn Asp Gln Gly Leu Thr Tyr Phe Ile Gly
1055 1060 1065
Arg Ser Lys Thr Glu Val Asn Gln Tyr Tyr Trp Arg Ser Val Asp
1070 1075 1080
His Asn Lys Phe Ser Glu Gly Lys Phe Pro Ala Asn Ala Trp Ser
1085 1090 1095
Glu Trp His Lys Ile Asp Cys Pro Ile Asn Pro Tyr Glu Asp Thr
1100 1105 1110
Ile Arg Pro Val Val Tyr Gln Ser Arg Leu Tyr Ile Ile Trp Leu
1115 1120 1125
Glu Gln Lys Lys Val Thr Asn Arg Ala Glu Gly Glu Ala Ile Lys
1130 1135 1140
Gln Gly Ser Lys Thr Thr Thr Ser Tyr His Tyr Glu Leu Lys Leu
1145 1150 1155
Ala His Ile Arg Tyr Asp Gly Thr Trp Asn Thr Pro Ile Thr Phe
1160 1165 1170
Asp Val Asp Glu Lys Ile Ser Gly Leu Asn Leu Glu Leu Asn Lys
1175 1180 1185
Ala Leu Gly Leu Tyr Cys Ala Ser Tyr Gln Gly Lys Asp Lys Leu
1190 1195 1200
Leu Val Met Phe Tyr Lys Lys Gln Glu Gln Leu Asn Asn Tyr Thr
1205 1210 1215
Glu Lys Thr Gly Asn Thr Tyr Thr Ala Pro Ile Lys Gly Leu Tyr
1220 1225 1230
Ile Thr Ser Asn Met Ser Pro Glu Glu Met Thr Pro Glu Ser Tyr
1235 1240 1245
Arg Leu Asn Ala His Lys Gln Phe Asp Thr Asn Asn Val Val Arg
1250 1255 1260
Val Asn Asn Arg Tyr Ala Glu Ser Tyr Glu Ile Pro Ser Ser Val
1265 1270 1275
Asn Ser Asn Asn Gly Tyr Asp Trp Gly Glu Gly Tyr Leu Ser Met
1280 1285 1290
Val Tyr Gly Gly Ser Ile Leu Ile Thr Arg Asp Pro Ser Asp Asn
1295 1300 1305
Ser Lys Ile Gln Ile Ser Pro Lys Leu Arg Ile Ile His Asn Gly
1310 1315 1320
Tyr Glu Gly Arg Gln Arg Asn Gln Cys Asn Leu Met Lys Lys Tyr
1325 1330 1335
Gly Lys Leu Gly Asp Lys Phe Ile Ile Tyr Thr Thr Leu Gly Ile
1340 1345 1350
Asn Pro Asn Asn Leu Ser Asn Lys Lys Leu Ile Tyr Pro Val Tyr
1355 1360 1365
Gln Tyr Glu Gly Asn Glu Ser Lys Leu Ser Gln Gly Arg Leu Leu
1370 1375 1380
Phe Tyr Arg Asp Ser Thr Thr Asn Phe Thr Arg Ala Trp Phe Pro
1385 1390 1395
Asn Leu Ser Ser Asp Ser Lys Glu Met Ser Ile Thr Thr Gly Gly
1400 1405 1410
Asn Ile Ser Gly Asn Tyr Gly Tyr Ile Asp Asn Lys His Ser Asp
1415 1420 1425
Asn Lys Pro Phe Glu Glu Tyr Phe Tyr Met Asp Asp His Gly Gly
1430 1435 1440
Ile Asp Thr Asp Val Ser Glu Pro Ile Phe Ile Asn Thr Lys Ile
1445 1450 1455
Gln Pro Ser Asn Val Lys Ile Ile Val Lys Thr Val Lys Asp Asp
1460 1465 1470
Gly Lys Leu Asp Ser Lys Pro Tyr Ile Ala Glu Asp Lys Val Ser
1475 1480 1485
Val Lys Pro Thr Pro Asn Phe Glu Glu Met Cys Tyr Gln Phe Asn
1490 1495 1500
Asn Leu Asp Gln Ile Asp Val Ser Thr Leu Val Phe Lys Asn Asn
1505 1510 1515
Glu Ala Ser Ile Asp Ile Thr Phe Thr Ala Ser Ala Asp Ala Phe
1520 1525 1530
Glu Ser Gly Lys Glu Gln Arg Asn Leu Gly Glu Glu His Phe Ser
1535 1540 1545
Ile Arg Ile Ile Lys Lys Ala Asn Val Asn Asp Val Leu Thr Leu
1550 1555 1560
His His Asp Pro Ser Gly Ala Gln Tyr Met Gln Trp Gly Ala Tyr
1565 1570 1575
Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Lys Leu Ile Ser Arg
1580 1585 1590
Ala Asn Ala Gly Ile Asp Thr Ile Leu Ser Met Glu Thr Gln Asn
1595 1600 1605
Ile Gln Glu Pro Gln Leu Gly Lys Gly Phe Tyr Val Asn Phe Thr
1610 1615 1620
Leu Pro Lys Tyr Asp Gln Asn Thr His Gly Asn Glu Arg Gln Phe
1625 1630 1635
Lys Ile His Ile Gly Asn Ile Ala Gly Asp Asn Thr Met Arg Pro
1640 1645 1650
Tyr Tyr Gln Gly Ile Leu Ala Asp Thr Glu Thr Ser Val Val Leu
1655 1660 1665
Phe Val Pro Tyr Glu Lys Gln Ser Tyr Thr Asn Glu Gly Val Arg
1670 1675 1680
Leu Gly Val Glu Tyr Lys Lys Val Ser Tyr Leu Gly Val Trp Glu
1685 1690 1695
Pro Ala Phe Phe Tyr Phe Asn Glu Ile Gln Gln Lys Phe Ile Leu
1700 1705 1710
Ile Asn Asp Ala Asp His Asn Ser Ala Met Thr Gln Ser Gly Glu
1715 1720 1725
Lys Thr Gly Ile Lys Lys Tyr Lys Gly Phe Leu Asp Val Ser Ile
1730 1735 1740
Leu Ile Asp His Gln His Thr Glu Pro Met Asp Phe Asn Gly Ala
1745 1750 1755
Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Leu
1760 1765 1770
Ile Ala Gln Arg Leu Leu His Glu Gln Asn Phe Asp Glu Ala Asn
1775 1780 1785
Arg Trp Leu Lys Tyr Val Trp Asn Pro Ser Gly His Ile Ala Asn
1790 1795 1800
Gly Gln Lys Gln His Pro His Asn Trp Asn Val Arg Pro Leu Gln
1805 1810 1815
Glu Asp Thr Ser Trp Asn Asp Asp Pro Leu Asp Thr Phe Asp Pro
1820 1825 1830
Asp Ala Ile Ala Gln His Asp Pro Met His Tyr Lys Val Ala Thr
1835 1840 1845
Phe Met Cys Ala Leu Asp Leu Leu Ile Glu Gln Gly Asp Tyr Ala
1850 1855 1860
Tyr Arg Gln Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Met Trp
1865 1870 1875
Tyr Met Gln Ala Leu His Leu Leu Gly Asp Lys Pro His Leu Leu
1880 1885 1890
Leu Ser Ser Thr Trp Ser Asp Pro Glu Leu Lys Glu Ala Ala Asp
1895 1900 1905
Leu Glu Lys Gln Gln Ala His Ala Lys Ala Ile Ala Asp Leu Arg
1910 1915 1920
Gln Gly Gln Pro Lys Asp Gly Ser Asn Thr Asp Leu Phe Leu Pro
1925 1930 1935
Gln Val Asn Glu Val Met Leu Ser Tyr Trp Gln Lys Leu Glu Gln
1940 1945 1950
Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Ile Asp Gly Gln Pro
1955 1960 1965
Leu His Leu Pro Ile Phe Ala Thr Pro Ala Asp Pro Lys Ala Leu
1970 1975 1980
Leu Ser Ala Ala Val Ala Ser Ser Gln Gly Gly Ser Asn Leu Pro
1985 1990 1995
Ser Glu Phe Ile Ser Val Trp Arg Phe Pro His Met Leu Glu Asn
2000 2005 2010
Ala Arg Ser Met Val Ser Gln Leu Thr Gln Phe Gly Ser Thr Leu
2015 2020 2025
Gln Asn Ile Ile Glu Arg Gln Asp Ala Glu Ala Leu Asn Thr Leu
2030 2035 2040
Leu Gln Asn Gln Ala Ala Glu Leu Ile Leu Thr Asn Leu Ser Ile
2045 2050 2055
Gln Asp Lys Thr Ile Glu Glu Leu Asp Val Glu Lys Thr Val Leu
2060 2065 2070
Glu Lys Thr Arg Ala Gly Ala Lys Ser Arg Phe Asp Ser Tyr Ser
2075 2080 2085
Lys Phe Tyr Asp Glu Asp Ile Asn Ala Gly Glu Lys Gln Ala Met
2090 2095 2100
Ala Leu Arg Ala Ser Val Ala Gly Ile Ser Thr Ala Leu Gln Ala
2105 2110 2115
Ser His Leu Ala Gly Ala Ala Leu Asp Leu Ala Pro Asn Ile Phe
2120 2125 2130
Gly Phe Ala Asp Gly Gly Ser His Trp Gly Ala Ile Ala Gln Ala
2135 2140 2145
Thr Ser Asn Val Met Glu Phe Ser Ala Ser Val Met Ser Thr Glu
2150 2155 2160
Ala Asp Lys Ile Ser Gln Ser Glu Ala Tyr Arg Arg Arg Arg Gln
2165 2170 2175
Glu Trp Lys Ile Gln Arg Asn Asn Ala Asp Ala Glu Leu Lys Gln
2180 2185 2190
Ile Asp Ala Gln Leu Gln Ser Leu Val Val Arg Arg Glu Ala Ala
2195 2200 2205
Val Leu Gln Lys Thr Ser Leu Lys Thr Gln Gln Glu Gln Thr His
2210 2215 2220
Ala Gln Leu Thr Phe Leu Gln His Lys Phe Ser Asn Gln Ala Leu
2225 2230 2235
Tyr Asn Trp Leu Arg Gly Arg Leu Ser Ala Ile Tyr Phe Gln Phe
2240 2245 2250
Tyr Asp Leu Ala Val Ala Arg Cys Leu Met Ala Glu Met Ala Tyr
2255 2260 2265
Arg Trp Glu Thr Asn Asp Ala Ala Ala Arg Phe Ile Lys Pro Gly
2270 2275 2280
Ala Trp Gln Gly Thr His Ala Gly Leu Leu Ala Gly Glu Thr Leu
2285 2290 2295
Met Leu Asn Leu Ala Gln Met Glu Asp Ala His Leu Lys Gln Glu
2300 2305 2310
Gln Arg Val Leu Glu Val Glu Arg Thr Val Ser Leu Ala Glu Val
2315 2320 2325
Tyr Lys Glu Lys Gly Gln Phe Ser Leu Thr Lys Lys Ile Ala Glu
2330 2335 2340
Leu Val Asn Lys Lys Pro Asp Thr Thr Ser Ser Arg Asn Asn Thr
2345 2350 2355
Leu Asn Phe Gly Glu Gly Asn Ala Lys Thr Ser Leu Gln Ala Ser
2360 2365 2370
Ile Ser Leu Ala Asp Leu Gln Ile Arg His Asp Tyr Pro Glu Asn
2375 2380 2385
Ser Gly Ala Gly Asn Val Arg Arg Ile Lys Gln Ile Ser Val Thr
2390 2395 2400
Leu Pro Ala Leu Leu Gly Pro Tyr Gln Asp Val Gln Ala Ile Leu
2405 2410 2415
Ser Tyr Gly Gly Asp Ala Thr Gly Leu Ala Lys Gly Cys Lys Ala
2420 2425 2430
Leu Ala Val Ser His Gly Met Asn Asp Ser Gly Gln Phe Gln Leu
2435 2440 2445
Asp Phe Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly Ile Glu Ile
2450 2455 2460
Asp Lys Gly Thr Leu Thr Leu Ser Phe Pro Asn Ala Thr Glu Lys
2465 2470 2475
Gln Lys Thr Met Leu Glu Ser Ile Ser Asp Ile Ile Leu His Ile
2480 2485 2490
Arg Tyr Thr Ile Arg Gln
2495
<210>63
<211>2381
<212>PRT
<213>发光光杆状菌
<400>63
Met Asn Ser Tyr Val Lys Glu Ile Pro Asp Val Leu Gln Ser Gln Tyr
1 5 10 15
Gly Ile Asn Cys Leu Thr Asp Ile Cys His Tyr Ser Phe Asn Glu Phe
20 25 30
Arg Gln Gln Val Ser Asp His Leu Ser Trp Ser Glu Thr Asn Arg Leu
35 40 45
Tyr Arg Asp Ala Gln Gln Glu Gln Lys Glu Asn Gln Leu Tyr Glu Ala
50 55 60
Arg Ile Leu Lys Arg Ala Asn Pro Gln Leu Gln Asn Ala Val His Leu
65 70 75 80
Gly Ile Thr Leu Pro His Ala Glu Leu Arg Gly Tyr Asn Ser Glu Phe
85 90 95
Gly Gly Arg Ala Ser Gln Tyr Val Ala Pro Gly Ser Val Ser Ser Met
100 105 110
Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Arg Asn
115 120 125
Leu His Ala Ser Asp Ser Val Tyr His Leu Asp Glu Arg Arg Pro Asp
130 135 140
Leu Gln Ser Met Thr Leu Ser Gln Gln Asn Met Asp Thr Glu Leu Ser
145 150 155 160
Thr Leu Ser Leu Ser Asn Glu Ile Leu Leu Lys Gly Ile Lys Ala Asn
165 170 175
Gln Ser Asn Leu Asp Ser Asp Thr Lys Val Met Glu Met Leu Ser Thr
180 185 190
Phe Arg Pro Ser Gly Thr Ile Pro Tyr His Asp Ala Tyr Glu Asn Val
195 200 205
Arg Lys Ala Ile Gln Leu Gln Asp Pro Lys Leu Glu Gln Phe Gln Lys
210 215 220
Ser Pro Ala Val Ala Gly Leu Met His Gln Ala Ser Leu Leu Gly Ile
225 230 235 240
Asn Asn Ser Ile Ser Pro Glu Leu Phe Asn Ile Leu Thr Glu Glu Ile
245 250 255
Thr Glu Ala Asn Ala Glu Ala Ile Tyr Lys Gln Asn Phe Gly Asp Ile
260 265 270
Asp Pro Ala Cys Leu Ala Met Pro Glu Tyr Leu Lys Ser Tyr Tyr Asn
275 280 285
Phe Ser Asp Glu Glu Leu Ser Gln Phe Ile Arg Lys Tyr Pro Asp Asn
290 295 300
Glu Leu Asn Thr Gln Lys Ile His Leu Leu Lys Ile Asn Lys Ile Ile
305 310 315 320
Leu Leu Ser Gln Ala Val Asn Leu Pro Phe Leu Lys Leu Asp Glu Ile
325 330 335
Ile Pro Glu Gln Asn Ile Thr Pro Thr Val Leu Gly Lys Ile Phe Leu
340 345 350
Val Lys Tyr Tyr Met Gln Lys Tyr Asn Ile Gly Thr Glu Thr Ala Leu
355 360 365
Ile Leu Cys Asn Asp Ser Ile Ser Gln Tyr Ser Tyr Ser Asn Gln Pro
370 375 380
Ser Gln Phe Asp Arg Leu Phe Asn Thr Ser Pro Leu Asn Gly Gln Tyr
385 390 395 400
Phe Val Ile Glu Asp Thr Asn Ile Asp Leu Ser Leu Asn Ser Thr Asp
405 410 415
Asn Trp His Lys Ala Val Leu Lys Arg Ala Phe Asn Val Asp Asp Ile
420 425 430
Ser Leu Tyr Arg Leu Leu His Ile Ala Asn His Asn Asn Thr Asp Gly
435 440 445
Lys Ile Ala Asn Asn Ile Lys Asn Leu Ser Asn Leu Tyr Met Thr Lys
450 455 460
Leu Leu Ala Asp Ile His Gln Leu Thr Ile Asp Glu Leu Tyr Leu Leu
465 470 475 480
Leu Ile Thr Ile Gly Glu Asp Lys Ile Asn Leu Tyr Asp Ile Asp Asp
485 490 495
Lys Glu Leu Glu Lys Leu Ile Asn Arg Leu Asp Thr Leu Ser Asn Trp
500 505 510
Leu His Thr Gln Lys Trp Ser Ile Tyr Gln Leu Phe Leu Met Thr Thr
515 520 525
Thr Asn Tyr Asp Lys Thr Leu Thr Pro Glu Ile Gln Asn Leu Leu Asp
530 535 540
Thr Val Tyr Asn Gly Leu Gln Asn Phe Asp Lys Asn Lys Thr Lys Leu
545 550 555 560
Leu Ala Ala Ile Ala Pro Tyr Ile Ala Ala Thr Leu Gln Leu Pro Ser
565 570 575
Glu Asn Val Ala His Ser Ile Leu Leu Trp Ala Asp Lys Ile Lys Pro
580 585 590
Ser Glu Asn Lys Ile Thr Ala Glu Lys Phe Trp Ile Trp Leu Gln Asn
595 600 605
Arg Asp Thr Thr Glu Leu Ser Lys Pro Pro Glu Met Gln Glu Gln Ile
610 615 620
Ile Gln Tyr Cys His Cys Leu Ala Gln Leu Thr Met Ile Tyr Arg Ser
625 630 635 640
Ser Gly Ile Asn Glu Asn Ala Phe Arg Leu Phe Ile Glu Lys Pro Thr
645 650 655
Ile Phe Gly Ile Pro Asp Glu Pro Asn Lys Ala Thr Pro Ala His Asn
660 665 670
Ala Pro Thr Leu Ile Ile Leu Thr Arg Phe Ala Asn Trp Val Asn Ser
675 680 685
Leu Gly Glu Lys Ala Ser Pro Ile Leu Thr Ala Phe Glu Asn Lys Thr
690 695 700
Leu Thr Ala Glu Lys Leu Ala Asn Ala Met Asn Leu Asp Ala Asn Leu
705 710 715 720
Leu Glu Gln Ala Ser Ile Gln Ala Gln Asn Tyr Lys Gln Val Thr Lys
725 730 735
Glu Asn Thr Phe Ser Asn Trp Gln Ser Ile Asp Ile Ile Leu Gln Trp
740 745 750
Thr Asn Ile Ala Ser Asn Leu Asn Ile Ser Pro Gln Gly Ile Ser Pro
755 760 765
Leu Ile Ala Leu Asp Tyr Ile Lys Pro Ala Gln Lys Thr Pro Thr Tyr
770 775 780
Ala Gln Trp Glu Asn Ala Ala Ile Ala Leu Thr Ala Gly Leu Asp Thr
785 790 795 800
Gln Gln Thr His Thr Leu His Val Phe Leu Asp Glu Ser Arg Ser Thr
805 810 815
Ala Leu Ser Asn Tyr Tyr Ile Gly Lys Val Ala Asn Arg Ala Ala Ser
820 825 830
Ile Lys Ser Arg Asp Asp Leu Tyr Gln Tyr Leu Leu Ile Asp Asn Gln
835 840 845
Val Ser Ala Glu Ile Lys Thr Thr Arg Ile Ala Glu Ala Ile Ala Ser
850 855 860
Ile Gln Leu Tyr Val Asn Arg Ala Leu Glu Asn Ile Glu Ile His Ala
865 870 875 880
Val Ser Asp Val Ile Thr Arg Gln Phe Phe Ile Asp Trp Asp Lys Tyr
885 890 895
Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gln Leu Val Tyr Tyr
900 905 910
Pro Glu Asn Tyr Ile Asp Pro Thr Met Arg Ile Gly Gln Thr Lys Met
915 920 925
Met Asp Thr Leu Leu Gln Ser Val Ser Gln Ser Gln Leu Asn Ala Asp
930 935 940
Thr Val Glu Asp Ala Phe Lys Ser Tyr Leu Thr Ser Phe Glu Gln Val
945 950 955 960
Ala Asn Leu Glu Val Ile Ser Ala Tyr His Asp Asn Val Asn Asn Asp
965 970 975
Gln Gly Leu Thr Tyr Phe Ile Gly Asn Ser Lys Thr Glu Val Asn Gln
980 985 990
Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe
995 1000 1005
Ala Ala Asn Ala Trp Ser Glu Trp His Lys Ile Asp Cys Ala Ile
1010 1015 1020
Asn Pro Tyr Gln Ser Thr Ile Arg Pro Val Ile Tyr Lys Ser Arg
1025 1030 1035
Leu Tyr Leu Ile Trp Leu Glu Gln Lys Glu Thr Ala Lys Gln Lys
1040 1045 1050
Glu Asp Asn Lys Val Thr Thr Asp Tyr His Tyr Glu Leu Lys Leu
1055 1060 1065
Ala His Ile Arg Tyr Asp Gly Thr Trp Asn Val Pro Ile Thr Phe
1070 1075 1080
Asp Val Asp Glu Lys Ile Leu Ala Leu Glu Leu Thr Lys Ser Gln
1085 1090 1095
Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gln Gly Glu Asp Thr Leu
1100 1105 1110
Leu Ile Met Phe Tyr Arg Lys Lys Glu Lys Leu Asp Asp Tyr Lys
1115 1120 1125
Thr Ala Pro Met Gln Gly Phe Tyr Ile Phe Ser Asp Met Ser Ser
1130 1135 1140
Lys Asp Met Thr Asn Glu Gln Cys Asn Ser Tyr Arg Asp Asn Gly
1145 1150 1155
Tyr Thr His Phe Asp Thr Asn Ser Asp Thr Asn Ser Val Ile Arg
1160 1165 1170
Ile Asn Asn Arg Tyr Ala Glu Asp Tyr Glu Ile Pro Ser Leu Ile
1175 1180 1185
Asn His Ser Asn Ser His Asp Trp Gly Glu Tyr Asn Leu Ser Gln
1190 1195 1200
Val Tyr Gly Gly Asn Ile Val Ile Asn Tyr Lys Val Thr Ser Asn
1205 1210 1215
Asp Leu Lys Ile Tyr Ile Ser Pro Lys Leu Arg Ile Ile His Asp
1220 1225 1230
Gly Lys Glu Gly Arg Glu Arg Ile Gln Ser Asn Leu Ile Lys Lys
1235 1240 1245
Tyr Gly Lys Leu Gly Asp Lys Phe Ile Ile Tyr Thr Ser Leu Gly
1250 1255 1260
Ile Asn Pro Asn Asn Ser Ser Asn Arg Phe Met Phe Tyr Pro Val
1265 1270 1275
Tyr Gln Tyr Asn Gly Asn Thr Ser Gly Leu Ala Gln Gly Arg Leu
1280 1285 1290
Leu Phe His Arg Asp Thr Ser Tyr Ser Ser Lys Val Ala Ala Trp
1295 1300 1305
Ile Pro Gly Ala Gly Arg Ser Leu Ile Asn Glu Asn Ala Asn Ile
1310 1315 1320
Gly Asp Asp Cys Ala Glu Asp Ser Val Asn Lys Pro Asp Asp Leu
1325 1330 1335
Lys Gln Tyr Ile Tyr Met Thr Asp Ser Lys Gly Thr Ala Thr Asp
1340 1345 1350
Val Ser Gly Pro Val Asp Ile Asn Thr Ala Ile Ser Ser Glu Lys
1355 1360 1365
Val Gln Ile Thr Ile Lys Ala Gly Lys Glu Tyr Ser Leu Thr Ala
1370 1375 1380
Asn Lys Asp Val Ser Val Gln Pro Ser Pro Ser Phe Glu Glu Met
1385 1390 1395
Cys Tyr Gln Phe Asn Ala Leu Glu Ile Asp Gly Ser Asn Leu Asn
1400 1405 1410
Phe Thr Asn Asn Ser Ala Ser Ile Asp Val Thr Phe Thr Ala Leu
1415 1420 1425
Ala Asp Asp Gly Arg Lys Leu Gly Tyr Glu Ile Phe Asn Ile Pro
1430 1435 1440
Val Ile Gln Lys Val Lys Thr Asp Asn Ala Leu Thr Leu Phe His
1445 1450 1455
Asp Glu Asn Gly Ala Gln Tyr Met Gln Trp Gly Ala Tyr Arg Ile
1460 1465 1470
Arg Leu Asn Thr Leu Phe Ala Arg Gln Leu Val Glu Arg Ala Asn
1475 1480 1485
Thr Gly Ile Asp Thr Ile Leu Ser Met Glu Thr Gln Asn Ile Gln
1490 1495 1500
Glu Pro Met Met Gly Ile Gly Ala Tyr Ile Glu Leu Ile Leu Asp
1505 1510 1515
Lys Tyr Asn Pro Asp Ile His Gly Thr Asn Lys Ser Phe Lys Ile
1520 1525 1530
Ile Tyr Gly Asp Ile Phe Lys Ala Gly Asp His Phe Pro Ile Tyr
1535 1540 1545
Gln Gly Ala Leu Ser Asp Ile Thr Gln Thr Thr Val Lys Leu Phe
1550 1555 1560
Leu Pro Arg Val Asp Asn Ala Tyr Gly Asn Lys Asn Asn Leu Tyr
1565 1570 1575
Val Tyr Ala Ala Tyr Gln Lys Val Glu Thr Asn Phe Ile Arg Phe
1580 1585 1590
Val Lys Glu Asp Asn Asn Lys Pro Ala Thr Phe Asp Thr Thr Tyr
1595 1600 1605
Lys Asn Gly Thr Phe Pro Gly Leu Ala Ser Ala Arg Val Ile Gln
1610 1615 1620
Thr Val Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ser Leu Tyr
1625 1630 1635
Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Val Ala Gln Arg
1640 1645 1650
Leu Leu His Glu Gln Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys
1655 1660 1665
Tyr Val Trp Ser Pro Ser Gly Tyr Ile Val Arg Gly Gln Ile Lys
1670 1675 1680
Asn Tyr His Trp Asn Val Arg Pro Leu Leu Glu Asn Thr Ser Trp
1685 1690 1695
Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val Ala Gln
1700 1705 1710
His Asp Pro Met His Tyr Lys Val Ala Thr Phe Met Arg Thr Leu
1715 1720 1725
Asp Leu Leu Met Ala Arg Gly Asp His Ala Tyr Arg Gln Leu Glu
1730 1735 1740
Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gln Ala Leu
1745 1750 1755
His Leu Leu Gly Asn Lys Pro Tyr Leu Pro Leu Ser Ser Val Trp
1760 1765 1770
Asn Asp Pro Arg Leu Asp Asn Ala Ala Ala Thr Thr Thr Gln Lys
1775 1780 1785
Ala His Ala Tyr Ala Ile Thr Ser Leu Arg Gln Gly Thr Gln Thr
1790 1795 1800
Pro Ala Leu Leu Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe
1805 1810 1815
Leu Pro Gln Ile Asn Asp Val Met Leu Ser Tyr Trp Asn Lys Leu
1820 1825 1830
Glu Leu Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Ile Asp Gly
1835 1840 1845
Gln Pro Leu His Leu Pro Ile Tyr Ala Thr Pro Ala Asp Pro Lys
1850 1855 1860
Ala Leu Leu Ser Ala Ala Val Ala Thr Ser Gln Gly Gly Gly Lys
1865 1870 1875
Leu Pro Glu Ser Phe Ile Ser Leu Trp Arg Phe Pro His Met Leu
1880 1885 1890
Glu Asn Ala Arg Ser Met Val Thr Gln Leu Ile Gln Phe Gly Ser
1895 1900 1905
Thr Leu Gln Asn Ile Ile Glu Arg Gln Asp Ala Glu Ser Leu Asn
1910 1915 1920
Ala Leu Leu Gln Asn Gln Ala Lys Glu Leu Ile Leu Thr Thr Leu
1925 1930 1935
Ser Ile Gln Asp Lys Thr Ile Glu Glu Ile Asp Ala Glu Lys Thr
1940 1945 1950
Val Leu Glu Lys Ser Lys Ala Gly Ala Lys Ser Arg Phe Asp Asn
1955 1960 1965
Tyr Ser Lys Leu Tyr Asp Glu Asp Val Asn Ala Gly Glu Arg Gln
1970 1975 1980
Ala Leu Asp Met Arg Ile Ala Ser Gln Ser Ile Thr Ser Gly Leu
1985 1990 1995
Lys Gly Leu His Met Ala Ala Ala Ala Leu Glu Met Val Pro Asn
2000 2005 2010
Ile Tyr Gly Phe Ala Val Gly Gly Thr Arg Tyr Gly Ala Ile Ala
2015 2020 2025
Asn Ala Ile Ala Ile Gly Gly Gly Ile Ala Ala Glu Gly Leu Leu
2030 2035 2040
Ile Glu Ala Glu Lys Val Ser Gln Ser Glu Ile Trp Arg Arg Arg
2045 2050 2055
Arg Gln Glu Trp Glu Ile Gln Arg Asn Asn Ala Glu Ala Glu Met
2060 2065 2070
Lys Gln Ile Asp Ala Gln Leu Lys Ser Leu Thr Val Arg Arg Glu
2075 2080 2085
Ala Ala Val Leu Gln Lys Thr Gly Leu Lys Thr Gln Gln Glu Gln
2090 2095 2100
Thr Gln Ala Gln Leu Ala Phe Leu Gln Arg Lys Phe Ser Asn Gln
2105 2110 2115
Ala Leu Tyr Asn Trp Leu Arg Gly Arg Leu Ala Ala Ile Tyr Phe
2120 2125 2130
Gln Phe Tyr Asp Leu Val Val Ala Arg Cys Leu Met Ala Glu Gln
2135 2140 2145
Ala Tyr Arg Trp Glu Thr Asn Asp Ser Ser Ala Arg Phe Ile Lys
2150 2155 2160
Pro Gly Ala Trp Gln Gly Thr Tyr Ala Gly Leu Leu Ala Gly Glu
2165 2170 2175
Thr Leu Met Leu Asn Leu Ala Gln Met Glu Asp Ala His Leu Lys
2180 2185 2190
Gln Glu Gln Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala
2195 2200 2205
Gln Val Tyr Gln Ser Leu Gly Glu Lys Ser Phe Ala Leu Lys Asp
2210 2215 2220
Lys Ile Glu Ala Leu Leu Gln Gly Asp Lys Glu Thr Ser Ala Gly
2225 2230 2235
Asn Asp Gly Asn Gln Leu Lys Leu Thr Asn Asn Thr Leu Ser Ala
2240 2245 2250
Thr Leu Thr Leu Gln Asp Leu Lys Leu Lys Asp Asp Tyr Pro Glu
2255 2260 2265
Glu Met Gln Leu Gly Lys Thr Arg Arg Ile Lys Gln Ile Ser Val
2270 2275 2280
Ser Leu Pro Ala Leu Leu Gly Pro Tyr Gln Asp Val Gln Ala Val
2285 2290 2295
Leu Ser Tyr Gly Gly Asp Ala Thr Gly Leu Ala Lys Gly Cys Lys
2300 2305 2310
Ala Leu Ala Val Ser His Gly Leu Asn Asp Asn Gly Gln Phe Gln
2315 2320 2325
Leu Asp Phe Ash Asp Gly Lys Phe Leu Pro Phe Glu Gly Ile Asp
2330 2335 2340
Ile Asn Asp Lys Gly Thr Phe Thr Leu Ser Phe Pro Asn Ala Ala
2345 2350 2355
Ser Lys Gln Lys Asn Ile Leu Gln Met Leu Thr Asp Ile Ile Leu
2360 2365 2370
His Ile Arg Tyr Thr Ile Leu Glu
2375 2380
<210>64
<211>949
<212>PRT
<213>发光光杆状菌
<400>64
Met Lys Asn Ile Asp Pro Lys Leu Tyr Gln His Thr Pro Thr Val Asn
1 5 10 15
Val Tyr Asp Asn Arg Gly Leu Thr Ile Arg Ash Ile Asp Phe His Arg
20 25 30
Asp Val Ala Gly Gly Asp Thr Asp Thr Arg Ile Thr Arg His Gln Tyr
35 40 45
Asp Thr Arg Gly His Leu Ser Gln Ser Ile Asp Pro Arg Leu Tyr Asp
50 55 60
Ala Lys Gln Thr Asn Asn Ser Thr Asn Pro Asn Phe Leu Trp Gln Tyr
65 70 75 80
Asn Leu Thr Gly Asp Thr Leu Arg Thr Glu Ser Val Asp Ala Gly Arg
85 90 95
Thr Val Ala Leu Asn Asp Ile Glu Gly Arg Gln Val Leu Ile Val Thr
100 105 110
Ala Thr Gly Ala Ile Gln Thr Arg Gln Tyr Glu Ala Asn Thr Leu Pro
115 120 125
Gly Arg Leu Leu Ser Val Ser Glu Gln Ala Pro Gly Glu Gln Thr Pro
130 135 140
Arg Val Thr Glu His Phe Ile Trp Ala Gly Asn Thr Gln Ala Glu Lys
145 150 155 160
Asp His Asn Leu Ala Gly Gln Tyr Val Arg His Tyr Asp Thr Ala Gly
165 170 175
Val Thr Gln Leu Glu Ser Leu Ser Leu Thr Glu Asn Ile Leu Ser Gln
180 185 190
Ser Arg Gln Leu Leu Ala Asp Gly Gln Glu Ala Asp Trp Thr Gly Asn
195 200 205
Asp Glu Thr Leu Trp Gln Thr Lys Leu Asn Ser Glu Thr Tyr Thr Thr
210 215 220
Gln Ser Thr Phe Asp Ala Thr Gly Ala Leu Leu Thr Gln Thr Asp Ala
225 230 235 240
Lys Gly Asn Met Gln Arg Leu Ala Tyr Asn Val Ala Gly Gln Leu Gln
245 250 255
Gly Ser Trp Leu Thr Leu Lys Asn Gln Ser Glu Gln Val Ile Val Lys
260 265 270
Ser Leu Thr Tyr Ser Ala Ala Gly Gln Lys Leu Arg Glu Glu His Gly
275 280 285
Asn Gly Val Ile Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Leu Arg Leu
290 295 300
Ile Gly Thr Thr Thr Arg Arg Gln Ser Asp Ser Lys Val Leu Gln Asp
305 310 315 320
Leu Arg Tyr Glu His Asp Pro Val Gly Asn Ile Ile Ser Val Arg Asn
325 330 335
Asp Ala Glu Ala Thr Arg Phe Trp Arg Asn Gln Lys Ile Val Pro Glu
340 345 350
Asn Thr Tyr Thr Tyr Asp Ser Leu Tyr Gln Leu Ile Ser Ala Thr Gly
355 360 365
Arg Glu Met Ala Asn Ile Gly Gln Gln Ser Asn Gln Leu Pro Ser Pro
370 375 380
Ile Ile Pro Leu Pro Thr Asp Glu Asn Ser Tyr Thr Asn Tyr Thr Arg
385 390 395 400
Ser Tyr Asn Tyr Asp Arg Gly Gly Asn Leu Val Gln Ile Arg His Ser
405 410 415
Ser Pro Ala Ala Gln Asn Asn Tyr Thr Thr Asp Ile Thr Val Ser Asn
420 425 430
Arg Ser Asn Arg Ala Val Leu Ser Ser Leu Thr Ser Asp Pro Thr Gln
435 440 445
Val Glu Ala Leu Phe Asp Ala Gly Gly His Gln Thr Lys Leu Leu Pro
450 455 460
Gly Gln Glu Leu Ser Trp Asn Thr Arg Gly Glu Leu Lys Gln Val Thr
465 470 475 480
Pro Val Ser Arg Glu Ser Ala Ser Asp Arg Glu Trp Tyr Arg Tyr Gly
485 490 495
Asn Asp Gly Met Arg Arg Leu Lys Val Ser Glu Gln Gln Thr Gly Asn
500 505 510
Ser Thr Gln Gln Gln Arg Val Thr Tyr Leu Pro Asp Leu Glu Leu Arg
515 520 525
Thr Thr Gln Asn Gly Thr Thr Thr Ser Glu Asp Leu His Ala Ile Thr
530 535 540
Val Gly Ala Ala Gly His Ala Gln Val Arg Val Leu His Trp Glu Thr
545 550 555 560
Thr Pro Pro Ala Gly Ile Asn Asn Asn Gln Leu Arg Tyr Ser Tyr Asp
565 570 575
Asn Leu Ile Gly Ser Ser Gln Leu Glu Leu Asp Asn Ala Gly Gln Ile
580 585 590
Ile Ser Gln Glu Glu Tyr Tyr Pro Phe Gly Gly Thr Ala Leu Trp Ala
595 600 605
Ala Arg Asn Gln Ile Glu Ala Ser Tyr Lys Ile Leu Arg Tyr Ser Gly
610 615 620
Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly Tyr Arg Tyr Tyr
625 630 635 640
Gln Pro Trp Val Gly Arg Trp Leu Ser Ala Asp Pro Ala Gly Thr Ile
645 650 655
Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn Pro Ser Thr Leu
660 665 670
Val Asp Ile Ser Gly Leu Ala Pro Thr Lys Tyr Asn Ile Pro Gly Phe
675 680 685
Asp Phe Asp Val Glu Ile Asp Glu Gln Lys Arg Ser Lys Leu Lys Pro
690 695 700
Thr Leu Ile Arg Ile Lys Asp Glu Phe Leu His Tyr Gly Pro Val Asp
705 710 715 720
Lys Leu Leu Glu Glu Lys Lys Pro Gly Leu Asn Val Pro Glu Glu Leu
725 730 735
Phe Asp Arg Gly Pro Ser Glu Asn Gly Val Ser Thr Leu Thr Phe Lys
740 745 750
Lys Asp Leu Pro Ile Ser Cys Ile Ser Asn Thr Glu Tyr Thr Leu Asp
755 760 765
Ile Leu Tyr Asn Lys His Glu Thr Lys Pro Phe Pro Tyr Glu Asn Glu
770 775 780
Ala Thr Val Gly Ala Asp Leu Gly Val Ile Met Ser Val Glu Phe Gly
785 790 795 800
Asn Lys Ser Ile Gly Asn Ala Ser Asp Glu Asp Leu Lys Glu Glu His
805 810 815
Leu Pro Leu Gly Lys Ser Thr Met Asp Lys Thr Asp Leu Pro Asp Leu
820 825 830
Lys Gln Gly Leu Met Ile Ala Glu Lys Ile Lys Ser Gly Lys Gly Ala
835 840 845
Tyr Pro Phe His Phe Gly Ala Ala Ile Ala Val Val Tyr Gly Glu Asp
850 855 860
Lys Lys Val Ala Ala Ser Ile Leu Thr Asp Leu Ser Glu Pro Lys Arg
865 870 875 880
Asp Glu Gly Glu Tyr Leu Gln Ser Thr Arg Lys Val Ser Ala Met Phe
885 890 895
Ile Thr Asn Val Asn Glu Phe Arg Gly His Asp Tyr Pro Lys Ser Lys
900 905 910
Tyr Ser Ile Gly Leu Val Thr Ala Glu Lys Arg Gln Pro Val Ile Ser
915 920 925
Lys Lys Arg Ala Asn Pro Glu Glu Ala Pro Ser Ser Ser Arg Asn Lys
930 935 940
Lys Leu His Val His
945
Claims (20)
1.控制或者抑制昆虫的方法,其包括将所述昆虫与有效量的蛋白质A、蛋白质B和蛋白质C相接触,其中:
(i)每一种所述的蛋白质A、B和C由天然存在的基因编码或者具有不同于天然存在基因编码的产物、仅通过截短或者保守氨基酸改变的氨基酸序列;
(ii)所述的蛋白质A是230-290kDa毒素复合体昆虫毒素,其来源于第一分类种,具有独特的杀昆虫活性,并且具有的氨基酸序列与选自SEQID NO:14(XptA1Xwi)、SEQ ID NO:34(XptA2Xwi)、SEQ ID NO:21(TcdA)、SEQ ID NO:62(TcdA2)、SEQ ID NO:63(TcdA4)和SEQ IDNO:59(TcbA)的序列具有至少40%的同一性;
(iii)所述的蛋白质B是130-180kDa毒素复合体增效剂,其具有的氨基酸序列与选自SEQ ID NO:22(TcdB1)、SEQ ID NO:45(TcdB2)、SEQ IDNO:56(TcaC)、SEQ ID NO:18(XptC1Xwi)、SEQ ID NO:49(XptB1xb)、SEQID NO:40(PptB1(orf5))和SEQ ID NO:60(SepB)的序列具有至少40%的同一性;
(iv)所述的蛋白质C是90-120kDa毒素复合体增效剂,其具有的氨基酸序列与选自SEQ ID NO:25(TccC1)、SEQ ID NO:58(TccC2)、SEQ IDNO:47(TccC3)、SEQ ID NO:64(TccC4)、SEQ ID NO:57(TccC5)、SEQ IDNO:16(XptB1Xwi)、SEQ ID NO:51(XptC1xb)、SEQ ID NO:43(PptC1(orf6长))、SEQ ID NO:42(PptC1(orf6短))和SEQ ID NO:61(SepC)的序列具有至少35%的同一性;
(v)所述的蛋白质B和所述的蛋白质C中至少一种蛋白质是来源于不同于所述第一分类种的第二分类种;
(vi)如果所述的蛋白质B来源于所述第二分类种,则所述蛋白质B的氨基酸序列与已知由所述第一分类种产生的任一蛋白质的氨基酸序列的同一性小于75%;和
(vii)如果所述的蛋白质C来源于所述第二分类种,则所述蛋白质C的氨基酸序列与已知由所述第一分类种产生的任一蛋白质的氨基酸序列的同一性小于75%;
其中如果特定物种的基因组含有编码特定蛋白质的基因,或者特定蛋白质是通过对特定物种基因组中含有的基因所编码蛋白质的氨基酸序列进行截短或者保守氨基酸变化而设计的,则认为特定蛋白质来源于特定物种。
2.权利要求1的方法,其中所述第一和第二分类种是来源于不同的属。
3.权利要求2的方法,其中所述蛋白质A来源于光杆状菌属物种并且所述蛋白质B和蛋白质C中至少一种蛋白质来源于致病杆菌属、类芽孢杆菌属、沙雷氏菌属或假单胞菌属的物种。
4.权利要求2的方法,其中所述蛋白质A来源于致病杆菌属物种并且所述蛋白质B和蛋白质C中至少一种蛋白质来源于光杆状菌属、类芽孢杆菌属、沙雷氏菌属或假单胞菌属的物种。
5.权利要求2的方法,其中所述蛋白质A、蛋白质B和蛋白质C中的至少一种蛋白质来源于致病杆菌属物种并且所述蛋白质A、蛋白质B和蛋白质C中的至少一种蛋白质来源于光杆状菌属物种。
6.权利要求2的方法,其中所述蛋白质A、蛋白质B和蛋白质C中的至少一种蛋白质来源于光杆状菌属物种并且所述蛋白质A、蛋白质B和蛋白质C中的至少一种蛋白质来源于致病杆菌属物种。
7.权利要求1的方法,其中:
(i)蛋白质A是SEQ ID NO:34(XptA2Xwi)或者SEQ ID NO:21(TcdA),
(ii)蛋白质B是SEQ ID NO:45(TcdB2)、SEQ ID NO:40(PptB11529)或者SEQ ID NO:49(XptB1Xb),和
(iii)蛋白质C是SEQ ID NO:47(TccC3)、SEQ ID NO:42(PptC11529-短)、SEQ ID NO:43(PptC11529-长)或者SEQ ID NO:51(XptC1xb)。
8.权利要求5的方法,其中所述蛋白质A与选自SEQ ID NO:14(XptA1Xwi)和SEQ ID NO:34(XptA2Xwi)的蛋白质具有至少40%的同一性。
9.权利要求6的方法,其中所述蛋白质A与选自SEQ ID NO:21(TcdA)和SEQ ID NO:59(TcbA)的蛋白质具有至少40%的同一性。
10.权利要求8的方法,其中蛋白质B与选自SEQ ID NO:22、SEQ IDNO:40、SEQ ID NO:45和SEQ ID NO:49的氨基酸序列具有至少40%的同一性,或者蛋白质C与选自SEQ ID NO:25、SEQ ID NO:42、SEQ IDNO:47和SEQ ID NO:51的氨基酸序列具有至少35%的同一性。
11.权利要求9的方法,其中蛋白质B与选自SEQ ID NO:18和SEQID NO:40的氨基酸序列具有至少40%的同一性,或者蛋白质C与选自SEQ.ID NO:16(XptB1wi)和SEQ ID NO:42的氨基酸序列具有至少35%的同一性。
12.权利要求5的方法,其中所述蛋白质A由在严格条件下与这样的探针杂交的多核苷酸编码,其中所述探针与编码选自SEQ ID NO:14(XptA1wi)和SEQ ID NO:20(XptA2wi)的氨基酸序列的核酸序列完全互补。
13.权利要求6的方法,其中所述蛋白质A由在严格条件下与这样的探针杂交的多核苷酸编码,其中所述探针与编码SEQ ID NO:21(TcdA)所示氨基酸序列的核酸序列完全互补。
14.权利要求8的方法,其中所述蛋白质B由在严格条件下与这样的探针杂交的多核苷酸编码,其中所述探针与编码选自SEQ ID NO:22、SEQID NO:40、SEQ ID NO:45和SEQ ID NO:49的氨基酸序列的核酸序列完全互补,或者所述蛋白质C由在严格条件下与这样的探针杂交的多核苷酸编码,其中所述探针与编码选自SEQ ID NO:25、SEQ ID NO:42、SEQ IDNO:47和SEQ ID NO:51的氨基酸序列的核酸序列完全互补。
15.权利要求9的方法,其中所述蛋白质B由在严格条件下与这样的探针杂交的多核苷酸编码,其中所述探针与编码选自SEQ ID NO:18和SEQ ID NO:40的氨基酸序列的核酸序列完全互补,或者所述蛋白质C由在严格条件下与这样的探针杂交的多核苷酸编码,其中所述探针与编码选自SEQ ID NO:16(XptB1wi)和SEQ ID NO:42的氨基酸序列的核酸序列完全互补。
16.产生蛋白质A、蛋白质B和蛋白质C的转基因植物或者植物细胞,其中
(i)每一种所述的蛋白质A、B和C由天然存在的基因编码或者具有不同于天然存在基因编码的产物、仅通过截短或者保守氨基酸改变的氨基酸序列;
(ii)所述的蛋白质A是230-290kDa毒素复合体昆虫毒素,其来源于第一分类种,具有独特的杀昆虫活性,并且具有的氨基酸序列与选自SEQID NO:14(XptA1Xwi)、SEQ ID NO:34(XptA2Xwi)、SEQ ID NO:21(TcdA)、SEQ ID NO:62(TcdA2)、SEQ ID NO:63(TcdA4)和SEQ IDNO:59(TcbA)的序列具有至少40%的同一性;
(iii)所述的蛋白质B是130-180kDa毒素复合体增效剂,其具有的氨基酸序列与选自SEQ ID NO:22(TcdB1)、SEQ ID NO:45(TcdB2)、SEQ IDNO:56(TcaC)、SEQ ID NO:18(XptC1Xwi)、SEQ ID NO:49(XptB1xb)、SEQID NO:40(PptB1(orf5))和SEQ ID NO:60(SepB)的序列具有至少40%的同一性;
(iv)所述的蛋白质C是90-120kDa毒素复合体增效剂,其具有的氨基酸序列与选自SEQ ID NO:25(TccC1)、SEQ ID NO:58(TccC2)、SEQ IDNO:47(TccC3)、SEQ ID NO:64(TccC4)、SEQ ID NO:57(TccC5)、SEQ IDNO:16(XptB1Xwi)、SEQ ID NO:51(XptC1xb)、SEQ ID NO:43(PptC1(orf6长))、SEQ ID NO:42(PptC1(orf6短))和SEQ ID NO:61(SepC)的序列具有至少35%的同一性;
(v)所述的蛋白质B和所述的蛋白质C中至少一种蛋白质是来源于不同于所述第一分类种的第二分类种;
(vi)如果所述的蛋白质B来源于所述第二分类种,则所述蛋白质B的氨基酸序列与已知由所述第一分类种产生的任一蛋白质的氨基酸序列的同一性小于75%;和
(vii)如果所述的蛋白质C来源于所述第二分类种,则所述蛋白质C的氨基酸序列与已知由所述第一分类种产生的任一蛋白质的氨基酸序列的同一性小于75%;
其中如果特定物种的基因组含有编码特定蛋白质的基因,或者符定蛋白质是通过对特定物种基因组中含有的基因所编码蛋白质的氨基酸序列进行截短或者保守氨基酸变化而设计的,则认为特定蛋白质来源于特定物种。
17.权利要求16的转基因植物或者植物细胞,其中:
(i)蛋白质A是SEQ ID NO:34(XptA2Xwi)或者SEQ ID NO:21(TcdA),
(ii)蛋白质B是SEQ ID NO:45(TcdB2)、SEQ ID NO:40(PptB11529)或者SEQ ID NO:49(XptB1xb),和
(iii)蛋白质C是SEQ ID NO:47(TccC3)、SEQ ID NO:42(PptC11529-短)、SEQ ID NO:43(PptC11529-长)或者SEQ ID NO:51(XptC1xb)。
18.权利要求1的方法,其中将所述昆虫与第二种蛋白质A相接触,该第二种蛋白质A来源于与所述第一分类种不同的分类种。
19.权利要求18的方法,其中将所述昆虫与选自SEQ ID NO:14(XptA1Xwi)和SEQ ID NO:34(XptA2Xwi)的第一种蛋白质A和选自SEQ IDNO:21(TcdA)和SEQ ID NO:59(TcbA)的第二种蛋白质A相接触。
20.权利要求18的方法,其中所述第一种蛋白质A是SEQ ID NO:14(XptA1Xwi),所述第二种蛋白质A是SEQ ID NO:21(TcdA),所述蛋白质B是SEQ ID NO:45(TcdB2)并且所述蛋白质C是SEQ ID NO:47(TccC3)。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US44172303P | 2003-01-21 | 2003-01-21 | |
US60/441,723 | 2003-01-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1751125A true CN1751125A (zh) | 2006-03-22 |
Family
ID=32825171
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA2004800046014A Pending CN1751125A (zh) | 2003-01-21 | 2004-01-07 | 混合并匹配tc蛋白质用于病虫害防治 |
Country Status (9)
Country | Link |
---|---|
US (2) | US7491698B2 (zh) |
EP (1) | EP1585819A2 (zh) |
CN (1) | CN1751125A (zh) |
AR (1) | AR043328A1 (zh) |
AU (1) | AU2004208093A1 (zh) |
BR (1) | BRPI0406856A (zh) |
CA (1) | CA2514041A1 (zh) |
MX (1) | MXPA05007411A (zh) |
WO (1) | WO2004067727A2 (zh) |
Families Citing this family (60)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7569748B2 (en) * | 1993-05-18 | 2009-08-04 | Wisconsin Alumni Research Foundation | Nucleic acid encoding an insecticidal protein toxin from photorhabdus |
US7792746B2 (en) * | 2003-07-25 | 2010-09-07 | Oracle International Corporation | Method and system for matching remittances to transactions based on weighted scoring and fuzzy logic |
US7285632B2 (en) * | 2004-01-07 | 2007-10-23 | Dow Agrosciences Llc | Isolated toxin complex proteins from Xenorhabus bovienii |
AU2005218598A1 (en) | 2004-03-02 | 2005-09-15 | Dow Agrosciences Llc | Insecticidal toxin complex fusion proteins |
AR052064A1 (es) * | 2004-12-22 | 2007-02-28 | Dow Agrosciences Llc | Complejo de toxina de xenorhabdus |
MX2007010727A (es) * | 2005-03-02 | 2007-11-13 | Dow Agrosciences Llc | Nuevas fuentes para, y tipos de, proteinas insecticidamente activas y polinucleotidos que codifican las proteinas. |
BR112014003911A2 (pt) * | 2011-08-19 | 2017-03-14 | Synthetic Genomics Inc | método integrado para a identificação de alto rendimento de novas composições pesticidas e seu uso |
KR101151861B1 (ko) | 2012-01-10 | 2012-06-08 | ㈜엠알이노베이션 | 여드름 피부개선용 조성물 |
US9475847B2 (en) | 2012-07-26 | 2016-10-25 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
AR091910A1 (es) * | 2012-07-26 | 2015-03-11 | Pioneer Hi Bred Int | Proteinas insecticidas y metodos de uso |
WO2014071182A1 (en) | 2012-11-01 | 2014-05-08 | Massachusetts Institute Of Technology | Directed evolution of synthetic gene cluster |
MX369750B (es) | 2013-03-14 | 2019-11-20 | Pioneer Hi Bred Int | Composiciones y metodos para controlar plagas de insectos. |
EP2971000A4 (en) | 2013-03-15 | 2016-11-23 | Pioneer Hi Bred Int | PHI-4 POLYPEPTIDES AND METHOD FOR THEIR USE |
US10006045B2 (en) | 2013-08-16 | 2018-06-26 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
CA3175967A1 (en) | 2013-09-13 | 2015-03-19 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
WO2015120270A1 (en) | 2014-02-07 | 2015-08-13 | Pioneer Hi Bred International, Inc. | Insecticidal proteins and methods for their use |
CA2939156A1 (en) | 2014-02-07 | 2015-08-13 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
TW201538518A (zh) | 2014-02-28 | 2015-10-16 | Dow Agrosciences Llc | 藉由嵌合基因調控元件所賦予之根部特異性表現 |
WO2016000237A1 (en) | 2014-07-03 | 2016-01-07 | Pioneer Overseas Corporation | Plants having enhanced tolerance to insect pests and related constructs and methods involving insect tolerance genes |
WO2016044092A1 (en) | 2014-09-17 | 2016-03-24 | Pioneer Hi Bred International Inc | Compositions and methods to control insect pests |
CN113372421A (zh) | 2014-10-16 | 2021-09-10 | 先锋国际良种公司 | 杀昆虫蛋白及其使用方法 |
CN107529763B (zh) | 2015-03-11 | 2021-08-20 | 先锋国际良种公司 | Pip-72的杀昆虫组合及使用方法 |
CN116333064A (zh) | 2015-05-19 | 2023-06-27 | 先锋国际良种公司 | 杀昆虫蛋白及其使用方法 |
US10647995B2 (en) | 2015-06-16 | 2020-05-12 | Pioneer Hi-Bred International, Inc. | Compositions and methods to control insect pests |
KR102461443B1 (ko) | 2015-07-13 | 2022-10-31 | 피벗 바이오, 인크. | 식물 형질 개선을 위한 방법 및 조성물 |
US11198709B2 (en) | 2015-08-06 | 2021-12-14 | E. I. Du Pont De Nemours And Company | Plant derived insecticidal proteins and methods for their use |
CA2992488A1 (en) | 2015-08-28 | 2017-03-09 | Pioneer Hi-Bred International, Inc. | Ochrobactrum-mediated transformation of plants |
US11479516B2 (en) | 2015-10-05 | 2022-10-25 | Massachusetts Institute Of Technology | Nitrogen fixation using refactored NIF clusters |
EP3390431A1 (en) | 2015-12-18 | 2018-10-24 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
US11781151B2 (en) | 2016-04-14 | 2023-10-10 | Pioneer Hi-Bred International, Inc. | Insecticidal CRY1B variants having improved activity spectrum and uses thereof |
CA3021391A1 (en) | 2016-04-19 | 2017-10-26 | Pioneer Hi-Bred International, Inc. | Insecticidal combinations of polypeptides having improved activity spectrum and uses thereof |
MX2018013249A (es) | 2016-05-04 | 2019-02-13 | Pioneer Hi Bred Int | Proteinas insecticidas y metodos para sus usos. |
US20190185867A1 (en) | 2016-06-16 | 2019-06-20 | Pioneer Hi-Bred International, Inc. | Compositions and methods to control insect pests |
CN116334123A (zh) | 2016-06-24 | 2023-06-27 | 先锋国际良种公司 | 植物调节元件及其使用方法 |
US11155829B2 (en) | 2016-07-01 | 2021-10-26 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins from plants and methods for their use |
WO2018013333A1 (en) | 2016-07-12 | 2018-01-18 | Pioneer Hi-Bred International, Inc. | Compositions and methods to control insect pests |
US11021716B2 (en) | 2016-11-01 | 2021-06-01 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
CA3044404A1 (en) | 2016-12-14 | 2018-06-21 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
WO2018118811A1 (en) | 2016-12-22 | 2018-06-28 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
CA3049258A1 (en) | 2017-01-12 | 2018-07-19 | Pivot Bio, Inc. | Methods and compositions for improving plant traits |
WO2018140214A1 (en) | 2017-01-24 | 2018-08-02 | Pioneer Hi-Bred International, Inc. | Nematicidal protein from pseudomonas |
US20190390219A1 (en) | 2017-02-08 | 2019-12-26 | Pioneer Hi-Bred International, Inc. | Insecticidal combinations of plant derived insecticidal proteins and methods for their use |
CN110621780B (zh) | 2017-05-11 | 2024-03-19 | 先锋国际良种公司 | 杀昆虫蛋白及其使用方法 |
BR112019024827A2 (pt) | 2017-05-26 | 2020-06-16 | Pioneer Hi-Bred International, Inc. | Construto de dna, planta transgênica ou progênie da mesma, composição e método para controlar uma população de pragas de insetos |
US20200165626A1 (en) | 2017-10-13 | 2020-05-28 | Pioneer Hi-Bred International, Inc. | Virus-induced gene silencing technology for insect control in maize |
EP3701040A4 (en) | 2017-10-25 | 2021-08-25 | Pivot Bio, Inc. | METHODS AND COMPOSITIONS FOR IMPROVING GENETICALLY MODIFIED MICROBES THAT BIND NITROGEN |
WO2019169150A1 (en) | 2018-03-02 | 2019-09-06 | Pioneer Hi-Bred International, Inc. | Plant health assay |
BR112020023800A2 (pt) | 2018-05-22 | 2021-02-23 | Pioneer Hi-Bred International, Inc. | elementos reguladores de planta e métodos de uso dos mesmos |
EP3814302A4 (en) | 2018-06-27 | 2022-06-29 | Pivot Bio, Inc. | Agricultural compositions comprising remodeled nitrogen fixing microbes |
AU2019332792A1 (en) | 2018-08-29 | 2021-01-28 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
WO2021076346A1 (en) | 2019-10-18 | 2021-04-22 | Pioneer Hi-Bred International, Inc. | Maize event dp-202216-6 and dp-023211-2 stack |
WO2021221690A1 (en) | 2020-05-01 | 2021-11-04 | Pivot Bio, Inc. | Modified bacterial strains for improved fixation of nitrogen |
TW202142114A (zh) | 2020-02-04 | 2021-11-16 | 美商陶氏農業科學公司 | 具有殺有害生物效用之組成物及與其相關之方法 |
CA3172322A1 (en) | 2020-05-01 | 2021-11-04 | Karsten TEMME | Modified bacterial strains for improved fixation of nitrogen |
AU2021265277A1 (en) | 2020-05-01 | 2022-12-08 | Vestaron Corporation | Insecticidal combinations |
US20230235352A1 (en) | 2020-07-14 | 2023-07-27 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
CA3189603A1 (en) | 2020-08-10 | 2022-01-17 | Pioneer Hi-Bred International, Inc. | Plant regulatory elements and methods of use thereof |
US11891613B2 (en) | 2020-12-17 | 2024-02-06 | Pioneer Hi-Bred International, Inc. | Bacterial strains with toxin complex for insect control |
CA3218556A1 (en) | 2021-07-02 | 2023-01-05 | Pivot Bio, Inc. | Genetically-engineered bacterial strains for improved fixation of nitrogen |
TW202345696A (zh) | 2022-05-18 | 2023-12-01 | 美商科迪華農業科技有限責任公司 | 具有殺有害生物效用之組成物及與其相關的方法 |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5972687A (en) | 1993-06-25 | 1999-10-26 | Commonwealth Scientific And Industrial Research Organisation | Toxin gene from Xenorhabdus nematophilus |
SK93197A3 (en) | 1995-11-06 | 1998-05-06 | Wisconsin Alumni Res Found | Insecticidal protein toxins from photorhabdus |
WO1998008932A1 (en) | 1996-08-29 | 1998-03-05 | Dow Agrosciences Llc | Insecticidal protein toxins from $i(photorhabdus) |
GB9618083D0 (en) | 1996-08-29 | 1996-10-09 | Mini Agriculture & Fisheries | Pesticidal agents |
TR199900552T1 (xx) | 1997-05-05 | 1999-10-21 | Dow Agrosciences Llc | Xenorhabdus'dan elde edilmi� insektisid �zellikte protein toksinler. |
AUPO808897A0 (en) | 1997-07-17 | 1997-08-14 | Commonwealth Scientific And Industrial Research Organisation | Toxin genes from the bacteria xenorhabdus nematophilus and photohabdus luminescens |
US6281413B1 (en) * | 1998-02-20 | 2001-08-28 | Syngenta Participations Ag | Insecticidal toxins from Photorhabdus luminescens and nucleic acid sequences coding therefor |
JP2002504336A (ja) | 1998-02-20 | 2002-02-12 | ノバルティス アクチエンゲゼルシャフト | フォトラブドゥスからの殺昆虫性毒素 |
IL139075A0 (en) | 1998-04-21 | 2001-11-25 | Novartis Ag | Novel insecticidal toxins from xenorhabdus nematophilus and nucleic acid sequences coding therefor |
US6174860B1 (en) | 1999-04-16 | 2001-01-16 | Novartis Ag | Insecticidal toxins and nucleic acid sequences coding therefor |
GB9825418D0 (en) | 1998-11-19 | 1999-01-13 | Horticulture Res Int | Insecticidal agents |
GB9901499D0 (en) | 1999-01-22 | 1999-03-17 | Horticulture Res Int | Biological control |
AR025097A1 (es) | 1999-08-11 | 2002-11-06 | Dow Agrosciences Llc | Plantas transgenicas expresando la toxina photorhabdus |
US6639129B2 (en) | 2000-03-24 | 2003-10-28 | Wisconsin Alumni Research Foundation | DNA sequences from photorhabdus luminescens |
US20070020625A1 (en) | 2001-02-07 | 2007-01-25 | Eric Duchaud | Sequence of the photorhabdus luminescens strain tt01 genome and uses |
ES2311731T3 (es) | 2002-06-28 | 2009-02-16 | Dow Agrosciences Llc | Proteinas de accion pesticida y polinucleotidos que se pueden obtener a partir de especies de paenibacillus. |
MXPA05005120A (es) * | 2002-11-12 | 2006-02-22 | Univ Bath | Secuencias de adn de la region genomica tcd de photorhabdus luminescens. |
US7359282B2 (en) | 2003-05-16 | 2008-04-15 | Schlumberger Technology Corporation | Methods and apparatus of source control for borehole seismic |
-
2004
- 2004-01-07 AR ARP040100032A patent/AR043328A1/es unknown
- 2004-01-07 BR BR0406856-4A patent/BRPI0406856A/pt not_active IP Right Cessation
- 2004-01-07 AU AU2004208093A patent/AU2004208093A1/en not_active Abandoned
- 2004-01-07 MX MXPA05007411A patent/MXPA05007411A/es unknown
- 2004-01-07 WO PCT/US2004/000394 patent/WO2004067727A2/en active Application Filing
- 2004-01-07 CA CA002514041A patent/CA2514041A1/en not_active Abandoned
- 2004-01-07 EP EP04700609A patent/EP1585819A2/en not_active Withdrawn
- 2004-01-07 US US10/754,115 patent/US7491698B2/en not_active Expired - Fee Related
- 2004-01-07 CN CNA2004800046014A patent/CN1751125A/zh active Pending
-
2009
- 2009-02-16 US US12/371,825 patent/US8084418B2/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
EP1585819A2 (en) | 2005-10-19 |
WO2004067727A8 (en) | 2005-08-18 |
MXPA05007411A (es) | 2006-02-10 |
WO2004067727A2 (en) | 2004-08-12 |
US7491698B2 (en) | 2009-02-17 |
AR043328A1 (es) | 2005-07-27 |
AU2004208093A1 (en) | 2004-08-12 |
WO2004067727A3 (en) | 2005-03-17 |
AU2004208093A2 (en) | 2005-08-25 |
US20090221501A1 (en) | 2009-09-03 |
BRPI0406856A (pt) | 2005-12-27 |
US20040208907A1 (en) | 2004-10-21 |
CA2514041A1 (en) | 2004-08-12 |
US8084418B2 (en) | 2011-12-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1751125A (zh) | 混合并匹配tc蛋白质用于病虫害防治 | |
CN1255539C (zh) | 新杀虫蛋白质和菌株 | |
CN1849397A (zh) | 苏云金芽孢杆菌分泌的杀虫蛋白及其应用 | |
CN1256712A (zh) | 植物病害控制 | |
CN1761753A (zh) | δ-内毒素基因及其使用方法 | |
CN1527663A (zh) | 新的杀虫毒素 | |
CN1314911A (zh) | 用dna改组优化害虫抗性基因 | |
CN1332800A (zh) | 转化植物以表达苏云金芽孢杆菌δ-内毒素的方法 | |
CN1119877A (zh) | 新杀虫蛋白和菌株 | |
CN1073717A (zh) | 编码杀虫剂蛋白的基因,用该基因转化的禾本科植物及其制备方法 | |
CN1886513A (zh) | 昆虫抗性棉花植株及对其进行探测的方法 | |
CN1334874A (zh) | Cry3B杀虫蛋白在植物中的提高表达 | |
CN1708588A (zh) | Cot102杀虫棉花 | |
CN1390259A (zh) | 对鳞翅目昆虫有活性的苏云金芽孢杆菌δ内毒素组合物及其使用方法 | |
CN1933724A (zh) | 玉米事件mir604 | |
CN1296482C (zh) | 杀虫毒素和编码这些毒素的核苷酸序列 | |
CN1626662A (zh) | 合成抗病原体物质的基因 | |
CN1721434A (zh) | 新蛋白质,编码该蛋白质的基因和它们的使用方法 | |
CN1642977A (zh) | 新的苏云金芽孢杆菌杀昆虫蛋白质 | |
CN1555414A (zh) | 来源于植物的抗性基因 | |
CN1408023A (zh) | 杀虫蛋白 | |
CN1163146C (zh) | 控制昆虫虫害的方法 | |
CN1625562A (zh) | CRY1EA和CRY1CA的嵌合型δ-内毒素蛋白 | |
CN1684974A (zh) | 可获自类芽孢杆菌种的杀虫活性蛋白质及多核苷酸 | |
CN1195063C (zh) | 蛋白酶抑制剂融合蛋白 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Open date: 20060322 |