CA2412627A1 - Genes and proteins involved in the biosynthesis of lipopeptides - Google Patents

Genes and proteins involved in the biosynthesis of lipopeptides Download PDF

Info

Publication number
CA2412627A1
CA2412627A1 CA002412627A CA2412627A CA2412627A1 CA 2412627 A1 CA2412627 A1 CA 2412627A1 CA 002412627 A CA002412627 A CA 002412627A CA 2412627 A CA2412627 A CA 2412627A CA 2412627 A1 CA2412627 A1 CA 2412627A1
Authority
CA
Canada
Prior art keywords
ala
leu
arg
gly
val
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002412627A
Other languages
French (fr)
Inventor
Chris M. Farnet
Alfredo Staffa
Emmanuel Zazopoulos
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cubist Pharmaceuticals LLC
Original Assignee
Ecopia Biosciences Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ecopia Biosciences Inc filed Critical Ecopia Biosciences Inc
Priority to CA002450691A priority Critical patent/CA2450691C/en
Priority to CA002412627A priority patent/CA2412627A1/en
Publication of CA2412627A1 publication Critical patent/CA2412627A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P20/00Technologies relating to chemical industry
    • Y02P20/50Improvements relating to the production of bulk chemicals
    • Y02P20/52Improvements relating to the production of bulk chemicals using catalysts, e.g. selective catalysts

Abstract

Genes and proteins involved in the biosynthesis of lipopeptides by microorganisms, in particular the nucleic acids forming the biosynthetic locus for the A54145 lipopeptide from Streptomyces fradiae and the A54145-like lipopeptide from Streptomyces refuineus. These nucleic acids can be used to make expression constructs and transformed host cells for the production of lipopeptides. The genes and proteins allow direct manipulation of lipopeptides and related chemical structures via chemical engineering of the proteins involved in the biosynthesis of A54145.

Description

DEMANDES OU BREVETS VOLUMINEUX
r LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVETS
COM(PREND PUTS D'UN TOME.
CECI EST LE TOME ~ DE
NOTE: Pour les tomes additionels, veillez contacter Ie Bureau Canadien des Brevets.
JUMBO APPLICATIONS / PATENTS
THIS SECTION OF THE APPLICATION / PATENT CONTAINS MORE
THAN ONE VOLUME"
THIS IS VOLUME ,~ OF
NOTE: For additional volumes please contact the Canadian Patent Office.

_1_ TITLE OF INVENTION: Genes and proteins involved in the biosynthesis of lipopeptides CROSS-REFERENCING TO RELATED APPLICATION:
This application claims benefit of provisional application USSN 60/342,133 filed on December 26, 2001 and of USSIV 60/372,789 filed on April 17, 2002. The application is also a continuation-in-part of USSN 10/23.2,370 filed on September 3, 2002. The teachings of the above applications are hereby incorporated by reference in their entirety for all purposes.
FIELD OF INVENTION:
The present invention relates to the genes and proteins that direct the synthesis of lipopeptides, in particular the invention relates to the biosynthetic locus for A54145 from Streptomyces fradiae ATCC 18158 and the biosynthetic locus for a lipopeptide natural product from Streptomyces refuineus NRRL 314.3. The present invention also is directed to the use of genes and proteins to produce compounds exhibiting antibiotic activity based on the lipopeptide structure.
BACKGROUND:
Lipopeptides are natural products that exhibit potent, broad-spectrum antibiotic activity with a high potential for biotechnological and pharmaceutical applications as antimicrobial, antifungal, or antiviral agents. A single microorganism may produce a mixture of related lipopeptides that differ in the lipid moiety that is attached to the peptide core via a free amine, usually the N-terminal amine of the peptide core. The lipid moiety can have a major influence on the biological properties of lipopeptide natural products. The A54145 antibiotics produced by ;S. fradiae are a group of lipopeptides comprising at least eight microbiologically active, related factors A, A1, B, B1, C, D, E, and F. Each A54145 factor bears a cyclic 13-amino acid, acidic polypeptide core and a fatty acyl group attached to the N-terminal amine. The eight A54145 factors differ in the identity of the amino acid residue at position 12 and 13 of the peptide core, as well as the identity of the fatty acid (see Figure 1 ).
Many low molecular weight peptides produced by bacteria are synthesized nonribosomally on large multifunctional proteins termed nonribosomal peptide _2_ synthetases (NRPSs) (Doekel and Marahiel, 2001, Metabolic Engineering, Vol. 3, pp.
64-77). NRPSs are modular proteins that consist of one or more polyfunctional polypeptides each of which is made up of modules. The amino-terminal to carboxy-terminal order and specificities of the individual modules correspond to the sequential order and identity of the amino acid residues of the peptide product. Each NRPS
module recognizes a specific amino acid substrate and catalyzes the stepwise condensation to form a growing peptide chain. The identity of the amino acid recognized by a particular unit can be determined by comparison with other units of known specificity (Challis and Ravel, 2000, FEMS Microbiology Letters, Vol.
187, pp.
111-114). In many peptide synthetases, there is a strici: correlation between the order of repeated units in a peptide synthetase and the order in which the respective amino acids appear in the peptide product, making it possible ilo correlate peptides of known structure with putative genes encoding their synthesis, as demonstrated by the identification of the mycobactin biosynthetic gene cluster from the genome of Mycobacterium Tuberculosis (Quadri et al., 1998, Chem. Biol. Vol. 5, pp. 631-645).
The modules of a peptide synthetase are composed of smaller units or "domains"
that each carry out a specific role in the recognition, activation, modification and joining of amino acid precursors to form the peptide product. Une type of domain, the adenylation (A) domain, is responsible for selectively recognizing and activating the amino acid that is to be incorporated by a particular unit of the peptide synthetase. This activation step is ATP-dependent and involves the transient formation of an amino-acyl-adenylate. The activated amino acid is covalently attached to the peptide synthetase through another type of domain, the thiolation (T) domain, that is generally located adjacent to the A domain. The T domain is post-translationally modified by the covalent attachment of a phosphopantetheinyl prosthetic arm to a conserved serine residue. The activated amino acid substrates are tethered onto the nonribosomal peptide synthetase via a thioester bond to the phosphopantetheinyl prosthetic arm of the respective T
domains. Amino acids joined to successive units of the peptide synthetase are subsequently covalently linked together by the formation of amide bonds catalyzed by another type of domain, the condensation (C) domain. INRPS modules can also occasionally contain additional functional domains that carry out auxiliary reactions, the most common being epimerization of an amino acid substrate from the L- to the D-form. This reaction is catalyzed by a domain referred to as an epimerization (E) domain that is generally located adjacent to the T domain of a given NRPS module.
Thus, a typical NRPS module has the following domain organization: C-A-T-(E).
Product assembly by NRPSs involves three distinct phases, namely chain initiation, chain elongation, and chain termination (Keating and Walsh, 1999, Curr. Opin.
Chem. Biol., Vol 3, pp. 598-606). Polypeptide chain initiation is carried out by specialized modules termed "starter modules" that comprise an A domain and a T
domain. Elongation modules have, in addition, a C domain that is located upstream of the A domain. It has been experimentally demonstrated that such elongation domains cannot initiate peptide bond formation due to interference by the C domain (Linne and Marahiel, 2000, Biochemistry, Vol. 39, pp. 10439-1044T). All the growing peptide intermediates are covalently tethered to the NRPS during translocations as an elongating series of aryl-S-enzyme intermediates. To release the mature peptide product from the NRPS, the terminal acyl-S-enzyme bond must be broken. This process is the chain termination step and is usually catalyzed by a C-terminal thioesterase (TE) domain. Thioesterase-mediated release of the mature peptide from the NRPS enzyme involves the transient formation of are acyl-O-TE intermediate that is then hydrolyzed or hydrolyzed and concomitantly cyclized to release the mature peptide (Keating et al., 2001, Chembiochem, Vol. 2, pp. 99-107).
SUMMARY OF THE INVENTION:
The present invention advantageously provides genes and proteins involved in the production of lipopeptides. Specific embodiments of the genes and proteins are provided in the accompanying sequence listing. SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 provide nucleic acids responsible for biosynthesis of the lipopeptide A54145. SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30 and 32 provide amino acid sequences for proteins responsible for biosynthesis of the lipopeptide A54145. SEO ID NOS: 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66 provide nucleic acid sequences for genes responsible for biosynthetisis of an A54145-like lipopeptide. SEQ ID NOS: 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 provide amino acid sequences for proteins responsible for biosynthesis of the A54145-like lipopeptide. The genes and proteins of the invention provide the machinery for producing novel lipopeptide-related compounds based on A54145 compounds.
The invention discloses NRPS genes, namely A541 ORF 2, 3, 4, 5 and 6 (SEQ
ID NOS: 5, 8, 10, 12 and 14) and 024A ORES 4, 5, 6 arid 7 (SEQ ID NOS: 42, 44, and 48) and their corresponding gene products SEO ID NOS: 4, 7, 9, 11, 13, 41, 43, 45 and 47 respectively) that can be used to produce a variety of lipopeptides, some of which are now produced only by fermentation, others of which are now produced by fermentation and chemical modification, and still others of which are novel lipopeptides which are now not produced either by fermentation or chemical modification.
The invention allows direct manipulation of A54145 and related chemical structures via chemical engineering of the enzymes of A541 and 024A, modifications which are presently not possible by chemical methodology because of complexity of the structures.
The invention can also be used to introduce "chemical handles" into normally inert positions that permit subsequence chemical modifications. Several general approaches to achieve the development of novel lipopeptides are facilitated by the methods and reagents of the present invention. For example, molecular modeling can be used to predict optimal structures. Various polypeptide structures can be generated by genetic manipulation of A541 and 024A gene cluster in accordance with the methods of the invention. The invention can be used to generate a focused library of analogs around a lipopeptide lead candidate to fine-tune the compound for optimal properties.
Genetic engineering methods of the invention can be directed to modify positions of the molecule previously inert to chemical modifications. Known techniques allow one to manipulate a known NRPS gene cluster either to produce the iipopeptide synthesized by that NRPS at higher levels than occur in nature or in hosts that otherwise do not produce the lipopeptide. Known techniques allow one to produce molecules that are structurally related to, but distinct from the lipopeptides produced from known lipopeptide gene clusters.
Thus the invention provides an isolated, purified or enriched nucleic acid comprising a nucleic acid sequence selected from the group consisting of: (a) SEQ ID
NOS: 1, 6, and 17 and coding regions thereof; (b) a nucleic acid having at least 75%
identity to a nucleic acid of (a); and (c) a nucleic acid complementary to a nucleic acid of (a) or (b). In a related aspect, the invention provides a rmcleic acid selected from the _5_ group consisting of: (a) a nucleic acid of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33; (b) a nucleic acid encoding a polypeptide of SEQ ID
NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32; (c) a nucleic acid having at least 75%
homology to a nucleic acid of (a) or (b); and (d) a nucleic acid complementary to a nucleic acid of (a), (b) or (c). In a further aspect, the invention provides an isolated, purified or enriched nucleic acid capable of hybridizing to the above nucleic acids under conditions of high stringency. In one embodiment, the nucleic acid comprises the sequence of at least two nucleic acids of the above nucleic acids. In another embodiment, the nucleic acid comprises the sequence of at least three of the above nucleic acids.
The invention also provides an isolated, purified or enriched nucleic acid that hybridizes under stringent conditions to any one of A541 ORFs 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 (SECT I D NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33) and can substitute for the ORF to which it specifically hybridizes to direct the synthesis of an A54145 compound or analogue.
The invention also provides an isolated gene cluster comprising ORFs encoding polypeptides sufficient to direct the synthesis of an A54~145 compound or analogue. In one embodiment, the isolated gene cluster is present in a bacterium. In another embodiment, the isolated gene cluster contains a nucleic acid of any one of A541 ORFs 1 to 15 (SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33) present in the E. coli strains DH10B having accession nos. IDAC 260202-1, 260202-2 and 260202-3.
.The invention also provides an isolated polypeptide comprising a polypeptide sequence selected from any one of: (a) a polypeptide of any one of SEQ ID NOS:
2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32; and (b) a polypeptide which is at least 75% identical in amino acid sequence to a polypeptide of any one of SEQ ID
NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32. In one embodiment, the polypeptide comprises at least two of the above poiypeptides. In another embodiment, the poiypeptide comprises at least three of the above polypeptides. In still another embodiment, the polypeptide comprises at least five or more of the above poiypeptides.
The invention also provides an expression vector comprising the above nucleic acids. In a related aspect, the invention provides a host cell transformed with the expression vector. In one embodiment the host cell is transformed with an exogenous nucleic acid comprising a gene cluster encoding polypeptides sufficient to direct the assembly of an A54145 compound or analogue.
The invention provides a method of chemically modifying a biological molecule that is a substrate for a polypeptide encoded by a gene product of A541 ORFs 1 to 15 comprising contacting the biological molecule with a gene product of A541 ORF
1 to 15 wherein said polypeptide chemically modifies said biological molecule. In another aspect, the invention provides a method of chemically modifying a biological molecule that is a substrate for a polypeptide encoded by an A54145 biosynthesis gene cluster, said method comprising contacting the biological molecule with at least two different polypeptides described above.
The invention also provides an isolated or purified antibody capable of specifically binding to a polypeptide having a sequence selected from the group consisting of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32. The invention provides a method of making a poiypeptide having a sequence selected from the group consisting of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 comprising introducing a nucleic acid encoding said polypeptide, said nucleic acid being operably linked to a promoter, into a host cell. The invention also provides a method of making a A54145 compound or analog comprising the step of providing a bacterium containing a gene cluster with sufficient genes to produce a A54145 compound or analogue and culturing the bacterium under conditions allowing for expression of the sufficient genes to produce an A54145 compound, wherein the gene cluster contains at least one of the nucleic acids referred to above. In one embodiment the method comprising culturing a Sfireptomyces fradiae bacterium under conditions allowing for expression of A541 ORFs 1 to 15 (SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33) present in the E. coli strains DH10B having accession nos.
IDAC
260202-1, 260202-2 and 260202-3.
Thus the invention provides an isolated, purified or enriched nucleic acid comprising a nucleic acid sequence selected from the group consisting of: (a) SECT ID
NO: 34, and coding regions thereof; (b) a nucleic acid having at least 75%
identity to a nucleic acid of (a); and (c) a nucleic acid complementary to a nucleic acid of (a) or (b).
In a related aspect, the invention provides a nucleic acid selected from the group consisting of: (a) a nucleic acid of SEQ ID NOS: 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64 and 66; (b) a nucleic acid encoding a polypeptide of SEQ ID
NOS: 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, and 65; (c) a nucleic acid having at least 75% homology to a nucleic acid of (a) or (b); and (d) a nucleic acid complementary to a nucleic acid of (a), (b) or (c). In a further aspect, the invention provides an isolated, purified or enriched nucleic acid capable of hybridizing to the above nucleic acids under conditions of high stringency. In one embodiment, the nucleic acid comprises the sequence of at least two nucleic acids of the above nucleic acids. In another embodiment, the nucleic acid comprises the sequence of at least three of the above nucleic acids.
The invention also provides an isolated, purified or enriched nucleic acid that hybridizes under stringent conditions to any one of 024A ORFs 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or 16 (SEO ID NOS: 36, 38, 40, 4.2, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64 and 66) and can substitute for the ORF to which it specifically hybridizes to direct the synthesis of an A54145-like compound or analogue.
The invention also provides an isolated gene cluster comprising ORFs encoding polypeptides sufficient to direct the synthesis of an 024A A54145-like compound or analogue. In one embodiment, the isolated gene cluster is present in a bacterium. In another embodiment, the isolated gene cluster contains a nucleic acid of any one of 024A ORFs 1 to 16 (SEQ ID NOS: 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64 and 66) present in the E. coli strains DH10B having accession nos. IDAC

and I DAC 260202-5.
The invention also provides an isolated polypeptide comprising a polypeptide sequence selected from any one of: (a) a polypeptide of anyone of SEQ ID NOS:
35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, and 65; and (b) a polypeptide which is at least 75% identical in amino acid sequence to a polypeptide of any one of SEO ID NOS: 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, and 65. In one embodiment, the polypeptide comprises at least two of i:he above polypeptides.
In another embodiment, the polypeptide comprises at least three of the above polypeptides. In still another embodiment, the polypeptiide comprises at least five or more of the above polypeptides.
The invention also provides an expression vector comprising one of the above nucleic acids. In a related aspect, the invention provides a host cell transformed with the expression vector. In one embodiment the host cell is transformed with an _8_ exogenous nucleic acid comprising a gene cluster encoding polypeptides sufficient to direct the assembly of an 024A A54145-like compound or analogue.
The invention provides a method of chemically modifying a biological molecule that is a substrate for a polypeptide encoded by a gene product of 024A ORFs 1 to 16 (SECT I D NOS: 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63 and 65) comprising contacting the biological molecule with a gene product of 024A ORF
1 to 16 (SEQ ID NOS: 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63 and 65), wherein said polypeptide chemically modifies said biological molecule. In another aspect, the invention provides a method of chemically modifying a biological molecule that is a substrate for a polypeptide encoded by an 024A biosynthesis gene, said method comprising contacting the biological molecule with at least two of the above polypeptides.
The invention also provides an isolated or purified antibody capable of specifically binding to a polypeptide having a sequence selected from the group consisting of SEO I D NOS: 35, 37, 39, 41, 43, 45, 47; 49, 51, 53, 55, 57, 59, 61, 63, and 65. The invention provides a method of making a polypeptide having a sequence selected from the group consisting of SEO ID NOS: 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, and 65 comprising introducing a nucleic acid encoding said polypeptide, said nucleic acid being operably linked to a promoter, into a host cell. The invention also provides a method of making a 024A compound or analog comprising the step of providing a bacterium containing a gene cluster with sufficient genes to produce a 024A compound or analogue and culturing the bacterium under conditions allowing for expression of the sufficient genes to produce a 024A compound, wherein the gene cluster contains at least one of the 024A nucleic acids. in one embodiment the method comprises culturing a Streptomyces bacterium under conditions allowing for expression of A541 ORFs 1 to 15 (SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33) present in the E. coli strains DH10B having accession nos. IDAC 260202-1, 260202-2 and 260202-3.

_g_ BRIEF DESCRIPTION OF THE DRAWIN(a'S
The present invention will be further understood 1'rom the following description with reference to the following figures:
Figure 1 is a graphical depiction of the A541 biosynthetic locus from Sfreptomyces fradiae ATCC 18158 showing, at the top of the figure, a scale in base pairs; followed by the coverage of the locus by the three continuous DNA
sequences (SEQ ID NO: 1, 6 and 17); the relative positioning and orientation of the 15 ORFs referred to by ORF number (SECT ID NOS: 3, 5, 8, 10, 13, 14, 16, 19, 21, 23, 25, 27, 29, 31 and 33 respectively); the regions of the locus covered by the deposited cosmid clones 184CM, 184CA and 184CJ; and the structure of an A54145 compound and all A54145 factors produced by A541.
Figure 2 is a graphical depiction of the 024A biosynthetic locus from Streptomyces refuineus NRRL 3143 showing, at the top of the figure, a scale in base pairs; the single continuous DNA sequence (SEQ ID NO: 34) represented by a continuous black line; the relative positioning and orientation of the 16 open reading frames by ORF numbers (SEQ ID NOS: 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64 and 66); the regions covered by the deposited cosmid clone 024CC
and 024CK; and a structure of the lipopeptide backbone and product of 024A.
Figure 3a, 3b and 3c are an amino acid alignment of C-domains from A541 ORFs 2, 3, 4, 5 and 6 (SEO ID NOS: 4, 7, 9, 11 and 13) highlighting conserved motifs characteristic of condensation domains. In this and other amino acid alignments ~f the specification, a line above the alignement is used to mark strongly conserved positions.
In addition, three characters, namely * (asterisk),: (colon) and . (period) are used, wherein "*" indicates positions which have a single, fully conserved residue;
":" indicates that one of the following strong groups is fully conserved: STA, NEQK, NHQK, NDEQ, QHRK, MILV, MILF, HY, and FYW; and "." Indicates that one of the following weaker groups is fully conserved: CSA, ATV, SAG, STNK, STPa4, SGND, SNDEQK, NDEQHK, NEQHRK, FVLIM, and HFY.
Figure 4a, 4b, 4c, 4d and 4e are an amino acid alignment of A-domains and an A/N-methyltransferase domain fusion from A541 ORFs ;?, 3, 4, 5 and 6 (SEQ ID
NOS:
4, 7, 9, 11 and 13) highlighting conserved motifs characteristic of adenylation domains and methyltransferase motifs.

Figure 5 is an amino acid alignment of T domain s from A541 ORF 2, 3, 4, 5 and 6 (SEO ID NOS: 4, 7, 9, 11 and 13) highlighting the conserved residue of the thiolation domain to which a phosphopantetheine group is covalently attached post-translationally.
Figure 6 is an amino acid alignment of E-domains from A541 ORFs 2 and 5 (SEO ID NOS: 4 and 11 ) highlighting conserved motifs characteristic of epimerization domains.
Figure 7 is an amino acid alignment of Te domain from A541 ORF 6 (SEO ID
NO: 13) as compared with the corresponding sequence in CADA highlighting the conserved residues characteristic of thioesterase domaiins.
Figure 8a, 8b and 8c is an amino acid alignment of C-domains in the 024A ORFs 4, 5, 6 and 7 (SEQ ID NOS: 41, 43, 45 and 47) highlighting conserved motifs characteristic of condensation domains.
Figure 9a, 9b, 9c, 9d and 9e is an amino acid alignment of A-domains and an A-domain having an insertion of an N-methyltransferase domain from 024A ORFs 4, 5, 6 and 7 (SEQ ID NOS: 41, 43, 45 and 47) highlighting conserved motifs characteristic of adei~ylation domains and methyltransferase motifs.
Figure 10 is an amino acid alignment of T domains from 024A ORFs 4, 5, 6 and 7 (SEO ID NOS: 41, 43, 45 and 47) highlighting the conserved residue of the thiolation domain to which a phosphopantetheine group is covalently attached post-translationally.
Figure 11 is an amino acid alignment of E-domains in 024A ORFs 4 and 6 (SEQ
ID NOS: 41 and 45) highlighting conserved motifs characteristic of epimerization domains.
Figure 12 is an amino acid alignment of Te domain from 024A ORF 7 (SEQ ID
NO: 47) as compared with the corresponding sequence in CADA highlighting the conserved residues characteristic of thioesterase domains.
Figure 13a and 13b show corresponding NRPS proteins found in 024A and A541, the modules and domains forming each NRPS, and the biosynthetic pathway by which the respective 024A and A541 NRPS complexes assemble their products.
Figure 14a and 14b is an amino acid alignment of ADLE proteins from 024A ORF
2 (SEQ ID NO: 37), A541 ORF 1 (SEQ ID NO: 2) and the ADLE proteins from RAMO, DAPT and A410, highlighting conserved motifs of aryl CoA ligases. For SEO ID
NO: 2 only amino acid residues for 1 to 648 corresponding to the ADLE domain were used in the alignment.

Figure 15 is an amino acid alignment of ACPH proteins from 024A ORF 3 (SEO
ID NO: 39), A541 ORF 1 (SEQ ID N0:2) and the ACPH proteins from RAMO, DAPT, A410, highlighting conserved serine residues of the thiolation domain to which a phosphopantetheine group is covalently attached post-translationally. For SEQ
ID NO:
2 only amino acids reidues for 649 to 723 corresponding to the ACPH domain were used for the alignment.
Figure 16 is a dendrogram showing the evolutionary relatedness of C domains from various lipopeptide NRPSs with a clearly branching cluster of C domains involved in N-acylation highlighted in gray.
Figure 17a and 17b is an amino acid alignment of the unusual (acyl-specific) N-terminal C-domain from NRSPs of 024A ORF 4 (SEQ ID NO: 41), A541 ORF 2 (SEQ ID
NO: 4), and the acyl-specific C-domains from NRPSs of RAMO, DAFT and A410,highlighting conserved motifs.
Figures 18a and 18b illustrate a mechanism for formation of N-acyl peptide linkage in lipopeptides. Figure 18c illustrates the N-acylation mechanism specific for A54145 formation and corresponding mechanism describing the A54145-like compound generated by 024A. The fatty acid structure in brackets indicates that alternative fatty acids may be incorporated.
Figure 19 is an amino acid alignment of the MTFZ C-methytransferase from 024A ORF 16 (SEQ ID NO: 65) and A541 ORF 15 (SE(~ ID NO: 32) and the MTFZ C-methytransferase from DAFT and CADA, which MTFZ C-methytransferases are involved in generating the 3-methyl-glutamate residue o~f A54145, the lipopeptide of 024A, A-21978C (daptomycin), and "calcium-dependent antibiotic" of S.
c~elicolor respectively. Conserved methyl transferase motifs are highlighted.
Figure 20a and 20b are photographs of plates generated in the bioassay of anionic lipopeptide isolation experiments described herein, which plates illustrate an enrichment of activity, based on IRA67 anion exchange chromatography of lipopeptides from Streptomyces fradiae and Streptomyces refuineus subsp. fhermotolerans.
Figure 21 a illustrates use of NRPS biosynthetic machinery of a nonlipopeptide natural product, complestatin, to produce an N-acylated analogue of complestatin.
Figure 21 b illustrates a rationally designed recombinant NRPS system that gives rise to N-acylated complestatin analogue(s).

DETAILED DESCRIPTION OF THE INVENTION:
Throughout the description and the figures, the biosynthetic locus for A54145 from Streptomyces fradiae ATCC 18158 is sometimes referred to as "A541" and the biosynthetic locus for a lipopeptide natural product from Streptomyces refuineus NRRL
3143 is sometimes referred to as "024A". In addition, reference is sometimes made in description and in the figures to other lipopeptide biosynthetic loci, wherein "RAMO"
refers to the biosynthetic locus for ramoplanin from Actinoplanes sp. ATCC
33076, "DAFT" refers to the biosynthetic locus for A21978C from Strepfomyces roseosporus NRRL 1 1379, "A410" refers to the biosynthetic locus for a lipopeptide natural product from Actinoplanes nipponensis FD 24834 ATCC 31145, and "CADA" refers to the biosynthetic locus for the calcium-dependent antibiotic from Streptomyces coelicolor A3(2) (Bentley et al., 2002, Nature, vol. 417, pp 141-147).
The ORFs in A541 and 024A are assigned a putative function sometimes referred to throughout the description and figures by reference to a four-letter designation, as indicated in Table I.
Table 1 Family Descriptions Families Proposed Function ABC transporter; ATP-binding cassette transmembrane transporter;
BCD includes proteins with similarity to Mdr proteins of mammalian tumor Cells that confer resistance to daunorubicir~, doxorubicin and some other structurally unrelated chemotherapeutic agents;
DrrA-type proteins cooperate with a transmembrane component to confer resistance ACPH acyl carrier protein, unusual ADLE similar to acyl-CoA ligase, involved in fatty acyl transfer; usually associated with a free aryl carrier protein ADLF natural fusion of ADLE and ACPH; acyICoA ligase activates and tethers fatty acids esterase/acetyltransferasehipase/haloperoxidase;
EATB alpha/beta hYdrolase fold; includes aryl esterases, bifunctional enzymes capable of both ester hydrolysis and halogenation, act on many phenolic esters membrane protein; includes DrrB daunorubicin resistance MEMD transmembrane protein and related proteins that act with an ABCD
component to confer resistance MEMT membrane protein MTAG methyltransferase Structurally related to the ubiE/COQ5 family of C-methyltransferases TFZ (pfam01209). Apart from the ubiquinone/rrienaquinone biosynthesis C-methyltransferases, this family also includes other methyltransferases involved in biotin and sterol biosynthesis and in phosphatidylethanolamine methylation.

OXAB oxidoreductase, putative; weak homology to alpha-ketoglutarate dependent dioxy enases.

oxidoreductase, flavoprotein-dependent; homology OXAU to acyl CoA
dehydrogenases; possibly membrane-associated; includes proteins that may be responsible for generating the unsaturated fatty acyl moiety of ramoplanin.

OXDD putative oxygenase, domain homology to c;lavaminate synthases Casl, Cas2 PPST/NRPSnon-ribosomal peptide synthetase unknown; includes MbtH involved in mycobactin synthesis;
UNKC usually associated with non-ribosomal peptide synthetases;;
may be an unusual ACP

The terms "lipopeptide producer" and "lipopeptide-producing organism" refer to a microorganism that carries the genetic information necessary to produce a lipopeptide compound, whether or not the organism is known to produce a lipopeptide compound.
The terms apply equally to organisms in which the genetic information to produce the lipopeptide compound is found in the organism as it exi sts in its natural environment, and to organisms in which the genetic information is introduced by recombinant techniques. For the sake of particularity; specific organisms contemplated herein include organisms of the family Micromonosporaceae, of which preferred genera include Micromonaspora, Actinoplanes and Dactylosporangium; the family Streptomycetaceae, of which preferred genera include Streptomyces and Kitasatospora; the family Pseudonocardiaceae, of which preferred genera are Amycoiatopsis and Saccharopolyspora; and the family Actinosynnemataceae, of which preferred genera include Saccharothrix and Actinosynnema; however the terms are intended to encompass all organisms containing genetic information necessary to produce a lipopeptide compound.
The term lipopeptide biosynthetic gene product refers to any enzyme or polypeptide involved in the biosynthesis of lipopeptide product. For the sake of particularity, the lipopeptide biosynthetic pathways are associated with Streptomyces fradiae in the case of A541 and with Streptomyces refuineus in the case of 024A.
However, it should be understood that this term encompasses lipopeptide biosynthetic enzymes (and genes encoding such enzymes) isolated from any microorganism of the genus Streptomyces, and furthermore that these genes may have novel homologues in related actinomycete microorganisms or non-actinomycete microorganisms that fall within the scope of the invention. Representative lipopeptide biosynthetic gene products include the polypeptides listed in SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 or homologues thereof.
The term "isolated" means that the material is removed from its original environment, e.g. the natural environment if it is naturally-occurring. For example, a naturally-occurring polynucleotide or polypeptide present in a living organism is not isolated, but the same polynucleotide or polypeptide, separated from some or all of the coexisting materials in the natural system, is isolated. Such polynucleotides could be part of a vector andlor such polynucleotides or polypeptides could be part of a composition, and still be isolated in that such vector or composition is not part of its natural environment.
The term "purified" does not require absolute purity; rather, it is intended as a relative definition. Individual nucleic acids obtained from a library have been conventionally purified to electrophoretic homogeneity. The purified nucleic acids of the present invention have been purified from the remainder of the genomic DNA in the organism by at least 104 to 1 O6 fold. However, the term "purified" also includes nucleic acids which have been purified from the remainder of the genomic DNA or from other sequences in a library or other environment by at least one order of magnitude, preferably two or three orders of magnitude, and more preferably four or five orders of magnitude.
"Recombinant" means that the nucleic acid is adjacent to "backbone" nucleic acid to which it is not adjacent in its natural environment. "Enriched" nucleic acids represent 5% or more of the number of nucleic acid inserts in a population of nucleic acid backbone molecules. "Backbone" molecules include nucleic acids such as expression vectors, self-replicating nucleic acids, viruses, integrating nucleic acids, and other vectors or nucleic acids used to maintain or manipulate a nucleic acid of interest.
Preferably, the enriched nucleic acids represent 15% or more, more preferably 50% or more, and most preferably 90% or more, of the number of nucleic acid inserts in the population of recombinant backbone molecules.
"Recombinant" polypeptides or proteins refer to polypeptides or proteins produced by recombinant DNA techniques, i.e. produced from cells transformed by an exogenous DNA construct encoding the desired polypeptide or protein.
"Synthetic"
polypeptides or proteins are those prepared by chemical synthesis.
The term "gene" means the segment of DNA involved in producing a polypeptide chain; it includes regions preceding and following the coding region (leader and trailer) as well as, where applicable, intervening regions (introns) between individual coding segments (exons).
A DNA or nucleotide "coding sequence" or "sequence encoding" a particular polypeptide or protein, is a DNA sequence which is transcribed and translated into a polypeptide or protein when placed under the control of appropriate regulatory sequences.
"Oligonucleotide" refers to a nucleic acid, generally of at least 10, preferably 15 and more preferably at least 20 nucleotides, preferably no more than 100 nucleotides, that are hybridizable to a genomic DNA molecule, a cDNA molecule, or an mRNA
molecule encoding a gene, mRNA, cDNA or other nucleic acid of interest.

A promoter sequence is "operably linked to" a coding sequence recognized by RNA polymerase which initiates transcription at the promoter and transcribes the coding sequence into mRNA.
"Plasmids" are designated herein by a lower case p preceded or followed by capital letters and/or numbers. The starting plasmids herein are commercially available, publicly available on an unrestricted basis, or can be constructed from available plasmids in accord with published procedures. In addition, equivalent plasmids to those described herein are known in the art and will be apparent to the skilled artisan.
"Digestion" of DNA refers to enzymatic cleavage of the DNA with a restriction enzyme that acts only at certain sequences in the DNA. The various restriction enzymes used herein are commercially available and their reaction conditions, cofactors and other requirements were used as would be known to the ordinary skilled artisan.
For analytical purposes, typically 1 pg of plasmid or DN,A fragment is used with about 2 units of enzyme in about 20 NI of buffer solution. For the purpose of isolating DNA
fragments for plasmid construction, typically 5 to 50 Ng of DNA are digested with 20 to 250 units of enzyme in a larger volume. Appropriate buffers and substrate amounts for particular enzymes are specified by the manufacturer. Incubation times of about 1 hour at 37°C are ordinarily used, but may vary in accordance with the supplier's instructions.
After digestion the gel electrophoresis may be performed to isolate the desired fragment.
We have now discovered the genes and proteins involved in the biosynthesis of the lipopeptide A54145. Nucleic acid sequences encoding proteins involved in the biosynthesis of the A54145 compound are provided in the accompanying sequence listing as SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25; 27, 29, 31, 33.
Polypeptides involved in the biosynthesis of the A54145 compound are provided in the accompanying sequence listing as SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 We have also discovered the genes and proteins involved in the biosynthesis of an A54145-like compound from an organism not previously reported to produce a lipopeptide. Nucleic acid sequences encoding proteins involved in the biosynthesis of the A54145-like compound are provided in the accompanying sequence listing as SEQ
ID NOS: 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66.
Polypeptides involved in the biosynthesis of the A54145-like compound are provided in the accompanying sequence listing as SEQ I D NOS: 35, 3T, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65.
One aspect of the present invention is an isolated, purified, or enriched nucleic acid comprising one of the sequences of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, the sequences complementary thereto, or a fragment comprising at least 100, 200, 300, 400, 500, 600, 700, 800 or more consecutive bases of one of the sequences of SEQ ID
NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66 or the sequences complementary thereto.
The isolated, purified or enriched nucleic acids may comprise DNA, including cDNA, genomic DNA, and synthetic DNA. The DNA may be double stranded or single stranded, and if single stranded may be the coding (sense) or non-coding (anti-sense) strand. Alternatively, the isolated, purified or enriched nucleic acids may comprise RNA.
As discussed in more detail below, the isolated, purified or enriched nucleic acids of one of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66 may be used to prepare one of the polypeptides of SEO ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 respectively or fragments comprising at least 50, 75, 100, 200, 300, 500 or more consecutive amino acids of one of the polypeptides of SEO ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65.
Accordingly, another aspect of the present invention is an isolated, purified or enriched nucleic acid which encodes one of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 or fragments comprising at least 50, 75, 100, 150, 200, 300 or more consecutive amino acids of one of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 . The coding sequences of these nucleic acids may be identical to one of the coding sequences of one of the nucleic acids of SEO ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66 or a fragment thereof or may be different coding sequences which encode one of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 or fragments comprising at least 50, 75, 100, 150, 200, 300 consecutive amino acids of one of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 as a result of tlhe redundancy or degeneracy of the genetic code. The genetic code is well known to those of skill in the art and can be obtained, for example, from Stryer, Biocherrristry, 3rd edition, W. I~.
Freeman & Co., New York.
The isolated, purified or enriched nucleic acid which encodes one of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, may include, but is not limited to: (1 ) only the coding sequences of one of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66; (2) the coding sequences of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66 and additional coding sequences, such as leader sequences or proprotein; and (3) the coding sequences of SEO ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and nt0 and non-coding sequences, such as introns or non-coding sequences 5' and/or 3' of the coding sequence. Thus, as used herein, the term "polynucleotide encoding a polypeptide" encompasses a polynucleotide that includes only coding sequence for the polypeptide as well as a polynucleotide that includes additional coding and/or non-coding sequence.
The invention relates to polynucleotides based on SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 529 54, 56, 58, 60, 62, 64, 66 but having polynucleotide changes that are "silent", for example changes which do not alter the amino acid sequence encoded by the polynucleotides of SEQ ID
NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66. The invention also relates to polynucleotides which have nucleotide changes which result in amino acid substitutions, additions, deletions, fusions and truncations of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 309 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65. Such nucleotide changes may be introduced using techniques such as site directed mutagenesis, random chemical mutagenesis, exonuclease III deletion, and other recombinant DNA techniques.

_19_ The isolated, purified or enriched nucleic acids of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, the sequences complementary thereto, or a fragment comprising at least 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400 or 500 consecutive bases of one of the sequence of SEQ ID NOS: 3, 5, 8, 10, 12, 14~, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, or the sequences complementary thereto may be used as probes to identify and isolate DNAs encoding the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 respectively. In such procedures, a genomic DNA library is constructed from a sample microorganism or a sample containing a microorganism capable of producing a lipopeptide. The genomic DNA library is then contacted with a probe comprising a coding sequence or a fragment of the coding sequence, encoding one of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or a fragment thereof under conditions which permit the probe to specifically hybridize to sequences complementary thereto. In a preferred embodiment, the probe is an oligonucleotide of about 10 to about 30 nucleotides in length designed based on a nucleic acid of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66.
Genomic DNA
clones which hybridize to the probe are then detected and isolated. Procedures for preparing and identifying DNA clones of interest are disclosed in Ausubel et al., Current Protocols in Molecular Biology, John Wiley 503 Sons, Inc. 1997; and Sambrook et al., Molecular Cloning: A Laboratory Manual 2d Ed., Cold Spring Harbor Laboratory Press, 1989. In another embodiment, the probe is a restriction fragment or a PCR
amplified nucleic acid derived from SEQ ID NOS: 3, 5, 8, 10, 12, '14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66.
The isolated, purified or enriched nucleic acids of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, the sequences complementary thereto, o~r a fragment comprising at least 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400 or 500 consecutive bases of one of the sequences of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, or the sequences complementary thereto may be used as probes to identify and isolate related nucleic acids. In some embodiments, the related nucleic acids may be genomic DNAs (or cDNAs) from potential lipopeptide producers. In such procedures, a nucleic acid sample containing nucleic acids from a potential lipopeptide-producer is contacted with the probe under conditions that permit the probe to specifically hybridize to related sequences. The nucleic acid sample may be a genomic DNA (or cDNA) library from the potential lipopeptide-producer. Hybridization of the prok~e to nucleic acids is then detected using any of the methods described above.
Hybridization may be carried out under conditions of low stringency, moderate stringency or high stringency. As an example of nucleic acid hybridization, a polymer membrane containing immobilized denatured nucleic acids is first prehybridized for 30 minutes at 45 °C in a solution consisting of 0.9 M NaCI, 50 mM NaH2P04, pH 7.0, 5.0 mM Na2EDTA, 0.5% SDS, 10X Denhardt's, and 0.5 mg/ml polyriboadenylic acid.
Approximately 2 x 10' cpm (specific activity 4-9 x 1 O$ cp~m/ug) of 32P end-labeled oligonucleotide probe are then added to the solution. After 12-16 hours of incubation, the membrane is washed for 30 minutes at room temperature in 1 X SET (150 mM
NaCI, 20 mM Tris hydrochloride, pH 7.8, 1 mM Na2EDTA) containing 0.5% SDS, followed by a 30 minute wash in fresh 1 X SET at Tm-10°C for the oligonucleotide probe where Tm is the melting temperature. The membrane is then exposed to autoradiographic film for detection of hybridization signals.
By varying the stringency of the hybridization conditions used to identify nucleic acids, such as genomic DNAs or cDNAs, which hybridize to the detectable probe, nucleic acids having different levels of homology to the probe can be identified and isolated. Stringency may be varied by conducting the hybridization at varying temperatures below the melting temperatures of the probes. The melting temperature of the probe may be calculated using the following formulas:
For oligonucleotide probes between 14 and 70 nucleotides in length the melting temperature (Tm) in degrees Celcius may be calculated using the formula:
Tm=81.5+16.6(log [Na+]) + 0.41 (fraction G+C)-(600/N) where N is the length of the oligonucleotide.
If the hybridization is carried out in a solution containing formamide, the melting temperature may be calculated using the equation Tm=81.5+16.6(log [Na +]) +
0.41 (fraction G + C)-(0.63% formamide)-(600/N) where N is the length of the probe.

Prehybridization may be carried out in 6X SSC, 5X Denhardt's reagent, 0.5%
SDS, 0.1 mglml denatured fragmented salmon sperm DNA or 6X SSC, 5X Denhardt's reagent, 0.5% SDS, 0.1 mg/ml denatured fragmented salmon sperm DNA, 50%
formamide. The composition of the SSC and Denhardt's solutions are listed in Sambrook et al., supra.
Hybridization is conducted by adding the detectable probe to the hybridization solutions listed above. Where the probe comprises double stranded DNA, it is denatured by incubating at elevated temperatures and quickly cooling before addition to the hybridization solution. It may also be desirable to similarly denature single stranded probes to eliminate or diminish formation of secondary structures or oligomerization.
The filter is contacted with the hybridization solution for a sufficient period of time to allow the probe to hybridize to cDNAs or genomic DNAs containing sequences complementary thereto or homologous thereto. For probes over 200 nucleotides in length, the hybridization may be carried out at 15-25 °C below the Tm.
For shorter probes, such as oligonucleotide probes, the hybridization may be conducted at 5-10 °C
below the Tm. Preferably, the hybridization is conducted in 6X SSC, for shorter probes.
Preferably, the hybridization is conducted in 50% formamide containing solutions, for longer probes.
All the foregoing hybridizations would be considered to be examples of hybridization performed under conditions of high stringency.
Following hybridization, the filter is washed for at least 15 minutes in 2X
SSC, 0.1 % SDS at room temperature or higher, depending on the desired stringency.
The filter is then washed with 0.1 X SSC, 0.5% SDS at room temperature (again) for minutes to 1 hour.
Nucleic acids which have hybridized to the probe are identified by conventional autoradiography and non-radioactive detection methods.
The above procedure may be modified to identify nucleic acids having decreasing levels of homology to the probe sequence. I=or example, to obtain nucleic acids of decreasing homology to the detectable probe, less stringent conditions may be used. For example, the hybridization temperature may be decreased in increments of °C from 68 °C to 42 °C in a hybridization buffer having a Na+ concentration of approximately 1 M. Following hybridization, the filter may be washed with 2X
SSC, 0.5%
SDS at the temperature of hybridization. These conditions are considered to be "moderate stringency" conditions above 50°C and "low stringency"
conditions below 50°C. A specific example of "moderate stringency" hybridization conditions is when the above hybridization is conducted at 55°C. A specific example of "low stringency"
hybridization conditions is when the above hybridization is conducted at 45°C.
Alternatively, the hybridization may be carried out in buffers, such as 6X
SSC, containing formamide at a temperature of 42 °C. In this case, the concentration of formamide in the hybridization buffer may be reduced in 5% increments from 50%
to 0%
to identify clones having decreasing levels of homology to the probe.
Following hybridization, the filter may be washed with fiX SSC, 0.5% SDS at 50 °C. These conditions are considered to be "moderate stringency" conditions above 25%
formamide and "low stringency" conditions below 25% formamide. A specific example of "moderate stringency" hybridization conditions is when the above hybridization is conducted at 30% formamide. A specific example of "low stringency"
hybridization conditions is when the above hybridization is conducted at 10% formamide.
Nucleic acids which have hybridized to the probe are identified by conventional autoradiography and non-radioactive detection methods.
For example, the preceding methods may be used to isolate nucleic acids having a sequence with at least 97%, at least 95%, at least 90%, at least 85%, at least 80%, or at least 70% homology to a nucleic acid sequence selected from the group consisting of the sequences of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, fragments comprising at least 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 consecutive bases thereof, and the sequences complementary thereto. Homology may be measured using BLASTN version 2.0 with the default parameters. For example, the homologous polynucleotides may have a coding sequence that is a naturally occurring allelic variant of one of the coding sequences described herein. Such allelic variant may have a substitution, deletion or addition of one or more nucleotides when compared to the nucleic acids of SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, or the sequences complementary thereto.
Additionally, the above procedures may be used to isolate nucleic acids which encode polypeptides having at least 99%, 95%, at least 90%, at least 85%, at least 80%, or at least 70% homology to a polypeptide having the sequence of one of SEQ ID
NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or fragments comprising at least 50, 75, 100, i 50, 200, 300 consecutive amino acids thereof as determined using the BLASTP
version 2.2.2 algorithm with default parameters.
Another aspect of the present invention is an isolated or purified polypeptide comprising the sequence of one of SEO ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 or fragments comprising at least 50, 75, 100, 150, 200 or 300 consecutive amino acids thereof. As discussed herein, such polypeptides may be obtained by inserting a nucleic acid encoding the polypeptide into a vector such that the coding sequence is operably linked to a sequence capable of driving the expression of the encoded polypeptide in a suitable host cell. For example, the expression vector rnay comprise a promoter, a ribosome binding site for translation initiation and a transcription terminator. The vector may also include appropriate sequences for modulating expression levels, an origin of replication and a selectable marker.
Promoters suitable for expressing the polypeptide or fragment thereof in bacteria include the E.coli lac or trp promoters, the lacl promoter, the lacZ promoter, the T3 promoter, the T7 promoter, the gpt promoter, the lambda PR promoter, the lambda P~
promoter, promoters from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), and the acid phosphatase promoter. Fungal promoters include the a factor promoter. Eckaryotic promoters include the CMV
immediate early promoter, the HSV thymidine kinase promoter, heat shock promoters, the early and late SV40 promoter, LTRs from retroviruses, and the mouse metallothionein-I promoter. Other promoters known to control expression of genes in prokaryotic or eukaryotic cells or their viruses may also be used.
Mammalian expression vectors may also comprise an origin of replication, any necessary ribosome binding sites, a polyadenylation site, splice donors and acceptor sites, transcriptional termination sequences, and 5' flanking nontranscribed sequences.
In some embodiments, DNA sequences derived from the SV40 splice and polyadenylation sites rnay be used to provide the required nontranscribed genetic elements.

Vectors for expressing the polypeptide or fragment thereof in eukaryotic cells may also contain enhancers to increase expression levels. Enhancers are cis-acting elements of DNA, usually from about 10 to about 300 by in length that act on a promoter to increase its transcription. Examples include the SV40 enhancer on the late side of the replication origin by 100 to 270, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and the adenovirus enhancers.
In addition, the expression vectors preferably contain one or more selectable marker genes to permit selection of host cells containing the vector. Examples of selectable markers that may be used include genes encoding dihydrofolate reductase or genes conferring neomycin resistance for eukaryotic cell culture, genes conferring tetracycline or ampicillin resistance in E. coif, and the S. cerevisiae TRP1 gene.
In some embodiments, the nucleic acid encoding one of the polypeptides of SEQ
ID
NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or fragments comprising at least 50, 75, 100, 150, 200 or 300 consecutive amino acids thereof is assembled in appropriate phase with a leader sequence capable of directing secretion of the translated polypeptides or fragments thereof. Optionally, the nucleic acid can encode a fusion polypeptide in which one of the polypeptide of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof is fused to heterologous peptides or polypeptides, such as N-terminal identification peptides which impart desired characteristics such as increased stability or simplified purification or detection.
The appropriate DNA sequence may be inserted into the vector by a variety of procedures. In general, the DNA sequence is ligated to the desired position in the vector following digestion of the insert and the vector with appropriate restriction endonucleases. Alternatively, appropriate restriction enzyme sites can be engineered into a DNA sequence by PCR. A variety of cloning techniques are disclosed in Ausbel et al. Current Protocols in Molecular Biology, John Wiley 503 Sons, Inc. 1997 and Sambrook et al., Molecular Cloning: A Laboratory Manual 2d Ed., Cold Spring Harbour Laboratory Press, 1989. Such procedures and others are deemed to be within the scope of those skilled in the art.

The vector may be, for example, in the form of a. plasmid, a viral particle, or a phage. Other vectors include derivatives of chromosornal, nonchromosomal and synthetic DNA sequences, viruses, bacterial plasmids, phage DNA, baculovirus, yeast plasmids, vectors derived from combinations of plasmids and phage DNA, viral DNA
such as vaccinia, adenovirus, fowl pox virus, and pseudorabies. A variety of cloning and expression vectors for use with prokaryotic and eul<aryotic hosts are described by Sambrook, et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, N.Y., (1989).
Particular bacterial vectors which may be used include the commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017), pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden), pGEM1 (Promega Biotec, Madison, WI, USA) pQE70, pQE60, pQE-9 (Qiagen), pDlO, phiX174, pBluescript II KS, pNHBA, pNHl6a, pNHl8A, pNH46A (Stratagene), ptrc99a, pKK223-3, pKK233-3, pDR540, pRITS (Pharmacia), pKK232-8 and pCM7. Particular eukaryotic vectors include pSV2CAT, pOG44, pXT1, pSG (Stratagene) pSVK3, pBPV, pMSG, and pSVL (Pharmacia). However, any other vector may be used as long as it is replicable and stable in the host cell.
The host cell may be any of the host cells familiar to those skilled in the art, including prokaryotic cells or eukaryotic cells. As representative examples of appropriate hosts, there may be mentioned: bacteria cells, such as E'. coli, Streptomyces, Bacillus subtilis, Salmonella typhimurium and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus, fungal cells, such as yeast, insect cells such as Drosophila S2 and Spodoptera Sf9, animal cells such as CHO, COS or Bowes melanoma, and adenoviruses. The selection of an appropriate host is within the abilities of those skilled in the art.
The vector may be introduced into the host cells using any of a variety of techniques, including electroporation transformation, transfection, transduction, viral infection, gene guns, or Ti-mediated gene transfer. Where appropriate, the engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants or amplifying the genes of the present invention. Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the selected promoter may be induced by appropriate means (e.g., temperature shift or chemical induction) and the cells may be cultured for an additional period to allow them to produce the desired polypeptide or fragment thereof.
Cells are typically harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract is retained for further purification.
Microbial cells employed for expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. Such methods are well known to those skilled in the art. The expressed polypeptide or fragment thereof can be recovered and purified from recombinant cell cultures by methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography and lectin chromatography. Protein refolding steps can be used, as necessary, in completing configuration of the polypeptide. If desired, high performance liquid chromatography (HPLC) can be employed for final purification steps.
Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mammalian expression systems include the C~S-lines of monkey kidney fibroblasts (described by Gluzman, Cell, 23:175(1981 )), and other cell lines capable of expressing proteins from a compatible vector, such as the C127, 3T3, CHO, HeLa and EHK cell lines.
The constructs in host cells can be used in a conventional manner to produce the gene product encoded by the recombinant sequence. Depending upon the host employed in a recombinant production procedure, the polypeptide produced by host cells containing the vector may be glycosylated or may lae non-glycosylated.
Polypeptides of the invention may or may not also include an initial methionine amino acid residue.
Alternatively, the polypeptides of SECT ID IVOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or fragments comprising at least 50, 75, 100, 150, 200 or 300 consecutive amino acids thereof can be synthetically produced by conventional peptide synthesizers. In other embodiments, fragments or portions of the polynucleotides may be employed for producing the corresponding full-length polypeptide by peptide synthesis;
therefore, the fragments may be employed as intermediates for producing the full-length polypeptides.

Cell-free translation systems can also be employed to produce one of the polypeptides of SEQ !D NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or fragments comprising at least 50, 75, 100, 150, 200 or 300 consecutive amino acids thereof using mRNAs transcribed from a DNA construct comprising a promoter operably finked to a nucleic acid encoding the polypeptide or fragment thereof. In some embodiment;, the DNA construct may be linearized prior to conducting an in vitro transcription reaction. The transcribed mRNA is then incubated with an appropriate cell-free translation extract, such as a rabbit reticulocyte extract, to produce the desired polypeptide or fragment thereof.
The present invention also relates to variants of the polypeptides of SECT ID
NOS; 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or fragments comprising at least 50, 75, 100, 150, 200 or 300 consecutive amino acids thereof. The term "variant" includes derivatives or analogs of these polypeptides. In particular, the variants may differ in amino acid sequence from the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 6.1, 63, 65, by one or more substitutions, additions, deletions, fusions and truncations, which may be present in any combination.
The variants may be naturally occurring or created in vitro. In particular, such variants may be created using genetic engineering techniques such as site directed mutagenesis, random chemical mutagenesis, Exonuclease Ill deletion procedures, and standard cloning techniques. Alternatively, such variants, fragments, analogs, or derivatives may be created using chemical synthesis or modification procedures.
Other methods of making variants are also familiar to those skilled in the art.
These include procedures in which nucleic acid sequences obtained from natural isolates are modified to generate nucleic acids that encode polypeptides having characteristics which enhance their value in industrial or laboratory applications. In such procedures, a large number of variant sequences having one or more nucleotide differences with respect to the sequence obtained from the natural isolate are generated and characterized. Preferably, these nucleotide differences result in amino acid changes with respect to the polypeptides encoded by the nucleic acids from the natural isolates.

For example, variants may be created using error prone PCR. In error prone PCR, DNA amplification is performed under conditions where the fidelity of the DNA
polymerase is low, such that a high rate of point mutation is obtained along the entire length of the PCR product. Error prone PCR is described in Leung, D.W., et al., Technique, 1:11-15 (1989) and Caldwell, R. C. & Joyce G.F., PCR Methods Applic., 2:28-33 (1992). Variants may also be created using site directed mutagenesis to generate site-specific mutations in any cloned DNA segment of interest.
Oligonucleotide mutagenesis is described in Reidhaar-Olson, J.F. & Sauer, R.T., et al., Science, 241:53-57 (1988). Variants may also be created using directed evolution strategies such as those described in US patent nos. 6,361,974 and 6,372,497.
The variants of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, may be (i) variants in which one or more of the amino acid residues of the polypeptides of SEQ ID
NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33 and 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, are substituted with a conserved or non-conserved amino acid residue (preferably a conserved amino acid residue) and such substituted amino acid residue may or may not be one encoded by the genetic code.
Conservative substitutions are those that substitute a given amino acid in a polypeptide by another amino acid of like characteristics. Typically seen as conservative substitutions are the following replacements: replacements of an aliphatic amino acid such as Ala, Val, Leu and Ile with another aliphatic amino acid;
replacement of a Ser with a Thr yr vice versa; replacement of an acidic residue such as Asp or Glu with another acidic residue; replacement of a residue bearing an amide group, such as Asn or Gln, with another residue bearing an amide group; exchange of a basic residue such as Lys or Arg with another basic residue; and replacement of an aromatic residue such as Phe or Tyr with another aromatic residue: Other variants are those in which one or more of the amino acid residues of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 includes a substituent group.
Still other variants are those in which the polypeptide is associated with another compound, such as a compound to increase the half-life of the polypeptide (for example, polyethylene glycol).

Additional variants are those in which additional amino acids are fused to the polypeptide, such as leader sequence, a secretory sequence, a proprotein sequence or a sequence which facilitates purification, enrichment, or stabilization of the polypeptide.
In some embodiments, the fragments, derivatives and analogs retain the same biological function or activity as the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65. In other embodiments, the fragment, derivative or analogue includes a fused heterologous sequence which facilitates purification, enrichment, detection, stabilization or secretion of the polypeptide that can be enzymatically cleaved, in whole or in part, away from the fragment, derivative or analogue.
Another aspect of the present invention are polypeptides or fragments thereof which have at least 70%, at least 80%, at least 85%, at least 90%, or more than 95%
homology to one of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or a fragment comprising at least 50, 75, 100, 150, 200 or 300 consecutive amino acids thereof. Homology may be determined using a program, such as BLASTP version 2 with the default parameters, or other like programs which align the polypeptides or fragments being compared and determines the extent of amino acid identity or similarity between them. It will be appreciated that amino acid "homology" includes conservative substitutions such as those described above.
The polypeptides or fragments having homology to one of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or a fragment comprising at least 50, 75, 100, 150, 200 or 300 consecutive amino acids thereof may be obtained by isolating the nucleic acids encoding them using the techniques described above.
Alternatively, the homologous polypeptides or fragments may be obtained through biochemical enrichment or purification procedures. The sequence of potentially homologous polypeptides or fragments may be determined by proteolytic digestion, gel electrophoresis and/or microsequencing. The sequence of the prospective homologous polypeptide or fragment can be compared to one of the polypeptides of SEQ ID
NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or a fragment comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof using a program such as BLASTP version 2 with the default parameters.
The polypeptides of SEQ 1D NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 5T, 59, 61, 63, 65, or fragments, derivatives or analogs thereof comprising at least 40, 50, 75, 100, 150, 200 or 300 consecutive amino acids thereof invention may be used in a variety of applications. For example, the polypeptides or fragments, derivatives or analogs thereof may be used to catalyze certain biochemical reactions. In particular, the polypeptide of the OXAIJ
family, namely SEQ ID NO: 35 or fragments, derivatives or analogs thereof may be used, in vitro or in vivo, to catalyze oxidation reactions to modify acyl fatty acid precursors that are either endogenously produced by the host, supplemented to the growth medium, or are added to a cell-free, purified or enriched preparation of said polypeptide; the ADLF and ADLE families, namely SEQ !D NOS: 2 and 37 or fragments, derivatives or analogs thereof may be used, in vitro or in vivo, to catalyze activation and tethering to acyl carrier proteins, or to themselves (ADLF; SEQ ID NO: 2) of aryl fatty acids that are either endogenously produced by the host, supplemented to the growth medium, or are added to a cell-free, purified or enriched preparation of said polypeptide;
the ACPH family, namely SEQ ID NO: 39 or fragments, derivatives or analogs thereof may be used, in vitro or in vivo, to be Loaded with activated acyl fatty acids that are either endogenously produced by the host, supplemented to the growth medium, or are added to a cell-free, purified or enriched preparation of acid polypeptide.
Polypeptides of the PPST family, namely SEQ ID NOS: 4, 7, 9, 11, 13, 41, 43, 45 and 47, or fragments, derivatives or analogs thereof may be used in any combination, in vitro or in vivo, to direct the synthesis of peptides of determined amino acid composition either in their natural context or in hybrid polypeptide synthetase systems originating from different nonribosomal peptide biosynthetic loci. Families OXAB, namely SEQ ID
NOS:
20 and 53, and OXDD, namely SEQ ID NOS: 24 and 57, or fragments, derivatives or analogs thereof may be used in any combination, in vitro or in vivo, to catalyze oxidation reactions that modify compounds that are either endogenously produced by the host, supplemented to the growth medium, or are added to a cell-free, purified or enriched preparation of said polypeptide. Families MTAG, namely SEQ ID NOS: 22 and 55, and MTFZ, namely SEQ ID NOS: 32 and 65, or fragments, clerivatives or analogs thereof may be used in any combination, in vitro or in vivo, to catalyze transfer of methyl groups modifying compounds that are either endogenously produced by the host, supplemented to the growth medium, or are added to a cell-free, purified or enriched preparation of said polypeptide. Polypeptides of the families ABCD, namely SEQ
ID
NOS: 26 and 59, MEMD, namely SEQ ID NOS: 28 and 61 and MEMT, namely SEQ ID
NOS: 30 and 63, or fragments, derivatives or analogs thereof may be used in any combination, to confer to microorganisms or eukaryotic cells resistance to lipopeptides or to increase the yield of lipopeptides in either naturally producing organisms or heterologously producing recombinant organisms.
The polypeptides of SECT ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or fragments, derivatives or analogues thereof comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof, may also be used to generate antibodies which bind specifically to the polypeptides or fragments, derivatives or analogues. The antibodies generated from SEO ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 may be used to determine whether a biological sample contains Streptomyces fradiae or a related microorganism. The antibodies generated from SEQ ID NOS:
35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 may be used to determine whether a biological sample contains Streptamyces refuineus or a related microorganism.
In such procedures, a biological sample is contacted with an antibody capable of specifically binding to one of the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof. The ability of the biological sample to bind to the antibody is then determined. For example, binding may be determined by labeling the antibody with a detectable label such as a fluorescent agent, an enzymatic label, or a radioisotope. Alternatively, binding of the antibody to the sample may be detected using a secondary antibody having such a detectable label thE:reon. A variety of assay protocols which may be used to detect the presence of a lipopeptide-producer, a Streptomyces fradiae organism, a Streptomyces refuineus organism or polypeptides related to SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22., 24, 28, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, in a sample are familiar to those skilled in the art. Particular assays include ELISA assays, sandwich assays, radioimmunoassays, and Western Blots. Alternatively, antibodies generated from SEO
I D NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65 may be used to determine whether a biological sample contains related polypeptides that may be involved in the biosynthesis of A54145-type natural products or other lipopeptides.
Polyclonal antibodies generated against the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof can be obtained by direct injection of the polypeptides into an animal or by administering the polypeptides to an animal, preferably a nonhuman. The antibody so obtained will then bind the polypeptide itself.
In this manner, even a sequence encoding only a fragment of the polypeptide can be used to generate antibodies that may bind to the whole native polypeptide.
Such antibodies can then be used to isolate the polypeptide from cells expressing that polypeptide.
For preparation of monoclonal antibodies, any technique which provides antibodies produced by continuous cell line cultures can be used. Examples include the hybridoma technique (Kholer and Milstein, 1975, Nature, 256:495-497), the trioma technique, the human B-cell hybridoma technique (Kozbor et al., 1983, Immunology Today 4:72), and the EBV-hybridoma technique (Cole, et al., 1985, in IVlonoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Techniques described for the production of single chain antibodies (U.S. Patent 4,945,778) can be adapted to produce single chain antibodies to the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof. Alternatively, transgenic mice may be used to express humanized antibodies to these polypeptides or fragments thereof.
Antibodies generated against the polypeptides of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32 and 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or consecutive amino acids thereof may be used in screening for similar polypeptides from a sample containing organisms or cell-free extracts thereof. In such techniques, polypeptides from the sample is contacted with the antibodies and those polypeptides which specifically bind the antibody are detected. Any of the procedures described above may be used to detect antibody binding. One such screening assay is described in "Methods for measuring Cellulase Activities", Methods in Enzymology, Vol 160, pp.
87-116.
The present invention will be further described with reference to the following examples; however, it is to be understood that the present invention is not limited to such examples.
Example 1: Identification and seguencing of A541 Streptomyces fradiae strain NRRL 18158 was known to express the lipopeptide antibiotic complex A-54145. The structure of lipopeptide antibiotic complex A-54145 is known as shown in Figure 1. The peptide backbone of the chemical structure of A54145 clearly implicates the presence of NRPS enzyrr~es in the biosynthesis of this compound. A DNA library was constructed using S~rep~omyces fradiae strain NRRL
18158 genomic DNA. Cosmids were selected by hybridization with NRPS specific oligonucleotide probes. The selected cosmids were screened by DNA sequencing and analyzed for the presence of NRPS encoding genes. Three overlapping cosmid clones shown to have a substantial NRPS gene content were selected for further studies. DNA
sequence analysis of these cosmids revealed the presence of one partial NRPS
gene, namely A541 ORFs 2 and 3 (SEQ ID NO: 4 and 7), and three complete NRPSs genes, namely A541 ORFs 4, 5 and 6 (SEQ ID NOS: 9, 11 and 13). Analysis of these ORFs demonstrated the presence of conserved typical NRPS domains involved in the recognition, activation, modification and condensation of amino acids. A total of 13 modules responsible for the condensation of 13 amino acid residues were identified.
A54145 is composed of 13 amino acids providing an indication that the cloned locus might be the one responsible for A54145 biosynthesis. The adenylation domains were further examined for the specificity of the amino acids that they activate and tether to the PCP domain of the NRPS. The predicted specificities clearly corresponded to the nature and order of the amino acid residues found in the A54145 chemical structure providing conclusive evidence for the role of the cloned locus in the biosynthesis of the A54145 components (Figures 1 and 13). Further evidence was provided by the presence of a methylation domain found in ORF 3, module 5 specifying the amino acid glycine. Chemical characterization of A54145 showed that the amino acid incorporated in the fifth position is a N-methyiated glycine (sarcosine) (Figure 1 and 13).
The nature and order of the amino acids specified by the NRPS genes as well as the presence of domains involved in the modification of some of the amino acids clearly demonstrate that the Streptomyces fradiae locus cloned and analyzed is involved in the expression of the lipopeptide complex A54145.
Example 2: Genes and proteins of A541 A541 is formed of three DNA contiguous sequences (SEQ ID NOS: 1, 6 and 17) arranged such that, as found within the A54145 biosynthetic locus, DNA contig 1 (SEQ
ID NO: 1 ) is adjacent to the 5' end of DNA contig 2 (SE(~ ID NO: 6) which in turn is adjacent to DNA contig 3 (SEQ ID NO: 17). IVlore than 19 kilobases of DNA
sequence were analyzed on each side of the A54145 locus and these regions contain primary metabolic genes. The order and relative position of the 15 ORFs representing the proteins of A541 are provided in Figure 1. Contiguous nucleotide sequences and deduced amino acid sequences of A541 provided in the accompanying sequence listing.
Contig 1 is formed of the 13315 base pairs provided in SECT ID NO: 1 and contains ORFs 1 and 2 of A541. The gene product of A541 ORF 1 (SEQ ID NO: 2) is the 723 amino acids deduced from the nucleic acid sequence of SEQ ID NO: 3 which is drawn from residues 1 to 2172 (sense strand) of contig 1 (SEQ ID NO: 1). The gene produce of A541 ORF 2 (SEQ ID NO: 4) is the 3700 amino acids representing the N-terminus of the polypeptide deduced from the nucleic acid sequence of SECT ID
NO: 5 which is drawn from residues 2216 to 13315 (sense strand) of contig 1 (SEO ID
NO: 1 ).
Contig 2 is formed of the 37360 base pairs provided in SEQ ID NO: 6 and contains ORFs 3-7 of A541. The gene product of A541 ORF 3 (;~EQ ID NO: 7) is the 2595 amino acids representing the C-terminus of the polypeptide deduced from the nucleic acid sequence of SEQ ID NO: 8 which is drawn from residues 2 to 7789 (sense strand) of contig 2 (SEQ ID NO: 6). The gene product of A541 ORF 4 (SEQ ID NO: 9) is the 2143 amino acids deduced from the nucleic acid sequence of SEQ ID NO: 10 which is drawn from residues 7786 to 14217 (sense strand) of ccmtig 2 (SEQ ID NO: 6).
The gene product of A541 ORF 5 (SEO ID NO: 11) is the 5245 amino acids deduced from the nucleic acid sequence of SEQ ID NO: 12 which is drawn from residues 14217 to 29954 (sense strand) of contig 2 (SEQ ID NO: 6). The gene product of A541 ORF

(SEQ ID NO: 13) is the 2384 amino acids deduced frorr~ the nucleic acid sequence of SEQ ID NO: 14 which is drawn from residues 29954 to 37108 (sense strand) of contig 2 (SEQ ID NO: 6). The gene product of A541 ORF 7 (SEQ ID NO: 15) is the 78 amino acids deduced from SEQ ID NO: 16 which is drawn frorn residues 37111 to 37347 of contig 2 (SEQ ID NO: 6). Contig 3 (SEO lD NO: 17) is formed of 8321 base pairs provided in SEQ ID NO: 17 and contains ORFs 8-15 of A541: The gene product of ORF 8 (SEQ ID NO: 18) is the 264 amino acids deduced from SEQ ID NO: 19 which is drawn from residues 57 to 851 of contig 3 (SEQ ID NO: 17). The gene product of ORF
9 (SEQ ID NO: 20) is the 331 amino acids of SEQ ID NO: 21 which is drawn from residues 863-1858 of contig 3 (SECT ID NO: 17). The gene product of A541 ORF

(SEQ ID NO: 22) is the 262 amino acids deduced from SEC. ID NO: 23 which is drawn from residues 1855 to 2643 of contig 3 (SEQ ID NO: 17). The gene product of ORF 11 (SEQ ID NO: 24) is the 319 amino acids deduced from SEQ ID NO: 25 which is drawn from residues 2713 to 3672 (sense strand) of cor~tig 3 (SEQ ID NO: 17).
The gene product of A541 ORF 12 (SEQ ID NO: 26) is the 353 amino acids deduced from SEQ ID NO: 27 which is drawn from residues 3672 to 4733 (sense strand) of contig 3 (SECT ID NO: 17). The gene product of A541 ORF 13 (SEQ ID NO: 28) is the 283 amino acids of SEQ ID NO: 29 which is drawn from residues 4730 to 5578 (sense strand) of contig 3 (SEQ ID NO: 17). The gene product of A541 ORF 14 (SEQ ID
NO:
30) is the 206 amino acids of SEQ ID NO: 31 which is drawn from residues 6263 to 5643 (anti-sense strand) of contig 3 (SEQ ID NO: 17). The gene product of A541 ORF
15 (SEQ ID NO: 32) is the 352 amino acids deduced from SEQ ID NO: 33 which is drawn from residues 7093 to 8151 (sense strand) of contig 3 (SEO ID NO: 17).
Some open reading frames listed herein initiate with non-standard initiation codons (e.g. GTG - Valine or CTG - Leucine) rather than the standard initiation codon ATG, namely ORFs 1, 2, 4 and 13 (SEQ ID NOS: 2, 4, 9 and 28). All ORFs are listed with the appropriate M, V or L amino acids at the amino-terminal position to indicate the specificity of the first codon of the ORF. It is expected, however, that in all cases the biosynthesized protein will contain a methionine residue, and more specifically a formylmethionine residue, at the amino terminal position, in keeping with the widely accepted principle that protein synthesis in bacteria initiates with methionine (formylmethionine) even when the encoding gene specifies a non-standard initiation colon (e.g. Stryer, Biochemistry 3r° edition, 1998, W.H. Freeman and Co., New York, pp. 752-754).
Three deposits, namely E. coli DH10B (184CM) strain, E. coil DH10B (184CA) strain and E. coil DH10B (184CJ) strain harbouring the cosmid clone referred to in parenthesis which together span the biosynthetic locus for the A54145 compound from Streptomyces fradiae have been deposited with the International Depositary Authority of Canada, Bureau of Microbiology, Health Canada, 1015 Arlington Street, Winnipeg, Manitoba, Canada R3E 3R2 on February 26, 2002 and were assigned deposit accession number IDAC 260202-1, 260202-2 and 260202-3 respectively. The E.
coil strain deposits are referred to herein as "the deposited strains". The part of the A541 locus covered by each of the deposited cosmids 184CN1, 184CA and 184CJ is indicated in Figure 1.
In order to identify the function of the proteins in A541, SEO ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30 and 32 were compared, using the BLASTP
version 2.2.1 algorithm with the default parameters, to sequences in the National Center for Biotechnology Information (NCBI) nonredundant proteins database and the DECIPHER~
database of microbial genes, pathways and natural products (Ecopia BioSciences Inc.
St.-Laurent, OC, Canada).
The accession numbers of the top GenBank hits of this BLAST analysis are presented in Table 2 along with the corresponding E value. The E value relates the expected number of chance alignments with an alignment score at least equal to the observed alignment score. An E value of 0.00 indicates a perfect homolog or nearly perfect homolog. The E values are calculated as described in Altschul et al.
J. Mol.
Biol., October 5; 215(3) 403-10. The E value assists in 'the determination of whether two sequences display sufficient similarity to justify an inference of homology.

Table 2 ORF SEO ID Famil ORF SEO ID Famil NO NO

4 9 PPST 4 41 PPS-J' _-d0 ~ U

>~ N

" ' ~ O

fn O Q L CI) ~i.e >

O ~ ~ ~ ~ ~ if ~
~ ~ 'V 'V

~ aj -~ _ ~
O ~

U ~ j~ ~ '.e ~ ~ >, ... -. ''"' ~ ~ O
U ~
U
~

C9 Q > ~ cLi U ~ . to ~ ~ ~ O ~ ~' O fn !n ~

Q f ~ U >.
(l5 f U 7 V
(n U

O Q y N
_O
!J~

O C~ C C ~ 'p C
O 0 L ~ U ~
~ ~
.

d t v ~ ~ Q ~ Q ~
n ~ ~ 3 tn g~ ~ ~ O
~ O >' o ~

LO (n ~ O _ .. O ~ O
~ _O ~,, O L
O O

.._ ~ .~_ ~ ~ ~ 7 ~ Q O
~ ' . O . N
O .

~ O O g C _ O O ~ ~ O O-O Q Q Q ~ ~ .
~ '~ ~ .... .~ C~
C
'-'Q

y a o .. U~ - , o >,a~ >,a~ o ... +-~ ' ~

~ o a~n U~.> c~
O cn c Q, ~ E c~.> u-~ .~ cn > tJ~
c~ ,r N

~ O O G ~ ~T C\7\ N ON N
OOo ~

o ~ ~ ~ ~ ~ ~ ~ _N _ . O ~ N

O CO T O7 r- c3~ YN~,.- ~O

O~ O~ N~_ r T T T T ~ 1~
~

O I'~ O
*" O~ o O N W r OJ O O o (.0O
*' ~ o o ~ o o o h o ~ ~ ~ ~

N h~. CO C~ C~ N ~ \ \ ~
C G \ C T' O ~
O O O ~J ~. O
O

O) ~. ~ A N ~ C~ T
d- O O ' C'7 ~ CS5 N N N C~
- O

r _ , N d I~ ~ ~ ~ ~ ~
- N N O N ~ C
'. ~ ~ ~
~

o N d t mr ~
yr , ~~
...

T .r r or r S3 O O O. ~ O O O O O

a y r r r r r r r r Q.

i O (C3 ~ f~ ~

~ CLf ~

~ d' ~ CJ ~ CD
' N '~ O O

r r ~ N
c6 r r .,s C'~7r r ~
r r m ~ ~

~ ~ ~, N ~ N

N co NI C3 m U

O O
m Z ~ ~

~ Q ~ h U Z I- Q
- .~

N r.

m N

>.

E o a L ~
O

LL

r N

CA 02412627 2002-12-24 ... .._...
~3~~2- ~~3 ~

V ) Y !n N L N

_O

rt- n O

c~ N ~ ~ ~ ~ ~ O U 0 E O N ' ~

O O ~ ,~ ,+.
N N Q ~ N N E 'i o a O O c6 p c~ O O

cn N N N ~ O ~ O N
0O U7 Q U p O V O U
O U

U O ~ O ~ In U N .
. .C

N
Q ~ ~ 0 .N

C ~ ~, ~ 47 +~- ~ ~ U O
U U L '' U U ~'_V
U

a . N ~ ~ ~ ~ j ~ ~
N r. N Cn O N Ct~
=
~

N _ N '='O ~ N N ~ ~ O
U U V N fn -~ Q

N C ~ ~ O
U . O a +. O Q C U ' ca ~ U

Q a ~U >, ~ Q j, ~ ~ 'U ~ U ~ S
p ~ E ~ ~ ~ ~ Q ~f ~

O O O ~ N , >, ~ ~ U
~ ~ .r.. O Q U7 ~ O

ctS O N O _O ~ ~ O
Q' O C Q' ~ ;O Q- ~ O ~ ~ ~ _O
~ N N O ~ N 4~ U C .~
O

O Q '*' Q ~ ~_' Q O C _ ~ " f~
Q '~ . .~ .~ O ''-':~."' ~ . , U
as-,fn r N +~. O ~ -;: rn ~ O .
~

. ~ ~ O N O ~ '~ U _ ~ ~ Q) U Cn Q ~ ~ ~ v~ ~ 0 O p U- ca- QM U= aN a.~ U= acsc~~ ~ ~ t S
N E

.,... a N O O ~ cf~~~ \ ~ Cn O \ \ '_ o ~ o o o o y N N N M M M N ~ ~ M ~ o \
~ M M N tfil~ ~ ~ ~ f~. CO
O ~ \ o C ~ C

O ~ O 07 M N ~ C\OO \ \ w,0 N
O N O f~. I~ ap 00 CV ~ T r O
d' N ' ' D ~ ' ~ ~ M N
~

l1] r T ~ d' M t!7 lI~~ lCj ~ ~ ~
r p d d O [~ d N M ~ ~ ~
~ Lf7 O O (p ~ . tf7tn ~
LC? ~ d' -LC~

, ~ p t CD
r r r N r r r r r - - "' ~' ~ r G~ [~ ~ O ~ ~ ~ O O O
O O O O O O O
tf) M \ \ O
\ \

r ~ M ~ \ \ 07 O \ \ ~ O
M tt7 M M M ~ r \ ~ CO d' \ ~
I~ T ~ ~ p O O
' f.

N N N \ \ .~ N C M CD CO CO N
~ ~ M M ~. ~ \ T N M ~ ~ ~
r ~ \ \ \ \ \ \
r';~ r T r T

pd'p~ C0~0N N ~ M ~ ~ d., d. _ T
Wit'd' '1'd''~ ~ ~ dr'. CMO ~ d' O
T

T r T O 'O O ~ C a O

O O O O ~

O O O O
N N N N N N N N N

~ ~ ~ T ~

T T ~ ~ N M M

D

~

c~

M cs~cps ca cpsN

~ CAD I~ pip~~ r ~ cLS
T r T T '~ T
~
V

' due'N ~ N ~ ~ ~ ~ I~ N

d ~ ~ N ~ O -M ~ ~ O ~ O ~ C ~ , C p M a0 c9 Q ~ dl l _ CO p L. ~ ~ M M
' ~ L M
d E- a c z . i W -M-Q z Q M
~- z ~

t -M tn N ~ N ' ~ N

i Z Q

W

M ~
O~

3~

~ U

U O ~ ~ O
-~

U U U ~ O

O >o >, >, O- >, >~ ~ !n In U
(6 O ~ ~ ~ Q (n p 0 ~ O

p O O O ~ O Q Q7 O O O
~

c~ _ _y- . _ -.....
_ ~ ~ .
~

d ~ ~ ~ ~ O ~ ~
O tf~ U~ >
C/~

U ~ ~ 0 ~
~

C C ~ 7 (LS
ct5 . - C C
u .

O v~ ~N UV fnV 'C U

O O O O ~__ _ O fn In U

Q Q CL Q) N . X ~
(tf '~

~ O 0) ~ ~ ~ C C
Q ~ ~ O p O

N ~_ ~_ O .
O +_~ O +. O .,... Q) O t~ >
O ~ U ' ~

t ~ >'O~ ~,O ~ ~
~ ~' j, O O
.

o a ~ n .o~. o~ ~ ~ ~ ;: U d o ~ ~ L ~ ~ ~
~

_ _ ~ > o _ cv ~ cn cn ~ n ~ o ~

p.. C~ (n C~ Q.. C'O ~ Q U U d D
(~ (t'f I E .C U r N
(LY fn d7 ~ ~t r ~? M o O l!7 O (~ Ln N N
0 ~o o o o \ o o o o o o o ~ \ \

N - O O td~ tI) ~ O O p~ O O
' I~ C'~ tn , N Q o N o \
. \ -- ~ fi ~

~ ~ \ r N o r]
O ~ r O ' O ~ ( CO
Lf7 lf~ r d' r ~ r ' p O O) rt.(~rCfl NCO rLO~ Ln r'i r r0 ~d'.d' CD lfj rtn rC.~ rl.n J d' r N r O Op d- r O7 M o O Lf~ O (~ t1~ N N
0 o o o Lf] LC'3 Lf o o o o O O ~ ~ ~ ~
~ ~

a 7 O 07 07 O O
N N ('~ C'~N N ~f7 N o ~ N N ~ ~
~ p~ I~ f~ N \ N CrJ 'd' O m N
~ \ o M T 0~0 Q r C'.O O N ~ O N
~t - 00 00 O O O O C~ O Ln N c0 90 N O tn d' ~' 07 ' 1~' N p p r' .

~ r O O ~ r O p r r r r r ~ ~ ~ I ~ ~
~ ~ ~ ~ ~ ~

r r r r r r r r ~e N N ~ N ~ O ~ ~. O ~ r p o I

co ct~ ~ ~r m c~ c~ co c~ c~ co m r N O ' O O N O O O O Qa O N

N C'~ ~ N (~ r [~. ~" r 00 I~ N r (~ (C3 CL$

d' 00 ~ N O

c r N r N

N ct5 C cLS cC3 M C'~
(~ r r r r _ r O r ~ ~ r r N

CV ~ C~ d' C'~ p N ~ c3 ~ ~ N

O ~ N " " r O d7 N (~ O CO r r N
ue.

C~ O Gp N ~ N ~ N N ~ d U a0 C'~ .

C~ I U O o0 d. ~. U O
~ d c d d d ~- d c~ d ~' v' d d t- U cn d m ~ d f- U d c0 U d r N ~7 Cr7 N G'~ C~

U

d O) ~ r N
r r r 3v~2 i~3C.~

m O U N ~ O

O d :r. O c~ O O Q
U ~ ~ c~ U

>, V U ~- O p ' U N
O O ~
..

Q. p i O
~ ~ C

p O O N p p ~ U O ~ O
O + O p U :
' ' U U .Q s O 6n C a ~t ~ ~ ~ N t~ Vy . ~ N
~ ~ a~

Q, _~ ''''_c~ ~ c~ p O U _ Q' p N

U5 Q cn ~ p ~ >, C3 ~ O' N
U U ~?

~ cn y' ~ N (u '~ E
~ ~ ~ E U ~ o N
' ~ Q

O C O - p O ~ a.., C U ~ ~ ~ m ~ n. as N ~U~N 0o C~~ ~~ i~Nccf~

c~ Q ~ L ~ U . U U N ~ ~ O
Q '- ~ ~ = ~ '"' a-y-.p ~ D
~

O ~ _N C!) O ~ p ~ ~
- O VJ ~ Z C~ cn ~
~ ~ ~ ~

> ,~ 'Q O ~ > .~ > O .~ .~ ~
O ~ G . ~ C C O ?, U ~ ~ ~ ~ Q ~ ~ ~ Q ay s m :~
~ ~ ~ ~ ~ N :~

0 m O ~ ~ O O O O ,A ~ O ~ O 4) .~ d O ~ O ~ :,_..

~ ~ ~ E a U Q ~.. D ~ a ~ a ~. s= ~ ~. ~ .n .. QC ~

0 d' 0 0 0 o cv 0 0 o o O ~ d' N T ~ o d0 N
~ ~ ~ ~ ~ ~

O W CO tI7 N t0 O M O N
o N N ~ ~ ~ CV - ~
~

~ C N ' y y ~ u T y O J C4 !~ N t~ 7 C~1 1'~

00 C'~O N CO O ' m m O O
~ T O O T
~ ~

r T r O 'Ch Cfl d' N r o N CO N o o o o o o o \

CO ~ N CU O G'~ Ci~N -0 CV ~ r yS N N N 0 s 1 y t N y O ~
~ N 0 ~ ~

d' ~ ~ ~ ~ N M 00 M N 'O
0 ~r M

d M N N d N N C C
' N j ~

O W 00 f'~ h N CO O tf3 d' _ N ~ N ~ Qi N N N N N

1-' T N Ln N C~ d' d' 00 T-~ ~ M W

a ~ ~ N
0 "

M N N c~ N ,_ N c~

_ T ~ T ~ T

p ~ r ~ t~ ~ N N

r Ln N d. O ,- W O CrJ

C~ T r O O _ ,- <f' 0~0 N N " p O r ~ O O

r- N Cfl T M O t~ O O r-~ O ~ C''3 a z z ~ Y' z c n N i N
.~

N t0 N

O O u7 N N

f- N

W

W

4.1 Example 3: identification and seauencina of the 024A locus:
024A was identified as a secondary metabolic biosynthetic locus using the genome scanning method described in detail in USSN 10/232,370, the contents of which are hereby incorporated by reference.. The sequence information for 024A
was then deposited into the DECIPHER~ database of natural product biosynthetic genes, foci and products (Ecopia SioSciences Inca, St.-Laurent, Canada). 024A was identify from the DECIPHER~ database as a lipopeptide biosynthetic locus using the method described in detail in co-pending application USSN 10/XXX,XXX entitled Compositions, Methods and Systems for the Discovery of Lipopepfides filed concurrently with the present application and also claiming priority from USSN 60/342,133 and USSN
60/372,789. The contents of USSN 10/XXX,XXX are incorporated herein in its entirety for all purposes.
Example 4: Genes and Proteins for 024A
The 024A locus includes the 61944 contiguous base pairs provided in SEQ ID
NO: 34 and contains the 16 ORFs provided SEQ ID NOS: 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63 and 65. More than 16 kilobases of DNA sequence were analyzed on each side of the 024A locus and these regions contain primary metabolic genes. The order and relative position of the 16 ORFs representing the genes of 024A
are provided in Figure 2. The accompanying sequence listing provides the nucleotide sequence of the 16 ORFs and the corresponding deduced polypeptides.
The gene product of 024A ORF 1 (SEQ ID NO: 35) is the 573 amino acids deduced from SEQ ID NO: 36 which is drawn from residues 1 to 1722 (sense strand) of SEQ ID NO: 34. The gene product of 024A ORF 2 (SEQ ID NO: 37) is the 601 amino acids deduced from SEQ 1D NO: 38 which is drawn from residues 2666 to 4471 (sense strand) of SEQ ID NO: 34. The gene product of 024A ORF 3 (SEQ ID NO: 39) is the 99 amino acids deduced from SEQ ID NO: 40 which is drawn from residues 4637 to 493f (sense strand) of SEO ID NO: 34. The gene product of 024A ORF 4 (SEQ ID NO: 41 ) is the 6291 amino acids deduced from SEQ ID NO: 42, which is drawn from residues 5061 to 23936 (sense strand) of SEQ iD NO: 34. The gene product of 024A ORF 5 (SEQ ID NO: 43) is the 2135 amino acids deduced from SEO ID NO: 44, which is drawn from residues 23933 to 30340 (sense strand) of SEQ ID~ NO: 34. The gene product of 024A ORF 6 (SEO ID NO: 45) is the 5245 amino acids deduced from SEQ ID NO: 46, which is drawn from residues 30337 to 46074 (sense strand) of SECT ID NO: 34.
The gene product of 024A ORF 7 (SEO ID NO: 47) is the 2394 amino acids of SEQ ID
NO:
48, which is drawn from residues 46074 to 53258 (sens:e strand) of SECT ID NO:
34.
The gene product of 024A ORF 8 (SEQ ID NO: 49) is the 78 amino acids deduced from SEQ ID NO: 50, which is drawn from residues 53262 to 53498 (sense strand) of SEQ ID
NO: 1. The gene product of 024A ORF 9 (SEQ ID NO: 51 ) is the 271 amino acids deduced from SEQ ID NO: 52 which is drawn from residues 53687 to 54502 (sense strand) of SECT ID NO: 34. The gene product of 024A t)RF 10 (SEQ ID NO: 53) is the 318 amino acids deduced from SEQ ID NO: 54 which is; drawn from residues 54499 to 55455 (sense strand) of SECT ID NO: 34. The gene product of 024A ORF 11 (SEO
ID
NO: 55) is the 269 amino acids deduced from SECT ID NO: 56 which is drawn from residues 55540 to 56349 (sense strand) of SEQ ID NO: 34. The gene product of ORF 12 (SEQ ID NO: 57) is the 319 amino acids deduced from SEQ ID NO: 58 which is drawn from residues 56448 to 57407 (sense strand) of SEQ ID NO: 34. The gene product of 024A ORF 13 (SEO ID NO: 59) is the 340 amino acids deduced from SEQ
ID
NO: 60 which is drawn from residues 57407 to 58429 (sense strand) of SEQ ID
NO: 34.
The gene product of 024A ORF 14 (SEQ DI NO: 61 ) is the 282 amino acids deduced from SEQ ID NO: 62 which is drawn from residues 58426 to 59274 (sense strand) of SEQ ID NO: 34. The gene product of 024A ORF 15 (SEO ID NO: 63) is the 205 amino acids deduced from SEQ ID NO: 64 which is drawn from residues 59924 to 59307 (antisense strand) of SEQ !D NO: 34. The gene product of 024A ORF 16 (SEO ID
NO:
65) is the 205 amino acids of SEQ ID NO: 66 which is drawn from residues 60814 to 61944 (sense strand) of SECT ID NO: 34.
Some open reading frames listed herein initiate with non-standard initiation colons (e.g. GTG - Valine or CTG - Leucine) rather than the standard initiation colon ATG, namely ORFs 2, 5, 6 and 14 (SEQ ID NOS: 37, 43, 45 and 61 ). All ORFs are listed with the appropriate M, V or L amino acids at the amino-terminal position to indicate the specificity of the first colon of the ORF. It is expected, however, that in all cases the biosynthesized protein will contain a methionine residue, and more specifically a formylmethionine residue, at the amino terminal position, in keeping with the widely accepted principle that protein synthesis in bacteria initiates with methionine (formylmethionine) even when the encoding gene specifies a non-standard initiation codon (e.g. Stryer, Biochemistry 3'd edition, 1998, W.H. Freeman and Co., New York, pp. 752-754).
Two deposits, namely E. sole DH10B (024CC) strain and E. coli DH10B (024CK) strain harbouring the cosmid clone referred to in parenthesis which together span the biosynthetic locus for the A54145-like compound from ~Streptomyces refuineus have been deposited with the International Depositary Authority of Canada, Bureau of Microbiology, Health Canada, 1015 Arlington Street, Winnipeg, Manitoba, Canada 3R2 on February 26, 2002 and were assigned deposit accession number IDAC

4 and 260202-5. The E, colp strain deposits are referred to herein as "the deposited strains". The part of the A541 locus covered by each of the deposited cosmids and 024CK is indicated in Figure 2.
The deposited cosmids 184CM, 184CA, 184CJ, ~024CC and 024CK span A541 and 024A. The sequence of the poiynucleotides comprised in the deposed strains, as well as the amino acid sequence of any polypeptide encoded thereby are controlling in the event of any conflict with any description of sequences herein. The deposit of the deposited strains have been made under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for Purposes of Patent Procedure. The deposited strains will be irrevocably and without restriction or conditions released to the public upon the issuance of a patent. A license may be required to make, use or sell the deposited strain and any compounds therefrom, and no such license is hereby granted.
In order to identify the function of the proteins in 024A, SEQ ID NOS: 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61 and 63 were compared, using the BLASTP
version 2.2.2 algorithm with the default parameters, to sequences in the National Center for Biotechnology Information (NCB!) non redundant protein database and the DECIPHER~ database of microbial genes, pathways arid natural products (Ecopia BioSciences Inc. St.-Laurent, QC, Canada). The accession numbers of the top GenBank hits of this BLAST analysis are presented in Table 4 along with the corresponding E value. The E value relates the expected number of chance alignments with an alignment score at least equal to the observed alignment score.
4~.

~OZ i3cA

_ -o ~ ~ ~ o O U ~ ~ > _ U U _O
v cn o ~ Q ~ z ~, U ~ ~ O (v .~ N
~

' N ~~ U ~ O_ - _J N ' U ~ Q
Q

C UM O O ~V O O QO E
~

O U ~ ~ _ C1 ~
00 = c~3 ~
~ O

- ~
o ~~ U

'~ o ~ '~ ~ ai ~
a~ U ~ ~
L. U

~cn ~ ~ _ ~ _ 0 D~ ~
OCn ~ ~ _O p~j .= ~5 ' ' O

Q- ~ ~ v~ ~ ~ c ~ ~ U ~ ~
~ o , , ._ 0 _ o ,,., D
a O O ~ Q O ~
a C

. cj U p .. (~ tCl ~ (WJ ~~ C tn .~ ~ O
~ -O 'CS .~ (TS Q.
.r .~ = ('~

+.~ r ~ ~ O '~' ~ 07 05 0 0 0 . ~ ~ a 0 :

~ I~ t~ t~. Cn o C~ f~ P \ M ,-. Cod o o \ \ '~ \
~ n ~ ~
co . ~ ~ ~N o~ ~r ~o ' o I~ ~
M

M ) Cp N ~h Ci) - .
M C.O C5 ~' M I
C ~' M N
O M ~ M

E ~ M ~ M M v ~ ~ ~
v7 t~ r' ~ ~ ~ ~

C ~ ~p N 1~. M~ ,-\
C~ ~ ~ ~N ~ t1 M M
00 ~

C. ~N W c~ O c sa X00 ~ ~j ; ~ d' D O O ~i.
d: .~ N M d' r M ~ ~, N N d N ~ N N
M N ~ r N ~) ~ C'M~
~ N
~

o ~

+..

N CO I~.
.Q I~ ~ d' T ~ (~ ~ Ln iC '~ T~ r r T r O O D

~ N N UJ N N ~ i N N
~

1.
r T r T r r ~.

O ~ a~0 ~ ~ N ~ c~ chi ~

r tLf ~ ~ ~ ~ N h - ~ r N

r r T (~ r r d' r r r r ~

M h ~ N ~ ~ C'~
' Q M

c1 ~ ~ N r. ~ N O o ~ O 0 N N O ~ p 0 a.~ ~~ Q ' d a ' Z tn ~ a, Q
N

d C~ Z

Q.

U

Q Q

~ r N M

3032-1~

(LS U O

~ U ~ 0) O O

U .~ :,:. O tn O ..-.p O cn O

,U .~-, N ~, ~ ~ ~ p ~ p ,U .U
s O ~- ms ~ 47 ~ 47 ~ :~ N +.
~ ,, ~ ~ ~ ~
O

O g ~a ~ ~O ~U .~O .~ 0...-O

U U ~ sue. C ~ ~ ~ ~ ~
~ U U U ~ tn cn ~ O ~ ~ >, ~ O _ ~ ~.
O tn N tn ~ U

U ~ p ~ U ~ ~ U 51 ~ U p 0) fn O.. Cn (U p O _ ~ p X
U C . .
.S

. Q J ~ ~ cn s2. cti >. ctf p- ~' ~.. ~. ~ fl..~ p p . (I~ O p p .~
~ p N

Q _ ~ p ~ ~ ~ ~ ~ ~ N t~
~ O ~ C O ~ '' C ~' Qca U ~

O 'a O C3 O Q) O I~ p._ Qp ,C
t y ~ O ~~ y ~ ~
~ ~ ~ ~ ~ p ~ ~ O- O- Q :,_.. Q f~ ~
O ~ (~ C~ Q p ~
t!) ~ " +. sue..

U wn ~ Wn U ~ U ~.-.~''.>' U .Q.
- ~ ~ ~ ~ ~ ~ '- cn ~ c~i O ~

o ~ O ~ O ~ ' ' O O
O

o O r N \ r ~ ~ L(~ ~ O O
O \ \ o \ o O \ O
\ \ \ . .
\ \

O ~ r O ~ ~ O ~ O
Ln d' N ~ d' N N Ln ~ C~ N r f~ CO 00 I~ Cb ~ C'~ ~' O7 ~

~ ~ r ~ ' L(7 ~ (.~' ~ ~ ~. ~ ~ C'~
C~ r ~ N ~7 I~. N L17 f~-Cp O~ r N O O r T- II ~ O <f O CV N r O N 00 j LC5 Wit' Lf~f' ~ O

0~ O ~ ~ ~ ~ O O O CO CO C~ O
~ ~ " w " d t ~

N N '- r r N T- r r r f~ O N ~ ~ ~ ~ N O h ~ O7 O Cfl r O O O ~ Lf:7~ .-~
O O N d' O ~ O
O

O O \ ~ \ r \ ~ r O d' O
\ r \ ~ N \ ~ ~ \ \ ~ O o O'7 CO r CO o O
I~ O Wit' d' I ~ ~ y C~ N N ~ C-~J
N CO ~ N g11 ~ G~O ~ a due' Y ' ..... ~ O ~...-O CO N ~ C~ r O ..r ~ ~ ~ d N r r N T~ r r O O O O O O O O O O O

O O O O O O O O O O O

N N N N N N N N N N N
a n i i ~ ~

~ i ~ ~ i r r Q) Q) r r r r r r r r r (LS

CAS O C1~ (~
C>;3(l~ (~

(' O~

r ~

t tLS M N ~ r Lf7 '~ (C5 r N (SS r d~ 00 N (C3 r T- N r r r r O

C~ ('~r ~

N
r O O

N ~ ~ ~ N '~ N p ~ C Ln ~ V ~ N ~

O ~ N
I ~ I LL I I I C~

Q Q ~ Q Z ~ U M

-N r O N ~ N

d O

.L
E ~
a~ o ~ ~ ~~ ~ 0 0 U o Q W~ ~
° ~ . ~ cn ca >, o o cu cu o >, ~ ~ _~ ~ a o >, ~-' ~ ~ E >, ° ° -o c c >, ° ~ o ~ ~ >, ~ ~ ~ o .~ E. ~ o o a~
° ~ ~s >' ca ~ '~ .~' tn d ~ ~ Q Q ° ~
~U O U ~ U +. ~ .~ U7 ~ L O ~ 4j cn ~ .L~. +.. U
>, +-. ~ Q .~ ~ Cn O ~ o o cn O o cn ~ o cn ~ a._. ~ ~ O (!5 crS
.C ~ ~ .C O ~ ~ p ~ :~ O ~ '+~-~ O N ~ ~ ~ ~ ~ y ~ ,C .~
O ~ C O ~ . = CU CS5 .O O ~ '- 'p _Ca '- iC3 Q
Q ~ .~ ~ O O ~ .Q ~ >, O C >. O ~ ~ CU ~'' > >' O O
cu - ~ ~ o ~ o. U ~ .~_. Q.. s U ~ ..C E a ~ _ cn as tn > L t~. c~
O o d~ o <t v N y o ~ o t~ j o ~ o O o O o N O CO ~' Cfl d' Cfl N N \ N O N N N O C'~ I~. C7 P~
T C'7 ~ 03 ~ CA ~ <t M O ~ d0 O O r <t O Cp N
O d' ~ Iln lf7 ~ 'd' (° 0~ ~ CC3 T- Ln 0~ C'r~ O I~ CO I~ Cfl O LO Old 00 Ie. r t- CO r l.C) N 00 r LCD r 4C7 r wr w.r ~...wr O o 0 0 .- o tf) o Lf~ o f~. o O o C~'7 0 00 \ d. \ ~. \ N \ ~,,, \ O \ ~ y O \ O \ O
r. CO O CO ~ CD '~' N ~ N O N ~ N ~ C'~ ~ C~ C~7 ~ 00 N CO N i'~ ~ t'~ ~ O ~ !.n ~ !'~ ~ Cr7 ~_ O) I~ ~ CEO ~ C~L~ r t~ r '~ r 'd ~ O N '- N O
a a ~ N ~ T~ ~ T
O C~ N O O I~. d' ~ CO ei' N N N I~- CO (.n T t!7 tfj CU ~ cv O ~ ~ ~ ~ N
T N N e- CY7 tn CO r N t0 I
(LS t~ ~ ~ c'~cf a3 C~5 C(1 CCf N tn (~
(~ N ~ COV
r v- r v-'- N Cfl In r, T' r I~ ~ d' O d' O d' I C~ d' 67 d' 00 I~ Cfl 1~- ~ N C~ ~ d' d' 'J- ~ ~I ~I ~I U ~I C~ C'3 z Q Q z z z CJ z CO Q Q
O
r N C'T~
Q
z Q x o ~, o r 3002-1~ ~,.

U U

C ~ O p _ cn ~...,~,0 O O

O ~ ~ Q ~
O ~ ~ ~ ~ ~ ~ ~ N
> j c ' , ~ N
~

? y ~ ~ ~ u~ ~n ~ C 7. O ~ O >. U) CO
V

c ~ ~ _ ~. 'v ~ ~ ~ N
Q ~ Q ~
~

~ ~ ~ . O O
O V

O O O Q O

~ ~ Ca V ~ ~' ~
~

O >, O cn O _ , L . C C
p V L ~ L O ~ .~
~ N cSi .,. . O ~
O C (~
Q) U
q~

C V ? = O
X (n ~' ~ ~ ~ ~ -~ ~ ~
O O ~ O ~ ~ O C ~ p O U ' p -~ ~.: >~ C~ U.U ~ (j j,~
~ ~ a;~ ~

~ U g ~ ~ ~ . t p U

( 1L - U V d d ~ y U
~ U U Q
3 ~
U

d d M o N ~ ~' M o f~' C35 CO ~ t'~
\ ,- \ \ o o o o N r N C~ N I N 'p N r r N
f~ c' N~ o f' CO \ \ o ~ ~ ~ M MM M

tn ..~ \ ~~ ' o~ ~~ ~o~ N a~
~T ~ ~~ ~N
~

r 'r r ~ r ~ r r CD I ~ ~ 0~
~ r- ~ ~ n LC a M
~

N . y C
r- ~r n ~ \ N o Wit' M ~ ~ f5~ cfl ~r I' 0 0 1 o \ \ o o 0 ~ ' n ~ o ~ \ \

N ~ . tP7 ~ ~ ~y ~ N .- \ r ~ ,r p~ ~ ~ M c M OD t N Lf~ N t M7 '~ ~
Ln C ~ c ~ N L ~ N N
D ,- M f? ~

T ~ ~ ~ ~ '- ~ ~ ~
~ ~ ~

T T ~ ~ N
~/

M
N

C~ T

'd' I~ N N ~ ~ ~ ~

ctt to ~ cts ct~

_N cti ~ M of ctf M ~ M tn tn ~ CLS

N C p O N T ~ M p r ~ t~ tM ttf N
(C1 r r r t~ r p ~ U N
~ M r ~f7 ~ ~ CO _ O y .

N~ Y ~ ~ ~ ~ N N Q N N
N

Q a ' ~

4 Q z 4 U U a.. ~ V
~

, ~

M ' d N

O ~

D

~

tJ

T
r T

3002-1 ~ (',~~

' as 07 c a~ ~ a~
c a o c c >. 'v ~

p O ~ O

Q ~ ~
~

Q O ~ 'u~ ~ Q Q' O

~' U U O U U
~ ~ ~ ~
U

.,. = . ~ ;.
. O .
~

N (n N N f- ~Q ~
+= ~ ~ _ (n O O

.O ~ ~ C (G .C ~
C C O C C

y y y U y y .U

~ O ~ ~ O ~ ~ Q
~ O C O O
~ p S ~ ~ C1 O ~ ~
Q c Q ~ Q
O

C . . . .
U

.- r ~ y r N d0 o o o ~ 0 N ~ ~ ~ i0 ~ d~' 0'D
~ M M N

O

00 ~f7 Cy d' 0 ~f.
~M t00 c~ ~ d00' ~N
C

. r 0 0 0 o O ~
r \ \ 00 \ o o \ \

~, r N ~

a~ ~~ ~~ ~~ ' _ N N ~ COr?N r ~
C~ C~ C~ c r N r O N N

r r r r N Q) N N O

N Cr~ r C~ r O

cts n7 cc~ ct~

N O r ~ N

O V' L'~

T'. N N M T' _r r r r d- r M O) O

N
O d' ~ ~t N

O CO CO CO O

CN ~~ a.~ a.~ Ll.~ Q_ cn N z z z o r.

N

I

Example 5: Biosynthesis of the A54145 peptide core structure While not intending to be limited to any particular mode of action or biosynthetic scheme, the gene products of the invention can explain the synthesis of A54145.
Five proteins, encoded by ORFs 2, 3, 4, 5 and 6 (SEQ ID NOS: 4, 7, 9, 11 and 13) are likely to be involved in the formation of the peptide core structure of A54145.
These ORFs show significant similarity to peptide synthetases (NRPSs) or peptide synthetase domains. Table 5 shows the modules and the approximate boundaries of their domains as found in the 5 NRPS ORFs. Each module is composed of a condensation domain, an adenylation domain and a thiolation domain. In module 5, found in ORF 3, the adenylation domain is modified by i:he insertion of an N-methyltransferase domain commonly found in NRPS ORFs and responsible for methylation of the alpha-amino position of the amino acid activated by the module.
Module 2, found in ORF 2, as well as modules 8 and 11 found in ORF 5, contain an additional domain responsible for epimerization of the amino acids activated by these modules, converting their stereochemistry form I_- to D-form. The ultimate module 13, included in ORF 6, ends with a thioesterase domain catalyzing cyclization and release of the mature peptide core structure from the NRPS enzyme.
Table 6 A54145 NRPS domain coordinates ORF Amino acid Homology Module no. coordinates no.

1 41 - 648 adenylating enzyme ("ADLE")~
loading 649 - 723 acyl carrier protein ("ACPH") 2* 28 - 480 condensation domain 514 - 1003 adenylation domain 1 1007 - 1074 thiolation domain 1089 - 1544 condensation domain 1572 - 2080 adenyiation domain 2 2084 - 2150 thiolation domain 2158 - 2665 epimerization domain 2667 - 3104 condensation domain 3129 - 3629 adenylation domain 3 3633 - 3698 thiolation domain 3** 17 - 470 condensation domain 498 - 1011 adenylation domain 4 1012 - 1079 thiolation domain 1093 - 1549 condensation domain 1607 - 2482 adenylation / N-methylation5 domains 2487 - 2553 thiolation domain 4 9 - 451 condensation domain 473 - 974 adenylation domain 6 978 - 1045 thiolation domain 1060 - 1541 condensation domain 1566 - 2054 adenylation domain 7 2058 - 2125 thioiation domain 1 - 455 condensation domain 491 - 998 adenylation domain 8 1002 - 1068 thiolation domain 1071 - 1570 epimerization domain 1572 - 2014 condensation domain 2040 - 2534 adenylation domain 9 2538 - 2605 thiolation domain 2620 - 3080 condensation domain 3105 - 3614 adenylation domain 10 3618 - 3685 thiolation domain 3700 - 4161 condensation domain 4190 - 4679 adenylation domain 11 4683 - 4749 thiolation domain 4752 - 5245 epimerization domain 6 6 - 450 condensation domain 475 - 975 adenylation domain 12 979 - 1046 thiolation domain 1060 - 1520 condensation domain 1545 - 2034 adenylation domain 13 2038 - 2105 thiolation domain 2135 - 2383 thioesterase domain Partial ORF; N-terminus Partial ORF; C-terminus Clustal alignment analysis of the NRPS domains revealed that all domains were complete and contained known motifs and conserved aimino acid residues required for activity (Figs 3 to 7).
Analysis of the adenylation domains found in the NRPS ORFs allows the amino acid that is incorporated by each unit to be identified (see Table 6 and Fig.
13). The following amino acid specificities are consistent with these comparisons: ORF
2, module 1: tryptophan (Trp); ORF 2, module 2: glutamic acid (Glu); ORF 2, module 3:
hydroxy-asparagine (HO-Asn) / asparagine (Asn); ORF 3, module 4: threonine (Thr); ORF
3, module 5: glycine (Gly); ORF 4, module 6: alanine (Ala); ORF 4, module 7:
aspartic acid (Asp); ORF 5, module 8: lysine (Lys); ORF 5, module 9: O-methylated aspartic acid (OCH3-Asp)/aspartic acid (Asp); ORF 5, module 10: glycine (Gly); ORF 5, module 11:
asparagine (Asn); ORF 6, module 12: 3-methyl glutamic acid (3meGlu) and ORF 6, module 13: isoleucine (11e).
Table 7 l17\O 01~ 01 r-IN O
M M M L~01 O N M

N N N N N M M M

ORF4 Asn D L T K V G D V
nAD03~024A~M3l ORF2 Asn D L T K V G D V
nAD031A541IM3~

ORF6_nAD041 024A~M11~Asn D L T K V G D V
(~H~'~Sri/ASTI

ORFS-nAD04 A541M11Asn D L T K V G D V
I ~ ~

emb~ CAB38517.1~Cda2~M3~ Asn D L T IZV G E V

gb~ AAF08797.1~MycC~M2~ Asn D L T K I G E V

gb~AAC45930.1~TycC~Ml~Asn D L T ICI G E V
J

ORF4 nAD041024A~M4~Thr D F W S V G M V
ORF3_nAD01 ~ A541 ~ M4 ~ Thr D F W S V G M V T'hT' emb~CAA72311.1~Sr_bC~Ml~Thr D F W N V G M V
gb~AAC80285.1~SyrEIM7IThr D F W N V G M V
emb~CAB38518.1~Cda1~M2~Thr D F W N V G M V
ORF4 nAD05~024AIM5~Gly D I L Q V G V I
ORF3_nAD02~A541IM5IGly D I L Q V G V I
4~ emb ~ CAB3 8517 . 1 I Cda2 ~ M2 ~ Gly D I L ~ V G L I Gly (S1T') gbIAAF17280.1INosC~M2~Gly D I L Q L G L I
ORFS nAD02~024AIM7~Asp D L T K V G A V
ORF4 nAD02~A541IM7IAsp D L T K V G A V
Asp embICAB38518.1~Cda1~M5~Asp D L T K I G A V
gb~AAF08797.1~MycC~M2~Asn D L T K I G E V
gb~AAC06348.1~BacC~M4~Asp D L T K V G H I
ORF6 nAD021024A~M9~Asp D L T K I G A V
ORF5_riAD02~A541~M9~Asp D L T K I G A V
emb ~ CAB3 8517 . 1 ~ Cda2 I M1 ~ Asp D L T K I G A V (~CH3~~rSp gb~AAF08796.1~MycB~M2~Asn D L T K I G E V
gb~AAC06348.1IBacC~MS~Asn D L T K I G E V
ORF6 nAD03~024AIM10~Gly D I L Q L G L V
ORF5 nAD03~A541~M10~Gly D I L Q L G L V
gbIAAF17280.1INosC~M2~Gly D I L Q L G L I
emb~CAB38517.1~Cda2~M2~Gly D I L Q V G L I
2~
ORF7 nAD021024A~M13~Ile D G L F V G V A
ORF6_nAD02~A541~M13~Ile D G L F V G I A
gbIAAC06346.1IBacAIM5~Ile D G F F L G V I r12 gbIAAC06346.1IBacAIMlIIle D G F F L G V I
emb~CAA06325.1ILchAC~MlIIle D G F F L G V V
3~ ORF4 nAD011024A~M1~Trp D V A L A G V V 'j'~
ORF2 nAD01~A541~M1~Trp D V A L V G V V
ORF4_nAD021024A~M2~Glu D L A K V A S V Gill ORF2_nAD02iA541~M2~Glu D L V K V A S V
ORFS n.AD01~024A~M6~Ala D V F N L A L V Ala ORF4 nAD01~A541~M6~Ala D V F A L A L V
ORF6 nADOl~A541~M8~Lys D A W D A G T V LyS
ORF5 nAD01~024A~M8'Lys D A W D A G T V
ORF7 nAD011024A~M12~Glu D L G K T G V V
ORF6 nAD01 ~ A541 ~ M12 i Glu D L G K T G V V 3-methylGlu Module 5 contains an adenylation-N-methyltransferase domain responsible for activation and tethering of glycine that is subsequently N-methylated to give the aminoacid sarcosine (Sar) found at amino acid position 5 in the A54145 mature peptide.
In the mature A54145 peptide structure, glutamic acid as well as 3-methyl glutamic acid are found in position 12 indicating that module 12 is able to recognize and activate both amino acid structures. Alternatively, only glutamic acid is incorporated by module 12 and subsequently methylated to form 3-methyl glutamic acid as seen in the mature A54145 structure. Module 13 activates and incorporate:; two related amino acids, isoleucine and valine (Val), indicating that the adenylation domain contained in this module displays a certain flexibility for recognizing and activating both amino acids.
The mature peptide is released from the NRPS enzyme (ORF 6) through the action of the thioesterase domain in module 13 with concomitant cyclization through esterification between the hydroxyl group of Thr at position 4 and the carbonyl group of lle/Val residues at position 13 (Fig 13b).
The order of modules as well as the predicted amino acid substrate specificities of the peptide synthetase repeating units are in precise agreement with the structure of the A54145 peptide core, providing conclusive evidence that the genetic locus described here is responsible for the biosynthesis of A54145 (figures 1 and 13).
Example 6: Biosynthesis of the 024A peptide core strucaure While not intending to be limited to any particular mode of action or biosynthetic scheme, the gene products of the invention can explain the synthesis of 024A
product.
Four proteins, encoded by ORFs 4, 5, 6 and 7 (:>ED ID NOS: 41, 43, 45 and 47) are likely to be involved in the formation of the peptide core structure of the 024A
product. These ORFs show significant similarity to peptide synthetases (NRPSs) or peptide synthetase domains. Table 7 shows the modulEa and the approximate boundaries of their domains as found in the 4 NRPS ORFs. Each module is composed of a condensation domain, an adenylation domain and a thiolation domain. In module 5, found in ORF 4, the adenylation domain is modified by the insertion of an N-methyltransferase domain commonly found in NRPS ORFs and responsible for methylation of the alpha-amino position of the amino acid activated by the module.
Module 2, found in ORF 4, as well as modules 8 and 11 found in ORF 6, contain an additional domain responsible for epimerization of the amino acids activated by these modules, converting their stereochemistry form L- to D-form. The ultimate module 13, included in ORF 7, ends with a thioesterase domain catalyzing cyclization and release of the mature peptide core structure from the NRPS enzyme.
Table 7 024A NRPS domain coordinates ORF Amino acid Flomology Module no. coordinates no.
2 NA adenylating enzyme ('ADLE") loading 3 NA acyl carrier protein ("ACPH")loading 4 6 - 443 condensation domain 477 - 970 adenylation domain 1 974 - 1041 thiolation domain 1056 - 1513 condensation domain 1541 - 2047 adenylation domain 2 2048 - 2114 thiofation domain 2127 - 2627 epimerization domain 2629 - 3071 condensation domain 3096 - 3597 adenylation domain 3 3601 - 3666 thiolation domain 3705 - 4164 condensation domain 4193 - 4706 adenylation domain 4 4707 - 4774 thiolation domain 4788 - 5244 condensation domain 5302 - 6179 adenylation / N-methylation5 domains 6184 - 6250 thiolation domain 9 - 449 condensation domain 475 - 974 adenylation domain 6 978 - 1045 thiolation domain 1060 - 1533 condensation domain 1558 - 2046 adenylation domain 7 2050 - 2117 thiolation domain 6 2 - 456 condensation domain 489 - 996 adenylation domain 8 1000 - 1066 thiolation domain 1074 - 1569 epimerization domain 1571 - 2010 condensation domain 2036 - 2530 adenylation domain 9 2534 - 2601 thioiation domain 2616 - 3076 condensation domain 3101 - 3608 adenylation domain 10 3612 - 3679 thiolation domain 3694 - 4156 condensation domain 4189 - 4678 adenylation domain 11 4682 - 4748 thiolation domain 4756 - 5245 epimerization domain 7 6 - 450 condensation domain 475 - 985 adenylation domain 12 986 - 1053 thiolation domain 1067 - 1527 condensation domain 1552 - 2046 adenylation domain 13 2050 - 2117 thiolation domain 2147 - 2392 thioesterase domain Clustal alignment analysis of the NRPS domains revealed that all domains were complete and contained known motifs and conserved amino acid residues required for activity (Figs 8 to 12).
Analysis of the adenylation domains found in the NRPS ORFs allows the amino acid that is incorporated by each unit to be identified (see Table 6 and Fig.
13). The following amino acid specificities are consistent with these comparisons: ORF
4, module 1: tryptophan (Trp); ORF 4, module 2: glutamic acid (Glu); ORF 4, module 3:
hydroxy-asparagine (HO-Asn) / asparagine (Asn); ORF 4, module 4: threonine (Thr); ORF
4, module 5: glycine (Gly); ORF 5, module 6: alanine (Ala); ORF 5, module 7:
aspartic acid (Asp); ORF 6, module 8: lysine (Lys); ORF 6, module 9: O-methylated aspartic acid (OCH3-Asp)/aspartic acid (Asp); ORF 6, module 10: glycine (Gly); ORF 6, module 11:
asparagine (Asn); ORF 7, module 12: 3-methyl glutamic: acid (3meGlu) and ORF
7, module 13: isoleucine (11e).
Module 5 contains an adenylation-N-rnethyltransferase domain responsible for activation and tethering of glycine that is subsequently N-methylated to give the aminoacid sarcosine (Sar). The adenylation domain in nnodule 12 recognizes and activates the same amino acid residue as the corresponding module in A54145 (Table6). This observation indicates that a glutamic or 3-methyl giutamic acid residue could be found at position 12 in the structure of the 024A compound. Module 13 is highly homologous to the corresponding module in A54145 indicating that Ile and Val could be incorporated at this position in the 024A compound.
The mature peptide is released from the NRPS enzyme (ORF 7) through the action of the thioesterase domain in module 13 with possibly concomitant cyclization through esterification between the hydroxyl group of Thr at position 4 and the carbonyl group of IIe/Val residues at position 13 (Fig 13b).
The order of modules as well as the predicted amino acid substrate specificities of the peptide synthetase repeating units are the same as in the A54145 locus providing evidence that the peptide core structure of the 024A compound is closely similar, if not identical, to the A54145 peptide core structure (Figures 1 and 13).
Example 7: Activation of fatty acid moieties in A54145 and 024A compounds Amino acid sequence homology analysis indicated that ORF 1 (SEQ ID NO: 2) in locus A541 as well as ORF 2 (SEO ID NO: 37) in locus 024A are similar to acyl CoA
ligases (ADLE), enzymes that activate acyl fatty groups and tether them to aryl carrier proteins (ACPH) (Tables 3 and 4). In A541, ADZE and ACPH family proteins are fused in one polypeptide (ADLF), as found in ORF 1 (SEQ ID NO: 2) whereas in 024A, ADLE
and ACPH enzymes are separate (ORFs 2 and 3 with SEO ID NOS: 37 and 39 respectively).
Conserved families of activating enzymes (ADLE) and aryl carrier proteins (ACPH) were also found in the ramoplanin locus (RAMC)) from Actinoplanes sp.
ATCC
33076 (ORF 26 and ORF 11 respectively) as described in detail in PCT/CA01/01462, in the A21978C locus (DAFT) from Streptomyces roseosp~rus NRRL 11379 (SEO ID
NOS: 26 and 40 respectively) and in the A41012 locus (A410) from the Actinoplanes nipponensis FD 24834 ATCC 31145 (SEQ ID NOS: 34 and 48 respectively) as disclosed in detail in USSN 60/342,133 and in co-pending USSN 10/XXX,XXX
entitled Genes and Proteins Involved in the Biosynthesis of Lipopeptides filed concurrently with the present application and also claiming priority from USSN 60/342,133. RAMO
and DAPT direct the synthesis of lipodepsipeptides similar in structure to that of (U.S. 4,427,656 and U.S. 4,208,403 respectively) whereas A410 directs the synthesis of a lipopeptide of unknown structure (U.S. 4,001,397). The only structural feature common to ramoplanin, A21978C and A54145 is a peptide backbone appended with a fatty acyl group at the N-terminal amino acid residue. Based on these correlations, ORF
1 (ADLF) in A541 and ORFs 2 and 3 in 024A are predicted to activate aryl fatty acids that are subsequently attached onto the peptide core structures to form the mature lipopeptide product.
The biological function of the ADLE, ADLF and ACPH ORFs was assessed by amino acid sequence similarity analysis, Clustal alignment of ADLE ORFs shows the conservation of domains and residues important for their enzymatic function (Fig. 14).
Domain I, involved in AMP binding, and domains II and III, proposed to be involved in the formation of a hydrophobic pocket for the fatty acyl moiety, are highlighted (Fitzmaurice and Kolattukudy (1997) J. Bact. Vol. 179 pp. 2608-2615).
Alignment of ACPH ORFs shows their overall sequence conservation and the absolute conservation of the serine residue that is modified by phosphopantetheinylation to form the active holo-acyl carrier protein (Fig. 15).
Example 8: Incorporation of fatty acid moieties in A54145 and 024A compounds Closer examination of module 1 organization of ORF 2 (SEQ ID NO: 4) and ORF
4 (SEQ ID NO: 41) in A541 and 024A respectively shoves that both NRPS modules begin with a condensation domain instead of the conventional adenylation-thiolation domains (Tables 5 and 7, and Fig. 13). A similar unusual organization is also found in the ramoplanin, A21978C and A41012 lipopeptide-specifying loci (supra). Such modules would generally be considered not to be capat~le of initiating peptide assembly on the assumption that the C-domain would likely interfere with this initiation process (see, for example, Linne and Marahiel, 2000, Biochemistry, Vol. 39, pp. 10439-10447).
The nucleotide sequences of the members of the conserved family of unusual NRPS C-domains in RAMO, DAPT and A541, disclosed in detail as SEQ ID NOS: 5, 7 and 9 respectively in co-pending USSN 10/XXX,XXX entitled Genes and Proteins Involved in the Biosynthesis of Lipopeptides, as well as N-terminal C-domains from module 1 of ORF 2 in A541 and ORF 4 in 024A (Tables 5 and 7) were compared to a collection of condensation domains derived from various lipopeptide NRPSs obtained from GenBank or disclosed herein. Figure 16 shows the evolutionary relatedness of these C-domains.
Apart from RAMO, DAPT, A541, 024A, figure 16 refers to additional lipopeptide biosynthetic loci by way of a four letter designations wherein CADA is the biosynthetic locus for the calcium-dependent antibiotic, FENG is the biosynthetic locus for fengycin, SURF is the biosynthetic locus for surfactin, SYRI is the biosynthetic locus for syringomycin, SERR is the biosynthetic locus for serrawettin, LICH is the biosynthetic locus for lichenysin, ITUR is the biosynthetic locus for ii:urin, and MYSU is the biosynthetic locus for mycosubtilin. All C-domains included in this analysis are full-length C domains. The convention used to identify and distinguish C domains in Figure 16 is as follows. Those NRPS C-domain sequences that were obtained from the GenBank database are denoted by accessions beginning with three letters and are followed by digits (usually numbering 5). These first eight characters identifying each of the C domains correspond to the GenBank accession number. The lower case "n"
serves to denote "NRPS domain", and the "CD" followed by two digits denotes "G
domain" and its number relative to the other C domains contained on that polypeptide sequence. For example "AAC80285nCD06~SYRl" represents the amino acid sequence corresponding to the sixth C domain contained on the taenBank entry AAC80285 for an NRPS from the syringomycin biosynthetic locus. The NRPS C domain sequences that are disclosed for the first time in this application, in U.S. provisional patent application USSN 60/342,133 or U.S. patent application USSN 09/976,059 follow a similar nomenclature (nCD00) but are denoted by nine-characller accessions beginning with three numbers.
Analysis of a clustal alignment of the C-domains clearly shows that these domains are evolutionarily related to C-domains found in the starter modules of known N-acylated lipopeptides such as calcium-dependent antibiotic (CADA), surfactin (SURF), syringomycin (SYRI) and mycosubtiiin (MYCO) among others (Fig. 16).
Moreover, these special C-domains are significantly evolutionarily distant from regular condensation domains found in NRPSs that catalyze amide bond formation and condensation between two adjacent amino acids (Fig. 16). Alignment of these unusual C-domains demonstrates the conservation of motifs and specific amino acid residues important for their catalytic activity (Fig. 17). The conserved motifs C1 to C6, characteristic of condensation domains, are highlighted. The C7 motif in this specialized group of C domains is different to that previously reported and as such it is labeled C7' in Figure 17. Based on these observations, the unusual C-domains are considered to catalyze N-aryl peptide linkages between a fatty acid and the amino terminal group of an amino acid.
Example 9: Biosynthesis of N-acylated peptides:
Despite the significant overall evolutionary distance between the lipopeptide-producing microorganisms described in this invention, they all contain closely related C-domains that are used for peptide N-acylation, a step which doubles as the peptide chain initiation step. Without intending to be limited to any particular biosynthetic scheme or mechanism of action, the ADLE/ADLF, ACPH and unusual NRPS C-domain, as exemplified by the first condensation domain in modules 1 of A541 and 024A, of the present invention can explain formation of the N-aryl peptide linkage found in lipopeptides.
Figure l8a,b illustrates a mechanism for NRPS chain initiation in which the fatty acyl group primes the synthesis of the peptide by the NRPS. CoA-linked fatty aryl precursors are channeled from the primary metabolic pool and modified while still attached to CoA by accessory enzymes such as oxidoreductases, epoxidases, desaturases, etc. encoded by genes of primary metabolism or by genes within the biosynthetic locus. The mature fatty acyl-CoA intermediate is then recognized by the cognate adenylating enzyme and transferred onto the phosphopantetheinyl prosthetic arm of the free holo-ACP, releasing CoA-SH and utilizing ATP in the process.
It is alternatively contemplated that the adenylating enzyme may recognize free fatty acyl substrates) and transfer them onto the phosphopantetlleinyl prosthetic arm of the free holo-ACP, utilizing ATP in the process. Once the fatty acyl group is tethered onto the free holo-ACP, the C domain of the first module carries out a reaction in which the carbonyl group of the activated fatty acyl is condensed with the amino group of the amino acid substrate that had been previously activated and tethered by the first module of the NRPS. Hence, peptide chain initiation arid N-acylation are closely coupled. Subsequent peptide elongation and termination steps can then proceed as with typical NRPS modules.
Figure 18c illustrates the above-described amino acid N-acylation mechanism using specific examples in A541 and 024A lipopeptide biosynthetic pathways. In A54145, biosynthesis of the acylated peptide chain is initiated by activation and tethering of specific fatty acid units onto the ACPH component of the ADLF
protein disclosed herein as ORF 1 (SEQ ID NO: 2). ADLF represents the fusion of the two protein families, ADLE and ACPH, required for activation of fatty acids in lipopeptide biosynthesis. Once the fatty acid is activated, the acyl-specific C-domain of the first module of ORF 2 (SEQ ID NO: 4) catalyzes the condensation of the carbonyl group of the fatty acyl and the amino group of the tryptophan residue (Trp) that had been previously activated by and tethered to the first module of the NRPS (Figs 13 and 18c).
The A54145 factors vary with respect to various permutations of the identity of the fatty aryl moiety attached to the N-terminal amine of the peptide core (Fig. 1 ).
The A54145 complex has eight factors composed of four different cyclic peptide cores and three different lipid side chains. Thus, eight of the possible twelve permutations of A54145 factors have been detected; presumably, the remaining four were present in such low amounts that they were not observed by the high-performance liquid chromatography (HPLC) system used. The variability in the fatty acyl group likely arises due to substrate flexibility in the adenylating enzyme/acyl carrier protein (ADLF) as well as the unusual C-domain in the first module of the A54145 lipopepetide NRPS.
In 024A compound biosynthesis, the ADLE enzyme (ORF 2; SECT ID NO: 37) activates specific fatty acid moieties and subsequently tether them onto the phosphopantetheinyl prosthetic arm of the ACPH (ORF 3; SEQ ID NO: 39).The carbonyl group of the activated fatty acyl is then condensed to the amino group of the tryptophan residue (Trp) that had been previously activated by and tethered to the first module of the NRPS. The condensation reaction is catalyzed by the acyl-specific C-domain of module 1 in ORF 4 (SEO ID NO: 41 ) (Figs 13 and 18c).
The same mechanism for peptide N-acylatior~ may be present in other microorganisms. Evidence supporting this hypothesis includes the fact that other lipopeptide NRPS enzymes that have been identified in very diverse microorganisms contain a specialized C domain in the first module. Examples include the syringomycin biosynthetic locus from Pseudomonas syringae pv. syringae (Guenzi at al.
(1998) J.
Biol. Chem. Vol. 273, pp. 32857-32863); the serrawettin W2 biosynthetic locus from Serratia liquefasciens MG1 (Lindum et aL (1998) Vol 180, pp. 6384-6388); the fengycin biosynthetic loci from Bacillus subtilis b213 and A1/3 (Stellar et al. (1999) Chem. Biol.
Vol. 6, pp. 31-41 ); the surfactin biosynthetic locus from Bacillus swotilis;
the lichenysin biosynthetic locus from Bacillus licheniformis (Konz et al. (1999) J. Bact.
Vol. 181, pp.
133-140); and the "calcium-dependent antibiotic" (CADA) biosynthetic locus from Streptomyces coelicolorA3(2) (Hajati et al. (2002) Chem. Biol. Vol. 9, pp.
1175-1 187).
The CADA biosynthetic locus does not apparently have an adenylating enzyme homologue but it does contain a free acyl carrier protein that may participate together with the unusual C domain of the first NRPS module in the N-acylation mechanism.
Therefore, certain fatty acids may require specialized enzymes to transfer the fatty acyl moiety onto the aryl carrier protein, but once tethered onto the free acyl carrier protein the mechanism is analogous to that outlined in Figure 18. It is noteworthy to point out that the fatty acyl moiety of CDA is unique in that it contains an epoxy modification.
Hence such fatty acids may be transferred onto the ACP by some other specialized enzyme.
It is possible that the N-acylation mechanism of the present invention extends beyond bacteria to even more diverse microorganisms such as lower eukaryotes and other organisms. For example, the fungi Aspergillus nidulans var. roseus, Glarea lozoyensis, and Aspergillus japonicus var. aculeatus are known to produce the antifungal lipopeptides echinocandin B, pneumocandin B0, and aculeacin A, respectively (Hino et al. (2001 ) Journal of Industrial Microbiology and Biotechnology Vol 27, pp. 157-162). Based on the overall similarity between fungal and bacterial NRPS
systems and on the fact that we have shown that very diverse NRPS systems employ the same mechanism of N-acylation, the mechanism of peptide N-acylation described in this invention is likely to be operative in these and/or other lipopeptide-producing lower eukaryotes as welt.
Although the disclosed mechanism for peptide N-acylation is apparently widespread among very diverse microorganisms, it is not the only means by which lipopeptides can be generated. For example, the lipopeptides mycosubtilin and iturin A
produced by Bacillus subtilis ATCC and RB14, respectively, are each assembled by multifunctional hybrid polypeptides comprising fused fatty acid synthase, amino transferase, and NRPS activities (Duitman et al. (1999) Proc. Natl. Acad. Sci USA. Vol.
96, pp. 13294-13299; Tsuge et al. (2001 ) J. Bact, Vol. 183, pp. 6265-6273).
This alternative mechanism of peptide N-acylation may be more evolutionarily restricted as, to the best of our knowledge, it has been identified only in members of the genus Bacillus, and the lipopeptides produced by these biosynthetic loci are members of a distinct sub-group of lipopeptides that contain a (3-amino fatty acyl moiety linked to the amino terminus of the peptide core. Despite the fact that this mechanism of N-acylation does not involve the action of ADLE and ACPH homologues, the C-domains that condense the (3-amino fatty aryl moiety to the first amino acid of both mycosubtilin and iturin are found to cluster within the highlighted group of acyl-specific C-domains as shown in Figure 16.
The widespread N-acylation mechanism for peptide natural products provides a knowledge-based approach for discovery and identification of lipopeptide biosynthetic loci in microorganisms. The highly conserved nucleotide sequences that are distinguishing signatures of the adenylating enzyme, the aryl carrier protein, and/or the specialized C-domain involved in the N-acylation mechanism can be identified and utilized as probes to screen libraries of microbial genomic DNA for the purpose of rapidly identifying, isolating, and characterizing lipopeptide biosynthetic loci in microorganisms of interest. The sequences of ADLE, ACPH proteins and the acyl-specific C-domain can also be used for in silico screening of large collections of microorganisms. Such a genetic-based screen has the added advantage over traditional fermentation approaches in that organisms having the genetic potential to produce lipopeptide natural products can be identified without the laborious fermentation, isolation, and characterization of the lipopeptide natural product. In addition, those organisms that normally produce lipopeptides only at very low or undetectable amounts or those organisms that only produce lipopeptides under very specialized growth conditions can nevertheless be readily identified using this genetic approach.
Example 10: Methylation of qlutamic acid at~~osition 12 of A54145 and 024A
compounds:
The amino acid in the 12t" position of the A54145 peptide core can be either glutamate or 3-methyl-glutamate. Four of the eight known A54145 factors, A, A1, D, and F, contain glutamate and the other four, B, B1, C, and E, contain 3-methyl-glutamate in the 12t" position. Based on our in silico analyses, ORF 15 (SEQ
ID NOS:
32) is predicted to be responsible for the formation of the 3-methyl-glutamate-containing A54145 factors. ORF 15 is structurally related to the S-adenosylmethionine-dependent ubiquinone (coenzyme Q)/menaquinone (vitamin K2) family of C-methyltransferases (pfam01209) (Table 3).

An equivalent methyltransferase is found in locus 024A (ORF 16, SEQ ID NO:
65) indicating that a similarly modified amino acid is found in the structure of the 024A
compound (Table 4 and Figure 13).
A search of the NCBI gene database identified a homologue with 35% identity to ORF 15 in Streptomyces coelicolorA3(2), hypothetical protein SCE8.08c (GenBank accession CAB38586). Further inspection of the genetic context of the gene encoding SCE8.08c revealed that it is located approximately 20 kilobasepairs upstream of the NRPS genes that are responsible for the production of the "calcium-dependent antibiotic" (CADA) of S. coelicolorand less than 3.5 kilobasepairs upstream of the gene encoding the CdaR transcriptional activator protein for CADA biosynthesis.
CADA is an example of an N-acylated lipopeptide and, significantly, it too varies at one position of the peptide core in that either glutamate or 3-methyl-glutamate is found in the 10th position of the eleven amino acid core. In an elegant si:udy using microarray expression profile analysis, Huang and coworkers recently demonstrated that the gene encoding hypothetical protein SCE8.08c is among those that are expressed coordinately along with the CADA NRPS cluster (Huang et al. (2001 ) GenE;s Dev. Vol. 15 pp. 3183-3192).
This finding supports our hypothesis implicating hypothetical protein SCE8.08c in the formation of 3-methyl-glutamate-containing CADA compounds. In contrast to the function which we propose here for hypothetical protein SCE8.08c, Ryding and coworkers have recently suggested that it is involved in the synthesis of tryptophan, a precursor used in the biosynthesis of CADA which is incorporated at both the third and eleventh positions. Their conclusion was based merely on the fact that the SCE8.08c gene is one of the six genes, most of which are homologues of known tryptophan biosynthetic genes, that is expressed as an operon transcribed from a single promoter known as p7 (Ryding (2002) J. Bact. Vol. 184 pp. 794-805). We disagree with these authors' proposed function for SCE8.08c as no C-methyltransferase is required in the tryptophan biosynthetic pathway.
The lipopeptide antibiotic A-219780 complex (daptomycin is one of the factors in this complex) produced by S. roseosporus is yet another example of a lipopeptide natural product that contains a 3-methyl-glutamate in the peptide core and shares the common features described above for A54145 and CADA. As expected by our predictions, a homologue with 38% identity to ORF 15 has been identified in S.
roseosporus and the gene encoding this polypeptide is located less than 3 kilobasepairs downstream of the A-21978C NRPS biosynthetic genes (data not shown). However, to our knowledge no variants of A-21978C containing glutamate instead of 3-methyl-glutamate have been isolated from cultures of S. roseosporus. Perhaps this indicates a tighter coupling between expression and/or activity levels of the C-methyltransferase and the NRPS machinery in S. roseosporus than either S. fradiae or S.
coelicolor.
Alternatively, it is possible that S. roseosporus does produce variants of A-containing glutamate instead of 3-methyl-glutamate but the extraction processes have eluded to identify these compounds.
Therefore, we propose that ORF 15 of the A541 ~45 locus, ORF 16 of the 024A
locus and their homologues in S. coelicolorand S. roseosporus constitute a novel family of C-methyltransferases (herein termed MTFZ) that give rise to NRPS-generated peptides containing 3-methyl-glutamate. Figure 19 is an amino acid alignment of ORF
from the A54145 locus and ORF 16 from the 024A locus together with the CADA-associated homologue of S. coelicolor and the A-21978C-associated homologue of S.
roseosporus. Three motifs of sequence similarity in S-adenosylmethionine-dependent methyltransferases (Kagan and Clarke (1994) Arch Biochem Biophys 1994 Vol.
310, pp.
417-427) are highlighted. The crystal structure determination of catechol O-methyltransferase has identified the amino acids immediately following motif II as playing an important role in the binding of ligands and in forming the enzymatic active site (Vidgren et al. (1994) Nature Vol. 368 pp.354-358). The post-motif II
region among the members of the MTFZ family includes a highly conserved motif, AYGTHH, which may play an analogously important role in the binding of ligands and in forming the enzymatic active site. Moreover, this highly conserved post-motif II region may be diagnostic of this novel class of C-methyltransferases.
The exact substrate and the timing with which ORF 15 in the A541 locus methylates it have yet to be determined. It has been shown, however, that factors containing glutamate at position 12 accumulate more rapidly and earlier during fermentation than those containing 3-methyl-glutamate at position 12.
Moreover, varying the temperature of the fermentation can modulate the ratios of glutamate- to 3-methyl-glutamate-containing factors (U.S. 4,994,270). At lower temperatures (21 degrees Celsius), the majority of the products were factors containing glutamate at position 12 whereas at higher temperatures (33 degrees Celsius) the majority of the products were factors containing 3-methyl-glutamate at position 12. One explanation for this temperature-dependent variation of the residue at position 12 is that the catalytic activity of ORF 15 is higher at elevated temperatures. Alternatively, expression levels of ORF 15 may be higher at elevated temperatures. For example, one possibility for the latter scenario is if a transcriptional repressor regulates expression of ORF
15 and this repressor is, in turn, temperature sensitive such that its function is compromised at elevated temperatures.
Having identified the functional relevance of this novel class of C-methyltransferase, one skilled in the art may engineer strains-by means of traditional strain improvement or by targeted genetic modification-to enrich or produce exclusively A54145 factors that are more desirable. For example, if A54145 factors containing glutamate at position 12 are desired over those containing 3-methyl-glutamate at position 12, one could genetically engineer a recombinant strain in which the ORF 15 gene is disrupted so as to eliminate the methylation step.
Conversely, it A54145 factors containing 3-methyl-glutamate at position 12 are desired over those containing glutamate at position 12, one could genetically engineer a recombinant strain that overproduces the ORF 15 gene (for example, by introducing a second copy of the gene on a high copy number plasmid) so as to increase the efficiency of the methylation step.
Example 11: Biosynthesis of an N-acylated lipopeptide by locus 024A:
Locus 024A in Streptomyces refuineus subsp. thermotolerans NRRL 3143 was shown to possess several characteristics of an N-acylated lipopeptide encoding locus, namely the presence of an acyl-specific C-domain in module 1 of ORF 2 (Table 7) located at the N-terminus of the first NRPS ORF involved in the assembly of the polypeptide, ADLE (ORF 2) and ACPH (ORF3) family proteins (SEQ ID NOS: 37 and 39 respectively) as well as an NRPS multienzymatico system composed of 13 modules (see Table 7 and Fig 13). The high homology of the NRPS systems found in loci 024A and A541 suggests that the 024A polypeptide scaffold is identical to that of A54145 (Fig 13).
Based on these observations and on the fact that there are known growth conditions for expressing lipopeptide A54145 in Streptomyces fradiae (US
4,977,083), Streptomyces refuineus subsp. thermotolerans was grown under identical culture conditions to assess possible induction of locus 024A and determine the nature of the specified product.
Streptomyces fradiae and Streptomyces refuineus subsp. thermotolerans were grown at 30°-C for 48 hour in a rotary shaker in 25 mL of a seed medium consisting of glucose (10 g/L), potato starch (30 g/L), soy flour (20 g/L), Pharmamedia (20g/L), and CaC03 (2 g/L) in tap water. Five mL of this seed culture was used to inoculate 500 mL
of production media in a 4L baffled flask. Production media consisted of glucose (25 g/L), soy grits (18.75 g/L), blackstrap molasses (3.75 g/L), casein (1.25 g/L), sodium acetate (8 g/L), and CaC03 (3.13 g/L) in tap water, and proceeded for 7 days at 30°-C on a rotary shaker. The production culture was centrifuged and filtered to remove mycelia and solid matter. The pH was adjusted to 6~4 and 46 mL of Diaion HP20 was added and stirred for 30 minutes. HP20 resin was collected by Buchner filtration and washed successively with 140 mL water and 90 mL 15% CH3CN/H20, and the wash was discarded. HP20 resin was then eluted with 140 mL 50% CH3CN/H20 (fraction HP20 E2). This pool was passed over a 5 mL Amberlite IRA68 column (acetate cycle) and the flow through (fraction IRA FT) was reserved for bioassay. The column was washed with 25 mL 50% CH3CN/H20 and eluted with 25 mL 50% CH3CN/H20 containing 0.1 N
HOAc (fraction IRA E1), and then eluted with 25 mL 50% CH3CN/H20 containing 1.0 N
HOAc (fraction IRA E2). Biological activity was followed during purification by bioassay with Micrococcus luteus in Nutrient Agar containing 5 mM CaCl2.
Figure 20 is a photograph of a plate generated during extraction of an anionic lipopeptide from Streptomyces fradiae. Figure 20a shows an enrichment of activity based on IRA67 anion exchange chromatography consistent with expression of an acidic lipopeptide. This activity is concentrated during the extraction procedure as indicated by the increased diameter of lysis rings. A54145 was detected via HPLC/MS
in fraction IRA E2 as evidenced by mass ion ES2+= 830.5 consistent with the structures of A54145C,D (US 4,994,270).
Figure 20b is a photograph of a plate generated during a similar extraction scheme performed on extracts from Streptomyces refuineus subsp. thermotolerans .
Figure 20b shows a similar enrichment of activity based on IRA67 anion exchange chromatography consistent with expression of an acidic lipopeptide. This activity is concentrated during the extraction procedure as indicated by the increased diameter of lysis rings. A mass ion of ES2+- 830.5, identical to that of A54145, was present in fraction IRA

confirming that an N-acylated acidic lipopeptide, identical to A54145C,D, is produced by 024A in Streptomyces refuineus subsp. thermotolerans.
Example 12: Use of the N-acyl capping, cassette to engineer peptide synthetases capable of producing novel lipopeptides The availability and understanding of lipopeptide N-acyl capping components increases the potential of redesigning (un)natural products by engineered peptide synthetases. It has been demonstrated that, using known molecular biology techniques, functional hybrids peptide synthetases may be engineered that are capable of producing rationally designed peptide products (Mootz et al. (2000) Proc.
Natl. Acad.
Sci. U S A. Vol 97 pp. 5848-5853). Moreover, it has been postulated that through domain swapping, change-of-substrate specificity by mutagenesis, and an induced termination to achieve release of a defined shortened product, it may be possible to obtain a recombinant NRPS system that produces antipain, a potent cathepsin inhibitor produced by Streptomyces roseus and whose biosynthetic machinery is unknown (Doekel S, Marahiel MA. (2001 ) Meta,6. Eng. Vol 3 pp. 64-77). Mootz et al~
(supra) described genetic engineering using an NRPS system to produce a peptide product that is not a naturally occurring product, and Doekel and Marahiel (supra) described a prophetic example of engineering an NRPS system to make the known natural product antipain.
The following outlines a strategy whereby the NRPS biosynthetic machinery of a nonlipopeptide natural product, complestatin, can be modified so as to produce an N-acylated analogue of complestatin (Fig. 21 ).
Streptomyces lavendulae produces complestatin, a cyclic peptide natural product that antagonizes pharmacologically relevant protein-protein interactions including formation of the C4b, 2b complex in the complement cascade and gp120-binding in the HIV life cycle. Cornplestatin, a member of the vancomycin group of natural products, consists of an alpha-ketoacyl hexapeptide backbone modified by oxidative phenolic couplings and halogenations. The entire complestatin biosynthetic and regulatory gene cluster spanning ca. 50 kb was cloned and sequenced (Chin et a9.
(2001 ) Proc. Natl. Acad. Sci. U S A Vol 98 pp. 8548-8553). It includes four NRPS
genes, comA, coma, comC, and comb (Fig. 10, panel a). The comA gene encodes an NRPS that is composed of a loading module that incorporates hydroxyphenylglycine (HPG; or a derivative thereof) followed by a module that incorporates tryptophan (Trp), the first two residues of complestatin. Through domain swapping, the loading module and the C domain of the tryptophan-incorporating module can be replaced by one of the acyl-specific C-domains disclosed herein. Preferably, the acyl-specific C-domain of A541 (in module 1 of ORF 2 - SEQ ID N~: 4), DAPT, or 024A (in module 1 or ORF

SEQ ID NO: 41 ) would be used, as these domains are naturally specific for condensing an acyl moiety to a tryptophan residue. In addition to this domain swapping, the ADLE
and ACPH genes would also be introduced into the system so as to provide a means to generate activated acyl substrates that can be used by the acyl-specific C
domain.
Thus, Figure 21 b depicts a rationally designed recombinant NRPS system that should give rise to N-acylated complestatin analogue(s). The recombinant NRPS system depicted in Figure 21 b could be employed either in vivo, using an appropriate recombinant host or in vitro using purified enzymes supplemented with the appropriate substrates.
One approach whereby N-acylated complestatin analogues) could be generated in vivo would involve the use of Strepfomyces lavendulae, the complestatin producer, as the host strain. Briefly, the N-acyl capping cassette would replace the comA gene. This could be accomplished either by inactivation of the comA gene on the Streptomyces lavendulae chromosome followed by the introduction of a plasmid expressing the ADLE, ACPH, and the recombinant ComA derivative, or by physically replacing, by way of a double recombination (Keiser et al., supra) the comA
gene on the Streptomyces lavendulae chromosome by a cassette containing genes encoding the ADLE, ACPH, and the recombinant ComA derivative. The resulting recombinant strains could be further modified to include genes involved in the biosynthesis of the acyl moieties and/or could be provided acyl moieties or precursors thereof in the fermentation medium.
One approach whereby N-acylated complestatin analogues) could be generated in vitro would involve the over-expression of the ADLE, ACPH, recombinant ComA, Coma, ComC, and Comb polypeptides in an appropriate host, for example E, toll, followed by the preparation of an extract or purified fraction thereof and use of said preparation together with appropriate substrates as outlined in Mootz et al.
(2000). It is expected that, in the absence of accessory proteins the product produced by this in vitro _70_ system might not contain certain modifications such as the cross-linking of residues that is catalyzed by specific complestatin cytochrome P450 enzymes.

SEQUENCE LISTING
Applicant name: ECOPIA BIOSCIENCES INC.
FARNET, Chris ZAZOPOULOS, Emmanuel STAFFA, Alfredo Title of invention: GENES AND PROTEINS INVOLVED IIQ THE BIOSYNTHESIS OF
LIPOPEPTIDES
Correspondence address: 7290 Frederick-Banting Saint-Laurent, Quebec, H4S 2A1 Current Application Data Expected filing Date: December 24, 20002 Patent Agent Information Name: Ywe J. Looper Reference Number: 10961 File reference: 3002-13CA
Number of SEQ ID NOS: 66 Software: PatentIn version 3.0 Information for SEQ ID NO: 1 Length: 13315 Type: DNA
Organism: Streptomyces fradiae Sequence: 1 ttgaccgtcc gggcggagca ccggaaagcg tcgaccctgc cgccggggaa cccggccgtc 60 agcagcggcg actccgcgtc ccgccgggag aagagggccg ccgctgggag cagttcttcg 120 gcggacccgc tggccggccc ccacctggtg gccgcgatct ccgcgacggc cgaggccgac 180 ccggggcgca aggccgtcgg tctcgtccgg gatccggagc gcgagggcga ggaggcgctg 240 cggagctacg cctggctcga cgacaccgcc cgccgcatcg ccgtcctcct gcgtgcggcc 300 gggctggaaa cgggcgcacg cgtgctgctg ctcttcccgc agtccgcgga gttcgcggcg 360 gcctacgccg ggtgtctcta cgcgggcatg gttgccgtcc ccgcgcccct tccgaccggc 420 acctcccatg aggccgcacg cgtcgtcggc atcgcgaagg actccg<~ggc aggcgccgtc 480 ctcaccgtct ccgaaaccga ggcggacgtc cggcaatggg cggcccgcac cggcctgggc 540 gcgctgcccc tccactgcgt cgacgaactg cccggcgacg ccgaccccga cacgtggcgg 600 gaaccggaga tccgggccga caccgtggcg gtcctccagt acacctccgg ctccaccggc 660 agccccaagg gggtcgtcgt cacccacggc gcgctcgccg acaacgtgcg cagcctgctc 720 acgggcttcg atctgggatc cggcgcccgg ctgggcggct ggctgccgat gtaccacgac 780 atggggctgt tcggcctgct gagcccggca ctgttcagcg gcggagccgc cgtgctgatg 840 agcggcagcgccttcctgcgccgcccgtcccagtggctgaggctgatcgaccgcttcggc900 ctcgtcttctcggcggcgcccgacttcgcctacgactactgcgtacggcgggtgagaccc960 gaggagacggacgggctcgacctgtcgcgctggcgctgggcggccaacggctccgagccc1020 atccgcgccgagacgctgcgcgccttcgccaaggagttcgccccggccggactccacccg1080 aacgccaccaccccttgctacggactggccgaggcgaccctgctggtgtccctgcccacg1140 ggtgagctgcgcacccgacgggtggacgtcgcggaactggagaaccaccgcttcgtcgaa1200 gcggccgtgggacgcccctcccgcgagatcgtgtcctgcggccggcccccgtccctggag1260 atccgcgtcgtcgaccccgcgaccggcaagtccgtcacgggcggcgacggagccggcgag1320 accagggtgggcgagatcagagtgcgcggcgcgagcgtcgccaggggctactggcagaaa1380 ccggaggcgaccgccgagacgttcgtcatggacgcggacggctccgggccctggctgcgc1440 accggcgacctcggcgctctgtacgagggcgagctgtacgtcaccggccgtatcaaggaa1500 ctcctcatcgtgcacggccgcaacatctacccccatgacatcgagcacgaactgcgcgcc1560 cgccacgccgaactcggcgctgtcggggccgccttctccctcagcaccgaatcgggcgag1620 gttgtggtcgtcacccatgaggtgaaccccaccgtccggcccgagc<~gggtcccgagctg1680 gtgaccgccc tgcgtgcgac gctcgcgcgg gagttcggcc tcgccccggc cggggtggtg 1740 ctggtgcgccgcggccgcatcccgcgcaccagcagcggcaaggtgcaacgccgcctgacc1800 gcccggctgttcagcacgggggaactcgcccaggtccatgccgaccccggcgcccaccgc1860 ctcctggcggaactcagggaggcgcacgaccgcggcggcgccttcccgcccccctccccg1920 cccgccagccaggaccccgaggccctgcggcagcggctgcgcgagctgtgcgccgactgt1980 ctcggcgtccccgtggactccctcgccacggacgcccccctcaccgactacgggatgacc2040 tccgtcaccggcaccgccctgtgcgggatggtggaggagtacctggacgtcgaatgcgac2100 ctggaactgctctggcaggagccgacgatcgacgggctcgcctcccggctggcctcgcgc2160 accgtgcgctgaccgtcgccggccccctccgcaccacgcgcccgtgccccggcacgtgtg2220 cgcccggacacctccgcgcgtccggcggtgccgcccccttcagcccgagagcgggagaag2280 catgttggagtcctcggcacaccgtgtggccgccacgtcggcccagaccgggatctggac2340 ggcgcagcgtctgcgcggggacgacaggctctacgcctgcggcctcttcctcgaactcga2400 ccacgtggtggaggaggtgctgagcgaggcgatccgccgcgccgtcgccgacaccgaggc2460 gctgcgcaccgcgttccgggaggacgcggacggcgcgctggagcagcacgtcctcgcccg2520 gccgccgagcacgcagacccgcctcttccacgccgacccgagcggcggaaccccctcccg2580 ctccgcgtccctggactggatggaccggcaacgggcgcaaccctggg~acctcgcgtcggg2640 cgacacctgccgtcataccctgatccccctcggcggcgaccgctcgctgctgcacctgcg2700 ttaccaccac ctcgccctgg acgggtacgg cgccgcgctc tatctggacc ggctcgcggc 2760 ggtctaccgc gcgctgcgca ccggccatca accgcccccc tgcgcgttcg cgccgctggc 2820 ccgcctggtc gaggaggacc acgcctaccg gaactccgcc cgtcaccgcg cggacgccaa 2880 tcactggcgc gaccgcttcg cggacctccc gcgccccacc agcctcgccg acgccaccac 2940 gcccgcggcg cccaccacgc ccgccacgcc cgccgcgccc gccgcgcccg acgaactgcg 3000 gcgcaccgtg cgcctgtccg ccgcccggtc cgccgcgctg cgccgtgcct cggaccggag 3060 cggccgaccc tggcccgtgt acgccacggc cgcggtggcc gccttcctga gccgactcgc 3120 gccgggggag gaggtcgtcg tcggcctccc ggtcaccgcc agggtgaccc ccgccgcggt 3180 gcgcacaccg gggatgctcg ccaacgtcgt accgcttcgc ctgcccgtcc ggcagggcat 3240 gtcgacggcg gagctgctgg agctgaccgc ggccgagatc agcaccacac tgcgccacca 3300 gcgccaccgc accgaggaca tcgggcgggc gctcggactc cacggcgctc cgccagccac 3360 cacactcgtg aacgtcatgg cgttcgcccc ggtcctcgac ttcggcgact gccgggcccc 3420 ggtgcaccag ctctcggccg gaccggtgga ggacctggtc gtcaacctcc tcggcacccc 3480 gggcgacggc ggcgagagcg acggcaccga gctggagatc actgtcgccg ccaacccccg 3540 cctccactcg gcggacgcgg tggcctcgct ggccgcgcgg ctcgcggagt tcctcacgca 3600 catggggcag gacgccgagg cgcccctcgg ccggacccgg ctgctcgacg cggaggagga 3660 ggtcgcggcg gtggcccgcg ggcacagtcc ccgacgcgac ctgcgcgccc ggaccctgcc 3720 cgagctcttc gcccggcagg tcgcccgcac cccggacgcc cccgccgtct cctcggaccg 3780 cgccacctgg acgtacgcCC aactcgacgc ccacgccgag agagtggccc ggcggctcgc 3840 cgcgcggggc gtgggaccgg agagcctcgt cgccctcgcg gtgccgcggg gcgtcgagct 3900 ggcggcgctg atcctcggga tccagcgggc cggaggggcc tacctcccca tcgacccgga 3960 gtacccggcg gaacgcgtcg gtttcctgct gcgcgacgcc cgccccgccc tgctcgtcgg 4020 cgggacgggg accgagccct ccgccgccga ctgcccgcgc gtgccggccg aagagctcct 4080 cgacgccggg gcgtgccgcg ccgaggccga cgtgcccccg cccggaagcc tcccggtgga 4140 ccttccggcg tacgtcgtcc acacctccgg ctcgaccggg cggcccaagg gggtggtggt 4200 cacccacgcg ggcatcgccg ccctggcggc cgagcagatc gaacgctacc gactgggacc 4260 cggctccagg gtggcgcagc tggcggccct cgggttcgac gtcgcggtcg ccgaactcgt 4320 gatggcgctg gcgtcgggga gctgtctcgt cctcccgccg cacggcctcg ccggcgacga 4380 gctggcctcc ttcctgcgcg accggcgtat caccacggcc ctcgccccgg ccgccgtact 4440 ggccaccctg ccccccggcg acctccccga cctgaccgat ctggtcaccg gaggcgagca 4500 gccaccgccc gcgctgatcg cccgctgggc acccggccgg cggatgttca acgtctacgg 4560 gccgacggag gccaccgtcc aggccacctc cgggcggtgc gcggccgacg gcgaccggtc 4620 gccggacatc gggaaccccg aggccggagt ggacgcctac gttctggacg ccgcgctgcg 4680 acccgtgccc gacggggtga cgggcgagct ctacctgcgt ggcaggggcc tggcccgcgg 4740 ctacctcggc cgccccggcc tcaccgccgg ccggttcgtc gccgaccccc acaccgggac 4800 gggcgagcgg atgtacagga ccggggacct ggtgcgccgg gtgcccggcg acggccgcac 4860 cgtgctgagg ttcgtcggcc gggccgacga ccaggtgaag atccggggct tccgggtcga 4920 gccgggcgag gtcgaggccg ccctcgccga actcgacggc gtcgagcagg ccctggtgac 4980 cgtccgggag gagcggcccg gcgaccgcag gctcgtcggc tacctgacac ccgcccccgg 5040 gcaccgggga tcactggacg tcgagcgcct gcgccgcgtg ctcgccgacc ggctccccgc 5100 ccacctcgtc ccctccctcc tgatggagct ggcggagatc ccgcgcaccg ccaacggcaa 5160 ggtggaccgt gcggcgctgc cggaccccgc tcccctggcc ccgaccgccg ggagggcgcc 5220 gcgcgacgcc cgagaagagg ccctgtgcgc gctcttcgcc gaggtgctgg gcgtcgagga 5280 ggtcggcgtc gacctcgact tcttcgcgct gggcggtgac tcgctgctgg ccgcccgact 5340 ggcgagccgc atccgtggcc ggctgggcaa ggcggtcacc gtacgggagg tcttccggtc 5400 cccgaccgtc gcccgcctcg cggaggaact gggcgacggg gccgtgccgg acgaccacgt 5460 ccgtcccgtc cggccccgcc cggagcggct gccgctgtcg tccgcgcagc gccggctgtg 5520 gttcatcgac gaactcaacg gcgcgtcggc ggcctacaac atcccgaccg tcctccacct 5580 ggagggaccg ctcgacgtcc ccgcgctgca cgccgcgctc ggcgacgtga cggaccggca 5640 cgaaacgctc cgcaccaccc tgcggccccc cgcggacgac ggctcggcgg gcgcacccga 5700 gcagcacatc gcgccccccg gcggccaccg gccaccgctg ccggtgctcg acgtcgctcc 5760 cgaagcgctc gccggggagc tgcgcgccgc agcgggccac gtcttcgacc tcgcgcggga 5820 ccttccggtg cgcgccacgc tctaccgcac cggcgagctg gagcacgcgc tgctgctgct 5880 ggtccatcac gtcgccgccg acggcgcgtc gatgggcccc ctgatcggcg acctggccac 5940 cgcctacacg gcccggctcg cgggacgcgc acccgtcctc ccgccacccg aggtgacgta 6000 cgccgacttc gcgctgtggg agcgggggag ccgggagcgc tccgccgcgc aggccgaggg 6060 gatcgactac tggcgccggg ccctggccgg gctgccggac cacatcc:ggc tccccgccga 6120 ccgcccccgc tcgcaggaac cggtccgccg cggcgggatc gcccggttcg aggtgccgcc 6180 cgccctgtac gccgggctcg tggagctggc ccaaggcgtc ggcgccaccc cgttcatggt 6240 gctccagacc gcgatagccg tcctgctcag ccggatggga gccggaaccg acatccccct 6300 gggcacgccg gtcgccggcc gccaggacga ggcgctcgac ggactcgtcg gctgcttcgt 6360 caacaccgtcgtcctgcgcaccgacgtctcgggtgaccccaccacgaccgaactgctggc6420 ccggacgcgcgacggcgacctggaagccctcgcccaccaggacgtgcccttcgaccgggt6480 cgtggaggcggtcaaccccgtccgttccacctcacggcaccccctcttccaggtcatgct6540 gatcctcaacggttctgatcaggaccggcaccgggcgcggttcccccgactggccgaccg6600 ggtcgagacggtggagccgggggagaccaagttcgacctctcctggcacttcacccaccg6660 cgacgggccggacaggtcgctggagggggccctcgtccacgccgccgacctgttcgacga6720 cgacaccgcgcaccggctcaccgcacgcctgctcgacgtcctgaccgccatggtcgacga6780 ccccgcccggcccgtcgggaccatcgacgtcctcagcgccgccgagcaccgcctcgtgcg6840 cgcgtgggggaccggcacgccccgcccccctggcagccgtccggagccggtcgccgcgag6900 gatcgccggccaggccgcccgcacaccggacgcccccgcggtgaccgagcccggacgggt6960 ctggagctacgccgaactcgacgcccgcgccgaccgggtggcggcggccctggccgcacg7020 cggaatcggcgccgaggacctcgtcgccgtactcctgccccgcggcgcggaactggtcgc7080 caccctgctggggatcctccgggcaggcgccgcctacctcccgctcgacaccggacaccc7140 cgccgaccgcaaccgccgggccctctccgactccgccccggcactgctggtgaccgacgc7200 cgggcggtcgcgcacgctccgaggagagaccgggtgcgccgcgctggtcctgggtgcgga7260 ggacaccgagcgggaactggcggaccgcgcccccctcccgcgggacggcgccggcctcgt7320 acgcccggtgaccggggacaacgccgcctacacgatcctcacctccggttcgacgggccg7380 ccccaaggccgtcgtcgtgacgcgggacgcgctggacgcgttcgtcgatcgcgccctgga7440 cacctacggcgacgcgctgggcggagaggctctgctgcactccccggtcgccttcgacct7500 cacggtcgtcaccctgtacgggccgctggccgcgggcgggcgcgtccgggtcggcgacct7560 cgacgagtccgggatcgcccggtgggagaaggagcgcccggccttcqtcaaggccacgcc7620 ctcccacctcgcgctgctgacggagttcggcggctccacggcccccggaacggtcgtcct7680 ggcgggcgagcaactcatcggcgcacggctggaccgctggcggactcgcctgggcgcctc7740 cggcaccaccgtcctcaacagctacgggcccaccgagaccaccgtcaactgcctggagca7800 caggatcgccccggacgccgacgtgccctcgggacccgtgccggtgggccggccggtgcc7860 cggggtacgggtgctgctcctcgacgaccgcctgcgccccgtcgccccgggcgtcacggg7920 cgaactgtatgtctgcgggcccggcgtcgcccgcggataccgtgccaggccggccgccac7980 cgccgaacggttcgtcgcctgtccgcaaggacggccgggagagcggatgtaccgcaccgg8040 cgacctgatgcgctggaccgccgacggcgaactggtctacgagggacgggccgacgccca8100 ggtgaaggtgcgcggcttccgggtcgagcccggcgaggtggaggccgcgctgctcggcct8160 ccccggcgtc cgcgaggccg ccgtcaccct cctggaaggg ccggaaggaa cggaagggcc 8220 ggaaggggcg cccgggaggg tggccgcccc cgcccgcctc gtcggctacg tcgtcggcgc 8280 cagcgaggaa ccggccgccc tcctcgaacg gctgcgcgtc aggttgcccg accacatggt 8340 gcccgccgcc ctcgtggacc tcgacgccct gccgctcacc ccgaacggca agctcgaccg 8400 ccgcgccctg cccgcccccg acttcggccg ccacgcgggc cgccgcgcac ccagcgggcc 8460 ggaggaggaa gcgctctgcg cgctcttcgc cgacgtgctg ggagtgcccg aggtcggcgc 8520 ggacgacagc ttcttcacgc tcgggggcga cagcatcgtc agcatcr_agc tcgtcggccg 8580 cgcccgggga gccggactgc acctcacggt gcgcgacgtc ttcgagc:acc ccacggccgc 8640 cgggetggcg accgtcgtcc ggtccgccgg accggacgcg gacgcggagc gtcccgcgcc 8700 gcaggcgctc gccccgagcg ggacgctgcc ctacgtcccg gccgcggcgc ggctcgtggc 8760 cagaaccggc tcgatgcgcg cccgcggcgc cgaccgcttc caccagt=cgg tggtcctcac 8820 caccccggcg aacgcggcgg ccgacgacgt ccggcgcgtg ctccagacgc tgatcgacca 8880 tcacggggca ctccgcctgc ggaccgccgc cgaccgggac ggatcgccgg acggcctggt 8940 gatcaccgaa ccggggacgg tcgccgccgc cggcctgctg cgctgccgcg acgccgccgg 9000 actccagggc gtggctctgc gggaggcggt ggagcgggag gccgggcacg cacgcgacgc 9060 cctcgacccg agcaccgggg ccgtgctgcg cgcggcctgg ctggaccggg ggaaggaccg 9120 gctcggcctg ctggtcctgg tggcccacca cctcagcgtg gacggccfitct cctggcgcat 9180 cctggcggac gacctccgcc aggcatggac cccggccgac gcccccgcca cgacggcgac 9240 cctgccgccg gagggcgctt ccctgcggga gtgggccacc cggatcgccc ggcgcgccac 9300 cgagaccgcc gtgaccggcc ggcttcccca ctggcgcgcg accctggccg gcctcgacga 9360 cccggacggc gacgtggtcg ccctggagac ccggctcgac cccgaggccg acacccacgg 9420 cacggcgcgc gagcacgcgc acggcctgtc acccgacctg accgacgcgc tcgtgcgcac 9480 cgcgcccgcg gcgctgcgcg cggacaccgg cgaactgctg ctggccgcct acgcgctcgc 9540 cgcgtcgcgg accctgggcg accggcccgt gttcgtggcg gagaccgaga gccacggccg 9600 ccaggacgcc ctcctgcccg gtgtcgacct gacccgcacc gtgggctggt tcacctctgt 9660 ccacccggtg cgcctgcggc ccggagccgg ggccgaacgg ctgctgaagg agaccaagga 9720 gcggctgcgc accgtgccgg aggccggtct cggccacgac ctgctccgcc tcggcggcgc 9780 ggcgtcctcc ccgcgggaga gcggccggga actgccccgc ccgcagttcg gattcaacta 9840 cctcggccgg gtcgccgtgg ccgaagccct caccgacgag ggcacggagc ccgccggggc 9900 ctgggcgttc gcgggccaca gcctcaccgc gcaacccccc gaactgccgc tcacacacga 9960 ggtggaactg accgtcgtcc tggaggacgg cccgcgggga ccggttctcc gagcccgctg 10020 gaacgcctcc gcccgctgcc tgaccggggc gcgcctgacc gcgctggcgc aggagtggga 10080 gaaggcgctg cacgagctga ccgccctggc cggcgtcgcc gggaccgccg gcctgatccc 10140 ctcggagacc ggcgccggcg acctggacca ggacgcgatc gaggagtgcg aggcggcggc 7.0200 ggacttcgag gtcgcggacc tgctcgccct cgcgcccgcc caggagggcc tgctcttcca 10260 cagcaccttc gacgacgcgg cggaggacgt gtacgtcggc cagctggcgc tggagttcca 10320 cggcgagctg tcgggcgcac ggatgcggga ggcggcccag caggtcctcg accggcacga 10380 cgtgctgcgc gcggccttcc tccagcgccg ctccggcgag tggagccagg cgatcgccgc 7.0440 caggacgccg gtgccctggg aggagcacga cctgtccgcc ctggccgggg aggagcggga 10500 gcggcggctg gacgccctgc tggccgggca ccgcacccgc cggttcgacc tcgcacggcc 10560 gccgctggtg cgcttcctgc tcgtcacgac ggcggcggac cggcacgtgc tggccgtgac 10620 caaccaccac ctggtactgg acggctggtc gctgccgttg gtggtgcgcg acctcatggc 10680 cctgtacggc acggacggcg ccgccctgcc cgccgtacgc ccgtaccgcg actacctcgc 10740 ctggctcgcc ggccaggacg cggacgcggc ccacgccgcc tgggcacagg cgctcgccgg 7.0800 gctccagccg tcactgatgg CCCCggaCgC CCCCCgCgaC ggcgcggccc cgctcgcgca 10860 ccaccgcacc atggaccccg acgtcgtctc ccgcctcacc gcctggtccc gccgcctggg 10920 cgtcaccctc aactccgtgg tggagaccgc gtgggccctc ctcctgggcc ggctcaccgg 10980 ccgcgacgac gtcagtttcg gcatcgccgc ctccggccgc cccaccgatc tgcccggcgc 11040 gggggagatc gtcggcctgc tgatgaacac cgtgccggtg cgcgtcaccc tggaccccgc 11100 cgaaccgctg gaagcactcg tccggcgcgt gcaaagggaa caggccgccc tcctcgacca 11160 ccagttcctc cccctggcac aggtgcagca gagcctcgga gcgggcgacc tcttcgacac 11220 cacgctcgtc ttcgagaact acccgctggc ccccgccgac ggcctcggcg acggcgacgg 11280 gctccgcctg cagggggcgc gcggccacga cggcaaccac taccccctca gcgtcaccgt 11340 cggccccgca cccgacctcc agctccgctt catccaccgc cccgacctgt tcacaccccc 11400 gtgggtggag gacctggcgg cgcggttcga gcaggtgctc gacgccatgg ccacgtccgg 11460 cgacaccccg gccggccgac tggacatcct gctgccgcgc gaacacgaca cgctcctggg 11520 cgactgggcg cgcggtgagg cggcgagcgc acgggagtgc cccgtcgccc tgttcgagga 11580 acaggtcgac cgcacccccg atgtcctcgc cctggtcgag ggcggtgacg gcgcccggct 11640 gagctacgcc gagttcgacg cccgcgccaa ccgcatggcg cgcttcctca tcgcccgcgg 11700 gctcggggcc gaggacctgg tcggcctggt cttcccccgc ggcgccgacc tgctcaccgg 11760 tctgtggggg gcgctcaagg ccggtgcggc ctacctgccg gtggacgtgg actacccggc 11820 cgaacggatc gcgctgctcc tcggcgacgg gaaccccgcc ctcgtcctca ccacctccgc 11880 ccacgcccac ctggtgcccg aggcgccggg gcggcagatc ctctgcgtcg acctgcccgg 11940 ccccgcggac gaactcgccc gcgccgcgga aggaagggtg accgacagcg agctgccgcg 12000 cccggtcggg cccgacaccc tcgcctacgt cctctacacc tccggctcca ccggccgccc 12060 caagggcgtg gcggtcggcc ggggttcgct ggccgcgcac gccgtccgct cccgcgaccg 12120 ctacccggac gcggccgggg tgtcgctgct gcactcaccg gtcgcgttcg acctgacggt 12180 gaccgccctg ttcaccacac tgatctccgg cggcaccctc ctcctcgcgg aactggacga 12240 acacgcccag gactccggcg tcacctacgt caagggcacg ccctcccacg tcgccctcct 12300 gaacgagctg cccggcgtcc tcgacgccac cgcggagcgc cccggcacgc tcgtgctcgg 12360 cggcgagccg ctcaccggag agatgctgga gcgctggcgc gcccaccacc cgcaggcccg 12420 ggtcttcaac gactacgggc cgtcggagac cagcgtcaac tgctccgacc tgctcctcga 12480 acccggagcc gaggtaccgg agggcctgct gccgatcggt cgcccgctgc ccggcaacca 12540 catgttcgtc ctcgaccacc tgctccagcc cgtaccggtc ggcgtcgtcg gagagatcta 12600 cgtctccggc gtcggcgtgg cccgcggcta ccacggccgg cccggcctga ccgccgagcg 12660 cttcctgccc tgcccctacg acgcaccggg cgcccggatg taccgc<~ccg gggacctggg 12720 gcgctggcgg cccgacggga tcatggagtg cctgggccga accgacgacc aggtcaaggt 12780 gcggggcttc cgggtggagc tgggcgaggt ggaggccgcg ctcgccgccc gctccgacgt 12840 cgcccgcgcc accgtcgtcg tgcgcgagga cgagccgggg gacaggcggc tgacgggcta 12900 cgtggtcccc gaagggggac cggacgcgga cttcgacccg gcggccgcgc tgcgcgacct 12960 ggcggccgcc ctgccgccgt acatggtccc ggccgcgatc gtggtcctct ccgaactgcc 13020 gcgtaccgag aacggcaagc tcgaccgcag ggcgctgccc gcgcccgact acggcaccgc 13080 ctccgtcgga cgtgcgccgc gcaccgccct ggagaccgac ctgtgcgccc tgttcgccga 13140 cgtgctcggc gtgcccggca tcaccctcga cgacgacttc ttcgccr_tgg gcggccactc 13200 cctgctcgcc gtccggctcg ccggccgcat ccgggccgag ctcggactgc ggctcgacat 13260 ccggacgatc ttcgaccacc gcacggtcgc ggacctcctc gccgatc cga atcct 13315 Information for SEQ ID N0: 2 Length: 723 Type: PRT
Organism: Streptomyces fradiae Sequence: 2 Leu Thr Val Arg Ala Glu His Arg Lys Ala Ser Thr Leu Pro Pro Gly g Asn Pro Ala Val Ser Ser Gly Asp Ser Ala Ser Arg Arg Glu Lys Arg Ala Ala Ala Gly Ser Ser Ser Ser Ala Asp Pro Leu Ala Gly Pro His Leu Val Ala Ala Ile Ser Ala Thr Ala Glu Ala Asp Pro Gly Arg Lys Ala Val Gly Leu Val Arg Asp Pro Glu Arg Glu Gly Glu Glu Ala Leu Arg Ser Tyr Ala Trp Leu Asp Asp Thr Ala Arg Arg Ile Ala Val Leu Leu Arg Ala Ala Gly Leu Glu Thr Gly Ala Arg Val Leu Leu Leu Phe Pro Gln Ser Ala Glu Phe Ala Ala Ala Tyr Ala Gly Cys Leu Tyr Ala Gly Met Val Ala Val Pro Ala Pro Leu Pro Thr Gly Th:r Ser His Glu Ala Ala Arg Val Val Gly Ile Ala Lys Asp Ser Glu Ala Gly Ala Val Leu Thr Val Ser Glu Thr Glu Ala Asp Val Arg Gln Trp Ala Ala Arg Thr Gly Leu Gly Ala Leu Pro Leu His Cys Val Asp Glu Leu Pro Gly Asp Ala Asp Pro Asp Thr Trp Arg Glu Pro Glu Ile Arg Ala Asp Thr Val Ala Val Leu Gln Tyr Thr Ser Gly Ser Thr Gly Ser Pro Lys Gly Val Val Val Thr His Gly Ala Leu Ala Asp Asn Val Arg Ser Leu Leu Thr Gly Phe Asp Leu Gly Ser Gly Ala Arg Leu Gly Gly Trp Leu Pro Met Tyr His Asp Met Gly Leu Phe Gly Leu Leu Ser Pro Ala Leu Phe Ser Gly Gly Ala Ala Val Leu Met Ser Gly Ser Ala Phe Leu Arg Arg Pro Ser Gln Trp Leu Arg Leu Ile Asp Arg Phe Gly Leu Val Phe Ser Ala Ala Pro Asp Phe Ala Tyr Asp Tyr Cys Val Arg Arg Val Arg Pro Glu Glu Thr Asp Gly Leu Asp Leu Ser Arg Trp Arg Trp Ala Ala Asn Gly Ser Glu Pro Ile Arg Ala Glu Thr Leu Arg Ala PhE~ Ala Lys Glu Information for SEQ ID N0: 2 Phe Ala Pro Ala Gly Leu His Pro Asn Ala Thr Thr Pro Cys Tyr Gly Leu Ala Glu Ala Thr Leu Leu Val Ser Leu Pro Thr Gly Glu Leu Arg Thr Arg Arg Val Asp Val Ala Glu Leu Glu Asn His Arg Phe Val Glu Ala Ala Val Gly Arg Pro Ser Arg Glu Ile Val Ser Cys Gly Arg Pro Pro Ser Leu Glu Ile Arg Val Val Asp Pro Ala Thr Gly Lys Ser Val Thr Gly Gly Asp Gly Ala Gly Glu Thr Arg Val Gly Glu Ile Arg Val Arg Gly Ala Ser Val Ala Arg G1y Tyr Trp Gln Lys Pro Glu Ala Thr Ala Glu Thr Phe Val Met Asp Ala Asp Gly Ser Gly Pro Trp Leu Arg Thr Gly Asp Leu Gly Ala Leu Tyr Glu Gly Glu Leu Tyr Val Thr Gly Arg Ile Lys Glu Leu Leu Ile Val His Gly Arg Asn Ile Tyr Pro His Asp Ile Glu His Glu Leu Arg Ala Arg His Ala Glu Leu Gly Ala Val Gly Ala Ala Phe Ser Leu Ser Thr Glu Ser Gly Glu Val Val Val Val Thr His Glu Val Asn Pro Thr Val Arg Pro Glu Gln Gl;r Pro Glu Leu Val Thr Ala Leu Arg Ala Thr Leu Ala Arg Glu Phe Gly Leu Ala Pro Ala Gly Val Val Leu Val Arg Arg Gly Arg Ile Pro Arg Thr Ser Ser Gly Lys Val Gln Arg Arg Leu Thr Ala Arg Leu Phe Ser Thr Gly Glu Leu Ala Gln Val His Ala Asp Pro Gly Ala His Arg Leu Leu Ala Glu Leu Arg Glu Ala His Asp Arg Gly Gly Ala Phe Pro Pro Pro Ser Pro Pro Ala Ser Gln Asp Pro Glu Ala Leu Arg Gln Arg Leu Arg Glu Leu Cys Ala Asp Cys Leu Gly Val Pro Val Asp Ser Leu Alai Thr Asp Ala l0 Pro Leu Thr Asp Tyr Gly Met Thr Ser Val Thr Gly Thr Ala Leu Cys Gly Met Val Glu Glu Tyr Leu Asp Val Glu Cys Asp Leu Glu Leu Leu Trp Gln Glu Pro Thr Ile Asp Gly Leu Ala Ser Arg Leu Ala Ser Arg Thr Val Arg Informationfor SEQ N0: 3 ID

Length: 72 Type:
DNA

Organism:Streptomyces fradiae Sequence:3 ttgaccgtccgggcggagcaccggaaagcgtcgaccctgccgccggggaacccggccgtc60 agcagcggcgactccgcgtcccgccgggagaagagggccgccgctgggagcagttcttcg120 gcggacccgctggccggcccccacctggtggccgcgatctccgcgacggccgaggccgac180 ccggggcgcaaggccgtcggtctcgtccgggatccggagcgcgagggcgaggaggcgctg240 cggagctacgcctggctcgacgacaccgcccgccgcatcgccgtcctcctgcgtgcggcc300 gggctggaaacgggcgcacgcgtgctgctgctcttcccgcagtccgcggagttcgcggcg360 gcctacgccgggtgtctctacgcgggcatggttgccgtccccgcgccccttccgaccggc420 acctcccatgaggccgcacgcgtcgtcggcatcgcgaaggactccgaggcaggcgccgtc480 ctcaccgtctccgaaaccgaggcggacgtccggcaatgggcggcccgcaccggcctgggc540 gcgctgcccctccactgcgtcgacgaactgcccggcgacgccgaccccgacacgtggcgg600 gaaccggagatccgggccgacaccgtggcggtcctccagtacacctccggctccaccggc660 agccccaagggggtcgtcgtcacccacggcgcgctcgccgacaacgtgcgcagcctgctc720 acgggcttcgatctgggatccggcgcccggctgggcggctggctgccgatgtaccacgac780 atggggctgttcggcctgctgagcccggcactgttcagcggcggag~~cgccgtgctgatg840 agcggcagcgccttcctgcgccgcccgtcccagtggctgaggctgatcgaccgcttcggc900 ctcgtcttctcggcggcgcccgacttcgcctacgactactgcgtacggcgggtgagaccc960 gaggagacggacgggctcgacctgtcgcgctggcgctgggcggccaacggctccgagccc1020 atccgcgccgagacgctgcgcgccttcgccaaggagttcgccccgg~~cggactccacccg1080 aacgccaccaccccttgctacggactggccgaggcgaccctgctggtgtccctgcccacg1140 ggtgagctgcgcacccgacgggtggacgtcgcggaactggagaaccaccgcttcgtcgaa1200 gcggccgtgggacgcccctcccgcgagatcgtgtcctgcggccggcccccgtccctggag1260 atccgcgtcg tcgaccccgc gaccggcaag tccgtcacgg gcggcgacgg agccggcgag 1320 accagggtgg gcgagatcag agtgcgcggc gcgagcgtcg ccaggggcta ctggcagaaa 1380 ccggaggcga ccgccgagac gttcgtcatg gacgcggacg gctccgggcc ctggctgcgc ' 1440 accggcgacc tcggcgctct gtacgagggc gagctgtacg tcaccggccg tatcaaggaa 1500 ctcctcatcg tgcacggccg caacatctac ccccatgaca tcgagcacga actgcgcgcc 1560 cgccacgccg aactcggcgc tgtcggggcc gccttctccc tcagcaccga atcgggcgag 1620 gttgtggtcg tcacccatga ggtgaacccc accgtccggc ccgagcaggg tcccgagctg 1680 gtgaccgccc tgcgtgcgac gctcgcgcgg gagttcggcc tcgccccggc cggggtggtg 1740 ctggtgcgcc gcggccgcat cccgcgcacc agcagcggca aggtgcaacg ccgcctgacc 1800 gcccggctgt tcagcacggg ggaactcgcc caggtccatg ccgaccccgg cgcccaccgc 1860 ctcctggcgg aactcaggga ggcgcacgac cgcggcggcg ccttcccgcc cccctccccg 1920 cccgccagcc aggaccccga ggccctgcgg cagcggctgc gcgagctgtg cgccgactgt 1980 ctcggcgtcc ccgtggactc cctcgccacg gacgcccccc tcaccgacta cgggatgacc 2040 tccgtcaccg gcaccgccct gtgcgggatg gtggaggagt acctggacgt cgaatgcgac 2100 ctggaactgc tctggcagga gccgacgatc gacgggctcg cctcccggct ggcctcgcgc 2160 accgtgcgct ga 2172 Information for SEQ ID NO. 4 Length: 3700 Type: PRT
Organism: Streptomyces fradiae Sequence: 4 Val Cys Ala Arg Thr Pro Pro Arg Val Arg Arg Cys Arg Pro Leu Gln Pro Glu Ser Gly Arg Ser Met Leu Glu Ser Ser Ala His Arg Val Ala Ala Thr Ser Ala Gln Thr Gly Ile Trp Thr Ala Gln Arg Leu Arg Gly Asp Asp Arg Leu Tyr Ala Cys Gly Leu Phe Leu Glu Leu Asp His Val Val Glu Glu Val Leu Ser Glu Ala Ile Arg Arg Ala Val Ala Asp Thr Glu Ala Leu Arg Thr Ala Phe Arg Glu Asp Ala Asp Gly Ala Leu Glu Gln His Val Leu Ala Arg Pro Pro Ser Thr Gln Thr Arg Leu Phe His Ala Asp Pro Ser Gly Gly Thr Pro Ser Arg Ser Ala Ser Leu Asp Trp Met Asp Arg Gln Arg Ala Gln Pro Trp Asp Leu Ala Se:r Gly Asp Thr Cys Arg His Thr Leu Ile Pro Leu Gly Gly Asp Arg Ser Leu Leu His Leu Arg Tyr His His Leu Ala Leu Asp Gly Tyr Gly Ala Ala Leu Tyr 165 1?0 175 Leu Asp Arg Leu Ala Ala Val Tyr Arg Ala Leu Arg Thr Gly His Gln Pro Pro Pro Cys Ala Phe Ala Pro Leu Ala Arg Leu Val Glu Glu Asp His Ala Tyr Arg Asn Ser Ala Arg His Arg Ala Asp Ala Asn His Trp Arg Asp Arg Phe Ala Asp Leu Pro Arg Pro Thr Ser Leu Ala Asp Ala Thr Thr Pro Ala Ala Pro Thr Thr Pro Ala Thr Pro Ala Ala Pro Ala Ala Pro Asp Glu Leu Arg Arg Thr Val Arg Leu Ser Ala Ala Arg Ser Ala Ala Leu Arg Arg Ala Ser Asp Arg Ser Gly Arg Pro Trp Pro Val Tyr Ala Thr Ala Ala Val Ala Ala Phe Leu Ser Arg Leu Ala Pro Gly Glu Glu Val Val Val Gly Leu Pro Val Thr Ala Arg Va:L Thr Pro Ala Ala Val Arg Thr Pro Gly Met Leu Ala Asn Val Val Pro Leu Arg Leu Pro Val Arg Gln Gly Met Ser Thr Ala Glu Leu Leu Glu Leu Thr Ala Ala Glu Ile Ser Thr Thr Leu Arg His Gln Arg His Arg Thr Glu Asp Ile Gly Arg Ala Leu Gly Leu His Gly Ala Pro Pro Ala Thr Thr Leu Val Asn Val Met Ala Phe Ala Pro Val Leu Asp Phe Gly Asp Cys Arg Ala Pro Val His Gln Leu Ser Ala Gly Pro Val Glu Asp Leu Val Val Asn Leu Leu Gly Thr Pro Gly Asp Gly Gly Glu Ser Asp Gly Thr Glu Leu Glu Ile Thr Val Ala Ala Asn Pro Arg Leu His Ser Ala Asp Ala Val Ala Ser Leu Ala Ala Arg Leu Ala Glu Phe Leu Thr His Met Gly Gln Asp Ala Glu Ala Pro Leu Gly Arg Thr Arg Leu Leu Asp Ala Glu Glu Glu Val Ala Ala Val Ala Arg Gly His Ser Pro Arg Arg Asp Leu Arg Ala Arg Thr Leu Pro Glu Leu Phe Ala .Arg Gln Va.1 Ala Arg Thr Pro Asp Ala Pro Ala Val Ser Ser Asp Arg Ala Thr Trp Thr Tyr Ala Gln Leu Asp Ala His Ala Glu Arg Val Ala Arg Arg Leu Ala Ala Arg Gly Val Gly Pro Glu Ser Leu Val Ala Leu Ala Val Pro Arg Gly Val Glu Leu Ala Ala Leu Ile Leu Gly Ile Gln Arg Ala Gly Gly Ala Tyr Leu Pro Ile Asp Pro Glu Tyr Pro Ala Glu Arg Val Gly Phe Leu Leu Arg Asp Ala Arg Pro Ala Leu Leu Val Gly Gly Thr Gly Thr Glu Pro 595 600 60.5 Ser Ala Ala Asp Cys Pro Arg Val Pro Ala Glu Glu Leu Leu Asp Ala Gly Ala Cys Arg Ala Glu Ala Asp Val Pro Pro Pro Gly Ser Leu Pro Val Asp Leu Pro Ala.Tyr Val Val His Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Val Val Thr His Ala Gly Ile Ala Ala Leu Ala Ala Glu Gln Ile Glu Arg Tyr Arg Leu Gly Pro Gly Ser Arg Val Ala Gln Leu Ala Ala Leu Gly Phe Asp Val Ala Val Ala Glu Leu Val Met Ala Leu Ala Ser Gly Ser Cys Leu Val Leu Pro Pro His Gly Leu Ala Gly Asp Glu Leu Ala Ser Phe Leu Arg Asp Arg Arg Ile Thr Thr Ala Leu Ala Pro Ala Ala Val Leu Ala Thr Leu Pro Pro Gly Asp Leu Pro Asp Leu Thr Asp Leu Val Thr Gly Gly Glu Gln Pro Pro Pro Ala Leu Ile Ala Arg Trp Ala Pro Gly Arg Arg Met Phe Asn Val Tyr Gly Pro Thr Glu Ala Thr Val Gln Ala Thr Ser Gly Arg Cys Ala A1<~ Asp Gly Asp Arg Ser Pro Asp Ile Gly Asn Pro Glu Ala Gly Val Asp Ala Tyr Val Leu Asp Ala Ala Leu Arg Pro Val Pro Asp Gly Val Th:r G1y Glu Leu Tyr Leu Arg Gly Arg Gly Leu Ala Arg Gly Tyr Leu Gly Arg Pro Gly Leu Thr Ala Gly Arg Phe Val Ala Asp Pro His Thr Gly Thr Gly Glu Arg Met Tyr Arg Thr Gly Asp Leu Va1 Arg Arg Val Pro Gly Asp Gly Arg Thr Val Leu Arg Phe Val Gly Arg Ala Asp Asp Gln Val Lys Ile Arg Gly Phe Arg Val Glu Pro Gly Glu Val Glu Ala Ala Leu Ala Glu Leu Asp Gly Val Glu Gln Ala Leu Val Thr Val Arg Glu Glu Arg Pro Gly Asp Arg Arg Leu Val Gly Tyr Leu Thr Pro Ala Pro Gly His Arg Gly Ser Leu Asp Val Glu Arg Leu Arg Arg Val Leu Ala Asp Arg Leu Pro Ala His Leu Val Pro Ser Leu Leu Met Glu Leu Ala Glu Ile Pro Arg Thr Ala Asn Gly Lys Val Asp Arg Ala Ala Leu Pro Asp Pro Ala Pro Leu Ala Pro Thr Ala Gly Arg Ala Pro Arg Asp A:La Arg Glu Glu Ala Leu Cys Ala Leu Phe Ala Glu Val Leu Gly Val Glu Glu Val Gly Val Asp Leu Asp Phe Phe Ala Leu Gly Gly Asp Ser Leu Leu Ala Ala Arg Leu Ala Ser Arg Ile Arg Gly Arg Leu Gly Lys Ala Val Thr Val Arg Glu Val Phe Arg Ser Pro Thr Val Ala Arg Leu Ala Glu Glu Leu Gly Asp Gly Ala Val Pro Asp Asp His Val Arg IS

Pro Val Arg Pro Arg Pro Glu Arg Leu Pro Leu Ser Ser Ala Gln Arg Arg Leu Trp Phe Ile Asp Glu Leu Asn Gly Ala Ser Ala Ala Tyr Asn Ile Pro Thr Val Leu His Leu Glu Gly Pro Leu Asp Val Pro Ala Leu His Ala Ala Leu Gly Asp Val Thr Asp Arg His Glu Thr Leu Arg Thr Thr Leu Arg Pro Fro Ala Asp Asp Gly Ser Ala Gly Ala Pro Glu Gln His Ile Ala Pro Pro Gly Gly His Arg Fro Pro Leu Pro Val Leu Asp Val Ala Pro Glu Ala Leu Ala Gly Glu Leu Arg Ala Ala Ala Gly His Val Phe Asp Leu Ala Arg Asp Leu Pro Val Arg Ala Thr Leu Tyr Arg Thr Gly Glu Leu Glu His Ala Leu Leu Leu Leu Val His His Val Ala Ala Asp Gly Ala Ser Met Gly Pro Leu Ile Gly Asp Leu Ala Thr Ala Tyr Thr Ala Arg Leu Ala Gly Arg Ala Pro Val Leu Pro Pro Pro Glu Val Thr Tyr Ala Asp Phe Ala Leu Trp Glu Arg Gly Ser Arg Glu Arg Ser Ala Ala Gln Ala Glu Gly Ile Asp Tyr Trp Arg Arg Ala Leu Ala Gly Leu Pro Asp His Ile Arg Leu Pro Ala Asp Arg Pro Arg Ser Gln Glu Pro Val Arg Arg Gly Gly Ile Ala Arg Phe Glu Val Pro Pro Ala Leu Tyr Ala Gly Leu Val Glu Leu Ala Gln Gly Val Gly Ala Thr Pro Phe Met Val Leu Gln Thr Ala Ile Ala Val Leu Leu Ser Arg Met Gly Ala Gly Thr Asp Ile Pro Leu Gly Thr Pro Val Ala Gly Arg Gln Asp Glu Ala Leu Asp Gly Leu Val Gly Cys Phe Val Asn Thr Val Val Leu Arg Thr Asp Val Ser Gly Asp Pro Thr Thr Thr Glu Leu Leu Ala Arg Thr Arg Asp Gly Asp Leu Glu Ala Leu Ala His Gln Asp Val Pro Phe Asp Arg Val Val Glu Ala Val Asn Pro Val Arg Ser Thr Ser Arg His Pro Leu Phe Gln Val Met Leu Ile Leu Asn Gly Ser Asp Gln Asp Arg His Arg Ala Arg Phe Pro Arg Leu Ala Asp Arg Val Glu Thr Val Glu Pro Gly Glu Thr Lys Phe Asp Leu Ser Trp His Phe Thr His Arg Asp Gly Pro Asp Arg Ser Leu Glu Gly Ala Leu Val His Ala Ala Asp Leu Phe Asp Asp Asp Thr Ala His Arg Leu Thr Ala Arg Leu Leu Asp Val Leu Thr Ala Met Val Asp Asp Pro Ala Arg Pro Val Gly Thr Ile Asp Val Leu Ser Ala Ala Glu His Arg Leu Val Arg A1a Trp Gly Thr Gly Thr Pro Arg Pro Pro Gly Ser Arg Pro Glu Pro Val Ala Ala Arg Ile Ala Gly Gln Ala Ala Arg Thr Pro Asp Ala Pro Ala Val Thr Glu Pro Gly Arg Val Trp Ser Tyr Ala Glu Leu Asp Ala Arg Ala Asp Arg Val Ala Ala Ala Leu Ala Ala Arg Gly Ile Gly Ala Glu Asp Leu Val Ala Val Leu Leu Pro Arg Gly Ala Glu Leu Val Ala Thr Leu Leu Gly Ile Leu Arg Ala Gly Ala Ala Tyr Leu Pro Leu Asp Thr Gly His Pro Ala Asp Arg Asn Arg Arg Ala Leu Ser Asp Ser A1a Pro Ala Leu Leu Val Thr Asp Ala Gly Arg Ser Arg Thr Leu Arg Gly Glu Thr Gly Cys Ala Ala Leu Val Leu Gly Ala Glu Asp Thr Glu Arg Glu Leu Ala Asp Arg Ala Pro Leu Pro Arg Asp Gly Ala Gly Leu Val Arg Pro Val Thr Gly Asp Asn Ala Ala Tyr Thr Ile Leu Thr Ser Gly Ser Thr Gly Arg Pro Lys Ala Val Val Val Thr Arg Asp Ala Leu Asp Ala Phe Val Asp Arg Ala Leu Asp Thr Tyr Gly Asp Ala Leu Gly Gly Glu Ala Leu Leu His Ser Pro Val Ala Phe Asp Leu Thr Val Val Thr Leu Tyr Gly Pro Leu Ala Ala Gly Gly Arg Val Arg Val Gly Asp Leu Asp Glu Ser Gly Ile Ala Arg Trp Glu Lys Glu Arg Pro Ala Phe Va1 Lys Ala Thr Pro Ser His Leu Ala Leu Leu Thr Glu Phe Gly Gly Ser Thr Ala Pro Gly Thr Val Val Leu Ala Gly Glu Gln Leu Ile Gly Ala Arg Leu Asp Arg Trp Arg Thr Arg Leu Gly Ala Ser Gly Thr Thr Val Leu Asn Ser Tyr Gly Pro Thr Glu Thr Thr Val Asn Cys Leu Glu His Arg Ile Ala Pro Asp Ala Asp Val Pro Ser Gly Pro Val Pro Val Gly Arg Pro Val Pro Gly Val Arg Val Leu Leu Leu Asp Asp Arg Leu Arg Pro Val Ala Pro Gly Val Thr Gly Glu Leu Tyr Val Cys Gly Pro Gly Val Ala Arg Gly Tyr Arg Ala Arg Pro Ala Ala Thr Ala Glu Arg Phe Val Ala Cys Pro Gln Gly Arg Pro Gly Glu Arg Met Tyr Arg Thr Gly Asp Leu Met Arg Trp Thr Ala Asp Gly Glu Leu Val Tyr Glu Gly Arg Ala Asp Ala Gln Val Lys Val Arg Gly Phe Arg Val Glu Pro Gly Glu Val Glu Ala Ala Leu Leu Gly Leu Pro Gly Val Arg Glu Ala Ala Val Thr Leu Leu Glu Gly Pro Glu Gly Thr Glu Gly Pro Glu Gly Ala Pro Gly Arg Val Ala Ala Pro Ala Arg Leu Val Gly Tyr Val Val Gly Ala Ser Glu Glu Pro Ala Ala Leu Leu Glu Arg Leu Arg Val Arg Leu Pro Asp His Met Val Pro Ala Ala Leu Val Asp Leu Asp Ala Leu Pro Leu Thr Pro Asn Gly Lys Leu Asp Arg Arg Ala Leu Pro Ala Pro Asp Phe Gly Arg His Ala Gly Arg Arg Ala Pro Ser Gly Pro Glu Glu Glu Ala Leu Cys Ala Leu Phe Ala Asp Val Leu Gly Val Pro Glu Val Gly Ala Asp Asp Ser Phe Phe Thr Leu Gly Gly Asp Ser Ile Val Ser Ile Gln Leu Val Gly Arg Ala Arg Gly Ala Gly Leu His Leu Thr Val Arg Asp Val Phe Glu His Pro Thr Ala Ala Gly Leu Ala Thr Val Val Arg Ser Ala Gly Pro Asp Ala Asp Ala Glu Arg Pro Ala Pro Gln Ala Leu Ala Pro Ser Gly Thr Leu Pro Tyr Val Pro Ala Ala Ala Arg Leu Val Ala Arg Thr Gly Ser Met Arg Ala Arg Gly Ala Asp Arg Phe His Gln Ser Val Val Leu Thr Thr Pro Ala Asn Ala Ala Ala Asp Asp Val Arg Arg Val Leu Gln Thr Leu Ile Asp His His Gly Ala Leu Arg Leu Arg Thr Ala Ala Asp Arg Asp Gly Ser Pro Asp Gly Leu Val Ile Thr Glu Pro Gly Thr Val Ala Ala Ala Gly Leu Leu Arg Cys Arg Asp Ala Ala Gly Leu Gln Gly Val Ala Leu Arg Glu Ala Val Glu Arg Glu Ala Gly His Ala Arg Asp Ala Leu Asp Pro Ser Thr Gly Ala Val Leu Arg Ala Ala Trp Leu Asp Arg Gly Lys Asp Arg Leu Gly Leu Leu Val Leu Val Ala His His Leu Ser Val Asp Gly Val Ser Trp Arg Ile Leu Ala Asp Asp Leu Arg Gln Ala Trp Thr Pro Ala Asp Ala Pro Ala Thr Thr Ala Thr Leu Pro Pro Glu Gly Ala Ser Leu Arg Glu Trp Ala Thr Arg Ile Ala Arg Arg Ala Thr Glu Thr Ala Val Thr Gly Arg Leu Pro His Trp Arg Ala Thr Leu Ala Gly Leu Asp Asp Pro Asp Gly Asp Val Val Ala Leu Glu Thr Arg Leu Asp Pro Glu Ala Asp Thr His Gly Thr Ala Arg Glu His Ala His Gly Leu Ser Pro Asp Leu Thr Asp Ala Leu Val Arg Thr Ala Pro Ala Ala Leu Arg Ala Asp Thr Gly Glu Leu Leu Leu Ala Ala Tyr Ala Leu Ala Ala Ser Arg Thr Leu Gly Asp Arg Pro Val Phe Val Ala Glu Thr Glu Ser His Gly Arg Gln Asp Ala Leu Leu Pro Gly Val Asp Leu Thr Arg Thr Val Gly Trp Phe Thr Ser Val His Pro Val Arg Leu Arg Pro Gly Ala Gly Ala Glu Arg Leu Leu Lys Glu Thr Lys Glu Arg Leu Arg Thr Val Pro Glu Ala Gly Leu Gly His Asp Leu Leu Arg Leu Gly Gly Ala Ala Ser Ser Pro Arg Glu Ser Gly Arg Glu Leu Pro Arg Pro Gln Phe Gly Phe Asn Tyr Leu Gly Arg Val Ala Val Ala Glu Ala Leu Thr Asp Glu Gly Thr Glu Pro Ala Gly Ala Trp Ala Phe Ala Gly His Ser Leu Thr Ala Gln Pro Pro Glu Leu Pro Leu Thr His Glu Val Glu Leu Thr Val Val Leu Glu Asp Gly Pro Arg Gly Pro Val Leu Arg Ala Arg Trp Asn Ala Ser Ala Arg Cys Leu Thr Gly Ala Arg Leu Thr Ala Leu Ala Gln Glu Trp Glu Lys Ala Leu His Glu Leu Thr Ala Leu Ala Gly Val Ala Gly Thr Ala Gly Leu Ile Pro Ser Glu Thr Gly Ala Gly Asp Leu Asp Gln Asp Ala Ile Glu Glu Cys Glu A1a Ala Ala Asp Phe Glu Val Ala Asp Leu Leu Ala Leu Ala Pro Ala Gln Glu Gly Leu Leu Phe His Ser Thr Phe Asp Asp Ala Ala Glu Asp Val Tyr Val Gly Gln Leu Ala Leu Glu Phe His Gly Glu Leu Ser Gly Ala Arg Met Arg Glu Ala Ala Gln Gln Val Leu Asp Arg His Asp Val Leu Arg Ala Ala Phe Leu Gln Arg Arg Ser Gly Glu Trp Ser Gln Ala Ile Ala Ala Arg Thr Pro Val Pro Trp Glu Glu His Asp Leu Ser Ala Leu Ala Gly Glu Glu Arg Glu Arg Arg Leu Asp Ala Leu Leu Ala Gly His Arg Thr Arg Arg Phe Asp Leu Ala Arg Pro Pro Leu Val Arg Phe Leu Leu Val Thr Thr Ala Ala Asp Arg His Val Leu Ala Val Thr Asn His His Leu Val Leu Asp Gly Trp Ser Leu Pro Leu Val Val Arg Asp Leu Met Ala Leu Tyr Gly Thr Asp Gly Ala Ala Leu Pro Ala Val Arg Pro Tyr Arg Asp Tyr Leu Ala Trp Leu Ala Gly Gln Asp Ala Asp Ala Ala His Ala Ala Trp Ala Gln Ala Leu Ala Gly Leu Gln Pro Ser Leu Met Ala Pro Asp Ala Pro Arg Asp Gly Ala Ala Pro Leu Ala His His Arg Thr Met Asp Pro Asp Val Val Ser Arg Leu Thr Ala Trp Ser Arg Arg Leu Gly Val Thr Leu Asn Ser Val Val Glu Thr Ala Trp Ala Leu Leu Leu Gly Arg Leu Thr Gly Arg Asp Asp Val Ser Phe Gly Ile Ala Ala Ser Gly Arg Pro Thr Asp Leu Pro Gly Ala Gly Glu Ile Val Gly Leu Leu Met Asn Thr Val Pro Val Arg Val Thr Leu Asp Pro Ala Glu Pro Leu Glu Ala Leu Val Arg Arg Val Gln Arg Glu Gln Ala Ala Leu Leu Asp His Gln Phe Leu Pro Leu Ala Gln Val Gln Gln Ser Leu Gly Ala Gly Asp Leu Phe Asp Thr Thr Leu Val Phe Glu Asn Tyr Pro Leu Ala Pro Ala Asp Gly Leu Gly Asp Gly Asp Gly Leu Arg Leu Gln Gly Ala Arg Gly His Asp Gly Asn His Tyr Pro Leu Ser Val Thr Val Gly Pro Ala Pro Asp Leu Gln Leu Arg Phe Ile His Arg Pro Asp Leu Phe Thr Pro Pro Trp Val Glu Asp Leu Ala Ala Arg Phe Glu Gln Val Leu Asp Ala Met Ala Thr Ser Gly Asp Thr Pro Ala Gly Arg Leu Asp Ile Leu Leu Pro Arg Glu His Asp Thr Leu Leu Gly Asp Trp Ala Arg Gly Glu Ala Ala Ser Ala Arg Glu Cys Pro Val Ala Leu Phe Glu Glu Gln Val Asp Arg Thr Pro Asp Val Leu A1a Leu Val Glu Gly Gly Asp Gly Ala Arg Leu Ser Tyr Ala Glu Phe Asp Ala Arg Ala Asn Arg Met Ala Arg Phe Leu Ile Ala Arg Gly Lea Gly Ala Glu Asp Leu Val Gly Leu Val Phe Pro Arg Gly Ala Asp Leu Leu Thr Gly Leu Trp Gly Ala Leu Lys Ala Gly Ala Ala Tyr Leu Pro Val Asp Val Asp Tyr Pro Ala Glu Arg Ile Ala Leu Leu Leu Gly Asp Gly Asn Pro Ala Leu Val Leu Thr Thr Ser Ala His Ala His Leu Val Pro Glu Ala Pro Gly Arg Gln Ile Leu Cys Val Asp Leu Pro Gly Pro Ala Asp Glu Leu Ala Arg Ala Ala Glu Gly Arg Val Thr Asp Ser Glu Leu Pro Arg Pro Val Gly Pro Asp Thr Leu Ala Tyr Val Leu Tyr Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Ala Val Gly Arg Gly Ser Leu Ala Ala His Ala Val Arg Ser Arg Asp Arg Tyr Pro Asp Ala Ala Gly Val Ser Leu Leu His Ser Pro Vai Ala Phe Asp Leu Thr Val Thr Ala Leu Phe Thr Thr Leu Ile Ser Gly Gly Thr Leu Leu Leu Ala Glu Leu Asp Glu His Ala Gln Asp Ser Gly Val Thr Tyr Val Lys Gly Thr Pro Ser His Val Ala Leu Leu Asn Glu Leu Pro Gly Val Leu Asp Ala Thr Ala Glu Arg Pro Gly Thr Leu Val Leu Gly Gly Glu Pro Leu Thr Gly Glu Met Leu Glu Arg Trp Arg Ala His His Pro Gln Ala Arg Val Phe Asn Asp Tyr Gly Pro Ser Glu Thr Ser Val Asn Cys Ser Asp Leu Leu Leu Glu Pro Gly Ala Glu Val Pro Glu Gly Leu Leu Pro Ile Gly Arg Pro Leu Pro Gly Asn His Met Phe Val Leu Asp His Leu Leu Gln Pro Val Pro Val Gly Val Val Gly Glu Ile Tyr Val Ser Gly Val Gly Val Ala Arg Gly Tyr His Gly Arg Pro Gly Leu Thr Ala Glu Arg Phe Leu Pro Cys Pro Tyr Asp Ala Pro Gly A1a Arg Met Tyr Arg Thr Gly Asp Leu Gly Arg Trp Arg Pro Asp Gly Ile Met Glu Cys Lei Gly Arg Thr Asp Asp Gln Val Lys Val Arg Gly Phe Arg Val Glu Leu Gly Glu Val Ala Ala Leu Ala Ser Val Ala Glu Ala Arg Asp Arg Ala Thr Val Val Arg Glu Asp Gly Arg Arg Val Glu Pro Asp Leu Thr Gly Tyr Val Pro Glu Gly Asp Asp Phe Val Gly Pro Ala Asp Pro Ala Ala Leu Arg Asp Leu Ala Pro Pro Ala Ala Ala Leu Tyr Met Val Pro Ala Ile Val Val Glu Pro Arg Ala Leu Ser Leu Thr Glu Asn Gly Leu Asp Arg Arg Pro Pro Asp Lys Ala Leu Ala Tyr Gly Thr Ala Val Gly Arg Ala Thr Leu Glu Ser Pro Arg Ala Thr Asp Leu Cys Leu Phe Ala Asp Gly Pro Gly Ala Val Leu Val Ile Thr Leu Asp Asp Phe Phe Ala Gly Ser Leu Asp Leu Gly His Leu Ala Val Arg Ala Gly Arg Ile Glu Gly Leu Leu Arg Ala Leu Arg Leu Asp Ile Thr Ile Phe Asp Thr Ala Asp Arg His Arg Val Leu Leu Ala Asp Asn Pro Pro Information for D NO: 5 SEQ I

Length: 11100 Type: DNA

Organism: Streptomyces fradiae Sequence: 5 gtgtgcgccc ggacacctccgcgcgtccgg cggtgccgcc cgagagcggg60 cccttcagcc agaagcatgt tggagtcctcggcacaccgt gtggccgcca gaccgggatc120 cgtcggccca tggacggcgc agcgtctgcgcggggacgac aggctctacg cttcctcgaa180 cctgcggcct ctcgaccacg tggtggaggaggtgctgagc gaggcgatcc cgccgacacc240 gccgcgccgt gaggcgctgc gcaccgcgttccgggaggac gcggacggcg gcacgtcctc300 cgctggagca gcccggccgc cgagcacgcagacccgcctc ttccacgccg cggaaccccc360 acccgagcgg tcccgctccg cgtccctggactggatggac cggcaacggg ggacctcgcg420 cgcaaccctg tcgggcgaca cctgccgtcataCCCtgatC CCCCtCggCg gctgctgcac480 gcgaccgctc ctgcgttacc accacctcgccctggacggg tacggcgccg ggaccggctc540 cgctctatct gcggcggtctaccgcgcgctgcgcaccggccatcaaccgcccccctgcgcgttcgcgccg600 ctggcccgcctggtcgaggaggaccacgcctaccggaactccgcccgtcaccgcgcggac660 gccaatcactggcgcgaccgcttcgcggacctcccgcgccccaccagcctcgccgacgcc720 accacgcccgcggcgcccaccacgcccgccacgcccgccgcgcccgccgcgcccgacgaa780 ctgcggcgcaccgtgcgcctgtccgccgcccggtccgccgcgctgcgccgtgcctcggac840 cggagcggccgaccctggcccgtgtacgccacggccgcggtggccgccttcctgagccga900 ctcgcgccgggggaggaggtcgtcgtcggcctcccggtcaccgccagggtgacccccgcc960 gcggtgcgcacaccggggatgctcgccaacgtcgtaccgcttcgcctgcccgtccggcag1020 ggcatgtcgacggcggagctgctggagctgaccgcggccgagatcagcaccacactgcgc1080 caccagcgccaccgcaccgaggacatcgggcgggcgctcggactccacggcgctccgcca1140 gccaccacactcgtgaacgtcatggcgttcgccccggtcctcgacttcggcgactgccgg1200 gccccggtgcaccagctctcggccggaccggtggaggacctggtcgtcaacctcctcggc1260 accccgggcgacggcggcgagagcgacggcaccgagctggagatcactgtcgccgccaac1320 ccccgcctccactcggcggacgcggtggcctcgctggccgcgcggctcgcggagttcctc1380 acgcacatggggcaggacgccgaggcgcccctcggccggacccggctgctcgacgcggag1440 gaggaggtcgcggcggtggcccgcgggcacagtccccgacgcgacctgcgcgcccggacc1500 ctgcccgagctcttcgcccggcaggtcgcccgcaccccggacgcccccgccgtctcctcg1560 gaccgcgccacctggacgtacgcccaactcgacgcccacgccgagagagtggcccggcgg1620 ctcgccgcgcggggcgtgggaccggagagcctcgtcgccctcgcggtgccgcggggcgtc1680 gagctggcggcgctgatcctcgggatccagcgggccggaggggcctacctccccatcgac1740 ccggagtacccggcggaacgcgtcggtttcctgctgcgcgacgcccgccccgccctgctc1800 gtcggcgggacggggaccgagccctccgccgccgactgcccgcgcgtgccggccgaagag1860 ctcctcgacgccggggcgtgccgcgccgaggccgacgtgcccccgcccggaagcctcccg1920 gtggaccttccggcgtacgtcgtccacacctccggctcgaccgggcggcccaagggggtg1980 gtggtcacccacgcgggcatcgccgccctggcggccgagcagatcgaacgctaccgactg2040 ggacccggctccagggtggcgcagctggcggccctcgggttcgacgt=cgcggtcgccgaa2100 ctcgtgatggcgctggcgtcggggagctgtctcgtcctcccgccgcacggcctcgccggc2160 gacgagctggcctccttcctgcgcgaccggcgtatcaccacggccct:cgccccggccgcc2220 gtactggccaccctgccccccggcgacctccccgacctgaccgatctggtcaccggaggc2280 gagcagccaccgcccgcgctgatcgcccgctgggcacccggccggcggatgttcaacgtc2340 tacgggccgacggaggccaccgtccaggccacctccgggcggtgcgcggccgacggcgac2400 cggtcgccggacatcgggaaccccgaggccggagtggacgcctacgttctggacgccgcg2460 ctgcgacccgtgcccgacggggtgacgggcgagctctacctgcgtggcaggggcctggcc2520 cgcggctacctcggccgccccggcctcaccgccggccggttcgtcg~~cgacccccacacc2580 gggacgggcgagcggatgtacaggaccggggacctggtgcgccgggtgcccggcgacggc2640 cgcaccgtgctgaggttcgtcggccgggccgacgaccaggtgaagatccggggcttccgg2700 gtcgagccgggcgaggtcgaggccgccctcgccgaactcgacggcgtcgagcaggccctg2760 gtgaccgtccgggaggagcggcccggcgaccgcaggctcgtcggctacctgacacccgcc2820 cccgggcaccggggatcactggacgtcgagcgcctgcgccgcgtgctcgccgaccggctc2880 cccgcccacctcgtcccctccctcctgatggagctggcggagatcccgcgcaccgccaac2940 ggcaaggtggaccgtgcggcgctgccggaccccgctcccctggccccgaccgccgggagg3000 gcgccgcgcgacgcccgagaagaggccctgtgcgcgctcttcgccgaggtgctgggcgtc3060 gaggaggtcggcgtcgacctcgacttcttcgcgctgggcggtgactcgctgctggccgcc3120 cgactggcgagccgcatccgtggccggctgggcaaggcggtcaccgtacgggaggtcttc3180 cggtccccgaccgtcgcccgcctcgcggaggaactgggcgacggggccgtgccggacgac3240 cacgtccgtcccgtccggccccgcccggagcggctgccgctgtcgtccgcgcagcgccgg3300 ctgtggttcatcgacgaactcaacggcgcgtcggcggcctacaacatcccgaccgtcctc3360 cacctggagggaccgctcgacgtccccgcgctgcacgccgcgctcggcgacgtgacggac3420 cggcacgaaacgctccgcaccaccctgcggccccccgcggacgacggctcggcgggcgca3480 cccgagcagcacatcgcgccccccggcggccaccggccaccgctgccggtgctcgacgtc3540 gctcccgaagcgctcgccggggagctgcgcgccgcagcgggccacgtcttcgacctcgcg3600 cgggaccttccggtgcgcgccacgctctaccgcaccggcgagctggagcacgcgctgctg3660 ctgctggtccatcacgtcgccgccgacggcgcgtcgatgggccccci=gatcggcgacctg3720 gccaccgcctacacggcccggctcgcgggacgcgcacccgtcctcccgccacccgaggtg3780 acgtacgccgacttcgcgctgtgggagcgggggagccgggagcgctccgccgcgcaggcc3840 gaggggatcgactactggcgccgggccctggccgggctgccggaccacatccggctcccc3900 gccgaccgcccccgctcgcaggaaccggtccgccgcggcgggatcgcccggttcgaggtg3960 ccgcccgccctgtacgccgggctcgtggagctggcccaaggcgtcggcgccaccccgttc4020 atggtgctccagaccgcgatagccgtcctgctcagccggatgggagccggaaccgacatc4080 cccctgggcacgccggtcgccggccgccaggacgaggcgctcgacggactcgtcggctgc4140 ttcgtcaacaccgtcgtcctgcgcaccgacgtctcgggtgaccccaccacgaccgaactg4200 ctggcccgga cgcgcgacgg cgacctggaa gccctcgccc accaggacgt gcccttcgac 4260 cgggtcgtgg aggcggtcaa ccccgtccgt tccacctcac ggcaccccct cttccaggtc 4320 atgctgatcc tcaacggttc tgatcaggac cggcaccggg cgcggttccc ccgactggcc 4380 gaccgggtcg agacggtgga gccgggggag accaagttcg acctctcctg gcacttcacc 4440 caccgcgacg ggccggacag gtcgctggag ggggccctcg tccacgccgc cgacctgttc 4500 gacgacgaca ccgcgcaccg gctcaccgca cgcctgctcg acgtcctgac cgccatggtc 4560 gacgaccccg cccggcccgt cgggaccatc gacgtcctca gcgccg~~cga gcaccgcctc 4620 gtgcgcgcgt gggggaccgg cacgccccgc ccccctggca gccgtccgga gccggtcgcc 4680 gcgaggatcgccggccaggccgcccgcacaccggacgcccccgcggtgaccgagcccgga4740 cgggtctggagctacgccgaactcgacgcccgcgccgaccgggtggcggcggccctggcc4800 gcacgcggaatcggcgccgaggacctcgtcgccgtactcctgccccgcggcgcggaactg4860 gtcgccaccctgctggggatcctccgggcaggcgccgcctacctcccgctcgacaccgga4920 caccccgccgaccgcaaccgccgggccctctccgactccgccccggcactgctggtgacc4980 gacgccgggcggtcgcgcacgctccgaggagagaccgggtgcgccgcgctggtcctgggt5040 gcggaggacaccgagcgggaactggcggaccgcgcccccctcccgcgggacggcgccggc5100 ctcgtacgcccggtgaccggggacaacgccgcctacacgatcctcacctccggttcgacg5160 ggccgccccaaggccgtcgtcgtgacgcgggacgcgctggacgcgttcgtcgatcgcgcc5220 ctggacacctacggcgacgcgctgggcggagaggctctgctgcactr_cccggtcgccttc5280 gacctcacggtcgtcaccctgtacgggccgctggccgcgggcgggcgcgtccgggtcggc5340 gacctcgacgagtccgggatcgcccggtgggagaaggagcgcccggccttcgtcaaggcc5400 acgccctcccacctcgcgctgctgacggagttcggcggctccacggcccccggaacggtc5460 gtcctggcgggcgagcaactcatcggcgcacggctggaccgctggcggactcgcctgggc5520 gcctccggcaccaccgtcctcaacagctacgggcccaccgagaccar_cgtcaactgcctg5580 gagcacaggatcgccccggacgccgacgtgccctcgggacccgtgccggtgggccggccg5640 gtgcccggggtacgggtgctgctcctcgacgaccgcctgcgccccgtcgccccgggcgtc5700 acgggcgaactgtatgtctgcgggcccggcgtcgcccgcggataccgtgccaggccggcc5760 gccaccgccgaacggttcgtcgcctgtccgcaaggacggccgggagagcggatgtaccgc5820 accggcgacctgatgcgctggaccgccgacggcgaactggtctacgagggacgggccgac5880 gcccaggtga aggtgcgcgg cttccgggtc gagcccggcg aggtggaggc cgcgctgctc 5940 ggcctccccg gcgtccgcga ggccgccgtc accctcctgg aagggccgga aggaacggaa 6000 gggccggaag gggcgcccgg gagggtggcc gcccccgccc gcctcgtcgg ctacgtcgtc 6060 ggcgccagcgaggaaccggccgccctcctcgaacggctgcgcgtcaggttgcccgaccac6120 atggtgcccgccgccctcgtggacctcgacgccctgccgctcaccccgaacggcaagctc6180 gaccgccgcgccctgcccgcccccgacttcggccgccacgcgggccgccgcgcacccagc6240 gggccggaggaggaagcgctctgcgcgctcttcgccgacgtgctgggagtgcccgaggtc6300 ggcgcggacgacagcttcttcacgctcgggggcgacagcatcgtcagcatccagctcgtc6360 ggccgcgcccggggagccggactgcacctcacggtgcgcgacgtcttcgagcaccccacg6420 gccgccgggctggcgaccgtcgtccggtccgccggaccggacgcggacgcggagcgtccc6480 gcgccgcaggcgctcgccccgagcgggacgctgccctacgtcccggccgcggcgcggctc6540 gtggccagaaccggctcgatgcgcgcccgcggcgccgaccgcttccaccagtcggtggtc6600 ctcaccaccccggcgaacgcggcggccgacgacgtccggcgcgtgctccagacgctgatc6660 gaccatcacggggcactccgcctgcggaccgccgccgaccgggacggatcgccggacggc6720 ctggtgatcaccgaaccggggacggtcgccgccgccggcctgctgcgctgccgcgacgcc6780 gccggactccagggcgtggctctgcgggaggcggtggagcgggaggccgggcacgcacgc6840 gacgccctcgacccgagcaccggggccgtgctgcgcgcggcctggctggaccgggggaag6900 gaccggctcggcctgctggtcctggtggcccaccacctcagcgtggacggcgtctcctgg6960 cgcatcctggcggacgacctccgccaggcatggaccccggccgacgcccccgccacgacg7020 gcgaccctgccgccggagggcgcttccctgcgggagtgggccacccggatcgcccggcgc7080 gccaccgagaccgccgtgaccggccggcttccccactggcgcgcgaccctggccggcctc7140 gacgacccggacggcgacgtggtcgccctggagacccggctcgaccccgaggccgacacc7200 cacggcacggcgcgcgagcacgcgcacggcctgtcacccgacctgaccgacgcgctcgtg7260 cgcaccgcgcccgcggcgctgcgcgcggacaccggcgaactgctgctggccgcctacgcg7320 ctcgccgcgtcgcggaccctgggcgaccggcccgtgttcgtggcggagaccgagagccac7380 ggccgccaggacgccctcctgcccggtgtcgacctgacccgcaccgtgggctggttcacc7440 tctgtccacccggtgcgcctgcggcccggagccggggccgaacggctgctgaaggagacc7500 aaggagcggctgcgcaccgtgccggaggccggtctcggccacgaccl=gctccgcctcggc7560 ggcgcggcgtcctccccgcgggagagcggccgggaactgccccgcccgcagttcggattc7620 aactacctcggccgggtcgccgtggccgaagccctcaccgacgagggcacggagcccgcc7680 ggggcctgggcgttcgcgggccacagcctcaccgcgcaaccccccgaactgccgctcaca7740 cacgaggtggaactgaccgtcgtcctggaggacggcccgcggggaccggttctccgagcc7800 cgctggaacgcctccgcccgctgcctgaccggggcgcgcctgaccgcgctggcgcaggag7860 2g tgggagaaggcgctgcacgagctgaccgccctggccggcgtcgccgggaccgccggcctg7920 atcccctcggagaccggcgccggcgacctggaccaggacgcgatcgaggagtgcgaggcg7980 gcggcggacttegaggtcgcggacctgctcgccctcgcgcccgcccaggagggcctgctc8040 ttccacagcaccttcgacgacgcggcggaggacgtgtacgtcggccagctggcgctggag8100 ttccacggcgagctgtcgggcgcacggatgcgggaggcggcccagcaggtcctcgaccgg8160 cacgacgtgctgcgcgcggccttcctccagcgccgctccggcgagtggagccaggcgatc8220 gccgccaggacgccggtgccctgggaggagcacgacctgtccgccctggccggggaggag8280 cgggagcggcggctggacgccctgctggccgggcaccgcacccgccc~gttcgacctcgca8340 cggccgccgctggtgcgcttcctgctcgtcacgacggcggcggaccggcacgtgctggcc8400 gtgaccaaccaccacctggtactggacggctggtcgctgccgttggtggtgcgcgacctc8460 atggccctgtacggcacggacggcgccgccctgcccgccgtacgcccgtaccgcgactac8520 ctcgcctggctcgccggccaggacgcggacgcggcccacgccgcctgggcacaggcgctc8580 gccgggctccagccgtcactgatggccccggacgccccccgcgacgqcgcggccccgctc8640 gcgcaccaccgcaccatggaccccgacgtcgtctcccgcctcaccgcctggtcccgccgc8700 ctgggcgtcaccctcaactccgtggtggagaccgcgtgggccctcct;cctgggccggctc8760 accggccgcgacgacgtcagtttcggcatcgccgcctccggccgccecaccgatctgccc8820 ggcgcgggggagatcgtcggcctgctgatgaacaccgtgccggtgcgcgtcaccctggac8880 cccgccgaaccgctggaagcactcgtccggcgcgtgcaaagggaacaggccgccctcctc8940 gaccaccagttcctccccctggcacaggtgcagcagagcctcggagcgggcgacctcttc9000 gacaccacgctcgtcttcgagaactacccgctggcccccgccgacgc~cctcggcgacggc9060 gacgggctccgcctgcagggggcgcgcggccacgacggcaaccacta.ccccctcagcgtc9120 aCCgtCggCCccgcacccgacctccagctcCgCttCatCCaCCgCCCCgaCCtgttCdCa9180 cccccgtgggtggaggacctggcggcgcggttcgagcaggtgctcga.cgccatggccacg9240 tccggcgacaccccggccggccgactggacatcctgctgccgcgcga.acacgacacgctc9300 ctgggcgactgggcgcgcggtgaggcggcgagcgcacgggagtgccccgtcgccctgttc9360 gaggaacaggtcgaccgcacccccgatgtcctcgccctggtcgagggcggtgacggcgcc9420 cggctgagctacgccgagttcgacgcccgcgccaaccgcatggcgcgcttcctcatcgcc9480 cgcgggctcggggccgaggacctggtcggcctggtcttcccccgcggcgccgacctgctc9540 accggtctgtggggggcgctcaaggccggtgcggcctacctgccggtggacgtggactac9600 ccggccgaacggatcgcgctgctcctcggcgacgggaaccccgccctcgtcctcaccacc9660 tccgcccacgcccacctggtgcccgaggcgccggggcggcagatcctctgcgtcgacctg9720 cccggccccg cggacgaact cgcccgcgcc gcggaaggaa gggtgaccga cagcgagctg 9780 ccgcgcccgg tcgggcccga caccctcgcc tacgtcctct acacctccgg ctccaccggc 9840 cgccccaagg gcgtggcggt cggccggggt tcgctggccg cgcacgccgt ccgctcccgc 9900 gaccgctacc cggacgcggc cggggtgtcg ctgctgcact caccggt:cgc gttcgacctg 9960 acggtgaccg ccctgttcac cacactgatc tccggcggca ccctcctcct cgcggaactg 10020 gacgaacacg cccaggactc cggcgtcacc tacgtcaagg gcacgccctc ccacgtcgcc 10080 ctcctgaacg agctgcccgg cgtcctcgac gccaccgcgg agcgccccgg cacgctcgtg 10140 ctcggcggcg agccgctcac cggagagatg ctggagcgct ggcgcgccca ccacccgcag 10200 gcccgggtct tcaacgacta cgggccgtcg gagaccagcg tcaactgctc cgacctgctc 10260 ctcgaacccg gagccgaggt accggagggc ctgctgccga tcggtcgccc gctgcccggc 10320 aaccacatgt tcgtcctcga ccacctgctc cagcccgtac cggtcggcgt cgtcggagag 10380 atctacgtct ccggcgtcgg cgtggcccgc ggctaccacg gccggcc:cgg cctgaccgcc 10440 gagcgcttcc tgccctgccc ctacgacgca ccgggcgccc ggatgtaccg caccggggac 10500 ctggggcget ggcggcccga cgggatcatg gagtgcctgg gccgaaccga cgaccaggtc 10560 aaggtgcggg gcttccgggt ggagctgggc gaggtggagg ccgcgct:cgc cgcccgctcc 10620 gacgtcgccc gcgccaccgt cgtcgtgcgc gaggacgagc cgggggacag gcggctgacg 10680 ggctacgtgg tccccgaagg gggaccggac gcggacttcg acccggcggc cgcgctgcgc 10740 gacctggcgg ccgccctgcc gccgtacatg gtcccggccg cgatcgtggt cctctccgaa 10800 ctgccgcgta ccgagaacgg caagctcgac cgcagggcgc tgcccgcgcc cgactacggc 10860 accgcctccg tcggacgtgc gccgcgcacc gccctggaga ccgacct:gtg cgccctgttc 10920 gccgacgtgc tcggcgtgcc cggcatcacc ctcgacgacg acttctt:cgc cctgggcggc 10980 cactccctgc tcgccgtccg gctcgccggc cgcatccggg ccgagctcgg actgcggctc 11040 gacatccgga cgatcttcga ccaccgcacg gtcgcggacc tcctcgccga tccgaatcct 11100 Information for SEQ ID NO: 6 Length: 37360 Type: DNA
Organism: Streptomyces fradiae Sequence: 6 gcccagggtc ccggcgcggc ccgcggcgac ccccgacccg cgggcgctgc ccgagcggct 60 gcccctgtcg cccgcccagc gcaggctgtg gttcctcaac cgctacgaca gggaggccgg 120 cggctaccac atcagcgtcg cgctgcggct caccggcgat ctcgacc~tcg acgccctcca 180 cgcggcactg ggcgacctga ccgcccggca cgagagcctg cgcaccgtct tccgcgagga 240 cgaacaggggccgcaccaggtcgtcctggacccgggggccccgcccgcacccgccgtcgt300 cccggctgccgcccaccgcatcgacgccctggtgcgcgaagccgtccgccgccccttcga360 cctggccgacgacatcccgctgcgccacaccctcttcacgctcccggacggcgaacacgt420 cctgctcctggtcatccaccacatcgccgccgacggctggtcgatggggccgctggcacg480 ggacctggccgccgcctaccgcgcccgcgccgccggccgcgcgcccgactggcccgcccc540 ggccgcccgccccgccgcacacccgcccggacagcacggcgacgacgtggacgacacggt600 ggaccgccgcctcgcccactgggccgaggaactgcgcggactgcccgacgaactcgcgct660 gccctacgaccggccgcgccccacgacaccccccggctacgccgagcgggtccccttccg720 cgtcgacgccgcgctgtaccgggacgtgcgggcgctggcggcccgccaccgggccacccc780 gttcatggtcctccacgccgccctggccgccctgtggcaccggctcggcgccggccccga840 catccccgtgggcaccccgtccgccggccgcgaccggcccgagaccgccgacctcgtcgg900 cttcctggtcaacaccctggtcctgcgcaccgacacctcgggcgacccggccttcgccga960 actgctcgaccgggtgcgcgagaccgacctgcgggcctacgcccaccaggacgtgccctt1020 cgagcggctggtggaggcggtcaaccccgcccgctcgcccagcaggcacccgctcgtcca1080 gacgatgctcaccttcgacaacgccgcccacggagcgctcgatcacctcctggacctgcc1140 gggcgtgcgcgcggaactgctgccgaccgccgagggcaccgcccacaccgacatcgaact1200 gaccttcaccgagaccacggcggacaccgacggcgacggactcgacgcgtccctgcgcta1260 ccgccccgacctgttcgaccgcacgaccgcgcgggctctggcggagcggttcatggcgct1320 gctccgcaccgtgacgcgcgagcccgccctgcggctcgggcagctcgacgtcaccaccgc1380 cggggaacgccggcggctggccgacgcggacgccgcggcacgggcgaggacggccgccac1440 cgccgtcgccgtcctgcccgccctcttcgccgcctccgcccatcgcacgcccgccgcccc1500 cgccctcaccgacggcccggcgaccctggactacgcggaactcgacgcccgctccaaccg1560 tctcgcccgggccctgctcggactcggcgtggggccggaggacttcgtcgccctggcggt1620 gccccgctcggcggacctggtggtggccgtcctcgcggtgctgaagt:cgggcgccgccta1680 cctcgccgtcgaccccgaccacccggccgagcgcacctcgtacatcctccacgactgccg1740 gcccgtcgccgtcctctccacgaccgccgtccgcgagaccctgcacggcacggtgggcga1800 ggcggtcggcgaggtcccgtggctgctgctcgacgagcccgccaccggcggcgcgacggc1860 cggccactcggccgcaccggtcaccgacgccgaccgccggtcgcccca tccccgacca1920 gc cccggcctacaccatctacacctccggctcgaccggacggcccaagggcgtcgtcgtcag1980 tcacgccaacgtctcacgactgctgaccgcctgccgcgcggccgtggacttcgggcccga2040 cgacgtctggacgctcttccactccagcgccttcgacttctcggtgtgggagatgtgggg2100 gccgctggcgcacggcggccggctggtcgtcgtcccgcacgacgtggccagatcacccgg2160 cgacctcctggacctgctgggccgcgagcgcgtcacggtgctcagccagacgccctccgc2220 cttcctccagctcctgcgggcggagtccgacctcggcgtccccccgaggaccaccgcggc2280 gctgcggtacgtcgtcttcggcggagaagcgctggacaccgcccaactcgccccctggcg2340 gggccgcccggtccgcctggtcaacatgtacgggatcaccgagacgaccgtccacgtcac2400 ccacctggagctggacgacgccgccgtggaccgcggcggcagcccgatcggcacacccct2460 gaacgacctgcgcgcccacgtgctcgaccaggggctgcttcccgtg<:cggtgggcgtcgt2520 gggcgagctgtacgtcgccggccccggcctggcccgcggctaccgccgccgccccggcct2580 gagcgccacccgcttcgtcgccgacccgttcgacaccggcggccggatgtaccggaccgg2640 cgacctcgtccggcgcacccaggacggcggcctccactacgtcggccggtccgactccca2700 ggtgaaactgcgcggctaccgcatcgagcccggcgagatcgaggccgccgcccgccgcca2760 cccggacgtcgcccaggcggccaccgccgtgcacggcgaaggaccgc;aggaccggtacct2820 ggtctgctacgtggtgccggcggccgacaccgaccccgacccgcaccaggtgcgcgccca2880 cctggccgacgccctgcccggctatatggtccccgccgccgtggtgc;cgctgaccgccct2940 gccgctgacc cccaacggca agctggaccg agcggcgctg cccgcccccg accgggcggc 3000 gtgggccacc ggcggcgccc cgaccggacc gcgcgaggaa gcgctctgcg ccgccttcgc 3060 cgacgtcctc ggcgtccagg aggtcagccg cgacgccgac ttcttcgccc tgggaggcca 3120 ttccctctcg gcggtccggc tcatcagccg gatcaggtcg gcgctcggag tggagatcgg 3180 catccgcacg ctcttcgagg cgcccacgcc cgccgcgctg tcccggcgcc tcgacaccgc 3240 cgggaccgga cggccccgcc tcctgccgcg ccgccgaccg gaccgcgtcc cgctctcctc 3300 cgcccagcgc aggctgtggt tcctcggaga actggaagga ccgagcgcca cctacaacat 3360 cccgctcgcc ctgcgcctgc gcggccgtct cgacgtcgac gccctgcgca ccgccctggc 3420 cgacgtggtg ggccggcacg aggccctgcg caccgtcttc ccgtccgagg acggcgcccc 3480 ctaccagcag gtggtcgcgg ccgaacgggc cgcgcccgcc ctcgacgtcg tggacgtcac 3540 cgagaaggag ctgcccgccg ccctcgccga ggcccgcgca cacgccttca ccctcaccga 3600 ggaccttccg ctgcgggccg tactgctgcg gaccggcccc gccgaccacg tgctctccct 3660 cgtcctccac cacatcgcgg gcgacggctg gtcgctggcc ccgctcgccc gcgacctcag 3720 caccgcctac gccgcacgtc gggagggccg cgccccgcag tggcggcccc tgccggtgca 3780 gtacgccgac cacaccctct ggaaagagga gttgctcggc gcggcggacg accccgagag 3840 cctcctcgcc cgccaactcg ccttctggcg cgaggcgctg gagggcgcgc cggaacagat 3900 cgagctaccc accgaccggc cgcgccccgc catggagagc caccgcggcg cgatccaccg 3960 cttcaccctc cccgcgtcac tgcgcgaccg gctgcgtgac ctcgcgcacg cgcggcgggc 4020 caccctcttc atggccctcc aggccggact cgccgcactg ttcgccaccc tgggggccgg 4080 ccgggacatc gtcctcggca cgcctgtcgc cggccgcgcc gacgaggcgg ccgacgacct 4140 cgtcggcttc ttcgtcaaca ccctggcgct ccgcaccgac ctcggcggcg accccacctt 4200 cgaggaactg ctcgaccgcg tcagggaagc cgacctgtcc gccttcgccc accaggacat 4260 cccgttcgag caactggtgg aggcgctcaa ccccacccgc tccctctcca ggcaccccgt 4320 cttccaggtg ctgctggccc tccagaacaa cgagcgcggc gaggccgtca tgccgggcct 4380 ggaggtcacc gtcgaacgcc ccgcccaggc ggcggccaag tacgacctct tcgtcaacct 4440 cgtggagtcc cggaacgagg aggacggaac gaccgccgtc gagggagcgg tcgagtacgc 4500 caccgacctc ttcgacgccc gtaccgtcgc ccggctcacc gagcgctacc acgacctgct 4560 cctggccgcc gtcgaggagc ccacgacacg gctcagccgg atgcccatgc tcgacacggc 4620 ggaacgcgac agactcacgg ccgaatgggg cgccgccgcc gcgggcccgg ccgaggacct 4680 ggtcgccctc ttccgtgccc gcgccgccga gacacccggc gcggtggcgg tccgcggcgc 4740 cggggacagc ctcacctacg cccagctcga cgagcgggcc ggacggatcg cggcggccct 4800 cgcccggcac ggcgccggcc ccgagagcag ggtcgcggtg tgtctgccgc gcaccgccga 4860 cctggtggcc gcgctgctcg gcgtcctacg ggccggcgcc gcctacgtac cgctcgaccc 4920 ggagtacccg gacgagcgcg tcgccgcgat cctggccgac acccgcccgg tggcgctgct 4980 caccacggcg gactgccgcc ccgcgatcac cggggccgcg accgccgccg gcggagccgt 5040 cctcctcgcg gccgacgccg cacacggcgc gggccccgtg cccgagcccc ccgccccgct 5100 gcccgaccag gccgcgtacg tcctgcacac ctcgggctcc accggacgcc ccaagggcgt 5160 cgtcgtcagc cggggcaacc tcgccaacct cctggccgac atgcgggacc ggctgcgccc 5220 caccgccgac gaccggctgg tcgccgtcac cacggtcagc ttcgacatcg ccgcgctgga 5280 actcttcctc ccgctggtca ccggcgccgg actggtcctg gccgaccgcg gcgccgcacg 5340 ggcccccgag gaactggccg ccctgctcac cgcgagcggt gccaccctcc tccaggccac 5400 cccgaccacc tggcagttgc tggccgagac cgcccccgac gccctgcgcg ggctgcgcaa 5460 gctggtcggc ggcgaagccc tccccgcctc cctggcctcc cgactgcgcg agctgggcgg 5520 cgaactcgtc aacgtctacg ggcccaccga gaccaccatc tggtcgaccg ccgcccacct 5580 cgaccgggtc accggcagcg ccccgcccat cggccgggcg ctgcgcggca cccgcgccta 5640 cgtgctggac gagtggctga acccgcgccc cgagaacgtc cccggcgagc tgtacctggc 5700 cggcgccggcgtggcccgcggctacctgggacgcggtggcctgaccgcggagcgcttcac5760 cgccgaccccttcggcgcgcccggcagccgcatgtaccgcacgggcgacctggtccgccg5820 ccgcgcggacggggagctggaattcctcggacgcaccgaccaccaggtcaaggtccgggg5880 cttccgcatcgaactgggcgagatcgagacggcactcggcgcgcacccgcacgtcgccgg5940 ggcggtcgtggtcgcccgcgcggcgtccggcgcggccctcgtgccggacgcgccggcccc6000 acggcgactggtggcctacgtggtccccgagccccaccgcgccgcc~~ccgacgacggccg6060 ggagcagaaccggctcgacgagtggcgggaggcctacgacaccctctacggcagctccgc6120 acccgcccccctcggccaggacttcggcatctggcgcagcagtcacgacgggcagcccat6180 ccccctggacgagatgcaccagtggcgggcggccacggtggaccgcatccgggcgctgcg6240 gcccacgcgggtgctggagatcggagtcggcaccgggctgctgctctcggagctggcgga6300 ggactgcaccgcctaccacggcaccgacctgtccgcgcgggcgatcgagacgctgcgcgc6360 acaggtcgacgccgaacccgcgctgaaggagagggtcgagctgcacgtccgcccggccca6420 cgacttcgacggcctgcgccggggcttctacgacaccatcgtgctcaactccgtcgtcca6480 gtacttccccgacgccgaccacctcacccgcgtactgcgcggcgcgctcgacctgctcgc6540 ccccggcgggcggctcttcgtcggcgacgtccgcagcctggcactgctgcgcgccttccg6600 cgcctcggtggagaccggcaacagcgcggtctccgaaactcccgccgccgtacttgccgc6660 cgccgaccgcaggacggccgcggagaacgaactcgtcatcgcccccgactacttcgcgcg6720 gctgcggcgggaggcccgcgaaccgctcctgctggacgtgcgcatccggcgcggacggcc6780 gtacaacgagctgacgcgctaccgctacgacgtcctgctggtcaaacaggagaccggagc6840 cgcgccctccgccctgcccccggccaccgaactgcgctggacgccggagaccggcgatgc6900 cgggcggctggccgagatctgcgcggcgcaccccggcgcgctgcgcgtcaccgcgatccc6960 caacgcccgcgtgcggcgcgagaccaccgccctcgccgccctggaggacgggcgaccggt7020 caccgaggtgcgccggctgctggagcaacccggtgacggagtcgatccggaggacctgta7080 cgacgccgcgaccgccgccggacgcaccgcctgggtgacctggtcggccgacggaccgcc7140 ggacaccgtggacctcgtgctggccccggcgggcggggacggcgtgccgccggtggcgcc7200 gccggccgagctgtggcccggcgcgccggccgctgaccggccggagacgaacgacccgac7260 cgccgggtcgcaccaccgcgaactggccgcccggctccgctcccacctggccgaacggct7320 gccggactacatggtcccctcggccgtcgtcgtcctcgacgccctcc:cgctgaccgccaa7380 cgggaaggtggaccgcaacgcgctgcccgaccccgacccggcgggcacggacgccggccg7440 cccgccgcgcacgccccgggaggaactgctctgcacgctcttcgccgacctgctgggcct7500 gggccgggtcggagtccaggacagcttcttcggcctcggcggcgacagcgtcctgtccgt7560 ccgcctcgtc agccgcgccc gcgcacacgg gctgccgctg accacccgcg acgtcttcga 7620 gcaccacacc gccgccgcgc tggcggcggc cctggacggc agggaaccgg agagcgaacc 7680 ggacggcggc ccgccgggcc ccgacgccac cgcggcgcgg cccatcaccc tcgacgaact 7740 cgccgagctc gaggccgagt tcggcacgga ctgggaggag acacagtgaa cggtccgcag 7800 cgcatggtcg aggaggtcct ggcggtcacc ccgctccagg aggggctgct cttccacgcc 7860 gtcttcgacg agaacgtccc cgacgcctac gtcagccggc tggtcctggc cctttccgga 7920 gagctggacg ccgaccggct gcgacaggcc gcccaggcgc tggtggcacg ccacccggcg 7980 ctgcgctcgg ccttccgcca gcggcgctcg ggggagtggt tccaactggt cgccacccgc 8040 cccgcggtgc cctggcagga gctcgacctg cggccgtcgg ggagcccggc ggaggcggac 8100 aagcacctgg aggcgctgct ggacgagcac caccgcaccg ggttcgacct cggccggccg 8160 cccctgctgc gcttcctgct ggccaggacc ggcgacgacc accaccggtt ggccgtgacc 8220 tatcaccacc tcgtcctcga cggctggtcc atgcccatcc tgatgcggga actggccgtg 8280 ctgtacggca gcggcggcga cccgtccgcc ctcccgcccg tccgcccgca ccgcgaccac 8340 ctcgactggc tggcccgccg cccgtccgag cggagcgccc gcgcctggcg gcaggcgctg 8400 gcgggactgc ccggccccac gctgatcgcc ccggacgccg accgcaacgg gccgctcccc 8460 gggtcggtgt ggacccggct cggcgagcgg gacacccatg ccctcggggc gtgggcgcgg 8520 gcccgcggcg tgacggtgaa ctcggcggtg caggccgcct gggccaccgt gatcggccgc 8580 ctcaccggcc gcgacgacgt cgtcttcggc acgacggtct cggggcggcc gccggatctg 8640 cccggcagcg aggacatggt cgggttcttc atcaacaccg tgccgacgcg cgtgcggatg 8700 aggccggccg agccgatcgg cgacctcgtc gtgcggatcc agcgcgagca gaccgccctc 8760 atggagcacc agcacgtccg gctctccgac atccagcgct ggtccgaccg gaccgtgctc 8820 ttcgacacct ccaccgcgtt cgaaaactac cccgccgacg acctgtccgc cgtcagctcc 8880 gccgggcacg cgggactgcg tatcgaggga ggctccggcc gcaccaccaa ccacttcccg 8940 ctctccctct acgcgctccc cggcccggcg ctccgcctgc gcctggacca ccgccccgac 9000 gccgtggacg acgtcaccac acgccacgcg gcggatctgc tggaacgtgc tctgaccgcc 9060 gtccacagcg ccccggccac cccgaccgcc gcgctcgccg ccacccccgc gacggcacgc 9120 gcggccgcac cccgcgccgc cgggccgggc gccccggcca cgatcgtgga cgcgttcgag 9180 gcgcgtgtgc gggcgacccc cgaggccccc gccgtcctcg ccggcggcga ggagctgacc 9240 tacgccgaac tcgacgcccg ggcgaaccgg ctggcgcgcc tgctgctgga gcgaggggtc 9300 ggacccgaga gccgggtcgc cctcaccgtc tcccggaacg cctggctgcc cgtcgccgtg 9360 ctcggcatcc tcaaggcggg cggctgctac gtccccgtgg gcgccacgct gccgcgggag 9420 cgcgccgccc gcatcctccg cgagaccgca ccggtctgtc tgctcaccga ccccgacgcc 9480 gaggccgccc ggacccgccg caccgccccc acgggagacg accgggacga gaacgcgccg 9540 ggcggcgtcg agcgcgtcgt gctgaccggc gccctcctgg ccgcgttcga cccggccccg 9600 ccgaccgacg ccgaacgggc cggacccctg ctccccggcc atctcgccta cctcctccac 9660 acctccggct ccagcggccg gcccaaaggg gtcgccgtcg aacacgccca ggtgaccgcc 9720 ctgctgtcct gggccggcac cggcgtcgga gccgaccgtc tgcaccggac cgtggcctcc 9780 acctcggaga gcttcgacgt gtcggtcttc gaCaCCCtCg tCCCgCtgCt caccggcggc 9840 cgcatcgaga tcgtggagaa caccctggcc gtcgccgacc ggaccggcgg cgaaccctcc 9900 ctcctgaacg ccgtcccctc ggccctgcag gcgctgctgg agcgcggcga gccgctcgcc 9960 gtccacacct tcctctgcgc cggcgaaccc ttccccgccc cactggcccg cagcctgcgc 10020 gccgccttcc cgcgggcgcg cgtggccaac ctctacggac cgaccgagac gaccgtcttc 10080 gtcaccgccc acttcctgga cggaaccgac gacggcgcgc cccccgtcgg ccgcccgctg 10140 cccggtgtgc gcgtccatat cctcgacccc tggctccgtc ccgtgccgga cggcgtcgtc 10200 ggggagctgt acctcgccgg ggaacacgtc acccgcggct actggcagcg cccggcgacg 10260 acggccgaac gctacgtcgc cgacatcttc ggcgcgcccg gcgcccgcat gtaccgcagc 10320 ggcgacctcg gacggctccg ccccgacggg gagatcgacc tggtcggccg ggcggacgac 7.0380 caggtaaagg tgcgcggcca ccgggtcgag ctgggagagg tggaggccgc cctggcctcc 10440 cacccggacg tcctgcgggc cgcagccgcc gtgcacgacg gcaaacccgc cggaccgcgc 10500 ctggtgggct acgtcgtgcc ccgcgggccg gcgcccgaca ccgccgccgt cctggaccac 10560 gtgcgccgcg aggtgccccc ttacatggtg ccctcggcgc tcgtggtgct ggacgagctg 10620 ccgctgaccg tcaacggcaa gcgggaccgc gccgcactgc cgcccccgcc cgaccggagc 10680 gacaccacgc gggcgcgcgc cccccgaggc ccgcacgaga cgatcctgct cggactgttc 10740 gccgaggtgc tcggggtacg cccggtcggc atcgacgacg acttctt:cgc cctgggcggc 10800 cactcgctgc tcgccacccg cctggtcagc cgggtgcgca ccaccct:cgg agccgaactg 10860 gcggtgcggg acctcttcga acaccccacg gtggccggtc tgtacgccag gatcgcccgc 10920 gccggggcgg cccggccgcc ggtctcccgc gtccacgcgc ggcccgaiccg cgtccccctc 10980 tcgttcgccc agcggcggtt gtggttcctc caccgcctcc agggccacag cgccgcctac 11040 cacgtcccgc tcgcgctccg cctcaccgga cgcctcgacc cggccgcgct gcgcggcgcg 11100 atcgccgaca cggtcgcccg gcacggaagc ctgcgcaccg tcttccacga ggacgccgag 11160 ggcgtccgcc agatcgtcca ggacgcctcg gccgccgccc ggctgatcac cctgatcccg 11220 gagcccgtcg aggacccgct gcgggcggcg gaggaggcgg tggcagaacc cttcgacctg 11280 acggccggac cgcccctgcg gtgcaggctg ttcacccggt ccgcggaccc ggcggacccg 11340 cgcgcgggcg ccggccagga gccgcaagaa cacctgttcc tcctggtggt gcaccacatc 11400 gcggccgacg gctggtcgct gcggatcatc gcccgggacg tggccgccgc gtacgccgcc 11460 cgcgtgcgcg gtgaggactt cgcgcccgcc CCgCCCCCCg tCgaCtaCgt cgaccacacc 11520 ctctggcagc accgggtgct cggcgacccc gacgcggacg gcggccccga cacggagggc 11580 ggccacgcca cggagggcgg cccgctcgac acccagctcg cgcactggcg gcggcggctg 11640 gccggcctgc cgcaggagat cgcgctgccg gccgaccggc agcgcccggc cgcctcctcc 11700 caccggggcg cggacgtgga cttcaccgtc cccgccgccg cggccgaacg gatcaggcaa 11760 ctggcgggaa ccaccggcac cacgcccttc atggtcctcc aggccgcact ggcggtcctg 11820 ctgcaccgca tgggggcggg gaccgacatc ccgctgggca ccccggtcgc cgggcgcacc 11880 gacagcgcgg tcgagggagt cgtcggactc ttcgtcaaca ccctggtcct gcgcaccgac 11940 ctgagcggct cgcccacctt cgcccagctc ctgggccggg tccgggccac cgccctggac 12000 gcctacgccc accaggacgt gccgttcgaa cggctggtcg aggtgctcgc ccccgagcgc 12060 tccctggccc gccaccccct cttccaggtc tccctcgtcc tgcagaacct cgacgaggcg 12120 gcggcgccgg tggacggact gcccgggctg cgcgccgaaa cggtccgcac ccggcgcgac 12180 ggcgcgaagg tcgacctgtc cttcgtgctg gctcccggcg gaccggaagg cggggacatg 12240 cccggagtcc tcacctacag cgccgacctc ttcgaccacg cgaccgr_gag agggctcgtg 12300 gaccggctgc tgcgggtgct ggaccaggtg ctcgccgccc ccgccacgcc tgtggggcgg 12360 gtggacgtgc tcctgcccgg cgaggcgcgg cgcgagctgg agcacagccg cggaccgggc 12420 gcggcgggag acggggacga accgctggca cgcttcgaga agtgggcggc gaccaccccc 12480 gacgcccccg ccctgcggtg ggacggcggc cgtctgacct acgccgagct ggaccggaag 12540 gcggacgcgg tggcccgcgc gctcgtcggg cgctccctcg ggcccgagga cgtggtcgcg 12600 gtcgtcgctc cgcgcgaccc ggacgtggtg gccgcactcc tcggggt:gct caggtgcggc 12660 gccgcgtacc tcccgatcga cgaggcatgg ccgcccgcac ggatccggcg gacgaccacc 12720 gacgccggcg cgcgcctgct cctggcgccg ggcgacaccg acgccgcccg gaccgccttc 12780 ggccccgcct gcggcccgga caccgacatc ctcggcctcg aggacccggc cttccgggcc 12840 accggcggtc cggcccttcc ggccgggcgg aaccacccgc gctcgct:ggc gtacgtcctc 12900 tatacctccg gctccaccgg gcgccccaag ggcgtgggcg tggagcgccg ggcactcgcg 12960 cactacgtgg aaggggcagt ccaccgctac ccggacgcgg cggcgacgac cctgctccac 13020 tccccgctga ccttcgacct cagcgccacc gccctgttca cccccctcgc ctcgggcggc 13080 tgcgtcgtcc tgggcgaggt ggaccgtgcg gcggaggccc acccggtgga cttcgtcaag 13140 gcgaccccgt cccacctgcc cctgctggaa cggcgtcccg gactgctcgg ggagaacggc 13200 accctcgtcc tgggcgggga agccctcgac gggcgggccc tgcgcgcctg gcgggccgcc 13260 cacccgcacg ccgaggtcgt caacgcctac ggccccacgg agctgaccgt caactgcgcc 13320 gagcaccgca tcgccgccgg cgaaccggtg ccggacgggc cggtaccgat cggccgcccg 13380 ttcgccggcg tccgcgcgat ggtgctcgac acggcacttg cccccgcacc cccgggcgtg 13440 gccggggagc tgtacgtcac ggggcccgga gtggcccgcg gctacctggg gcagcgcgcc 13500 ctgaccgccg agcggttcgt ggcctgcccg ttcggggagc cgggggagcg gatgtaccgc 13560 accggcgacc tcgtccgccg ccttcccggc ggcgaactgg agtacgtggg ccgaacggac 13620 gagcaggtga agctgcgggg cttccggata gaactgcccg aggtggcgcg caccctggcc 13680 gccgacgagt cggtcgcgcg cgcggtcgtc gtcgtacggg aggaccgtcc gggcgaccgg 13740 cggctgaccg gctacgtggt cccggcggcg ggagtccgcc cgcacgagga cgaactgcgc 13800 ggcgcggtgg cccgcacgct gcccgactac atggtgccct ccgccgtcgt cgtectcgac 13860 gaactgccca ccacgcccca cggaaaactc gaccggcgcg cgctccccgc cccggcacac 13920 cgctcgcggg gcggccgccc gccgcgcgac cagcgcgagc gggacctgtg ccggatctac 13980 gccgacgtgc tgggcctgcc cgaggtgggc gccgaggacg acttcttcgc cctcggcggc 14040 cactccctgc tcgccacccg gctggtcaac cggatccggg ccgaactcgc cgaagaactc 14100 gacgtacgga ccgtgttcga ggcccgaacg gtcgccgcgc tggcggcccg gctgcggacc 14160 gcccgccctg acacccgccc cgcgttgcga cggatgtcgc ggtcggagga cttgtgatgc 14220 ttcccctctc cctcgcccag cagcggctgt ggttcctcca cacgatggac ggccccagct 14280 ccacctacaa catccccacg gcgttgcgga tgaccggccc gctggacgtc accgcgctgg 14340 gcgaggccct gcgcgacgtc gtacggcgcc acgagacgct tcgcaccgtc ttccccgaca 14400 ccggcgacgg cgcccggcag cacgtcctgc ccgccgacgg gaccgccgtc gagctggccg 14460 tcacccgttc caccgagcac gaactgcccg ccgcgctggc ccacgaggcc ggccacgcct 14520 tcgacctggc ccgcgaagtc ccgatcagag cgaggctgtt cgtgctcggc gagcgggagc 14580 acgtgctctg cctggtgatc catcacatcg ccagcgacgg ctggtcgcgc accccgctcg 14640 cccgcgacct cgccaccgcc tacgccgccc gcggcgccgg gcacgcc:ccg cggtgggagg 14700 aactccccgt ccagtacggc gactacaccc tctggcagcg cgagct<:ctc ggttcgcagg 14760 acgaccccga aagcctgctc agccgccaga cggcgtactg gaagcagcgg ctcgcgggcc 14820 tgccggacgc catcgaactg cccctcgacc gtcctcgccc gccgat<:gcc ggccaccgcg 14880 gcgacaccgt ccccttcacc ctcccgcccg cgacccacga gcgggtcgcc gcgctcgccg 14940 cccgccacgg cgcgaccacc ttcatggtgg tgcaggcggc cctggccggc ctgctgtccc 15000 ggctgggcgc gggcaccgac atccccctgg gcaccccggt ggccggacgc accgacgcgg 15060 cgctggaggg gctgatcggc ttcttcgtca acaccctggt gctgcgcacg gacacctcgg 15120 ggaaccccac cttcgacgaa ctggtcgaac gggcccgcgc ctgcgccctg gacgcctacg 15180 cccaccagga cgtgccgttc gagcgactgg tggagacgct cgcccccgag cgctccctgg 15240 cccgccaccc gctcttccag gtgagcctga gcctccagca cgccaccgac cacacggccc 15300 tcctgaacgg tctggagatc gcccccctgg acaccggatg gcgggcggcc aagttcgacc 15360 tctCCttCga CCtCCtggag aagcgcggcc ccgacggccg cccggacggc atcgccggca 15420 ccgtcgagta ctccaccgac gtcctcgacg ccgccaccgt ccgcgggctc ggggaacgcc 15480 tcgtccgcct gctggaggcc ggcaccgccg cccccgaggc gcggctgctc tcgatcgacc 15540 tgctctccgc cgaggaacgg cgccgcgtgc tggaggagtt cgccgccgag cccgcagccg 15600 acgagcccgc agccgccgag cccgcggccg acgaggggct ggaggccgtg tgcgacacct 15660 tcgcccgcca ggcggcggcc acccccgagg ccccggccgt cgtcggcggt ccggtcgccc 15720 tcaccttcgc ggaggccgac gcccgcgtct cccgcctggc ccggctgctg atctcccggg 15780 gcgccggccc cgaggtccgc gtcgccgtct gcctggaccg caacgcr_ctg tggccgacga 15840 ccgtgctggc cgtgctgcgc agcggcgccg tccacgtacc gctggaccca cgctccccgc 15900 acgagcggct ggccgccgtc gaacgcgacg tcgcccccct gctcgtcctc gccgagcgcg 15960 ccaccgaggc cgccgtcgcc gacctcgccg ccccggtcct cgtcctggac gacccgagca 16020 ccgaggccgc gatcgacgcc ctggacccgg gcccggtcac cgacgc<:gac cgcaccgcgc 16080 ccctcctgcc cgggcacgcc gcctacgtca tccacacctc gggttccacc ggcaggccca 16140 agggggtcac ggtggaccac cggggcctgt cgcggctgct ccaggcgcac cgccgggtca 16200 ccttctcccg catccgtccc tccgcaggcg gccccggccg cgccgcc cac gtctcctcct 16260 tctccttcga cgcctcgtgg gacccgctgc tcgcgatggt cgccggccac gaactgcaca 16320 tgatcgacga ggacctgcgg ttcgacccgc cgggcgtggt ggcctacttc cgcgaccgcc 16380 gcatcgacta cgtcgacctc acccccacct acttccgcag cctgctcgac gccggactgc 16440 tggaggaagg cttcccctgc ccgtccctcg ttgccctggg cggcgaggcg atggacggcg 16500 aactgtggga gcggctgcgg gcggccgccc cccgcgtgac cgcgatgaac acctacggtc 16560 ccaccgagac cgccgtcgac gccgtggtga ccgtactggg cgacctgccc ccgggcacga 16620 tcggccggcc cgtgccccgc tggcgggcct acgtcctcga cgcgggactg cggccggtcc 16680 cgcccggcgt gctgggcgag ctgtacctcg ccggacccgg agtcgcccgc ggctacctgg 16740 ggcagcacgc cctgaccgcc gagcggttcg tggcctgccc gttcgggaag ccgggggagc 16800 ggatgtaccg caccggcgac ctggcgcggt ggctccccga cggccacctg gtctatgtcg 16860 gacgcggcga cgagcaggtc aagatccgag ggttccgcat cgagcccggg gaggtggagg 16920 ccgcactgcg ggaactggag ggcgtcgcgg ccgccgccgt gaccgtccgt gaggacaccc 16980 ccggaacacg cagactggtg gggtacgtcg tcggtacccc cgacgccgac gacgcccggc 17040 tccggcccgc cgaggtgctg gcacgcctgc gcgaccgact gcccgaccac ctggtgccct 17100 cggcgttcgt ccgcctccgt gaactgcccg tcaacaccag cggcaaactg gaccgggccg 17160 cgctcccggc ccccgacccc gcggacttcc ccgccggccg gcgaccgcgc accgccctgg 17220 agcgggaggt gtgcgcgctg ttcgcggagg tcctcggcgc cgggagcgtc ggcatcgacg 17280 acgacttctt cggccggggc ggcgacagca tcctctccat ccaactggtg ggcagcgccc 17340 gccgggcggg cctcacgttc accgtccggc aggtcttcga gctgcgcacc cccgcggccc 17400 tggccgccgc cgcccgcagg accgacgcgg caggcgacga ggaccccgct ctcgccgtcg 17460 gaccgctgcc gctccttccc gtggtcgccg agaccctcgc ggccggcggg ccggtccact 17520 cgtacaacca gtcggtcgtc ctcgcgtccc cgccggacgc cgcacccgac gacgtacgcg 17580 acgcgctcca ggccctcctc gaccggcacg acgcgctgcg cgtccacgcc gccccggcgg 17640 ccggccccgg ccgcctctgg gacctccggg tggaggaggc cggcacggtc gcggccgagc 17700 ggtgcctgcg ccggatcgac gcgaccggca tgtccgacga ggaactggcg cgggcgcagg 17760 ccgccgaggc cgtcacggcg cgcgcctgcc tcgaccccct cgccggggcc ctcgtcagcg 17820 ccgtctggtt cgaccggggc gaccggccgg gccggctcgt gctggtqatc caccacctcg 17880 ccgtcgacgg cgtctcctgg cgcatcctcc tcggcgacct ccgtgaggca tggcgggcgt 17940 tgcgcgccgg ccgccgcccc gaactccccc gtacgggcac ctcgctgcgc acctgggcca 18000 cccggctcac cgaacgggcc accgacccgg ccgtcaccgc ccaactggac cactggacgg 1.8060 ccacgctcgc cgacggcccc gcaccgggca gccggccgct ggaccggacc cgggacaccg 18120 tggccacctc cgccgtcctc agcggcgaac tgcccgcgtc cctcaccacc gacctgctcg 18180 gtccggcccc ggcggccttc cgtgccgggg tgaacgacct gctgctgacc gctttcgccc 18240 tcgccgtcgc ccactggcgg ggcgaggagg acgcaccggt cctggtggac ctggagagcc 18300 acggccggac cgaggaactg gtgccggggg ctgacctgtc ccgcacc:gtc ggctggttca 18360 cctccgtcca cccggtgcgg ctcgccgccg gcagggtcac cgccgc<:gac ctcgccgagc 18420 gcgccccggc cgtcggcgac gcgatcaaac ggatcaagga gcaactgcgc gccgtccccg 18480 acggagggct ggggcacggt ctgctgcgcc acctgaaccc cgacaccgcc ccccgcctcc 18540 gaggcctcgc ccgcgcgcgg ttcggcttca actacctggg ccggttcgcc gccgagcagg 18600 gcgcgggcga ggacagctgg ccgctgctcg gcagcggccc cgcgggccag catccggaca 18660 ccccgctcga ccacgagatc gaggtcaacg tcgtcacggc cgagggtccg gacgggcccc 18720 ggctgatcac ccggtggacc tacgccaccg gtctgctcac cgaggaggag gtgcgccgcc 18780 tcacgcggtc ctggtcgctg gcgctgcacg ccgtcgtcgg ccacgccacc gccgagggag 18840 cgggcggcct cagcccctcc gacgtggccg ttcccgacct cggccaggcc gagatcgaac 18900 agctcgaacg ccgcaccggc accgccttgg aggacatcct gccggtcgcc cccctccagg 18960 agggcctgct cttccacagc gtgtacgacc ggcgcgccct ggacgtctac gtcggccagc 19020 tcgccttccg cctggaggga gagatcgacc aggacgccct gcggacggcc gccgccgcgc 19080 tgctcgcccg ccacaccagc ctgcggaccg gcttccacca acgggagtcc ggccagtggg 19140 tgcaggccgt ggcccggtcg gtggagctgc cgtggcagtt ccacgacctg ctcgacccgc 19200 acggcgccgg cggggccgcc ggtgccgcgg acgccgggtc cgggcgacga tgggaggagc 19260 tggccgcggc cgaacgcgtc gagcggttcg acctcacccg ccccccgctc gtccgcttcc 19320 tcctggcccg caccgccccc gagcggtacc agttcgtgat caccacccac cacacgatcg 19380 tcgacggctg gtccatcccc atcctgctgc gcgagctgct cgcgctctac ggcggggacc 19440 cgctgccccc ggcccccggt caccgcctcc acgccgactg gctggccgca cgcgacctgg 19500 tggcggcgcg cgaggcgtgg acgcgggcgc tggcggacac cgaggggccc accctgctcg 19560 cgcccggcgc gccgcgcgtc ggagaagtgc cccggtcggt acggctgaac ctgcccgagg 19620 aggtctccgc acggctgctg acccgcgecc gcgaggccgg ggccaccctc aactcggtcg 19680 tccaggccgt ctgggccctc gtcctcgccc aggagaccgg ccgctcggac gtcacgttcg 19740 gcatcaccgt ctcgggtcgc ccggcggaac tccccggggc cgagaacctg gtcggcatgc 19800 tggtgaacaa ggtcccgctg cgcgtccgtc tccgcccggc cgaacccctc atggaactgg 19860 cccggcggct ggagagggaa cagctggaac tcctggagca ccagcacgtc ccgctcacca 19920 ccctgcaccg ctggagcggc ctgcccgaac tcttcgacac caccatggtg ttcgagaact 19980 acccggcgga ggtcaccgcc cggcaggcgc ccttccgcgc gtcgggcacg gccagttaca 20040 gccgcaacca ctacccgctc acgctggtcg gagccatgcg cgggaccgag ctgaccgtcc 20100 gtgtcgacca ccgccccgac ctcttcgacg aggacttcgc ccgctccctg ggcgagcggg 20160 tgatcgccgc cctcaccgag gccgccgacc accccttcgt ccccgccggc acgctcgacc 20220 tgctcggtgc cgaggagcgc gcccgcctcc tggagtgggg caccggcccc gcaccggagg 20280 acgccccacg cacctatgtc gacctgttcg aggagcaggc cgcccgcacc cccgacgcgc 20340 cggcggtcat ctcgtccgac ggtgtcctca cctacgccga gctggaccgg caggcgaacg 20400 gcgtcgcccg gtggctggcc ggccgggccg gatccgccgg tggcgccgag gtccacatcg 20460 gtgtgctggc cccacgccgc cccgaagtgc tcgccgtcct gctcggcgtc ctcaagtcgg 20520 gcgccgccta cgtccccctg gacgagcagt ggccggccga acgcct~~cgc acggtcctgg 20580 aggactgccg ccccgcgctc gtgctggccc cgacggccgc caggagcgat gccgcgcggg 20640 agtccggcgc gacggtgctc cccgtcgacc cggccgccct cgccgcacac ggtccccaga 20700 ccccgaccga cgccgagcgg atacgtcccc tgacgcccgg cgcagccgcg tacgccctct 20760 acacctcggg atccaccggc cgccccaagg gcgtggtgat cgaccacagc gccctggccg 20820 cgtacgtcgg cggcgcgcgc cgccgctacc ccgacgcggc cgggacctcg ctggcccaca 20880 cctcgctcgc cttcgacctc accgtcacca ccctcctcac cccgctc acc gcgggcggcg 20940 ccgtgcgcct gggcgaactg gacgagaccg cccgggacgc cggggccacc ctggtcaagg 21000 cgacgccctc gcacctgccc ctgctgagcg agctgcccgg agccctgaac gacgggggca 21060 ccctgatcct cggcggcgag gcgctgaccg gcggccggct gcgcccctgg cgcgaactgc 21120 accccgacgc ccaggtcgtc aacgcctacg gtccgacgga actcacggtc aactgcaccg 21180 agtaccggct gccgaaggga gaaccggtcg gcgaagggcc ggtgcccatc ggccgcccgt 21240 tcgccggggt acgggtccac gtgctcggcc ccggcctgcg cccggtcccc gccgaggtcc 21300 ccggcgagct gtacgtcagc ggcgtcgggg tggcccgggg ctatctgggc cggccggccc 21360 tgaccgccga gcggttcgtg gcctgcccgt tcggggagcc gggggagcgg atgtaccgca 21420 ccggcgacct cgtccgctgg cggagcgacg gccaactgga gtacgtcggc cgaagcgacg 21480 accaggtcaa actgcgcgga ttccgcgtcg agaccgcgga ggtcgcccgc gccctggaga 21540 cctgcccctc cgtcggaagc gcgatggtgg tgctgcgcga ggaccagccg ggcgaccagc 21600 gcctggtcgg ctacctcgta ccggccgccg gaagcggcgc gctcgacaag gaggccgtgt 21660 cggacgcggt ccgggcggtc ctgcccgagt acatggtccc ctcggcactg gtggtgctgg 21720 aagacggacc gccgctgacg gtcaacggca aggtcgaccg gagcgcgctg ccggcgccgg 21780 aggcggagcc ggcccgcagc gcgggccggg cgccgcgcgg gccgcgcgag gagatcctgt 21840 gcgggctctt cgccgacgtg ctgggcgtgc gagcggtcgg cgtggacgac gacttcttcg 21900 ccctgggcgg ccactccctg ctcgccatcg tcgtgatcag ccggatcagg gccctgctcg 21960 acgtggacgt ggccatcgac gcgctcttcg aggcccccac ggtggcc:cgg ctggccgccc 22020 acctcgacgg gcccggacgc ggtcacggcg cggtgcgccc ggccgtgcca cgccccggac 22080 gcctcccgct ctcctacgcc cagctccgcc tgtggctcct ccaccagatc gaggggccga 22140 gcgccaccta caccatcccg ctggcgctgc gcctgaccgg tccgctggac gtggcggcgc 22200 tgcgggccgc gctgggggac gtggtcgccc ggcacgagag cctgcgcacc gtcttcgccg 22260 aggacgagca cggcccgcac cagatcgtcc tcgcgcccgg ggacgccgaa cccggcctca 22320 aggcggtccc caccacggag gaccgtctga ggtccgacct ggaggccgag gccgcccgcc 22380 ccttcgacct cggccaggca ccgccggtcc acgcccgcct cttcgtcctc gacgaacgca 22440 cccacgtcct gctgctggcg gtccaccaca tcgcgatgga cggctggtcg gtccgccctc 22500 tggtgcgcga cctggcgtcc gcctacgcgg cccgccgccg aggcgcctcc ctggacctgc 22560 ccgcacttcc cgtgcagtac gccgactaca ccctgtggca gcacgaggag ctgggctccg 22620 aggacgaccc ggacagtccc ctcgccgcgc aactgcggta ctggcgccgg accctggacg 22680 gcctgccgca ggagtccgcg ccggccgccg accggccccg tcccgccacc ccctcgtacc 22740 ggggcggccg cgtcgccctc accgtcccgc cggaactgca cgggcgggtg gtggagttgg 22800 cgcgggagtt ccgggcgacg ccgttcatgg tggtgcacgc ggcgttggcg gcgttgctga 22860 cgcggttggg cgcgggcacg gacgtgccga tcggttcgcc ggtggccggg cgggtcgacg 22920 acgcgctgga ggacctggtg gggttcttcg tcaacacgct ggtgctgcgc acggacacct 22980 cgggcgaccc gaccttcggg gagttgctgg aacgggtgcg ggccaccgac ctgggggcct 23040 acgcccacca ggacctcccc ttcgaacgcc tggtggaagt gctcaatccg gagcgctccc 23100 tcgcccgcca cccgctcttc cagatcctgc tggccttcaa caacggcgcg gcgcccgacg 23160 aaggacccgc cgaccgggcg tcggacgtcc tggtgcggcc ggagacggtg gagatcgcgg 23220 cggccaagtt cgacctgtcg ctgtccttca acgaggaccg ggcggccgac ggcaccgcgg 23280 ccgggatgcg gggcgtgctg gagtacagcg ccgacctgta cgacgagagc acggcccgca 23340 ggatggccga acgctacctc cggctgctcg aagcggcggt cgcggagccc cgcaccccgc 23400 tgagccgcat tcccgtcctg agcgaggccg agctgcacga cgtcctcgtc cggcgcaacg 23460 acactggtcg cacccggccc gactcctccc cactgcgacg gttcgaggcg caggcggcca 23520 cgactccccg ggccacagcc ctggtcgtgg gtgaggagcg gctcgactac gccgaactcg 23580 acgcacgggc cgagcggctc gccaccctgc tgtcccggag caccgccggg cgcggcggac 23640 ccgtcgccgt cgccctgccg cgcggtgtca tgcttccggt ggccctgctc gccgtctgga 23700 aggcgggcct gcactacctg ccgctggacc ccgaccaccc gaggag<:cgc ctggcggacg 23760 tcctcgccga ctccgcgccc ggctgcgtca tcacgacgac cgacctc:gcg cgccgcctcc 23820 cgccggtacc cgccccgctg ctcgtcctgg acgatccggc caccgccgca cgcctggccg 23880 ccaccaccgc cacagccctg gccgaggacc cgcgggagca gaacggggag tggggggagg 23940 aactggcgta caccatctac acctccggct ccaccggccg tcccaagggc gtcatggtga 24000 cccggtcggc cgtggcgaac ttcctcgccg acatgaacga acggctggaa ctgggccccg 24060 gcgaccggtt gctggcggtc accacggtct ccttcgacat cgccgtcctc gaactcctcg 24120 ccccgctgct caccggcggc acggtcgtcc tcgccgacgc caccacccag cgcgaccccg 24180 cggccgtgag gtccctctgc gcccgcgagg gcgtgacggt gatccaggcc acccccagct 24240 ggtggcacgc catggccgtg gacggcggcc tcgacctcac ggccctgcgc gtgctggtgg 24300 gcggcgaggc actgccgccc gccctcgccc gcaccctcct ggaacccggc cgcgcgccgc 24360 tgggcgatta cctgctcaac ctgtacggac ccacggagac caccgtctgg tccaccgtcg 24420 cgcggatcac cgccgattcc ttggaggcgc acggcggcgc cgtgcccacg gggacgccga 24480 tcgcccgcac cgccgcctac gtgctcgacg CCgCgCtgCg gCCCgtgCCC gacggagtgc 24540 cgggcgagct gtacctggcc ggcgccgggc tggcccgggg ctatctgggc cggccgggaa 24600 tgaccgccga gcggttcgtg gcctgcccgt tcggggagcc gggggagcgg atgtaccgca 24660 ccggcgacct cgcccgctgg cgggccgacg gcaacctgga acacctgggc aggaccgacg 24720 accaggtcaa ggtccgcggg ttcaggatcg aactgggcga ggtgga<~aga gccctgacgc 24780 aggcccacgg cgtcggccgg gccgccgccg ccgtccaccc cgacgccgcc ggctccgccc 24840 gactggtcgg ctatctggta ccggccggcg gcagcggcgc actcgacgag aaggccgtcg 24900 ccgacgccgt gcgggcggtg ctgcccgcgt acatggtccc ctcggcgctg gtggtgctgg 24960 acggcggcct gccgctgacc gcgaacggca agctggaccg ggccgcgctt cccgcgcccg 25020 aggcgacgac cggccgcggc cccggccggg cgccgcgcgg gccgcgcgag gagatcctgt 25080 gcgggctctt cgccgacgta ctgggcgtgc ccgcggtcgg cgtggacgac gacttcttcg 25140 ccctgggcgg ccactccctg ctcgccaccc ggctcatcgc ccgggtccgc ggcacactcg 25200 gcgtcgaact cggcgtccga gaggtcttcg agacaccgac cgtggccggt ctcgccgccg 25260 cgctctccgc ggcgggcgag gccggacccc ggctgcgccc cgccgacccg cgccccgagc 25320 gcctgcccct gtcccacgcc cagcgccgcc tgtggttcgt ccggcaactg gaggggccga 25380 gtgccaccta caacgtcccg tgggcgctgc gcctgaccgg tccgctggac gtggcggcgc 25440 tgcgggccgc gctgggggac gtggtcgccc ggcacgagag cctgcgcacc gtcttcgccg 25500 aggacgagca cggcccgcac caggtcgtcc tgtccgccga cggcccggcc ccgctcagcg 25560 ggcccgtccg gaccgacgag gacgcactgc cccgcctgct gcgggaagcg gccgaccacg 25620 ccttccggct ggacgccgaa ccgccgctgc gcgcccacct gttcgccacc gcgccggagg 25680 accacaccct gctcctggtc atgcaccaca tcgccaccga cgcctggtcg cagcggccgt 25740 tgatcgccga tctggccgcg gcctacgccg cccgccacgc cggccgggtc ccgacgctgc 25800 cgccgctgcc ggtcgcctac gccgactacg ccctgtggca gcaggcc:cgc ctgggcgacg 25860 aacgggagaa ggacagcgcg ctgtccgccc aactcgccta ctggcgcgac gcgctggcgg 25920 gctccccgga ggagctcgcg ctgcccgccg aCCggCCCCg gCCCgCCgtC CCCtCgCdCC 25980 ggggggacag cgtgcccctc accgtcccgc cggaactgca cgggcgggtg gtggagttgg 26040 cgcgggagtt ccgggcgacg ccgttcatgg tggtgcacgc ggcgttggcg gcgttgctga 26100 cgcggttggg cgcgggcacg gacgtgccga tcggttcacc ggtggccggg cgggtcgacg 26160 acgcgctgga ggacctggtg gggttcttcg tcaacacgct ggtgctgcgc acggacacct 26220 cgggcgaccc gaccttcggg gagttgctgg aacgcgtacg ggccaccgac ctgggggcct 26280 acgcccacca ggacctcccc ttcgaacgcc tggtggagct ccgcgacccg gaacgctcgc 26340 tggcccgcca cccgctcttc caggtctcgc tgaactacga cacggccgag acggcccgag 26400 cacgcgatgc cgcaccggaa ctggacgggc tgaccgtgag cgggcgaccg ctcggcgtca 26460 ccacgtccaa gttcgacctc accttcgcgc tcaccgagac ccgcgcccac gacggcggcc 26520 ccgccggact gcgcggcgcg ctggagtaca gcaccgacct gttcgaccgt ggcaccgccg 26580 agcgcctggc ggagcggttc gcacgggtcc tccaggccgc ggtggccgcc cccggcacca 26640 ggctcgacca gatcgacgtg ctgctgccgg gcgaacgcgc gctcctggag ggcgagtgga 26700 gcaggcccga gcccggaccc gtcgccccca cggacgacgc ccgcttcccg gacctcttcg 26760 aggcgcaggc cgcccgcacc ccgcacgccc ccgccgtccg cgacggi:gac cgggagctct 26820 cctacgccga gctgaacgac cgggccaacc ggctggcccg gttcctcgcc gctcgcggag 26880 cgggccccga ggacaccgtc gccgtcctgc tgccgcgcgg ccccgagctg atcaccgccc 26940 tggtggccgt ccagaaggcc ggggccgcct acgtccccat ggacgccgag ctgcccgccg 27000 agcggatcgc ccacatgctg gagaacgccc gcccggtgct cgtcctcgcc cacaccgcaa 27060 cccaggacgc cctcccggag ggggccggcc ccgtggtccg cctcgacgcc ccggccatcg 27120 aggcggcgct cgccgggctc gacggcggcg actgcaccga cgccgaccgc cgcgcaccgg 27180 ccacgcacca cgacccggcc tacgtcgtct acacctccgg gtccaccggt acgcccaagg 27240 gcgtcgtggt cgaacagcgc tccctcgccg ccttcctggt ccgctcggcc gcccggtacc 27300 gcggagccgc cggaaccgcg ctgctgcacg gctcgccggc cttcgacctc acggtcacca 27360 ccctgttcac cccgctgatc gccggaggct gcatcgtggt ggcggacctc gacgctccgg 27420 agcgggacgc cccggcccgc cccgacctgc tcaaggtcac tccctcr_cac ctcgccctcc 27480 tggacacgat cgcctcctgg gcgacacccg cggccgacct ggtcgtcggg ggcgagcaac 27540 tgaccgcgtc ccgtctcgcc cggctgcgcc gggcacaccc ggacatgcgc gtcttcaacg 27600 actacggtcc caccgaagcc accgtgagct gcgccgactt cgtcctggaa ccgggcgacg 27660 caccgcccac cgacaccgtg ccgatcggac gccccctggc gggacaccgg ctgttcgtcc 27720 tggacgatcg cctgcgcccg gtgcccgcca acgtccccgg cgagctgtac gtcagcggcg 27780 tcggggtggc ccggggctat ctgggccggc cgggaatgac cgccgagcgg ttcgtggcct 27840 gcccgttcgg ggagccgggg gagcggatgt accgcaccgg cgacctcgcc cgccggcggg 27900 ccgacggaaa cctggagtac ctgggccgcc gcgacggcca ggtgaaggtg cgcggattcc 27960 gcgtcgagac gggcgagatc gagaccgccc tgctcgaccg cccggagatc ggccaggccg 28020 ccgtcgtcct gcgcggcgaa cgcctcctcg cctacgtcgc ggccccgccg gagcggttcg 28080 acccggacgc gctccgccag gcgctcgcgt cccggctgcc ccggtacatg gtccccgccg 28140 cgttcgtccg gctggacgcc ctgccgctgg ctccgggagg caagctcgac caccgggcgc 28200 tgcccgagcc gccggcgccc gccgacgccc cgcacgggag caggccgccg cgcgacgcgt 28260 gggaaggcgt gctgtgcgag gcgttcggcg aggtgctggg gatcgcggag gtcggggccg 28320 acgacgactt cttcgccctc ggcggcgaca gcatcggctc catccggctc gtcggccggg 28380 tgcgcgcggc gggcggccgg atgaccgtcc gcgacatctt cgaacagcgc acgcccgccg 28440 ccctcgccgg CCgCtCgCgC cccggcggtc cggcgaccga ggtactcggc ggtcgcggga 28500 ccgggccggt ggagccgacg ccgatcagct cctggctggc cgagctgggc ggcgcggtcg 28560 acgggtacaa ccagtccgtg ctgctgcgcg tccccgccga ggccgacgcg gccgtcgtga 28620 ccggcgccct ccagacactg ctggaccacc acgacgcgct gcggatgcgg gccgaaccgg 28680 aggacggtca ctggcggatg gagatcgccg aggcgggcgc ggtggacgcg gccaccgtgc 28740 tggagcgggt ggacgcggcg ggcgccgatc aaggggagct ggaccgc~ctg gtgcggacgc 28800 actgcgccgc ggcccgtgac cggctcgccc cgcagaaggg ctccgtcctg cgtgccgtct 28860 ggttcgacgg cgggccacgg gagccgggac acctcgcgct cgtcgcccac cacctcgtcg 28920 tggacggagt ctcctggcgc atcctcaccg ccgacctcgg cagcgcgtgg caggccctcg 28980 ccgagggccg ggaaccccac ctcgacccgg tgggcacccc gctgaggatc tgggcccggc 29040 acctggcgga gctggccgcc gacccgcgcc gcgccgagcg gtgcgcccac tgggaggagc 29100 agtcgccgcg gccctgggag accggcggtc tcgaccccgc cctcgacgac cggagcaccg 29160 aggaggccct ttccctgacc ctcccggccg ccgccacccg cgccgtgctc ggcccggtgc 29220 ccgccgcgct cggcgtcggg gtgagcgaag tcctgctcgg aacgttcgcc gccgccgtgc 29280 ggcgccggcg tcccgcggag gccgcggacg gcgtcacggt ggacctggaa ggccacggcc 29340 gggaggagga cgtcgtcccc ggagcggacc tctcccgcac ggtcggctgg ttcaccgccg 29400 cccacccggt ccgggtgccc gccgcgcggc ccgacgagga ccggaccggc gcgctgcggg 29460 cgctggccgc gatcctggac cgggcgcccg acgccggcct cggctacggc ctcctgcgct 29520 acctcaaccc gcgcacccgg caacggctcg ccgccctgcc cgccccgcgc tacggcttca 29580 actacctggg cagattcggt ggttcgggag aggaccggga cgcggacgac cggacgtcga 29640 actggtcgcc ggtggccgcc gggctcgccg gccagcccgc ccagct.gccc ctggcccacg 29700 agatcgaggt caccgcggtc gccgtcgagg ggcccgaggg cccgcgtctg atcgccacct 29760 ggtcctgggc cggccggctc caccgggagc gggacgtccg cgaactggcc gaactctggt 29820 tccgcgagct ggaggaactc gcgtccgccg aaccgccccc ggccggcccc gcgcccctgt 29880 ccgacccgcc ccccctggtc gaactcaccg acaccgaact cgaccagctc gaagcagagt 29940 ggaaggccgg ctgatgcgcg gatccctaca ggacgtcctg cccctctccc cgctgcagga 30000 gggcctcctc ttccacagcg aatacgtcgg ggacgaggcc gtcgacgtct acaccgtcca 30060 gaccgaggtg gagctgcggg gaccgctgga cgtcccggcg ctgcgcgcgg cggccgaggc 30120 cgtgctccgc cgccacgaca acctgcgggc cggcttcgcg acccgcgccc tgaaggaccc 30180 ggtgcagttc gtcccccgcg aggtcgaact cccctgggag gaggccgacc tgcgcgcggc 30240 cgacgacccg gacgcggagg cggcccgccg actggaggaa caccgctggc gccgtttcag 30300 gccctccaag ccgcccctgg tgcggttcct gctgctgcgg acggcccacg accgccaccg 30360 tttcgctctc accaaccacc acatcctgct ggacggttgg tcgatgccga tgctgctgcg 30420 cgagctcatg ctgctctacc gcaccggggg cgacgcctcc gccctgccgc cggtgcgccg 30480 ctaccgggac tacctggcct ggctgggggg ccgcgaccgc gggaccgcac gggaggcctg 30540 gcgcgccgct ctggcgggtc tggaggcgcc caccctcatc gccccgcggg ccgaccgggc 30600 cgcggaggcg ccgacgtggc tcgacttcac gctgtcggag accgcctccg ccggtctctc 30660 cgcggccgcg cgagccgccg gcctcacgct caacacggtc gtgcaagggc tgtgggccct 30720 gaccctggca cgcaccaccg gcagccagga cgtggtgtac ggggtggtcg tctccggacg 30780 tccgccggag ctggacggcg tcgagtccat gatcggcctg ttcgcgaaca cggtcccgct 30840 gcgggcccgg atgcccgtcg acgaacccct gacggatttc ctgcggcggc tgcagcgcga 30900 gcagagcgcg ctcctggacc atcagcacct acggctggcc gacatc~~agc ggctggccgg 30960 ccagggcgag ttgttcgatt ccgtgatggc gttcgagaac tacccggacg gccccgccga 31020 cgagccctcc ggcgcttccg ccgacacgcc gggacacgtc cgcgtggtgg cctcccggat 31080 gcgcgacgcc atgcactacc cgctcggcct cctggcgtcc cccggaactc ggatgcgctt 31140 ccgcctcggc caccggccca gcgcggtcac gccgcgcctg gccgccgccc tgcgcgaccg 31200 cctgctgcgg ctcgtcgacg ccttcctcgc caccccgcac ctgccgctgg gcaggttcga 31260 cgtcctcgac gacgccgaac gcgccctggt actggacacg ttcaacgaca ccgcgcacga 31320 ggtcgaggac accaccgccg tcgagctgtt cctccggcag gccgcccgca cccccgcccg 31380 gatcgccgtg gagacggccg accgctccgt cgactacgcc cggctc:gccg accgctccgg 31440 ccgcctggcc cgcctgctgg cggagcacgg ggcgcgggcc gagcggttcg tcgccctcgt 31500 gctgccgcgc tcgcccgaac tggtcgaaac cgcgctcgcc gtgtgcrcaga ccggagccgc 31560 ctacgtcccg gtggaccccg cccacccggc cgaccggatg gcccggctgc tgcgggaggc 31620 cgaccccgtc ctcaccgtca ccaccgccga cctggccgac cggctgccgg ccgggctccc 31680 tctgctggtc ctggacggcc cgagcaccgc cgccgccctc caggccctgc ccggcggccc 31740 gctgaccgcg agtgagctcc ccgcgcccgt ggacccccgg aacgccgcct acgcgctcta 31800 cacctccggg tccaccggcc gccccaaggg cgtggtcgcc acccaccgct ccctcgtcgg 31860 ctacctgctg cgcggctcgg cccagtaccc gtccgacgga cgctccctgg tgcactcgcc 31920 ggtctccttc gacctcaccg tcggcgccct gtacgtcccg ctgatcagcg gcggcaccgt 31980 gcgcctcgcc tccctggacg acgaaccggt cctgcgcccc ggcgagacgc cccccgactt 32040 cgtgaaggtg acccccagcc acctgcccgt cctcgaaggg ctgccgggcg aggtcagccc 32100 gaccggggcg atcaccttcg gcggcgaaca gctcaccggc cgccacctgc ggcgctggcg 32160 cgccgaccac ccggacgtca ccgtctacaa cgtctacggg cccaccgaga cgaccgtcaa 32220 ctgctccgag caccgcatcg ccccccgtga cccggtcggc gacgggccgg tccccatcgg 32280 acggccgctg tggaacaccc gcctgttcgt cctgggcccc ggcctcgccc cggtgccggt 32340 cggcgtgccg ggcgagctgt acgtcgccgg cgccggcctg acccgcggct acctccgcga 32400 tccgggcagg accgccgagc gcttcgtcgc ctgcccctac gccgccgggc aacggatgta 32460 ccgaaccggc gacctcgtcc gctggaacga ggacgggctg ctggagtacc tgggcagggt 32520 ggacgaccag atcagcctgc gcggcttccg ggtggagccc ggcgaggtgg aggcggcgct 32580 ggcggcccac cccgcggtcc gccgcgccgc ggtggtgctg cgggaggaca cgcccggcga 32640 cgcccggctg gtcgcctacg ccgtccccgc cgagccggaa ggagcg~~gga gcacgccgcc 32700 gtccccgctc cccaccgagc agatcctgga acacctgcgc cggaccctgc cgccctacat 32760 ggtccccgcg cacctcgtgg aactgcccgc gctgcccgtc acgccccacg gcaagatcga 32820 ccgggccgcg ctgccggaac cctccgtcgc cggcgccccg gccggaggag cgccccgctc 32880 cccccgggag gagatcctgt gcggcatctt cgccgaggtg ctgcggcgcc cgcgggtctc 32940 catcgacgac gacttcttcg ccctgggcgg gcactccctg ctggccaccc ggctggccag 33000 cagggtgcgg gcggccctgg acacggagct gccggtgcgc cgcctcttcg aacaccccac 33060 ggtgcgctcc ctgtccgcac tgctggaccc cgacgccggc aggcgccctg cggtgacgcc 33120 cgcacggcga cctgagcacg tcccgctctc cttctcccag cagcggctgt ggatcatgca 33180 ccggctcacc ggccccgacg ccacgtataa catccaccgg gccctgcggc tcgacggcga 33240 cctcgacgtc ccggcgctgg aggccgcgct gcacgacgtg accgaacggc acgagacgct 33300 gcgcaccgtc ttccccgagg gccccgaggg cccgtaccag aaggtcctcc cggcccgacg 33360 ggaggacggg accctcaccg tcctcccggt cgccgaccgg gaggtcgacc gcaccctcgc 33420 cgagctggcg gcccaccgct tcgacctgga gtccgaaccg ccgaagcgcg cctggctcct 33480 ggagagcggt ccgcgcagcc gggtcctcgt cctggtgctc caccacatcg ccagcgacgg 33540 ctggtcgggc aggcggctcc tgcgcgacct gttcaccgcc tacaccgcgc gccgcgcggg 33600 ccgggcgccc caatggcgac cgctgccggt gcagtacgcg gactacgccc tgtggcagcg 33660 gcgccacctc ggcgaccccg cggaccccgc cagtcccgcc gccgtccaag gggagtactg 33720 ggagaagcag ttggccggac tccccgagga actgcggctg cccgccgacc ggccgcgccc 33780 ggcgcgcccg acccgcaccg gcggccaggt gtggctgacg ctcccggcga cggcccacgc 33840 cgccgtggcc gagctggcca gaaccagccg ggccagcgtg ttcatggtcg tccaagccgc 33900 cgtggccgcc ttcctcaccc gcatgggcgc cggggaggac atccccatcg gcgccccggt 33960 cgccgggcgc accgacgaag cggtggagga actggtcgga ttcttcgtca gcaccctggt 34020 cctgcggacc gaCaCCtCCg gtgaCCCCtC gttcaccgaa ctcgtcggcc gggtccggga 34080 aaccgcgctg gccgcctacg cccaccagga cctgcccttc gagtacgtgg tggagcggct 34140 cagcccgacc cggtccctcg gccggcaccc cctcttccag gtcgccctgt cctgcaacaa 34200 caccgaggag cagctgggcc gccagggcgc cccgcccccc gggctctccg tcacaccgca 34260 ccaggtggac gccgcccgct cgaagttcga cctgatgttc accttcctgg agaaccacgg 34320 cgaggacggc cagcccacgg gcatcgagac cgccctcgaa tacagcgccg acctgttcga 34380 ccgggagacc gcgcagagcc tcctcgaccg cttcgcccgg atgctggcga tctgggcggc 34440 ggaaccggcc gccgccatcg gcgctcgcga actcctggcg gccgacgagc ggcacacggt 34500 ggtcaccgcg tggaacgcca cccggcgcgc ggacctggtc gcgacactcc cgcggatgtt 34560 cgaggagcag gtcgcccgca ccccgcacgc cacagccctc gaacacgccg gccaccacct 34620 gacgtacgcc gaactcaacg cccgagccaa ccggttggcc agagtgctgg tgcgccgcgg 34680 catccgcccc gaacaccgcg tcgccatcct gatgccgcgc tccgtcgagc agatcaccgc 34740 cctgctggcc atcaccaagg ccggcggcgc cgccgtaccg gtcgatcccg gccaccccgg 34800 acaacgcatc gccttcatgc tgcgcgacag cgcctgcgcc ctgatcctgg cggaccaccc 34860 gcacgcggcg ggacgtgagg agatcgccgg cgtcccggtc ctcgtccccg ccgacgaacc 34920 ggccccggaa cgggccaccg acctcgccga cggcgaccgc aacgcccccc tcaccgccgg 34980 ccacgccgcc tacgtcgtct acacctccgg ttccacgggc cgccccaagg gcgtggtgac 35040 cgaacaccgc ggcctgctgt cactggccac ggcacagcgt gagcga.tacc cggtggggcc 35100 cggcagccgg gtgctgcaac tcgcctcacc gtccttcgac ggcgccgtac tggaactgct 35160 catggccctc accaccggag gaaccctcgt cctgcccgac gggcccctcc tcgccgggca 35220 accgctcgcc gacatgctgg ccgagcaccg catcagccac gccttcatcc ccccggcggt 35280 gctgagcggc cttccctccg aagggctgga gggcctgcgc tgcctcgtcg tcggcggcga 35340 ggcggtcacc gcgcccctca cggaccgctg ggcgcccggc cgtcgcatgc tcaacatcta 35400 cggccccacc gagaccaccg ccgtcaccct gaccagcgaa gccctgaccc ccggcggccc 35460 accgcccgcc atcggcaccc ccgtacccaa caccagggcc cacgtgctcg acgaccggct 35520 gcgccccgtc ccgcccggcg tgacgggcga gctgtacctg gccggcgcgt cactggcgcg 35580 cggctacggc cgccgcccgg cgctcaccgc cagccgctac gtcggctgcc cgttcggagc 35640 gccgggggag cggatgtacc gcaccggcga cctggcgcgc ctggaccggg agggccgcgt 35700 ccaccacatg ggccgcaccg acgagcagat caagctgcgc ggcttccgcg tcgagcccgg 35760 tgagatccgg gcccggctca ccgagcatcc cgccgtgcgg gaggcggcgg tcgtcctgcg 35820 cgacgacggg ccgggcggac gcgcgctggt ggcctacgcg gtaccggccg acggcccgcc 35880 ccgccccacc gcggcccagc tccgcgcaca cctgaacgcc ctcctcccgc cctacatggt 35940 gcccgccgcc ttcctggtgc tggacgcgct gccgaccacc cccaacggca agctcgaccg 36000 ggaggccctg cccgccccgc aaccgcacgc cgaggagacc ggccgtccgc cgcgcgacga 36060 acgcgaggcc gccctgtgcg aggtgttcgc cgaggtcctg gagcgcacgt cgctcggcgc 36120 cgacgacggc ttcttcgaga acgggggaca ctcgctgctc gccgtccggc tggtcgccag 36180 ggtccgcgag cgcctgggcg tgcccctggc cgcacgggac ctgttcgagg ctcccacccc 36240 ggccgccctg gcggagcgcc tggcccgcgg cgccgaacgc cgcgccccgg cgcccctgct 36300 caccctgcgg ggccgcggcg accggccccc cttgttctgc gtccacccgg ccgtcggcct 36360 gggatgggcg tacgcgagcc tcctgccgtg gctcccggcc gacgtcc ccc tccacgcgct 36420 gcaggcccgc acgcccgcgg acggcgccgg tctgccgggg agcgtcgagg agatggccga 36480 ggactacgtc cggctgatcc gccgcgtccg cccccacggc ccctaccggc tgctcggctg 36540 gtcgctggga gcccacgtgg cccacaccgc ggcggccctg ctggagcgcg acggacagcg 36600 ggtggacctg ctcgccatgc tggacgccta tcctccccac cgcaccgggg accccggcgg 36660 acgggccgag gagtccgagg ccgagatcgt ggcggccaac ctgcgggagt cggggttcgc 36720 gtgggacgag gacgagcagc gcgcgggacg cttcccgctg gagcgct~tcc gcgcccacct 36780 gcgccgggtg gacagctcgc tcggccacct cgacgacgcc gaactgacgg cggccaagga 36840 SO

cgtctacgtc aacaacgtcc ggctcatgcg ctccttcacc cccggccgtg tccggtgcgg 36900 gatcgtcctg atgaccgcgg aacgcacccg cagcctcgat ccggcggcgt gggagccgca 36960 caccgaagga ggcgtcgagg tgcaccggct ggacgcctcc cacatgtcca tgctgaccga 37020 accggcgtcg gtcgcggcag ccggccgtct cctgacccac cgactggagt ccctgcgggg 37080 agccaccacg aagaaacgag aggtatgacg atgaccaacc ccttcgacga caccgagggc 37140 gtcttccacg tcctggtcaa cgacgagaac cagcactcgc tgtggcccca cttcgtcgag 37200 atccccgacg gctggcgggc cgtggtgcgc gagcgtccgc gccaggagtg cctggactac 37260 atcgaggcga actggaccga catgcgcccg cagagcctca tcgacg~ccat ggaggcacac 37320 gagaagtccg agggcgcgat ccggtgaccc aagggcgcgg 37360 Information for SEQ ID N0: 7 Length: 2595 Type: PRT
Organism: Streptomyces fradiae Sequence: 7 Pro Arg Val Pro Ala Arg Pro Ala Ala Thr Pro Asp Pro Arg Ala Leu Pro Glu Arg Leu Pro Leu Ser Pro Ala Gln Arg Arg Leu Trp Phe Leu Asn Arg Tyr Asp Arg Glu Ala Gly Gly Tyr His Ile Ser Val Ala Leu Arg Leu Thr Gly Asp Leu Asp Val. Asp Ala Leu His Ala Ala Leu Gly Asp Leu Thr Ala Arg His Glu Ser Leu Arg Thr Val Phe Arg Glu Asp 65 70 75 g0 Glu Gln Gly Pro His Gln Val Val Leu Asp Pro Gly Ala Pro Pro Ala Pro Ala Val Val Pro Ala Ala Ala His Arg Ile Asp Ala Leu Val Arg Glu Ala Val Arg Arg Pro Phe Asp Leu Ala Asp Asp Ile Pro Leu Arg His Thr Leu Phe Thr Leu Pro Asp Gly Glu His Val Leu Leu Leu Val Ile His His Ile Ala Ala Asp Gly Trp Ser Met Gly Pro Leu Ala Arg Asp Leu Ala Ala Ala Tyr Arg Ala Arg Ala Ala Gly Arg Ala Pro Asp 165 170 ' 175 Trp Pro Ala Pro Ala Ala Arg Pro Ala Ala His Pro Pro Gly Gln His Gly Asp Asp Val Asp Asp Thr Val Asp Arg Arg Leu Ala His Trp Ala Glu Glu Leu Arg Gly Leu Pro Asp Glu Leu Ala Leu Pro Tyr Asp Arg Pro Arg Pro Thr Thr Pro Pro Gly Tyr Ala Glu Arg Val Pro Phe Arg Val Asp Ala Ala Leu Tyr Arg Asp Val Arg Ala Leu Ala Ala Arg FIis Arg Ala Thr Pro Phe Met Val Leu His Ala Ala Leu Ala Ala Leu Trp His Arg Leu Gly Ala Gly Pro Asp Ile Pro Val Gly Thr Pro Ser Ala Gly Arg Asp Arg Pro Glu Thr Ala Asp Leu Val Gly Phe Leu Val Asn Thr Leu Val Leu Arg Thr Asp Thr Ser Gly Asp Pro Ala Phe Ala Glu Leu Leu Asp Arg Val Arg Glu Thr Asp Leu Arg Ala Tyr Ala His Gln Asp Val Pro Phe Glu Arg Leu Val Glu Ala Val Asn Pro Ala Arg Ser Pro Ser Arg His Pro Leu Val Gln Thr Met Leu Thr Phe Asp Asn Ala Ala His Gly Ala Leu Asp His Leu Leu Asp Leu Pro Gly Val Arg F>la Glu Leu Leu Pro Thr Ala Glu Gly Thr Ala His Thr Asp Ile Glu Leu Thr Phe Thr Glu Thr Thr Ala Asp Thr Asp Gly Asp Gly Leu Asp Ala Ser Leu Arg Tyr Arg Pro Asp Leu Phe Asp Arg Thr Thr Ala Arg Ala Leu Ala Glu Arg Phe Met Ala Leu Leu Arg Thr Val Thr Arg Glu Pro Ala Leu Arg Leu Gly Gln Leu Asp Val Thr Thr Ala Gly Glu Arg Arg Arg Leu Ala Asp Ala Asp Ala Ala Ala Arg Ala Arg Thr Ala Ala Thr Ala Val Ala Val Leu Pro Ala Leu Phe Ala Ala Ser Ala His Arg Thr Pro Ala Ala Pro Ala Leu Thr Asp G7y Pro Ala Thr Leu Asp Tyr Ala Glu Leu Asp Ala Arg Ser Asn Arg Leu Ala Arg Ala Leu Leu Gly Leu Gly Val Gly Pro Glu Asp Phe Val Ala Leu Ala Val Pro Arg Ser Ala Asp Leu Val Val Ala Val Leu Ala Val Leu Lys Ser Gly Ala Ala Tyr Leu Ala Val Asp Pro Asp His Pro Ala Glu Arg Thr Ser Tyr Ile Leu His Asp Cys Arg Pro Val Ala Val Leu Ser Thr Thr Ala Val Arg Glu Thr Leu His Gly Thr Val Gly Glu Ala Val Gly Glu Val Pro Trp Leu Leu Leu Asp Glu Pro Ala Thr Gly Gly Ala Thr Ala Gly His Ser Ala Ala Pro Val Thr Asp Ala Asp Arg Arg Ser Pro Leu Leu Pro Asp His Pro Ala Tyr Thr Ile Tyr Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Val Val Ser His Ala Asn Val Ser Arg Leu Leu Thr Ala Cys Arg Ala Ala Val Asp Phe Gly Pro Asp Asp Val Trp Thr Leu Phe His Ser Ser Ala Phe Asp Phe Ser Val Trp Glu Met Trp Gly Pro Leu Ala His Gly Gly Arg Leu Val Val Val Pro His Asp Val Ala Arg Ser Pro Gly Asp Leu Leu Asp Leu Leu Gly Arg Glu Arg Val Thr Val Leu Ser Gln Thr Pro Ser Ala Phe Leu Gln Leu Leu Arg Ala Glu Se:r Asp Leu Gly Val Pro Pro Arg Thr Thr Ala Ala Leu Arg Tyr Val Va:1 Phe Gly Gly Glu Ala Leu Asp Thr Ala Gln Leu Ala Pro Trp Arg Gly Arg Pro Val Arg Leu Val Asn Met Tyr Gly Ile Thr Glu Thr Thr Val His Val Thr 785 790 '795 800 His Leu Glu Leu Asp Asp Ala Ala Val Asp Arg Gly G1;T Ser Pro Ile Gly Thr Pro Leu Asn Asp Leu Arg Ala His Val Leu Asp Gln Gly Leu Leu Pro Val Pro Val Gly Val Val Gly Glu Leu Tyr Val Ala Gly Pro 835 840 89:5 Gly Leu Ala Arg Gly Tyr Arg Arg Arg Pro Gly Leu Ser Ala Thr Arg Phe Val Ala Asp Pro Phe Asp Thr Gly Gly Arg Met Tyr Arg Thr Gly Asp Leu Val Arg Arg Thr Gln Asp Gly Gly Leu His Tyr Val Gly Arg Ser Asp Ser Gln Val Lys Leu Arg Gly Tyr Arg Ile Glu Pro Gly Glu Ile Glu Ala Ala Ala Arg Arg His Pro Asp Val Ala Gln Ala Ala Thr Ala Val His Gly Glu Gly Pro Gln Asp Arg Tyr Leu Val Cys Tyr Val Val Pro Ala Ala Asp Thr Asp Pro Asp Pro His Gln Val Arg Ala His Leu Ala Asp Ala Leu Pro Gly Tyr Met Val Pro Ala Ala Val Val Pro Leu Thr Ala Leu Pro Leu Thr Pro Asn Gly Lys Leu Asp Arg Ala Ala Leu Pro Ala Pro Asp Arg Ala Ala Trp Ala Thr Gly Gly Ala Pro Thr Gly Pro Arg Glu Glu Ala Leu Cys Ala Ala Phe Ala Asp Val Leu Gly Val Gln Glu Val Ser Arg Asp Ala Asp Phe Phe Ala Leu Gly Gly His Ser Leu Ser Ala Val Arg Leu Ile Ser Arg Ile Arg Ser Ala Leu Gly Val Glu Ile Gly Ile Arg Thr Leu Phe Glu Ala Pro Thr Pro Ala A1a Leu Ser Arg Arg Leu Asp Thr Ala Gly Thr Gly Arg Pro Arg Leu Leu Pro Arg Arg Arg Pro Asp Arg Val Pro Leu Ser Ser Ala Gln Arg Arg Leu Trp Phe Leu Gly Glu Leu Glu Gly Pro Ser Ala Thr Tyr Asn Ile Pro Leu Ala Leu Arg Leu Arg Gly Arg Leu Asp Val Asp Ala Leu Arg Thr Ala Leu Ala Asp Val Val Gly Arg His Glu Ala Leu Arg Thr Val Phe Pro Ser Glu Asp Gly Ala Pro Tyr Gln Gln Val Val Ala Ala Glu Arg Ala Ala Pro Ala Leu Asp Val Val Asp Val Thr Glu Lys Glu Leu Pro Ala Ala Leu Ala Glu Ala Arg Ala His Ala Phe Thr Leu Thr Glu Asp Leu Pro Leu Arg Ala Val Leu Leu Arg Thr Gly Pro Ala Asp His Val Leu Ser Leu Val Leu His His Ile Ala Gly Asp Gly Trp Ser Leu Ala Pro Leu Ala Arg Asp Leu Ser Thr Ala Tyr Ala Ala Arg Arg Glu Gly Arg Ala Pro Gln Trp Arg Pro Leu Pro Val Gln Tyr Ala Asp His Thr Leu Trp Lys Glu Glu Leu Leu Gly Ala Ala Asp Asp Pro Glu Ser Leu Leu Ala Arg G1n Leu Ala Phe Trp Arg Glu Ala Leu Glu Gly Ala Pro Glu Gln Ile Glu Leu Pro Thr Asp Arg Pro Arg Pro Ala Met Glu Ser His Arg Gly Ala Ile His Arg Phe Thr Leu Pro Ala Ser Leu Arg Asp Arg Leu Arg Asp Leu Ala His Ala Arg Arg Ala Thr Leu Phe Met Ala Leu Gln Ala Gly Leu Ala Ala Leu Phe Ala Thr Leu Gly Ala Gly Arg Asp Ile Val Leu Gly Thr Pro Val Ala Gly Arg Ala Asp Glu Ala Ala Asp Asp Leu Val Gly Phe Phe Val Asn Thr Leu Ala Leu Arg Thr Asp Leu Gly Gly Asp Pro Thr Phe Glu Glu Leu Leu Asp Arg VaI Arg Glu Ala Asp Leu Ser Ala Phe Ala His Gln Asp Ile Pro Phe Glu Gln Leu Val Glu Ala Leu Asn Pro Thr Arg Ser Leu Ser Arg His Pro Val Phe Gln Val Leu Leu Ala Leu Gln Asn Asn Glu Arg Gly Glu Ala Val Met Pro Gly Leu Glu Val Thr Val Glu Arg Pro Ala Gln Ala Ala Ala Lys Tyr Asp Leu Phe Val Asn Leu Val Glu Ser Arg Asn Glu Glu Asp Gly Thr Thr Ala Val Glu Gly Ala Val Glu Tyr Ala Thr Asp Leu Phe Asp Ala Arg Thr Val Ala Arg Leu Thr Glu Arg Tyr His Asp Leu Leu Leu Ala Ala Val Glu Glu Pro Thr Thr Arg Leu Ser Arg Met Pro Met Leu Asp Thr Ala Glu Arg Asp Arg Leu Thr Ala Glu Trp Gly Ala Ala Ala Ala Gly Pro Ala Glu Asp Leu Val Ala Leu Phe Arg Ala Arg Ala Ala Glu Thr Pro Gly Ala Val Ala Val Arg Gly Ala Gly Asp Ser Leu Thr Tyr Ala Gln Leu Asp Glu Arg Ala Gly Arg Ile Ala Ala Ala Leu Ala Arg His Gly Ala Gly Pro Glu Ser Arg Val Ala Val Cys Leu Pro Arg Thr Ala Asp Leu Val Ala Ala Leu Leu Gly Val Leu Arg Ala Gly Ala Ala Tyr Val Pro Leu Asp Pro Glu Tyr Pro Asp Glu Arg Val Ala Ala Ile Leu Ala Asp Thr Arg Pro Val Ala Leu Leu Thr Thr Ala Asp Cys Arg Pro Ala Ile Thr Gly Ala Ala Thr Ala Ala Gly Gly Ala Val Leu Leu Ala Ala Asp Ala Ala His Gly Ala Gly Pro Va1 Pro Glu Pro Pro Ala Pro Leu Pro Asp Gln Ala Ala Tyr Val Leu His Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Val Val Ser Arg Gly Asn Leu Ala Asn Leu Leu Ala Asp Met Arg Asp Arg Leu Arg Pro Thr Ala Asp Asp Arg Leu Val Ala Val Thr Thr Val Ser Phe Asp Ile Ala Ala Leu Glu Leu Phe Leu Pro Leu Val Thr Gly Ala Gly Leu Val Leu Ala Asp Arg Gly Ala Ala Arg Ala Pro Glu Glu Leu Ala Ala Leu Leu Thr Ala Ser Gly Ala Thr Leu Leu Gln Ala Thr Pro Thr Thr Trp Gln Leu Leu Ala Glu Thr Ala Pro Asp Ala Leu Arg Gly Leu Arg Lys Leu Val Gly Gly Glu Ala Leu Pro Ala Ser Leu Ala Ser Arg Leu Arg Glu Leu Gly Gly Glu Leu Val Asn Val Tyr Gly Pro Thr Glu Thr Thr Ile Trp Ser Thr Ala Ala His Leu Asp Arg Val Thr Gly Ser Ala Pro Pro Ile Gly Arg Ala Leu Arg Gly Thr Arg Ala Tyr Val Leu Asp Glu Trp Leu Asn Pro Arg Pro Glu Asn Val Pro Gly Glu Leu Tyr Leu Ala Gly Ala Gly Val Ala Arg Gly Tyr Leu Gly Arg Gly Gly Leu Thr Ala Glu Arg Phe Thr Ala Asp Pro Phe Gly Ala Pro Gly Ser Arg Met Tyr Arg Thr Gly Asp Leu Val Arg Arg Arg Ala Asp Gly Glu Leu Glu Phe Leu Gly Arg Thr Asp His Gln Val Lys Val Arg Gly Phe Arg Ile Glu Leu Gly Glu Ile Glu Thr Ala Leu Gly Ala His Pro His Val Ala Gly Ala Val Val Va1 Ala Arg Ala Ala Ser Gly Ala Ala Leu Val Pro Asp Ala Pro Ala Pro Arg Arg Leu Val Ala Tyr Val Va1 Pro Glu Pro His Arg Ala Ala Pro Asp Asp Gly Arg Glu Gln Asn Arg Leu Asp Glu Trp Arg Glu Ala Tyr Asp Thr Leu Tyr Gly Ser Ser Ala Pro Ala Pro Leu Gly Gln Asp Phe Gly Ile Trp Arg Ser Ser His Asp Gly Gln Pro Ile Pro Leu Asp Glu Met His Gln Trp Arg Ala Ala Thr Val Asp Arg Ile Arg Ala Leu Arg Pro Thr Arg Val Leu Glu Ile Gly Val Gly Thr Gly Leu Leu Leu Ser Glu Leu Ala Glu Asp Cys Thr Ala Tyr His Gly Thr Asp Leu Ser Ala Arg Ala Ile Glu Thr Leu Arg Ala Gln Val Asp Ala Glu Pro Ala Leu Lys Glu Arg Val Glu Leu His Val Arg Pro Ala His Asp Phe Asp Gly Leu Arg Arg Gly Phe Tyr Asp Thr Ile Val Leu Asn Ser Val Val Gln Tyr Phe Pro Asp Ala Asp His Leu Thr Arg Val Leu Arg Gly Ala Leu Asp Leu Leu Ala Pro Gly Gly Arg Leu Phe Val Gly Asp Val Arg Ser Leu Ala Leu Leu Arg Ala Phe Arg Ala Ser Val Glu Thr G11~ Asn Ser Ala Val Ser Glu Thr Pro Ala Ala Val Leu Ala Ala Ala Asp Arg Arg Thr Ala Ala Glu Asn Glu Leu Val Ile Ala Pro Asp Tyr Phe Ala Arg Leu Arg Arg Glu Ala Arg Glu Pro Leu Leu Leu Asp Val 2240 ?,245 2250 Arg Ile Arg Arg Gly Arg Pro Tyr Asn Glu Leu Thr Arg Tyr Arg Tyr Asp Val Leu T~eu Val Lys Gin Glu Thr Gly Ala Ala Pro Ser Ala Leu Pro Pro Ala Thr Glu Leu Arg Trp Thr Pro Glu Thr Gly Asp Ala Gly Arg Leu Ala Glu Ile Cys Ala Ala His Pro Gly Ala Leu Arg Val Thr Ala Ile Pro Asn Ala Arg Val Arg Arg Glu Thr Thr Ala Leu Ala Ala Leu Glu Asp Gly Arg Pro Val Thr Glu Val Arg Arg Leu Leu Glu Gln Pro Gly Asp Gly Val Asp Pro Glu Asp Leu Tyr Asp Ala Ala Thr Ala Ala Gly Arg Thr Ala Trp Vai Thr J~8 Trp Ser Ala Asp Gly Pro Pro Asp Thr Val Asp Leu Val Leu Ala Pro Gly Gly G1y ProProValAlaPro ProAla Glu Ala Asp Val Leu Pro Gly Pro AlaAspArgProGlu ThrAsn Asp Trp Ala Ala Pro Ala Gly His ArgGluLeuAlaAla ArgLeu Arg Thr Ser His Ser Leu Ala Arg ProAspTyrMetVal ProSer Ala His Glu Leu Val Val Leu Ala ProLeuThrAlaAsn GlyLys Val Val Asp Leu Asp Asn Ala Pro ProAspProAlaGly ThrAsp Ala Arg Leu Asp Gly Pro Pro Thr ArgGluGluLeuLeu CysThr Leu Arg Arg Pro Phe Asp Leu Gly GlyArgValGlyVal GlnAsp Ser Ala Leu Leu Phe Gly Leu Gly SerValLeuSerVal ArgLeu Val Phe Gly Asp Ser Ala Arg His LeuProLeuThrThr ArgAsp Val Arg Ala Gly Phe His His Ala AlaLeuAlaAlaAla LeuAsp Gly Glu Thr Ala Arg Pro Glu Glu AspGlyGlyProPro GlyPro Asp Glu Ser Pro Ala Ala Ala Pro ThrLeuAspGluLeu AlaGlu Leu Thr Arg Ile Glu Glu Phe Thr TrpGluGluThrGln Ala Gly Asp Information for D NO:

Length:7788 Type:
DNA

Organism:Streptomyces fradiae Sequence:8 cccagggtcc cggcgcggcccgcggcgacc cgagcggctg 60 cccgacccgc gggcgctgcc cccctgtcgc ccgcccagcgcaggctgtgg ggaggccggc 120 ttcctcaacc gctacgacag ggctaccaca tcagcgtcgcgctgcggctc ccggcgatc cgccctccac 180 a tcgacgtcga gcggcactgg gcgacctgaccgcccggcac ccgcgaggac 240 gagagcctgc gcaccgtctt gaacaggggc cgcaccaggtcgtcctggac cgccgtcgtc 300 ccgggggccc cgcccgcacc ccggctgccgcccaccgcatcgacgccctggtgcgcgaagccgtccgccgccccttcgac360 ctggccgacgacatcccgctgcgccacaccctcttcacgctcccggacggcgaacacgtc420 ctgctcctggtcatccaccacatcgccgccgacggctggtcgatggggccgctggcacgg480 gacctggccgccgcctaccgcgcccgcgccgccggccgcgcgcccgactggcccgccccg540 gccgcccgccccgccgcacacccgcccggacagcacggcgacgacgtggacgacacggtg600 gaccgccgcctcgcccactgggccgaggaactgcgcggactgcccgacgaactcgcgctg660 ccctacgaccggccgcgccccacgacaccccccggctacgccgagcgggtccccttccgc720 gtcgacgccgcgctgtaccgggacgtgcgggcgctggcggcccgccaccgggccaccccg780 ttcatggtcctccacgccgccctggccgccctgtggcaccggctcggcgccggccccgac840 atccccgtgggcaccccgtccgccggccgcgaccggcccgagaccgccgacctcgtcggc900 ttcctggtcaacaccctggtcctgcgcaccgacacctcgggcgacccggccttcgccgaa960 ctgctcgaccgggtgcgcgagaccgacctgcgggcctacgcccaccaggacgtgcccttc1020 gagcggctggtggaggcggtcaaccccgcccgctcgcccagcaggcacccgctcgtccag1080 acgatgctcaccttcgacaacgccgcccacggagcgctcgatcacctcctggacctgccg1140 ggcgtgcgcgcggaactgctgccgaccgccgagggcaccgcccacaccgacatcgaactg1200 accttcaccgagaccacggcggacaccgacggcgacggactcgacgcgtccctgcgctac1260 cgccccgacctgttcgaccgcacgaccgcgcgggctctggcggagcggttcatggcgctg1320 ctccgcaccgtgacgcgcgagcccgccctgcggctcgggcagctcgacgtcaccaccgcc1380 ggggaacgccggcggctggccgacgcggacgccgcggcacgggcgaggacggccgccacc1440 gccgtcgccgtcctgcccgccctcttcgccgcctccgcccatcgcacgcccgccgccccc1500 gccctcaccgacggcccggcgaccctggactacgcggaactcgacgcccgctccaaccgt1560 ctcgcccgggccctgctcggactcggcgtggggccggaggacttcgtcgccctggcggtg1620 ccccgctcggcggacctggtggtggccgtcctcgcggtgctgaagtcgggcgccgcctac1680 ctcgccgtcgaccccgaccacccggccgagcgcacctcgtacatcctccacgactgccgg1740 cccgtcgccgtcctctccacgaccgccgtccgcgagaccctgcacggcacggtgggcgag1800 gcggtcggcgaggtcccgtggctgctgctcgacgagcccgccaccggcggcgcgacggcc1860 ggccactcggccgcaccggtcaccgacgccgaccgccggtcgcccctgctccccgaccac1920 ccggcctacaccatctacacctccggctcgaccggacggcccaagggcgtcgtcgtcagt1980 cacgccaacgtctcacgactgctgaccgcctgccgcgcggccgtggacttcgggcccgac2040 gacgtctggacgctcttccactccagcgccttcgacttctcggtgtgggagatgtggggg2100 ccgctggcgcacggcggccggctggtcgtcgtcccgcacgacgtggccagatcacccggc2160 gacctcctggacctgctgggccgcgagcgcgtcacggtgctcagccagacgccctccgcc2220 ttcctccagctcctgcgggcggagtccgacctcggcgtccccccgaggaccaccgcggcg2280 ctgcggtacgtcgtcttcggcggagaagcgctggacaccgcccaactcgccccctggcgg2340 ggccgcccggtccgcctggtcaacatgtacgggatcaccgagacgaccgtccacgtcacc2400 cacctggagctggacgacgccgccgtggaccgcggcggcagcccgatcggcacacccctg2460 aacgacctgcgcgcccacgtgctcgaccaggggctgcttcccgtgccggtgggcgtcgtg2520 ggcgagctgt acgtcgccgg ccccggcctg gcccgcggct accgccgccg ccccggcctg 2580 agcgccaccc gcttcgtcgc cgacccgttc gacaccggcg gccggatgta ccggaccggc 2640 gacctcgtcc ggcgcaccca ggacggcggc ctccactacg tcggccggtc cgactcccag 2700 gtgaaactgc gcggctaccg catcgagccc ggcgagatcg aggccgccgc ccgccgccac 2760 ccggacgtcg cccaggcggc caccgccgtg cacggcgaag gaccgcagga ccggtacctg 2820 gtctgctacg tggtgccggc ggccgacacc gaccccgacc cgcaccaggt gcgcgcccac 2880 ctggccgacg ccctgcccgg ctatatggtc cccgccgccg tggtgccgct gaccgccctg 2940 ccgctgaccc ccaacggcaa gctggaccga gcggcgctgc ccgcccccga ccgggcggcg 3000 tgggccaccg gcggcgcccc gaccggaccg cgcgaggaag cgctctgcgc cgccttcgcc 3060 gacgtcctcg gcgtccagga ggtcagccgc gacgccgact tcttcgccct gggaggccat 3120 tccctctcgg cggtccggct catcagccgg atcaggtcgg cgctcggagt ggagatcggc 3180 atccgcacgc tcttcgaggc gcccacgccc gccgcgctgt cccggcgcct cgacaccgcc 3240 gggaccggac ggccccgcct cctgccgcgc cgccgaccgg accgcgtccc gctctcctcc 3300 gcccagcgca ggctgtggtt cctcggagaa ctggaaggac cgagcgccac ctacaacatc 3360 ccgctcgccc tgcgcctgcg cggccgtctc gacgtcgacg ccctgcgcac cgccctggcc 3420 gacgtggtgg gccggcacga ggccctgcgc accgtcttcc cgtccgagga cggcgccccc 3480 taccagcagg tggtcgcggc cgaacgggcc gcgcccgccc tcgacgtcgt ggacgtcacc 3540 gagaaggagc tgcccgccgc cctcgccgag gcccgcgcac acgccttcac cctcaccgag 3600 gaccttccgc tgcgggccgt actgctgcgg accggccccg ccgaccacgt gctctccctc 3660 gtcctccacc acatcgcggg cgacggctgg tcgctggccc cgctcgcccg cgacctcagc 3720 accgcctacg ccgcacgtcg ggagggccgc gccccgcagt ggcggcccct gccggtgcag 3780 tacgccgacc acaccctctg gaaagaggag ttgctcggcg cggcggacga ccccgagagc 3840 ctcctcgccc gccaactcgc cttctggcgc gaggcgctgg agggcgcgcc ggaacagatc 3900 gagctaccca ccgaccggcc gcgccccgcc atggagagcc accgcggcgc gatccaccgc 3960 ttcaccctcc ccgcgtcact gcgcgaccgg ctgcgtgacc tcgcgcacgc gcggcgggcc 4020 accctcttca tggccctcca ggccggactc gccgcactgt tcgccaccct gggggccggc 4080 cgggacatcg tcctcggcac gcctgtcgcc ggccgcgccg acgaggcggc cgacgacctc 4140 gtcggcttct tcgtcaacac cctggcgctc cgcaccgacc tcggcggcga ccccaccttc 4200 gaggaactgc tcgaccgcgt cagggaagcc gacctgtccg ccttcgccca ccaggacatc 4260 ccgttcgagc aactggtgga ggcgctcaac cccacccgct ccctctccag gcaccccgtc 4320 ttccaggtgc tgctggccct ccagaacaac gagcgcggcg aggccgtcat gccgggcctg 4380 gaggtcaccg tcgaacgccc cgcccaggcg gcggccaagt acgacctctt cgtcaacctc 4440 gtggagtccc ggaacgagga ggacggaacg accgccgtcg agggagcggt cgagtacgcc 4500 accgacctct tcgacgcccg taccgtcgcc cggctcaccg agcgctacca cgacctgctc 4560 ctggccgccg tcgaggagcc cacgacacgg ctcagccgga tgcccatgct cgacacggcg 4620 gaacgcgaca gactcacggc cgaatggggc gccgccgccg cgggcccggc cgaggacctg 4680 gtcgccctct tccgtgcccg cgccgccgag acacccggcg cggtggcggt ccgcggcgcc 4740 ggggacagcc tcacctacgc ccagctcgac gagcgggccg gacggatcgc ggcggccctc 4800 gcccggcacg gcgccggccc cgagagcagg gtcgcggtgt gtctgccgcg caccgccgac 4860 ctggtggccg cgctgctcgg cgtcctacgg gccggcgccg cctacgtacc gctcgacccg 4920 gagtacccgg acgagcgcgt cgccgcgatc ctggccgaca cccgcccggt ggcgctgctc 4980 accacggcgg actgccgccc cgcgatcacc ggggccgcga ccgccgccgg cggagccgtc 5040 ctcctcgcgg ccgacgccgc acacggcgcg ggccccgtgc ccgagccccc cgccccgctg 5100 cccgaccagg ccgcgtacgt cctgcacacc tcgggctcca ccggacgccc caagggcgtc 5160 gtcgtcagcc ggggcaacct cgccaacctc ctggccgaca tgcgggaccg gctgcgcccc 5220 accgccgacg accggctggt cgccgtcacc acggtcagct tcgacatcgc cgcgctggaa 5280 ctcttcctcc cgctggtcac cggcgccgga ctggtcctgg ccgaccgcgg cgccgcacgg 5340 gcccccgagg aactggccgc cctgctcacc gcgagcggtg CC3CCC'tCCt ccaggccacc 5400 ccgaccacct ggcagttgct ggccgagacc gcccccgacg ccctgcgcgg gctgcgcaag 5460 ctggtcggcg gcgaagccct ccccgcctcc ctggcctccc gactgcgcga gctgggcggc 5520 gaactcgtca acgtctacgg gcccaccgag accaccatct ggtcgaccgc cgcccacctc 5580 gaccgggtca ccggcagcgc cccgcccatc ggccgggcgc tgcgcggcac ccgcgcctac 5640 gtgctggacg agtggctgaa cccgcgcccc gagaacgtcc ccggcgagct gtacctggcc 5700 ggcgccggcg tggcccgcgg ctacctggga cgcggtggcc tgaccgcgga gcgcttcacc 5760 gccgacccct tcggcgcgcc cggcagccgc atgtaccgca cgggcgacct ggtccgccgc 5820 cgcgcggacg gggagctgga attcctcgga cgcaccgacc accaggtcaa ggtccggggc 5880 ttccgcatcg aactgggcga gatcgagacg gcactcggcg cgcacccgca cgtcgccggg 5940 gcggtcgtgg tcgcccgcgc ggcgtccggc gcggccctcg tgccggacgc gccggcccca 6000 cggcgactgg tggcctacgt ggtccccgag ccccaccgcg ccgcccccga cgacggccgg 6060 gagcagaacc ggctcgacga gtggcgggag gcctacgaca ccctctacgg cagctccgca 6120 cccgcccccc tcggccagga cttcggcatc tggcgcagca gtcacgacgg gcagcccatc 6180 cccctggacg agatgcacca gtggcgggcg gccacggtgg accgcatccg ggcgctgcgg 6240 cccacgcggg tgctggagat cggagtcggc accgggctgc tgctctcgga gctggcggag 6300 gactgcaccg cctaccacgg caccgacctg tccgcgcggg cgatcgagac gctgcgcgca 6360 caggtcgacg ccgaacccgc gctgaaggag agggtcgagc tgcacgtccg cccggcccac 6420 gacttcgacg gcctgcgccg gggcttctac gacaccatcg tgctcaactc cgtcgtccag 6480 tacttccccg acgccgacca cctcacccgc gtactgcgcg gcgcgctcga cctgctcgcc 6540 cccggcgggc ggctcttcgt cggcgacgtc cgcagcctgg cactgctgcg cgccttccgc 6600 gcctcggtgg agaccggcaa cagcgcggtc tccgaaactc ccgccgccgt acttgccgcc 6660 gccgaccgca ggacggccgc ggagaacgaa ctcgtcatcg cccccgacta cttcgcgcgg 6720 ctgcggcggg aggcccgcga accgctcctg ctggacgtgc gcatccggcg cggacggccg 6780 tacaacgagc tgacgcgcta ccgctacgac gtcctgctgg tcaaacagga gaccggagcc 6840 gcgccctccg ccctgccccc ggccaccgaa ctgcgctgga cgccggagac cggcgatgcc 6900 gggcggctgg ccgagatctg cgcggcgcac cccggcgcgc tgcgcgtcac cgcgatcccc 6960 aacgcccgcg tgcggcgcga gaccaccgcc ctcgccgccc tggaggacgg gcgaccggtc 7020 accgaggtgc gccggctgct ggagcaaccc ggtgacggag tcgatccgga ggacctgtac 7080 gacgccgcga ccgccgccgg acgcaccgcc tgggtgacct ggtcggccga cggaccgccg 7140 gacaccgtgg acctcgtgct ggccccggcg ggcggggacg gcgtgccgcc ggtggcgccg 7200 ccggccgagc tgtggcccgg cgcgccggcc gctgaccggc cggagacgaa cgacccgacc 7260 gccgggtcgc accaccgcga actggccgcc cggctccgct cccacctggc cgaacggctg 7320 ccggactaca tggtcccctc ggccgtcgtc gtcctcgacg ccctcccgct gaccgccaac 7380 gggaaggtgg accgcaacgc gctgcccgac cccgacccgg cgggcacgga cgccggccgc 7440 ccgccgcgca cgccccggga ggaactgctc tgcacgctct tcgccgacct gctgggcctg 7500 ggccgggtcg gagtccagga cagcttcttc ggcctcggcg gcgacagcgt cctgtccgtc 7560 cgcctcgtca gccgcgcccg cgcacacggg ctgccgctga ccacccgcga cgtettcgag 7620 caccacaccgccgccgcgctggcggcggccctggacggcagggaaccgga gagcgaaccg7680 gacggcggcccgccgggccccgacgccaccgcggcgcggcccatcaccct cgacgaactc7740 gccgagctcgaggccgagttcggcacggactgggaggagacacagtga 7788 Information for SEQ ID ISO: 9 Length: 2143 Type: PRT
Organism: Streptomyces fradiae Sequence: 9 Val Asn Gly Pro Gln Arg Met Vai Glu Glu Val Leu Ala Val Thr Pro Leu Gln Glu Gly Leu Leu Phe His Ala Val Phe Asp Glu Asn Val Pro Asp Ala Tyr Val Ser Arg Leu Val Leu Ala Leu Ser Gly Glu Leu Asp Ala Asp Arg Leu Arg Gln Ala Ala Gln Ala Leu Val Ala Arg His Pro Ala Leu Arg Ser Ala Phe Arg Gln Arg Arg Ser Gly Glu Trp Phe Gln Leu Val Ala Thr Arg Pro AIa Val Pro Trp Gln Glu Leu Asp Leu Arg Pro Ser Gly Ser Pro Ala Glu Ala Asp Lys His Leu Glu Ala Leu Leu Asp Glu His His Arg Thr Gly Phe Asp Leu Gly Arg Pro Pro Leu Leu Arg Phe Leu Leu Ala Arg Thr Gly Asp Asp His His Arg Leu Ala Val Thr Tyr His His Leu Val Leu Asp Gly Trp Ser Met Pro Ile Leu Met Arg Glu Leu Ala Val Leu Tyr Gly Ser Gly Gly Asp Pro Ser Ala Leu Pro Pro Val Arg Pro His Arg Asp His Leu Asp Trp Leu Ala Arg Arg Pro Ser Glu Arg Ser Ala Arg Ala Trp Arg Gln Ala Leu Ala Gly Leu Pro Gly Pro Thr Leu Ile Ala Pro Asp Ala Asp Arg Asn Gly Pro Leu Pro Gly Ser Val Trp Thr Arg Leu Gly Glu Arg Asp Thr His Ala Leu Gly Ala Trp AIa Arg Ala Arg Gly VaI Thr Val Asn Ser Ala Val Gln Ala Ala Trp Ala Thr Val Ile Gly Arg Leu Thr Gly Arg Asp Asp Val Val Phe Gly Thr Thr Val Ser Gly Arg Pro Pro Asp Leu Pro Gly Ser Glu Asp Met Val Gly Phe Phe Ile Asn Thr Val Pro Thr Arg Val Arg Met Arg Pro Ala Glu Pro Ile Gly Asp Leu Val Val Arg Ile Gln Arg Glu Gln Thr Ala Leu Met Glu His Gln His Val Arg Leu Ser Asp Ile Gln Arg Trp Ser Asp Arg Thr Val Leu Phe Asp Thr Ser Thr Ala Phe Glu Asn Tyr Pro Ala Asp Asp Leu Ser Ala Val Ser Ser Ala Gly His Ala Gly Leu Arg Ile Glu Gly Gly Ser Gly Arg Thr Thr Asn His Phe Pro Leu Ser Leu Tyr Ala Leu Pro Gly Pro Ala Leu Arg Leu Arg Leu Asp His Arg Pro Asp Ala Val Asp Asp Val Thr Thr Arg His Ala Ala Asp Leu Leu Glu Arg Ala Leu Thr Ala Val His Ser Ala Pro Ala Thr Pro Thr Ala A~.a Leu Ala Ala Thr Pro Ala Thr Ala Arg Ala Ala Ala Pro Arg Ala Ala Gly Pro Gly Ala Pro Ala Thr Ile Val Asp Ala Phe Glu Ala Arg Val Arg Ala Thr Pro Glu Ala Pro Ala Val Leu Ala Gly Gly Glu Glu Leu Thr Tyr A1a Glu Leu Asp Ala Arg Ala Asn Arg Leu Ala Arg Leu Leu Leu Glu Arg Gly Val Gly Pro Glu Ser Arg Val Ala Leu Thr Val Ser Arg Asn Ala Trp Leu Pro Val Ala Val Leu Gly Ile Leu Lys Ala Gly Gly Cys Tyr Val Pro Val Gly Ala Th:r Leu Pro Arg Glu Arg Ala Ala Arg Ile Leu Arg Glu Thr Ala Pro Val Cys Leu Leu Thr Asp Pro Asp Ala Glu Ala Ala Arg Thr Arg Arg Thr Ala Pro Thr GS

Gly Asp Asp Arg Asp Glu Asn Ala Pro Gly Gly Val Glu Arg Val Val Leu Thr Gly Ala Leu Leu Ala Ala Phe Asp Pro Ala Pro Pro Thr Asp Ala Glu Arg Ala Gly Pro Leu Leu Pro Gly His Leu Ala Tyr Leu Leu His Thr Ser Gly Ser Ser Gly Arg Pro Lys Gly Val Ala Val Glu His Ala Gln Val Thr Ala Leu Leu Ser Trp Ala Gly Thr Gl:y Val Gly Ala Asp Arg Leu His Arg Thr Val Ala Ser Thr Ser Glu Ser Phe Asp Val Ser Val Phe Asp Thr Leu Val Pro Leu Leu Thr Gly Gl;y Arg Ile Glu Ile Val Glu Asn Thr Leu Ala Val Ala Asp Arg Thr Gly Gly Glu Pro Ser Leu Leu Asn Ala Val Pro Ser Ala Leu Gln Ala Leu Leu Glu Arg Gly Glu Pro Leu Ala Val His Thr Phe Leu Cys Ala Gl;y Glu Pro Phe Pro Ala Pro Leu Ala Arg Ser Leu Arg Ala Ala Phe Pro Arg Ala Arg Val Ala Asn Leu Tyr Gly Pro Thr Glu Thr 'Thr Val Phe Val Thr Ala His Phe Leu Asp Gly Thr Asp Asp Gly Ala Pro Pro Va.l Gly Arg Pro Leu Pro Gly Val Arg Val His Ile Leu Asp Pro Trp Leu Arg Pro Val Pro Asp Gly Val Val Gly Glu Leu Tyr Leu Ala Gly Glu His Val Thr Arg Gly Tyr Trp Gln Arg Pro Ala Thr Thr Ala Glu Arg Tyr Val Ala Asp Ile Phe Gly Ala Pro Gly Ala Arg Met Tyr Arg Ser Gly Asp Leu Gly Arg Leu Arg Pro Asp G1y Glu Ile Asp Leu Val Gly Arg Ala Asp Asp Gln Val Lys Val Arg Gly His Arg Val Glu Leu Gly Glu Val Glu Ala Ala Leu Ala Ser His Pro Asp Val Leu Arg Ala Ala Ala Ala Val His Asp Gly Lys Pro Ala Gly Pro Arg Leu Val Gly Tyr Val Val Pro Arg Gly Pro Ala Pro Asp Thr Ala Ala Val Leu Asp His Val Arg Arg Glu Val Pro Pro Tyr Met Val Pro Ser Ala :Leu Val Va.L Leu Asp Glu Leu Pro Leu Thr Val Asn Gly Lys Arg Asp Arg Ala Ala Leu Pro Pro Pro Pro Asp Arg Ser Asp Thr Thr Arg Ala Arg Ala Pro Arg Gly Pro His Glu Thr Ile Leu Leu Gly Leu Phe Ala Glu Val Leu Gly Val Arg Pro Val Gly Ile Asp Asp Asp Phe Phe Ala Leu Gly Gly His Ser Leu Leu Ala Thr Arg Leu Val Ser Arg Val Arg Thr Thr Leu Gly Ala Glu Leu Ala Val Arg Asp Leu Phe Glu His Pro Thr Val Ala Gly Leu Tyr Ala Arg Ile Ala Arg Ala Gly Ala Ala Arg Pro Pro Val Ser Arg Val His Ala Arg Pro Asp Arg Val Pro Leu Ser Phe Ala Gln Arg Arg Leu Trp Phe Leu His Arg Leu Gln Gly His Ser Ala Ala Tyr His Val Pro Leu Ala Leu Arg Leu Thr Gly Arg Leu Asp Pro Ala Ala Leu Arg Gly Ala Ile Ala Asp Thr Val Ala Arg His Gly Ser Leu Arg Thr Val Phe His Glu Asp Ala Glu Gly Val Arg Gln Ile Val Gln Asp Ala Ser Ala Ala Ala Arg Leu Ile Thr Leu Ile Pro Glu Pro Val Glu Asp Pro Leu Arg Ala Ala Glu Glu Ala Val Ala Glu Pro Phe Asp Leu Thr Ala Gly Pro Pro Leu Arg Cys Arg Leu Phe Thr Arg Ser Ala Asp Pro Ala Asp Pro Arg Ala Gly Ala Gly Gln Glu Pro Gln Glu His Leu Phe Leu Leu Val Val His His Ile Ala Ala Asp Gly Trp Ser Leu Arg Ile Ile Ala Arg Asp Val Ala Ala Ala Tyr Ala Ala Arg Val Arg Gly Glu Asp Phe Ala Pro Ala Pro Pro Pro Val Asp Tyr Val Asp His Thr Leu Trp Gln His Arg Val Leu Gly Asp Pro Asp Ala Asp Gly Gly Pro Asp Thr Glu Gly Gly His Ala Thr Glu Gly Gly Pro Leu Asp Thr Gln Leu Ala His Trp Arg Arg Arg Leu Ala Gly Leu Pro Gln Glu Ile Ala Leu Pro Ala Asp Arg Gln Arg Pro Ala Ala Ser Ser His Arg Gly Ala Asp Val Asp Phe Thr Val Pro Ala Ala Ala Ala Glu Arg Ile Arg Gln Leu Ala Gly Thr Thr Gly Thr Thr Pro Phe Met Val Leu Gln Ala Ala Leu Ala Val Leu Leu His Arg Met Gly Ala Gly Thr Asp Ile Pro Leu Gly Thr Pro Val Ala Gly Arg Thr Asp Ser Ala Val Glu Gly Val Val Gly Leu Phe Val Asn Thr Leu Val Leu Arg Thr Asp Leu Ser Gly Ser Pro Thr Phe Ala Gln Leu Leu Gly Arg Val Arg Ala Thr Ala Leu Asp Ala Tyr Ala His Gln Asp Val Pro Phe Glu Arg Leu Val Glu Val Leu Ala Pro Glu Arg Ser Leu Ala Arg His Pro Leu Phe Gln Val Ser Leu Val Leu Gln Asn Leu Asp Glu Ala Ala Ala Pro Val Asp Gly Leu Pro Gly Leu Arg Ala Glu Thr Val Arg Thr Arg Arg Asp Gly Ala Lys Val Asp Leu Ser Phe Val Leu Ala Pro Gly Gly Pro Glu Gly Gly Asp Met Pro Gly Val Leu Thr Tyr Ser Ala Asp Leu Phe Asp His Ala Thr Ala Arg Gly Leu Val Asp Arg Leu Leu Arg Val Leu Asp Gln Val Leu Ala Ala Pro Ala Thr Pro Val Gly Arg Val Asp Val Leu Leu Pro Gly Glu Ala Arg Arg Glu Leu Glu His Ser Arg Gly Pro Gly Ala Ala Gly Asp Gly Asp Glu Pro Leu Ala Arg Phe Glu Lys Trp Ala Ala Thr Thr Pro Asp Ala Pro Ala Leu Arg Trp Asp Gly Gly Arg Leu Thr Tyr Ala Glu Leu Asp Arg Lys Ala Asp Ala Val Ala Arg Ala Leu Val Gly Arg Ser Leu Gly Pro Glu Asp Val Val Ala Val Val Ala Pro Arg Asp Pro Asp Val Val Ala Ala Lea Leu Gly Val Leu Arg Cys Gly Ala Ala Tyr Leu Pro Ile Asp Glu Ala Trp Pro Pro Ala Arg Ile Arg Arg Thr Thr Thr Asp Ala Gly Ala Arg Leu Leu Leu Ala Pro Gly Asp Thr Asp Ala Ala Arg Thr Ala Phe Gly Pro Ala Cys Gly Pro Asp Thr Asp Ile Leu Gly Leu Glu Asp Pro Ala Phe Arg Ala Thr Gly Gly Pro Ala Leu Pro Ala Gly Arg Asn His Pro Arg Ser Leu Ala Tyr Val Leu Tyr Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Gly Val Glu Arg Arg Ala Leu Ala His Tyr Val Glu Gly Ala Val His Arg Tyr Pro Asp Ala Ala Ala Thr Thr Leu Leu His Ser Pro Leu Thr Phe Asp Leu Ser Ala Thr Ala Leu Phe Thr Pro Leu Ala Ser Gly Gly Cys Val Val Leu Gly Glu Val Asp Arg Ala Ala Glu Ala His Pro Val Asp Phe Val Lys Ala Thr Pro Ser His Leu Pro Leu Leu Glu Arg Arg Pro Gly Leu Leu Gly Glu Asn Gly Thr Leu Val Leu Gly Gly Glu Ala Leu Asp Gly Arg Ala Leu Arg Ala Trp Arg Ala Ala His Pro His Ala Glu Val Val Asn Ala Tyr Gly Pro Thr Glu Leu Thr Val Asn Cys Ala Glu His Arg Ile Ala Ala Gly Glu Pro Val Pro Asp Gly Pro Val Pro Ile Gly Arg Pro Phe Ala Gly Val Arg Ala Met Val Leu Asp Thr Ala Leu Ala Pro Ala Pro Pro Gly Val Ala Gly Glu Leu Tyr Val Thr Gly Pro Gly Val Ala Arg Gly Tyr Leu Gly Gln Arg Ala Leu Thr Ala Glu Arg Phe Val Ala Cys Pro Phe Gly Glu Pro Gly Glu Arg Met Tyr Arg Thr Gly Asp Leu Val Arg Arg Leu Pro Gly Gly Glu Leu Glu Tyr Val Gly Arg Thr Asp Glu Gln Val Lys Leu Arg Gly Phe Arg Ile Glu Leu Pro Glu Val Ala Arg Thr Leu Ala Ala Asp Glu Ser Val Ala Arg Ala Val Val Val Val Arg Glu Asp Arg Pro Gly Asp Arg Arg Leu Thr Gly Tyr Val Val Pro Ala Ala Gly Val Arg Pro His Glu Asp Glu Leu Arg Gly Ala Val Ala Arg Thr Leu Pro Asp Tyr Met Val Pro Ser Ala Val Val Val Leu Asp Glu Leu Pro Thr Thr Pro His Gly Lys Leu Asp Arg Arg Ala Leu Pro Ala Pro Ala His Arg Ser Arg Gly Gly Arg Pro Pro Arg Asp Gln Arg Glu Arg Asp Leu Cys Arg Ile Tyr Ala Asp Val Leu Gly Leu Pro Glu Val Gly Ala Glu Asp Asp Phe Phe Ala Leu Gly Gly His Ser Leu Leu Ala Thr Arg Leu Val Asn Arg Ile Arg Ala Glu Leu Ala Glu Glu Leu Asp Val Arg Thr Val Phe Glu Ala Arg Thr Val Ala Ala Leu Ala Ala Arg Leu Arg Thr Ala Arg Pro Asp Thr Arg Pro Ala Leu Arg Asp Leu Arg Met Ser Arg Ser Glu Information for SEQ ID
N0: 10 Length:

Type:
DNA

Organism:Streptomyces fradiae Sequence:10 gtgaacggtccgcagcgcatggtcgaggaggtcctggcggtcaccccgctccaggagggg60 ctgctcttccacgccgtcttcgacgagaacgtccccgacgcctacgtcagccggctggtc120 ctggccctttccggagagctggacgccgaccggctgcgacaggccgcccaggcgctggtg180 gcacgccacccggcgctgcgctcggccttccgccagcggcgctcgggggagtggttccaa240 ctggtcgccacccgccccgcggtgccctggcaggagctcgacctgcggccgtcggggagc300 ccggcggaggcggacaagcacctggaggcgctgctggacgagcaccaccgcaccgggttc360 gacctcggccggccgcccctgctgcgcttcctgctggccaggaccggcgacgaccaccac420 cggttggccgtgacctatcaccacctcgtcctcgacggctggtccatgcccatcctgatg480 cgggaactggccgtgctgtacggcagcggcggcgacccgtccgccct~cccgcccgtccgc540 ccgcaccgcgaccacctcgactggctggcccgccgcccgtccgagcggagcgcccgcgcc600 tggcggcaggcgctggcgggactgcccggccccacgctgatcgccccggacgccgaccgc660 aacgggccgctccccgggtcggtgtggacccggctcggcgagcgggacacccatgccctc720 ggggcgtgggcgcgggcccgcggcgtgacggtgaactcggcggtgcaggccgcctgggcc780 accgtgatcggccgcctcaccggccgcgacgacgtcgtcttcggcacgacggtctcgggg840 cggccgccggatctgcccggcagcgaggacatggtcgggttcttcat:caacaccgtgccg900 acgcgcgtgcggatgaggccggccgagccgatcggcgacctcgtcgt:gcggatccagcgc960 gagcagaccgccctcatggagcaccagcacgtccggctctccgacat-ccagcgctggtcc1020 gaccggaccgtgctcttcgacacctccaccgcgttcgaaaactaccc:cgccgacgacctg1080 tccgccgtcagctccgccgggcacgcgggactgcgtatcgagggaggctccggccgcacc1140 accaaccacttcccgctctccctctacgcgctccccggcccggcgctccgcctgcgcctg1200 gaccaccgccccgacgccgtggacgacgtcaccacacgccacgcggcggatctgctggaa1260 cgtgctctgaccgccgtccacagcgccccggccaccccgaccgccgcgctcgccgccacc1320 cccgcgacggcacgcgcggccgcaccccgcgccgccgggccgggcg<:cccggccacgatc1380 gtggacgcgttcgaggcgcgtgtgcgggcgacccccgaggcccccgccgtcctcgccggc1440 ggcgaggagctgacctacgccgaactcgacgcccgggcgaaccggctggcgcgcctgctg1500 ctggagcgag gggtcggacc cgagagccgg gtcgccctca ccgtctcccg gaacgcctgg 1560 ctgcccgtcg ccgtgctcgg catcctcaag gcgggcggct gctacgt ccc cgtgggcgcc 1620 acgctgccgc gggagcgcgc cgcccgcatc ctccgcgaga ccgcaccggt ctgtctgctc 1680 accgaccccgacgccgaggccgcccggacccgccgcaccgcccccacgggagacgaccgg1740 gacgagaacgcgccgggcggcgtcgagcgcgtcgtgctgaccggcgccctcctggccgcg1800 ttcgacccggccccgccgaccgacgccgaacgggccggacccctgctccccggccatctc1860 gcctacctcctccacacctccggctccagcggccggcccaaaggggtcgccgtcgaacac1920 gcccaggtgaccgccctgctgtcctgggccggcaccggcgtcggagccgaccgtctgcac1980 cggaccgtggcctccacctcggagagcttcgacgtgtcggtcttcgacaccctcgtcccg2040 ctgctcaccggcggccgcatcgagatcgtggagaacaccctggccgtcgccgaccggacc2100 ggcggcgaaccctccctcctgaacgccgtcccctcggccctgcaggcgctgctggagcgc2160 ggcgagccgctcgccgtccacaccttcctctgcgccggcgaacccttccccgccccactg2220 gcccgcagcctgcgcgccgccttcccgcgggcgcgcgtggccaacctctacggaccgacc2280 gagacgaccgtcttcgtcaccgcccacttcctggacggaaccgacgacggcgcgcccccc2340 gtcggccgcccgctgcccggtgtgcgcgtccatatcctcgacccctggctccgtcccgtg2400 ccggacggcgtcgtcggggagctgtacctcgccggggaacacgtcacccgcggctactgg2460 cagcgcccggcgacgacggccgaacgctacgtcgccgacatcttcggcgcgcccggcgcc2520 cgcatgtaccgcagcggcgacctcggacggctccgccccgacggggagatcgacctggtc2580 ggccgggcggacgaccaggtaaaggtgcgcggccaccgggtcgagctgggagaggtggag2640 gccgccctggcctcccacccggacgtcctgcgggccgcagccgccgtgcacgacggcaaa2700 cccgccggaccgcgcctggtgggctacgtcgtgccccgcgggccggcgcccgacaccgcc2760 gccgtcctggaccacgtgcgccgcgaggtgcccccttacatggtgccctcggcgctcgtg2820 gtgctggacgagctgccgctgaccgtcaacggcaagcgggaccgcgccgcactgccgccc2880 ccgcccgaccggagcgacaccacgcgggcgcgcgccccccgaggcccgcacgagacgatc2940 ctgctcggactgttcgccgaggtgctcggggtacgcccggtcggcatcgacgacgacttc3000 ttcgccctgggcggccactcgctgctcgccacccgcctggtcagccgggtgcgcaccacc3060 ctcggagccgaactggcggtgcgggacctcttcgaacaccccacggtggccggtctgtac3120 gccaggatcg cccgcgccgg ggcggcccgg ccgccggtct cccgcgtcca cgcgcggccc 3180 gaccgcgtcc ccctctcgtt cgcccagcgg cggttgtggt tcctccaccg cctccagggc 3240 cacagcgccg cctaccacgt cccgctcgcg ctccgcctca ccggacgcct cgacccggcc 3300 gcgctgcgcg gcgcgatcgc cgacacggtc gcccggcacg gaagcctgcg caccgtcttc 3360 cacgaggacgccgagggcgtccgccagatcgtccaggacgcctcggccgccgcccggctg3420 atcaccctgatcccggagcccgtcgaggacccgctgcgggcggcggaggaggcggtggca3480 gaacccttcgacctgacggccggaccgcccctgcggtgcaggctgttcacccggtccgcg3540 gacccggcggacccgcgcgcgggcgccggccaggagccgcaagaacacctgttcctcctg3600 gtggtgcaccacatcgcggccgacggctggtcgctgcggatcatcgcccgggacgtggcc3660 gccgcgtacgccgcccgcgtgcgcggtgaggacttcgcgcccgccccgccccccgtcgac3720 tacgtcgaccacaccctctggcagcaccgggtgctcggcgaccccgacgcggacggcggc3780 cccgacacgg agggcggcca cgccacggag ggcggcccgc tcgacaccca gctcgcgcac 3840 tggcggcggc ggctggccgg cctgccgcag gagatcgcgc tgccggccga ccggcagcgc 3900 ccggccgcct cctcccaccg gggcgcggac gtggacttca ccgtccccgc cgccgcggcc 3960 gaacggatca ggcaactggc gggaaccacc ggcaccacgc ccttcatggt cctccaggcc 4020 gcactggcgg tcctgctgca ccgcatgggg gcggggaccg acatcccgct gggcaccccg 4080 gtcgccgggc gcaccgacag cgcggtcgag ggagtcgtcg gactcttcgt caacaccctg 4140 gtcctgcgca ccgacctgag cggctcgccc accttcgccc agctcctggg ccgggtccgg 4200 gccaccgccc tggacgccta cgcccaccag gacgtgccgt tcgaacggct ggtcgaggtg 4260 ctcgcccccg agcgctccct ggcccgccac cccctcttcc aggtctccct cgtcctgcag 4320 aacctcgacg aggcggcggc gccggtggac ggactgcccg ggctgcgcgc cgaaacggtc 4380 cgcacccggc gcgacggcgc gaaggtcgac ctgtccttcg tgctggctcc cggcggaccg 4440 gaaggcgggg acatgcccgg agtcctcacc tacagcgccg acctcttcga ccacgcgacc 4500 gcgagagggc tcgtggaccg gctgctgcgg gtgctggacc aggtgctcgc cgcccccgcc 4560 acgcctgtgg ggcgggtgga cgtgctcctg cccggcgagg cgcggcgcga gctggagcac 4620 agccgcggac cgggcgcggc gggagacggg gacgaaccgc tggcacgctt cgagaagtgg 4680 gcggcgacca cccccgacgc ccccgccctg cggtgggacg gcggccgtct gacctacgcc 4740 gagctggacc ggaaggcgga cgcggtggcc cgcgcgctcg tcgggcgctc cctcgggccc 4800 gaggacgtgg tcgcggtcgt cgctccgcgc gacccggacg tggtggccgc actcctcggg 4860 gtgctcaggt gcggcgccgc gtacctcccg atcgacgagg catggccgcc cgcacggatc 4920 cggcggacga ccaccgacgc cggcgcgcgc ctgctcctgg cgccgggcga caccgacgcc 4980 gcccggaccg ccttcggccc cgcctgcggc ccggacaccg acatcctcgg cctcgaggac 5040 ccggccttcc gggccaccgg cggtccggcc cttccggccg ggcggaacca cccgcgctcg 5100 ctggcgtacg tcctctatac ctccggctcc accgggcgcc ccaagggcgt gggcgtggag 5160 cgccgggcac tcgcgcacta cgtggaaggg gcagtccacc gctacccgga cgcggcggcg 5220 acgaccctgc tccactcccc gctgaccttc gacctcagcg ccaccgccct gttcaccccc 5280 ctcgcctcgg gcggctgcgt cgtcctgggc gaggtggacc gtgcggcgga ggcccacccg 5340 gtggacttcg tcaaggcgac cccgtcccac ctgcccctgc tggaacggcg tcccggactg 5400 ctcggggaga acggcaccct cgtcctgggc ggggaagccc tcgacgggcg ggccctgcgc 5460 gcctggcggg ccgcccaccc gcacgccgag gtcgtcaacg cctacggccc cacggagctg 5520 accgtcaact gcgccgagca ccgcatcgcc gccggcgaac cggtgccgga cgggccggta 5580 ccgatcggcc gcccgttcgc cggcgtccgc gcgatggtgc tcgacacggc acttgccccc 5640 gcacccccgg gcgtggccgg ggagctgtac gtcacggggc ccggagtggc ccgcggctac 5700 ctggggcagc gcgccctgac cgccgagcgg ttcgtggcct gcccgttcgg ggagccgggg 5760 gagcggatgt accgcaccgg cgacctcgtc cgccgccttc ccggcggcga actggagtac 5820 gtgggccgaa cggacgagca ggtgaagctg cggggcttcc ggatagaact gcccgaggtg 5880 gcgcgcaccc tggccgccga cgagtcggtc gcgcgcgcgg tcgtcgtcgt acgggaggac 5940 cgtccgggcg accggcggct gaccggctac gtggtcccgg cggcgggagt ccgcccgcac 6000 gaggacgaac tgcgcggcgc ggtggcccgc acgctgcccg actacatggt gccctccgcc 6060 gtcgtcgtcc tcgacgaact gcccaccacg ccccacggaa aactogaccg gcgcgcgctc 6120 cccgccccgg cacaccgctc gcggggcggc cgcccgccgc gcgaccagcg cgagcgggac 6180 ctgtgccgga tctacgccga cgtgctgggc ctgcccgagg tgggcgccga ggacgacttc 6240 ttcgccctcg gcggccactc cctgctcgcc acccggctgg tcaaccggat ccgggccgaa 6300 ctcgccgaag aactcgacgt acggaccgtg ttcgaggccc gaacggtcgc cgcgctggcg 6360 gcccggctgc ggaccgcccg ccctgacacc cgccccgcgt tgcgacggat gtcgcggtcg 6420 gaggacttgt ga 6432 Information for SEQ ID NO: 11 Length: 5245 Type: PRT
Organism: Streptomyces fradiae Sequence: 11 Met Leu Pro Leu Ser Leu Ala Gln Gln Arg Leu Trp Phe Leu His Thr Met Asp Gly Pro Ser Ser Thr Tyr Asn Ile Pro Thr Ala Leu Arg Met Thr Gly Pro Leu Asp Val Thr Ala Leu Gly Glu Ala Leu Arg Asp Val Val Arg Arg His Glu Thr Leu Arg Thr Val Phe Pro As:p Thr Gly Asp Gly Ala Arg Gln His Val Leu Pro Ala Asp Gly Thr Ala Val Glu Leu Ala Val Thr Arg Ser Thr Glu His Glu Leu Pro A1a Ala Leu Ala His Glu Ala Gly His Ala Phe Asp Leu Ala Arg Glu Val Pro Ile Arg Ala Arg Leu Phe Val Leu Gly Glu Arg Glu His Val Leu Cys Leu Val Ile His His Ile Ala Ser Asp Gly Trp Ser Arg Thr Pro Leu Ala Arg Asp Leu Ala Thr Ala Tyr Ala Ala Arg Gly Ala Gly His Ala Pro Arg Trp Glu Glu Leu Pro Val Gln Tyr Gly Asp Tyr Thr Leu Trp Gln Arg Glu Leu Leu Gly Ser Gln Asp Asp Pro Glu Ser Leu Leu Ser Arg Gln Thr Ala Tyr Trp Lys Gln Arg Leu Ala Gly Leu Pro Asp Ala Ile Glu Leu Pro Leu Asp Arg Pro Arg Pro Pro Ile Ala Gly His Arg Gly Asp Thr Val Pro Phe Thr Leu Pro Pro Ala Thr His G1u Arg Val Ala Ala Leu Ala Ala Arg His Gly Ala Thr Thr Phe Met Val Val Gln Ala Ala Leu Ala Gly Leu Leu Ser Arg Leu Gly Ala Gly Thr Asp Ile Pro Leu Gly Thr Pro Val Ala Gly Arg Thr Asp Ala Ala Leu Glu Gly Leu Ile Gly Phe Phe Val Asn Thr Leu Val Leu Arg Thr Asp Thr Ser Gly Asn Pro Thr Phe Asp Glu Leu Val Glu Arg Ala Arg Ala Cys Ala Leu Asp Ala Tyr Ala His Gln Asp Val Pro Phe Glu Arg Leu Val Glu Thr Leu Ala Pro Glu Arg Ser Leu Ala Arg His Pro Leu Phe Gln Val Ser Leu Ser Leu Gln His Ala Thr Asp His Thr Ala Leu Leu Asn Gly Leu Glu Ile Ala Pro Leu Asp Thr Gly Trp Arg Ala Ala Lys Phe Asp Leu Ser Phe Asp Leu Leu Glu Lys Arg Gly Pro Asp Gly Arg Pro As:p Gly Ile Ala Gly Thr Val Glu Tyr Ser Thr Asp Val Leu Asp Ala Ala Thr Val Arg Gly Leu Gly Glu Arg Leu Val Arg Leu Leu Glu Ala Gly Thr Ala Ala Pro Glu Ala Arg Leu Leu Ser Ile Asp Leu Leu Ser A1a Glu Glu Arg Arg Arg Val Leu Glu Glu Phe Ala Ala Glu Pro A1a Ala Asp Glu Pro Ala Ala Ala Glu Pro Ala Ala Asp Glu Gly Leu Glu Ala Val Cys Asp Thr Phe Ala Arg Gln Ala Ala Ala Thr Pro Glu Ala Pro Ala Val Val Gly Gly Pro Val Ala Leu Thr Phe Ala Glu Ala Asp Ala Arg Val Ser Arg Leu Ala Arg Leu Leu Ile Ser Arg Gly Ala Gly Pro Glu Val Arg Val Ala Val Cys Leu Asp Arg Asn Ala Leu Trp Pro Thr Thr Val Leu Ala Val Leu Arg Ser Gly Ala Val His Val Pro Leu Asp Pro Arg Ser Pro His Glu Arg Leu Ala Ala Val Glu Arg Asp Val Ala Pro Leu Leu Val Leu Ala Glu Arg Ala Thr Glu Ala Ala Val Ala Asp Leu Ala Ala Pro Val Leu Val Leu Asp Asp Pro Ser Thr Glu Ala Ala Ile Asp Ala Leu Asp Pro Gly Pro Val Thr Asp Ala Asp Arg Thr Ala Pro Leu Leu Pro Gly His Ala Ala Tyr Val Ile His Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Thr Val Asp His Arg Gly Leu Ser Arg Leu Leu Gln Ala His Arg Arg Val Thr Phe Ser Arg Ile Arg Pro Ser Ala Gly Gly Pro Gly Arg Ala Ala His Val Ser Ser Phe Ser Phe Asp Ala Ser Trp Asp Pro Leu Leu Ala Met Val Ala Gly His Glu Leu His Met Ile Asp Glu Asp Leu Arg Phe Asp Pro Pro Gly Val Val Ala Tyr Phe Arg Asp Arg Arg Ile Asp Tyr Val Asp Leu Thr Pro Thr Tyr Phe Arg Ser Leu Leu Asp Ala Gly Leu Leu Glu Glu Gly Phe Pro Cys Pro Ser Leu Val Ala Leu Gly Gly Glu Ala Met Asp G1y Glu Leu Trp Glu Arg Leu Arg Ala Ala Ala Pro Arg Val Thr Ala Met Asn Thr Tyr Gly Pro Thr Glu Thr Ala Val Asp Ala Val Val Thr Val Leu Gly Asp Leu Pro Pro Gly Thr Ile Gly Arg Pro Val Pro Arg Trp Arg Ala Tyr Val Leu Asp Ala Gly Leu Arg Pro Val Pro Pro Gly Val Leu Gly Glu Leu Tyr Leu Ala Gly Pro Gly Val Ala Arg Gly Tyr Leu Gly Gln His Ala Leu Thr Ala Glu Arg Phe Val Ala Cys Pro Phe Gly Lys Pro Gly Glu Arg Met Tyr Arg Thr Gly Asp Leu Ala Arg Trp Leu Pro Asp Gly His Leu Val Tyr Val Gly Arg Gly Asp Glu Gln Val Lys Ile Arg Gly Phe Arg Ile Glu Pro Gly Glu Val Glu Ala Ala Leu Arg Glu Leu Glu Gly Val Ala Ala Ala Ala Val Thr val Arg Glu Asp Thr Pro Gly Thr Arg Arg Leu Val Gly Tyr Val Val Gly Thr Pro Asp Ala Asp Asp Ala Arg Leu Arg Pro Ala Glu Val Leu Ala Arg Leu Arg Asp Arg Leu Pro Asp His Leu Val Pro Ser Ala Phe Val Arg Leu Arg Glu Leu Pro Val Asn Thr Ser Gly Lys Leu Asp Arg Ala Ala Leu Pro Ala Pro Asp Pro Ala Asp Phe Pro Ala Gly Arg Arg Pro Arg Thr Ala Leu Glu Arg Glu Val Cys Ala Leu Phe Ala Glu Val Leu Gly Ala Gly Ser Val Gly Ile Asp Asp Asp Phe Phe Gly Arg Gly Gly Asp Ser Ile Lea Ser Ile Gln Leu Val Gly Ser Ala Arg Arg Ala Gly Leu Thr Phe Thr Val Arg Gln Val Phe Glu Leu Arg Thr Pro Ala Ala Leu Ala Ala Ala Ala Arg Arg Thr Asp Ala Ala Gly Asp Glu Asp Pro Ala Leu Ala Val Gly Pro Leu Pro Leu Leu Pro Val Val Ala Glu Thr Leu Ala Ala Gly Gly Pro Val His Ser Tyr Asn Gln Ser Val Val Leu Ala Ser Pro Pro Asp Ala Ala Pro Asp Asp Val Arg Asp Ala Leu Gln Ala Leu Leu Asp Arg His Asp Ala Leu Arg Val His Ala Ala Pro Ala Ala Gly Pro Gly Arg Leu Trp Asp Leu Arg Val Glu Glu Ala Gly Thr Val Ala Ala Glu Arg Cys Leu Arg Arg Ile Asp Ala Thr Gly Met Ser Asp Glu Glu Leu Ala Arg Ala Gln Ala Ala Glu Ala Val Thr Ala Arg Ala Cys Leu Asp Pro Leu Ala Gly Ala Leu Val Ser Ala Val Trp Phe Asp Arg Gly Asp Arg Pro Gly Arg Leu Val Leu Val Ile His His Leu Ala Val Asp Gly Val Ser Trp Arg Ile Leu Leu Gly Asp Leu Arg Glu Ala Trp Arg Ala Leu Arg Ala Gly Arg Arg Pro Glu Leu Pro Arg Thr Gly Thr Ser Leu Arg Thr Trp Ala Thr Arg Leu Thr Glu Arg Ala Thr Asp Pro Ala Val Thr Ala Gln Leu Asp His Trp Thr Ala Thr Leu Ala Asp Gly Pro Ala Pro Gly Ser Arg Pro Leu Asp Arg Thr Arg Asp Thr Val Ala Thr Ser Ala Val Leu Ser Gly Glu Leu Pro Ala Ser Leu Thr Thr Asp Leu Leu Gly Pro Ala Pro Ala Ala Phe Arg Ala Gly Val Asn Asp Leu Leu Leu Thr Ala Phe Ala Leu Ala Val Ala His Trp Arg Gly Glu Glu Asp Ala Pro Val Leu Val Asp Leu Glu Ser His Gly Arg Thr Glu Glu Leu Val Pro Gly Ala Asp Leu Ser Arg Thr Val Gly Trp Phe Thr Ser Val His Pro Val Arg Leu Ala Ala Gly Arg Val Thr Ala Ala Asp Leu Ala Glu Arg Ala Pro Ala Val Gly Asp Ala Ile Lys Arg Ile Lys Glu Gln Leu Arg Ala Val Pro Asp Gly Gly Leu Gly His Gly Leu Leu Arg His Leu Asn Pro Asp Thr Ala Pro Arg Leu Arg Gly Leu Ala Arg Ala Arg Phe Gly Phe Asn Tyr Leu Gly Arg Phe Ala Ala Glu Gln Gly Ala Gly Glu Asp Ser Trp Pro Leu Leu Gly Ser Gly Pro Ala Gly Gln His Pro Asp Thr Pro Leu Asp His Glu Ile Glu Val Asn Val Val Thr Ala Glu Gly Pro Asp Gly Pro Arg Leu Ile Thr Arg Trp Thr Tyr Ala Thr Gly Leu Leu Thr Glu Glu Glu Val Arg Arg Leu Thr Arg Ser Trp Ser Leu Ala Leu His Ala Val Val Gly His Ala Thr Ala Glu Gly Ala Gly Gly Leu Ser Pro Ser Asp Val Ala Val Pro Asp Leu Gly Gln Ala Glu Ile Glu Gln Leu Glu Arg Arg Thr Gly Thr Ala Leu Glu Asp Ile Leu Pro Val Ala Pro Leu Gln Glu Gly Leu Leu Phe His Ser Val Tyr Asp Arg Arg Ala Leu Asp Val Tyr Val Gly Gln Leu Ala Phe Arg Leu Glu Gly Glu I1e Asp Gln Asp Ala Leu Arg Thr Ala Ala Ala Ala Leu Leu Ala Arg His Thr Ser Leu Arg Thr Gly Phe His Gln Arg Glu Ser Gly Gln Trp Val Gln Ala Val Ala Arg Ser Val Glu Leu Pro Trp Gln Phe His Asp Leu Leu Asp Pro His Gly Ala Gly Gly Ala Ala Gly Ala Ala Asp Ala Gly Ser Gly Arg Arg Trp Glu Glu Leu Ala Ala Ala Glu Arg Val Glu Arg Phe Asp Leu Thr Arg Pro Pro Leu Val Arg Phe Leu Leu Ala Arg Thr Ala Pro Glu Arg Tyr Gln Phe Val Ile Thr Thr His His Thr Ile Val Asp Gly Trp Ser Ile Pro Ile Leu Leu Arg Glu Leu Leu Ala Leu Tyr Gly Gly Asp Pro Leu Pro Pro Ala Pro Gly His Arg Leu His Ala Asp Trp Leu Ala Ala Arg Asp Leu Val Ala Ala Arg Glu Ala Trp Thr Arg Ala Leu Ala Asp Thr Glu Gly Pro Thr Leu Leu Ala Pro Gly Ala Pro Arg Val Gly Glu Val Pro Arg Ser Val Arg Leu Asn Leu Pro Glu Glu Val Ser Ala Arg Leu Leu Thr Arg Ala Arg Glu Ala Gly Ala Thr Leu Asn Ser Val Val Gln Ala Val Trp Ala Leu Val Leu Ala Gln Glu Thr Gly Arg Ser Asp Val Thr Phe Gly Ile Thr Val Ser Gly Arg Pro Ala Glu Leu Pro Gly Ala Glu Asn Leu Val Gly Met Leu Val Asn Lys Val Pro Leu Arg Val Arg Leu Arg Pro Ala Glu Pro Leu Met Glu Leu Ala Arg Arg Leu Glu Arg Glu Gln Leu Glu Leu Leu Glu His Gln His Val Pro Leu Thr Thr Leu His Arg Trp Ser Gly Leu Pro Glu Leu Phe Asp Thr Thr Met Val Phe Glu Asn Tyr Pro Ala Glu Val Thr Ala Arg Gln Ala Pro Phe Arg Ala Ser Gly Thr Ala Ser Tyr Ser Arg Asn His Tyr Pro Leu Thr Leu Val Gly Ala Met Arg Gly Thr Glu Leu Thr Vai Arg Val Asp His Arg Pro Asp Leu Phe Asp Glu Asp Phe Ala Arg Ser Leu Gly Glu Arg Val Ile Ala Ala Leu Thr Glu Ala Ala Asp His Pro Phe Val Pro Ala Gly Thr Leu Asp Leu Leu Gly Ala Glu Glu Arg Ala Arg Leu Leu Glu Trp Gly Thr Gly Pro Ala Pro Glu Asp Ala Pro Arg Thr Tyr Val Asp Leu Phe Glu Glu Gln Ala Ala Arg Thr Pro Asp Ala Pro Ala Val Ile Ser Ser Asp Gly Val Leu Thr Tyr Ala Glu Leu Asp Arg Gln Ala Asn Gly Val Ala Arg Trp Leu Ala Gly Arg Ala Gly Ser Ala Gly Gly Ala Glu Val His Ile Gly Val Leu Ala Pro Arg Arg Pro Glu Val Leu Ala Val Leu Leu Gly Val Leu Lys Ser Gly Ala Ala Tyr Val Pro Leu Asp Glu Gln Trp Pro Ala Glu Arg Leu Arg Thr Val Leu Glu Asp Cys Arg Pro Ala Leu Val Leu Ala Pro Thr Ala Ala Arg Ser Asp Ala Ala Arg Glu Ser Gly Ala Thr Val Leu Pro Val Asp Pro Ala Ala Leu Ala Ala His Gly Pro Gln Thr Pro Thr Asp Ala Glu Arg Ile Arg Pro Leu Thr Pro Gly Ala Ala Ala Tyr Ala Leu Tyr Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Val Ile Asp His Ser Ala Leu Ala Ala Tyr Val Gly Gly Ala Arg Arg Arg Tyr Pro Asp Ala Ala Gly Thr Ser Leu Ala His Thr Ser Leu Ala Phe Asp Leu Thr Val Thr Thr Leu Leu Thr Pro Leu Thr Ala Gly Gly Ala Val Arg Leu Gly Glu Leu Asp Glu Thr Ala Arg Asp Ala Gly Ala Thr Leu Val Lys Ala Thr_ Pro Ser His Leu Pro Leu Leu Ser Glu Leu Pro Gly Ala Leu Asn Asp Gly Gly Thr Leu Ile Leu Gly Gly Glu Ala Leu Thr Gly Gly Arg Leu Arg Pro Trp Arg Glu Leu His Pro Asp Ala Gln Val Val Asn Ala Tyr Gly Pro Thr Glu Leu Thr Val Asn Cys Thr Glu Tyr Arg Leu Pro Lys Gly Glu Pro Val Gly Glu Gly Pro Val Pro Ile Gly Arg Pro Phe Ala Gly Val Arg Val His Val Leu Gly Pro Gly Leu Arg Pro Val Pro Ala Glu Val Pro Gly Glu Leu Tyr Val Ser Gly Val Gly Val Ala Arg Gly Tyr Leu Gly Arg Pro Ala Leu Thr Ala Glu Arg Phe Val Ala Cys Pro Phe Gly Glu Pro Gly Glu Arg Met Tyr Arg Thr Gly Asp Leu Val Arg Trp Arg Ser Asp Gly Gln Leu Glu Tyr Val Gly Arg Ser Asp Asp Gln Val Lys Leu Arg Gly Phe Arg Val Glu Thr Ala Glu Val Ala Arg Ala Leu Glu Thr Cys Pro Ser Val Gly Ser Ala Met Val Val Leu Arg Glu Asp Gln Pro Gly Asp Gln Arg Leu Val Gly Tyr Leu Val Pro Ala Ala Gly Ser Gly Ala Leu Asp Lys Glu Ala Val Ser Asp Ala Val Arg Ala Val Leu Pro Glu Tyr Met Val Pro Ser Ala Leu Val Val Leu Glu Asp Gly Pro Pro Leu Thr Val Asn Gly Lys Val Asp Arg Ser Ala Leu Pro Ala Pro Glu Ala Glu Pro Ala Arg Ser Ala Gly Arg Ala Pro Arg Gly Pro Arg Glu Glu Ile Leu Cys Gly Leu Phe Ala Asp Val Leu Gly Val Arg Ala Val Gly Val Asp Asp Asp Phe Phe Ala Leu Gly Gly His Ser Leu Leu Ala Ile Val Val Ile Ser Arg Ile Arg Ala Leu Leu Asp Val Asp Val Ala Ile Asp Ala Leu Phe Glu Ala Pro Thr Val Ala Arg Leu Ala Ala His Leu Asp Gly Pro Gly Arg Gly His Gly Ala Val Arg Pro Ala Val Pro Arg Pro Gly Arg Leu Pro Leu Ser Tyr Ala Gln Leu Arg Leu Trp Leu Leu His Gln Ile Glu Gly Pro Ser Ala Thr Tyr Thr Ile Pro Leu Ala Leu Arg Leu Thr Gly Pro Leu Asp Val Ala Ala Leu Arg Ala Ala Leu Gly Asp Val Val Ala Arg His Glu Ser Leu Arg Thr Val Phe Ala Glu Asp Glu His Gly Pro His Gln Ile Val Leu Ala Pro Gly Asp Ala Glu Pro Gly Leu Lys Ala Val Pro Thr Thr Glu Asp Arg Leu Arg Ser Asp Leu Glu Ala Glu Ala Ala Arg Pro Phe Asp Leu Gly Gln Ala Pro Pro Val His Ala Arg Leu Phe Val Leu Asp Glu Arg Thr His Val Leu Leu Leu Ala Val His His Ile Ala Met Asp Gly Trp Ser Val Arg Pro Leu Val Arg Asp Leu Ala Ser Ala Tyr Ala Ala Arg Arg Arg Gly Ala Ser Leu Asp Leu Pro Ala Leu Pro Val Gln Tyr Ala Asp Tyr Thr Leu Trp Gln His Glu Glu Leu Gly Ser Glu Asp Asp Pro Asp 5er Pro Leu Ala Ala Gln Leu Arg Tyr Trp Arg Arg Thr Leu Asp Gly Leu Pro Gln Glu Ser Ala Pro Ala Ala Asp Arg Pro Arg Pro Ala Thr Pro Ser Tyr Arg Gly Gly Arg Val Ala Leu Thr Val Pro Pro Glu Leu His Gly Arg Val Val Glu Leu Ala Arg Glu Phe Arg Ala Thr Pro Phe Met Val Val His Ala Ala Leu Ala Ala Leu Leu Thr Arg Leu Gly Ala Gly Thr Asp Val Pro Ile Gly Ser Pro Val Ala Gly Arg Val Asp Asp Ala Leu Glu Asp Leu Val Gly Phe Phe Val Asn Thr Leu Val Leu Arg Thr Asp Thr Ser Gly Asp Pro Thr Phe Gly Glu Leu Leu Glu Arg Val Arg Ala Thr Asp Leu Gly Ala Tyr Ala His Gln Asp Leu Pro Phe Glu Arg Leu Val Glu Val Leu Asn Pro Glu Arg Ser Leu Ala Arg His Pro Leu Phe Gln Ile Leu Leu Ala Phe Asn Asn Gly Ala Ala Pro Asp Glu Gly Pro Ala Asp Arg Ala Ser Asp Val Leu Val Arg Pro Glu Thr Val Glu Ile Ala Ala A1a Lys Phe Asp Leu Ser Leu Ser Phe Asn Glu Asp Arg Ala Ala Asp Gly Thr Ala Ala Gly Met Arg Gly Val Leu Glu Tyr Ser Ala Asp Leu Tyr Asp Glu Ser Thr Ala Arg Arg Met Ala Glu Arg Tyr Leu Arg Leu Leu Glu Ala Ala Val Ala Glu Pro Arg Thr Pro Leu Ser Arg Ile Pro Val Leu Ser Glu Ala Glu Leu His Asp Val Leu Val Arg Arg Asn Asp Thr Gly Arg Thr Arg Pro Asp Ser Ser Pro Leu Arg Arg Phe Glu Ala Gln Ala Ala Thr Thr Pro Arg Ala Thr Ala Leu Val Val Gly Glu Glu Arg Leu Asp Tyr Ala Glu Leu Asp Ala Arg Ala Glu Arg Leu Ala Thr Leu Leu Ser Arg Ser Thr Ala Gly Arg Gly Gly Pro Val Ala Val Ala Leu Pro Arg Gly Val Met Leu Pro Val Ala Leu Leu Ala Val Trp Lys Ala Gly Leu His Tyr Leu Pro Leu Asp Pro Asp His Pro Arg Ser Arg Leu Ala Asp Val Leu Ala Asp Ser Ala Pro Gly Cys Val Ile Th.r Thr Thr Asp Leu Ala Arg Arg Leu Pro Pro Val Pro Ala Pro Leu Leu Val Leu Asp Asp Pro Ala Thr A1a Ala Arg Leu Ala Ala Thr Thr Ala Thr Ala Leu Ala Glu Asp Pro Arg Glu Gln Asn Gly Glu Trp Gly Glu Glu Leu Ala Tyr Thr Ile Tyr Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Met Val Thr Arg Ser Ala Val Ala Asn Phe Leu Ala Asp Met Asn Glu Arg Leu Glu Leu Gly Pro Gly Asp Arg Leu Leu Ala Val Thr Thr Val Ser Phe Asp Ile Ala Val Leu Glu Leu Leu Ala Pro Leu Leu Thr Gly Gly Thr Val Val Leu Ala Asp Ala Thr Thr Gln Arg Asp Pro Ala Ala Val Arg Ser Leu Cys Ala Arg Glu Gly Val Thr Val Ile Gln Ala Thr Pro Ser Trp Trp His Ala Met Ala Val Asp Gly Gly Leu Asp Leu Thr Ala Leu Arg Val Leu Val Gly Gly Glu Ala Leu Pro Pro Ala Leu Ala Arg Thr Leu Leu Glu Pro Gly Arg Ala Pro Leu Gly Asp Tyr Leu Leu Asn Leu Tyr Gly Pro Thr Glu Thr Thr Val Trp Ser Thr Val Ala Arg Ile Thr Ala Asp Ser Leu Glu Ala His Gly Gly Ala Val Pro Thr Gly Thr Pro Ile Ala Arg Thr Ala Ala Tyr Val,Leu Asp Ala Ala Leu Arg Pro Val Pro Asp Gly Val Pro Gly Glu Leu Tyr Leu Ala Gly Ala Gly Leu Ala Arg Gly Tyr Leu Gly Arg Pro Gly Met Thr Ala Glu Arg Phe Val Ala Cys Pro Phe Gly Glu Pro Gly Glu Arg Met Tyr Arg Thr Gly Asp Leu Ala Arg Trp Arg Ala Asp Gly Asn Leu Glu His Leu Gly Arg Thr Asp Asp Gln Va1 Lys Val Arg Gly Phe Arg Ile Glu Leu Gly Glu Val Glu Arg Ala Leu Thr Gln Ala His Gly Val Gly Arg Ala Ala Ala Ala Val His Pro Asp Ala Ala Gly Ser Ala Arg Leu Val Gly Tyr Leu Val Pro Ala Gly Gly Ser Gly Ala Leu Asp Glu Lys Ala Val Ala Asp Ala Val Arg Ala Val Leu Pro Ala Tyr Met Val Pro Ser Ala Leu Val Val Leu Asp Gly Gly Leu Pro Leu Thr Ala Asn Gly Lys Leu Asp Arg Ala Ala Leu Pro Ala Pro Glu Ala Thr Thr Gly Arg Gly Pro Gly Arg Ala Pro Arg Gly Pro Arg Glu Glu Ile Leu Cys Gly Leu Phe Ala Asp Val Leu Gly Val Pro Ala Val Gly Val Asp Asp Asp Phe Phe Ala Leu Gly Gly His Ser Leu Leu Ala Thr Arg Leu Ile Ala Arg Val Arg Gly Thr Leu Gly Val Glu Leu Gly Val Arg Glu Val Phe Glu Thr Pro Thr Val Ala Gly Leu Ala Ala Ala Leu Ser Ala Ala Gly Glu Ala Gly Pro Arg Leu Arg Pro Ala Asp Pro Arg Pro Glu Arg Leu Pro Leu Ser His Ala Gln Arg Arg Leu Trp Phe Val Arg Gln Leu Glu Gly Pro Ser Ala Thr Tyr Asn Val Pro Trp Ala Leu Arg Leu Thr Gly Pro Leu Asp Val Ala Ala Leu Arg Ala Ala Leu Gly Asp Val Val Ala Arg His Glu Ser Leu Arg Thr Val Phe Ala Glu Asp Glu His Gly Pro His Gln Val Val Leu Ser Ala Asp Gly Pro Ala Pro Leu Ser Gly Pro Val Arg Thr Asp Glu Asp Ala Leu Pro Arg Leu Leu Arg Glu Ala Ala Asp His Ala Phe Arg Leu Asp Ala Glu Pro Pro Leu Arg Ala His Leu Phe Ala Thr Ala Pro Glu Asp His Thr Leu Leu Leu Val Met His His Ile Ala Thr Asp Ala Trp Ser Gln Arg Pro Leu Ile Ala Asp Leu Ala Ala Ala Tyr Ala Ala Arg His Ala Gly Arg Val Pro Thr Leu Pro Pro Leu Pro Val Ala Tyr Ala Asp Tyr Ala Leu Trp Gln Gln Ala Arg Leu Gly Asp Glu Arg Glu Lys Asp Ser Ala Leu Ser Ala Gln Leu Ala Tyr Trp Arg Asp Ala Leu Ala Gly Ser Pro Glu Glu Leu Ala Leu Pro Ala Asp Arg Pro Arg Pro Ala Val Pro Ser His Arg Gly Asp Ser Val Pro Leu Thr Val Pro Pro Glu Leu His Gly Arg Val Val Glu Leu Ala Arg Glu Phe Arg Ala Thr Pro Phe Met Val Val His Ala Ala Leu Ala Ala Leu Leu Thr Arg Leu Gly Ala Gly Thr Asp Val Pro Ile Gly Ser Pro Val Ala Gly Arg Val Asp Asp Ala Leu Glu Asp Leu Val Gly Phe Phe Val Asn Thr Leu Val Leu Arg Thr Asp Thr Ser Gly Asp Pro Thr Phe Gly Glu Leu Leu Glu Arg Val Arg Ala Thr Asp Leu Gly Ala Tyr Ala His Gln Asp Leu Pro Phe Glu Arg Leu Va1 Glu Leu Arg Asp Pro Glu Arg Ser Leu Ala Arg His Pro Leu Phe Gln Val Ser Leu Asn Tyr Asp Thr Ala Glu Thr Ala Arg Ala Arg Asp Ala Ala Pro Glu Leu Asp Gly Leu Thr Val Ser Gly Arg Pro Leu Gly Val Thr Thr Ser Lys Phe Asp Leu Thr Phe Ala Leu Thr Glu Thr Arg Ala His Asp Gly Gly Pro Ala Gly Leu Arg Gly Ala Leu Glu Tyr Ser Thr Asp Leu Phe Asp Arg Gly Thr Ala Glu Arg Leu Ala G1u Arg Phe Ala Arg Val Leu Gln Ala Ala Va1 Ala Ala Pro Gly Thr Arg Leu Asp Gln Ile Asp Val Leu Leu Pro Gly Glu Arg Ala Leu Leu Glu Gly Glu Trp Ser Arg Pro Glu Pro Gly Pro Val Ala Pro Thr Asp Asp Ala Arg Phe Pro Asp Leu Phe Glu Ala Gln Ala Ala Arg Thr Pro His Ala Pro Ala Val Arg Asp Gly Asp Arg Glu Leu Ser Tyr Ala Glu Leu Asn Asp Arg Ala Asn Arg Leu Ala Arg Phe Leu Ala Ala Arg Gly Ala Gly Pro Glu Asp Thr Val Ala Val Leu Leu Pro Arg Gly Pro Glu Leu Ile Thr Ala Leu Val Ala Val Gln Lys Ala Gly Ala Ala Tyr Val Pro Met Asp Ala Glu Leu Pro Ala Glu Arg Ile Ala His Met Leu Glu Asn Ala Arg Pro Val Leu Val Leu Ala His Thr Ala Thr Gln Asp Ala Leu Pro Glu Gly Ala Gly Pro Val Val Arg Leu Asp Ala Pro Ala Ile Glu Ala Ala Leu Ala Gly Leu Asp Gly Gly Asp Cys Thr Asp Ala Asp Arg Arg Ala Pro Ala Thr His His Asp Pro Ala Tyr Val Val Tyr Thr Ser Gly Ser Thr Gly Thr Pro Lys Gly Val Val Val Glu Gln Arg Ser Leu Ala Ala Phe Leu Val Arg Ser Ala Ala Arg Tyr Arg Gly Ala Ala Gly Thr Ala Leu Leu His Gly Ser Pro Ala Phe Asp Leu Thr Val Thr Thr Leu $g Phe Thr Pro Leu Ile Ala Gly Gly Cys Ile Val Val Ala Asp Leu Asp Ala Pro Glu Arg Asp AIa Pro Ala Arg Pro Asp Leu Leu Lys Val Thr Pro Ser His Leu AIa Leu Leu Asp Thr Ile Ala Ser Trp Ala Thr Pro Ala Ala Asp Leu Val Val Gly Gly Glu Gln Leu Thr Ala Ser Arg Leu Ala Arg Leu Arg Arg Ala His Pro Asp Met Arg Val Phe Asn Asp Tyr Gly Pro Thr Glu Ala Thr Val Ser Cys Ala Asp Phe Val Leu Glu Pro Gly Asp Ala Pro Pro Thr Asp Thr Val Pro Ile Gly Arg Pro Leu Ala Gly His Arg Leu Phe Val Leu Asp Asp Arg Leu Arg Pro Val Pro Ala Asn Val Pro Gly Glu Leu Tyr Val Ser Gly Val Gly Val Ala Arg Gly Tyr Leu Gly Arg Pro Gly Met Thr Ala Glu Arg Phe Val Ala Cys Pro Phe Gly Glu Pro Gly Glu Arg Met Tyr Arg Thr Gly Asp Leu Ala Arg Arg Arg Ala Asp Gly Asn Leu Glu Tyr Leu Gly Arg Arg Asp Gly Gln Val Lys Val Arg Gly Phe Arg Val Glu Thr Gly Glu Ile Glu Thr Ala Leu Leu Asp Arg Pro Glu Ile Gly Gln Ala Ala Val Val Leu Arg Gly Glu Arg Leu Leu Ala Tyr Val Ala Ala Pro Pro Glu Arg Phe Asp Pro Asp Ala Leu Arg Gln Ala Leu Ala Ser Arg Leu Pro Arg Tyr Met Val Pro Ala Ala Phe Val Arg Leu Asp Ala Leu Pro Leu Ala Pro Gly Gly Lys Leu Asp His Arg Ala Leu Pro Glu Pro Pro Ala Pro Ala Asp Ala Pro His Gly Ser Arg Pro Pro Arg Asp Ala Trp Glu Gly Val Leu Cys Glu Ala Phe Gly Glu Val Leu Gly Ile Ala Glu Val Gly Ala Asp Asp Asp Phe Phe Ala Leu Gly Gly Asp Ser Ile Gly Ser Ile Arg Leu Val G7_y Arg Val Arg Ala Ala Gly Gly Arg Met Thr Val Arg Asp Ile Phe Glu Gln Arg Thr Pro Ala Ala Leu Ala Gly Arg Ser Arg Pro Gly Gly Pro Ala Thr Glu Val Leu Gly Gly Arg Gly Thr Gly Pro Val Glu Pro Thr Pro Ile Ser Ser Trp Leu Ala Glu Leu Gly Gly Ala Val Asp G1y Tyr Asn Gln Ser Val Leu Leu Arg Val Pro Ala Glu Ala Asp Ala Ala Val Val Thr Gly Ala Leu Gln Thr Leu Leu Asp His His Asp Ala Leu Arg Met Arg Ala Glu Pro Glu Asp Gly His Trp Arg Met Glu Ile Ala Glu Ala Gly Ala Val Asp Ala Ala Thr Val Leu G1u Arg Val Asp Ala Ala Gly Ala Asp Gln Gly Glu Leu Asp Arg Leu Val Arg Thr His Cys Ala Ala Ala Arg Asp Arg Leu Ala Pro Gln Lys Gly 5er Val Leu Arg Ala Va1 Trp Phe Asp Gly Gly Pro Arg Glu Pro Gly His Leu Ala Leu Val Ala His His Leu Val Val Asp Gly Val Ser Trp Arg Ile Leu Thr Ala Asp Leu Gly Ser Ala Trp Gln Ala Leu Ala Glu Gly Arg Glu Pro His Leu Asp Pro Val Gly Thr Pro Leu Arg Ile Trp Ala Arg His Leu Ala Glu Leu Ala Ala Asp Pro Arg Arg Ala Glu Arg Cys Ala His Trp Glu Glu Gln Ser Pro Arg Pro Trp Glu Thr Gly Gly Leu Asp Pro Ala Leu Asp Asp Arg Ser Thr Glu Glu Ala Leu Ser Leu Thr Leu Pro Ala Ala Ala Thr Arg Ala Val Leu Gly Pro Val Pro Ala Ala Leu Gly Val Gly Val Ser Glu Val Leu Leu Gly Thr Phe Ala Ala Ala Val Arg Arg Arg Arg Pro Ala Glu Ala Ala Asp Gly Val Thr Val Asp Leu Glu Gly His Gly Arg Glu Glu Asp Val Val Pro Gly Ala Asp Leu Ser Arg Thr Val Gly Trp Phe Thr Ala Ala His Pro Val Arg Val Pro Ala Ala Arg Pro Asp Glu Asp Arg Thr Gly Ala Leu Arg Ala Leu Ala Ala Ile Leu Asp Arg Ala Pro Asp Ala Gly Leu Gly Tyr Gly Leu Leu Arg Tyr Leu Asn Pro Arg Thr Arg Gln Arg Leu Ala Ala Leu Pro Ala Pro Arg Tyr G1y Phe Asn Tyr Leu Gly Arg Phe Gly Gly Ser Gly Glu Asp Arg Asp Ala Asp Asp Arg Thr Ser Asn Trp Ser Pro Val Ala Ala Gly Leu Ala Gly Gln Pro Ala Gln Leu Pro Leu Ala His Glu Ile Glu Val Thr Ala Val Ala Val Glu Gly Pro Glu Gly Pro Arg Leu Ile Ala Thr Trp Ser Trp Ala Gly Arg Leu His Arg Glu Arg Asp Val Arg Glu Leu Ala Glu Leu Trp Phe Arg Glu Leu Glu Glu Leu Ala Ser Ala Glu Pro Pro Pro Ala Gly Pro Ala Pro Leu Ser Asp Pro Pro Pro Leu Val Glu Leu Thr Asp Thr Glu Leu Asp Gln Leu Glu Ala Glu Trp Lys Ala Gly Information for SEQ ID NO: 12 Length: 15738 Type: DNA
Organism: Streptomyces fradiae Sequence: 12 atgcttcccc tctccctcgc ccagcagcgg ctgtggttcc tccacaagat ggacggcccc 60 agctccacctacaacatccccacggcgttgcggatgaccggcccgctggacgtcaccgcg120 ctgggcgaggccctgcgcgacgtcgtacggcgccacgagacgcttcgcaccgtcttcccc180 gacaccggcgacggcgcccggcagcacgtcctgcccgccgacgggaccgccgtcgagctg240 gccgtcacccgttccaccgagcacgaactgcccgccgcgctggcccacgaggccggccac300 gccttcgacctggcccgcgaagtcccgatcagagcgaggctgttcgtgctcggcgagcgg360 gagcacgtgctctgcctggtgatccatcacatcgccagcgacggctggtcgcgcaccccg420 ctcgcccgcgacctcgccaccgcctacgccgcccgcggcgccgggcacgccccgcggtgg480 gaggaactccccgtccagtacggcgactacaccctctggcagcgcgagctcctcggttcg540 caggacgaccccgaaagcctgctcagccgccagacggcgtactggaagcagcggctcgcg600 ggcctgccggacgccatcgaactgcccctcgaccgtcctcgcccgccgatcgccggccac660 cgcggcgacaccgtccccttcaccctcccgcccgcgacccacgagcgggtcgccgcgctc720 gccgcccgccacggcgcgaccaccttcatggtggtgcaggcggccctggccggcctgctg780 tcccggctgggcgcgggcaccgacatccccctgggcaccccggtggccggacgcaccgac840 gcggcgctggaggggctgatcggcttcttcgtcaacaccctggtgctgcgcacggacacc900 tcggggaaccccaccttcgacgaactggtcgaacgggcccgcgcctgcgccctggacgcc960 tacgcccaccaggacgtgccgttcgagcgactggtggagacgctcgcccccgagcgctcc1020 ctggcccgccacccgctcttccaggtgagcctgagcctccagcacgccaccgaccacacg1080 gccctcctgaacggtctggagatcgcccccctggacaccggatggcgggcggccaagttc1140 gacctctcct tcgacctcct ggagaagcgc ggccccgacg gccgcccgga cggcatcgcc 1200 ggcaccgtcg agtactccac cgacgtcctc gacgccgcca ccgtccgcgg gctcggggaa 1260 cgcctcgtcc gcctgctgga ggccggcacc gccgcccccg aggcgcggct gctctcgatc 1320 gacctgctct ccgccgagga acggcgccgc gtgctggagg agttcgccgc cgagcccgca 1380 gccgacgagcccgcagccgccgagcccgcggccgacgaggggctggaggccgtgtgcgac1440 accttcgcccgccaggcggcggccacccccgaggccccggccgtcgtcggcggtccggtc1500 gccctcaccttcgcggaggccgacgcccgcgtctcccgcctggcccggctgctgatctcc1560 cggggcgccggccccgaggtccgcgtcgccgtctgcctggaccgcaacgccctgtggccg1620 acgaccgtgctggccgtgctgcgcagcggcgccgtccacgtaccgctggacccacgctcc1680 ccgcacgagcggctggccgccgtcgaacgcgacgtcgcccccctgct:cgtcctcgccgag1740 cgcgccaccgaggccgccgtcgccgacctcgccgccccggtcctcgtcctggacgacccg1800 agcaccgaggccgcgatcgacgccctggacccgggccaggtcaccgacgccgaccgcacc1860 gcgcccctcctgcccgggcacgccgcctacgtcatccacacctcgggttccaccggcagg1920 cccaaggggg tcacggtgga ccaccggggc ctgtcgcggc tgctccaggc gcaccgccgg 1980 gtcaccttct cccgcatccg tcectccgca ggcggccccg gccgcgccgc ccacgtctcc 2040 tccttctcct tcgacgcctc gtgggacccg ctgctcgcga tggtcgccgg ccacgaactg 2100 cacatgatcg acgaggacct gcggttcgac ccgccgggcg tggtggccta cttccgcgac 2160 cgccgcatcg actacgtcga cctcaccccc acctacttcc gcagcctgct cgacgccgga 2220 ctgctggagg aaggcttccc ctgcccgtcc ctcgttgccc tgggcggcga ggcgatggac 2280 ggcgaactgt gggagcggct gcgggcggcc gccccccgcg tgaccgcgat gaacacctac 2340 ggtcccaccg agaccgccgt cgacgccgtg gtgaccgtac tgggcgacct gcccccgggc 2400 acgatcggcc ggcccgtgcc ccgctggcgg gcctacgtcc tcgacgcggg actgcggccg 2460 gtcccgcccg gcgtgctggg cgagctgtac ctcgccggac ccggagtcgc ccgcggctac 2520 ctggggcagc acgccctgac cgccgagcgg ttcgtggcct gcccgttcgg gaagccgggg 2580 gagcggatgt accgcaccgg cgacctggcg cggtggctcc ccgacggcca cctggtctat 2640 gtcggacgcg gcgacgagca ggtcaagatc cgagggttcc gcatcgagcc cggggaggtg 2700 gaggccgcac tgcgggaact ggagggcgtc gcggccgccg ccgtgaccgt ccgtgaggac 2760 acccccggaa cacgcagact ggtggggtac gtcgtcggta cccccgacgc cgacgacgcc 2820 cggctccggc ccgccgaggt gctggcacgc ctgcgcgacc gactgcccga ccacctggtg 2880 ccctcggcgt tcgtccgcct ccgtgaactg cccgtcaaca ccagcggcaa actggaccgg 2940 gccgcgctcc cggcccccga ccccgcggac ttccccgccg gccggcgacc gcgcaccgcc 3000 ctggagcggg aggtgtgcgc gctgttcgcg gaggtcctcg gcgccgggag cgtcggcatc 3060 gacgacgact tcttcggccg gggcggcgac agcatcctct ccatccaact ggtgggcagc 3120 gcccgccggg cgggcctcac gttcaccgtc cggcaggtct tcgagct=gcg cacccccgcg 3180 gccctggccg ccgccgcccg caggaccgac gcggcaggcg acgaggaccc cgctctcgcc 3240 gtcggaccgc tgccgctcct tcccgtggtc gccgagaccc tcgcggccgg cgggccggtc 3300 cactcgtaca accagtcggt cgtcctcgcg tccccgccgg acgccgcacc cgacgacgta 3360 cgcgacgcgc tccaggccct cctcgaccgg cacgacgcgc tgcgcgtcca cgccgccccg 3420 gcggccggcc ccggccgcct ctgggacctc cgggtggagg aggccgqcac ggtcgcggcc 3480 gagcggtgcc tgcgccggat cgacgagacc ggcatgtccg acgaggaact ggcgcgggcg 3540 caggccgccg aggccgtcac ggcgcgcgcc tgcctcgacc ccctcgccgg ggccctcgtc 3600 agcgccgtct ggttcgaccg gggcgaccgg ccgggccggc tcgtgct:ggt gatccaccac 3660 ctcgccgtcg acggcgtctc ctggcgcatc ctcctcggcg acctccgtga ggcatggcgg 3720 gcgttgcgcg ccggccgccg ccccgaactc ccccgtacgg gcacctcgct gcgcacctgg 3780 gccacccggc tcaccgaacg ggccaccgac ccggccgtca ccgcccaact ggaccactgg 3840 acggccacgc tcgccgacgg ccccgcaccg ggcagccggc cgctggaccg gacccgggac 3900 accgtggcca cctccgccgt cctcagcggc gaactgcccg cgtccctcac caccgacctg 3960 ctcggtccgg ccccggcggc cttccgtgcc ggggtgaacg acctgctgct gaccgctttc 4020 gccctcgccg tcgcccactg gcggggcgag gaggacgcac cggtcctggt ggacctggag 4080 agccacggcc ggaccgagga actggtgccg ggggctgacc tgtcccgcac cgtcggctgg 4140 ttcacctccg tccacccggt gcggctcgcc gccggcaggg tcaccgccgc cgacctcgcc 4200 gagcgcgccc cggccgtcgg cgacgcgatc aaacggatca aggagcaact gcgcgccgtc 4260 cccgacggag ggctggggca cggtctgctg cgccacctga accccgacac cgccccccgc 4320 ctccgaggcc tcgcccgcgc gcggttcggc ttcaactacc tgggccggtt cgccgccgag 4380 cagggcgcgg gcgaggacag ctggccgctg ctcggcagcg gccccgcggg ccagcatccg 4440 gacaccccgc tcgaccacga gatcgaggtc aacgtcgtca cggccgaggg tccggacggg 4500 ccccggctga tcacccggtg gacctacgcc accggtctgc tcaccgagga ggaggtgcgc 4560 cgcctcacgc ggtcctggtc gctggcgctg cacgccgtcg tcggccacgc caccgccgag 4620 ggagcgggcg gcctcagccc ctccgacgtg gccgttcccg acctcggcca ggccgagatc 4680 gaacagctcg aacgccgcac cggcaccgcc ttggaggaca tcctgccggt cgcccccctc 4740 caggagggcc tgctcttcca cagcgtgtac gaccggcgcg ccctgga~cgt ctacgtcggc 4800 cagctcgcct tccgcctgga gggagagatc gaccaggacg ccctgcggac ggccgccgcc 4860 gcgctgctcg cccgccacac cagcctgcgg accggcttcc accaacggga gtccggccag 4920 tgggtgcagg ccgtggcccg gtcggtggag ctgccgtggc agttccacga cctgctcgac 4980 ccgcacggcg ccggcggggc cgccggtgcc gcggacgccg ggtccgggcg acgatgggag 5040 gagctggccg cggccgaacg cgtcgagcgg ttcgacctca cccgcccccc gctcgtccgc 5100 ttcctcctgg cccgcaccgc ccccgagcgg taccagttcg tgatcaccac ccaccacacg 5160 atcgtcgacg gctggtccat ccccatcctg ctgcgcgagc tgctcgcgct ctacggcggg 5220 gacccgctgc ccccggcccc cggtcaccgc ctccacgccg actggct:ggc cgcacgcgac 5280 ctggtggcgg cgcgcgaggc gtggacgcgg gcgctggcgg acaccgaggg gcccaccctg 5340 ctcgcgcccg gcgcgccgcg cgtcggagaa gtgccccggt cggtacggct gaacctgccc 5400 gaggaggtct ccgcacggct gctgacccgc gcccgcgagg ccggggccac cctcaactcg 5460 gtcgtccagg ccgtctgggc cctcgtcctc gcccaggaga ccggccgctc ggacgtcacg 5520 ttcggcatca ccgtctcggg tcgcccggcg gaactecccg gggccgagaa cctggtcggc 5580 atgctggtga acaaggtccc gctgcgcgtc cgtctccgcc cggccgaacc cctcatggaa 5640 ctggcccggc ggctggagag ggaacagctg gaactcctgg agcaccagca cgtcccgctc 5700 accaccctgc accgctggag cggcctgccc gaactcttcg acaccaccat ggtgttcgag 5760 aactacccgg cggaggtcac cgcccggcag gcgcccttcc gcgcgtcggg cacggccagt 5820 tacagccgca accactaccc gctcacgctg gtcggagcca tgcgcgggac cgagctgacc 5880 gtccgtgtcg accaccgccc cgacctcttc gacgaggact tcgcccgctc cctgggcgag 5940 cgggtgatcg ccgccctcac cgaggccgcc gaccacccct tcgtccccgc cggcacgctc 6000 gacctgctcg gtgccgagga gcgcgcccgc ctcctggagt ggggcaccgg ccccgcaccg 6060 gaggacgccc cacgcaccta tgtcgacctg ttcgaggagc aggccgcccg cacccccgac 6120 gcgccggcgg tcatctcgtc cgacggtgtc ctcacctacg ccgagctgga ccggcaggcg 6180 aacggcgtcg cccggtggct ggccggccgg gccggatccg ccggtggcgc cgaggtccac 6240 atcggtgtgc tggccccacg ccgccccgaa gtgctcgccg tcctgctcgg cgtcctcaag 6300 tcgggcgccg cctacgtccc cctggacgag cagtggccgg ccgaacgcct ccgcacggtc 6360 ctggaggact gccgccccgc gctcgtgctg gccccgacgg ccgccaggag cgatgccgcg 6420 cgggagtccg gcgcgacggt gctccccgtc gacccggccg ccctcgccgc acacggtccc 6480 cagaccccga ccgacgccga gcggatacgt cccctgacgc ccggcgcagc cgcgtacgcc 6540 ctctacacct cgggatccac cggccgcccc aagggcgtgg tgatcgacca cagcgccctg 6600 gccgcgtacg tcggcggcgc gcgccgccgc taccccgacg cggccgggac ctcgctggcc 6660 CdCaCCtCgC tCgCCttCga CCtCaCCgtC aCCaCCCtCC tCaCCCCgCt caccgcgggc 6720 ggcgccgtgc gcctgggcga actggacgag accgcccggg acgccggggc caccctggtc 6780 aaggcgacgc cctcgcacct gcccctgctg agcgagctgc ccggagccct gaacgacggg 6840 ggcaccctga tcctcggcgg cgaggcgctg accggcggcc ggctgcgccc ctggcgcgaa 6900 ctgcaccccg acgcccaggt cgtcaacgcc tacggtccga cggaactcac ggtcaactgc 6960 accgagtacc ggctgccgaa gggagaaccg gtcggcgaag ggccggtgcc catcggccgc 7020 ccgttcgccg gggtacgggt ccacgtgctc ggccccggcc tgcgcccggt ccccgccgag 7080 gtccccggcg agctgtacgt cagcggcgtc ggggtggccc ggggctatct gggccggccg 7140 gccctgaccg ccgagcggtt cgtggcctgc ccgttcgggg agccggggga gcggatgtac 7200 cgcaccggcg acctcgtccg ctggcggagc gacggccaac tggagtacgt cggccgaagc 7260 gacgaccagg tcaaactgcg cggattccgc gtcgagaccg cggaggtcgc ccgcgccctg 7320 gagacctgcc cctccgtcgg aagcgcgatg gtggtgctgc gcgaggacca gccgggcgac 7380 cagcgcctgg tcggctacct cgtaccggcc gccggaagcg gcgcgctcga caaggaggcc 7440 gtgtcggacg cggtccgggc ggtcctgccc gagtacatgg tcccctcggc actggtggtg 7500 ctggaagacg gaccgccgct gacggtcaac ggcaaggtcg accggagcgc gctgccggcg 7560 ccggaggcgg agccggcccg cagcgcgggc cgggcgccgc gcgggccgcg cgaggagatc 7620 ctgtgcgggc tcttcgccga cgtgctgggc gtgcgagcgg tcggcgtgga cgacgacttc 7680 ttcgccctgg gcggccactc cctgctcgcc atcgtcgtga tcagccggat cagggccctg 7740 ctcgacgtgg acgtggccat cgacgcgctc ttcgaggccc ccacggtggc ccggctggcc 7800 gcccacctcg acgggcccgg acgcggtcac ggcgcggtgc gcccggccgt gccacgcccc 7860 ggacgcctcc cgctctccta cgcccagctc cgcctgtggc tcctccacca gatcgagggg 7920 ccgagcgcca cctacaccat cccgctggcg ctgcgcctga ccggtccgct ggacgtggcg 7980 gcgctgcggg ccgcgctggg ggacgtggtc gcccggcacg agagcctgcg caccgtcttc 8040 gccgaggacg agcacggccc gcaccagatc gtcctcgcgc ccggggacgc cgaacccggc 8100 ctcaaggcgg tccccaccac ggaggaccgt ctgaggtccg acctggaggc cgaggccgcc 8160 cgccccttcg acctcggcca ggcaccgccg gtccacgccc gcctcttcgt cctcgacgaa 8220 cgcacccacg tcctgctgct ggcggtccac cacatcgcga tggacggctg gtcggtccgc 8280 cctctggtgc gcgacctggc gtccgcctac gcggcccgcc gccgaggcgc ctccctggac 8340 ctgcccgcac ttcccgtgca gtacgccgac tacaccctgt ggcagcacga ggagctgggc 8400 tccgaggacg acccggacag tcccctcgcc gcgcaactgc ggtactggcg ccggaccctg 8460 gacggcctgc cgcaggagtc cgcgccggcc gccgaccggc cccgtcccgc caccccctcg 8520 taccggggcg gccgcgtcgc cctcaccgtc ccgccggaac tgcacgggcg ggtggtggag 8580 ttggcgcggg agttccgggc gacgccgttc atggtggtgc acgcggcgtt ggcggcgttg 8640 ctgacgcggt tgggcgcggg cacggacgtg ccgatcggtt cgccggtggc cgggcgggtc 8700 gacgacgcgc tggaggacct ggtggggttc ttcgtcaaca cgctggtgct gcgcacggac 8760 acctcgggcg acccgacctt cggggagttg ctggaacggg tgcgggccac cgacctgggg 8820 gcctacgccc accaggacct ccccttcgaa cgcctggtgg aagtgctcaa tccggagcgc 8880 tccctcgccc gccacccgct cttccagatc ctgctggcct tcaacaacgg cgcggcgccc 8940 gacgaaggacccgccgaccgggcgtcggacgtcctggtgcggccggagacggtggagatc9000 gcggcggccaagttcgacctgtcgctgtccttcaacgaggaccgggcggccgacggcacc9060 gcggccgggatgcggggcgtgctggagtacagcgccgacctgtacgacgagagcacggcc9120 cgcaggatggccgaacgctacctccggctgctcgaagcggcggtcgcggagccccgcacc9180 ccgctgagccgcattcccgtcctgagcgaggccgagctgcacgacgtcctcgtccggcgc9240 aacgacactggtcgcacccggcccgactcctccccactgcgacggttcgaggcgcaggcg9300 gccacgactccccgggccacagccctggtcgtgggtgaggagcggctcgactacgccgaa9360 ctcgacgcacgggccgagcggctcgccaccctgctgtcccggagcaccgccgggcgcggc9420 ggacccgtcgccgtcgccctgccgcgcggtgtcatgcttccggtggccctgctcgccgtc9480 tggaaggcgggcctgcactacctgccgctggaccccgaccacccgaggagccgcctggcg9540 gacgtcctcgccgactccgcgcccggctgcgtcatcacgacgaccgacctcgcgcgccgc9600 ctcccgccggtacccgccccgctgctcgtcctggacgatccggccaccgccgcacgcctg9660 gccgccacca ccgccacagc cctggccgag gacccgcggg agcagaacgg ggagtggggg 9720 gaggaactgg cgtacaccat ctacacctcc ggctccaccg gccgtcccaa gggcgtcatg 9780 gtgacccggt cggccgtggc gaacttcctc gccgacatga acgaacggct ggaactgggc 9840 cccggcgacc ggttgctggc ggtcaccacg gtctccttcg acatcgccgt cctcgaactc 9900 ctcgccccgc tgctcaccgg cggcacggtc gtcctcgccg acgccaccac ccagcgcgac 9960 cccgcggccg tgaggtccct ctgcgcccgc gagggcgtga cggtgatcca ggccaccccc 10020 agctggtggc acgccatggc cgtggacggc ggcctcgacc tcacggccct gcgcgtgctg 10080 gtgggcggcg aggcactgcc gcccgccctc gcccgcaccc tcctggaacc cggccgcgcg 10140 ccgctgggcg attacctgct caacctgtac ggacccacgg agaccaccgt ctggtccacc 10200 gtcgcgcgga tcaccgccga ttccttggag gcgcacggcg gcgccgtgcc cacggggacg 10260 ccgatcgccc gcaccgccgc ctacgtgctc gacgccgcgc tgcggcccgt gcccgacgga 10320 gtgccgggcg agctgtacct ggccggcgcc gggctggccc ggggctatct gggccggccg 10380 ggaatgaccg ccgagcggtt cgtggcctgc ccgttcgggg agccggggga gcggatgtac 10440 cgcaccggcg acctcgcccg ctggcgggcc gacggcaacc tggaacacct gggcaggacc 10500 gacgaccagg tcaaggtccg cgggttcagg atcgaactgg gcgaggtgga aagagccctg 10560 acgcaggccc acggcgtcgg ccgggccgcc gccgccgtcc accccgacgc cgccggctcc 10620 gcccgactgg tcggctatct ggtaccggcc ggcggcagcg gcgcactcga cgagaaggcc 10680 gtcgccgacg ccgtgcgggc ggtgctgccc gcgtacatgg tcccctcggc gctggtggtg 10740 ctggacggcg gcctgccgct gaccgcgaac ggcaagctgg accgggccgc gcttcccgcg 10800 cccgaggcga cgaccggccg cggccccggc cgggcgccgc gcgggccgcg cgaggagatc 10860 ctgtgcgggc tcttcgccga cgtactgggc gtgcccgcgg tcggcgtgga cgacgacttc 10920 ttcgccctgg gcggccactc cctgctcgcc acccggctca tcgcccgggt ccgcggcaca 10980 ctcggcgtcg aactcggcgt ccgagaggtc ttcgagacac cgaccgtggc cggtctcgcc 11040 gccgcgctct ccgcggcggg cgaggccgga ccccggctgc gccccgccga cccgcgcccc 11100 gagcgcctgc ccctgtccca cgcccagcgc cgcctgtggt tcgtccggca actggagggg 11160 ccgagtgcca cctacaacgt cccgtgggcg ctgcgcctga ccggtccgct ggacgtggcg 11220 gcgctgcggg ccgcgctggg ggacgtggtc gcccggcacg agagcctgcg caccgtcttc 11280 gccgaggacg agcacggccc gcaccaggtc gtcctgtccg ccgacggccc ggccccgctc 11340 agcgggcccg tccggaccga cgaggacgca ctgccccgcc tgctgcggga agcggccgac 11400 cacgccttcc ggctggacgc cgaaccgccg ctgcgcgccc acctgttcgc caccgcgccg 11460 gaggaccaca ccctgctcct ggtcatgcac cacatcgcca ccgacgcctg gtcgcagcgg 11520 ccgttgatcg ccgatctggc cgcggcctac gccgcccgcc acgccggccg ggtcccgacg 11580 ctgccgccgc tgccggtcgc ctacgccgac tacgccctgt ggcagcaggc ccgcctgggc 11640 gacgaacggg agaaggacag cgcgctgtcc gcccaactcg cctactggcg cgacgcgctg 11700 gcgggctccc cggaggagct cgcgctgccc gccgaccggc cccggcccgc cgtcccctcg 11760 caccgggggg acagcgtgcc cctcaccgtc ccgccggaac tgcacgggcg ggtggtggag 11820 ttggcgcggg agttccgggc gacgccgttc atggtggtgc acgcggcgtt ggcggcgttg 11880 ctgacgcggt tgggcgcggg cacggacgtg ccgatcggtt caccggtggc cgggcgggtc 11940 gacgacgcgc tggaggacct ggtggggttc ttcgtcaaca cgctggtgct gcgcacggac 12000 acctcgggcg acccgacctt cggggagttg ctggaacgcg tacgggccac cgacctgggg 12060 gcctacgccc accaggacct ccccttcgaa cgcctggtgg agctccgcga cccggaacgc 12120 tcgctggccc gccacccgct cttccaggtc tcgctgaact acgacacggc cgagacggcc 12180 cgagcacgcg atgccgcacc ggaactggac gggctgaccg tgagcgggcg accgctcggc 12240 gtcaccacgt ccaagttcga cctcaccttc gcgctcaccg agacccgcgc ccacgacggc 12300 ggccccgccg gactgcgcgg cgcgctggag tacagcaccg acctgttcga ccgtggcacc 12360 gccgagcgcc tggcggagcg gttcgcacgg gtcctccagg ccgcggtggc cgcccccggc 12420 accaggctcg accagatcga cgtgctgctg ccgggcgaac gcgcgctcct ggagggcgag 12480 tggagcaggc ccgagcccgg acccgtcgcc cccacggacg acgcccgctt cccggacctc 12540 ttcgaggcgc aggccgcccg caccccgcac gcccccgccg tccgcgacgg tgaccgggag 12600 ctctcctacg ccgagctgaa cgaccgggcc aaccggctgg cccggttcct cgccgctcgc 12660 ggagcgggcc ccgaggacac cgtcgccgtc ctgctgccgc gcggccccga gctgatcacc 12720 gccctggtgg ccgtccagaa ggccggggcc gcctacgtcc ccatggacgc cgagctgccc 12780 gccgagcgga tcgcccacat gctggagaac gcccgcccgg tgctcgtcct cgcccacacc 12840 gcaacccagg acgccctccc ggagggggcc ggccccgtgg tccgcctcga cgccccggcc 12900 atcgaggcgg cgctcgccgg gctcgacggc ggcgactgca ccgacgccga ccgccgcgca 12960 ccggccacgc accacgaccc ggcctacgtc gtctacacct ccgggtccac cggtacgccc 13020 aagggcgtcg tggtcgaaca gcgctccctc gccgccttcc tggtccgctc ggccgcccgg 13080 taccgcggag ccgccggaac cgcgctgctg cacggctcgc cggccttcga cctcacggtc 13140 accaccctgt tcaccccgct gatcgccgga ggctgca.tcg tggtggcgga cctcgacgct 13200 ccggagcggg acgccccggc ccgccccgac ctgctcaagg tcactccctc ccacctcgcc 13260 ctcctggaca cgatcgcctc ctgggcgaca cccgcggccg acctggtcgt cgggggcgag 13320 caactgaccg cgtcccgtct cgcccggctg cgccgggcac acccggacat gcgcgtcttc 13380 aacgactacg gtcccaccga agccaccgtg agctgcgccg acttcgtcct ggaaccgggc 13440 gacgcaccgc ccaccgacac cgtgccgatc ggacgccccc tggcgggaca ccggctgttc 13500 gtcctggacg atcgcctgcg cccggtgccc gccaacgtcc ccggcgagct gtacgtcagc 13560 ggcgtcgggg tggcccgggg ctatctgggc cggccgggaa tgaccgccga gcggttcgtg 13620 gcctgcccgt tcggggagcc gggggagcgg atgtaccgca ccggcgacct cgcccgccgg 13680 cgggccgacg gaaacctgga gtacctgggc cgccgcgacg gccaggtgaa ggtgcgcgga 13740 ttccgcgtcg agacgggcga gatcgagacc gccctgctcg accgcccgga gatcggccag 13800 gccgccgtcg tcctgcgcgg cgaacgcctc ctcgcctacg tcgcggcccc gccggagcgg 13860 ttcgacccgg acgcgctccg ccaggcgctc gcgtcccggc tgccccggta catggtcccc 13920 gccgcgttcg tccggctgga cgccctgccg ctggctccgg gaggcaagct cgaccaccgg 13980 gcgctgcccg agccgccggc gcccgccgac gccccgcacg ggagcaggcc gccgcgcgac 14040 gcgtgggaag gcgtgctgtg cgaggcgttc ggcgaggtgc tggggatcgc ggaggtcggg 14100 gccgacgacg acttcttcgc cctcggcggc gacagcatcg gctccatccg gctcgtcggc 14160 cgggtgcgcg cggcgggcgg ccggatgacc gtccgcgaca tcttcgaaca gcgcacgccc 14220 gccgccctcg ccggccgctc gcgccccggc ggtccggcga ccgaggtact cggcggtcgc 14280 gggaccgggc cggtggagcc gacgccgatc agctcctggc tggccgagct gggcggcgcg 14340 gtcgacgggt acaaccagtc cgtgctgctg cgcgtccccg ccgaggccga cgcggccgtc 14400 gtgaccggcg ccctccagac actgctggac caccacgacg cgctgcggat gcgggccgaa 14460 ccggaggacg gtcactggcg gatggagatc gccgaggcgg gcgcggtgga cgcggccacc 14520 gtgctggagc gggtggacgc ggcgggcgcc gatcaagggg agctggaccg gctggtgcgg 14580 acgcactgcg ccgcggcccg tgaccggctc gccccgcaga agggctccgt cctgcgtgcc 14640 gtctggttcg acggcgggcc acgggagccg ggacacctcg cgctcgtcgc ccaccacctc 14700 gtcgtggacg gagtctcctg gcgcatcctc accgccgacc tcggca.gcgc gtggcaggcc 14760 ctcgccgagg gccgggaacc ccacctcgac ccggtgggca ccccgctgag gatctgggcc 14820 cggcacctgg cggagctggc cgccgacccg cgccgcgccg agcggtgcgc ccactgggag 14880 gagcagtcgc cgcggccctg ggagaccggc ggtctcgacc ccgccctcga cgaccggagc 14940 accgaggagg ccctttccct gaccctcccg gccgccgcca cccgcgccgt gctcggcccg 15000 gtgcccgccg cgctcggcgt cggggtgagc gaagtcctgc tcggaacgtt cgccgccgcc 15060 gtgcggcgcc ggcgtcccgc ggaggccgcg gacggcgtca cggtggacct ggaaggccac 15120 ggccgggagg aggacgtcgt ccccggagcg gacctctccc gcacggtcgg ctggttcacc 15180 gccgcccacc cggtccgggt gcccgccgcg cggcccgacg aggaccggac cggcgcgctg 15240 cgggcgctgg ccgcgatcct ggaccgggcg cccgacgccg gcctcggcta cggcctcctg 15300 cgctacctca acccgcgcac ccggcaacgg ctcgccgccc tgcccgcccc gcgctacggc 15360 ttcaactacc tgggcagatt cggtggttcg ggagaggacc gggacgcgga cgaccggacg 15420 tcgaactggt cgccggtggc cgccgggctc gccggccagc ccgcccagct gcccctggcc 15480 cacgagatcg aggtcaccgc ggtcgccgtc gaggggcccg agggcccgcg tctgatcgcc 15540 acctggtcct gggccggccg gctccaccgg gagcgggacg tccgcgaact ggccgaactc 15600 tggttccgcg agctggagga actcgcgtcc gccgaaccgc ccccggccgg ccccgcgccc 15660 ctgtccgacc cgccccccct ggtcgaactc accgacaccg aactcgacca gctcgaagca 15720 gagtggaagg ccggctga 15738 Information for SEQ ID NO: 13 Length: 2384 Type: PRT
Organism: Streptomyces fradiae Sequence: 13 Met Arg Gly Ser Leu Gln Asp Val Leu Pro Leu Ser Pro Leu Gln Glu Gly Leu Leu Phe His Ser Glu Tyr Val Gly Asp Glu Ala Val Asp Val Tyr Thr Val Gln Thr Glu Val Glu Leu Arg Gly Pro Leu Asp Val Pro Ala Leu Arg Ala Ala Ala Glu Ala Val Leu Arg Arg His Asp Asn Leu Arg Ala Gly Phe Ala Thr Arg Ala Leu Lys Asp Pro Val Gln Phe Val Pro Arg Glu Val Glu Leu Pro Trp Glu G1u Ala Asp Leu Arg Ala Ala goo Asp Asp Pro Asp Ala Glu Ala Ala Arg Arg Leu G1u Glu His Arg Trp Arg Arg Phe Arg Pro Ser Lys Pro Pro Leu Val Arg Phe Leu Leu Leu Arg Thr Ala His Asp Arg His Arg Phe Ala Leu Thr Asn His His Ile Leu Leu Asp Gly Trp Ser Met Pro Met Leu Leu Arg Glu Leu Met Leu Leu Tyr Arg Thr Gly Gly Asp Ala Ser Ala Leu Pro Pro Val Arg Arg Tyr Arg Asp Tyr Leu Ala Trp Leu Gly Gly Arg Asp Arg Gly Thr Ala Arg Glu Ala Trp Arg Ala Ala Leu Ala Gly Leu Glu Ala Pro Thr Leu Ile Ala Pro Arg Ala Asp Arg Ala Ala Glu Ala Pro Thr Trp Leu Asp Phe Thr Leu Ser Glu Thr Ala Ser Ala Gly Leu Ser Ala Ala Ala Arg Ala Ala Gly Leu Thr Leu Asn Thr Val Val Gln Gly Leu Trp Ala Leu Thr Leu Ala Arg Thr Thr Gly Ser Gln Asp Val Val Tyr Gly Val Val Val Ser Gly Arg Pro Pro Glu Leu Asp Gly Val Glu Ser Met Ile Gly Leu Phe Ala Asn Thr Val Pro Leu Arg Ala Arg Met Pro Val Asp Glu Pro Leu Thr Asp Phe Leu Arg Arg Leu Gln Arg Glu Gl:n Ser Ala Leu Leu Asp His Gln His Leu Arg Leu Ala Asp Ile Gln Arg Leu Ala Gly Gln Gly Glu Leu Phe Asp Ser Val Met Ala Phe Glu Asn Tyr Pro Asp Gly Pro Ala Asp Glu Pro Ser Gly Ala Ser Ala Asp Th:r Pro Gly His Val Arg Val Val Ala Ser Arg Met Arg Asp Ala Met His Tyr Pro Leu Gly Leu Leu Ala Ser Pro Gly Thr Arg Met Arg Phe Arg Leu Gly His Arg Pro Ser Ala Val Thr Pro Arg Leu Ala Ala Ala Leu Arg Asp Arg Leu Leu Arg Leu Val Asp Ala Phe Leu Ala Thr Pro His Leu Pro Leu Gly Arg Phe Asp Val Leu Asp Asp Ala Glu Arg Ala Leu Val Leu Asp Thr Phe Asn Asp Thr Ala His Glu Val Glu Asp Thr Thr Ala Val Glu Leu Phe Leu Arg Gln Ala Ala Arg Thr Pro Ala Arg Ile Ala Val Glu Thr Ala Asp Arg Ser Val Asp Tyr Ala Arg Leu Ala Asp Arg Ser Gly Arg Leu Ala Arg Leu Leu Ala Glu His Gly Ala Arg Ala Glu Arg Phe Val Ala Leu Val Leu Pro Arg Ser Pro Glu Leu Val Glu Thr Ala Leu Ala Val Trp Gln Thr Gly Ala Ala Tyr Val-Pro Val Asp Pro Ala His Pro Ala Asp Arg Met Ala Arg Leu Leu Arg Glu Ala Asp Pro Val Leu Thr Val Thr Thr Ala Asp Leu Ala Asp Arg Leu Pro Ala Gly Leu Pro Leu Leu Val Leu Asp Gly Pro Ser Thr Ala Ala Ala Leu Gln Ala Leu Pro Gly Gly Pro Leu Thr Ala Ser Glu Leu Pro Ala Pro Val Asp Pro Arg Asn Ala Ala Tyr Ala Leu Tyr Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Val Ala Thr His Arg Ser Leu Val Gly Tyr Leu Leu Arg Gly Ser Ala Gln Tyr Pro Ser Asp Gly Arg Ser Leu Val His Ser Pro Val Ser Phe Asp Leu Thr Val Gly Ala Leu Tyr Val Pro Leu Ile Ser Gly Gly Thr Val Arg Leu Ala Ser Leu Asp Asp Glu Pro Val Leu Arg Pro Gly Glu Thr Pro Pro Asp Phe Val Lys Val Thr Pro Ser His Leu Pro Val Leu Glu Gly Leu Pro Gly Glu Val Ser Pro Th:r Gly Ala Ile Thr Phe Gly Gly Glu Gln Leu Thr Gly Arg His Leu Arg Arg Trp Arg 725 730 ~ 735 Ala Asp His Pro Asp Val Thr Val Tyr Asn Val Tyr Gly Pro Thr Glu Thr Thr Val Asn Cys Ser Glu His Arg Ile Ala Pro Arg Asp Pro Val Gly Asp Gly Pro Val Pro Ile Gly Arg Pro Leu Trp Asn Thr Arg Leu Phe Val Leu Gly Pro Gly Leu Ala Pro Val Pro Val Gly Val Pro Gly Glu Leu Tyr Val Ala Gly Ala Gly Leu Thr Arg Gly Tyr Leu Arg Asp Pro Gly Arg Thr Ala Glu Arg Phe Val Ala Cys Pro Tyr Ala Ala Gly Gln Arg Met Tyr Arg Thr Gly Asp Leu Val Arg Trp Asn Glu Asp Gly Leu Leu Glu Tyr Leu Gly Arg Val Asp Asp Gln Ile Ser Leu Arg Gly Phe Arg Val Glu Pro Gly Glu Val Glu Ala Ala Leu Ala Ala His Pro Ala Val Arg Arg Ala Ala Val Val Leu Arg Glu Asp Thr Pro Gly Asp Ala Arg Leu Val Ala Tyr Ala Val Pro Ala Glu Pro Glu Gly Ala Arg Ser Thr Pro Pro Ser Pro Leu Pro Thr Glu Gln Ile Leu Glu His Leu Arg Arg Thr Leu Pro Pro Tyr Met Val Pro Ala His Leu Val Glu Leu Pro Ala Leu Pro Val Thr Pro His Gly Lys Ile Asp Arg Ala Ala Leu Pro Glu Pro Ser Val Ala Gly Ala Pro Ala Gly Gly Ala Pro Arg Ser Pro Arg Glu Glu Ile Leu Cys Gly Ile Phe Ala Glu Va.l Leu Arg Arg Pro Arg Val Ser Ile Asp Asp Asp Phe Phe Ala Leu G:ly Gly His Ser Leu Leu Ala Thr Arg Leu Ala Ser Arg Val Arg Ala Ala Leu Asp Thr Glu Leu Pro Val Arg Arg Leu Phe Glu His Pro Thr Val Arg Ser Leu Ser Ala Leu Leu Asp Pro Asp Ala Gly Arg Arg Pro Ala Val Thr Pro Ala Arg Arg Pro Glu His Val Pro Leu Ser Phe Ser Gln Gln Arg Leu Trp Ile Met His Arg Leu Thr Gly Pro Asp Ala Thr Tyr Asn Ile His Arg Ala Leu Arg Leu Asp Gly Asp Leu Asp Val Pro Ala Leu Glu Ala A1a Leu His Asp Val Thr Glu Arg His Glu Thr Leu Arg Thr Val Phe Pro Glu Gly Pro Glu Gly Pro Tyr Gln Lys Val Leu Pro Ala Arg Arg Glu Asp Gly Thr Leu Thr Val Leu Pro Val Ala Asp Arg Glu Val Asp Arg Thr Leu Ala Glu Leu Ala Ala His Arg Phe Asp Leu Glu Ser Glu Pro Pro Lys Arg Ala Trp Leu Leu Glu Ser Gly Pro Arg Ser Arg Val Leu Val Leu Val Leu His His Ile Ala Ser Asp Gly Trp Ser Gly Arg Arg Leu Leu Arg Asp Leu Phe Thr Ala Tyr Thr Ala Arg Arg Ala Gly Arg Ala Pro Gln Trp Arg Pro Leu Pro Val Gln Tyr Ala Asp Tyr Ala Leu Trp Gln Arg Arg His Leu Gly Asp Pro Ala Asp Pro Ala Ser Pro Ala Ala Val Gln Gly Glu Tyr Trp Glu Lys Gln Leu Ala Gly Leu Pro Glu Glu Leu Arg Leu Pro Ala Asp Arg Pro Arg Pro Ala Arg Pro Thr Arg Thr Gly Gly Gln Val Trp Leu Thr Leu Pro Ala Thr Ala His Ala Ala Val Ala Glu Leu Ala Arg Thr Ser Arg Ala Ser Va1 Phe Met Val Val Gln Ala Ala Val Ala Ala Phe Leu Thr Arg Met Gly Ala Gly Glu Asp Ile Pro Ile Gly Ala Pro Val Ala Gly Arg Thr Asp Glu Ala Val Glu Glu Leu Val Gly Phe Phe Val Ser Thr Leu Val Leu Arg Thr Asp Thr Ser Gly Asp Pro Ser Phe Thr Glu Leu Val Gly Arg Val Arg Glu Thr Ala Leu Ala Ala Tyr Ala His Gln Asp Leu Pro Phe Glu Tyr Val Val Glu Arg Leu Ser Pro Thr Arg Ser Leu Gly Arg His Pro Leu Phe Gln Val Ala Leu Ser Cys Asn Asn Thr Glu Glu Gln Leu Gly Arg Gln Gly Ala Pro Pro Pro Gly Leu Ser Val Thr Pro His Gln Val Asp Ala Ala Arg Ser Lys Phe Asp Leu Met Phe Thr Phe Leu Glu Asn His Gly Glu Asp Gly Gln Pro Thr Gly Ile Glu Thr Ala Leu Glu Tyr Ser Ala Asp Leu Phe Asp Arg Glu Thr Ala Gln Ser Leu Leu Asp Arg Phe Ala Arg Met Leu Ala Ile Trp Ala Ala Glu Pro Ala Ala Ala Ile Gly Ala Arg Glu Leu Leu Ala Ala Asp Glu Arg His Thr Val Val Thr Ala Trp Asn Ala Thr Arg Arg Ala Asp Leu Val Ala Thr Leu Pro Arg Met Phe Glu Glu Gln Val Ala Arg Thr Pro His Ala Thr Ala Leu Glu His Ala Gly His His Leu Thr Tyr Ala Glu Leu Asn Ala Arg Ala Asn Arg Leu Ala Arg Val Leu Val Arg Arg Gly Ile Arg Pro Glu His Arg Val Ala Ile Leu Met Pro Arg Ser Val Glu Gln Ile Thr Ala Leu Leu Ala Ile Thr Lys Ala Gly Gly Ala Ala Val Pro Val Asp Pro Gly His Pro Gly Gln Arg Ile Ala Phe Met Leu Arg Asp Ser Ala Cys Ala Leu Ile Leu Ala Asp His Pro His Ala Ala Gly Arg Glu Glu Ile Ala Gly Val Pro Val Leu Val Pro Ala Asp Glu Pro Ala Pro Glu Arg Ala Thr Asp Leu Ala Asp Gly Asp Arg Asn Ala Pro Leu Thr Ala Gly His Ala Ala Tyr Val Val Tyr I~S

Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Val Thr Glu His Arg Gly Leu Leu Ser Leu Ala Thr Ala Gln Arg Glu Arg Tyr Pro Val Gly Pro Gly Ser Arg Val Leu Gln Leu Ala Ser Pro Ser Phe Asp Gly Ala Val Leu Glu Leu Leu Met Ala Leu Thr Thr Gly Gly Thr Leu Val Leu Pro Asp Gly Pro Leu Leu Ala Gly Gln Pro Leu Ala Asp Met Leu Ala Glu His Arg Ile Ser His Ala Phe Ile Pro Pro Ala Val Leu Ser Gly Leu Pro Ser Glu Gly Leu Glu Gly Leu Arg Cys Leu Val Val Gly Gly Glu Ala Val Thr Ala Pro Leu Thr Asp Arg Trp Ala Pro Gly Arg Arg Met Leu Asn Ile Tyr Gly Pro Thr Glu Thr Thr Ala Val Thr Leu Thr Ser Glu Ala Leu Thr Pro Gly Gly Pro Pro Pro Ala Ile Gly Thr Pro Val Pro Asn Thr Arg Ala His Val Leu Asp Asp Arg Leu Arg Pro Val Pro Pro Gly Val Thr Gly Glu Leu Tyr Leu Ala Gly Ala Ser Leu Ala Arg Gly Tyr Gly Arg Arg Pro Ala Leu Thr Ala Ser Arg Tyr Val Gly Cys Pro Phe Gly Ala Pro Gly Glu Arg Met Tyr Arg Thr Gly Asp Leu Ala Arg Leu Asp Arg Glu Gly Arg Val His His Met Gly Arg Thr Asp Glu Gln Ile Lys Leu Arg Gly Phe Arg Val Glu Pro Gly Glu Ile Arg Ala Arg Leu Thr Glu His Pro Ala Va1 Arg Glu Ala Ala Val Val Leu Arg Asp Asp Gly Pro Gly Gly Arg Ala Leu Val Ala Tyr Ala Val Pro Ala Asp Gly Pro Pro Arg Pro Thr Ala Ala Gln Leu Arg Ala His Leu Asn Ala Leu Leu Pro Pro Tyr Met Val Pro Ala Ala Phe Leu Val Leu Asp Ala Leu Pro Thr Thr Pro Asn Gly Lys Leu Asp Arg Glu Ala Leu Pro Ala Pro Gln Pro His Ala Glu Glu Thr Gly Arg Pro Pro Arg Asp Glu Arg Glu Ala Ala Leu Cys Glu Val Phe Ala Glu Val Leu Glu Arg Thr Ser Leu Gly Ala Asp Asp Gly Phe Phe Glu Asn Gly Gly His Ser Leu Leu Ala Val Arg Leu Val Ala Arg Val Arg Glu Arg Leu Gly Val Pro Leu Ala Ala Arg Asp Leu Phe Glu Ala Pro Thr Pro Ala Ala Leu Ala Glu Arg Leu Ala Arg Gly Ala Glu Arg Arg Ala Pro Ala Pro Leu Leu Thr Leu Arg Gly Arg Gly Asp Arg Pro Pro Leu Phe Cys Val His Pro Ala Val Gly Leu Gly Trp Ala Tyr Ala Ser Leu Leu Pro Trp Leu Pro Ala Asp Val Pro Leu His Ala Leu Gln Ala Arg Thr Pro Ala Asp Gly Ala Gly Leu Pro Gly Ser Val Glu Glu Met Ala Glu Asp Tyr Val Arg Leu Ile Arg Arg Val Arg Pro His Gly Pro Tyr Arg Leu Leu Gly Trp Ser Leu Gly Ala His Val Ala His Thr Ala Ala Ala Leu Leu Glu Arg Asp Gly Gln Arg Val Asp Leu Leu Ala Met Leu Asp Ala Tyr Pro Pro His Arg Thr Gly Asp Pro Gly Gly Arg Ala Glu Glu Ser Glu Ala Glu Ile Val Ala Ala Asn Leu Arg Glu Ser Gly Phe Ala Trp Asp Glu Asp Glu Gln Arg Ala Gly Arg Phe Pro Leu Glu Arg Phe Arg Ala His Leu Arg Arg Val Asp Ser Ser Leu I~7 Gly His Ala Glu Val Tyr Leu Asp Leu Thr Asp Ala Ala Lys Asp Val Asn Leu Met Arg Val Asn Val Arg Ser Arg Phe Thr Pro Gly Arg Cys Leu Met Ser Leu Gly Ile Thr Ala Val Glu Arg Thr Arg Asp Pro Glu Pro Glu Val Ala Ala His Thr Trp Glu Gly Gly Val His Arg Ser His Pro Ala Leu Asp Met Ser Ala Met Leu Thr Glu Ser Val Gly Arg u Leu Glu Ser Ala Ala Le Thr His Ala Arg Leu Leu Arg Thr Lys Gly Ala Lys Arg Thr Glu Val Information for D N0:

I

Length:

Type:
DNA

Organism:Streptomyces fradiae Sequence:14 atgcgcggatccctacaggacgtcctgcccctctccccgctgcaggagggcctcctCttC60 cacagcgaatacgtcggggacgaggccgtcgacgtctacaccgtccagaccgaggtggag120 ctgcggggaccgctggacgtcccggcgctgcgcgcggcggccgaggccgtgctccgccgc180 cacgacaacctgcgggccggcttcgcgacccgcgccctgaaggacccggtgcagttcgtc240 ccccgcgaggtcgaactcccctgggaggaggccgacctgcgcgcggccgacgacccggac300 gcggaggcggcccgccgactggaggaacaccgctggcgccgtttcaggccctccaagccg360 cccctggtgcggttcctgctgctgcggacggcccacgaccgccaccgtttcgctctcacc420 aaccaccacatcctgctggacggttggtcgatgccgatgctgctgcgcgagctcatgctg480 ctctaccgcaccgggggcgacgcctccgccctgccgccggtgcgccgctaccgggactac540 ctggcctggctggggggccgcgaccgcgggaccgcacgggaggcctggcgcgccgctctg600 gcgggtctggaggcgcccaccctcatcgccccgcgggccgaccgggccgcggaggcgccg660 acgtggctcgacttcacgctgtcggagaccgcctccgccggtctctccgcggccgcgcga720 gccgccggcctcacgctcaacacggtcgtgcaagggctgtgggcccl=gaccctggcacgc780 accaccggcagccaggacgtggtgtacggggtggtcgtctccggacgtccgccggagctg840 gacggcgtcgagtccatgatcggcctgttcgcgaacacggtcccgci=gcgggcccggatg900 cccgtcgacgaacccctgacggatttcctgcggcggctgcagcgcgagcagagcgcgctc960 ctggaccatcagcacctacggctggccgacatccagcggctggccggccagggcgagttg1020 I~g ttcgattccg tgatggcgtt cgagaactac ccggacggcc ccgccgacga gccctccggc 1080 gcttccgccg acacgccggg acacgtccgc gtggtggcct cccggatgcg cgacgccatg 1140 cactacccgc tcggcctcct ggcgtccccc ggaactcgga tgcgcttccg cctcggccac 1200 cggcccagcg cggtcacgcc gcgcctggcc gccgccctgc gcgaccgcct gctgcggctc 1260 gtcgacgcct tCCtCgCCdC CCCgCaCCtg ccgctgggca ggttcgacgt cctcgacgac 1320 gccgaacgcg ccctggtact ggacacgttc aacgacaccg cgcacgaggt cgaggacacc 1380 accgccgtcg agctgttcct ccggcaggcc gcccgcaccc ccgcccggat cgccgtggag 1440 acggccgacc gctccgtcga ctacgcccgg ctcgccgacc gctccggccg cctggcccgc 1500 ctgctggcgg agcacggggc gcgggccgag cggttcgtcg ccctcgtgct gccgcgctcg 1560 cccgaactgg tcgaaaccgc gctcgccgtg tggcagaccg gagccgccta cgtcccggtg 1620 gaccccgccc acccggccga ccggatggcc cggctgctgc gggaggccga ccccgtcctc 1680 accgtcacca ccgccgacct ggccgaccgg ctgccggccg ggctccctct gctggtcctg 1740 gacggcccga gcaccgccgc cgccctccag gccctgcccg gcggcccgct gaccgcgagt 1800 gagctccccg cgcccgtgga cccccggaac gccgcctacg cgctctacac ctccgggtcc 1860 accggccgcc ccaagggcgt ggtcgccacc caccgctccc tcgtcggcta cctgctgcgc 1920 ggctcggccc agtacccgtc cgacggacgc tccctggtgc actcgccggt ctccttcgac 1980 ctcaccgtcg gcgccctgta cgtcccgctg atcagcggcg gcaccgtgcg cctcgcctcc 2040 ctggacgacg aaccggtcct gcgccccggc gagacgcccc ccgacttcgt gaaggtgacc 2100 cccagccacc tgcccgtcct cgaagggctg ccgggcgagg tcagcccgac cggggcgatc 2160 accttcggcg gcgaacagct caccggccgc cacctgcggc gctggcgcgc cgaccacccg 2220 gacgtcaccg tctacaacgt ctacgggccc accgagacga ccgtcaactg ctccgagcac 2280 cgcatcgccc cccgtgaccc ggtcggcgac gggccggtcc ccatcggacg gccgctgtgg 2340 aacacccgcc tgttcgtcct gggccccggc ctcgccccgg tgccggtcgg cgtgccgggc 2400 gagctgtacg tcgccggcgc cggcctgacc cgcggctacc tccgcgatcc gggcaggacc 2460 gccgagcgct tcgtcgcctg cccctacgcc gccgggcaac ggatgtaccg aaccggcgac 2520 ctcgtccgct ggaacgagga cgggctgctg gagtacctgg gcagggtgga cgaccagatc 2580 agcctgcgcg gcttccgggt ggagcccggc gaggtggagg cggcgctggc ggcccacccc 2640 gcggtccgcc gcgccgcggt ggtgctgcgg gaggacacgc ccggcgacgc ccggctggtc 2700 gcctacgccg tccccgccga gccggaagga gcgcggagca cgccgccgtc cccgctcccc 2760 accgagcaga tcctggaaca cctgcgccgg accctgccgc cctacai~ggt ccccgcgcac 2820 ctcgtggaac tgcccgcgct gcccgtcacg ccccacggca agatcgaccg ggccgcgctg 2880 ccggaaccctccgtcgccggcgccccggecggaggagcgccccgct.ccccccgggaggag2940 atcctgtgcggcatcttcgccgaggtgctgcggcgcccgcgggtctccatcgacgacgac3000 ttcttcgccctgggcgggcactccctgctggccacccggctggcca.gcagggtgcgggcg3060 gccctggacacggagctgccggtgcgccgcctcttcgaacacccca.cggtgcgctccctg3120 tccgcactgctggaccccgacgccggcaggcgccctgcggtgacgcccgcacggcgacct3180 gagcacgtcccgctctccttctcccagcagcggctgtggatcatgcaccggctcaccggc3240 cccgacgccacgtataacatccaccgggccctgcggctcgacggcgacctcgacgtcccg3300 gcgctggaggccgcgctgcacgacgtgaccgaacggcacgagacgctgcgcaccgtcttc3360 cccgagggccccgagggcccgtaccagaaggtcctcccggcccgacgggaggacgggacc3420 ctcaccgtcctcccggtcgccgaccgggaggtcgaccgcaccctcgccgagctggcggcc3480 caccgcttcgacctggagtccgaaccgccgaagcgcgcctggctcctggagagcggtccg3540 cgcagccgggtcctcgtcctggtgctccaccacatcgccagcgacggctggtcgggcagg3600 cggctcctgcgcgacctgttcaccgcctacaccgcgcgccgcgcgggccgggcgccccaa3660 tggcgaccgctgccggtgcagtacgcggactacgccctgtggcagcggcgccacctcggc3720 gaccccgcggaccccgccagtcccgccgccgtccaaggggagtactgggagaagcagttg3780 gccggactccccgaggaactgcggctgcccgccgaccggccgcgcccggcgcgcccgacc3840 cgcaccggcggccaggtgtggctgacgctcccggcgacggcccacgccgccgtggccgag3900 ctggccagaaccagccgggccagcgtgttcatggtcgtccaagccgccgtggccgccttc3960 ctcacccgcatgggcgccggggaggacatccccatcggcgccccggtcgccgggcgcacc4020 gacgaagcggtggaggaactggtcggattcttcgtcagcaccctggtcctgcggaccgac4080 acctccggtgacccctcgttcaccgaactcgtcggccgggtccgggaaaccgcgctggcc4140 gcctacgcccaccaggacctgcccttcgagtacgtggtggagcggctcagcccgacccgg4200 tccctcggccggcaccccctcttccaggtcgccctgtcctgcaacaa cgaggagcag4260 cac ctgggccgcc agggcgcccc gccccccggg ctctccgtca caccgcacca ggtggacgcc 4320 gcccgctcga agttcgacct gatgttcacc ttcctggaga accacggcga ggacggccag 4380 cccacgggcatcgagaccgccctcgaatacagcgccgacctgttcgaccgggagaccgcg4440 cagagcctcctcgaccgcttcgcccggatgctggcgatctgggcggcggaaccggccgcc4500 gccatcggcgctcgcgaactcctggcggccgacgagcggcacacggi=ggtcaccgcgtgg4560 aacgccacccggcgcgcggacctggtcgcgacactcccgcggatgttcgaggagcaggtc4620 gcccgcaccccgcacgccacagccctcgaacacgccggccaccacctgacgtacgccgaa4680 ctcaacgccc gagccaaccg gttggccaga gtgctggtgc gccgcggcat ccgccccgaa 4740 caccgcgtcg ccatcctgat gccgcgctcc gtcgagcaga tcaccgccct gctggccatc 4800 accaaggccg gcggcgccgc cgtaccggtc gatcccggcc accccggaca acgcatcgcc 4860 ttcatgctgc gcgacagcgc ctgcgccctg atcctggcgg accacccgca cgcggcggga 4920 cgtgaggaga tcgccggcgt cccggtcctc gtccccgccg acgaaccggc cccggaacgg 4980 gccaccgacc tcgccgacgg cgaccgcaac gCCCCCCtCa CCgCCg'gCCa CgCCgCCtaC 5040 gtcgtctaca cctccggttc cacgggccgc cccaagggcg tggtgaccga acaccgcggc 5100 ctgctgtcac tggccacggc acagcgtgag cgatacccgg tggggcccgg cagccgggtg 5160 ctgcaactcg cctcaccgtc cttcgacggc gccgtactgg aactgctcat ggccctcacc 5220 accggaggaa ccctcgtcct gcccgacggg cccctcctcg ccgggcaacc gctcgccgac 5280 atgctggccg agcaccgcat cagccacgcc ttcatccccc cggcggtgct gagcggcctt 5340 ccctccgaag ggctggaggg cctgcgctgc ctcgtcgtcg gcggcgaggc ggtcaccgcg 5400 cccctcacgg accgctgggc gcccggccgt cgcatgctca acatctacgg ccccaccgag 5460 accaccgccg tcaccctgac cagcgaagcc ctgacccccg gcggcccacc gcccgccatc 5520 ggcacccccg tacccaacac cagggcccac gtgctcgacg accggctgcg ccccgtcccg 5580 cccggcgtga cgggcgagct gtacctggcc ggcgcgtcac tggcgcgcgg ctacggccgc 5640 cgcccggcgc tcaccgccag ccgctacgtc ggctgcccgt tcggagcgcc gggggagcgg 5700 atgtaccgca ccggcgacct ggcgcgcctg gaccgggagg gccgcgtcca ccacatgggc 5760 cgcaccgacg agcagatcaa gctgcgcggc ttccgcgtcg agcccggtga gatccgggcc 5820 cggctcaccg agcatcccgc cgtgcgggag gcggcggtcg tcctgcgcga cgacgggccg 5880 ggcggacgcg cgctggtggc ctacgcggta ccggccgacg gcccgccccg ccccaccgcg 5940 gcccagctcc gcgcacacct gaacgccctc ctcccgccct acatggtgcc cgccgccttc 6000 ctggtgctgg acgcgctgcc gaccaccccc aacggcaagc tcgaccggga ggccctgccc 6060 gccccgcaac cgcacgccga ggagaccggc cgtccgccgc gcgacgaacg cgaggccgcc 6120 ctgtgcgagg tgttcgccga ggtcctggag cgcacgtcgc tcggcgr_cga cgacggcttc 6180 ttcgagaacg ggggacactc gctgctcgcc gtccggctgg tcgccagggt ccgcgagcgc 6240 ctgggcgtgc ccctggccgc acgggacctg ttcgaggctc ccaccccggc cgccctggcg 6300 gagcgcctgg cccgcggcgc cgaacgccgc gccccggcgc ccctgci=cac cctgcggggc 6360 cgcggcgacc ggcccccctt gttctgcgtc cacccggccg tcggcctggg atgggcgtac 6420 gcgagcctcc tgccgtggct cccggccgac gtccccctcc acgcgct~gca ggcccgcacg 6480 cccgcggacg gcgccggtct gccggggagc gtcgaggaga tggccgagga ctacgtccgg 6540 ctgatccgccgcgtccgcccccacggcccctaccggctgctcggctggtcgctgggagcc6600 cacgtggcccacaccgcggcggccctgctggagcgcgacggacagCgggtggacctgctc6660 gccatgctggacgcctatcctccccaccgcaccggggaccccggcggacgggccgaggag6720 tccgaggccgagatcgtggcggccaacctgcgggagtcggggttcgcgtgggacgaggac6780 gagcagcgcgcgggacgcttcccgctggagcgcttccgcgcccacctgcgccgggtggac6840 agctcgctcggccacctcgacgacgccgaactgacggcggccaaggacgtctacgtcaac6900 aacgtccggctcatgcgctccttcacccccggccgtgtccggtgcgggatcgtcctgatg6960 accgcggaacgcacccgcagcctcgatccggcggcgtgggagccgcacaccgaaggaggc7020 gtcgaggtgcaccggctggacgcctcccacatgtccatgctgaccgaaccggcgtcggtc7080 gcggcagccggccgtctcctgacccaccgactggagtccctgcggggagccaccacgaag7140 aaacgagaggtatga 7155 Information for SEQ ID NO: 15 Length: 78 Type: PRT
Organism: Streptomyces fradiae Sequence: 15 Met Thr Asn Pro Phe Asp Asp Thr Glu Gly Val Phe His Val Leu Val Asn Asp Glu Asn Gln His Ser Leu Trp Pro His Phe Val Glu Ile Pro Asp Gly Trp Arg Ala Val Val Arg Glu Arg Pro Arg Gl:n Glu Cys Leu Asp Tyr Ile Glu Ala Asn Trp Thr Asp Met Arg Pro Gl:n Ser Leu Ile Asp Ala Met Glu Ala His Glu Lys Ser Glu Gly Ala Ile Arg Information for SEQ ID NO: 16 Length: 237 Type: DNA
Organism: Streptomyces fradiae Sequence: 16 atgaccaacc ccttcgacga caccgagggc gtcttccacg tcctggl~caa cgacgagaac 60 cagcactcgc tgtggcccca cttcgtcgag atccccgacg gctggcgggc cgtggtgcgc 120 gagcgtccgc gccaggagtg cctggactac atcgaggcga actggac cga catgcgcccg 180 cagagcctca tcgacgccat ggaggcacac gagaagtccg agggcgcgat ccggtga 237 Information for SEQ
ID N0:

Length: 321 Type:
DNA

Organism:Streptomyces fradiae Sequence:17 ccctccggccccacccccggtcctggagaccgccgacgaactgcgaggaagtccggatgc60 tggcacggatcaatgggatcgacctggatcacgaacgcagaggcagtggttcccccgtcc120 tcctgatcatggggagcggcgccgccggaacgggctggcatttgcaccaggtgcccgcgc180 tggtcgccgccggtttcgaggccgtcaccttcaccaaccggggcatcacccccagcggcg240 ggggccccggtttcacccttcaggacatggccgccgacaccatcggcctgatcgaacacc300 tcgggctgggcccgtgcgcggtcgtggggacgtccctgggggccagggtcgcgtgcgagg.

tcgcccgtacccgccccgacctggtctcccggtgcgtcctcatggcgccgcgagcacgcg420 cggaccggacgagggccgccgcgaccgaggcggagatcgccctggccgacagcggcgtca480 ccgtcccgccccgctaccgggcggtggtgcgggccatgcagaacctctcaccccggacgc540 tcgcggacgacgaacggatcgccgactggctcgacctcttcgaactggcggcggccgccg600 gccccggtgcccgcacccagctggagatcagcgccgtctaccaccgcgaggaggacctgg660 cccggatcaccgccccctgccgggtgatcgccttcgccgacgacatcgtggcgccggcgc720 atctggccaaggagatcgccgacgccctgcccgaggccgactaccacgtggtgcccgact780 gcggccactacggctacctcgaacaacccgaccgggtcaaccggctcatcacccaattcc840 tcgccgcatgaaagagccccgcatggaaccgaccaccgcctggcggcccgccgtgatcag900 tccggacagccacgcgctgcccgccaccgccgacgccctggcgggcctgctccaggactc960 cgcccgcaccgacgaactcctggccgcccacaaagtgctgttcctcagcggcttcggagt1020 gggcccgctggagctggagaagatcatgccgctcctgctccccgaccgcctgccctacgt1080 cttcggcaactccccgcgcaccaaggtcggacacaacgtgtacacctcgacggagtaccc1140 ggcggaattcaccatctcgatgcacagcgagatgtcgtacgccgcgcgatggccggcccg1200 gctgctcttctactgcgagcgggcggccgacaccggcggcgcgaccccggtggtggacaa1260 cgccgcctggtaccgggcactggacaaggacgtccgcgacgcctacgcgggcggcctgcg1320 ctacacccagaacctccacggaggacgcgggctcggcaagagctggcaggacaccttcga1380 gacggaggaccgctccgaggtcgaggagtacctctcccgcaccggcgccacctggcagtg1440 gaacgcccgcaacggactgcgcgtcagccacgtacgacccgcgacgatcgaacaccccgc1500 caccggcgagcggctgtggttcaaccagagcgaccagtggcaccccgccacgctcggcgg1560 cgaggccgccgcgctgatggagctgctgcccccggaggaactgccccagtcggtcgcctt1620 cgccgacggctccccgattccggccgagtacgcgcgccaggtccgcgaccgcggactgga1680 acacgccgtggacaacgactggcgccccggcgacctcatgctcgtcgacaacgtccaggc1740 ggcccacggccgcaggcccttcaccggcgaccgccgcatcctggtcgccatgtcggacca1800 cggacgcccccaccgccccggacagccgccgcgtcccccgcaggaaggccaccgatgacc1860 atcgccctcgccgacgtggaagggctcaaccagcacgagaccgagttcctctacgacgag1920 atcttcacccgccgcgcctacctgcccgaggccctgcacctgcccgaggcccccgtggtc1980 ttcgacgtcggggccaacatcggcatgttcaccctcttcgtacgctcggaacgacccggt2040 gccacggtccactccttcgaaccggtccccccggtgcgcgacatcctgtgccgcaaccgg2100 gagcgccacgcggtggcggggctcgtccatccctacggcctcgccgaggcggaacaggaa2160 gtcgagttcacccactatccgggctactcgaccatgtccacgcgcagcaccctggcggac2220 accgaggcggaacgggccttcgtccgggggcaggtgcggaccgccgacctgcccgaggcc2280 gagcggatgctggacgaactcctcgccttccggttcagggaggagaaggtgacctgccgg2340 ctccgccccctctccgccgtcctcgacgagcatcccgtcgaccggatcgacctgctgaag2400 atcgacgtccagcgcggtgagcgggaggtgctgcgaggactggaggaccggcactggcca2460 ctggtccgccagatcgccatggaggtgcacgacagccccggcggcagcaccgccggccgg2520 ctgcgagcggtggccgacgagctggagcggcgcgggttcgacgtcctgaccgagcaggag2580 gaccggtacgcgggcaccgaccgccacagcgtgttcgccgtcgcggaaccgcgtcgcggc2640 tgagccacgtccccgcccggccgcccaccgcagaaccgaatcgaacggtcccccacctcc2700 cggagaaacagcatggaacccgagaacaccttcaccctgtccgaagccgaacgcgacgac2760 gtggccgccttggcccaggagttgacgcgcgcgcgccccggcctggtggacgagagggaa2820 tggctcgaccggtgccgcaccctctcctgccacctgcccgcccgcctccaggaccggctc2880 cgcgccttccgccacgaccccggccccaccggcaggctgctcctgcgcaaccttcccgcc2940 gccgacagcgtgccggccaccccgcgggagcccgactccgtggagcgcagggcgacgctg3000 agcgcctccgtgctgtgcgccctgtccatggaactcggtgacgtcatcgcctaccgcaac3060 gagaagcagggcgcgctcgtgcagaacgtggtgccggtgcccggccgggagggccagcag3120 tccaacgccggctccgtcccgctggagatgcacaccgagaacgccttccaCCCCCaCCgC3180 cccgactacgtgggcctgttctgcgtccgcagcgaccacgaccgggccgcgggactgcgg3240 gtcgcctccgtccgcgccgtgatggaccacctggacgcggggacacgcgagatgctgcgg3300 cagcccctgttcaccaccgagccgcccccttccttcgggcggccggacagcgggaccaag3360 ccgcacgccgtgctcaccggcgacgccgaggacccggacatccgggl=ggacttccacgcc3420 acccacaccagcgatccctggggcaggcaggccatggaggccctggcggaggccgtccgc3480 accgtctccgaggaactggtcctggaaccggccgacctggtgtacgtggacaaccgcgtc3540 gctctgcacggacgcacggccttcgtcccccgctacgacggacaggaccgctggctccag30'00 cgcgccttcgtccacctggaccaccggcggtcccgcgccgcacgcgcgcaccattcgcgt3660 gtactgagctgatggcggtggaccaggtgattcacgccaccggcctgcacaagcgcttcg3720 ccgcggtccaggccctcgccggcgtggacctgacggtggcgcgcggggagatcatgggcc3780 ttctgggccacaacggggcggggaagaccaccctcgtcaacgtcctgtccaccctcacgc3840 CCCCCaCCtCgggcaccgcgtcggtcgccggtttcgacgtggtcggccgccccgacgagg3900 tgcgtcggcgcatcggcgtcaccgggcagttcgccgcgctcgacgaggagctgtccggtt3960 acgacaacctcgtcctggtggcccgcttgtgcggagcctccaaggcacaggcggtcggcc4020 gggcggacgagttgctggagatgttcggcctccgtgccttcgcacggcgcagggcggtgt4080 cgtactcgggcgggatgcgccgccggctggatctggcgctcgggctggccggccgtcccg4140 acgtcctgttcctcgacgagccgtccgtggggttggacctgcccagccgtctcggcctgt4200 gggagatggtgcaggggctcgcgcgggacggcacggcggtcctgctgacgacgcagtacc4260 tggaggaggccgatcggctcgccgaccggatcaccgtcctgggggcgggccgggttctgg4320 tgtcgggtaccgcggtggagctgaaggcacgggccggtcgcgggtccatctcgctgcggg4380 tcggaccccagggcgacaggaccgtcgccgccgaggccctgcaccgggccggcttcccgt4440 cgggcgtggacgacgggcgcggtgagctgaccgtgccggcgggggattcggcggacctgg4500 ccgtggtgatccgggtgctggacgccgtgggacagaacgtcacggagatccgctacgcgg4560 agccctccctcgacgacgtctacctcgccttcaccgccacggcccccgacgcccccgctc4620 cggcgacggg agccctcagc caccctcccg cccccgctcc ggcgcgtatc gcccaccgca 4680 ccaccctcccggcggccgaacgacgcaccggcgacacctcggagaattcgtgaacgcacc4740 tggaacgatgacgccctcggccgagcccgcgacgcgggagcgggcaccgtgctggagacc4800 ggcgggcagggccgcacaactgcgggtgctcaccgcacgccagatccgtctggtctacgc4860 cgaccgcagggtggtgctgttcagcgtggcgcagccggtggtgatgctgctgctgatcag4920 ccaggtcttcggcagcctcgcggaccgctcgatcctgccgcgcggggtgacctacatcga4980 gttcctgctgcccgctctcctcgtcaccaccgggatcggcacctcgcagtccgcgggggt5040 gggactggtgcgggacatggagggcggcatggtgcgccgcttccgcgtcctgccgttgtc5100 actgccgctggtgctggtcgcgcgttcgatcgcggatctgacgcgctcgggaatgcaact5160 gctcgtcctggtcgtcgcgggccacctgctgttcggctaccgggccgggggaggagcggt5220 gggactggtggcggcgttgtcgctgtccacggtggtgatctggtcgctgatctggatctt5280 catcgcgctcgccacgtggctgcgaaaggtggaggtgctctccagcatcggattcttcgt5340 lls caacttcccgctgatgttcgcttccagcgcgttcgtaccggtcgacgtcctgcccggttg5400 gctggcggccgtcgccaccgtcaatccggtgagtcacgcggtggaggcgtcgcggagcct5460 tgccctgggagagcaggccggaagcgaggtgaccgcggcgctgtgcggggccctgggcct5520 gctggtgacgatgacggtcctcgccgcgcgcgctgtccgccgaccccccgacgagtgacg5580 tgacgacacatcgacggatcctgtcgtcccgggcgagcgacgtcggacgacgcgacgccc5640 ggtcacgagctgcggcggcggtgcctgcgccgccaccacacggccgcgagtgtgaaggcg5700 cccgcgacgcccagcagggaccacaagatggtggagagcacgctctcggcctcgcgcagg5760 gaggccgagaccagcgttcccgcggacacgtacagcgcggaccacatcgcggctccggcg5820 agggaggcgggcaggaagcggcggtagcgcacggagccgaccccggcggtcgcgggggtg5880 agggtgcgcaccacgggcagcagccgggtcaggaagacggCgCgCgCCCCgtgccggtgg5940 cagagatcctgcgcgcggtcccaatggtgctgccccatccgccgcaccagccgcgtctcc6000 cgcatccgctgcccgtagcggatgccgaggaagtagccgatgtggtcgccggccgagctg6060 ctgagcgtgacgacgaggaaaagggccagcagcgggcgcgtcccctccgttccggcgctc6120 agggccagcaccgcgacctcgccggggacgaccatgccggcaccgaggccggactccgcg6180 aacgcgaacgcggaggccagggcgaatctggcgaccgggctcatgtccgacaccgctgtc6240 agcacatcgttcatccacgacatggcagccccgccccgcctctcctcgctcgtggagccc6300 tcccggcggcgcccctgggattcccacgcccctcttccgagaacaca aagggaacag6360 ccg ggaaacgacttcccggtgtcaccggacgcatacccggccggccggtgcggtcgcccggcc6420 gcctgaaaaggacgaagggagaccaacgtaccagggaaccgccggacgacttcttccttc6480 cggccacgaccacccccgcgacggccccgccgatcgtcttcggcgac ctccgacgac6540 cgt gggggagaggagattcgtcgaccggatccgaccggatccgaccggaaggggaatgtgagg6600 gaacctccgtccgaagcagaggtgaaaaacaaaaaccgccaaccatcacgacgcgaagcg6660 accaccccgcagccggtccgacgcgcgaccgaccccgaccctccgacggcattcctccga6720 tatgggtgaattcggcccgcaggaatccctgcccgccgtcggacgggaattgctgaggcc6780 gaatcccggcggaagcaccgcgcctttacccggagttcgcgccgccc~gcagggaaccggc6840 acggcgcgccccggggcaggacgctccctcgtgagcactcttgacgcccgatcggcgccc6900 tgaaagggtgcgacggtgaaactgcggaccggactggccgagcgccggtgctctccccga6960 actccgctcggcgcactccggtgcggaccgatggaataacgggccgcccgtggcttgcga7020 aaggtgaaaaccctgtatacggaacgaccagtcgtccatgacgacatatcgactgacc7080 ca ggattggtggaaatgcaggcggatgcaccggcgggaacaaaaacaggaacagaaacctac7140 ttaccgtccctgagtctcgaagactatctgcgcgacacggttccggcccacccggtcctg7200 aaatcgtccgtggatttcggccgcccaggctctgatgaagcgctcagagcactggccgcg7260 acgaccacggaattcgattccgacgagaccggacgcggcgacacctatcgcagggcccag7320 caggacccctccgtccgctggaggggaatgcgacaactgctcgaactggccgcgccctcc7380 cgcgccccctccgacacggccgccccccgcaccgtcctcgacgttctgggcggggacgga7440 accatcgcccgcgccgtccaCgaCCaCgCCCgCgagCtgtgggaCCgCCCgcacatcctc7500 accggtgacctctcgggggacatggtcgaacgcgccctcgcccagggcctggccgccatc7560 cgacaggccgcggaccatctcttcctggccgacggcaccatggacg~~ggcactgctcgcc7620 tacggcacccaccacatcgcgccgcaggagaggctcgccgccgtcaccgaggccctgcgc7680 gtcgtcaaggacggcggccacgtggtcctgcacgacttcgacgacgccagccccatggca7740 cgcttcttcaccgacatcgtccacccccacaccacagccggccacgactaccgccacttc7800 tcccgcgaccttctggccgaactcttcgccgaggcgggcacaccgg~~ccgcgtcgtcgac7860 ctgtacgacccactcgtagtccgcggcaccaccgaggaggaggcgcgccgccgcatgtgc7920 gcgtacgtggccgacatgtacggagtcggcgcgttcttcgccaccctgggcggaaccgac7980 gcgtgctggcgactcctggaggagtacttccagcacgacacctacctgtcgaccctgccc8040 gagcagaccgacttcaccaccgcgccgatcgtctaccgctccgagggcgccttcatcgcc8100 gagataccccgcgcggccatcgtcgccgtctcccacaagccacccacctgacccccgctt8160 ccggaccccggggaccggcgcccccaccctcctgggactgcccgcagaggagtgggggcc8220 ggtgatcgcgagccctgtccggcaggagagccgggcctggcagtccggagtccgaggacg8280 gcgatgttccggcggagcggcgtcggcagtgacgcgcggtc 8321 Information for SEQ ID NO: 18 Length: 264 Type: PRT
Organism: Streptomyces fradiae Sequence: 18 Met Leu Ala Arg Ile Asn Gly Ile Asp Leu Asp His Glu Arg Arg Gly Ser Gly Ser Pro Val Leu Leu Ile Met Gly Ser Gly Ala Ala Gly Thr Gly Trp His Leu His Gln Val Pro Ala Leu Val Ala Ala Gly Phe Glu Ala Val Thr Phe Thr Asn Arg G1y Ile Thr Pro Ser Gly Gly Gly Pro Gly Phe Thr Leu Gln Asp Met Ala Ala Asp Thr Ile Gly Leu Ile Glu His Leu Gly Leu Gly Pro Cys Ala Val Val Gly Thr Ser Leu Gly Ala Arg Val Ala Cys Glu Val Ala Arg Thr Arg Pro Asp Leu Val Ser Arg Cys Val Leu Met Ala Pro Arg Ala Arg Ala Asp Arg Th:r Arg Ala Ala Ala Thr Glu Ala Glu Ile Ala Leu Ala Asp Ser Gly Val Thr Val Pro Pro Arg Tyr Arg Ala Val Val Arg A1a Met Gln Asn Leu Ser Pro Arg Thr Leu Ala Asp Asp Glu Arg Ile Ala Asp Trp Leu Asp Leu Phe Glu Leu Ala Ala Ala Ala Gly Pro Gly Ala Arg 'rhr Gln Leu Glu Ile Ser Ala Val Tyr His Arg Glu Glu Asp Leu Ala Arg Ile Th:r Ala Pro Cys Arg Val Ile Ala Phe Ala Asp Asp lle Val Ala Pro Ala His Leu Ala Lys Glu Ile Ala Asp Ala Leu Pro Glu Ala Asp Tyr His Val Val Pro Asp Cys Gly His Tyr Gly Tyr Leu Glu Gln Pro Asp Arg Val Asn Arg Leu Ile Thr Gln Phe Leu Ala Ala Information for SEQ
ID NO:

Length:

Type:
DNA

Organism:Streptomyces fradiae Sequence:19 atgctggcacggatcaatgggatcgacctggatcacgaacgcagaggcagtggttccccc60 gtcctcctgatcatggggagcggcgccgccggaacgggctggcatti=gcaccaggtgccc120 gcgctggtcgccgccggtttcgaggccgtcaccttcaccaaccggggcatcacccccagc180 ggcgggggccccggtttcacccttcaggacatggccgccgacaccai~cggcctgatcgaa240 cacctcgggctgggcccgtgcgcggtcgtggggacgtccctgggggccagggtcgcgtgc300 gaggtcgcccgtacccgccccgacctggtctcccggtgcgtcctcatggcgccgcgagca360 cgcgcggaccggacgagggccgccgcgaccgaggcggagatcgccct:ggccgacagcggc420 gtcaccgtcccgccccgctaccgggcggtggtgcgggccatgcagaacctctcaccccgg480 acgctcgcggacgacgaacggatcgccgactggctcgacctcttcgaactggcggcggcc540 gccggccccggtgcccgcacccagctggagatcagcgccgtctaccaccgcgaggaggac600 ctggcccggatcaccgccccctgccgggtgatcgccttcgccgacga cgtggcgccg660 cat gcgcatctggccaaggagatcgccgacgccctgcccgaggccgactaccacgtggtgccc720 gactgcggccactacggctacctcgaacaacccgaccgggtcaaccggctcatcacccaa780 ttcctcgccgcatga 795 Information for SEQ ID NO: 20 Length: 331 Type: PRT
Organism: Streptomyces fradiae Sequence: 20 Met Glu Pro Thr Thr Ala Trp Arg Prc Ala Val Ile Se_r Pro Asp Ser His Ala Leu Pro Ala Thr Ala Asp Ala Leu Ala Gly Leu Leu Gln Asp Ser Ala Arg Thr Asp Glu Leu Leu Ala Ala His Lys Va:l Leu Phe Leu Ser Gly Phe Gly Val Gly Pro Leu Glu Leu Glu Lys Ile Met Pro Leu Leu Leu Pro Asp Arg Leu Pro Tyr Val Phe Gly Asn Ser Pro Arg Thr 65 70 '75 80 Lys Val Gly His Asn Val Tyr Thr Ser Thr Glu Tyr Pro Ala Glu Phe Thr Ile Ser Met His Ser Glu Met Ser Tyr Ala Ala Arg Trp Pro Ala Arg Leu Leu Phe Tyr Cys Glu Arg Ala Ala Asp Thr Gly Gly Ala Thr Pro Val Val Asp Asn Ala Ala Trp Tyr Arg Ala Leu Asp Lys Asp Val Arg Asp Ala Tyr Ala Gly Gly Leu Arg Tyr Thr Gln Asn Leu His Gly Gly Arg Gly Leu Gly Lys Ser Trp Gln Asp Thr Phe Glu Thr Glu Asp Arg Ser Glu Val Glu Glu Tyr Leu Ser Arg Thr Gly Ala Thr Trp Gln Trp Asn Ala Arg Asn Gly Leu Arg Val Ser His Val Arq Pro Ala Thr 195 200 20~i Ile Glu His Pro Ala Thr Gly Glu Arg Leu Trp Phe Asn Gln Ser Asp Gln Trp His Pro Ala Thr Leu Gly Gly Glu Ala Ala Ala Leu Met Glu Leu Leu Pro Pro Glu Glu Leu Pro Gln Ser Val Ala Phe Ala Asp Gly Ser Pro Ile Pro Ala Glu Tyr Ala Arg Gln Val Arg Asp Arg Gly Leu Glu His Ala Val Asp Asn Asp Trp Arg Pro Gly Asp Leu Met Leu Val 275 280 28:5 Asp Asn Val Gln Ala Ala His Gly Arg Arg Pro Phe Th:r G1y Asp Arg Arg Ile His Gly Pro His Arg Pro Leu Val Arg Gly Ala Met Ser Asp Gln Pro Gly His Pro Arg Arg Pro Pro Gln Glu Information for SEQ
ID N0:

Length:

Type:
DNA

Organism:Streptomyces fradiae Sequence:21 atggaaccgaccaccgcctggcggcccgccgtgatcagtccggacagcca cgcgctgccc60 gccaccgccgacgccctggcgggcctgctccaggactccgcccgcaccga cgaactcctg120 gccgcccacaaagtgctgttcctcagcggcttcggagtgggcccgctgga gctggagaag180 atcatgccgctcctgctccccgaccgcctgccctacgtcttcggca<~ctc cccgcgcacc240 aaggtcggacacaacgtgtacacctcgacggagtacccggcggaati~cac catctcgatg300 cacagcgagatgtcgtacgccgcgcgatggccggcccggctgctcttcta ctgcgagcgg360 gcggccgacaccggcggcgcgaccccggtggtggacaacgccgcctggta ccgggcactg420 gacaaggacgtccgcgacgcctacgcgggcggcctgcgctacacccagaa cctccacgga480 ggacgcgggctcggcaagagctggcaggacaccttcgagacggagg<~ccg ctccgaggtc540 gaggagtacctctcccgcaccggcgccacctggcagtggaacgcccgcaa cggactgcgc600 gtcagccacgtacgacccgcgacgatcgaacaccccgccaccggcgagcg gctgtggttc660 aaccagagcgaccagtggcaccccgccacgctcggcggcgaggccgr_cgc gctgatggag720 ctgctgcccccggaggaactgccccagtcggtcgccttcgccgacggctc cccgattccg780 gccgagtacgcgcgccaggtccgcgaccgcggactggaacacgccgtgga caacgactgg840 cgccccggcgacctcatgctcgtcgacaacgtccaggcggcccacggccg caggcccttc900 accggcgaccgccgcatcctggtcgccatgtcggaccacggacgccccca ccgccccgga960 cagccgccgcgtcccccgcaggaaggccaccgatga 996 Information for SEQ ID NO: 22 Length; 262 Type : PF2T
Organism. Streptomyces fradiae Sequence: 22 Met Thr Ile Ala Leu Ala Asp Val Glu Gly Leu Asn Gl.n His Glu Thr Glu Phe Leu Tyr Asp Glu Ile Phe Thr Arg Arg Ala Tyr Leu Pro Glu Ala Leu His Leu Pro Glu Ala Pro Val Val Phe Asp Val Gly Ala Asn Ile Gly Met Phe Thr Leu Phe Val Arg Ser Glu Arg Pro Gly Ala Thr Val His Ser Phe Glu Pro Val Pro Pro Val Arg Asp Ile Leu Cys Arg Asn Arg Glu Arg His A1a Val Ala Gly Leu Val His Pro Tyr Gly Leu Ala Glu Ala Glu Gln Glu Val Glu Phe Thr His Tyr Pro Gly Tyr Ser Thr Met Ser Thr Arg Ser Thr Leu Ala Asp Thr Glu Ala Glu Arg Ala Phe Val Arg Gly Gln Val Arg Thr Ala Asp Leu Pro Glu Ala Glu Arg Met Leu Asp Glu Leu Leu Ala Phe Arg Phe Arg Glu Glu Lys Val Thr Cys Arg Leu Arg Pro Leu Ser Ala Val Leu Asp Glu His Pro Val Asp Arg Ile Asp Leu Leu Lys Ile Asp Val Gln Arg Gly Glu Arg Glu Val Leu Arg Gly Leu Glu Asp Arg His Trp Pro Leu Val Arg Gln Ile Ala Met Glu Val His Asp Ser Pro Gly Gly Ser Thr Ala Gly Arg Leu Arg Ala Val Ala Asp Glu Leu Glu Arg Arg Gly Phe Asp Val Leu Thr Glu Gln Glu Asp Arg Tyr Ala Gly Thr Asp Arg His Ser Val Phe Ala Val Ala Glu Pro Arg Arg Gly Information for SEQ ID NO: 23 Length: 789 Type:
DNA

Organism:Streptomyces fradiae Sequence:23 atgaccatcgccctcgccgacgtggaagggctcaaccagcacgagaccgagttcctctac60 gacgagatcttcacccgccgcgcctacctgcccgaggccctgcacctgcccgaggccccc120 gtggtcttcgacgtcggggccaacatcggcatgttcaccctcttcgtacgctcggaacga180 cccggtgccacggtccactccttcgaaccggtccccccggtgcgcgacatcctgtgccgc240 aaccgggagcgccacgcggtggcggggctcgtccatccctacggcctcgccgaggcggaa300 caggaagtcgagttcacccactatccgggctactcgaccatgtccacgcgcagcaccctg360 gcggacaccgaggcggaacgggccttcgtccgggggcaggtgcggaccgccgacctgccc420 gaggccgagcggatgctggacgaactcctcgccttccggttcagggaggagaaggtgacc480 tgccggctccgccccctctccgccgtcctcgacgagcatcccgtcgaccggatcgacctg540 ctgaagatcgacgtccagcgcggtgagcgggaggtgctgcgaggactggaggaccggcac600 tggccactggtccgccagatcgccatggaggtgcacgacagccccggcggcagcaccgcc660 ggccggctgcgagcggtggccgacgagctggagcggcgcgggttcgacgtcctgaccgag720 caggaggaccggtacgcgggcaccgaccgccacagcgtgttcgccgtcgcggaaccgcgt780 cgcggctga 789 Information for SEQ ID NO: 24 Length: 319 Type: PRT
Organism: Streptomyces fradiae Sequence: 24 Met Glu Pro Glu Asn Thr Phe Thr Leu Ser Glu Ala Glu Arg Asp Asp Val Ala Ala Leu Ala Gln Glu Leu Thr Arg Ala Arg Pro Gly Leu Val Asp Glu Arg Glu Trp Leu Asp Arg Cys Arg Thr Leu Ser Cys His Leu Pro Ala Arg Leu Gln Asp Arg Leu Arg Ala Phe Arg Hi;a Asp Pro Gly Pro Thr Gly Arg Leu Leu Leu Arg Asn Leu Pro Ala Ala Asp Ser Val Pro Ala Thr Pro Arg Glu Pro Asp Ser Val Glu Arg Arg Ala Thr Leu 85 90 ' 95 Ser Ala Ser Val Leu Cys Ala Leu Ser Met Glu Leu Gly Asp Val Ile Ala Tyr Asn Glu Lys Gln Ala Leu Gln Asn Val Pro Arg Gly Val Val Val Pro Arg Glu Gly Gln Ser Asn Gly Ser Pro Leu Gly Gln Ala Val Glu Met Thr Glu Asn Ala His Pro Arg Pro Tyr Val His Phe His Asp Gly Leu Cys Val Arg Ser His Asp Ala Ala Leu Arg Phe Asp Arg Gly Val Ala Val Arg Ala Val Asp His Asp Ala Thr Arg Ser Met Leu Gly Glu Met Arg Gln Pro Leu Thr Thr Pro Pro Ser Phe Leu Phe Glu Pro Gly Arg Asp Ser Gly Thr Pro His Val Leu Gly Asp Pro Lys Ala Thr Ala Glu Pro Asp Ile Arg Asp Phe Ala Thr Thr Ser Asp Val His His Asp Pro Gly Arg Gln Ala Glu Ala Ala Glu Val Arg Trp Met Leu Ala Thr Val Glu Glu Leu Val Glu Pro Asp Leu Tyr Val Ser Leu Ala Val Asp Asn Val Ala Leu His Arg Thr Phe Val Arg Tyr Arg Gly Ala Pro Asp Gly Asp Arg Trp Leu Arg Ala Val His Asp His Gln Gln Phe Leu Arg Arg Arg Ala Ala Arg His His Arg Val Ser Ser Ala Ser Leu Informationfor SEQ ID N0:

Length: 0 Type:
DNA

Organism:Streptomyces fradiae Sequence:25 atggaacccgagaacacctt caccctgtccgaagccgaacgcgacgacgtggccgccttg60 gcccaggagttgacgcgcgc gcgccccggcctggtggacgagagggaatggctcgaccgg120 tgccgcaccctctcctgcca cctgcccgcccgcctccaggaccggcr_ccgcgccttccgc180 cacgaccccggccccaccgg caggctgctcctgcgcaaccttcccgccgccgacagcgtg240 ccggccaccccgcgggagcc cgactccgtggagcgcagggcgacgctgagcgcctccgtg300 ctgtgcgccctgtccatgga actcggtgacgtcatcgcctaccgcaacgagaagcagggc360 gcgctcgtgcagaacgtggt gccggtgcccggccgggagggccagcagtccaacgccggc420 tccgtcccgctggagatgca caccgagaacgccttccacccccaccgccccgactacgtg480 ggcctgttct gcgtccgcag cgaccacgac cgggccgcgg gactg<:gggt cgcctccgtc 540 cgcgccgtga tggaccacct ggacgcgggg acacgcgaga tgctgcggca gcccctgttc 600 accaccgagc cgcccccttc cttcgggcgg ccggacagcg ggaccaagcc gcacgecgtg 660 ctcaccggcg acgccgagga cccggacatc cgggtggact tccacgccac ccacaccagc 720 gatccctggg gcaggcaggc catggaggcc ctggcggagg ccgtccgcac cgtctccgag 780 gaactggtcc tggaaccggc cgacctggtg tacgtggaca accgcgtcgc tctgcacgga 840 cgcacggcct tcgtcccccg ctacgacgga caggaccgct ggctccagcg cgccttcgtc 900 cacctggacc accggcggtc ccgcgccgca cgcgcgcacc attcgcgtgt actgagctga 960 Information for SEQ ID N0: 26 Length: 353 Type: PRT
Organism: Streptomyces fradiae Sequence: 26 Met Ala Val Asp Gln Val Ile His Ala Thr Gly Leu His Lys Arg Phe Ala Ala Val Gln Ala Leu Ala Gly Val Asp Leu Thr Val Ala Arg Gly Glu Ile Met Gly Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Leu Val Asn Val Leu Ser Thr Leu Thr Pro Pro Thr Ser Gly Thr Ala Ser Val Ala Gly Phe Asp Val Val Gly Arg Pro Asp Glu Val Arg Arg Arg Ile Gly Val Thr Gly G1n Phe Ala Ala Leu .Asp Glu Glu Leu Ser Gly Tyr Asp Asn Leu Val Leu Val Ala Arg Leu Cys Gly Ala 5er Lys Ala Gln Ala Val Gly Arg Ala Asp Glu Leu Leu Glu Met Phe Gly Leu Arg Ala Phe Ala Arg Arg Arg Ala Val Ser Tyr Ser Gly Gly Met Arg Arg Arg Leu Asp Leu Ala Leu Gly Leu Ala Gly Arg Pro Asp Val Leu Phe Leu Asp Glu Pro Ser Val Gly Leu Asp Leu Pro Ser Arg Leu Gly Leu Trp Glu Met Val Gln Gly Leu Ala Arg Asp Gly Thr Ala Val Leu Leu Thr Thr Gln Tyr Leu Glu Glu Ala Asp Arg Leu Ala Asp Arg Ile Thr Val Leu Gly Ala Gly Arg Val Leu Val Ser Gly Thr Ala Val Glu Leu Lys Ala Arg Ala Gly Arg Gly Ser Ile Ser Leu Arg Val Gly Pro Gln Gly Asp Arg Thr Va1 Ala Ala Glu Ala Leu His Arg Ala Gly Phe Pro Ser Gly Val Asp Asp Gly Arg Gly Glu Leu Thr Val Pro Ala Gly Asp Ser Ala Asp Leu Ala Val Val Ile Arg Val Leu Asp Ala Val Gly Gln Asn Val Thr Glu Ile Arg Tyr Ala Glu Pro Ser Leu Asp Asp Val Tyr Leu Ala Phe Thr Ala Thr Ala Pro Asp Ala Pro Ala Pro Ala Thr Gly Ala Leu Ser His Pro Pro Ala Pro Ala Pro Ala Arg Ile Ala His Arg Thr Thr Leu Pro Ala Ala Glu Arg Arg Thr Gly Asp Thr Ser Glu Asn Ser Information for SEQ
ID NO:

Length:

Type:
DNA

Organism:Streptomyces fradiae Sequence:27 atggcggtggaccaggtgattcacgccaccggcctgcacaagcgcttcgccgcggtccag60 gccctcgccggcgtggacctgacggtggcgcgcggggagatcatgggccttctgggccac120 aacggggcggggaagaccaccctcgtcaacgtcctgtccaccctcacgccccccacctcg180 ggcaccgcgtcggtcgccggtttcgacgtggtcggccgccccgacgaggtgcgtcggcgc240 atcggcgtcaccgggcagttcgccgcgctcgacgaggagctgtccggttacgacaacctc300 gtcctggtggcccgcttgtgcggagcctccaaggcacaggcggtcggccgggcggacgag360 ttgctggagatgttcggcctccgtgccttcgcacggcgcagggcggtgtcgtactcgggc420 gggatgcgccgccggctggatctggcgctcgggctggccggccgtcccgacgtcctgttc480 ctcgacgagccgtccgtggggttggacctgcccagccgtctcggcctgtgggagatggtg540 caggggctcgcgcgggacggcacggcggtcctgctgacgacgcagt<~cctggaggaggcc600 gatcggctcgccgaccggatcaccgtcctgggggcgggccgggttctggtgtcgggtacc660 gcggtggagctgaaggcacgggccggtcgcgggtccatctcgctgcgggtcggaccccag720 ggcgacaggaccgtcgccgccgaggccctgcaccgggccggcttcccgtcgggcgtggac780 gacgggcgcggtgagctgaccgtgccggcgggggattcggcggacctggccgtggtgatc840 cgggtgctggacgccgtgggacagaacgtcacggagatccgctacqcggagccctccctc900 gacgacgtctacctcgccttcaccgccacggcccccgacgcccccgctccggcgacggga960 gccctcagccaccctcccgcccccgctccggcgcgtatcgcccacc:gcaccaccctcccg1020 gcggccgaacgacgcaccggcgacacctcggagaattcgtga 1062 Information for SEQ ID NO: 28 Length: 282 Type: PRT
Organism: Streptomyces fradiae Sequence: 28 Val Asn Ala Pro Gly Thr Met Thr Pro Ser Ala Glu Pro Ala Thr Arg Glu Arg Ala Pro Cys Trp Arg Pro Ala Gly Arg Ala Ala Gln Leu Arg Val Leu Thr Ala Arg Gln Ile Arg Leu Val Tyr Ala Asp Arg Arg Val Val Leu Phe Ser Val Ala Gln Pro Val Val Met Leu Leu Leu Ile Ser Gln Val Phe Gly Ser Leu Ala Asp Arg Ser Ile Leu Pro Arg Gly Val Thr Tyr Ile Glu Phe Leu Leu Pro Ala Leu Leu Val Thr Thr Gly Ile Gly Thr Ser Gln Ser Ala Gly Val Gly Leu Val Arg Asp Met Glu Gly Gly Met Val Arg Arg Phe Arg Val Leu Pro Leu Ser Leu Pro Leu Val Leu Val Ala Arg Ser Ile Ala Asp Leu Thr Arg Ser Gly Met Gln Leu Leu Val Leu Val Val Ala G1y His Leu Leu Phe Gly Ty:r Arg Ala Gly Gly Gly Ala Val Gly Leu Val Ala Ala Leu Ser Leu Sex Thr Val Val Ile Trp Ser Leu Ile Trp Ile Phe Ile Ala Leu Ala Thr Trp Leu Arg Lys Val Glu Val Leu Ser Ser Ile Gly Phe Phe Val Asn Phe Pro Leu Met Phe Ala Ser Ser Ala Phe Val Pro Val Asp Val Letz Pro Gly Trp Leu Ala Val Ala Thr Val Pro Val His A7.a Ala Asn Ser Val Glu Ala Ser Arg Leu Ala Leu Gly Gln Ala Ser Glu Sex Glu Gly Val Thr Ala Ala Leu Gly Ala Leu Gly Leu Val Met Thr Cys Leu Thr Val Leu Ala Ala Arg Val Arg Arg Pro Ala Pro Asp Glu Informationfor SEQ ID NO:

Length: 9 Type:
DNA

Organism:Streptomyces fradiae Sequence:29 gtgaacgcacctggaacgat gacgccctcggccgagcccgcgacgcgggagcgggcaccg60 tgctggagaccggcgggcag ggccgcacaactgcgggtgctcaccgcacgccagatccgt120 ctggtctacgccgaccgcag ggtggtgctgttcagcgtggcgcagccggtggtgatgctg180 ctgctgatcagccaggtctt cggcagcctcgcggaccgctcgatcctgccgcgcggggtg240 acctacatcgagttcctgct gcccgctctcctcgtcaccaccgggatcggcacctcgcag300 tccgcgggggtgggactggt gcgggacatggagggcggcatggtgcgccgcttccgcgtc360 ctgccgttgtcactgccgct ggtgctggtcgcgcgttcgatcgcggatctgacgcgctcg420 ggaatgcaactgctcgtcct ggtcgtcgcgggccacctgctgttcggctaccgggccggg480 ggaggagcggtgggactggt ggcggcgttgtcgctgtccacggtggtgatctggtcgctg540 atctggatcttcatcgcgct cgccacgtggctgcgaaaggtggaggtgctctccagcatc600 ggattcttcgtcaacttccc gctgatgttcgcttccagcgcgttcgtaccggtcgacgtc660 ctgcccggttggctggcggc cgtcgccaccgtcaatccggtgagtcacgcggtggaggcg720 tcgcggagccttgccctggg agagcaggccggaagcgaggtgaccgcggcgctgtgcggg780 gccctgggcctgctggtgac gatgacggtcctcgccgcgcgcgctgtccgccgacccccc840 gacgagtga 849 Informationfor SEQ ID NO:

Length: 6 Type:
PRT

Organism:Streptomyces fradiae Sequence: 30 Met Ser Trp Met Asn Asp Val Leu Thr Ala Val Ser Asp Met Ser Pro Val Ala Arg Phe Ala Leu Ala Phe Phe A7.a Glu Ser Ala Ser Ala Gly Leu Gly Ala Gly Met Val Gly Glu Ala Val Leu Ala Val Pro Val Leu Ser Ala Gly Thr Glu Gly Pro Leu Ala Leu Phe Leu Thr Arg Leu Val Val Thr Leu Ser Ser Ser Asp His Gly Tyr Phe Leu Ala Gly Ile Gly Ile Arg Tyr Gly Gln Arg Glu Thr Leu Val Arg Arg Met Arg Arg Met Gly Gln His His Trp Asp Gln Asp Cys His Arg His Arg Ala Leu Gly Ala Arg Ala Val Phe Leu Leu Leu Val Val Arg Thr Thr Arg Pro Leu Thr Pro Ala Thr Ala Gly Ser Val Tyr Arg Arg Phe Val Gly Arg Leu Pro Ala Ser Leu A1a Gly Met Trp Ala Leu Tyr Va1 Ala Ala Ser Ser Ala Gly Thr Leu Val Ser Leu Arg Ala Glu Ser Val Ala Ser Glu Leu Ser Thr Ile Leu Trp Ser Gly Val Gly Ala Phe Thr Leu Leu Ala Leu Ala Ala Val Trp Trp Arg His Arg Arg Ser Ser Arg Arg Arg Information for SEQ ID
NO: 31 Length: 621 Type. DNA

Organism: Streptomyces fradiae Sequence: 31 atgtcgtgga tgaacgatgt gctgacagcggtgtcggacatgagcccggt cgccagattc60 gccctggcct ccgcgttcgc gttcgcggagtccggcctcggtgccggcat ggtcgtcccc120 ggcgaggtcg cggtgctggc cctgagcgccggaacggaggggacgcgccc gctgctggcc180 cttttcctcg tcgtcacgct cagcagctcggccggcgaccacatcggcta cttcctcggc240 atccgctacg ggcagcggat gcgggagacgcggctggtgcggcggatggg gcagcaccat300 tgggaccgcg cgcaggatct ctgccaccggcacggggcgcgcgccgtctt cctgacccgg360 ctgctgcccg tggtgcgcac cctcacccccgcgaccgccggggtcggctc cgtgcgctac420 cgccgcttcc tgcccgcctc cctcgccggagccgcgatgtggtccgcgct gtacgtgtoc480 gcgggaacgc tggtctcggc ctccctgcgcgaggocgagagcgtgctctc caccatcttg540 tggtccctgc tgggcgtcgc gggcgccttcacactcgcggccgtgtggtg gcggcgcagg600 caccgccgcc gcagctcgtg a 621 Information for SEQ ID NO: 32 Length: 352 Type: PRT
Organism: Streptomyces fradiae Sequence: 32 Met Gln Ala Asp Ala Pro Ala Gly Thr Lys Thr Gly Th.r Glu Thr Tyr Leu Pro Ser Leu Ser Leu Glu Asp Tyr Leu Arg Asp Thr Val Pro Ala His Pro Val Leu Lys Ser Ser Val Asp Phe Gly Arg Pro Gly Ser Asp Glu Ala Leu Arg Ala Leu Ala Ala Thr Thr Thr Glu Phe Asp Ser Asp Glu Thr Gly Arg Gly Asp Thr Tyr Arg Arg Ala Gln Gln Asp Pro Ser Val Arg Trp Arg Gly Met Arg Gln Leu Leu Glu Leu Ala Ala Pro Ser Arg Ala Pro Ser Asp Thr Ala Ala Pro Arg Thr Val Leu Asp Val Leu Gly Gly Asp Gly Thr Ile Ala Arg Ala Val His Asp His Ala Arg Glu Leu Trp Asp Arg Pro His Ile Leu Thr Gly Asp Leu Ser Gly Asp Met Val Glu Arg Ala Leu Ala Gln Gly Leu Ala Ala Ile Arg Gln Ala Ala Asp His Leu Phe Leu Ala Asp Gly Thr Met Asp Ala Ala Leu Leu Ala Tyr Gly Thr His His Ile Ala Pro Gln Glu Arg Leu Ala Ala Val Thr Glu Ala Leu Arg Val Val Lys Asp Gly Gly His Val Va:1 Leu His Asp Phe Asp Asp Ala Ser Pro Met Ala Arg Phe Phe Thr Ash Ile Val His Pro His Thr Thr Ala Gly His Asp Tyr Arg His Phe Ser Arg Asp Leu Leu Ala Glu Leu Phe Ala Glu Ala Gly Thr Pro Ala Arg Val Val Asp Leu Tyr Asp Pro Leu Val Val Arg Gly Thr Thr Glu Glu Glu Ala Arg Arg Arg Met Cys Ala Tyr Val Ala Asp Met Tyr Gly Val Gly Ala Phe Phe Ala Thr Leu Gly Gly Thr Asp Ala Cys Trp Arg Leu Leu Glu Glu Tyr Phe Gln His Asp Thr Tyr Leu Ser Thr Leu Pro Glu Gln Thr Asp Phe Thr Thr Ala Pro Ile Val Tyr Arg Ser Glu Gly Ala Phe Ile Ala Glu Ile Pro Arg Ala Ala Ile Val Ala Val Ser His Lys Pro Pro Thr Information for SEQ
ID NO:

Length:

Type:
DNA

Organism:Streptomyces fradiae Sequence:33 atgcaggcggatgcaccggcgggaacaaaaacaggaacagaaacctacttaccgtccctg60 agtctcgaagactatctgcgcgacacggttccggcccacccggtcctgaaatcgtccgtg120 gatttcggccgcccaggctctgatgaagcgctcagagcactggccgcgacgaccacggaa180 ttcgattccgacgagaccggacgcggcgacacctatcgcagggcccagcaggacccctcc240 gtccgctggaggggaatgcgacaactgctcgaactggccgcgccctcccgcgccccctcc300 gacacggccgccccccgcaccgtcctcgacgttctgggcggggacggaaccatcgcccgc360 gccgtccacgaccacgcccgcgagctgtgggaccgcccgcacatcctcaccggtgacctc420 tcgggggacatggtcgaacgcgccctcgcccagggcctggccgccatccgacaggccgcg480 gaccatctcttcctggccgacggcaccatggacgcggcactgctcgcctacggcacccac540 cacatcgcgccgcaggagaggctcgccgccgtcaccgaggccctgcgcgtcgtcaaggac600 ggcggccacgtggtcctgcacgacttcgacgacgccagccccatggcacgcttcttcacc660 gacatcgtccacccccacaccacagccggccacgactaccgccacttctcccgcgacctt720 ctggccgaactcttcgccgaggcgggcacaccggcccgcgtcgtcgacctgtacgaccca780 ctcgtagtccgcggcaccaccgaggaggaggcgcgccgccgcatgtgcgcgtacgtggcc840 gacatgtacggagtcggcgcgttcttcgccaccctgggcggaaccgacgcgtgctggcga900 ctcctggaggagtacttccagcacgacacctacctgtcgaccctgcccgagcagaccgac960 ttcaccaccgcgccgatcgtctaccgctccgagggcgccttcatcgccgagataccccgc1020 gcggccatcgtcgccgtctcccacaagccacccacctga 1059 Information for SEQ ID NO: 34 Length: 61944 Type:
DNA

Organism:Streptomyces refuineus Sequence:34 atggccgacccgctgctgttcaacccccgcacctacgaccccgggcacttcgaccccgag60 acccgcaggctgctgcgcgccaccgtcgactggttcgagcagcgcggcaagcgccgcctg120 atcgaggactaccgcacccgcgcctggccggcggacttcctcgccttcgccgcggaggag180 gagctgttcgccaccttcctcacccccgcccgcgagagcgacggccggcgggacaggcgc240 tgggacaccgcgcggatcgccgccctcagcgagatcctcggcttctacgggctcgactac300 tggtacgtctggcaggtcaccgtcctcggactcggaccggtctggcagagcggcaacgcc360 gcggcccgcgcccgcgccgccgaactgctctcccggggcgaggtgttcgcgttcggcctg420 tcggagaaggcccacggcgccgacatctactccaccgacatgctgctggagcccgacggc480 gacggcggcttccgggccggcggctccaagtactacatcggcaacgggaacgccgcgggg540 ctcgtctccgtcttcggccgccgcaccgacgtcgaggggcccgacggctacgtcttcttc600 gccgcggacagccgccacccggcgtaccacgtcgtgaggaacgtcgtcgactcctccaag660 tacgtcagcgagttccggctcgaggactacccggtcggcccggaggacgtcctgcacacc720 gggcgcgccgccttcgacgccgcgctcaacaccgtcaacatcggcaagttcaacctctgc780 accgcctcgatcggcatctgcgagcacgcgatgtacgaggcggtgacccacgcccgcaac840 cggatcctctacggccgccccgtcaccgccttcccgcacgtgcgccgcgagctgaccgac900 gcctacgtccgcctggtcgggatgaagctgttcagcgaccgagccgtcgactacttccgc960 tccgcgggccccgacgaccgccgctacctgctcttcaacccgatgacgaagatgaaggtg1020 accacggagggcgagaaggtcgtcgacctgctgtgggacgtcatcgccgccaagggcttc1080 gagaaggacacctacttcgcccaggcggccgtcgagatccggagcctgccgaagctggag1140 ggcacggtccacgtcaacctcgcgctgatcctcaagttcatgcgcaaccacctgctggac1200 ccggtcgagtacgcgcccgtgcccacccgtctggacccggccgacgacgccttcctcttc1260 cggcagggccccgcccgcggcctgggatcggtccgcttccacgactggcggcccgccttc1320 gacgcccacgcccacctgcccaacgtcggccgcttccgggaacaggcggacgccctgtgc1380 gagttcgtcgccaccgcggcccccgacgaggagcagagccgcgacctcgatctgctcctc1440 gccgtcggccggttgttcgcgctggtcgtgcacggccagctgatcctggagcaggcgggg1500 ccggccggtgtggacggggacgtgctcgacgaactgttcgccgtcctcgtgcgcgacttc1560 tccgcgcacgccgtggaactgcacggcaaggactccgcgacggcgccgcagcagcgctgg1620 gccctggacgcggtccggcgccccgtcgtcgacgacgcccggtcggcgcgcgtgtgggag1680 cgcgtcgaggccctgtccggggcgtacgagatgacaccgtgaaccacgtggcgccgcaag1740 gggaggcgtt cgcgcctccg gcggccgggc ccgttccggg cggaccggaa cgggcccggc 1800 ccgcggcgcg ccgccggtcc tCCggCgCCC gccggcgggc gcgaga.atat ccggcgagtg 1860 attttccgtc tctggtttta cgttgaaccg aagatttcac gccggtgaag taattcggaa 1920 ccgcccgcgc ggcggaatgt ggcggccgcg gcgccggcgt cggtcgcgtg atggcgccga 1980 acgggaaaag gccgtcttcg ccgcccgtcg gcggggcgtg ctccctccgg acctcgaaag 2040 tttcggtacc cggccgaaac cggcatccgg cccgcctgcg agagaagtgc ggacctcctg 2100 tcaagagcgt ctcgtcggac accctcttgc ggtgaggccg aagatctgca tgtggccccc 2160 gggtgcgccg gcccccggag cggccgccgg agcgggaaac ggacaaaagg cgggttatgg 2220 taatttacgg tgcacccgct ggtgaattgc gagtatttgg cgtgaagttt tgcggggtgg 2280 attggatctt caatttattt cgctgtgacc cctgatcaaa acgagctcag gcctgtatgg 2340 tgactgtcga gcggcccgat ttttcgcacg gatgcaccgg gccggatatt cccgcatcag 2400 ggtcagaacg tacgcacaac ctgtggaagc cgctttacgg gggaggccgg cagagggtgt 2460 acgaccagga agatccacgg cgcagagccg tccggccggt gtgccggaaa ccgtcggtgg 2520 agtcgtcgca gacccggggg gtccgcgacg gcgcccccgc ggcgccgaag gcccggggga 2580 acgaggaggc gcggtgcgag ccggccgcgg gtgcgaggcc gcggcccggc ggagagcgtg 2640 cccgccgagg ccgtgagggg ggagggtggc tcacgtgagc ggacccccag cagacccgcc 2700 ggccggctcc cacctggtgg ccgcgatccg cgcgacggcc gaggccgacc ccgagcgcaa 2760 ggccgtcggc ttcgtccggg atccggaacg cgaaggtgag gaggcgctgc ggagctactc 2820 ctggctcgac gacagggccc gccgcatcgc cgtcctcctc cgcggggcgc ggctcggcgc 2880 gggctcgcgc gtcctgctgc tcttcccgca gtccgcggag ttcgcggcgg cctacgccgg 2940 atgcctctacggggggatggtcgccgtccccgcgcccctgcccacgggaacctccctgga3000 gaccgcacgcgtcgccggcatcgcccgggacgccggggcgggcgccgtcctcaccgtctc3060 cgacaccgaggcggaggtccggcggtgggcggccgagaccggtctgggcgacctgcccct3120 gttctccgtcgacgaactgcccgacgacaccgacccgggggagtggcgggagccggagat3180 ccgggccggcaccgtggcggtgctgcagtacacctccggctccaccggcagccccaaggg3240 ggtcgtcgtcacccacggcgcgctcgccgacaacgtccgcagcctcctgtccgggttcga3300 cctgggaaccggcgcccggctgggcggctggctgccgatgtaccacgacatggggctgtt3360 cgggctgctgagcccggcgctgttcagcggcggcgccgccgtgctgatgagcggcagcgc3420 cttcctgcgcaggccgcacctgtggccgacgctgatcgaccgcttcggcgtggtcttctc3480 cgcggcgcccgacttcgcctacgactactgcgtacggcgggtggagcccgagcaggtgga3540 ccggctcgacctctcgcgctggcgctgggcggccaacggctcggagcccatccgggccga3600 gacgctccgcgccttcaccaaggagttcgcccccgcggggctgccccacgacgcgatgac3660 cccctgctacggactggccgaggcgaccctgctggtctccctgtcggcgggcgagctgcg3720 cacccggcgggtggacgccgcggcactggagaaccaccgcttcgtcgaggcggccgcggg3780 ccgcccgtcccgcgaggtcgtctcgtgcggccggcccccggccctggaggtccgcgtggc3840 cgaccccgcgaccggagagcccgtcacgggcgatgcggtgggcgagatccaggtgcgggg3900 cgcgagcgtggccggcggctactggcggaaaccggaggcgaccgccgagacgttcgtcac3960 ggccgcggacggctccgggccctggctgcgcaccggcgacctcggcgccctgtacgaggg4020 cgagctgtacgtcaccggccgcatcaaggaactcctcatcgtgcacggccgcaacatcta4080 cccgcacgacgtcgagcgcgaactgcgcgcccaccacgacgagctcggcgcgatcggcgc4140 cgtcttctccgtccccacggaggagggcgaggccgtcgtggtcacgcacgaggtggtccc4200 gtccgtccgggacgaccggggccccgcgctggtgacggcggtacgggcgacgctcgcccg4260 ggagttcggcctggcaccggccggggtggtgctggtgcgccgcggccgcaccccgcgcac4320 cagcagcggcaaggtgcagcgccgcctggccgcccggctcttccgcaccggggaactcgc4380 ccaggtccacgccgaccccggtgcccaccggctcgtggcggcgctccgcgaggcggacgg4440 cctgcgcgacgcccccgcgtccacgacatgacctccccatcgtcctgatccgctcccagc4500 gtcgggcggctccccgaatcccgggctccgagcgtctcggagcaccggccggcccctcag4560 cgggagccgtccggccgggatgccctcccgacggccgaccggttgcccgcacaccgaaga4620 cagaggtcctacccgcatgtccctgtccccgccttcttcgtccccgccttcttccccgcc4680 cccttctccgccgcacgaccccgacgccctgcggcagtggctgcgcgagcagtgcgccga4740 ctgcctcggcgtccccccggcatccctcgccaccgacgtccccctcaccgactacggcat4800 gacctccgtcaccgggaccgccctgtgcggcatggtggaggaccacctggacgtcgagtg4860 cgacctgagcctgctctggcaggagcagacgatcgacggcatcacci=cccggctggcctc4920 gcgcaccgcgcgctgacggccgtccggccctgtccccacaccacgcacacgtacacgcac4980 atgccgaggcactcgtgcgtgcggacgccgctgcgcgatcgggcgtgccgacaacctttc5040 agtcctagggcgggagaagcatgttggagtccccggcagaccgcgtggccgccacctcgg5100 cccagtccgggatctggacggcacagcggctgcgctcggatgaccggctctacacctgcg5160 gcctctacctcgaactcgaccacgtggtggaggaggtgctgggcgaggcgatcggccgtg5220 cggtcgccgacaccgaggcgctgcgcaccgccttcggggaggacggggacggcgcgctgg5280 aacagcgcgtgctcgcgcggccgccggacacgcagacacggctgttccggctggacctgg5340 gcggagacgaccggccccgcgccgaggccctggactggatggaccggcagcaggcggaac5400 cgtgggacct cgccgccggc gacacctgcc ggcacaccct gatccgcctc ggcggccacc 5460 gcaccgtcct gcacctgcgc taccaccacc tcgccctgga cgggttcggt gccgcgctct 5520 acctggacag gatcgcggcg gtgtaccggg cgctgcgcac cggccaggag acgcccccct 5580 gcaccttcgc gccgctggcc cgcctcgtgg aggaggaccg cgcctaccgg cggtccgccc 5640 gccaccgcag ggacgccgac cactggcgga cgcgcttcgc ggacctcccc cgccccacca 5700 gcctcgccgg cgccgccgcg cccgccgcgc ccgccgcgct gcgccacacg gtccgcgtgt 5760 ccgcggccga caccgccgca ctgggcctgc gggcggaccg gagcggcagc acctggccgg 5820 tgttcgccac ggccgcggtg gccgccttcc tgagccgcct cgcgccgggg gaggaggtcg 5880 tcgtcggctt cccggtcacc gccagggtca cgcccgccgc ggtgcgcacg ccggggatgc 5940 tggcgaacgt cgtgccgctc cggatccggg tgcggcaggg gatgtcgttc gccgcgctgc 6000 tggaccggac cgcggccgag atcggcgcca cgctgcggca ccagcgccac cgcaccgagg 6060 acatcggccg ggcgctcggc ctccccccgc acggcgccca gccggccccg accctggtca 6120 acgtcatggc cttcgccccg gtgctcgact tcggcgactg cctctcgccg gtgcaccagc 6180 tgtcggccgg cccggtcgag gacctggcgg tcaacctgct cggcaccccc ggggacggcc 6240 gggagctgga gatcaccgtc gccgccaacc ccctgctcca ctcggaggac gcggtggcgt 6300 cgctggccgc gcggctggcg gagttcctgg cgcgcgcggg cgagcacgcc gacgccccga 6360 tcggccggac acgcctgctc ggcgcggcgg aggaggccga ggcgctggcc gccgggcgga 6420 gcccgcgacg ggacctcccc gcccgcaccc tgcccgagct cttcgcccgg caggccgccc 6480 gcacccccga cgccccggcg gtcgcctcgg accgcacgac ctggacgtac gcccggctcg 6540 acgcgcacgc cggcagggtg gcccggcggc tggccgcccg gggcgtgggg ccggagagca 6600 tcgtcgccct cgcggtgccg cgcggggtgg agctggcggc gctggtcatc ggagtgcagc 6660 gggccggggg cgcctacctc cccatcgacc cggagtaccc ggccgagcgc atcgggttcc 6720 tgctgcgcga cgcccgcccc gccctggtgg tctgcgagcc cgggacggac cttccggaca 6780 ccgggtgccc gcaggtgccg gccggcgacc tcctcgacgc cggggtgcgg tgcgcggagg 6840 cggaggaacc ggcgcccggg gacctcccgg cggacctgcc cgcctacgtc gtctacacct 6900 ccggctcgac cgggcggccc aagggggtcg tggtcaccca cgccggcatc gccgccctgg 6960 cggcggagca gatcgaccgc taccggctgg gccccggctc cagggtggcg cagctcgcgg 7020 ccctcgggtt cgacgtcgcg gtcgccgaac tcgcgatggc gctgacca cg ggaagctgcc 7080 tcgtcctccc gccgcacgga ctcgccggcg aggaactggc cgagttcctg cgcagccggc 7140 gcatcacgac ggccctcacc acggcctccg tgctggccac ggtgcc<:ccc ggcgacttcc 7200 ccgacctgtccgacctggccaccggcggcgagcagcccccgcccccgctgatcgcccgct7260 gggcgcccggccggcggatgttcaacgtctacgggccgaccgaggcgaccgtccaggcca7320 cctccggacgctgcgcggcgggcggggagcggatgccggacatcgggaacaccgaggcgg7380 gcgtggacgcctacgtcctggacggggcgctcagacccgtgcccgacggggcgaccggag7440 agctctacctgcgcggcagggggctggcccgcggctacctgcgccgccccggcctcaccg7500 ccgcacgcttcgtcgccgacccccacaccgggacgggcgagcggatgtaccggaccgggg7560 acctggtgcgccgggtgcccggggagggccgcaccgtgctggagttcgtcggccgggcgg7620 acgaccaggtgaagatccggggtttccgggtggagccgggcgaggtggaggcggccctcg7680 ccgaactcgacggggtggcgcaggcgctggtgaccgtgcgcgaggaacggccgggcgacc7740 gcaggctcgtcggctacctggtgcccgaccccgcgggccgggacggctccgcgcggggcc7800 cggacgtcgagcggtggcggaggctgatcgccgcccggctgcccgcccacctggtcccct7860 cggcgctggtggagctggcggagatcccgcgcaccgccaacggcaaggtggaccgctcgg7920 cgctgccggcccccggcggcacgcccccgcccgcgggacgggcaccgcggaacgcccgcg7980 aggaggccctgtgcgcgctcttcgccgaggtgctgggcgtcgaggaggtcggcgccgacc8040 acgacttcttcgccctgggcggcgactccctgctggccgcccggctggcgagccgcatcc8100 ggaaccggctggggaaggcggtcacggtacgggaggtcttccgggccccgaccgccgcag8160 gcctcgcggaggcgctcggcggcgaggcgcgggcggacggccgcgtccgtccggtccggc8220 ctcgcccggagcgggtgccgctgtcggccgcgcagcgccggctgtggttcatcgacgaac8280 tccagggggcctcggccgcctacaacatcccgaccaccctgcgcttcgacggaccgctgg8340 acgtccccgcgctgcacgccgcgctgggcgacgtggtggaccggcacgaggccctgcgga8400 ccaccgtccggcccgcggcggaggacgccaccggggcggccgccgcacccgagcagcaca8460 tcgcccccccgggcggccaccgcctcccgctgccggtgcgcgacatcgcccccgaggagc8520 tcgccggggagctgcgcgcggccgcgggccacgtcttcgacctcacccgggacctgccgg8580 tacgcgcccggctctaccgcaccgccgagcgggagcacgtcctgctc~ctgctcgtccacc8640 acatcgccgccgacggcgcgtcgatggggcccctgatcggggacctggctacggcctaca8700 cggcccggctcgcgggccgggcccccgacctccccgcgccggaggtg~acgtacgccgact8760 tcgcgctgtgggagcaccggggcggggagcacgccgcggcgcaggccgaggggctcgact8820 actggcgccgggccctggccgggctgccggaccggatccggctccccgccgaccggcccc8880 ggtcgcaggagccggtccgccggggcggagcggcacggttcgaggtgccgcccgcgctgt8940 acgccaggctggcggagctggccggaagcgtgcgcgccaccccgttcatggtgctgcaga9000 ccgcggtcgccgtcctgctgagccgcatgggcgccggcccggacgtccccctgggcacgc9060 ccgtggccgg ccgcccggac gaggcgctcg acgaggtcgt cggctgcttc gtcaacaccg 9120 tggtcctgcg caccgacgtc tcgggcgacc cgaccgtggc cgagctgctg gcgcggacgc 9180 gggacggcga cctcgcggcc ctcgcccacc aggacgtgcc gttcgaccgg gtcgtggacg 9240 cggtcaaccc cgtgcgctcc atcgcgcggc accccctctt ccaggtcatg ctcgtcctca 9300 acggcgcgga gcagcgccgg gggcgggccc gcttccccgg cctggacagc cggatcgggg 9360 cggtggactc cggcgagacg aagttcgacc tctcctggca cttcacgcac cgggacgggc 9420 ccgagcgggc gctggaggga acgctcgtct acgccgccga catgttcggc gccgccaccg 9480 cccgccggct caccgagcgg ctgctcggcg tgctgaccgc gatggccgac gaccccggcc 9540 ggccggtcgg gtccatcgac gtcctcagcg ccgccgagca ccgcgcggtg cgggcgtggg 9600 gcaccggcgc ggcccaggac cgcacccgcc gccccgagcc ggtggccggg aggatcgccg 9660 cccaggcggc ccgcaccccc ggcgcgcccg cggtgaccga acccggccgg gtgtggacgt 9720 acgccgaact cgacgcccgc gccaaccggc tggcgcgcgc cctggccgcc cggggcgtgg 9780 gcgccgagga cctcgtcgcc gtgct:cctgc cccgcggggc ggacctggtc gccaccctgc 9840 tgggagtgct gcgggccggc gcctcctacc tcccgctcga caccgggcac ccgtcggacc 9900 gcaaccggtg ggccgtctcg gacgc;cgccc cggcgctggt ggtgaccgac ggcgcgcacc 9960 gcggcacgct ccccggggag accgqgtgcg ccgtgctggt cctggg~cggg gaggacgccg 10020 aggccgaact ggcgggccgc gccc<;caccc cgccggacga gaccgacctc gcccggccgg 10080 tggccggggc caacgccgcc taca<;catcc acacctcggg ttcgacgggc cgccccaagg 10140 ccgtcgtcgt cacacgcgac gcgct:ggatg cgttcgtcga gcgcaccgtc gacacctacg 10200 gggacgcgct gcggggcacc tccctgctcc actccccggt cgccttcgac ctcacggtcg 10260 ccaccctcta cggaccgccg gccg<:cggcg ggcggatcca cgtggaggac ctcgacgagg 10320 ccgggatcgc gcggtgggag cgggagtgcc cggccttcct caaggccacg ccctcccacc 10380 tggcgctgct ggaggagttc ggcgqctccg cggcccccgg aacggtcgtc ctggcgggcg 10440 agcagctcct gggcgcgcgg ctgga~ccgct ggcgggcccg ccaccccggc accgccgtct 10500 tcaacagcta cgggccgacc gagaccaccg tcaactgcct ggagtacagg atcgccccgg 10560 gcgcggagac ggccccgggg cccgtgccgg tgggccgccc ggtggcgggc gtccgggtgc 10620 acctgctcga cgcccgcctc cgcccggtcg ccccgggtgt gacgggcgaa ctgtacgtct 10680 gcgggcccgg ggtcgcccgc gggtaccgcg ggcggccggc ggccaccgcg gagcggttcg 10740 tcgcctgccc gttcggggag ccggc~ggagc ggatgtaccg caccggggac ctgatgcggt 10800 ggaccccgga cggcgcactg ctctacgagg gccgggccga cgcccagctg aaggtgcgcg 10860 gcttccgggt ggagcccggc gaggtggagg ccgcgctgct ggacctcccc ggcgtgcggg 10920 aggcggccgt gaccctcgtc ggcgggcccg gccgggggtc cggccaggcg ggcggctccg 10980 ccgcccccgc ccgcctggtc ggctacgtcg tcggcggggc ctttgacccg gccgccctcc 11040 tggagcggct gcgcgtccgg ctgcccgacc acatggtgcc cgccgcgctc gtggagctgg 11100 acgccctgcc gctcaccccc aacggcaagc tcgaccgccg ggccctgccg gcgcccgact 11160 tcggccgcca cgcgggccgc cgcgctccgc gcggaccgcg ggaggagctg ctgtgcacgc 11220 tcttcgccga ggtgctgggg ctgcecgagg ccggcgccga ggacagcttc ttcgcgctcg 11280 gcggcgacag catcgtcagc atcca.gctcg tcggacgcgc ccgccgggcc ggactgcact 11340 tcaccgtgcg cgacgtcttc gagca.cccca cggccgccgg gctggccgcc gtggcccggg 11400 ccgccgaccc ggccggggac ccgggcaccc ggcccgcgcc gggacti~ccc ccgagcgggc 11460 cgctgccgta cgtcccggcc gccgc:gcggc tcgtcgccag gaccgggtcg atccgcgccc 11520 ggggcgccga ccggttccac cagtcggtgg tcctcaccgc ccccgcggac gccggcccgg 11580 acgacgtccg gcgcgtgctg cagac:ggtga tcgaccacca cggggcgctg cgcctgcggg 11640 ccgccgcgga ccgcgacgga gcgccggacg gcctggtgat cggcgaaccg gggtcggtcg 11700 cggccgcgga cctgctgcgc tgccgcgacg ccgcggggct gccggaggcg gcgctgcggg 11760 aagcggtgga gcaggaggcc cggcuggccc gggacggcct cgacccgagc acgggatccg 11820 tgctgcgcgc ggcctggctg gaccgcggcc cggaccgggg cggcctgctg gtgctggtgg 11880 cccaccacct gagcgtggac ggcgtctcct ggcggatcct gctggacgac atccgccacg 11940 cctggagcac gcccgccggc cccgccgggg ggacccccct gccgccggag ggcacctccc 12000 tgcgggagtg ggccacccgg accgccggag ccgccgccgg ggccgccgtg accggccggc 12060 tcccccactg gcggcagacc ctggcgggcc tggaggaccc ggacggcgag gtggtcgccc 12120 tggaggtgcg gctcgacccc gccgc:cgaca cccacggctc ggcgcgcgag accgcgcaca 12180 ggctgccgcc cgacctgacc gacgc:gctcg tccgcaccgc gccggcggcg ctccgcgcgg 12240 agcccggtga gctgctgctg gccgggtacg cgctggccgc ctcccgggcc ctgggcggcc 12300 ggcccgtgtt cgtggtggag accgagggcc acggccggca ggacgcgctc ctgccgggga 12360 tcgacctgtc ccgcaccgtc ggctggttca cctccgtcca tccggtgcgg ctcaggcccg 12420 gcgccggggc cgcgcggctg ctgaaggaga ccagggagcg gctgcgcacc gtgccggacg 12480 ccgggctggg ccacgacctg ctccqccacg gcggcgcggc gccgtcctcc ggggagggcg 12540 gccgcgggct gccccgcccg cagtt:cggct tcaactacct cgggcgggtg gccgtggccg 12600 aggcccccgc cggcacggac ccgggcggga cgtgggcgtt cgcgggccac agcgtcgccc 12660 cgcagccgcc cgagctgccg ctggcgcacg aggtggagct gaccgtcgtc ctggaggacg 12720 gccccggagg accggtcctg gcggcgcgct ggaacgcctc cgcccgctgc ctgtcccggg 12780 cgcggcagga cgcgctggcg cgggagtggg agaaggccct gcgcgaa ctc gtcgccctgg 12840 ccggcaccgc cgggggcggc ggcctgatcc cctcggagac cggcgccggc ggcctggacc 12900 aggacgagat cgaggagtgc gaggcggcgg cggacttcga ggtcgccgac ctgctcgcac 12960 tcgcgcccgc ccaggagggg ctgctcttcc acagcacctt cgacgacgag gcggaggacg 13020 tctacgtcgg ccagctggcg ctgga.gctcc acggggagct gtcgggcgcc cgcctgcggg 13080 aagccgccca aggcgtcctc gaccggcacg acgcgctgcg cgcggcr_ttc ctccagcgcc 13140 gctccggcga gtggatccag gcgatcgccg cacgggcgcc ggtcggctgg gaggagcacg 13200 acctgtccgg accgggcgga cagga.gcggc agcggaggct ggaggagctg ctggccggac 13260 agcgcacccg ccggttcgac ctctccaggc cgccgctggt gcgcttcctg ctcgtgcgga 13320 ccgcggcgga ccggcacgtc ctcgc:cctga ccaaccacca cctggtgctc gacggctggt 13380 cgctgccgct ggtggtgcgc gacctcatgg cgctgtacgg ggcggacggc ggcgcggccc 13440 tgcccgccgc gcggccgtac cgcgactacc tcgcctggct cggcgggcag gaccgcgacg 13500 cggcccggga ggcgtggaca cgggcgctcg ccgggctcca gccctcgctg atcgccccga 13560 acgcccgccg cgacggcgcg gcgccgctcc cgcactaccg caccatggac ccgcaggtcg 13620 tctcccgcct caccgcttgg gcccggcggc acggcgtcac cctcaactcg gcggtcgagg 13680 cggcgtgggc gctcctcctg ggccc~gctca ccggccggga cgacgtgagc ttcggcatcg 13740 cggcctccgg ccggcccacc gacct:gcccg gcgccgcgga gatcgtcggc ctgctgatga 13800 acaccgtgcc ggtgcgcgtc gtcct:ggacc ccggcgagcc gctggaggcc ctcgtccggc 13860 gcgtgcagcg ggagcaggcc ggcctgctgg accaccagtt cgtccccctg gcgcaggtgc 13920 agcgctgggt gggaggaggc gacct:cttcg acaccacgct cgtcttcgag aactacccgc 13980 tggaccccgc cgccggcctc accgccggcg aggccggcgg cgacgggccc cggctgcacg 14040 acgcgcgcgg ccacgacagc aaccactacc cgctcagcgt caccgtcggc cccgcccccg 14100 acctccagct ccgcttcacc taccgccccg acctgttcgc accggagtgg gtggaggagc 14160 tggcggcgcg gttcgagcag gtgct:cgaca ccatggcggc gtccggcacc accccggcgg 14220 gccggctggg cgtcctgctc ccgcacgaac gcgccacgct gctgggcgac tgggcgcgcg 14280 gcgaggcggc gagcggacgg gagtc~ccccg tcgccctctt cgaggagcag gccgcccgca 14340 ccccggacgc cctcgcgctg gtcgagggcg gcgacggcgg actccgtctg acctacgccg 14400 agttcgacgc gcgcgccaac cgcatggcgc gcttcctcac cgcccgcggg atcggggccg 14460 aggacctggt cggcctggtc ttcc<:gcgcg gcgccgacct gctcaccggc ctgtgggggg 14520 cgctcaaggc cggtgcggcg tacctgccgg tggacgtgga ctacccggcc gagcggatcg 14580 ggctgctcct ctccgacggc gcccccgccc tcgtcctgac cacctccgcc cacgcccacc 14640 tggtgcccga ggcgccgggg cggcagatcc tctgcgtcga cctgcccggg cccgcggacg 24700 aactggcccg cgccccggag ggaccggtga ccgaccggga gcgtccgcgc ccggtcgggg 14760 ccgacaccct cgcctacgtc ctctacacct cgggctccac cggccgcccc aagggcgtgg 14820 ccatcagccg cggttcgctg gccgcgcacg ccgtccggtc ccgcgaccgc tacccggacg 14880 ccgccggggt gtcgctgctg cactccccgg tggcgttcga cctcac~gtg acggccctgt 14940 tcaccacgct cgtctccggc gggaccctgc tgctggcgga gctcgacgaa cacgcccagg 15000 gcgccggcgt caccttcgtg aagggcacgc cctcccacgt cgcgctcctg gacgagctgc 15060 ccggcgtcct cgacgccacc ggggaacgcc ccggcacgct cgtgctcggc ggcgagccgc 15120 tcaccggcga gatgctggag cgctggcgcg cacgccaccc gcaggccagg gtcttcaacg 15180 actacgggcc ctcggagacc agcgtgaact gctccgacct ggtcttcgag cccggtgacg 25240 aggtgccggc cggcctgctg ccgat;cggcc ggccgctgcc cggcaaccac atgttcgtgc 15300 tcgaccacct gctccagccc gtgcc;ggtcg gcgtcgtcgg cgagat~~tac gtctccggtg 15360 tcggcgtggc ccgcggctac cacgqcaggc cgggcctgac cgccgagcgc ttcctgccct 15420 gcccgttcga cgccccgggc gcccggatgt accgcaccgg ggacctgggc cgctggcggc 15480 ccgacgggat catggagtgc ctgggccgca ccgacgacca ggtcaaggtg cgcggcttcc 15540 gggtggagct gggagaggtg gaggc:cgccc tcgccgcccg ccccgacgtc gcccgcgcca 15600 ccgtcgtcgt gcgcgaggac gagccggggg acaggaggct gacgggctac gtggtgcccg 15660 agggagggcc ggaggcgggc ctcgacaccg ccgccgtgct gcgcgacctc gccgcacagc 15720 tgccggagta catggtcccg gccgcggtcg tggtcctggc ggagctgccg cgcaccgaga 15780 acggcaagct cgaccgccgg gcgct:gccca cacccgagta cggcaccagg tccgccgggc 15840 gggcgccgcg CaCCgCCgCC gagac;cgccc tgtgcgccct gttcgcggag gtcctgggag 15900 tgcccggggc caccgtggac gacgacttct tcgccctggg cggccactcg ctgctggccg 15960 tccggctcgc gggccgcatc cggg<;cgagc tcggcctgcg gctggacatc cgcacgatct 16020 tcgaccgccg caccgtcgcg gacct:cctgg ccgatccgca gatcgccgca cagctgacgg 16080 acggcgaggt ccccggggcg ccggcgcggc cggcggagcc cgcgccgggc gggcagcccc 16140 cggccgggga gcgaggcggc gccggggccg ggcccgagcg cctgcccctg tctcccgccc 16200 agcgcagact gtggttcctc aaccgctacg acagggaggc cggcggctac cacatcagcg 16260 tcgcgctgcg cctgaccggc gacct:cgacg tcggcgccct gcacgcggcg ctgggcgacc 16320 tggccgcccg gcacgagagc ctgcc~cacgg tcttccgcga ggacgaggag gggccctacc 16380 aggtcgtcct gcccgcggcg ccgtccccgg cgccggccgc cgtccccgcc tccgcgcggg 16440 agctcgacgc gctggtgcgc gaagccgtcc gccggccctt cgacctcgcc gaggacaccc 16500 cgctgcggca caccctgttc gcgctcccgg accgcgagca cgtcctgctc ctggtgatcc 16560 accacatcgc cgccgacggc tggtcgatgg ggccgctggc ccgggacctg gccgccgcct 16620 accgcgcccg cgcggccggc gacgcgcccc ggtggccggc gccggcgccg agccacgccg 16680 atcacgtgct ccggcggcac cgcgcgccgg agcacggtgg ggacgccggc gacccggcgg 16740 accaccggct cgcccactgg gccgaggagc tgcgcggact gcccgacgag ctcccgctcc 16800 cctacgaccg gccgcgcccc acgacacccc ccgggtacgc cgagcggatc ggcttccgga 16860 tcgacgccgg actgtaccgg gacgtgctgg ccctggcggc ccgccaccgg gccaccccgt 16920 tcatggtgct ccacgccgcc ctggccgccc tgctgacccg gctcggcgcc ggcaccgaca 16980 tccccgtggg caccccctcg gcgggccgcg accggcccga gaccgccgac ctcgtcggct 17040 tcctggtcaa caccctggtc ctgcc~cacgg acacctcggg cgacccgacg ttcggggaac 17100 tgctggaccg ggtgcgcgag accgacctga gggcgtacac ccaccaggac gtgcccttcg 17160 aacggctggt ggaggcggtc aaccc:cgccc gctcgcccag caggcacccg ctcgtgcaga 17220 ccatgctcac cctcgacaac gcggcccagg gagcgctgga gcacctcctg gacctgccgg 17280 gggtgcgagc ggagctgctg ccgaccgccg agggcaccgc ccacaccgac ctcgacctga 17340 ccttcgccga gaccggctcc gcccc;gcccg cgggcgggac gggactcgac gggaccctgc 17400 agtaccgccc cgacctgttc gaccgcgcga ccgcgcaggc cctggtggag cggttcgtgg 17460 cgctgctgcg cacggtgacg cgcgagcccg gcctgcggct gggccggctc gacgtcacca 17520 ccggcgagga gcgccggcgg ctggtcgagg aggacgccgc cgcccggcgg gcgcgggccg 17580 agaccaccgt cacggacctg cccgcgctgt tcgccgcctg ggccgagcgg accccttccg 17640 cccccgccct caccgacggc gggacgaccg tggactacgc cgaactggac gcccgctcca 17700 accgcctggc ccgcgcactg ctggaactcg gcgtggggcc ggaggacttc gtcgccctgg 17760 ccgtgccccg ctcggcggac ctggt:ggtgg ccgtgctcgc cgtgctgaag tcgggcgccg 17820 cctacctcgc ggtggacccc gactacccgg ccgagcgcac ctcctacatc ctcggcgact 17880 gccggccggc cgcggtgctc tccacgaccg cggtccgggc ggccctgcac ggcacggtgg 17940 gcgaggcggc cggcgaggtg ccgtctgctgc tgctcgactc gccccggacc cgcgccgcgg 18000 cggccgggct gtcggcggcg ccggtcaccg acgccgaccg ccggtcgccc ctgctccccg 18060 accaccccgc ctacaccatc taca<;ctcgg gatcgaccgg gcggcccaag ggcgtggtcg 18120 tcagccacgc caacgtctcg cggctgctgg acgtctgccg ctcggccgtg gacttcgggc 18180 gggacgacgt gtggacgctc ttccactcca gcgccttcga cttctcggtg tgggagatgt 18240 ggggagccct ggcgcacggc ggccgactgg tggtcgttcc gcacga<:gtg gccagatccc 18300 ccagggacct cctggagctg ctgggccgcg agcgcgtcac ggtgctcagc cagacgccct 18360 cggccttcct ccagctcctg cgggccgaga ccgagcgggg cgtccccgcg gaggccaccg 18420 ccgcgctgag gtacgtcgtc ttcggcggcg aggcgctgga caccgc<:cag ctcgccccct 18480 ggcggggccg cccggtccgc ctggtcaaca tgtacgggat caccgagacc accgtgcacg 18540 tcacccacct ggagctggac gacgccgccg tggagcgcgg cggcagcctc atcggctccc 18600 ccctggacga cctgcgcgcc cacgtgctcg acgaacggct gcgccccgtg ccgtcgggcg 18660 tcgtcggcga gctgtacgtc gccggccccg gcctggcccg cgggtaccgg cagcgccccg 18720 gcctgacggc cgcccgcttc gtcgccgacc cgtccgacgc cggcgggcgg atgtaccgga 18780 ccggcgatct ggtcaggcgc gccccggacg gcggcctcca ctacgtcggc cggtccgacg 18840 cccaggtcaa actgcgcggc taccgcatcg agccggggga ggtcgaggcc gccgcacggc 18900 gccacccgga catcggccag gcggc:cgcgg tcgtgcacgg ggacggaccg gacgaccggt 18960 acctggtctg ctacgcggtg ccggacggag acgccgaccc cgacccgcac gaggtgcgcg 19020 cccacctggc cggcgccctg cccggctaca tggtccccgc cgccgtggtg ctgctgcccg 19080 ccctgccgct gacccccaac ggcaagctgg accgcagggc gctgcccgcc ccggaccggg 19140 cggcactggc caccggcggc gctccggccg gaccgcgcga ggaggcgctc tgcgcggcct 19200 tcgccgacgt cctccgcgtc gaggaggtca gccgggacgc cgacttcttc gccctgggcg 19260 gccactccct gtcggccgtg cggct:catca gccggatccg gtcggcgctg ggggtggaga 19320 tcggcatccg cacgctcttc gaggcgccca cgcccgccgc gctggcccgg cggctggaca 19380 ccgccggagc cggacggccc cgcctggtgc cgcagcggcg gccgcaccgc gtcccgctct 19440 cctccgccca gcggcggctg tggtt:cctcg gggagctgga ggggcccggc gcgacgtaca 19500 acattccgct cgccctgcgg ctgcgcggtc ccctggacgt cggcgccctg cgcgccgcgc 19560 tggcggacgt ggtggcccgg cacgaggcgc tgcgcacggt cttcccggcc gagaacggag 19620 tcccccacca gcacgtggtc gcgcccgagg aggccgcgcc cggaccggcc gtcgtggacg 19680 tcgccgagga ggagctgccc gcggc:cctcg ccgaggcctg cgcatacgcg ttcacgctga 19740 ccgaggacct cccgctgcgg gcggt:gctgc tgcgcaccgg ccccaccgac cacgtgctct 19800 ccctggtcct gcaccacatc gccggcgacg gctggtcgct cgccccgctc gcccgcgacc 19860 tgagcaccgc ctacgccgca cgcctggagg gccgcgcccc gcggtggcgg ccgctgccgg 19920 tgcagtacgc cgacttcacc ctgtggaagg agcggctgct cggcgaggcg gacgaccccg 19980 acagcctctt cgcacgccag ctcgccttct ggcgtgacac cctggcgggg gcgccggagc 20040 agatcgagct gcccaccgac cggccgcgcc ccgcgatgga gagccac:cgc ggcgcgatcc 20100 accgcttcac cctgccggca cggctgcggg accggctgcg tgcgttc~gcg cactcccggc 20160 aggcgacctt gttcatggcc ctgcaggcgg gcctggcggc gctgtt<:gcc aagttggggg 20220 ccggccggga catcgtcctg ggcaccccgg tcgccggccg cggcga<:gag gcggtcgacg 20280 acctcgtcgg cttcttcgtc aacaccctgg cgctccgcac cgacct<:ggc ggcgacccca 20340 cgttcgagga gctgctggac cgggtgaggg aggcggacct gtccgccttc gcccaccagg 20400 acataccgtt cgagcagctg gtgga.ggcgc tcaaccccac ccgctccctc tcccggcacc 20460 ccgtcttcca ggtgctgctg gccctccaga acaacgaact cggcgaggcc gtcatgccgg 20520 gtctggaggt caccgtggaa cgccccgccc aggtggcggc caagtacgac ctcttcgtga 20580 acctggtgga gtcccgggac ccggccggcg gcgcgaccgc catcgagggc gccgtcgagt 20640 acgccaccga cctcttcgac gccggaaccg tcgcacggct ggccgagcgc taccaggacc 20700 tgctcctggc ggtcaccgag gagcccgcga cgcggctcag ccggatcccg gtgctgagcg 20760 gggccgaacg cggcatgctg gcggccgagt gggacggcac cgccgcgggc ccggccgagg 20820 acgtggccga cctcttccgc gccccJcgcca ccgcgacgcc ggaggcggtg gcggtccgct 20880 gcgccgggga gagcctcacg tacgc:cgagc tcggcgagcg ggccgaccgg gtggcggcgg 20940 cgctggccgg gcggggcgcc ggccccgaac ggcgggtcgc ggtgtg~~ctg ccgcgcaccg 21000 ccgacctggt ggcctgcctg ctcgc~agtcc tgcgggccgg cgccgcctac gtgccgctgg 21060 acccggagta cccggacgag cgcat:cgccg cgatcctgtc cgacacccgc ccggtggcgc 21120 tgctcaccac ggcggactgc cgccccgcga tcaccgctgc cgcggccgcc tgcggtgccg 21180 ccaccctcct ggcggccgac gccgc:acagg gcgccgggcc cctgcccgag gtgcccgcgc 21240 cgctgccgga ccaggccgcg tacgt:gctgc acacctcggg ctccacggga cggcccaagg 21300 gcgtggtcgt cagccggggc aacctcgcca acctgctggc cgacatgcgg gagcggctgc 21360 gcctcaccgc cggggaccgg ctggtggccg tcaccacggt cagcttcgac atcgccgccc 21420 tggagctgtt cctgcccctg gtcggcggcg ccgaactggt cctggccgac cgcggcaccg 21480 cacgggaccc ggaggcactg gcggcactgc tcaccgggag cggcgccacc atcctccagg 21540 ccaccccgac cacctggcag ctgctggccg agaccgcgcc cgacgccctg cgcgggctgc 21600 gcaaactggt gggcggcgaa gcgctccccg cgtccctggc ctcccgcctg cacggcctgg 21660 gcggcgaact ggtcaacgtc tacgggccca ccgagaccac catctggtcc accgccgccc 21720 acctcgaccg ggccaccggg agcgcaccgc ccatcggccg ggcgctgcgc aacacccgcg 21780 cctacgtgct ggacgagtgg ctcgacccgg tccccgccgg cgtccccggc gagctctacc 21840 tggccggcgc cggcgtggcc cgcggctacc tgggccgcgg cgccctgacc gccgagcgct 21900 tcaccgccga ccccttcggc gcgcccggca gccgcatgta ccgcacgggc gacctggtcc 21960 gccggcgcgc ggacggggag ctggagttcc tcggacgcac cgaccaccag gtcaaggtcc 22020 ggggcttccg catcgagctg ggcgagatcg agacggccct cggtgcgcac ccggacgtct 22080 ccggggcggt cgtggtcgcc cgcggcgcgt ccggcccggc ccccgccficcg gacgacggcg 22140 gcaccgccgg cccgccccgg cagctggtgg cctacgtggt cgccgagccc gaccgggccg 22200 gccacgacgg gagccgggag cgggcccggc tcgacgagtg gcgggagacc tacgacaccc 22260 tctacgacaa ctccgaaccg acccccctgg gccgggactt cgggatctgg cggagcagct 22320 acgacggacg gcccatcccg ctggaggaga tgctccagtg gcgggcggcc acggtggacc 22380 gcatccgggc gctgcggccc gcgcggctgc tggagatcgg ggtgggcacc ggactgctgc 22440 tgtcggaact ggcaccggac tgcaccgcct accacggaac cgacctgtcc gcacgggtga 22500 tcgagaccct gcacgagcag gtcgc;ggccg agcccgcgct gaaggagaag gtggagctgc 22560 acgtccgccc ggcgcacgac ttcaccggtc tgcgcagggg tttctacgac accatcgtgc 22620 tcaactccgt cgtccagtac ttccc:cggcg ccgactacct ctcccgggtg ctgcgcggcg 22680 cactcgacct gctggagccc ggcgc~acggc tcttcgtcgg cgacgtgcgc agcctggcgc 22740 tgctgcgggc cttccgcgcc tcggt,ggaga tcggcgacgc cgccgcgggc gacgccccgg 22800 gcccggtgct ggcggccgcc gaccgcagga cggccacgga gaaggaactc gtcgtggacc 22860 cgggctactt cgcgcggctg cgccgggaga ccggcgaacc cctcgtcctg gacgtgcggg 22920 tccggcgggg gaggccgctc aacgagctga cgcgctaccg ctacgacgtc ctgctggcca 22980 agccggaggc cgggaccgcc gctccggccc cggccgccga gatgcgctgg gcggaggagg 23040 tcggagaccg cgcgcggctg gccgaggtct gcgcggcaca ccgcgg~~gcg ctgcgcgtca 23100 ccgcgatccc caacgcccgg gtgcqgcgcg agacggccgc cctcgccgcg ctggaggacg 23160 ggcggccgct cgccgcggcg cggcc~gctgc tggacggccc cgccggcgga gtggacccgg 23220 aggacctgta cgacgtggcg gcggc:ggccg gccgcaccgc gtgggtgtgc tggtcggccg 23280 agggaccgcc ggacaccgtg gacctggtgc tggccccggc ggacgggggc ggegccgcgg 23340 aggtggcacc gccggccgag ctgtggccgt acgagccgga cgcggaccgg ccgcagacca 23400 acgacccgtc cgccgcgctg cgcaaccggg aactggccgc cgggctgcgc gcgtacctgg 23460 ccggacggct gccggactac atggtgccct cggccgtcgt cgtcctcggc gccctcccgc 23520 tgaccgccaa cgggaaggtg gaccgggccg cgctgcccga ccccgacccg gcgggcgcgg 23580 ccggcggccg gccaccgcgc acgcc:ccggg aggagctgct gtgccggctc ttcgccgacc 23640 tgctgggcct gagccgggtg ggcac:cgagg acagcttctt cagcctgggc ggcgacagca 23700 tcctgtccgt ccgcctcgtc agccgcgcac gggaacaggg gctgccgttg accacccgcg 23760 acgtcttcga gcaccacacc gtggccgcgc tggcggcggc cctggacggc agggagccgg 23820 agcagaccgc ggccgacggc cggccggacc ccgccgccgg accgcggccc atcagcgccg 23880 aggaactcgc cgagctggag gaagagctcg gcgcggactg ggaggagatg cagtgagcgg 2394D
ctcgcagcgc atggtcgaag aggtccttcc ggtcaccccg ctccaggagg ggctgctctt 24000 ccacgcggtc ttcgacgagg acgtccccga cgcctacgtc agccggr_tgg tcctcgccct 24060 ccgcggcgac ctggacgccg gccggctgag acgggccgcc caggcgctgg tgggacggca 24120 cccggcgttg cgctcggcct tccgacagcg gcgctcgggg gagtggttcc agctggtcgc 24180 gtcccgtccc gcggtgccct gggaagagct ggacctgcgg tccgccgggg gcccggccga 24240 ggcggacaag cacctggagg cgctcctgga cgagcaccac cggaccgggt tcgacctcgg 24300 ccggccgccc ctgctgcgct tcctgctcgc caggaccggg gaggaccgcc accggctggc 24360 ggtgacctac caccacatca tcctcgacgg ctggtcgatg cccatcctga tgagggaact 24420 ggtcgcgctg tacggcagcg ggggcgaccc ctccgcgctc cggccggtcc gcccgcaccg 24480 cgaccacctc gactggctgg cccgacgccc gtccgaacgg agcgcccacg cctggcggca 24540 ggcgctggcg ggactgtccg gaccc:acgct ggtggcgccg ggcgcggacc gcaacggccc 24600 gctgccgcag caggtgtgga cgcggctgac cgaacgggac acccgggcgc tcaccgcgtg 24660 ggcgcgcgcc cgcggcgtga cggtqaactc ggcggtgcag gccgcctggg cgacggtgct 24720 cggccgcctc accggccgcg acgac:gtcgt cttcgggacg accgtgt cgg ggcggccgcc 24780 ggagctgccc ggtgccgagg acatggtcgg gttcttcatc aacacggtgc cgctgcgggt 24840 gcggatgcgg ccggacgagc cgatc:ggcga cctcgtcgcc cggatccagc gcgagcagac 24900 cgccctcatg gagcaccagc acgtcagact gtccgacatc cagcgctggt cgggacaggc 24960 cgaactcttc gacacctcca cggcc~ttcga gaactacccc gccgacgacc tcgccgccgt 25020 cggctcctcc gaccacgccg gcctc~cgcgt ggaggggggc tccggggtca ccacgaacca 25080 cttccccctc tccctctacg cactgcccgg gccggcgctc cgcctgcggc tggaccaccg 25140 gcccgacgcg gtggaccacg acacc:gcacg acgcgcggcg gacctgctgg gacgggccct 25200 ggccgccgtc ctgggcgccc ccgccacccc gaccgccgcg gtcgccgccc cgcggcaggc 25260 accgcgaccg cgggagcggt gcgcc:ggcgc cgccccgcgg ggcccccgga ccacgatcgt 25320 ggcggcgttc gaggcgcagg tgcgc:gcggc gcccggcgcc cccgccgtcc tggccggcgg 25380 ccgggagctc acctatgccg agctc:gacgc ccgcgcgaac cggctggcgc ggctgctgat 25440 ctcccgcggg gtcggccccg agagc:cgggt cgccctcacc gtctcccgga acgcgtggct 25500 gcccgtcgcc gtcctcggcg tcctcaaggc gggcggctgc tacgtccccg tgagcgcctc 25560 gctgccgagg gagcgcgccg ccttcctcct ccgcgggacc gcgccggtct gcctgctcac 25620 cgaccccgat gcggaggccg cccggcacgc cgccgccgga gcgcaggacc cgccggggac 25680 cgcccggagc ggcgtcgagc gcatcgtcct gaccgaggag ctgctcgacc ggtacgaccc 25740 gagcgcgccg accgacgccg agcggaccgc gccgctgctg ccgggccacc tcgcctacct 25800 cctgcacacc tccggctcca gcggccggcc caagggggtg gcggtcgaac acgcccaggt 25860 ggccgcactg ctgtcctggg ccgtcaccgg cctcggcgcc gaccggctgc gccgcaecgt 25920 ggccgccacc tcggagagct tcgacgtgtc ggtcttcgac accctcgtcc cgctgctcgt 25980 gggcggccgc atcgagatcg tggag~aacac gctggccgtc gccgaccggg ccggcggcga 26040 gccctccctg ctgaacgccg tcccctcggc cctgcaggcg ctgctggacc gcggggcgcc 26100 gctcgccgtc gacaccttcc tctgcgccgg cgagcccttc cccgcctcgc tggccgcggc 26160 cctgcgcgcc gcctgcccgc gggcgcgcgt cggcaacctc tacggcccga ccgagacgac 26220 cgtcttcgtg accgcccggt tcctggacgg caccgaggac ggggcaccgc ccatcggccg 26280 gccgctgccc ggcgtgcgga tccac:gtgct cgacccctgg ctccgc~~cgg tgcccgaggg 26340 ggtggtgggc gagctctaca tcgcc;gggga ccacgtcacc cgcgggtact ggagaagccc 26400 ggcgacgacg gccgagcgct atgtcgccga ccccttcggc gcgccgggcg agcggatgta 26460 ccgcagcggc gacctcggac gccggctccc cggcggggag atcgacatgg tcggccgggc 26520 cgacggccag gtgaaggtgc gcggccaccg ggtcgagctg ggggaggtgg agtccgcgct 26580 cgtctcccac ccggacgtcc ggcaggcggc ggccaccgtg cacgacggcg gtcccgccgg 26640 accgcgcctg gtgggctacg tcgtgcccgg cgacacggtg cccgacaccg gcgcggtcct 26700 cgaccacctg cggctgaggc tgcccgcgta catggtgccc tcggccctgg tggtgctgga 26760 cgagctgccg ctgacgggga acggcaagcg ggaccgcgcc gcgctgccgc ccccgccgga 26820 ccggagcgcc gcgacgcggg cgcg<;gctcc ccgcggcccg cacgaaacga tcctgctcgc 26880 cctgttcgcc gaggtgctcg gggtc~cgccc ggccggcatc gacgac~gact tcttcgcgct 26940 gggcggccac tcgctgctcg ccacccgcct ggtcagcagg gtgcgcacca ccctcggcgc 27000 cgaactgggg gtgcgggacc tcttc;gagca ccccacggtg gcggccctgg gcgccaggat 27060 cgcccgggcc gggacggccc ggccgccggt ctcgcacgtc cgggagcggc ccggccgcat 27120 cccgctgtcg ttcgcccagc ggcggctgtg gttcctccac cgcctgcagg gcggcagcgc 27180 cgcctaccac gtcccgctcg cactccgcct caccggccgc ctcgacacgg ccgcgctgcg 27240 cggcgcgatc gccgacgtgg tggccgggca cgggagcctg cgcacgtcgt tccacgagga 27300 cgccgagggt ccccaccagg tcgtc~cggga cgccgcggag gccgccgaac tgatcaccct 27360 ggtcccggag ccggtcgacg acccgctccg ggcggcggac gaggcggtgg ccgaaccctt 27420 cgacctgacg gccggcctcc ccctgcggtg ccggctgttc acccggaccg gaacgccccg 27480 ggcaccgcgg gacccggcgg aaccgcccgc cggcggggag ccggagqagc atctgctgct 27540 cctggtggtg caccacatcg cggccgacgg atggtcgctg cggatcatcg cgcgcgacgt 27600 ggccgccgcg tacgccgccc gcgtggacgg ccggcggccc gcgcccgccc cgccccccgt 27660 cgactacgtc gaccacaccc tgtggcagca ccgggtgctc ggcggccccg acgaggaggg 27720 cggcccgctc ggcgagcagc tcgcctactg gcggcggcag ctggccgcgc tgccgccgga 27780 actcgcgctc cccgccgacc ggccgcgccc ggccgtctcc tcccaccgcg gcgaggacct 27840 ggacttcgcc gttcccgccg cggcggccgc acggatgcgg gagctggcgg ggaccaccgg 27900 caccacgccc ttcatggtcc tccaggccgc gctggcggtc ctgctgcacc ggatgggcgc 27960 gggcacggac atccccgtgg gcaccccggt cgccgggcgc acggacgggg cggtggaggg 28020 agtcgtcgga ctcttcgtca acacc:ctcgt gctgcgcacg gacctgagcg ggtcccccac 28080 cttcagacag ctcctggacc gcgtc;cgcag caccgccctg gacgcctacg cccaccagga 28140 cgtgccgttc gagcggctgg tggaagtgct cgcccccgag cgctccctgg cccgccaccc 28200 gctcttccag gtctccctcg ccctccagaa cctcgacgac gcggcggcac cggcgggcga 28260 actgcccggg ctgcgcgccg aagccrgtccg cacgcggcgg gacggcgcca agttcgacct 28320 gtccttcgtg ctcgccccgg gcggcccgga gggcggcgac atgcccggag tgctcaccta 28380 cagcaccgac ctcttcgacc gcgcgaccgc gcaggggctg gtggaccggc tgctgcgggt 28440 gctggacgag gtgctcgccg cccccgccac cccggtgggg aaggtggacg tgctcctccc 28500 cggcgaggcg cagcgggcgc tggagcacgg ccgcggaccc cggagcgggc acgtggcgga 28560 cgaaccgctg gcccgcttcg aggggtgggc ggcggccacc cccgacgccc ccgccctgcg 28620 gtggaacggc ggccggctga cctacgccga gctggaccgg cgggcgggcg cggtggcccg 28680 ggtgctcctc gggcggggca tcgga.cccga ggacgtggtc gcggtggccg ccccgcgccg 28740 cccggaggtg gtggccgcgc tgctgggagt gctcaagtcc ggcgccgcct acctgccggt 28800 ggacgaggcg tggccggccg agcggcggcg gcaggtgacg gccgacgccg gggcgcgcct 28860 gctgctggcc ccggggagca ccgacgccgc ccgcgccgcc ctcgccccgc ccaccgggcc 28920 gggcacggag gtgctcggcc tggcggaccc ggtcttcacc gccgccggcg gctccgccct 28980 cccggcggtg caaacccacc cgcgcgcgct cgcctacgtc atctacacct cgggctccac 29040 cggccggccc aagggcgtcg gcgtggagcg cggggcgctg gccgactacg tggacggggc 29100 cgtccgccgc tacccggacg cggcgggcac ggccctgctg cactcaccgc tgaccttcga 29160 cctcagcggc accgccctgt tcaccccgct ggccgcggga ggctgcgtcg tgctggggga 29220 ggtggaccgg gaggcggagg agagccgggc gacgttcgtc aaggcgacac cgtcccacct 29280 gccgctgctg gagcggcacc cggggctgct ggaggaggcc ggcaccctgg tgctgggcgg 29340 tgaggcgctc gacgggcgcg ccctgcgcga ctggcgcgcc gcccacccgc gcgccacggt 29400 CgtCaaCgCC tacggcccca cggaactcac cgtcaactgc gCCgagCaCC gCatCCCCCC 29460 cggcggcccc gtgcccgagg ggccggtgcc gatcggccgc ccgttcgccg gcgtccgcgc 29520 gatggtgctg gacgcggggc tggccccggt gcccccgggc gtggtcgggg agctgtacgt 29580 cgcggggccg ggagtggccc gcggctacct cggccgcccg ggcctgaccg cggagcggtt 29640 cctgccgtgc ccgttcgggg agccg~gggga gcggatgtac cgcaccggtg atctggcccg 29700 ccggctgccg ggcggtgaac tggag~tacgc cggccgaacg gacgagcagg tgaagctgcg 29760 cggcttccgg atcgaactcg ccgacgtggc gcaggccctg gccgcggcgg agtcggtcgc 29820 ccgggcggtc gccgtcatac gggaggaccg ccccggggac cggcggctga ccggctacgt 29880 ggtcccggcg gcgggagccc gcccccagga ggacgaactg cggagcacgg tggcccgcac 29940 gctgcccgag tacatggtgc cctcctccgt ggtggtcctc gacgagctgc ccaccacgcc 30000 gcacggaaag ctggaccggc gcgccctgcc cgcccccgcg caccgctccc gcggcggccg 30060 gccgccgcgc gacgagcgcg aacgggccct gtgccggatc tacgccgaag tgctcggcgt 30120 acccgaggtc ggcgccgagg acgacttctt cgcgctcggc ggccactcgc tgctcgccac 30180 ccggctggtc aaccggatca ggtcggaatt cgccgacgag ctggacgtac gggccgtgtt 30240 cgaggcccgt acggtcgccg cgctggcagc ccggctgcgg accagccgcc ctaacgcccg 30300 ccccgcgttg cgacggatgt cgcggtcgga gaactcgtga tgcttcccct ttccctcgcc 30360 cagcagagac tgtggttcct ccaccagatg gacggcccca gcgccacgta caacatcccc 30420 acggcgctgc ggatgaccgg cgcgctggac gtcgccgcgc tgcgcgaggc gctgcgcgac 30480 gtcgtgcggc gccacgagac gctccgcacg gtcttccccg acgccggcga cggcgcccgg 30540 cagcacgtcc tgcccgcggc cgaggccgcc gtggagctga ccgtcaccgg gaccaccgag 30600 gccgaactgc cggccgtcct ggcccaggag gccggccacg ccttcgacct ggcccgcgaa 30660 gtgcccctga gggcgcggct cctcc;cgctc ggcgaacggg accacgtgct gtgcctggtg 30720 atccaccaca tcgccagcga cggctggtcg cgcgccccgc tcgcacgcga cctcaccacc 30780 gcctacgccg cccgcagcga gggccgcgcg ccgcagtggg aggaactccc cgtccagtac 30840 gccgactaca ccctctggca gcgggagctg ctcggctccg aggaggaccc cgagagcctg 30900 ctcagccgcc agacggcgta ctggaagcag gcgctcgccg gtctgccgga cgccatcgag 30960 ctgcccttcg accgcccccg cccgccgatc gccggccacc gcggcgacac cgtgccgatc 31020 accctgccgc cgcggaccca cgagcggatc gccgcgctgg ccggccgcca cggcgcgagc 31080 acgttcatgg tggtgcaggc ggcgctggcc ggcctgctgt cccggctcgg cgcgggcacg 31140 gacatccccc tgggcacatc ggtggccgga cgcaccgacg aggcgctgga ggggctcatc 31200 ggcttcttcg tcaacaccct ggtgctgcgc accgacacct ccgggaaccc caccttcgac 31260 gaactggtcg caagggcccg cgagaccgcc ctggacgcct acgcccacca ggacgtgccg 31320 ttcgagcggc tggtggaggc actcgccccc gagcgctccc tggcccgcca cccgctcttc 31380 caggtgagcc tgagcctcca gcacgccacc gagcagacgg cggtcctgga cgggctggag 31440 atcgccccgc tggacacggg ctggcgggcg gccaagttcg acctgtcctt cgacctcctg 31500 gagaagcacg gccccggcgg ccgtccggac ggcatcaccg gcaccgtcga gtactccacc 31560 gacgtcttcg acgccgccac cgtccgcggg atcggcgagc gcctcgtccg cttcctcgag 31620 gccgcggtgg acgcccccgg ggcacgcctg ctctccgtcg acctgctctc cgccggtgaa 31680 cggcggcgcc tgctggcgga gttcggcgca tcgcggtccg ggaccgacgg gacggccggg 31740 gccgaggagg agccggaacc ggtctgcgac accttcgcgc ggcaggccgc cgccaccccg 31800 gaggccgtgg ccgtcgtcgg cggcgacacc gcgctcacgt tcgccgaggc cgacgcccgg 31860 gtctcccggc tggcccggct gctgatctcc cgcggggccg gccccgagac ccgggtggcc 31920 gtctgcctgg gccggaacgc cctgtggccg acggccgtcc tggcggtgct gcggagcggc 31980 gccgcctacg tgccgctgga cccgcgctcc ccggccgagc gcctggccgc cgtcgaacgc 32040 gacgccaccc ccctgatcgt gctcc~ccgag cgcggcaccg aggccgcggt cgccggcctc 32100 acggccccgc tgctggtcct ggacc~acccg cggaccgagg ccgggatcga ggcgcaggac 32260 ccggccccgg tcaccgacgc cgaccgcacc gcgcccctcc tgcccggcca cgcggcgtac 32220 gtcatccaca cctcgggctc caccggccgg cccaaggggg tcgtggtgga ccaccgcggt 32280 ctggcgcggc tgctccaggc ccaccgccgg gtcaccttct cccgcattcg cccccacgga 32340 gcgggacccg cccgggccgc ccacgtctcg tccttctcct tcgacgcctc gtgggacccg 32400 ctgctggcga tggtcgccgg acac<;aactg cacatgatcg acgaggatct gcggctcgac 32460 ccgccgggag tggtggccta cttcc:gcgac cgccgcatcg actacgtcga cctcaccccc 32520 acctacttcc gcagtctgct cgacgccgga ctgctggagg agaccgcccc ctgcccgtcg 32580 ctgatcgccc tgggcggcga ggcgatggac ggcgagctgt gggagcggct gcgcgcggcc 32640 gccccccggg tgacggcgat gaacacctac ggccctaccg agaccgccgt cgacgcggtg 32700 gtgaccgaac tgggggacct gccg<:acggc acgatcgggc ggcccgtgcc gcggtggcgg 32760 gcctacgtgc tcgacgccgg gctgc:agccg gtcccgcccg gcgtgctggg cgagctgtac 32820 ctggccgggc ccggggtcgc ccgcggctac ctggggcagc acggcct:gac cgcggagcgg 32880 ttcgtggcct gcccgttcgg ggagccgggg gagcggatgt accgcac:cgg cgacctggcg 32940 cggtggctcg acgacggcaa cctggtctgc gccggacgcg gcgacgagca ggtcaagatc 33000 cgcggcttcc gcatcgagcc cggcgaggtg gaggcggccc tgcgggagct ggagggcgtc 33060 gcgcaggccg ccgtggccgt ccgcgaggac actcccggaa cccgcaggct ggtcggttac 33120 gtggtggggg ccgacggcgc cgaccccggc ctgctccggc ccgccgaggt gctggcgcgc 33180 ctgcgcgacc ggctgcccga ccacctggtg ccgtccgcgc tcgtacggat cggcgaactg 33240 cccgtcaacg ccagcggcaa gctggaccgg gccgcgctgc cggcgcccga ccccgcggcc 33300 ttccccgccg gccggcaacc gcgca.ccgac ctggagcggg acctgtgcgc gctgttcgcg 33360 gacgttctgg gcaccgggag tgtcggcatc gacgacgact tcttcgtccg gggcggcgac 33420 agcatcctct ccatccagct ggtcggcagc gcccgccggg ccggcctgga gttcaaggtc 33480 cggcaggtct tcgagctgcg gacccccgcg ggcctggcca ccgtggcccg ccggaccggc 33540 gcgggacggc aggaggaccc cggcgccgcc gtcgggccgc tgccgccgct gcccgtggtc 33600 gccgagaccc tggcggccgg cgggccggtc ggcgagtaca accagtcggt cgtcctcgcc 33660 tccccgccgg gcgccgggcc cgacgacgtg cgcgacgcgc tccaggcgct gctggaccgg 33720 cacgacgcgc tgcggatcca cgccgccccg gcggcggaac ccggccgcct gtgggatttg 33780 agggtggagc gggccggcac ggtcacggcg gagcggtgcc tgcgcaagat cgacgcggcc 33840 gggatgtccg aggaggagct ggcggaggcg gtggccgccg aggccgtcgc ggcccgggag 33900 gccctcgacc ctgtcgccgg agccctcgtc ggggcgatct ggttcgaccg gggcggggag 33960 ccgggccggc tcgtgctggt gatcc:accac ctcgccgtcg acggcgtctc ctggcgcatc 34420 ctgctcggcg acctccgcga ggcgtggcgg gcgctgcggg acggccgccg cccggagctc 34080 ccccgcacgg gcacctcgct gcgcacctgg.gccacccggc tcgccgaccg ggccgccgac 34140 ccggccgtca ccgcccagct ggaccactgg acggccacgc tcgccgacgc cggccccgag 34200 gtgggcagcc gcccgctgga ccggacccgg gacaccgcgg ccacctccgc cgtcctgagc 34260 ggcgagctgc cggagcccgt caccgccgca ctgctgggcc cggccccggc ggccttccgc 34320 gccggggtga acgacctgct gctgaccgcc ttcgcgctgg ccgtcaacca ctggcggggc 34380 gaggagggcg agccggtcct ggtggacctg gagggccacg gccgggcgga ggacctggtg 34440 ccgggggccg acctgtcccg tacggtcggc tggttcacca gcgtgtaccc ggtgcggctg 34500 gccgccggag cggtcaccgc cgccgacctc gccgggcgcg ccccggccgt cggcgacgcg 34560 atcaaacggg tcaaggaaca gctgcgggcg gtccccgacg aggggctggg gtacggcctg 34620 ctgcgccacc tcaaccccga gacgtcccgg cgcctcgcgc acggtgcccg ggcgcgcttc 34680 ggcttcaact acctcggccg gttcgccgcc gagcagggcg ggggcgagga cggctggcag 34740 ctgctcggca gcggcecggc gggccggcac ccggacaccc cgctcga cca cgagatcgag 34800 gtgaacgccg tcacggcgga gggcccggac ggaccgcggc tgatcacccg gtggacctac 34860 gccaccggcc tgctgaccga ggaggaggtg cgccgcctcg cgcgctr_ctg gtcgctggcg 34920 ctgcacgcgg tcgtcggcca cgccaccggc gccggcgccg gcgggctcac cccctccgac 34980 gtggccgtcc ccgacctcgg ccaggccgag atcgaggagc tggagcggcg ctgcggcacc 35040 gccctggagg acgtgctgcc ggtggccccc ctccaggagg gcctgct:cta ccacagcgtg 35100 tacgaccggc gcgccctgga cgtctacgtc ggccagctcg ccttcc<~cct ggacggcgag 35160 atcgacgagg acgccctgcg ggcggcggcc ggggtgctgg tcgcccgcca caccagcctg 35220 cggacgggct tccagcagcg ggagtcgggc cagtgggtgc agaccgtggc cgccgcggcg 35280 gagctgccgt ggcgctcctg cgacctgcgc gccctggagg atgcgcccgg ggacgccggg 35340 gccgcgcagc ggcggctcga cgaactggcc gcggccgaac gcaccgagag gttcgacctc 35400 acccgcccgc cgctcgtccg cttccacctg gcccgcaccg cccccgagca gtaccggttc 35460 gtgatcacca cccaccacac gatcytggac ggctggtcca tccccat=cct gctgcgcgag 35520 ctgctcgcgc tctacggcgg cgccccgctg ccggacgccc ccggcca ccg cgcctacgcc 35580 gactggctcg ccggccgcga cctccgggcg gcccgggagg cgtggacgcg ggcactggag 35640 ggcgtgggcg ggccgaccct ggtcgccccc ggcgccccgc gcgtcggaga gatccccgag 35700 tcggtgeggc tgaacctccc cgaggacgtc tcggcgcgac tgcggacgcg ggcccgcgag 35760 gccggagtca ccctcaactc cgtca~tgcag gccgcctggg cgctcgtcct cgcccaggag 35820 accggccgcg acgacgtcac cttcggcatc accgtctccg gccgccccgc ggaactcccc 35880 ggcgccgagg acatggtcgg catgctggtc aacaagatcc cgctgcc~cgt ccggctccgc 35940 ccggccgaac cgctgctgga actggtccgc cggctggaga aggagcagct cgaactgctg 36000 gagcaccagc acgtcccgct gaccg-ccctg caccgctgga gcgggctgcc cgaactcttc 36060 gacaccacca tggtgttcga gaactacccg gcggagatca ccgcgcggga ggcgcccttc 36120 cgcgcgtcgg gcacggccgg ctaca.gccgc aaccactacc cggtcaccct ggtcggggcg 36180 atgcgcggga gcgagctgac cgtccgcatc gactaccgcc ccgacctctt cggcgaggac 36240 tgggcccgct ccctgggccg gagggtcgtc gccgcgctga ccgaggccgc cgaccgcccc 36300 gccgccccgt ccggcacgct ggacctgctc gacggcgagg agcgcgcccg gctgctggag 36360 gactggggcg ccggcggcgc cccgcaggac gcctcgcgcg gctacgt:cga gctgttcgag 36420 gagcaggtcg cccgcacacc ggacgccccc gcggtcacgt cgcccggcgg cacgctgacc 36480 tacgccgagc tggaccggca ggcgaacggc gtcgcccggt ggctggccga ccgcaccgcg 36540 ggcaccggcg gcgccgaggt ctacgtgggc gtgctggccc cgcgccgggc ggaggcgctc 36600 gccgtcctgc tcggcgtcct gaagtcgggc gccgcctacg tgccgctgga cgagcagtgg 36660 ccggccgaac gcacccgcag ggtgctggag gactgccgcc ccgcactcgt ggtggccccg 36720 gccggcagcc ggcccgacgg cgtgcgggag gccggggcgg aggtgctcgc cgtggacccg 36780 gccgccctcg cctcccgcgg ggcgcacgcc ccggccggcg acgaacgggt gcgccccgcg 36840 gcgccgggcg gcgccgcgta cgccatctac acctccggtt ccaccggccg ccccaagggc 36900 gtggtgatcg accacagcgc cctgggcgcg tacgtcggcg gcgcacgcgg ccgctacccc 36960 gacgcggccg ggacctcgct ggcccacacc tcgctcgcct tcgacctcac cgtcaccacc 37020 ctgctcaccc cgctcgccgc cgggggcacc gtgcggctgg gcgagctgga cgagtccgcc 37080 cagaccgccg gggccaccct ggtcaaggcg acgccctcgc acctgcccat gctgcgcgag 37140 ctgcccggag tcctgccgga cgggggcacc ctgatcctcg gcggcgaggc actgaccggc 37200 aagcagctgc gcccgtggct cgaactgcac cccgccgcgc aggtcgtcaa cgcctacggg 37260 ccgacggaac tcacggtcaa ctgcaccgag ttccggctgc cgcgggggga accggtcggc 37320 gacggaccgg tgcccatcgg ccgcccgttc cccggcgtgc gggcctacgt gctcggcccc 37380 ggcctgcgcc cggtccccac cgggaccgtc ggcgagctgt acgtgtcggg cacgggggtg 37440 gcccgcggct acctcggccg gccggggatg accgcggagc ggttcgtggc ctgtccgttc 37500 ggggggccgg gggagcggat gtaccgcacc ggcgacctgg cccgctggcg gcccgacggg 37560 aacctggagt acgccggccg cggcgacgac caggtcaaac tgcgcggttt ccgcatcgag 37620 acggcggagg tcgcccgcgc cctggagggc caccccgcgg tcgccagggc agcggtggtg 37680 ctgcgcgagg accagccggg cgaccagcgc ctggtggggt atctggtgcc ggtcgcgggg 37740 gagggggtgc cggatcggga ggcggtgtcg gccgcggtcg cggcggtgct gcccgagtac 37800 atggtgccgt cggcgctggt ggtgctggag gacgggttgc cgttgacggc caacggcaag 37860 ctggaccggg ccgcgttgcc ggtgccggag ttcgcgccgg tgcgcggggt ggggcgggcg 37920 ccgcgcggtc cgcgcgagga gatcctgtgc gggttgttcg ccgaggtgct gggggtgccc 37980 ggggtcgggg tggacgacga cttcttcgcc ctgggcggcc actccctgct ggcgatcgtc 38040 gtgatcagcc ggatcagggc cctgctcgac gtggacctgg ccatcgacgc cctcttcgag 38100 gcgcccacgg tggccgggct ggccgcgcac ctcgacggcc cgcagcgccg ccccggcgcg 38160 gtgcgggcgg tggtgccgcg gcccgggcgg ctgccgctct cctacgc cca gcagcgcctg 38220 tggttcctcc accagatcga ggggccgagc gccacctaca ccgtcccgct ggcgctgcgg 38280 ctgaccgggc ccctggacgt ggccgccctg cgcgccgcgc tggcgga cgt ggtcgcccgg 38340 cacgagagcc tgcgcaccgt cttcgccgag gacgagcacg gcccgcacca gatcgtcctg 38400 ggaccgcggc agggcgcgcc cggcctgcag gtggtcccca ccaccgaggc ccgcctgcgg 38460 gccgacctgg aagccgaggc cgcccgcccc ttcgacctcg cacaggcacc gccggtgcac 38520 gcccggctct tcgccctcga cgagcgcacc cacgtactgc tgctggcggt ccaccacatc 38580 gccatggacg gctggtcggt ccgccccctg gtgcgcgacc tggcggccgc ctacgccgcc 38640 cggcgccggg gcgcgccccc ggccctgccc gaactgcccg tgcagtacgc cgactacacc 38700 ctgtggcagc acgaggagct cggcaccgag gacgacccgg acagcgcgat cgccgcgcaa 38760 ctgcggtact ggcgggacgc cctgcgcgga ctgccggagg aactggcgct ccccgccgac 38820 cgccctcgcc ccgccacccc ctcccaccgc gggggccggg tcggcttcac cgtcccgccg 38880 gcggtgcacg ggcgggtggc cgagctggcc cgggagcacc gggcgacgcc cttcatggtg 38940 gtgcacgcgg cgctggcggc gctgctgacg cggttggggg cggggacgga cgtgccgatc 39000 ggctcgccgg tggccgggcg caccgacgac gcgctggagg acctggtggg gttcttcgtg 39060 aacacgctgg tgctgcgcac cgacacctcg ggcgacccga gcttcgcgga gctgctggag 39120 cgggtgcgcg ccaccgacct ggcggcctac gcccaccagg acctcccctt cgaacggctg 39180 gtggaggtgc tcaacccggt gcgctcgctc gcccgccacc cgctcttcca ggtgctgctg 39240 gccttcaaca acggcgcggt gcccgccgac ggacccgccg accgggcctc ggacgtcctg 39300 gtccggcccg tgacggtgga gaccgcggcg gccaagttcg acctgtcgct gtccttcaac 39360 gaggaccggg cggccgacgg ctcggcggcg gggatccggg gcgtactgga gtacagcacc 39420 gacctgttcg acgagagcac ggcccacagg acggtccggt acttcctccg gctgctgggc 39480 gcggcggtcg agcagccccg cacaccgctg agcggccttc ccgtcctgag cgagccggag 39540 cggcacgagc tgctcgtccg gcgcaacgac accgcccgcg acctgccctg gacctcgccg 39600 ctgcggcgct tcgaggccca ggccgcccgg accccccggg ccacggccct ggtcgccggc 39660 gaggagcgga tctgctacgc cgacctcgac gcacgggccg accggctcgc cgggctgctg 39720 tcggacggcg ccgcgggacg gagcggaccg gtcgcggtcg cgctgcggcg cggcgccctg 39780 ctgccggtga cgctgctcgc cgtctggaag gcgggactgc actacctgcc cctggacccc 39840 ggccacccga gggagcggct ggcggacgtc ctcgccgact gcgcgcccgc atgcgtggtc 39900 accaccgcgg acctcgccgg cgacctccct cccggcccgg ccccgci:gct cgtcctggac 39960 gacccggcca ccgccgaacg cctggccgcc gcgcccggca ccgcaccggc cggggccgcg 40020 cacgcctggg gccacccgga cgacctggcg tacaccatct acacctccgg ctccaccggc 40080 cgccccaagg gcgtcatggt gacccgggcg ggcgtggcga acttcct:cgc cgacctgacc 40140 gagcggctgg agctggggcc cgacgaccgg ctgctggcgg tcacga.cggt ctccttcgac 40200 atcgccgtcc tggagctctt cgcccccctg ctcaccggcg gcgcggtcgt cctggccgac 40260 gccaccgccc agcgcgaccc cgcggccgtg cggtccctgt gcgcccgcga gggcgtgacg 40320 gtcgtccagg ccacccccgg ctggtggcac gccatggccg tggacggcgg cctggacctc 40380 accggcctgc gcgtgctggt gggcggcgag gcgctgccgg cgaccctggc ccgcgccctc 40440 ctggagcccg gccgcgcgcc gtccggcgac cgcctgctca acctgtacgg gccgacggag 40500 accaccgtct ggtccaccgc cgcgcacatc accgccggga ccccggaggc gcgcggcggc 40560 tcggtgccca cggggacgcc gatcgccaac accgccgcct acgtgctgga cgccgcgctc 40620 cggcccgtgc cggacggagt gccgggcgaa ctctacctgg ccgggaccgg gctggcccgc 40680 ggetacctcg gccggccggg gatgaccgcg gagcggttcg tggcctgtcc gttcgggggg 40740 ccgggggagc ggatgtaccg caccggcgac ctggcccgct ggcggcccga cgggaacctg 40800 gagcacctcg gccggaccga cgaccaggtc aaggtccgcg ggttcaggat cgagctgggc 40860 gaggtcgaga aggccctggc ggaggccccc ggcgtcggcc gggccgccgc ggccgtgcgc 40920 ccggatcccg ccggctccgc ccgcctggtg gggtatctgg tgccggtcgc gggggagggg 40980 gtgccggatc gggaggcggt gtcggccgcg gtcgcggcgg tgctgcccga gtacatggtg 41040 ccgtcggcac tggtggtgct ggaggacggg ttgccgttga cggccaacgg caagctggac 41100 cgggccgcgt tgccggtgcc ggagttcgcg ccggtgcgcg gggcggggcg ggcgccgcgc 41160 ggtccgcgcg aggagatcct gtgcgggttg ttcgccgagg tgctgggggt gcccggggtc 41220 ggggtggacg acgacttctt cgccctgggc ggccactccc tgctggccac ccggctggtc 41280 gcgcggatcc gcagcacgct cggcgtcgag ctgggggtcc gggaggtctt cgagacgccg 41340 acggtggccg ggctggccgc cgcactgtcc cgggcggggg aggccgggcc ccggctgcgc 41400 cccgccgacc cgcgacccgg gcggctgccg ctctcctacg cccagcagcg cctgtggttc 41460 gtgcagcaac tggagggacc gggcgccacc tacaacatcc cgctggcgct gcggctgacc 41520 gggcccctgg acgtggccgc cctgcgcgcc gcactggcgg acgtggtcgc ccggcacgcg 41580 agcctgcgca ccgtcttcgc cgaggacgag cacggcccgc accagatcgt cctggccgcc 41640 gCCgaCggCC CCgCCCCgct cgccggcccg gtccgcaccg acgaggagga actcccccgc 41700 ctcctgcggg aggcggccga ccacgagttc cggctggacg ccgaaccgcc gctgcgcaca 41760 cacctgttcg ccacggcacc cgacgagcac gtgctgctgc tggtcat~gca ccacatcgcc 41820 accgacgcct ggtcgcagcg gccgctgatc gccgacctgg cggccgccta cgccgcccgc 41880 cgcgcgggcc gggccccggc ctggccgccg ctgccggtcg agtaccccga ctacgccctg 41940 tggcagcggg cccgcctggg ggacgagcgg gaggccggca gcgagctggc cgcccagctg 42000 gcctactggc gggacgccct ggcgggctcc cccgaggagc tggcgctccc cgccgaccgg 42060 ccccgtcccg ccatcccctc ccaccgcggg gacagcgtgc cgatccaggt cccgccggcg 42120 gtgcacgggc gggtggccga gctggcccgg gagcaccggg cgacgccctt catggtggtg 42180 cacgcggcgc tggcggcgct gctgacgcgg ctgggggcgg ggacggacgt gccgatcggc 42240 tcgccggtgg ccgggcgcac cgacgacgcg ctggaggacc tggtga~ggtt cttcgtgaac 42300 acgctggtgc tgcgcaccga cacctcgggc gacecgagct tcgcggagct gctggagcgg 42360 gtgcgcgcca ccgacctggc ggcctacgcc caccaggacc tccccttcga acggctggtg 42420 gaactccgcg accccgagcg ctcgctcgcc cgccacccgc tcttccaggt ggcgctgaac 42480 ttcgacacgg cegagacggc cggcgcgcgc gacaccgcac ccgaactgga cgggctgacc 42540 gtgcgcaggg aacggctcgg cgtcacgacg tcgaagttcg acctcacctt cgcgctcacc 42600 gagacccgca cccgcgacgg cggcgccggc ggactgcgcg gcgtgctgga gtacagcacc 42660 gacctgttcg accgcagcac cgcccggcac ctggtggagc ggctcggccg ggtgctggag 42720 gccgtcgtgg aggcgcccgg cacegctctc ggcgagatcg acgtcctgct gccgggcgag 42780 cgcgaactcc tggcgggcgc gtggagcgaa cccgaccccg ggccggtcac caccgccggg 42840 gccgccgcgg acggcatccg cttcccggac ctgttcgagg cgcaggccgc ccgcaccccg 42900 cacgcgccgg cggtccgcga cggcggccgg gaggtcgcct acgccgagct gaacagccgg 42960 gccaaccggc tggcccggct gctcgccggg aggggagccg gccccgagga caccgtcgcg 43020 gtcctgctgc cgcgcggcgc cgggctgatc accgcactgg tggcggtcca gaaggccgga 43080 gccgcctacg tccccctgga cgccgagctg cccaccggtc ggatcgccca catgctggac 43140 gacgccaagc cggtgctcac cgtgaccctc accgggatgc gggacgcgct cccggccggg 43200 gcgggccccg tggtctgcct ggacgacccg gccaccgagg ccgcgctcgc cgggctcgac 43260 ggcgccgact gcaccgacgc cgaccgccgc gcgccggccg gggaccgcga tccggcctac 43320 gtcgtctaca cctcggggtc caccggcaca ccgaagggcg tcgtcgtcga gcagcggtcc 43380 ctcgccgcct tcctggtgcg ctcggccgcc cggtaccgcg gcgccgcggg aaccgtgctg 43440 ctgcacggct ccccggcctt cgacctcacg gtgaccacgc tgttcacccc gctggtcgcc 43500 ggcggctgca tcgtggtggc ggacctggac gcggcggagg gcgacgcccc gaaccggcct 43560 gacctgctga aggtcacgcc gtcccacctc gccttcctgg acgggatcgc ctcctgggcg 43620 gcccctgccg ccgacctggt cgtcgggggc gagcaactga ccggagcccg gctggcccgg 43680 ctgcgcgcgg cgcaccccgg gatgcgcgtc tacaacgact atgggcccac cgaggcgacc 43740 gtcagctgcg cggacttcgt actggagccg ggcgacgaac tgcccgcgga cgccgtgccg 43800 atcgggcgcc ccctggcggg gcaccggctg ttcgtcctgg acgagcgcct gcgcccggtg 43860 ccggccggcg tccccggcga gctgtacatc gccggcgtgg gggtgc~cccg cggctacctc 43920 ggccggccgg ggatgaccgc ggagcggttc gtgggctgcc cgttcggggg gccgggggag 43980 cggatgtacc gcaccggcga cctggcccgc tggcggcccg acgggaacct ggagtacctc 44040 ggccgaggcg acggccagct gaaggtccgc ggcttccgca tcgaaccggg ggagatcgag 44100 gcggcgctgc tcgaccgccc ggagatcggc caggccgccg tcgtcctgcg cggggaacgc 44160 ctggtcgcct acgtcgcggc cccggaggcg gagttcgacc cggccgcgct ccgggaggga 44220 ctcgccgccc ggctgccgcg gtacatggtc ccggccgcga tcgtccggct ggacgccctg 44280 ccgctggccc ccggcgggaa gctcgaccac agggcgctgc cggagcctcc ggcgcccgcg 44340 gacgccccgc acgaccgcag gccgccgcgg gacgcgtggg agcgcgtgct gtgcgaggcg 44400 ttccgggagg tgctcggggt cgcggaggtc ggggccgacg acgacttctt cgcgctcggc 44460 ggcgacagca tcggctccat ccagctcgtc ggccgggtcc gcagggcggg tggccggatg 44520 accgtccgcg acgtcttcga acggcggacg cccgccgcgc tcgcggcccg ctcccggcag 44580 agcggggcgg ccttcgaggt gctcggcggc cgggccaccg gcccggtgcc gcccacgccg 44640 atcagctcct ggctggccga actcggcggc gcggccgagg gctacaacca gtccgtgctg 44700 ctgcgcgtcc cggcccaggc ggacgaggcc gtcctcgtcg gcgccctcca ggcgctgctg 44760 gaccaccacg acgcgctgcg gatgcgggcc gagccggcgg ccggccactg gcggatggag 44820 atcggcgagg cgggcggcgt ggacgcggcc gcggtgctgg agcgggtgcc ggcggcggac 44880 gtcccgcagg cggagctgga ccggctggtc cgcgcgcact gcgccgcggc ccgcgaacgg 44940 ctcgccccgc aggagggcgc catgctgcgc gccgtctggt tcgaccgggg gccgcgggag 45000 ccgggccacc tcgcgctcgt cgcccaccac ctggtcgtgg acggggtgtc ctggcgcatc 45060 ctcaccgccg acctcggccg ggcgtggcag gcggtcgccg acggccggga ggtccggctc 45120 gacccggtgg gcaccccgct gcgggtctgg gcgcagcggc tggcggagct cgccgccgac 45180 ccgcgccgcg ccgaccggtg cgcctactgg gaggagcagg cggcacggcc ctgggaggcc 45240 ggtcgcctcg acccggccgc ggacgacagg agcaccgagg aggccctgtc cctgaccctc 45300 ccggccggca ccacccgggc cgtgctgggc tcggtgcccg ccgcactggg ggtgggggtg 45360 accgaggtcc tgctgggaac gttcgccgcc gccgtgcggc ggtggcgccc ggcggaggcc 45420 gcggacggcg tcacggtgga cctggagggc cacggccgcg aggaggaggt ggtccccggg 45480 gccgatctct cccgcacggt cggctggttc accgccgccc acccggi=ccg gatcccggcc 45540 gccgggccgg acgacgaccg ggccggcgcg ctgcgggcgc tggccgggac gctggaccgg 45600 gtgccggacg ccggcctggg ctacggcatg ctgcggtacc tcaacccgcg gacccgggaa 45660 cggctcgcct ccctccccgc gccgcgcttc ggcttcaact acctgggccg gttcggcgac 45720 ccgggcgcgg accgggacgg ggccgcggag gcccccgcct ggtcgccggt gggcagcggg 45780 gtcgcgggcc agcccgcggg gctcccgctc gcccacgaga tcgaggtcaa cgcggtcgcc 45840 gccgacggcc ccgacggccc ccgcctgatc gccacctggt cctgggccgg ccggctccac 45900 cgggagcagg acgtccggga gctggccggg ctgtggttcc gggaactgga cgagctcgcc 45960 tccgccgaac ggtccccggc cgccggcccc ccgcccccgg cggacccagc ccccctggtc 46020 gagctctccg acgccgaact cgaccagctc gaagcagagt ggaaggccga ctgatgcgcc 46080 gatccctaca ggacgtcctg cccctttccc cgctgcagga gggcctgctc ttccacagcg 46140 agtacgccgg cgacgaggcc gtcgacgtct acaccgtcca gaccgaggtg gaactgcacg 46200 ggccgctgga cgtgccggcg ctgcgcgcgg ccgccgaggc gctgctgcgg cgccacgaca 46260 acctgcgggc gggttttgcg acccgcgccc tgaaggaccc cgtgcagttc gtccccaggg 46320 aggtcgagct cccctgggag gaggccgacc tgcgcgcggc cggcgatccg gaggcggagg 46380 cggcacggcg gctggacgag caccgctggc gccgcttccg gcccgccaag ccgccgctgg 46440 tgcggttcct gctgctgcgc acggcgcagg accgccatcg gttcgccctc accaaccacc 46500 acatcctgct cgacggctgg tcgatgccgg tgctgctgcg cgagctcatg ctgctctacc 46560 gcaccggcgg cgacgcctcc gccctgccgc cggtgcgccg ctaccgcgac tacctggcct 46620 ggctggaccg ccgcgacgag cgggcggcgc aggacgcctg gcggcgcgcg ctggagggtc 46680 tggaggcccc catcctcgtc gccccgcggg ccgaccgggc ggcggaggcg ccgcagtggc 46740 tggacttcga actgcccgcg gcggcctcgg ccggactgac ccgggccgcc cgcggcgccg 46800 gcctcacgct caacaccgtc gtgcaggggc tgtgggccct gaccctcgcc cgcaccaccg 46860 gcagccagga cgtggtgtac ggcgtggtcg tctccggccg gccaccggaa ctggacggcg 46920 tcgagtccat ggtcggcctg ttcgccaaca ccgtcccgct gcgggcccgg atgcccgcgg 46980 ccgaaccgct gacggacttc ctccggcggt tgcagcgcga gcagagcgcg ctcctggacc 47040 accagcacgt gcggctggcc gacatccagc gcctggtcgg ccaggg~~gag ctgttcgact 47100 cggtgatggc gttcgagaac tacccggccg ggcccgcgga ggagcc~~ccc ggcgattccc 47160 ccgccgcgcc ggggcgggtg cgcgcggtgg cgtcgaggat gcgcgacgcc atgcactacc 47220 cgctcggcct gctcgcctcc cccggcccgc cggtgcggtt ccgcctgggc caccggccca 47280 gcgcggtgac gccgcgtctg gcggctgccc tgcgcgaccg cctgctgcgg ctcgtcgacg 47340 ccttcctggc tgccccggac ctgcccctgg ggcggctcga cgtcctcgac gacgccgaac 47400 gggccctggt gctggagaag ttcaacgaca ccgcgcgcga ggtcgaggac accaccgcca 47460 ccgagctgtt cctccggcag gccgcccgca cccccgggcg gaccgccgtg gagacggccg 47520 accgcagcat cggctacggc cggctcgccg accgctccgg ccggctggcc cgcctgctgg 47580 tggagcgcgg ggcgcgggcc gagcggttcg tcgccctggc gctgccgcgc tcgccggaac 47640 tggtcgaggc cgcgctcgcg gtgtggcaga ccggcgccgc ctacgtaccg gtcgaccccg 47700 gccacccggc cgaccgggtg gcccggctgc tgcgggaggc cgaaccgctc ctcaccgtca 47760 ccaccgccga cctggccggc cggctgccgg cggacctccc gctgctggtc ctggacgctc 47820 cgcggaccgt cgccgcgctg gaggaactgc ccggcggccc gctgggcgac ggcgagcgcc 47880 cctcgccgcc ggacccgggg aacgccgcct acgccatcta cacctccggc tccaccggac 47940 ggcccaaggg cgtggtggcc acccaccggt ccctcgtcgg ctacctgctg cgcggctcgc 48000 aggagtaccc gtccgacgga cgctccctgg tgcactcgcc ggtctccttc gacctcacgg 48060 tcggcgccct ctacgtcccg ctggtcagcg ggggcacggt ccgcctcgcc tccctggacg 48120 acgagccggt cctgcgcccc ggcgaggcac ccccggactt cgcgaaggtg acccccagcc 48180 acctgccggt cctcgaaggg ctgccgcggg aggtcagccc gaccgcggcg atcaccttcg 48240 gcggcgaaca gctcaccggc cggcacctgc ggcggtggcg cgccgaccac ccggacgtca 48300 ccgtctacaa cgtctacggg cccaccgaga cgaccgtgaa ctgctccgag caccggatcg 48360 ccccgcgcgc cccggtcgcc gacggcccgg tgcccatcgg gcggccgctg tggaacaccc 48420 gcctgttcgt cctcggcccc ggcctggtcc cggtgccggt cggcgtcccc ggcgagctgt 48480 acgtcgccgg gtccggcctg gcccgcggct acctccgcga cccgggcagg accgccgagc 48540 gcttcgtggc gtgcccctac gccgccgggg agcggatgta ccgcaccggc gacctcgtcc 48600 ggtggaacga ggacggcctg ctggagtacc tcggcagggc ggacgaccag atcagcctgc 48660 ggggcttccg ggtggagccc ggcgaggtgg aggcggcgct ggcggcccac cccgccgtgc 48720 gccgggccgc ggtggtgatg cgggaggacg cggcggggga cgcccggctg gtcggctacg 48780 tcgttcccgc cggggaggac gccggggacg gcgcgccccc ctccgc~~ccg gccggatccg 48840 acaccgggct gcccaccgcg cagatcaccg agcacctgcg ccggatgctg ccgccctaca 48900 tggttccctc gcacctggtc gaactgcccg cgctgcccgt cacgcccaac ggcaagatcg 48960 accgcgccgc gctgccggag ccccccgccg cgggcgactc cgccggggga gcgcccagat 49020 ccccccgcga ggagatcctg tgcgggctct tcgccgacgt gctccggcgc ccgcaggtct 49080 ccatcgacga cgacttcttc gccctgggcg gccactccct gctggccacc cgcctggcca 49140 gcagggtgcg ggcggccctg gacgtggagc tgccggtgcg ccggctcttc gagcacccca 49200 cggtgaggtc cctgtccgcg ctgctggact cccgcggcgg cgaacgc ccg ccggtgaggc 49260 cggcggagcg cccggagcgc gtcccgctct cgtacgccca gcagcggctg tggatcctgc 49320 accggctcac cggccccgac gccacctaca acatccaccg ggccctgcgg ctcgacggcg 49380 acctcgacgt ccgggcgctg gaggccgcgc tgcacgacgt ggccga.acgg cacgagacgc 49440 tgcgcaccgt catcgccgag ggcgccgagg gcccgttcca gaaggtgctg ccggcccggc 49500 ggacggacga gcgcctcacc gtcctgccgg cggccgagga ggaggtggac cgcaccgtcg 49560 gcgagctggc ggcccaccgc ttcgacctgg aagccgaacc ccccatgcgg gcctggctgc 49620 tggagaccgg cccgcacagc cgggtgctcg tgctggtgct gcacca.catc gccagcgacg 49680 gctggtcggg caggaggctc ctgcgcgacc tgttcaccgc ctacaccgca cgccgcgcgg 49740 gccgggcccc gaactggcgg ccgctgccgg tgcagtacgt ggactacgcc ctgtggcagc 49800 ggcggttcct cggcgatccc gcggaccccg gcagcaccgc agccgcccag ctggagtact 49860 gggagcggca gctggccggt ctgccggagg agctgaggct gccggccgac cggccgcgtc 49920 cggccgtccc gtcccgcacc ggcggccagg tctggctgac gctgcccgca tccgtccaca 49980 ccgccgtggt cgacctggcc cggacgtgcc gggcgagcgt gttcatggtc gtccaggccg 50040 ccgtcgcagc cttcctcacc cggatgggcg ccggggagga catccccgtc ggcaccccgg 50100 tggccgggcg caccgacgag gcggtggagg acctggtcgg attcttcgtc aacaccctgg 50160 tcctgcggac cgacacctcc ggcgaccccg cgttcgccga gctggtcgga cgggtccgcg 50220 agaccgcgct ggccgcctac gcgcaccagg acctgccctt cgagcagctg gtggagcgcc 50280 tgagcccggc ccgctcgctc ggccggcacc cgctcttcca ggtcgccctc tcctgcaaca 50340 acaccgagga gcagctgggg cgccagggct ccccgccccc cggactcgcc gtcagccccc 50400 accaggtgga gaccgcgcgg tcgaagttcg acctgatgtt caccttcctg gagggccacg 50460 gggaggacgg gcggccggcc gggatcgaga ccgccctgga gtacagcgcc gacctcttcg 50520 acagggagac cgcgcaggac ctgctggagg cgttcggccg gatgctggcg ctctgggcgg 50580 cggacccggg cggccccatc ggagcccggg agctgctcgc ggccgacgag cggcacacgg 50640 tcgtggccga gtggaacgcc acccggcgcg cgggcctggt cgcgacgctg ccggagatgt 50700 tccaggagca ggtcgcccgg actcccgacg cccccgccgt ggagcacgcc ggccgcgggc 50760 tgacgtacgc cgaactcaac gcccgggcca accggctcgc cagggtgctg gtccggcacg 50820 gcgtcggccc cgagcgccgg gtggccctgc tgatgccccg ctccctcggg caggtcaccg 50880 cgctgctggc ggtgctcaag gccggcggcg cctacgtgcc ggtggacccc ggccacccgg 50940 aggagcgcat cgccttcatg ctgcgcgaca gcgcccccgc gctggtcctg gcggccgagt 51000 cgtgcgcggc gggacgcggg gagatcgccg gggtcccggt cctggtgccc gacgacgggc 51060 cggccggggc ggagccggac gggccgtccg ccgccgacct caccgacggg gaccggaacg 51120 cccccctgac cgccggcaac gccgcgtacg tcgtctacac ctccgc~ctcc acgggccgcc 51180 ccaagggcgt ggtgaccgag caccgcggtc tgctgtcgct ggccgt.ggcg cagcgggagc 51240 ggtacccggt gcgggccggc agccgtgtgc tgcagctcgc gtcgccgtcc ttcgacggcg 51300 ccgtgctgga gctgctgatg gcgttcgcca ccggagggac gctggtcctg gccgaccggc 51360 cgctcctggc cggggagctg ctcggcgaga ccatcgccgc gcggcggatc agccacgcct 51420 tcattccccc ggcggcgctg accggtctca cgcccgaggg actggactgc ctgcgctgcc 51480 tcgtcgtcgg cggcgaggcg gtcacggcct cggtcgtgga ccgctgggcg cccggccggc 51540 gcatgctcaa cgtctacggc ccgaccgagg ccaccgcggt caccctgacc agcggagccc 51600 tctccccggg cggaccggcg cccgccatcg gcacgcccgt gcccaacacc cgggcctacg 51660 tgctcgacga ccggctgcgg ccggtgcccc ccggggtgac gggcgagctc tacctggccg 51720 gcgcgtcgct ggcgcgcggc tacggcgacc gccccgggct caccgcgacc cggtacgtcg 51780 gctgcccgtt cggggagccg ggggagcgga tgtaccgcac cggcgacctg gcgcgctggg 51840 accgggaggg gcgagtccac tacgtgggcc gcgcggacga gcagatcaag ctgcgcggtt 51900 tccgggtgga gcccggcgag gtccaggccc ggctcaccga gcacgccgcg gtgcgggagg 51960 cggccgtcgt cctgcgggag gacgagccgg gggagcgcag actggtggcc tacgcggtgc 52020 cggccgacgg cctgccccgg cccaccgccg cggaactgcg ggcccatctg gccgccctcc 52080 tgccgcccta catggtgccc tcggcctatc tggtgctgga cgccctcccg gccaacgcca 52140 acggcaagct cgaccgggac gccctgcccg aaccggaacc gctcgccgag gagggcggcc 52200 ggccgccgag cgacgaacgg gaggccgccc tgtgcgaggt gttcgccgag gtgctggggc 52260 gcgagtggat cggtgccgac gacgggttct tcgagaacgg ggggcactcg ctgctggcca 52320 cccggctggt cacgcgggtc cgcgagcgcc tcggggtgcc cgtcgccgcg cgggacctgt 52380 tcgaggcgcc gacggcggcc ggcctggccg agcgcatcgg gcggggcgcc gagcgccgcg 52440 ccccggcgcc cctgctgacg ctgcgggggc gcggcgacca gccgccgctg ttctgcgtcc 52500 acccggccgt cggcctgggg tgggcgtacg cgggcctcct ccagcggctc cccgcggacg 52560 tcccgctcta cgcgctgcag gcccgcacac ccgccgccgg cggcggactg ccgcgcagca 52620 tcgaggagat ggccggcgac tacgtccggc tggtccgcgc cgtccggccg cacgggccgt 52680 accggctgct cggctggtcg ctgggggccc acgtggccca caccatggcc ggcctgctgg 52740 agcgcgacgg cgagcgggtg gacctgctcg ccgtgctgga cgccta~~cct ccccaccgca 52800 cggggacgac cggacgggag gggacggagg ccgagatcgt cgcggc~~aac ctgcgggagt 52860 cgggattcgc ctgggaggag gcggagctgc gcgacggacg cttcccgctg gagcggttcc 52920 gtgcccacct gcgccggatg gacagctcgc tcggccacct cgacgacggc gagttgacgg 52980 cggccaagga cgtctacgtc aacaacgtac ggctcatgcg gtcccacacc cccggacgcg 53040 tccgctgcgg gatcgtgctg atgaccgcgg agcgctcccg cagcctcgac cccggggcgt 53100 gggacgcgca caccgaggag ggcgtcgagg tgcaccgcgt cgacgccgcc cacatgtcca 53160 tgctcaccga accgacgtcg gtcgccgaag tcggccgcgt cctgac:ccgc cgactggact 53220 ccctgcgggg agccgacacg aagaaacgag aggtgtgaac gatgac:caac cccttcgacg 53280 acgccgaggg caccttccac gtcctggtca acgacgaggg ccagcactcg ctgtggccga 53340 acttcgtgga ggtcccggcg ggctggcggg cggtggtgga ggaccgcccc cgccaggagt 53400 gcctggacta catcgaggag aactggaccg acatgcgccc caagagcctc atcgaggcca 53460 tggaggccca cgagaaggcc gcgaccgcgg ccgagtgacc gggccccggg cgggcggacc 53520 cgacgggacc ccgcccgccc gcggaccccg gcgcggggcc cggcgaccgt gccgggcccc 53580 cgcgcggcgc acccgcaccc ggcggccgct tccccggccc gcgcaccgca caccgaccga 53640 ccatccggtc ctcagggccg ccgacgaccc tccgagggag ccttcgatgc cgaccacacg 53700 gatcaacggg atcgccctgg accacgaccg caccggcagc ggcccgcccg tcctcctgat 53760 catggggagc ggcgccgcca agtcggcctg gcacctgcac caggtgcccg cgctggtcgc 53820 cgagggcttc gaggccgtca cgttcaccaa ccgcggcgtc cctcccagcg gaggcggccc 53880 cggcttcacc ctcggcgaca tggcggccga caccgtcggc ctgatcgagc acctcggcat 53940 cggcccctgc gcggtcgtgg ggatgtccct gggggccagg gtcgcgcgcg aggtcgcccg 54000 gacccgcccc gacctggtct cccgatgcgt cctcgtggcg ccgcgggccc gctcggaccg 54060 gatgagggcc gcctgcaccg ccgccgagat cgccctcgcc gacagcggcg tcaccctgcc 54120 gccgcgctac cgcgcggtgg tgcgggcgat gcagaacctc tcgccgcgga cgctcgcgga 54180 cgaccggcag atcgccgact ggctggacgt cctcgaactg gcggcggccg acgggcccgg 54240 cctgcgcacc cagctggagc tcagcgccgc cgacgaccgc ggggaggacc tggccggcat 54300 caccgccccc tgccgggtga tcgccttcgc cgacgacatc gtggcgccgc cgcacctggc 54360 gaaggagatc gcggacgccc tgcccgaggc cgactaccac gtcgtccccg actgcggcca 54420 ctacggctac ctggagcggc ccgaccgggt caaccggctc atcaccgaat tcctccgcgc 54480 accccagacc acacagggat gaaagagcac accatggaac cgatcactcc ctggcggccc 54540 gccgagatca gcccgggcag ccactccctc cccgccaccg ccgacgccct ggccgacttc 54600 ctgcgggact ccgagcggat cgccgggctc ctggccgccc acaaggtgct ggtcctgcgc 54660 ggtttcggcg tgggcccgca ggagctggag aagatcatgc cgctcct=get gccggaccgc 54720 ctggcgtacg tcttcggcaa ctccccgcgc accaaggtgg ggcgcaacgt gtacacctcc 54780 acggagtacc cgcaggagtt caccatctcg atgcacagcg agatgtccta cgccgcgcag 54840 tggcccgccc ggctgctgtt ctactgcgag cgggcggccg ggagcggcgg cgccacaccg 54900 gtggtggaca acgccgcctg gtaccgggcg ctggaccggg aggtccgcga ggccttcgcg 54960 ggcggcctgc gctacaccca gaacctgcac gggggacggg gcctgggcaa gagctggcag 55020 gacaccttcg agaccgagga ccggtccgag gtcgaggact acctctcccg gagcggcgcc 55080 acctggcagt ggaacgcgcg caacgggctg cgggtcagcc acgtccggcc cgcgacgatc 55140 gagcacccgg agacggggga accgctgtgg ttcaaccagt ccgaccagtg gcacccggcc 55200 acgctcggcg acgaggccgc cgccgcgctg atggagatgc tgcccccgga ggagctgccc 55260 cagtcggtga ccttcgccga cggcaccccg ataccggccg actacgtgcg gcaggtgcgc 55320 gaccgeggac tggagcacgc ggtggacaac gactggcggg ccggtgacct gatgctcgtc 55380 gacaacgtcc aggcggcgca cggccgcagg ccgttcaccg gtgaccgccg ggtcctggtc 55440 gccatgtcgg actagccggc gcgccgcgga ccggcaccgc gggcggccgc cgcccgccgc 55500 cgcccacccc caacccccag ccccaggaag gccgcacgga tgagcatcgc cctcgccgac 55560 gtggaaggcg tgaacaggca cgagaccgag ttcctctacg acgagatctt cacccgccgc 55620 ggttacctgc cggaggtgct gcacctgccg gaggaccccg tcgtcttcga cgtgggggcc 55680 aacatcggca tgtacaccct cttcgtgaag tccgagagac ccggtgccac ggtccactcc 55740 ttcgaaccgg tcccctcggt gtacgaggtc ctgtgccgca accgggagcg ccacggcgtg 55800 gcggggctcg cctgccccta cggcctcgcc gagagcgagc aggaggtcga gttcacccac 55860 tacccgggct actcgaccat gtccacgcgc agcaccctgg cggacaccga ggcggagaag 55920 gcgttcgtcc gggaacaggt gcggaccgac cacctgcccg aggccgagcg gatgctggac 55980 gaactcctgg ccttcaggtt ccgggagcag acggtgcgct gccggctccg tccgctctcc 56040 gcggtcctcg acgagcaccc cgtggaccgg atcgacgtgc tgaagatcga cgtgcagcgc 56100 ggcgagcagc aggtgctgcg gggcatcgag gaacggcact ggccgctggt gcgccagatc 56160 gcgatggagg tgcacgacag ccccggcggc gtcaccgccg gccggctcgc agcggtgacc 56220 gacgggctgg agcggcgcgg gttccgcgtc caggccgtgc aggaggaccg gtacgcgggc 56280 agcgaccgct actccgtgtt cgccgtctcg cggcaccacg gcggccggga cgcggcgggg 56340 ccctcctgag gcacccgccg gccggggccc cggcgcgacc gccgcgcggg acgcccgccg 56400 gaccgcagac ccggaccacc cttccccaac cccctcggag ggacagcatg aaccccgaaa 56460 gccgcttcga actgtccgac gccgagcgcg ccgacgtcgc gctcctggcg gaggagctga 56520 cgcgcacgcc ccccggcctg gtggacgagc gggaatggct cgaccggtgc cgcagcctct 56580 cctgccacct gcccgcccgc ctccaggacc ggctccgcgc cttccgccac gaccccggcc 56640 gcgcgggcat gctgctcata cgcaacctgc ccgcggccgg gtccgtgccg gacaccccgc 56700 gggagggcga ctcggtggaa cgcagggcga cactgagcgc ctcggtgctc tgcgccgtct 56760 ccatggaact gggcgaggtc gtcgcctacc gcaacgagaa gcagggggcg ctggtgcaga 56820 acgtggtgcc cgtgcccggc cgggaggacc agcagtccaa cgccggctcc gtcccgctgg 56880 agatgcacac cgagaacgcc ttccaccccc accgccccga ctacgtcggg ctgctgtgcg 56940 tccgcagcga ccacgaccgg accgccggac tccgggtggc ctgcgtccgc gccgcgatgg 57000 agcacctgga cgccggaacc cgcgagaccc tgcggcgccc cctgttcacc accgaaccgc 57060 cgccctcctt cgagcgtccg gacagcggga ccaagccgca cgccgtgctc accggcgacg 57120 tggaggaccc ggacatccgc atcgacttcc acgccaccca cgcgacggac ccgtggggca 57180 ggcaggccat ggacgccctg gccgacgccg tccgcgcggt ctccgaggaa ctcgtcctgg 57240 acccggccga tctggtgtac gtggacaacc gcgtcgcgct gcacgggcgt acggccttca 57300 ccccgcgcta cgacggcgag gaccgctggc tccagcgcgc cttcgtccac ctggaccacc 57360 ggcgctcccg ggccgtgcgc tcgctgcacg ggcgcgtgct gagctgatgg ccgcggacgg 57420 agcgatccat gccgccggtc tgcacaagcg cttcggagcc gtccacgcgc tccaaggcgt 57480 ggacctgacg gtggcccagg gcgagatcat gggcctcctg ggccacaatg gcgcggggaa 57540 gaccaccctc gtcaacatcc tgtccaccct gacgcccccc acgtcgggca ccgcctcggt 57600 cgccggcttc gacgtggccg ggcgccccga ggaggtgcgc cggcgcatcg gcgtcaccgg 57660 gcagttcgcc tcgctcgacg agcagctctc cggctacgac aacctcgtcc tggtggcccg 57720 gctgtgcggg gcctcccggg cgcaggccga ggaccgggcg ggcgagctgc tggaggcctt 57780 cgggctccgt ggcgccgggc agcggaaggc ggtgacctac tcgggaggca tgcgccgccg 57840 gctggacctg gccctgggtc tggtcgggcg ccccgacgtg ctgttcctgg acgagccgac 57900 cgtcgggctg gacctgccca gccgcatcgg tctgtgggag atggtcgagg atctcacgcg 57960 cggcggtacc accgtcctgc tcaccacgca gtacctggag gaggcggacc ggctcgccga 58020 ccgcatcacc gtcctgggag cgggccgggt cctggtctcc ggcacggcgg ccgagctgaa 58080 gtcgcgggcc ggcagcggtt ccatctcgct caggctcgaa ccgcacgggg atccggccgc 58140 ggccgccggc gccctgcgcc gggccgggtt cccgcccagc gtggcggccg cccggcgcga 58200 gctgacggtg ccggcgggcg gctcggcgga cctggccacg gtgatccggg tgctggacac 58260 cgtcgggcag aacgtgaccg agatccgcca ctcggagccc tccctggacg acgtctacct 58320 cgccctcacc ggagcggagc cccacggccc cggcccggtg gaggcggcgc gccggacggc 58380 tcccccgaag gcgcaccggc ccaccggcga cacgcaggag accacgtgaa cgcaccgcgg 58440 acgatggcac ccccggcccg ccccgccgcg gggccgcgcg gcccgcgctg gaggccgacg 58500 ggcagagccg cgcagctgtg gatcctcacg gcgcgtcaga tccgcctggt gtacgccgac 58560 cgcagggtcg tgctgttcag cgtggcccag ccggtggtga tgctgctgct gatcagtcag 58620 gtcttcggca gcctcgcgga ccgctccgtc ctcccgcggg gcgtgtccta cgtcgagttc 58680 ctgctgccgg ccctgctggt caccaccggg atcggcacct cgcagtccgc gggagtgggt 58740 ctcgtgcggg acatggaagg cgggatggtg cgccgcttcc gcgtcctccc gctgtcgctg 58800 ccgctggtcc tggtggcgcg ctcgatcgcc gacctgaccc gctcggggat gcagctgctg 58860 gtcctggtgg ccggcggcca cctgctgttc ggataccggg ccgggggagg gctggggggc 58920 ctggtggcgg cgctgctgct gtccaccctg gtgatctggg ccctgatctg gatcttcatc 58980 gccctcgccg cgtggctgcg caaggtggag gtgctctcca gcatcggctt cttcgtcaac 59040 ttcccgctga tgttcgcctc cagcgcgttc gtgccgatcg aggtgctgcc cggctggctg 59100 gcggccgtcg ccaccgtcaa tcccgtgagc tacgcggtgg aggcgtcccg cggcctcgcg 59160 ctgggagggc aggtcgggag cgaggtgccc gcggcgctgt gcgcgggcct cggcctgatg 59220 ctggcgatga tgctgctcgc CgCCCgCgCC ttccgccggc cgcccgacga gtgaccgcgc 59280 cggcccgcag gggcggccgc gccggctcag agcgcgcggc ggcgcctccg gcgccaccac 59340 acgacgccga ccgccaggac ggccgccacg cccagcagcg cccacaggat cctggtgacc 59400 acggtctcgg cctcccgcac ggaggccgag accagcgacc ccgccgacac gtagagcgcg 59460 gaccacatcg cggccccggc gagcgaggcc ggcaggaagc ggaagtagcg caccgagccc 59520 actccggccg tcgccggggt gagcgtgcgc acgatgggca gcagcctggt caggaagacg 59580 gcggtcgccc cgtaccggtg gaagagcgcc tgggcgcggt cccagtgctg ctggccgatc 59640 cgccgcacca ggcgcgtgtc ccgcatccgc tccccgtagc ggattccgag gaagtagccg 59700 atgtggtcgc cggccgaact gctgaccgcc acgacgagga agagcgccag cagggaacgg 59760 gtcccctccg ttccggcgct cagcaccagg accgcgacct caccggggac gaacatgccc 59820 gctccgaggc ccgattccgc gaaagcgaac gccgaggcca gtgcgaatct ggaggccgga 59880 tccatctccg acaccgctgt cagcacatcg tggatccatg ccatggcagt cctgaaccgc 59940 ctctcctcgc tcgtggtgcc cggtacggat tctgctgtgc ggggcgcgga caccggctcc 60000 ggcggcgccc gttccccgga actcccgcat ccttctccgg gaggaagtcg aaatgtcttt 60060 cgggtggacg ccgggtgacg gggaccggtg cggcacgggg tgctccggct ttcgtgtgga 60120 tccgcatccc gcgctcacgg tggacagtag ggcggcttcc cggtttcgcc ggaaacaccc 60180 ccggccgatg ggcgcgcgcc cgggagaagg gcgaatgcgc gccaatgtac caggcggccc 60240 gccggcggac ccgttcgtcc ggcccggccg gaaccggccg cggcgcccgt gccccggccg 60300 gttccggcgg gaccggcccc gcccgcgggg cgcgggggcc caccggccgg agccggcggg 60360 ggcggaaatc cccggggcat gtcggcctgc gccggaggtg attgatggaa tgtcggattc 60420 cgggtacggg attccggggg aacgcccgcg gcggggttca acggcgggcc cgtccgattg 60480 cccgcggccg gcactccggc gatatggggg attcgatccc cccgaatccc atgtccggcg 60540 ccctgtgccc gttgggcacg gatgctcatg aatcgcggtg cccggaaatt tccaacccag 60600 gtgaatcgcc tcttccgtgg agttcgcggc gccgacggca agggcgcggt tgcgccgcgc 60660 ccagcggagg acgaccgatc acaatgccgc gtccaaagcg cccttgacgc ggggatcgaa 60720 ctcctgacag ggtgcaacgg tgtcactgcg gaccggactg gccgggcacc ggtgccccga 60780 aattccgtca atcgcgcaca ccggtgccgg cggatgggat tgcgggtcgt ttgcgagcag 60840 cggaggatat ccgccccgca cgcggtacgg gtggtcgacg atgaaaccca tcgaatgacc 60900 ggattggtgg aaatgcaggc ggatgcactg gcggagacgg atttctactt gccgtccatg 60960 accctcgaag actatctgcg ggacgcggtg ccggcccacc cggtcctgaa ggcggccgtg 61020 gacttcggcc gcccgggcgc cgacaaggcg ttgagggcgc tggccgcgac gaccacggaa 61080 ttcgactccg acgacaccgg acgcggcgac tcctatcgcc gggcgcagcg gaactcgtcg 61140 gtccgctggc ggggcatacg gcagctcctg gaactggccg ccccctccga cgcggcggcc 61200 ccccacatcg tcctggacgt cctcggaggg gacgggacca tcgcgcgcgc cgtccacgaa 61260 cacgcccccg acctgcggga ccgcgtgagc atcctgacgg gcgacctctc cggcgacatg 61320 gtcgaacggg ccctcgccca gggcctgccc gccgtccggc aggcggcgga ccacctcttc 61380 ctgggggacc gcaccgtgga cgcggcgctg ctcgcctacg gaacccacca catcgcgccg 61440 caggacaggc tcaacgcggt caccgaggcc ctgcgcgtcg tcagggacgg gggccgggtc 61500 gtcctccacg acttcgacga cgccagcccc atggcgcgct tcttcaccga cgtcgtccac 61560 ccccacacca cggcgggcca cgactaccgg cacttctcgc gcggatcgct ggtcgagctc 61620 ttcgaggagg ccgggacacc ggctcgcgtc gtcgacctgt acgacccgct ggtcgtccgc 61680 ggcgacacgg aggaggacgc ccgccggcgc atgtgcgagt acgtggccga catgtacgga 61740 gtcggggagt acttcgccgc gcagggcggg accgacgccc gctggcggat cctggagcgg 61800 tacttcgggc acgagggcta cctggcgggc ctgcccgcgg aggtcgactt cacaccgcgc 61860 cccgtcgtct accgctcccg cggcgcctac gtcgccgagg tgccccgtgc ggcgatggtc 61920 gccgtcgcgc ggaagacggc atga 61944 Information for SEQ ID N0: 35 Length: 573 Type: PRT

Organism: Streptomyces refuineus Sequence: 35 Met Ala Asp Pro Leu Leu Phe Asn Pro Arg Thr Tyr Asp Pro Gly His Phe Asp Pro Glu Thr Arg Arg Leu Leu Arg Ala Thr Val Asp Trp Phe Glu Gln Arg Gly Lys Arg Arg Leu Ile Glu Asp Tyr Arg Thr Arg Ala Trp Pro Ala Asp Phe Leu Ala Phe Ala Ala Glu Glu Glu Leu Phe Ala Thr Phe Leu Thr Pro Ala Arg Glu Ser Asp Gly Arg Arg Asp Arg Arg Trp Asp Thr Ala Arg Ile Ala Ala Leu Ser Glu Ile Leu Gly Phe Tyr Gly Leu Asp Tyr Trp Tyr Val Trp Gln Val Thr Val Leu Gly Leu Gly Pro Va1 Trp Gln Ser Gly Asn Ala Ala Ala Arg Ala Arg Ala Ala G1u Leu Leu Ser Arg Gly Glu Val Phe Ala Phe Gly Leu Ser Glu Lys Ala His Gly Ala Asp Ile Tyr Ser Thr .Asp Met Leu Leu Glu Pro Asp Gly Asp Gly Gly Phe Arg Ala Gly Gly Ser Lys Tyr Tyr Ile Gly Asn Gly Asn Ala Ala Gly Leu Val Ser Val Phe Gly Arg Arg Thr Asp Val Glu Gly Pro Asp Gly Tyr Val Phe Phe Ala Ala Asp Ser Arg His Pro Ala Tyr His Val Val Arg Asn Val Val Asp Ser Ser Lys Tyr Val Ser Glu Phe Arg Leu Glu Asp Tyr Pro Val Gly Pro Glu Asp Val Leu His Thr Gly Arg Ala Ala Phe Asp Ala Ala Leu Asn Thr Val Assn Ile Gly Lys Phe Asn Leu Cys Thr Ala Ser Ile Gly Ile Cys Glu His Ala Met Tyr Glu Ala val Thr His Ala Arg Asn Arg Ile Leu Tyr Gly Arg Pro Val Thr Ala Phe Pro His Val Arg Arg Glu Leu Thr Asp Ala Tyr Val Arg Leu Val Gly Met Lys Leu Phe Ser Asp Arg Ala Val Asp Tyr Phe Arg Ser Ala Gly Pro Asp Asp Arg Arg Tyr Leu Leu Phe Asn Pro Met Thr Lys Met Lys Val Thr Thr Glu Gly Glu Lys Val Val Asp Leu Leu Trp Asp Val Ile Ala Ala Lys Gly Phe Glu Lys Asp Thr Tyr Phe Ala Gln Ala Ala Val Glu Ile.Arg Ser Leu Pro Lys Leu Glu Gly Thr Val His Val Asn Leu Ala Leu Ile Leu Lys Phe Met Arg Asn His Leu Leu Asp Pro Val Glu Tyr Ala Pro Val Pro Thr Arg Leu Asp Pro Ala Asp Asp Ala Phe Leu Phe Arg Gln Gly Pro Ala Arg Gly Leu Gly Ser Val Arg Phe His Asp Trp Arg Pro Ala Phe Asp Ala His Ala His Leu Pro Asn Val Gly Arg Phe Arg Glu Gln Ala Asp Ala Leu Cys Glu Phe Val Ala Thr Ala Ala Pro Asp Glu Glu Gln Ser Arg Asp Leu Asp Leu Leu Leu Ala Val Gly Arg Leu Phe Ala Leu Val Val His Gly Gln Leu Ile Leu Glu Gln Ala Gly Pro Ala Gly Val Asp Gly Asp Val Leu Asp Glu Leu Phe Ala Val Leu Val Arg Asp Phe Ser Ala His Ala Val Glu Leu His Gly Lys Asp Ser Ala Thr Ala Pro Gln Gln Arg Trp Ala Leu Asp Ala Val Arg Arg Pro Val Val Asp Asp A1a Arg Ser Ala Arg Val Trp Glu Arg Val Glu Ala Leu Ser Gly Ala Tyr Glu Met Thr Pro Information for SEQ ID NO: 36 Length: 1722 Type: DNA
Organism: Streptomyces refuineus Sequence: 36 atggccgacc cgctgctgtt caacccccgc acctacgacc ccgggcactt cgaccccgag 60 acccgcaggc tgctgcgcgc caccgtcgac tggttcgagc agcgcggcaa gcgccgcctg 120 atcgaggactaccgcacccgcgcctggccggcggacttcctcgccttcgccgcggaggag180 gagctgttcgccaccttcctcacccccgcccgcgagagcgacggccggcgggacaggcgc240 tgggacaccgcgcggatcgccgccctcagcgagatcctcggcttctacgggctcgactac300 tggtacgtctggcaggtcaccgtcctcggactcggaccggtctggcagagcggcaacgcc360 gcggcccgcgcccgcgccgccgaactgctctcccggggcgaggtgttcgcgttcggcctg420 tcggagaaggcccacggcgccgacatctactccaccgacatgctgctggagcccgacggc480 gacggcggcttccgggccggcggctccaagtactacatcggcaacgggaacgccgcgggg540 ctcgtctccgtcttcggccgccgcaccgacgtcgaggggcccgacggctacgtcttcttc600 gccgcggacagccgccacccggcgtaccacgtcgtgaggaacgtcgtcgactcctccaag660 tacgtcagcgagttccggctcgaggactacccggtcggcccggaggacgtcctgcacacc720 gggcgcgccgccttcgacgccgcgctcaacaccgtcaacatcggcaagttcaacctctgc780 accgcctcgatcggcatctgcgagcacgcgatgtacgaggcggtgacccacgcccgcaac840 cggatcctctacggccgccccgtcaccgccttcccgcacgtgcgccgcgagctgaccgac900 gcctacgtccgcctggtcgggatgaagctgttcagcgaccgagccgtcgactacttccgc960 tccgcgggccccgacgaccgccgctacctgctcttcaacccgatgacgaagatgaaggtg1020 accacggagggcgagaaggtcgtcgacctgctgtgggacgtcatcgccgccaagggcttc1080 gagaaggacacctacttcgcccaggcggccgtcgagatccggagcctgccgaagctggag1140 ggcacggtccacgtcaacctcgcgctgatcctcaagttcatgcgcaaccacctgctggac1200 ccggtcgagtacgcgcccgtgcccacccgtctggacccggccgacgacgccttcctcttc1260 cggcagggccccgcccgcggcctgggatcggtccgcttccacgactggcggcccgccttc1320 gacgcccacgcccacctgcccaacgtcggccgcttccgggaacaggcggacgccctgtgc1380 gagttcgtcgccaccgcggcccccgacgaggagcagagccgcgacctcgatctgctcctc1440 gccgtcggccggttgttcgcgctggtcgtgcacggccagctgatcctggagcaggcgggg1500 ccggccggtgtggacggggacgtgctcgacgaactgttcgccgtcctcgtgcgcgacttc1560 tccgcgcacgccgtggaactgcacggcaaggactccgcgacggcgccgcagcagcgctgg1620 gccctggacgcggtccggcgccccgtcgtcgacgacgcccggtcggcgcgcgtgtgggag1680 cgcgtcgaggccctgtccggggcgtacgagatgacaccgtga 1722 Information for SEQ ID NO: 37 Length: 601 Type: PRT
Organism: Streptomyces refuineus Sequence: 37 Val Ala His Val Ser Gly Pro Pro Ala Asp Pro Pro Ala Gly Ser His Leu Val Ala Ala Ile Arg Ala Thr Ala Glu Ala Asp Pro Glu Arg Lys Ala Val Gly Phe Val Arg Asp Pro Glu Arg Glu Gly Glu Glu Ala Leu Arg Ser Tyr Ser Trp Leu Asp Asp Arg Ala Arg Arg Ile A1a Val Leu Leu Arg Gly Ala Arg Leu Gly Ala Gly Ser Arg Val Leu Leu Leu Phe Pro Gln Ser Ala Glu Phe Ala Ala Ala Tyr Ala Gly Cys Leu Tyr Gly Gly Met Val Ala Val Pro Ala Pro Leu Pro Thr Gly Thr Ser Leu Glu Thr Ala Arg Val Ala Gly Ile Ala Arg Asp Ala Gly Ala Gly Ala Val Leu Thr Val Ser Asp Thr Glu Ala Glu Val Arg Arg Trp Ala Ala Glu Thr Gly Leu Gly Asp Leu Pro Leu Phe Ser Val Asp Glu Leu Pro Asp Asp Thr Asp Pro Gly Glu Trp Arg Glu Pro Glu Ile Arg Ala G1y Thr Val Ala Val Leu Gln Tyr Thr Ser Gly Ser Thr Gly Ser Pro Lys Gly Val Val Val Thr His Gly Ala Leu Ala Asp Asn Val Arg Ser Leu Leu Ser Gly Phe Asp Leu Gly Thr Gly Ala Arg Leu Gly Gly Trp Leu Pro Met Tyr His Asp Met Gly Leu Phe Gly Leu Leu Ser Pro Ala Leu Phe Ser Gly Gly Ala Ala Val Leu Met Ser Gly Ser Ala Phe Leu Arg Arg Pro His Leu Trp Pro Thr Leu Ile Asp Arg Phe Gly Val Val Phe Ser Ala Ala Pro Asp Phe Ala Tyr Asp Tyr Cys Val Arg Arg Val Glu Pro Glu Gln Val Asp Arg Leu Asp Leu Ser Arg Trp Arg Trp Ala Ala Asn Gly Ser Glu Pro Ile Arg Ala Glu Thr Leu Arg Ala Phe Thr Lys Glu I~g Phe Ala Pro Ala Gly Leu Pro His Asp Ala Met Thr Pro Cys Tyr Gly Leu Ala Glu Ala Thr Leu Leu Val Ser Leu Ser Ala Gly Glu Leu Arg Thr Arg Arg Val Asp Ala Ala Ala Leu Glu Asn His Arg Phe Val Glu Ala Ala Ala Gly Arg Pro Ser Arg Glu Val Val Ser Cys Gly Arg Pro Pro Ala Leu Glu Val Arg Val Ala Asp Pro Ala Thr Gly Glu Pro Val Thr Gly Asp Ala Val Gly Glu Ile Gln Val Arg Gly Ala Ser Val Ala Gly Gly Tyr Trp Arg Lys Pro Glu Ala Thr Ala Glu Thr Phe Val Thr Ala Ala Asp Gly Ser Gly Pro Trp Leu Arg Thr Gly Asp Leu Gly Ala Leu Tyr Glu Gly Glu Leu Tyr Val Thr Gly Arg Ile Lys Glu Leu Leu Ile Val His Gly Arg Asn Ile Tyr Pro His Asp Val Glu Arg Glu Leu Arg Ala His His Asp Glu Leu Gly Ala Ile Gly Ala Val Phe Ser Val Pro Thr Glu Glu Gly Glu Ala Val Val Val Thr His Glu Val Val Pro Ser Val Arg Asp Asp Arg Gly Pro Ala Leu Val Thr Ala Val Arg Ala Thr Leu Ala Arg Glu Phe Gly Leu Ala Pro Ala Gly Val Val Leu Val Arg Arg Gly Arg Thr Pro Arg Thr Ser Ser Gly Lys Val Gln Arg Arg Leu Ala Ala Arg Leu Phe Arg Thr Gly Glu Leu Ala Gln Val His Ala Asp Pro Gly Ala His Arg Leu Val Ala Ala Leu Arg Glu Ala Asp Gly Leu Arg Asp Ala Pro Ala Ser Thr Thr Information for SEQ ID NO: 38 Length: 1806 Type: DNA
Organism: Streptomyces refuineus Sequence: 38 gtggctcacg tgagcggacc cccagcagac ccgccggccg gctcccacct ggtggccgcg 60 atccgcgcgacggecgaggccgaccccgagcgcaaggccgtcggcttcgtccgggatccg120 gaacgcgaaggtgaggaggcgctgcggagctactcctggctcgacgacagggcccgccgc180 atcgccgtcctcctccgcggggcgcggctcggcgcgggctcgcgcgtcctgctgctcttc240 ccgcagtccgcggagttcgcggcggcctacgccggatgcctctacggggggatggtcgcc300 gtccccgcgcccctgcccacgggaacctccctggaga.ccgcacgcgtcgccggcatcgcc360 cgggacgccggggcgggcgccgtcctcaccgtctccgacaccgaggcggaggtccggcgg420 tgggcggccgagaccggtctgggcgacctgcccctgttctccgtcgacgaactgcccgac480 gacaccgacccgggggagtggcgggagccggagatccgggccggcaccgtggcggtgctg540 cagtacacctccggctccaccggcagccccaagggggtcgtcgtcacccacggcgcgctc600 gccgacaacgtccgcagcctcctgtccgggttcgacctgggaaccggcgcccggctgggc660 ggctggctgccgatgtaccacgacatggggctgttcgggctgctgagcccggcgctgttc720 agcggcggcgccgccgtgctgatgagcggcagcgccttcctgcgcaggccgcacctgtgg780 ccgacgctgatcgaccgcttcggcgtggtcttctccgcggcgcccgacttcgcctacgac840 tactgcgtacggcgggtggagcccgagcaggtggaccggctcgacctctcgcgctggcgc900 tgggcggccaacggctcggagcccatccgggccgagacgctccgcgccttcaccaaggag960 ttcgcccccgcggggctgccccacgacgcgatgaccccctgctacggactggccgaggcg1020 accctgctggtctccctgtcggcgggcgagctgcgcacccggcgggtggacgccgcggca1080 ctggagaaccaccgcttcgtcgaggcggccgcgggccgcccgtcccgcgaggtcgtctcg1146 tgcggccggcccccggccctggaggtccgcgtggccgaccccgcgaccggagagcccgtc1200 acgggcgatgcggtgggcgagatccaggtgcggggcgcgagcgtggccggcggctactgg1260 cggaaaccggaggcgaccgccgagacgttcgtcacggccgcggacggctccgggccctgg1320 ctgcgcaccggcgacctcggcgccctgtacgagggcgagctgtacgtcaccggccgcatc1380 aaggaactcctcatcgtgcacggccgcaacatctacccgcacgacgtcgagcgcgaactg1440 cgcgcccaccacgacgagctcggcgcgatcggcgccgtcttctccgtccccacggaggag1500 ggcgaggccgtcgtggtcacgcacgaggtggtcccgtccgtccgggacgaccggggcccc1560 gcgctggtgacggcggtacgggcgacgctcgcccgggagttcggcctggcaccggccggg1620 gtggtgctggtgcgccgcggccgcaccccgcgcaccagcagcggcaaggtgcagcgccgc1680 ctggccgcccggctcttccgcaccggggaactcgcccaggtccacgccgaccccggtgcc1740 caccggctcgtggcggcgctccgcgaggcggacggcctgcgcgacgcccccgcgtccacg1800 acatga 1806 l~

Information for SEQ ID NO: 39 Length: 99 Type: PRT
Organism: Streptomyces refuineus Sequence: 39 Met Ser Leu Ser Pro Pro Ser Ser Ser Pro Pro Ser Ser Pro Pro Pro Ser Pro Pro His Asp Pro Asp Ala Leu Arg Gln Trp Leu Arg Glu Gln Cys Ala Asp Cys Leu Gly Val Pro Pro Ala Ser Leu Ala Thr Asp Val Pro Leu Thr Asp Tyr Gly Met Thr Ser Val Thr Gly Thr Ala Leu Cys Gly Met Val Glu Asp His Leu Asp Val Glu Cys Asp Leu Ser Leu Leu Trp Gln Glu Gln Thr Ile Asp Gly Ile Thr Ser Arg Leu Ala Ser Arg Thr Ala Arg Information for SEQ ID NO: 40 Length: 300 Type: DNA
Organism: Streptomyces refuineus Sequence: 40 atgtccctgt ccccgccttc ttcgtccccg ccttcttccc cgcccccttc tccgccgcac 60 gaccccgacg ccctgcggca gtggctgcgc gagcagtgcg ccgactgcct cggcgtcccc 120 ccggcatccc tcgccaccga cgtccccctc accgactacg gcatgacctc cgtcaccggg 180 accgccctgt gcggcatggt ggaggaccac ctggacgtcg agtgcgacct gagcctgctc 240 tggcaggagc agacgatcga cggcatcacc tcccggctgg cctcgcgcac cgcgcgctga 300 Information for SEQ ID NO: 41 Length: 6291 Type: PRT
Organism: Streptomyces refuineus Sequence: 41 Met Leu Glu Ser Pro Ala Asp Arg Val Ala Ala Thr Ser Ala Gln Ser Gly Ile Trp Thr Ala Gln Arg Leu Arg Ser Asp Asp Arg Leu Tyr Thr Cys Gly Leu Tyr Leu Glu Leu Asp His Val Val Glu Glu Val Leu Gly Glu Ala Ile Gly Arg Ala Val Ala Asp Thr Glu Ala Leu Arg Thr Ala Phe Gly Glu Asp Gly Asp Gly Ala Leu Glu Gln Arg Val Leu Ala Arg Pro Pro Asp Thr Gln Thr Arg Leu Phe Arg Leu Asp Leu Gly Gly Asp Asp Arg Pro Arg Ala Glu Ala Leu Asp Trp Met Asp Arg Gln Gln Ala Glu Pro Trp Asp Leu Ala Ala Gly Asp Thr Cys Arg His Thr Leu Ile Arg Leu Gly Gly His Arg Thr Val Leu His Leu Arg Tyr His His Leu Ala Leu Asp Gly Phe Gly Ala Ala Leu Tyr Leu Asp Arg Ile Ala Ala Val Tyr Arg Ala Leu Arg Thr Gly Arg Glu Thr Pro Pro Cys Thr Phe Ala Pro Leu Ala Arg Leu Val Glu Glu Asp Arg Ala Tyr Arg Arg Ser Ala Arg His Arg Arg Asp Ala Asp His Trp Arg Thr Arg Phe Ala Asp Leu Pro Arg Pro Thr Ser Leu Ala Gly Ala Ala Ala Pro Ala Ala Pro Ala Ala Leu Arg His Thr Val Arg Val Ser Ala Ala Asp Thr Ala Ala Leu Gly Leu Arg Ala Asp Arg Ser Gly Ser Thr Trp Pro Val Phe Ala Thr Ala Ala Val Ala Ala Phe Leu Ser Arg Leu Ala Pro Gly Glu Glu Val Val Val Gly Phe Pro Val Thr Ala Arg Val Thr Pro Ala Ala Val Arg Thr Pro Gly Met Leu Ala Asn Val Val Pro Leu Arg Ile Arg Val Arg Gln Gly Met Ser Phe Ala Ala Leu Leu Asp Arg Thr Ala Ala Glu Ile Gly Ala Thr Leu Arg His Gln Arg His Arg Thr Glu Asp Ile Gly Arg Ala Leu Gly Leu Pro Pro His Gly Ala Gln Pro Ala Pro Thr Leu Val Asn Val Met Ala Phe Ala Pro Val Leu Asp Phe Gly Asp Cys Leu Ser Pro Val His Gln Leu Ser Ala Gly Pro Val Glu Asp Leu Ala Val Asn Leu Leu Gly Thr Pro Gly Asp Gly Arg Glu Leu Glu Ile Thr Val Ala Ala Asn Pro Leu Leu His Ser Glu Asp Ala Val Ala Ser Leu Ala A1a Arg Leu Ala Glu Phe Leu Ala Arg Ala Gly Glu His Ala Asp Ala Pro Ile Gly Arg Thr Arg Leu Leu Gly Ala Ala Glu Glu Ala Glu Ala Leu Ala Ala Gly Arg Ser Pro Arg Arg Asp Leu Pro Ala Arg Thr Leu Pro Glu Leu Phe Ala Arg Gln Ala Ala Arg Thr Pro Asp Ala Pro Ala Val Ala Ser Asp Arg Thr Thr Trp Thr Tyr Ala Arg Leu Asp Ala His Ala Gly Arg Val Ala Arg Arg Leu Ala Ala Arg Gly Val Gly Pro Glu Ser Ile Val Ala Leu Ala Val Pro Arg Gly Val Glu Leu Ala Ala Leu Val Ile Gly Val Gln Arg Ala Gly Gly Ala Tyr Leu Pro Ile Asp Pro Glu Tyr Pro Ala Glu Arg Ile Gly Phe Leu Leu Arg Asp Ala Arg Pro Ala Leu Val Val Cys Glu Pro Gly Thr Asp Leu Pro Asp Thr Gly Cys Pro Gln Val Pro Ala Gly Asp Leu Leu Asp Ala Gly Val Arg Cys Ala Glu Ala Glu Glu Pro Ala Pro Gly Asp Leu Pro Ala Asp Leu Pro Ala Tyr Val Val Tyr Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Val Val Thr His Ala Gly Ile Ala Ala Leu Ala Ala Glu Gln Ile Asp Arg Tyr Arg Leu Gly Pro Gly Ser Arg Val Ala Gln Leu Ala Ala Leu Gly Phe Asp Val Ala Val Ala Glu Leu Ala Met Ala Leu Thr Ser Gly Ser Cys Leu Val Leu Pro Pro His Gly Leu Ala Gly Glu Glu Leu Ala Glu Phe Leu Arg Ser Arg Arg Ile Thr Thr Ala Leu Thr Thr Ala Ser Val Leu Ala Thr Val Pro Pro Gly Asp Phe Pro Asp Leu Ser Asp Leu Ala Thr Gly Gly Glu Gln Pro Pro Pro Pro Leu Ile Ala Arg Trp Ala Pro Gly Arg Arg Met Phe Asn Val Tyr Gly Pro Thr Glu Al~a Thr Val G1n Ala Thr Ser Gly Arg Cys Ala Ala Gly Gly Glu Arg Met Pro Asp Ile Gly Asn Thr Glu Ala Gly Val Asp Ala Tyr Val Leu As;p Gly Ala Leu Arg Pro Val Pro Asp Gly Ala Thr Gly Glu Leu Tyr Leu Arg Gly Arg Gly Leu Ala Arg Gly Tyr Leu Arg Arg Pro Gly Leu Thr Ala Ala Arg Phe Val Ala Asp Pro His Thr G1y Thr Gly Glu Arg Met Tyr Arg Thr Gly Asp Leu Val Arg Arg Val Pro Gly Glu Gly Arg Thr Val Leu Glu Phe Val Gly Arg Ala Asp Asp Gln Val Lys Ile Arg Gly Phe Arg Val Glu Pro Gly Glu Val Glu Ala Ala Leu Ala Glu Leu As;p Gly Val Ala Gln Ala Leu Val Thr Val Arg Glu Glu Arg Pro Gly As;p Arg Arg Leu Val Gly Tyr Leu Val Pro Asp Pro Ala Gly Arg Asp Gl:y Ser Ala Arg Gly Pro Asp Val Glu Arg Trp Arg Arg Leu Ile Ala Ala Arg Leu Pro Ala His Leu Val Pro Ser Ala Leu Val Glu Leu Ala Glu Ile Pro Arg Thr Ala Asn Gly Lys Val Asp Arg Ser Ala Leu Pro Al~a Pro Gly Gly Thr Pro Pro Pro Ala Gly Arg Ala Pro Arg Asn Ala Arg Glu Glu Ala Leu Cys Ala Leu Phe Ala Glu Val Leu Gly Val Glu Glu Val Gly Ala Asp His Asp Phe Phe Ala Leu G1y Gly Asp Ser Leu Leu Ala Ala Arg Leu Ala Ser Arg Ile Arg Asn Arg Leu Gly Lys Ala Val Thr Val r74 Arg Glu Val Phe Arg Ala Pro Thr Ala Ala Gly Leu Ala Glu Ala Leu Gly Gly Glu Ala Arg Ala Asp Gly Arg Val Arg Pro Val Arg Pro Arg Pro Glu Arg Val Pro Leu Ser Ala Ala Gln Arg Arg Leu Trp Phe Ile Asp Glu Leu Gln Gly Ala Ser Ala Ala Tyr Asn Ile Pro Thr Thr Leu Arg Phe Asp Gly Pro Leu Asp Val Pro Ala Leu His Ala Ala Leu Gly Asp Val Val Asp Arg His Glu Ala Leu Arg Thr Thr Val Arg Pro Ala Ala Glu Asp Ala Thr Gly Ala Ala Ala Ala Pro Glu Gln His Ile Ala Pro Pro Gly Gly His Arg Leu Pro Leu Pro Val Arg Asp Ile Ala Pro Glu Glu Leu Ala Gly Glu Leu Arg Ala Ala Ala Gly His Val Phe Asp Leu Thr Arg Asp Leu Pro Val Arg Ala Arg Leu Tyr Arg Thr Ala Glu Arg Glu His Val Leu Leu Leu Leu Val His His Ile Ala Ala Asp Gly Ala Ser Met Gly Pro Leu Ile Gly Asp Leu Ala Thr Ala Tyr Thr Ala Arg Leu Ala Gly Arg Ala Pro Asp Leu Pro Ala Pro Glu Val Thr Tyr Ala Asp Phe Ala Leu Trp Glu His Arg Gly Gly Glu His Ala Ala Ala Gln Ala Glu Gly Leu Asp Tyr Trp Arg Arg Ala Leu Ala Gly Leu Pro Asp Arg Ile Arg Leu Pro Ala Asp Arg Pro Arg Ser Gln Glu Pro Val Arg Arg Gly Gly Ala Ala Arg Phe Glu Val Pro Pro Ala Leu Tyr Ala Arg Leu Ala Glu Leu Ala Gly Ser Val Arg Ala Thr Pro Phe Met Val Leu Gln Thr Ala Val Ala Val Leu Leu Ser Arg Met 17~

Gly Ala Gly Pro Asp Val Pro Leu Gly Thr Pro Val Ala Gly Arg Pro Asp Glu Ala Leu Asp Glu Val Val Gly Gys Phe Val Asn Thr Val Val Leu Arg Thr Asp Val Ser Gly Asp Pro Thr Val Ala Glu Leu Leu Ala Arg Thr Arg Asp Gly Asp Leu Ala Ala Leu Ala His Gln Asp Val Pro Phe Asp Arg Val Val Asp Ala Val Asn Pro Val Arg Ser Ile Ala Arg His Pro Leu Phe Gln Val Met Leu Val Leu Asn Gly Ala Glu Gln Arg Arg Gly Arg A1a Arg Phe Pro Gly Leu Asp Ser Arg Ile Gly Ala Val Asp Ser Gly Glu Thr Lys Phe Asp Leu Ser Trp His Phe Thr His Arg Asp Gly Pro Glu Arg Ala Leu Glu Gly Thr Leu Val Tyr Ala Ala Asp Met Phe Gly Ala Ala Thr Ala Arg Arg Leu Thr Glu Arg Leu Leu Gly Val Leu Thr Ala Met Ala Asp Asp Pro Gly Arg Pro Val Gly Ser Ile Asp Val Leu Ser Ala Ala Glu His Arg Ala Val Arg Ala Trp Gly Thr Gly Ala Ala Gln Asp Arg Thr Arg Arg Pro Glu Pro Val Ala Gly Arg Ile Ala Ala Gln Ala Ala Arg Thr Pro Gly Ala Pro Ala Val Thr Glu Pro Gly Arg Val Trp Thr Tyr Ala Glu Leu Asp Ala Arg Ala Asn Arg Leu Ala Arg Ala Leu Ala Ala Arg Gly Val Gly Ala Glu Asp Leu Val Ala Val Leu Leu Pro Arg Gly Ala Asp Leu Val Ala Thr Leu Leu Gly Val Leu Arg Ala Gly Ala Ser Tyr Leu Pro Leu Asp Thr Gly His Pro Ser Asp Arg Asn Arg Trp Ala Val Ser Asp Ala Ala Pro Ala Leu Val Val Thr Asp Gly Ala His Arg Gly Thr Leu Pro Gly Glu Thr Gly Cys Ala Val Leu Val Leu Gly Gly Glu Asp Ala Glu Ala Glu Leu Ala Gly Arg Ala Pro Thr Pro Pro Asp Glu Thr Asp Leu Ala Arg Pro Val Ala Gly Ala Asn Ala Ala Tyr Thr Ile His Thr Ser Gly Ser Thr Gly Arg Pro Lys Ala Val Val Val Thr Arg Asp Ala Leu Asp Ala Phe Val Glu Arg Thr Val Asp Thr Tyr Gly Asp Ala Leu Arg Gly Thr Ser Leu Leu His Ser Pro Val Ala Phe Asp Leu Thr Val Ala Thr Leu Tyr Gly Pro Pro Ala Ala Gly Gly Arg Ile His Val Glu Asp Leu Asp Glu Ala Gly Ile Ala Arg Trp Glu Arg Glu Cys Pro Ala Phe Leu Lys Ala Thr Pro Ser His Leu Ala Leu Leu Glu Glu Phe Gly Gly Ser Ala Ala Pro Gly Thr Val Val Leu Ala Gly Glu Gln Leu Leu Gly Ala Arg Leu Asp Arg Trp Arg Ala Arg His Pro Gly Thr Ala Val Phe Asn Ser Tyr Gly Pro Thr Glu Thr Thr Val Asn Cys Leu Glu Tyr Arg Ile Ala Pro Gly Ala Glu Thr Ala Pro Gly Pro Val Pro Val Gly Arg Pro Val Ala Gly Val Arg Val His Leu Leu Asp Ala Arg Leu Arg Pro Val Ala Pro Gly Val Thr Gly Glu Leu Tyr Val Cys Gly Pro Gly Val Ala Arg Gly Tyr Arg Gly Arg Pro Ala Ala Thr Ala Glu Arg Phe Val Ala Cys Pro Phe Gly Glu Pro Gly Glu Arg Met Tyr Arg Thr Gly Asp Leu Met Arg Trp Thr Pro Asp Gly Ala Leu Leu Tyr Glu Gly Arg Ala Asp Ala Gln Leu Lys Val Arg Gly Phe Arg Val Glu Pro Gly Glu Val Glu Ala Ala Leu Leu Asp Leu Pro Gly Val Arg Glu Ala Ala Val Thr Leu Val Gly Gly Pro Gly Arg Gly Ser Gly Gln Ala Gly Gly Ser Ala Ala Pro Ala Arg Leu Val Gly Tyr Val Val Gly Gly Ala Phe Asp Pro Ala Ala Leu Leu Glu Arg Leu Arg Val Arg Leu Pro Asp His Met Va1 Pro Ala Ala Leu Val Glu Leu Asp Ala Leu Pro Leu Thr Pro Asn Gly Lys Leu Asp Arg Arg Ala Leu Pro Ala Pro Asp Phe Gly Arg His Ala Gly Arg Arg Ala Pro Arg Gly Pro Arg Glu Glu Leu Leu Cys Thr Leu Phe Ala Glu Val Leu Gly Leu Pro Glu Ala Gly Ala Glu Asp Ser Phe Phe Ala Leu Gly Gly Asp Ser Ile Val Ser Ile Gln Leu Val Gly Arg Ala Arg Arg Ala Gly Leu His Phe Thr Va1 Arg Asp Val Phe Glu His Pro Thr Ala Ala Gly Leu Ala Ala Val Ala Arg Ala Ala Asp Pro Ala Gly Asp Pro Gly Thr Arg Pro Ala Pro Gly Leu Pro Pro Ser Gly Pro Leu Pro Tyr Val Pro Ala Ala Ala Arg Leu Val Ala Arg Thr Gly Ser Ile Arg Ala Arg Gly Ala Asp Arg Phe His Gln Ser Val Val Leu Thr Ala Pro Ala Asp Ala Gly Pro Asp Asp Val Arg Arg Val Leu Gln Thr Val Ile Asp His His Gly Ala Leu Arg Leu Arg Ala Ala Ala Asp Arg Asp Gly Ala Pro Asp Gly Leu Val Ile Gly Glu Pro Gly Ser Val Ala Ala Ala Asp Leu Leu Arg Cys Arg Asp Ala Ala Gly Leu Pro Glu Ala Ala Leu Arg Glu Ala Val Glu Gln DEMANDES OU B VETS V~OIJIJMIl~TEU
.
LA PRESENTS PARTIE DE CETTE DEMAN~DE OU CE BREVE',TS
COMPREND PLUS D'UN TOME.
CECI EST LE TOME r DlE
NOTE: Pdur les Tomes additionels, veillez contacter 1e Bureau Canadian des Brevets.
~TU~MBO APIJICATIONS f PATENTS
THIS SECTION OF THE APPLICATION I PATENT CONTAINS MORE
THAN ONE VOLUME.
THIS I>~ VOLUME ~ OF
NOTE: For additional volumes please contact the Canadian Patent Office.

Claims (22)

1. An isolated, purified or enriched nucleic acid comprising a nucleic acid sequence selected from the group consisting of:
a. SEQ ID NOS: 1, 6, 17, 34, and coding regions thereof;
b. a nucleic acid having at least 75% identity to a nucleic acid of (a); and c. a nucleic acid complementary to a nucleic acid of (a) or (b).
2. A nucleic acid of claim 1 selected from the group consisting of:
a. a nucleic acid of SEQ ID NO: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33;
b. a nucleic acid encoding a polypeptide of SEQ ID NO: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32;
c. a nucleic acid having at least 75% homology to a nucleic acid of (a) or (b);
and d. a nucleic acid complementary to a nucleic acid of (a), (b) or (c).
3. An isolated, purified or enriched nucleic acid capable of hybridizing to a nucleic acid of claim 2 under conditions of high stringency.
4. An isolated, purified or enriched nucleic acid comprising the sequence of at least two nucleic acids of claim 2.
5. An isolated, purified or enriched nucleic acid comprising the sequence of at least three nucleic acids of claim 2.
6. An isolated, purified or enriched nucleic acid that hybridizes under stringent conditions to any one of A541 ORFs 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 (SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33) and can substitute for the ORF to which it specifically hybridizes to direct the synthesis of an compound or analogue.
7. An isolated gene cluster comprising ORFs encoding polypeptides sufficient to direct the synthesis of an A54145 compound or analogue.
8. The isolated gene cluster of claim 7 wherein the gene cluster is present in a bacterium.
9. The isolated gene cluster of claim 8 wherein the gene cluster contains a nucleic acid of any one of A541 ORFs 1 to 15 (SEQ ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33) present in the E. coli strains DH10B having accession nos.
IDAC
260202-1, 260202-2 and 260202-3.
10. An isolated polypeptide comprising a polypeptide sequence selected from any one of:
(a) a polypeptide of any one of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32; and (b) a polypeptide which is at least 75% identical in amino acid sequence to a polypeptide of any one of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32.
11. A polypeptide comprising at least two polypeptides of claim 10.
12. A polypeptide comprising at least three polypeptides of claim 10.
13. A polypeptide comprising at least five or more polypeptides of claim 10.
14. An expression vector comprising a nucleic acid of claim 2.
15. A host cell transformed with an expression vector of claim 14.
16. The host cell of claim 15, wherein the cell is transformed with an exogenous nucleic acid comprising a gene cluster encoding polypeptides sufficient to direct the assembly of an A54145 compound or analogue.
17. A method of chemically modifying a biological molecule that is a substrate for a polypeptide encoded by an A54145 biosynthesis gene cluster, said method comprising contacting the biological molecule with a polypeptide of claim 10, wherein said polypeptide chemically modifies said biological molecule.
18. The method of chemically modifying a biological molecule that is a substrate for a polypeptide encoded by an A54145 biosynthesis gene cluster, said method comprising contacting the biological molecule with at least two different polypeptides of claim 10.
19. An isolated or purified antibody capable of specifically binding to a polypeptide having a sequence selected from the group consisting of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, 32.
20. A method of making a polypeptide having a sequence selected from the group consisting of SEQ ID NOS: 2, 4, 7, 9, 11, 13, 15, 18, 20, 22, 24, 26, 28, 30, comprising introducing a nucleic acid encoding said polypeptide, said nucleic acid being operably linked to a promoter, into a host cell.
21. A method of making a A54145 compound or analog comprising the step of providing a bacterium containing a gene cluster with sufficient genes to produce a A54145 compound or analogue and culturing the bacterium under conditions allowing for expression of the sufficient genes to produce an A54145 compound, wherein the gene cluster contains at least one nucleic acid of claim 2.
22. A method of making an A54145 compound or analog comprising culturing a Streptomyces fradiae bacterium under conditions allowing for expression of A541 ORFs 1 to 15 (SEO ID NOS: 3, 5, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33) present in the E. coli strains DH10B having accession nos. IDAC 260202-1, 260202-2 and 260202-3.
CA002412627A 2001-12-26 2002-12-24 Genes and proteins involved in the biosynthesis of lipopeptides Abandoned CA2412627A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CA002450691A CA2450691C (en) 2001-12-26 2002-12-24 Genes and proteins involved in the biosynthesis of lipopeptides
CA002412627A CA2412627A1 (en) 2001-12-26 2002-12-24 Genes and proteins involved in the biosynthesis of lipopeptides

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
USUSSN60/342,133 2001-12-26
USUSSN60/372,789 2002-04-17
CA002412627A CA2412627A1 (en) 2001-12-26 2002-12-24 Genes and proteins involved in the biosynthesis of lipopeptides

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CA002450691A Division CA2450691C (en) 2001-12-26 2002-12-24 Genes and proteins involved in the biosynthesis of lipopeptides

Publications (1)

Publication Number Publication Date
CA2412627A1 true CA2412627A1 (en) 2003-06-26

Family

ID=4171238

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002412627A Abandoned CA2412627A1 (en) 2001-12-26 2002-12-24 Genes and proteins involved in the biosynthesis of lipopeptides

Country Status (1)

Country Link
CA (1) CA2412627A1 (en)

Similar Documents

Publication Publication Date Title
DK2271666T3 (en) NRPS-PKS GROUP AND ITS MANIPULATION AND APPLICABILITY
JPH09224687A (en) Polyketide-synthase gene
KR20070033979A (en) DNA coding for polypeptides involved in biosynthesis of pladienolides
KR20100039443A (en) Compositions and methods relating to the daptomycin biosynthetic gene cluster
CN107868789B (en) Colimycin biosynthesis gene cluster
KR20100049580A (en) Thiopeptide precursor protein, gene encoding it and uses thereof
CN101275141A (en) Biological synthesis gene cluster for Azintamide
KR20080012845A (en) Genetically modified microorganism and process for production of macrolide compound using the microorganism
US20030124689A1 (en) Mitomycin biosynthetic gene cluster
CN101691575B (en) Biosynthetic gene cluster of sanglifehrin
US20020164747A1 (en) Gene cluster for ramoplanin biosynthesis
CN107540682B (en) Streptovaricin derivative and its preparation method and application
WO2002059322A9 (en) Compositions and methods relating to the daptomycin biosynthetic gene cluster
CN101063140B (en) Vancocin biological synthesis gene cluster
KR101189475B1 (en) Genes and proteins for biosynthesis of tricyclocompounds
US20030175888A1 (en) Discrete acyltransferases associated with type I polyketide synthases and methods of use
CN106676115B (en) 2 &#39;-chloro Pentostatins and 2 &#39;-amino -2&#39;-deoxyadenosine biological synthesis gene cluster and its application
KR102159415B1 (en) Uk-2 biosynthetic genes and method for improving uk-2 productivity using the same
US20030171562A1 (en) Genes and proteins for the biosynthesis of polyketides
US20030077767A1 (en) Genes and proteins for the biosynthesis of anthramycin
US20030064491A1 (en) Genes and proteins involved in the biosynthesis of enediyne ring structures
US20030113874A1 (en) Genes and proteins for the biosynthesis of rosaramicin
CA2450691C (en) Genes and proteins involved in the biosynthesis of lipopeptides
CA2412627A1 (en) Genes and proteins involved in the biosynthesis of lipopeptides
CN107164394B (en) Biosynthetic gene cluster of atypical keratinocyte compound nenestatin A and application thereof

Legal Events

Date Code Title Description
EEER Examination request
FZDE Dead