Detailed Description
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and synthetic biology, and the like, which are within the skill of the art. Such techniques are explained fully in the literature: "Molecular Cloning: A Laboratory Manual," second edition (Sambrook et al, 1989); "Oligonucleotide Synthesis" (m.j. Gait eds., 1984); "Animal Cell Culture" (ed. R.I. Freshney, 1987); "Methods in Enzymology" (Academic Press, Inc.); "Current Protocols in Molecular Biology" (edited by F.M. Ausubel et al, 1987, and updated regularly); "PCR: The Polymerase Chain Reaction," (Mullis et al eds., 1994); singleton et al, Dictionary of Microbiology and Molecular Biology, second edition, J. Wiley & Sons (New York, N.Y.1994) and March's Advanced Organic Chemistry Reactions, fourth edition Mechanisms and Structure, John Wiley & Sons (New York, N.Y.1992), provide one of skill in the art with a general guide to many of the terms used in this application.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. For the purposes of the present invention, the following terms are defined below.
The articles "a" and "the" are used herein to refer to one or to more than one (i.e., to at least one) of the grammatical object of the article. For example, "an element" means one/one element or more than one/one element.
The use of an alternative (e.g., "or") should be understood to mean either, both, or any combination thereof.
The term "and/or" should be understood to mean either or both of the alternatives.
As used herein, the term "about" or "approximately" refers to a quantity, level, value, quantity, frequency, percentage, dimension, size, amount, weight, or length that varies by up to 15%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, or 1% as compared to a reference quantity, level, value, quantity, frequency, percentage, dimension, size, amount, weight, or length. In one embodiment, the term "about" or "approximately" refers to a quantity, level, value, quantity, frequency, percentage, dimension, size, amount, weight, or length that surrounds a reference quantity, level, value, quantity, frequency, percentage, dimension, size, amount, weight, or length by 15%, ± 10%, ± 9%, ± 8%, ± 7%, ± 6%, ± 5%, ± 4%, ± 3%, ± 2%, or ± 1%.
As used herein, the term "substantially" refers to an amount, level, value, amount, frequency, percentage, dimension, size, amount, weight, or length that is about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or more as compared to a reference amount, level, value, amount, frequency, percentage, dimension, size, amount, weight, or length. In one embodiment, the term "substantially the same" refers to a range of numbers, levels, values, amounts, frequencies, percentages, dimensions, sizes, amounts, weights, or lengths that are about the same as a reference number, level, value, amount, weight, or length.
As used herein, the term "substantially free," when used to describe a composition, e.g., a population of cells or a culture medium, refers to a composition that is free of a specified substance, e.g., 95% free, 96% free, 97% free, 98% free, 99% free of the specified substance, or is undetectable as measured by conventional means. Similar meanings may apply to the term "absent" when referring to the absence of a particular substance or component of a composition.
Throughout this specification, unless the context requires otherwise, the terms "comprise", "comprising" and "have" are to be construed as implying that the recited step or element or group of steps or elements is included, but not excluding any other step or element or group of steps or elements. In certain embodiments, the terms "comprising," "including," "containing," and "having" are used synonymously.
"consisting of … …" is intended to include, but is not limited to, anything following the phrase "consisting of … …". Thus, the phrase "consisting of … …" is intended to indicate that the listed elements are required or mandatory, and that no other element may be present.
"consisting essentially of … …" is intended to include any elements listed after the phrase "consisting essentially of … …" and is limited to other elements that do not interfere with or contribute to the activities or actions specified in the disclosure of the listed elements. Thus, the phrase "consisting essentially of … …" is intended to indicate that the listed elements are required or mandatory, but that no other elements are optional and may or may not be present depending on whether they affect the activity or action of the listed elements.
Reference throughout this specification to "one embodiment," "some embodiments," "a particular embodiment," or similar language means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the foregoing phrases in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Where features relating to a particular aspect of the invention, for example the product of the invention, are disclosed, such disclosure is also to be considered as applicable to any other aspect of the invention, for example the method and use of the invention, mutatis mutandis.
The present invention is based, at least in part, on the discovery that: host microorganisms expressing an olivine synthase (OLS) and/or an Olivine Acid Cyclase (OAC) can produce high concentrations of olivine and/or olivine acid by deleting the genomic fabH coding sequence in the host microorganism (e.g., escherichia coli). Without wishing to be bound by theory, it is believed that the deletion of the fabH gene promotes synthesis of olivine and/or olivine acid by the olive alcohol synthase (OLS) and/or Olive Acid Cyclase (OAC) in the engineered microorganism using intracellular increased levels of Malonyl-CoA (Malonyl-CoA) and Hexanoyl-CoA (Hexanoyl-CoA) as substrates. In this way, the engineered microorganism can be used to produce olivetol or olivetol on demand, thereby surpassing the cumbersome and costly prior art that still relies on complex synthetic chemistry.
Accordingly, in one aspect, the present invention provides an engineered microorganism, wherein the engineered microorganism is modified to express an olivine synthase (OLS), wherein the engineered microorganism has a deletion of the genomic fabH gene.
In some embodiments, the microorganism engineered to biosynthesize olivetol and olivetol is escherichia coli (e. In a specific embodiment, the microorganism engineered to biosynthesize olivetol and olivetol is Escherichia coli BW25113 (ATCC No.; available from American Type Culture Collection). The BW25113 strain is derived from E.coli K-12W1485, is a derivative strain of K-12W1485, is similar to MG1655, and is an engineered Escherichia coli strain which is slightly modified and is closer to a wild type. Improved in the production of malonyl-CoA and hexanoyl-CoA compared to WT.
In some embodiments, the olivetol synthase used herein may be derived from cannabis: (a), (b), (c), (d) and (d)Cannabis sativa). The olivetol synthase can be a variant optimized for expression in e. In some embodiments of the present invention, the substrate is,the olivetol synthase may comprise, consist essentially of, or consist of: an amino acid sequence encoded by the nucleotide sequence set forth in SEQ ID NO. 12 or an amino acid sequence encoded by a nucleotide sequence having at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO. 12 or a range of any two of the foregoing. In some embodiments, the olivine synthase may comprise, consist essentially of, or consist of: 13 or an amino acid sequence which has at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% or a range of any two of the aforementioned values as compared to SEQ ID No. 13.
Exemplary olive alcohol synthetases for use herein may include full-length olive alcohol synthetases, fragments of olive alcohol synthetases, variants of olive alcohol synthetases, truncated olive alcohol synthetases, or fusion enzymes having at least one activity of an olive alcohol synthetase. In some embodiments, the olive alcohol synthase used in the present invention has an activity of catalyzing the synthesis of olive alcohol from malonyl-coa and hexanoyl-coa.
In some embodiments, the olivine acid cyclase used herein may be derived from cannabis: (a), (b), (c), (d) and (d)Cannabis sativa). The olivine acid cyclase may be a variant optimized for expression in E.coli. In some embodiments, the olivine acid cyclase may comprise, consist essentially of, or consist of: amino acid sequence encoded by the nucleotide sequence shown as SEQ ID NO. 14 or by a nucleotide sequence having at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO. 14 or a range of identities consisting of any two of the foregoing valuesAn amino acid sequence. In some embodiments, the olivine acid cyclase may comprise, consist essentially of, or consist of: 15 or an amino acid sequence which has at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% or a range of any two of the aforementioned values as compared to SEQ ID No. 15.
Exemplary olive-alcohol-acid cyclases for use herein may include full-length olive-alcohol-acid cyclase, fragments of olive-alcohol-acid cyclase, variants of olive-alcohol-acid cyclase, truncated olive-alcohol-acid cyclase, or a fusion enzyme having at least one activity of olive-alcohol-acid cyclase. In some embodiments, the olivine acid cyclase used in the present invention has activity to carboxylate olivine to olivine acid.
In some embodiments of the invention, the long-chain acyl-CoA synthetase (fadD), acyl-CoA dehydrogenase (fadE), and β -ketoacyl-acyl carrier protein synthase (fabH) are components inherent in E.coli.
In some embodiments, fadD as used herein may be derived from e. fadD can be a variant optimized for expression in e. In some embodiments, the fadD may comprise, consist essentially of, or consist of: an amino acid sequence encoded by the nucleotide sequence set forth in SEQ ID NO. 8 or an amino acid sequence encoded by a nucleotide sequence having at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO. 8 or a range consisting of any two of the foregoing. In some embodiments, the fadD may comprise, consist essentially of, or consist of: 9 or an amino acid sequence which has at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% or a range of any two of the aforementioned values as compared to SEQ ID No. 9.
Exemplary fadD as used herein may include full-length fadD, fragments of fadD, variants of fadD, truncated fadD, or a fusion enzyme having at least one activity of fadD. In some embodiments, fadD used in the present invention has activity to catalyze the production of hexanoyl-coa from hexanoic acid by acylation.
In some embodiments, the one or more enzymes used in the present invention may be a mutant (mutant) or variant (variant) of an enzyme described herein. As used herein, "mutant" and "variant" refer to molecules that retain the same or substantially the same biological activity as the original sequence. The mutants or variants may be from the same or different species, or may be synthetic sequences based on natural or existing molecules. In some embodiments, the terms "mutant" and "variant" refer to polypeptides having an amino acid sequence that differs from a corresponding wild-type polypeptide by at least one amino acid. For example, mutants and variants may comprise conservative amino acid substitutions: i.e. replacing the original corresponding amino acid with an amino acid having similar properties. Conservative substitutions may be polar para-polar amino acids (glycine (G, Gly), serine (S, Ser), threonine (T, Thr), tyrosine (Y, Tyr), cysteine (C, Cys), asparagine (N, Asn), and glutamine (Q, Gln)); nonpolar versus nonpolar amino acids (alanine (a, Ala), valine (V, Val), tryptophan (W, Trp), leucine (L, Leu), proline (P, Pro), methionine (m, Met), phenylalanine (F, Phe)); acidic versus acidic amino acids (aspartic acid (D, Asp), glutamic acid (E, Glu)); basic pair basic amino acids (arginine (R, Arg), histidine (H, His), lysine (K, Lys)); charged amino acids (aspartic acid (D, Asp), glutamic acid (E, Glu), histidine (H, His), lysine (K, Lys) and arginine (R, Arg)); and hydrophobic versus hydrophobic amino acids (alanine (a, Ala), leucine (ULeu), isoleucine (I, Ile), valine (V, Val), proline (P, Pro), phenylalanine (F, Phe), tryptophan (W, Trp), and methionine (M, Met)). In some other embodiments, the mutant or variant may also comprise non-conservative substitutions.
In some embodiments, a mutant or variant polypeptide can have substitutions, additions, insertions, or deletions of amino acids in a range of about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100 or more or any two of the foregoing. A mutant or variant may have an activity of at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% or a range of any two of the foregoing values as compared to an unaltered enzyme. Enzyme activity can be determined by conventional techniques known in the art, such as colorimetric enzymatic assays.
As is well known to those skilled in the art, expression of a heterologous nucleic acid in a host can be improved (i.e., codon optimized) by replacing one or more encoding nucleotides (i.e., codons) in a nucleotide sequence encoding a polypeptide, such as an enzyme, with another codon for better expression in the host. One reason for this effect is because different organisms show a preference for different codons. In some embodiments, a nucleotide sequence encoding a polypeptide, such as an enzyme, disclosed herein is modified or optimized such that the resulting nucleotide sequence reflects codon bias for a particular host. For example, in some embodiments, the nucleotide sequence encoding a polypeptide, such as an enzyme, is modified or optimized for e. See, e.g., Gouy M, gateway C. multidon use in bacteria: correlation with gene expression [ J ]. Nucleic acids research, 1982, 10(22): 7055-7074; Eyre-Walker A. Synonymus code bias is related to gene length in Escherichia coli: selection for translational access [ J ]. Molecular biology and evolution, 1996, 13(6): 864 872; nakamura Y, Gojobori T, Ikemura T. Codon use blocked from international DNA sequences databases: status for the year 2000[ J ]. Nucleic acids research, 2000, 28(1): 292-.
A polynucleotide or polypeptide has a certain "sequence identity" or percentage of "identity" to another polynucleotide or polypeptide, meaning that when two sequences are aligned, the percentage of bases or amino acids are the same and in the same relative position. Determining the percent identity of two amino acid sequences or two nucleotide sequences can include aligning and comparing the amino acid residues or nucleotides at corresponding positions in the two sequences. A sequence is considered 100% identical if all positions in both sequences are occupied by the same amino acid residue or nucleotide. Sequence identity can be determined in a number of different ways, for example, sequences can be aligned using various methods and computer programs (e.g., BLAST, T-COFFEE, MUSCLE, MAFFT, etc.).
Some embodiments of the invention relate to expression constructs, e.g. vectors, such as plasmids, preferably expression constructs comprising one or more nucleotide sequences encoding OLS and/or OAC. The nucleotide sequence encoding OLS or OAC is as described above. Preferably, the expression construct is a plasmid. Preferably, the expression construct may be used to express OLS and/or OAC, more preferably to co-express both OLS and OAC, in escherichia coli, preferably in escherichia coli BW 25113.
Some embodiments of the invention relate to expression constructs, e.g., vectors, such as plasmids, comprising a nucleotide sequence encoding fadD. The nucleotide sequence encoding fadD is as described above. Preferably, the expression construct is a plasmid. Preferably, the expression construct can be used for the expression of fadD in e.coli, preferably e.coli BW 25113. In some embodiments, the host microorganism expresses fadD itself, and thus an expression construct encoding fadD is introduced into the host microorganism to overexpress fadD.
Engineering a microorganism may comprise expressing an enzyme of interest in the microorganism. In some embodiments, the expression constructs described herein are introduced into a host microorganism by transformation to express an enzyme of interest. The transformation can be carried out by methods well known in the art. For example, plasmids described herein comprising a nucleotide sequence encoding fadD can be introduced into e.coli by transformation to overexpress fadD. Transformation can be, but is not limited to, Agrobacterium-mediated transformation, electroporation with plasmid DNA, DNA uptake, biolistic transformation, virus-mediated transformation, or protoplast transformation. Transformation may be any other transformation method suitable for the particular host.
Expression of the enzyme of interest in the host microorganism to achieve the desired aim may be achieved as described above by converting the expression construct encoding the enzyme into the host microorganism, by integrating the expression construct encoding the enzyme into the genomic sequence of the host microorganism in a variety of ways, or by enhancing transcription and/or expression of enzyme encoding genes, such as fadD encoding genes, native to the host microorganism in a variety of ways, for example by using stronger regulatory elements such as promoters. Such means are generally well known to those skilled in the art. In some embodiments, the plasmids used herein are set forth in SEQ ID NO 7. In some embodiments, the plasmids used herein are set forth in SEQ ID NO 10. In some embodiments, the plasmids used herein are set forth in SEQ ID NO 11.
Engineering a microorganism may comprise intervening in the function of the protein of interest in said microorganism, e.g. reducing or eliminating the expression of said protein, which may be achieved, for example, by deleting genomic sequences of interest in said microorganism. In some embodiments, the genomic fabH gene in a microorganism described herein is deleted such that 3-oxoacyl- [ acyl carrier protein ] synthetase 3 (3-oxoacyl- [ acyl carrier protein ] synthase 3) is not expressed in the microorganism. In some embodiments, the amino acid sequence of 3-oxoacyl- [ acyl carrier protein ] synthetase 3 is shown as NCBI ACCESSION NO: NP-415609 (https:// www.ncbi.nlm.nih.gov/protein/16129054). In some embodiments, a genomic fadE gene in a microorganism described herein is deleted such that an acyl-CoA dehydrogenase (acyl-CoA dehydrogenase) is not expressed in the microorganism. In some embodiments, the amino acid sequence of the acyl-CoA dehydrogenase is shown as NCBI ACCESSION NO: NP-414756 (https:// www.ncbi.nlm.nih.gov/protein/90111100). In some embodiments, both genomic fabH and fadE genes in a microorganism described herein are deleted such that both 3-oxoacyl- [ acyl carrier protein ] synthetase 3 and acyl-CoA dehydrogenase are not expressed in the microorganism. In some embodiments, the fabH gene sequence that is deleted is shown in SEQ ID NO. 3, or is a variant sequence of SEQ ID NO. 3 that is deleted such that a protein having the same or similar function as 3-oxoacyl- [ acyl carrier protein ] synthetase 3 is not expressed. In some embodiments, the deleted fadE gene sequence is as shown in SEQ ID NO 6, or a variant sequence of SEQ ID NO 6 that is deleted such that a protein having the same or similar function as the acyl-CoA dehydrogenase is not expressed.
Deletion of genomic sequences of interest in a microorganism can be performed by methods known in the art. The genomic sequence may be deleted, for example, by designing an artificial sequence for lambda-Red homologous recombination against the genomic sequence, which is integrated into the genome at the target location using lambda-Red homologous recombination. Specific experimental protocols can be found in the examples described herein. In some embodiments, the artificial sequence for deletion of the fabH gene is shown in SEQ ID NO 1. In some embodiments, the portion of the genomic sequence of E.coli in which the fabH gene is deleted, such as BW25113, that is upstream and downstream of the original fabH gene site is SEQ ID NO 2. In some embodiments, the artificial sequence for deletion of the fadE gene is set forth in SEQ ID NO 4. In some embodiments, the portion of the genomic sequence of E.coli in which the fadE gene is deleted, e.g., BW25113, that is upstream and downstream of the original fadE gene site is SEQ ID NO 5.
In addition to deleting the genomic sequence of interest in the microorganism, the function of the protein of interest can be interfered with by other methods known in the art, including, but not limited to, interfering with transcription of the genomic sequence encoding the protein of interest, interfering with expression of mRNA encoding the protein of interest, interfering with delivery of the protein of interest, e.g., to the outside of the cell; more specifically, including, but not limited to, methods of deleting all or part of the genomic sequence encoding the protein of interest or its regulatory elements such as a promoter, inserting one or more nucleotides, e.g., a stop codon, that affects its transcription or mutating one or more nucleotides thereof to such an extent that the genomic sequence cannot be normally transcribed, in the middle of the genomic sequence encoding the protein of interest or its regulatory elements such as a promoter, introducing an agent that interferes with or silences mRNA encoding the protein of interest, e.g., an siRNA or dsRNAi agent, or a method of inhibiting or stopping the function of delivering the protein of interest, e.g., to an extracellular system (e.g., chaperones, signal sequences, transporters).
Suitable media for culturing the host may include standard media (e.g., Luria-Bertani broth, optionally supplemented with one or more other agents, such as an inducer; standard yeast media; and the like). In some embodiments, the medium can be supplemented with fermentable sugars (e.g., hexoses, such as glucose, xylose, and the like). In some embodiments, a suitable medium comprises an inducer. In certain such embodiments, the inducing agent comprises rhamnose.
The carbon source in a suitable medium for host culture may vary from simple sugars such as glucose to more complex hydrolysates of other biomass such as yeast extract. The addition of salts typically provides the necessary elements, such as magnesium, nitrogen, phosphorus, and sulfur, to allow the cell to synthesize polypeptides and nucleic acids. Suitable media may also be supplemented with selective agents, such as antibiotics, to select for maintenance of certain plasmids and the like. For example, if the microorganism is resistant to an antibiotic, such as ampicillin, tetracycline or kanamycin, the antibiotic can be added to the culture medium to prevent growth of cells that lack resistance. Suitable media may be supplemented with other compounds as necessary to select for desired physiological or biochemical properties, such as particular amino acids, and the like.
Materials and methods suitable for the maintenance and growth of microorganisms of the present invention are described herein, for example, in the examples section. Other materials and methods suitable for the maintenance and growth of microorganisms (e.g., E.coli) are well known in the art. Exemplary techniques may be found in WO 2009/076676; US12/335,071 (US 2009/0203102); WO 2010/003007; US 2010/0048964; WO 2009/132220; US 2010/0003716; gerhardt P, Murray R G E, Costalow R N, et al, Manual of methods for genetic biology [ J ] 1981; crueger W, Crueger A, Brock T D, et al Biotechnology a textbook of industrial microbiology [ J ] 1990, in which standard culture conditions and fermentation modes that can be used are described, such as batch, fed-batch or continuous fermentation, the entire contents of which are incorporated herein by reference.
For small scale production, the engineered microorganism may be grown, fermented, and induced to express a desired nucleotide sequence, such as a nucleotide sequence encoding an OLS, OAC, and/or fadD, and/or to synthesize a desired fermentation product, such as olivetol and/or olivolic acid, in bulk, for example, on a scale of about 100mL, 500mL, 1L, 5L, or 10L. For large scale production, the engineered microorganism may be grown, fermented, and induced to express a desired nucleotide sequence, such as a nucleotide sequence encoding an OLS, OAC, and/or fadD, and/or to synthesize a desired fermentation product, such as olivetol and/or olivolic acid, in bulk on a scale of about 10L, 100L, 1000L, 10,000L, 100,000L, or greater.
Analysis of the fermentation product may be performed by separating the fermentation product of interest by chromatography, preferably HPLC, to determine the concentration at one or more times during the cultivation. The microbial culture and fermentation products can also be detected photometrically (absorbance, fluorescence).
The engineered microorganisms described herein achieve improved production of olivetol and/or olivetol acid. In some embodiments, the engineered microorganisms described herein achieve a higher yield in terms of olivine alcohol and/or olivine acid production than a suitable un-engineered or partially engineered microorganism control of at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or 1000-fold or more or a range of any two of the foregoing values. In some embodiments, the engineered microorganism described herein achieves at least about 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4, 3.6, 3.8, 4.0, 4.2, 4.4, 4.6, 4.8, 5.0, 5.2, 5.4, 5.6, 5.8, 6.0, 6.2, 6.4, 6.6, 6.8, 7.0, 7.2, 7.4, 7.6, 7.8, 8.0, 8.2, 8, 8.6, 8, 8.0, 6.2, 6.4, 6.6, 6, 6.8, 7.8, 7.0, 7.2, 7.8, 8, 8.0, 8, 6, 8, 9.0, 9.50, 9.8, 30, 25, 20, 19, 25, 40, 25, 40, 25, 40, 25, 40, 25, 40, 25, 40, 19, 25, 19, 25, 40, 25, 40, 25, 40, 19, 40, 25, 40, 25, 40, 19, 25, 40, 25, 40, 25, 40, 25, 19, 40, 19, 40, 23, 19, 40, 25, 19, 25, 40, 19, 40, 19, 40, 19, 40, 19, 40, 19, 40, 23, 40, 25, 40, 19, 25, 40, 19, 25, 40, 25, 40, 19, 40, 25, 40, 200. 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 mg/L or higher or a range of any two of the foregoing values.
The invention also provides some preferred embodiments as follows:
item 1. an engineered escherichia coli, wherein said engineered escherichia coli is modified to express an olivine synthase (OLS) variant, wherein said OLS variant comprises one or more mutations compared to its wild type selected from the group consisting of: I303T, I52L, S56A, H262M and K263R.
Item 2. the engineered escherichia coli according to any one of the preceding items, wherein the engineered escherichia coli is modified to express an Olivine Acid Cyclase (OAC).
Item 3. the engineered escherichia coli according to any one of the preceding items, wherein the genomic fabH gene of the engineered escherichia coli is deleted.
Item 4. the engineered escherichia coli according to any one of the preceding items, wherein the genomic fadE gene of the engineered escherichia coli is deleted.
Item 5. the engineered escherichia coli according to any one of the preceding items, wherein both of the genomic fabH and fadE genes of the engineered escherichia coli are deleted.
Item 6. the engineered escherichia coli according to any one of the preceding items, wherein the engineered escherichia coli is modified to overexpress a long-chain esteracyl-coa synthetase (fadD).
Item 7. the engineered escherichia coli according to any one of the preceding items, wherein the engineered escherichia coli is modified from wild-type escherichia coli, BW25113, or BL 21.
Item 8. the engineered escherichia coli according to any one of the preceding items, wherein the modification is performed by introducing a plasmid.
Item 9. the engineered escherichia coli according to any one of the preceding items, wherein the OLS and OAC are expressed by the same plasmid.
Item 10. the engineered escherichia coli according to any one of the preceding items, wherein the fadD is overexpressed by a plasmid.
Item 11. a method for preparing engineered escherichia coli, comprising modifying escherichia coli to express an olivol synthase (OLS) variant, wherein said OLS variant comprises one or more mutations compared to its wild type selected from the group consisting of: I303T, I52L, S56A, H262M and K263R.
Item 12. the method of any one of the preceding items, comprising modifying the large intestine rod to express Olivine Acid Cyclase (OAC).
Item 13. the method of any one of the preceding items, comprising deleting the genomic fabH gene of the e.
Item 14. the method of any one of the preceding items, comprising deleting the genomic fadE gene of e.
Item 15. the method of any one of the preceding items, comprising deleting both the genomic fabH and fadE genes of the e.
Item 16. the method of any one of the preceding items, comprising modifying the escherichia coli to overexpress a long-chain esteracyl-coa synthetase (fadD).
Item 17. the method of any one of the preceding items, wherein the e.coli is wild-type e.coli, BW25113, or BL 21.
Item 18. the method of any one of the preceding items, wherein the modification is performed by introducing a plasmid.
Item 19. the method of any one of the preceding items, wherein the OLS and OAC are expressed by the same plasmid.
Item 20. the method of any one of the preceding items, wherein the fadD is overexpressed by a plasmid.
Examples
Hereinafter, the present invention will be described in detail by examples. However, the examples provided herein are for illustrative purposes only and are not intended to limit the present invention.
The experimental procedures used in the following examples are all conventional unless otherwise specified.
Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
The enzymatic reagents used were purchased from ThermoFisher and New England Biolabs (NEB), the small molecule standards used were purchased from Sigma, the kits used for plasmid extraction were purchased from Tiangen Biotechnology, Inc. (Beijing), and the kits for DNA fragment recovery were purchased from Omega, USA, and the corresponding procedures were performed strictly according to the product instructions. All media were prepared with deionized water unless otherwise specified, yeast extract and peptone were purchased from OXID, UK, and other reagents were purchased from Chemicals, national institutes. The service of gene synthesis is provided by the Huada institute of genetics.
Coli culture media that can be used herein:
LB culture medium: 5g/L yeast extract, 10g/L peptone, 10g/L NaCl. Adjusting pH to 7.0-7.2, and autoclaving for 30 min.
SOB medium: 5g/L yeast extract, 20g/L peptone, 0.5g/L NaCl, 2.5mL of 1M KCl. Adjusting pH to 7.0-7.2, and sterilizing with high pressure steam.
ZY medium: 10g/L of peptone and 5g/L of yeast extract, and after dissolving in distilled water, adjusting the pH to 7.0. Autoclaving for 30 minutes.
50×M:1.25mol/L Na2HPO4,1.25mol/L KH2PO4,2.5mol/L NH4Cl,0.25mol/L Na2SO4And (5) sterilizing by high-pressure steam for 30 minutes.
50X 5052: 25% glycerol, 2.5% glucose, autoclaved for 30 minutes.
1M MgSO4: 24.6g MgSO were weighed out4·7H2O plus H2Dissolving O, diluting to 100mL, and then sterilizing for 30 minutes by high-pressure steam.
1000 times trace elements: 50mmol/L FeCl3,20mmol/L CaCl2,10mmol/L MnCl2,10mmol/L ZnSO4,CoCl2,NiCl2,Na2MO4,Na2SeO3,H3BO32mmol/L each.
ZYM medium: 2mL of 50X 5052, 2mL of 50X M, 200. mu.L of 1M MgSO 2 was added to the ZY medium4100 μ L1000 Xtrace elements.
Example 1: deletion of fabH Gene
The fabH gene in the genome of the Escherichia coli BW25113 is deleted to reduce the intracellular Malonyl-CoA flow to the branch metabolism, thereby improving the accumulation of intracellular Malonyl-CoA to further increase the synthesis of the target product OA.
Synthesis of H1-kana-H2 as described in SEQ ID No. 1, by integrating SEQ ID No. 1 into the fabH gene position of the genome of BW25113 by lambda-Red homologous recombination according to the method provided in the literature (Datsenko K A, Wanner B L. One-step inactivation of chromogenes in Escherichia coli K-12 using PCR products [ J ]. Proceedings of the National Academy of Sciences, 2000, 97 (12): 6640-6645.) for deletion of the fabH gene as follows:
1. preparation of BW25113 competence;
2. introducing a plasmid pKD 46;
3. BW25113 (pKD 46) was inoculated into 3mL of LB containing ampicillin at a concentration of 100. mu.g/L and cultured overnight in a shaker at 30 ℃;
4. 100. mu.L of an overnight cultured BW25113 (pKD 46) cell suspension was added to 10mL of the OB medium, followed by 100. mu.L of arabinose at a concentration of 1M and 10. mu.L of ampicillin at a concentration of 100 mg/L. Shake culturing at 30 ℃ to OD600= 0.4-0.6;
centrifuging at 5.4 deg.C to collect thallus, re-suspending thallus with 10mL precooled ultrapure water, washing thallus twice in the same way, and finally re-suspending thallus with 50 μ L10% glycerol solution;
6. adding 50ng of the nucleotide sequence fragment of SEQ ID NO. 1, uniformly mixing, and adding the mixed solution into an electric shock cup;
7. putting the electric shock cup into an electric shock instrument for one-time electric shock, wherein the electric shock conditions are 200 omega, 25 muF and 2.5 KV;
8. adding 1mL of precooled SOB culture medium, transferring the mixed solution into a sterile EP tube, and carrying out shake culture at 30 ℃ for 1 hour;
9. uniformly coating the bacterial liquid on an LB plate culture medium containing ampicillin and kanamycin, and carrying out static culture at 30 ℃ for 16-20 hours;
10. transformants grown on the plates were verified.
After integration of SEQ ID NO: 1, the KanR resistance gene was deleted according to the method provided in the literature (Datsenko K A, Wanner B L., supra) as follows:
1. preparing the competence of the strain integrating SEQ ID NO. 1;
2. transforming pCP20 into competence, and culturing at 30 ℃;
3. selecting a single clone to 3mL of SOB culture medium, and culturing at 30 ℃ overnight;
4. transferring 100 mu L of the suspension to 10mL of SOB, and carrying out shake culture at 30 ℃ for 3-4 hours, wherein OD600= 0.4;
centrifugally collecting thalli under the condition of 5.4 ℃, and washing the thalli twice by using ice-bath sterile water;
6. resuspending in 50-100. mu.L sterile water for electroporation;
7. adding 100ng of pCP20, performing electric shock transformation (setting of an electric shock instrument: 1.8KV, 5.5 ms), and resuscitating at 30 ℃ for 1 hour;
8. uniformly coating on an LB plate containing ampicillin, and culturing at 30 ℃ for 2-3 days;
9. selecting a single clone, streaking the single clone on an LB (Langmuir-Blodgett) plate without antibiotics, and culturing the single clone at 42 ℃ overnight;
10. 20-30 clones were picked up on a kanamycin-resistant plate and a non-resistant plate, respectively, and clones which could not grow on the kanamycin plate but simultaneously grown on the non-resistant plate were targeted for PCR verification.
The partial sequence of the genomic sequence of the engineered BW25113 lacking fabH gene, which is up-and-down stream of the original fabH gene site, is SEQ ID NO 2.
Example 2: deletion of fadE Gene
The engineered BW25113 obtained in example 1 further has the fadE gene deleted in its genome to reduce the flow of intracellular hexanoyl-coa to the bypass metabolism, thereby increasing the accumulation of intracellular hexanoyl-coa to further increase the synthesis of the target product OA.
Synthesis of H3-kana-H4 as described in SEQ ID NO: 4, integration of SEQ ID NO: 4 into the fadE gene position of the genome of BW25113 lacking the fabH gene as described in example 1 to delete the fadE gene and the KanR resistance gene after integration of SEQ ID NO: 4 was designed.
Example 3: introduction of fadD expression plasmid
Long-chain acyl-coa synthetase (fade) was overexpressed in engineered BW25113 obtained in example 2 to convert hexanoic acid (hexanoic acid) to hexanoyl-coa, thereby increasing intracellular hexanoyl-coa accumulation to further increase the synthesis of the target product OA.
The fadD (long-chain acyl-coenzyme A synthetase) expression plasmid pL-Prha-fadD shown in SEQ ID NO. 7 was designed and synthesized, and transformed into BW25113 lacking both of the fabH and fadE genes to obtain an engineered E.coli strain in which fadD is overexpressed and both of the fabH and fadE genes are deleted.
Example 4: introduction into OLS expression plasmid or OLS&OAC expression plasmid
The OLS expression plasmid p15A-Prha-OLS shown in SEQ ID NO 10 and the OLS & OAC expression plasmid p15A-Prha-OLS-OAC shown in SEQ ID NO 11 were designed and transformed into engineered BW25113 obtained in example 3, respectively, to obtain engineered BW25113 synthesizing OL or OA, referred to as CZ-OL or CZ-OA, respectively.
Example 5: introduction of OLS mutations
The OLS variant & OAC expression plasmid p 15A-pra-OLS variant-OAC with different OLS mutations was designed to be synthesized and transformed into the engineered BW25113 obtained in example 3 as described in example 4 to obtain a series of OLS variant-containing engineered BW25113 of synthetic OA.
Example 6: performance testing of engineered strains containing OLS variants
The engineered strains obtained in the examples were tested in triplicate according to the procedure shown below.
Fermentation and sample preparation:
1. the recombinant strain is inoculated into 3mL of LB liquid culture medium, and is cultured at 37 ℃ at 220 rpm overnight for about 14 hours, and the final OD600 value reaches 2-3;
2. adding ZYM culture medium into a 24-deep-well plate, and adding 2mL of ZYM culture medium into each well;
3. transferring the bacterial liquid in the step 1 to the ZYM culture medium in the step 2, wherein OD600 is 0.01 after transferring;
4. when the bacterial liquid OD600 grows to 0.2, adding an inducer (the addition amount is 0.2 percent of rhamnose) and a precursor caproic acid (1 mM), and totally 24 hours from inoculation to fermentation end;
5. adding 3mL of ethyl acetate into 1mL of fermentation liquor, shaking and uniformly mixing for 10min, and centrifuging to collect an upper organic phase;
6. operation 5 was repeated, and the organic phases obtained by two centrifugations were combined together and about 6mL was transferred to a 10mL tube.
7. The organic phase in the tube was evaporated to dryness using a vacuum concentrator and 1mL of methanol was added to resuspend all the samples in the tube.
8. The sample from step 7 was filtered through a 0.22 μ M filter and then transferred to an HPLC sample bottle.
TABLE 1 cell concentration in fermentation broths at fermentation stop
The fermentation broth extract was analyzed for the amount of OL produced by the HPLC assay method described above.
The OA yield in the fermentation broth extract was analyzed by the HPLC assay method described above.
The OA yield in the fermentation broth extract was calculated from the peak area. The results showed that, in the above engineered strain, the yield of OA was 260.49 mg/L for OLS (I52L); for OLS (S56A), the yield of OA was 320.31 mg/L; for OLS (H262M), the yield of OA was 311.60 mg/L; for OLS (K263R), the OA yield was 300.89mg/L, which was much higher than that achieved in the prior art (Tan Z, Clomburg J M, Gonzalez R. Synthetic pathway for the production of aliphatic acid in Escherichia coli [ J ]. ACS Synthetic biology, 2018, 7(8): 1886. 1896.), with a yield of 80 mg/L. In addition, the CZ-OL was tested to have an OL yield of 116.02mg/L, which is higher than that reported in the prior art patent No. WO2020176547A1, which found an olive alcohol synthase derived from Cymbium hybride multivar to have an olive alcohol yield of 40.7 mg/L.
TABLE 2 OA yields corresponding to the various OLS variants
Example 7: performance testing of OLS variant-containing strains
To further test whether the effect of the OLS variants on OA production was limited to that of a particular engineered strain, the effect of the OLS variants was tested in two non-engineered escherichia coli strains BW25113 and BL 21.
The p15A-Prha-OLS (WT) -OAC plasmid and the four p15A-Prha-OLS (mutant) -OAC plasmids were transformed into these two E.coli strains, respectively, to obtain strains BW (WT), BW (I52L), BW (S56A), BW (H262M), BW (K263R), BL21 (WT), BL21 (I52L), BL21 (S56A), BL21 (H262M), BL21 (K263R). The strain was tested for OA production as described in example 6.
TABLE 3 cell concentration in fermentation broths at fermentation stop
TABLE 4 OA yields corresponding to the various OLS variants
Note: "-" indicates that OA was not detected
It can be seen that the above-described OLS variants achieve improved OA production in both wild-type and engineered strains. It was concluded that the OA production-improving effect of the OLS variant is not limited to a specific strain.
In addition, the corresponding version of the strain used herein for OA production without OAC expression (i.e., producing OL instead of OA) also obtained beneficial OL yields relative to the corresponding version of the OA production control strain herein without OAC expression (i.e., producing OL instead of OA).
While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention.
Sequence listing
<110> Beijing blue-crystal Microbiol technologies Ltd
Shenzhen Lanjing Biotech Co Ltd
<120> Olive alcohol synthetase variant and engineered microorganism expressing the same
<130> CID210073
<160> 15
<170> SIPOSequenceListing 1.0
<210> 1
<211> 2353
<212> DNA
<213> Artificial Sequence
<400> 1
ccttctatca attatatcgg ctatcttgaa gccaatgagt tgttaactgg caagacagat 60
gtgctggttt gtgacggctt tacaggaaat gtcacattaa agacgatgga aggtgttgtc 120
aggatgttcc tttctctgct gaaatctcag ggtgaaggga aaaaacggtc gtggtggcta 180
ctgttattaa agcgttggct acaaaagagc ctgacgaggc gattcagtca cctcaacccc 240
gaccagtata acggcgcctg tctgttagga ttgcgcggca cggtgataaa aagtcatggt 300
gcagccaatc agcgagcttt tgcggtcgcg attgaacagg cagtgcaggc ggtgcagcga 360
caagttcctc agcgaattgc cgctcgcctg gaatctgtat acccagctgg ttttgagctg 420
ctggacggtg gcaaaagcgg aactctgcgg tagcaggacg ctgccagcga actcgcagtt 480
tgcaagtgac ggtatataac cgaaaagtga ctgagcgtca tgattccggg gatccgtcga 540
cctgcagttc gaagttccta ttctctagaa agtataggaa cttcggatga atgtcagcta 600
ctgggctatc tggacaaggg aaaacgcaag cgcaaagaga aagcaggtag cttgcagtgg 660
gcttacatgg cgatagctag actgggcggt tttatggaca gcaagcgaac cggaattgcc 720
agctggggcg ccctctggta aggttgggaa gccctgcaaa gtaaactgga tggctttctt 780
gccgccaagg atctgatggc gcaggggatc aagatctgat caagagacag gatgaggatc 840
gtttcgcatg attgaacaag atggattgca cgcaggttct ccggccgctt gggtggagag 900
gctattcggc tatgactggg cacaacagac aatcggctgc tctgatgccg ccgtgttccg 960
gctgtcagcg caggggcgcc cggttctttt tgtcaagacc gacctgtccg gtgccctgaa 1020
tgaactgcag gacgaggcag cgcggctatc gtggctggcc acgacgggcg ttccttgcgc 1080
agctgtgctc gacgttgtca ctgaagcggg aagggactgg ctgctattgg gcgaagtgcc 1140
ggggcaggat ctcctgtcat ctcaccttgc tcctgccgag aaagtatcca tcatggctga 1200
tgcaatgcgg cggctgcata cgcttgatcc ggctacctgc ccattcgacc accaagcgaa 1260
acatcgcatc gagcgagcac gtactcggat ggaagccggt cttgtcgatc aggatgatct 1320
ggacgaagag catcaggggc tcgcgccagc cgaactgttc gccaggctca aggcgcgcat 1380
gcccgacggc gaggatctcg tcgtgaccca tggcgatgcc tgcttgccga atatcatggt 1440
ggaaaatggc cgcttttctg gattcatcga ctgtggccgg ctgggtgtgg cggaccgcta 1500
tcaggacata gcgttggcta cccgtgatat tgctgaagag cttggcggcg aatgggctga 1560
ccgcttcctc gtgctttacg gtatcgccgc tcccgattcg cagcgcatcg ccttctatcg 1620
ccttcttgac gagttcttct gagaagttcc tattctctag aaagtatagg aacttcgaag 1680
cagctccagc ctacatccgc gctggttcgt ttctaggata aggattaaaa catgacgcaa 1740
tttgcatttg tgttccctgg acagggttct caaaccgttg gaatgctggc tgatatggcg 1800
gcgagctatc caattgtcga agaaacgttt gctgaagctt ctgcggcgct gggctacgac 1860
ctgtgggcgc tgacccagca ggggccagct gaagaactga ataaaacctg gcaaactcag 1920
cctgcgctgt tgactgcatc tgttgcgctg tatcgcgtat ggcagcagca gggcggtaaa 1980
gcaccggcaa tgatggccgg tcacagcctg ggggaatact ccgcgctggt ttgcgctggt 2040
gtgattgatt tcgctgatgc ggtgcgtctg gttgagatgc gcggcaagtt catgcaagaa 2100
gccgtaccgg aaggcacggg cgctatggcg gcaatcatcg gtctggatga tgcgtctatt 2160
gcgaaagcgt gtgaagaagc tgcagaaggt caggtcgttt ctccggtaaa ctttaactct 2220
ccgggacagg tggttattgc cggtcataaa gaagcggttg agcgtgctgg cgctgcctgt 2280
aaagcggcgg gcgcaaaacg cgcgctgccg ttaccagtga gcgtaccgtc tcactgtgcg 2340
ctgatgaaac cag 2353
<210> 2
<211> 1261
<212> DNA
<213> Artificial Sequence
<400> 2
ccttctatca attatatcgg ctatcttgaa gccaatgagt tgttaactgg caagacagat 60
gtgctggttt gtgacggctt tacaggaaat gtcacattaa agacgatgga aggtgttgtc 120
aggatgttcc tttctctgct gaaatctcag ggtgaaggga aaaaacggtc gtggtggcta 180
ctgttattaa agcgttggct acaaaagagc ctgacgaggc gattcagtca cctcaacccc 240
gaccagtata acggcgcctg tctgttagga ttgcgcggca cggtgataaa aagtcatggt 300
gcagccaatc agcgagcttt tgcggtcgcg attgaacagg cagtgcaggc ggtgcagcga 360
caagttcctc agcgaattgc cgctcgcctg gaatctgtat acccagctgg ttttgagctg 420
ctggacggtg gcaaaagcgg aactctgcgg tagcaggacg ctgccagcga actcgcagtt 480
tgcaagtgac ggtatataac cgaaaagtga ctgagcgtca tgattccggg gatccgtcga 540
cctgcagttc gaagttccta ttctctagaa agtataggaa cttcgaagca gctccagcct 600
acatccgcgc tggttcgttt ctaggataag gattaaaaca tgacgcaatt tgcatttgtg 660
ttccctggac agggttctca aaccgttgga atgctggctg atatggcggc gagctatcca 720
attgtcgaag aaacgtttgc tgaagcttct gcggcgctgg gctacgacct gtgggcgctg 780
acccagcagg ggccagctga agaactgaat aaaacctggc aaactcagcc tgcgctgttg 840
actgcatctg ttgcgctgta tcgcgtatgg cagcagcagg gcggtaaagc accggcaatg 900
atggccggtc acagcctggg ggaatactcc gcgctggttt gcgctggtgt gattgatttc 960
gctgatgcgg tgcgtctggt tgagatgcgc ggcaagttca tgcaagaagc cgtaccggaa 1020
ggcacgggcg ctatggcggc aatcatcggt ctggatgatg cgtctattgc gaaagcgtgt 1080
gaagaagctg cagaaggtca ggtcgtttct ccggtaaact ttaactctcc gggacaggtg 1140
gttattgccg gtcataaaga agcggttgag cgtgctggcg ctgcctgtaa agcggcgggc 1200
gcaaaacgcg cgctgccgtt accagtgagc gtaccgtctc actgtgcgct gatgaaacca 1260
g 1261
<210> 3
<211> 954
<212> DNA
<213> Escherichia coli
<400> 3
atgtatacga agattattgg tactggcagc tatctgcccg aacaagtgcg gacaaacgcc 60
gatttggaaa aaatggtgga cacctctgac gagtggattg tcactcgtac cggtatccgc 120
gaacgccaca ttgccgcgcc aaacgaaacc gtttcaacca tgggctttga agcggcgaca 180
cgcgcaattg agatggcggg cattgagaaa gaccagattg gcctgatcgt tgtggcaacg 240
acttctgcta cgcacgcttt cccgagcgca gcttgtcaga ttcaaagcat gttgggcatt 300
aaaggttgcc cggcatttga cgttgcagca gcctgcgcag gtttcaccta tgcattaagc 360
gtagccgatc aatacgtgaa atctggggcg gtgaagtatg ctctggtcgt cggttccgat 420
gtactggcgc gcacctgcga tccaaccgat cgtgggacta ttattatttt tggcgatggc 480
gcgggcgctg cggtgctggc tgcctctgaa gagccgggaa tcatttccac ccatctgcat 540
gccgacggta gttatggtga attgctgacg ctgccaaacg ccgaccgcgt gaatccagag 600
aattcaattc atctgacgat ggcgggcaac gaagtcttca aggttgcggt aacggaactg 660
gcgcacatcg ttgatgagac gctggcggcg aataatcttg accgttctca actggactgg 720
ctggttccgc atcaggctaa cctgcgtatt atcagtgcaa cggcgaaaaa actcggtatg 780
tctatggata atgtcgtggt gacgctggat cgccacggta atacctctgc ggcctctgtc 840
ccgtgcgcgc tggatgaagc tgtacgcgac gggcgcatta agccggggca gttggttctg 900
cttgaagcct ttggcggtgg attcacctgg ggctccgcgc tggttcgttt ctag 954
<210> 4
<211> 2161
<212> DNA
<213> Artificial Sequence
<400> 4
tcattaccga cgcaggaaat atgactaacg tcagaaatag caatcgccgg gtagcccgga 60
cggttttcac ggtagcgacc ggtcaactct tcggcaaagt gcatagcgtc gcaatgggaa 120
ccgccgttgc cgcaggaaag cactttgcca ccggctttaa agctgtctgc taacaggacc 180
gccgcgcgct gaatggcgtg aatattggcg tcatctttta aaaagttagc cagcgtttcc 240
gccgcttcgt tcagttcgtt acgaataaga tcctggtaca tgaggatatc cttcagcata 300
aatgtaatag acaaaatgca gtgtaccgga taccgccaaa agcgagaagt acgggcaggt 360
gctatgacca ggactttttg acctgaagtg cggataaaaa cagcaacaat gtgagctttg 420
ttgtaattat attgtaaaca tattgctaaa tgtttttaca tccactacaa ccatatcatc 480
acaagtggtc agacctccta caagtaaggg gcttttcgtt gaagttccta ttctctagaa 540
agtataggaa cttcggatga atgtcagcta ctgggctatc tggacaaggg aaaacgcaag 600
cgcaaagaga aagcaggtag cttgcagtgg gcttacatgg cgatagctag actgggcggt 660
tttatggaca gcaagcgaac cggaattgcc agctggggcg ccctctggta aggttgggaa 720
gccctgcaaa gtaaactgga tggctttctt gccgccaagg atctgatggc gcaggggatc 780
aagatctgat caagagacag gatgaggatc gtttcgcatg attgaacaag atggattgca 840
cgcaggttct ccggccgctt gggtggagag gctattcggc tatgactggg cacaacagac 900
aatcggctgc tctgatgccg ccgtgttccg gctgtcagcg caggggcgcc cggttctttt 960
tgtcaagacc gacctgtccg gtgccctgaa tgaactgcag gacgaggcag cgcggctatc 1020
gtggctggcc acgacgggcg ttccttgcgc agctgtgctc gacgttgtca ctgaagcggg 1080
aagggactgg ctgctattgg gcgaagtgcc ggggcaggat ctcctgtcat ctcaccttgc 1140
tcctgccgag aaagtatcca tcatggctga tgcaatgcgg cggctgcata cgcttgatcc 1200
ggctacctgc ccattcgacc accaagcgaa acatcgcatc gagcgagcac gtactcggat 1260
ggaagccggt cttgtcgatc aggatgatct ggacgaagag catcaggggc tcgcgccagc 1320
cgaactgttc gccaggctca aggcgcgcat gcccgacggc gaggatctcg tcgtgaccca 1380
tggcgatgcc tgcttgccga atatcatggt ggaaaatggc cgcttttctg gattcatcga 1440
ctgtggccgg ctgggtgtgg cggaccgcta tcaggacata gcgttggcta cccgtgatat 1500
tgctgaagag cttggcggcg aatgggctga ccgcttcctc gtgctttacg gtatcgccgc 1560
tcccgattcg cagcgcatcg ccttctatcg ccttcttgac gagttcttct gagaagttcc 1620
tattctctag aaagtatagg aacttcatga ataacggagc cgaaaggctc cgtttcttta 1680
tccgctaatt atttaaaatt aaagccatcc ggatggtttt ccaggctgcc ggtcaacgcc 1740
gcgaacaaca ccgttttacc atcaatcgaa agcgcatcgt tcacattcag ccaggtgagt 1800
ttctcttgcg acgttttctc atcaatagtc gagaacagcc ccgtcatctg attagatttc 1860
tcggaccaca tcacagcgat acgttgcgag ccacagtcat gcggtttgca cgcgctcatc 1920
acctgatacg tctcatctcc caacgttacg gtttgtgcgg gagtataagt accgcctttc 1980
atcacccagg caggcagctt atgcccttgt accatctgat taaatgcagc tttggtggtt 2040
tcgccctttg caaggctgct aatggttaaa tcatcctgcg ccattgcact ggtggcgatg 2100
accagagcgg cgactgtcgt tattgcctta aacatcattc ctcccgagct tatcctgccc 2160
a 2161
<210> 5
<211> 1015
<212> DNA
<213> Artificial Sequence
<400> 5
caaagtgcat agcgtcgcaa tgggaaccgc cgttgccgca ggaaagcact ttgccaccgg 60
ctttaaagct gtctgctaac aggaccgccg cgcgctgaat ggcgtgaata ttggcgtcat 120
cttttaaaaa gttagccagc gtttccgccg cttcgttcag ttcgttacga ataagatcct 180
ggtacatgag gatatccttc agcataaatg taatagacaa aatgcagtgt accggatacc 240
gccaaaagcg agaagtacgg gcaggtgcta tgaccaggac tttttgacct gaagtgcgga 300
taaaaacagc aacaatgtga gctttgttgt aattatattg taaacatatt gctaaatgtt 360
tttacatcca ctacaaccat atcatcacaa gtggtcagac ctcctacaag taaggggctt 420
ttcgttaccg tttaaataat gccaattatt taaagttagc ggccgcgaag ttcctattct 480
ctagaaagta taggaacttc atgaataacg gagccgaaag gctccgtttc tttatccgct 540
aattatttaa aattaaagcc atccggatgg ttttccaggc tgccggtcaa cgccgcgaac 600
aacaccgttt taccatcaat cgaaagcgca tcgttcacat tcagccaggt gagtttctct 660
tgcgacgttt tctcatcaat agtcgagaac agccccgtca tctgattaga tttctcggac 720
cacatcacag cgatacgttg cgagccacag tcatgcggtt tgcacgcgct catcacctga 780
tacgtctcat ctcccaacgt tacggtttgt gcgggagtat aagtaccgcc tttcatcacc 840
caggcaggca gcttatgccc ttgtaccatc tgattaaatg cagctttggt ggtttcgccc 900
tttgcaaggc tgctaatggt taaatcatcc tgcgccattg cactggtggc gatgaccaga 960
gcggcgactg tcgttattgc cttaaacatc attcctcccg agcttatcct gccca 1015
<210> 6
<211> 2445
<212> DNA
<213> Escherichia coli
<400> 6
atgatgattt tgagtattct cgctacggtt gtcctgctcg gcgcgttgtt ctatcaccgc 60
gtgagcttat ttatcagcag tctgattttg ctcgcctgga cagccgccct cggcgttgct 120
ggtctgtggt cggcgtgggt actggtgcct ctggccatta tcctcgtgcc atttaacttt 180
gcgcctatgc gtaagtcgat gatttccgcg ccggtatttc gcggtttccg taaggtgatg 240
ccgccgatgt cgcgcactga gaaagaagcg attgatgcgg gcaccacctg gtgggagggc 300
gacttgttcc agggcaagcc ggactggaaa aagctgcata actatccgca gccgcgcctg 360
accgccgaag agcaagcgtt tctcgacggc ccggtagaag aagcctgccg gatggcgaat 420
gatttccaga tcacccatga gctggcggat ctgccgccgg agttgtgggc gtaccttaaa 480
gagcatcgtt tcttcgcgat gatcatcaaa aaagagtacg gcgggctgga gttctcggct 540
tatgcccagt ctcgcgtgct gcaaaaactc tccggcgtga gcgggatcct ggcgattacc 600
gtcggcgtgc caaactcatt aggcccgggc gaactgttgc aacattacgg cactgacgag 660
cagaaagatc actatctgcc gcgtctggcg cgtggtcagg agatcccctg ctttgcactg 720
accagcccgg aagcgggttc cgatgcgggc gcgattccgg acaccgggat tgtctgcatg 780
ggcgaatggc agggccagca ggtgctgggg atgcgtctga cctggaacaa acgctacatt 840
acgctggcac cgattgcgac cgtgcttggg ctggcgttta aactctccga cccggaaaaa 900
ttactcggcg gtgcagaaga tttaggcatt acctgtgcgc tgatcccaac caccacgccg 960
ggcgtggaaa ttggtcgtcg ccacttcccg ctgaacgtac cgttccagaa cggaccgacg 1020
cgcggtaaag atgtcttcgt gccgatcgat tacatcatcg gcgggccgaa aatggccggg 1080
caaggctggc ggatgctggt ggagtgcctc tcggtaggcc gcggcatcac cctgccttcc 1140
aactcaaccg gcggcgtgaa atcggtagcg ctggcaaccg gcgcgtatgc tcacattcgc 1200
cgtcagttca aaatctctat tggtaagatg gaagggattg aagagccgct ggcgcgtatt 1260
gccggtaatg cctacgtgat ggatgctgcg gcatcgctga ttacctacgg cattatgctc 1320
ggcgaaaaac ctgccgtgct gtcggctatc gttaagtatc actgtaccca ccgcgggcag 1380
cagtcgatta ttgatgcgat ggatattacc ggcggtaaag gcattatgct cgggcaaagc 1440
aacttcctgg cgcgtgctta ccagggcgca ccgattgcca tcaccgttga aggggctaac 1500
attctgaccc gcagcatgat gatcttcgga caaggagcga ttcgttgcca tccgtacgtg 1560
ctggaagaga tggaagcggc gaagaacaat gacgtcaacg cgttcgataa actgttgttc 1620
aaacatatcg gtcacgtcgg tagcaacaaa gttcgcagct tctggctggg cctgacgcgc 1680
ggtttaacca gcagcacgcc aaccggcgat gccactaaac gctactatca gcacctgaac 1740
cgcctgagcg ccaacctcgc cctgctttct gatgtctcga tggcagtgct gggcggcagc 1800
ctgaaacgtc gcgagcgcat ctcggcccgt ctgggggata ttttaagcca gctctacctc 1860
gcctctgccg tgctgaagcg ttatgacgac gaaggccgta atgaagccga cctgccgctg 1920
gtgcactggg gcgtacaaga tgcgctgtat caggctgaac aggcgatgga tgatttactg 1980
caaaacttcc cgaaccgcgt ggttgccggg ctgctgaatg tggtgatctt cccgaccgga 2040
cgtcattatc tggcaccttc tgacaagctg gatcataaag tggcgaagat tttacaagtg 2100
ccgaacgcca cccgttcccg cattggtcgc ggtcagtacc tgacgccgag cgagcataat 2160
ccggttggct tgctggaaga ggcgctggtg gatgtgattg ccgccgaccc aattcatcag 2220
cggatctgta aagagctggg taaaaacctg ccgtttaccc gtctggatga actggcgcac 2280
aacgcgctgg tgaaggggct gattgataaa gatgaagccg ctattctggt gaaagctgaa 2340
gaaagccgtc tgcgcagtat taacgttgat gactttgatc cggaagagct ggcgacgaag 2400
ccggtaaagt tgccggagaa agtgcggaaa gttgaagccg cgtaa 2445
<210> 7
<211> 6563
<212> DNA
<213> Artificial Sequence
<400> 7
tcgtcgacct aattcccatg tcagccgtta agtgttcctg tgtcactcaa aattgctttg 60
agaggctcta agggcttctc agtgcgttac atccctggct tgttgtccac aaccgttaaa 120
ccttaaaagc tttaaaagcc ttatatattc ttttttttct tataaaactt aaaaccttag 180
aggctattta agttgctgat ttatattaat tttattgttc aaacatgaga gcttagtacg 240
tgaaacatga gagcttagta cgttagccat gagagcttag tacgttagcc atgagggttt 300
agttcgttaa acatgagagc ttagtacgtt aaacatgaga gcttagtacg tgaaacatga 360
gagcttagta cgtactatca acaggttgaa ctgcggatct tgatgagtgg atagtacgtt 420
gctaaaacat gagataaaaa ttgactctca tgttattggc gttaagatat acagaatgat 480
gaggtttttt tatgagactc aaggtcatga tggacgtgaa caaaaaaacg aaaattcgcc 540
accgaaacga gctaaatcac accctggctc aacttccttt gcccgcaaag cgagtgatgt 600
atatggcgct tgctcccatt gatagcaagg aacctcttga acgagggcga gttttcaaaa 660
ttagggctga agaccttgca gcgctcgcca aaatcacccc atcgcttgct tatcgacaat 720
taaaagaggg tggtaagtta cttggtgcca gcaaaatttc gctaagaggg gatgatatca 780
ttgcttcagc taaagagctt aacctgctct ttactgctaa agactcccct gaagagttag 840
atcttaacat tattgagtgg atagcttatt caaatgatga aggatacttg tctttaaaat 900
tcaccagaac catagaacca tatatctcta gccttattgg gaaaaaaaat aaattcacaa 960
cgcaattgtt aacggcaagc ttacgcttaa gtagccagta ttcatcttct ctttatcaac 1020
ttatcaggaa gcattactct aattttaaga agaaaaatta ttttattatt tccgttgatg 1080
agttaaagga agagttaata gcttatactt ttgataaaga tggaagtatt gagtacaaat 1140
accctgactt tcctattttt aaaagggatg tattaaataa agccattgct gaaattaaaa 1200
agaaaacaga aatatcgttt gttggcttta ctgttcatga aaaagaagga agaaaaatta 1260
gtaagctgaa gttcgaattt gtcgttgatg aagatgaatt ttctggcgat aaagatgatg 1320
aagctttttt tatgaattta tctgaagcta atgcagcttt tctcaaggta tttgatgaaa 1380
ccgtacctcc caaaaaagct aaggggtgat atatggctaa aatttacgat ttccctcaag 1440
gagccgaacg ccgcaggatg caccgcaaaa tccagtggaa caacgctgta aaattatcta 1500
aaaatggctg gagtaagcca gaggttaaac gctggtcttt tttagcattc atctcaactg 1560
gctgcggccg cctatttgtt tatttttcta aatacattca aatatgtatc cgctcatgag 1620
acaataaccc tgataaatgc ttcaataata ttgaaaaagg aagagtatga gggaagcggt 1680
gatcgccgaa gtatcgactc aactatcaga ggtagttggc gtcatcgagc gccatctcga 1740
accgacgttg ctggccgtac atttgtacgg ctccgcagtg gatggcggcc tgaagccaca 1800
cagtgatatt gatttgctgg ttacggtgac cgtaaggctt gatgaaacaa cgcggcgagc 1860
tttgatcaac gaccttttgg aaacttcggc ttcccctgga gagagcgaga ttctccgcgc 1920
tgtagaagtc accattgttg tgcacgacga catcattccg tggcgttatc cagctaagcg 1980
cgaactgcaa tttggagaat ggcagcgcaa tgacattctt gcaggtatct tcgagccagc 2040
cacgatcgac attgatctgg ctatcttgct gacaaaagca agagaacata gcgttgcctt 2100
ggtaggtcca gcggcggagg aactctttga tccggttcct gaacaggatc tatttgaggc 2160
gctaaatgaa accttaacgc tatggaactc gccgcccgac tgggctggcg atgagcgaaa 2220
tgtagtgctt acgttgtccc gcatttggta cagcgcagta accggcaaaa tcgcgccgaa 2280
ggatgtcgct gccgactggg caatggagcg cctgccggcc cagtatcagc ccgtcatact 2340
tgaagctaga caggcttatc ttggacaaga agaagatcgc ttggcctcgc gcgcagatca 2400
gttggaagaa tttgtccact acgtgaaagg cgagatcacc aaggtagtcg gcaaataatg 2460
tctaacaatt cgttcaagcc gaggggccgc aagatccggc cacgatgacc cggtcgtcgg 2520
ttcagggcag ggtcgttaaa tagccgctta tgtctattgc tggtttaccg gtttattgac 2580
taccggaagc agagcggata acaatttcac acaggagagc tcaaagagtg gaacaatgca 2640
ggacgccgtc cggtaacggt gttgagcctc ggttgtgtgg tgattattgt cgccgctaac 2700
atcgtcggca tcggcatggc gaattaatct ttctgcgaat tgagatgacg ccactggctg 2760
ggcgtcatcc cggtttcccg ggtaaacacc accgaaaaat agttactatc ttcaaagcca 2820
cattcggtcg aaatatcact gattaacagg cggctatgct ggagaagata ttgcgcatga 2880
cacactctga cctgtcgcag atattgattg atggtcattc cagtctgctg gcgaaattgc 2940
tgacgcaaaa cgcgctcact gcacgatgcc tcatcacaaa atttatccag cgcaaaggga 3000
cttttcaggc tagccgccag ccgggtaatc agcttatcca gcaacgtttc gctggatgtt 3060
ggcggcaacg aatcactggt gtaacgatgg cgattcagca acatcaccaa ctgcccgaac 3120
agcaactcag ccatttcgtt agcaaacggc acatgctgac tactttcatg ctcaagctga 3180
ccgataacct gccgcgcctg cgccatcccc atgctaccta agcgccagtg tggttgccct 3240
gcgctggcgt taaatcccgg aatcgccccc tgccagtcaa gattcagctt cagacgctcc 3300
gggcaataaa taatattctg caaaaccaga tcgttaacgg aagcgtagga gtgtttatcg 3360
tcagcatgaa tgtaaaagag atcgccacgg gtaatgcgat aagggcgatc gttgagtaca 3420
tgcaggccat taccgcgcca gacaatcacc agctcacaaa aatcatgtgt atgttcagca 3480
aagacatctt gcggataacg gtcagccaca gcgactgcct gctggtcgct ggcaaaaaaa 3540
tcatctttga gaagttttaa ctgatgcgcc accgtggcta cctcggccag agaacgaagt 3600
tgattattcg caatatggcg tacaaatacg ttgagaagat tcgcgttatt gcagaaagcc 3660
atcccgtccc tggcgaatat cacgcggtga ccagttaaac tctcggcgaa aaagcgtcga 3720
aaagtggtta ctgtcgctga atccacagcg ataggcgatg tcagtaacgc tggcctcgct 3780
gtggcgtagc agatgtcggg ctttcatcag tcgcaggcgg ttcaggtatc gctgaggcgt 3840
cagtcccgtt tgctgcttaa gctgccgatg tagcgtacgc agtgaaagag aaaattgatc 3900
cgccacggca tcccaattca cctcatcggc aaaatggtcc tccagccagg ccagaagcaa 3960
gttgagacgt gatgcgctgt tttccaggtt ctcctgcaaa ctgcttttac gcagcaagag 4020
cagtaattgc ataaacaaga tctcgcgact ggcggtcgag ggtaaatcat tttccccttc 4080
ctgctgttcc atctgtgcaa ccagctgtcg cacctgctgc aatacgctgt ggttaacgcg 4140
ccagtgagac ggatactgcc catccagctc ttgtggcagc aactgattca gcccggcgag 4200
aaactgaaat cgatccggcg agcgatacag cacattggtc agacacagat tatcggtatg 4260
ttcatacaga tgccgatcat gatcgcgtac gaaacagacc gtgccaccgg tgatggtata 4320
gggctgccca ttaaacacat gaatacccgt gccatgttcg acaatcacaa tttcatgaaa 4380
atcatgatga tgttcaggaa aatccgcctg cgggagccgg ggttctatcg ccacggacgc 4440
gttaccagac ggaaaaaaat ccacactatg taatacggtc atactggcct cctgatgtcg 4500
tcaacacggc gaaatagtaa tcacgaggtc aggttcttac cttaaatttt cgacggaaaa 4560
ccacgtaaaa aacgtcgatt tttcaagata cagcgtgaat tttcaggaaa tgcggtgagc 4620
atcacatcac cacaattcag caaattgtga acatcatcac gttcatcttt ccctggttgc 4680
caatggccca ttttcctgtc agtaacgaga aggtcgcgaa ttcaggcgct ttttagactg 4740
gtcgtaatga aattcaacta gtgctctgca ggagctgtca ccggatgtgc tttccggtct 4800
gatgagtccg tgaggacgaa acagcctcta caaataattt tgtttaagag ttactagaga 4860
aagaggagaa atactagttg aagaaggttt ggcttaaccg ttatcccgcg gacgttccga 4920
cggagatcaa ccctgaccgt tatcaatctc tggtagatat gtttgagcag tcggtcgcgc 4980
gctacgccga tcaacctgcg tttgtgaata tgggggaggt aatgaccttc cgcaagctgg 5040
aagaacgcag tcgcgcgttt gccgcttatt tgcaacaagg gttggggctg aagaaaggcg 5100
atcgcgttgc gttgatgatg cctaatttat tgcaatatcc ggtggcgctg tttggcattt 5160
tgcgtgccgg gatgatcgtc gtaaacgtta acccgttgta taccccgcgt gagcttgagc 5220
atcagcttaa cgatagcggc gcatcggcga ttgttatcgt gtctaacttt gctcacacac 5280
tggaaaaagt ggttgataaa accgccgttc agcacgtaat tctgacccgt atgggcgatc 5340
agctatctac ggcaaaaggc acggtagtca atttcgttgt taaatacatc aagcgtttgg 5400
tgccgaaata ccatctgcca gatgccattt catttcgtag cgcactgcat aacggctacc 5460
ggatgcagta cgtcaaaccc gaactggtgc cggaagattt agcttttctg caatacaccg 5520
gcggcaccac tggtgtggcg aaaggcgcga tgctgactca ccgcaatatg ctggcgaacc 5580
tggaacaggt taacgcgacc tatggtccgc tgttgcatcc gggcaaagag ctggtggtga 5640
cggcgctgcc gctgtatcac atttttgccc tgaccattaa ctgcctgctg tttatcgaac 5700
tgggtgggca gaacctgctt atcactaacc cgcgcgatat tccagggttg gtaaaagagt 5760
tagcgaaata tccgtttacc gctatcacgg gcgttaacac cttgttcaat gcgttgctga 5820
acaataaaga gttccagcag ctggatttct ccagtctgca tctttccgca ggcggtggga 5880
tgccagtgca gcaagtggtg gcagagcgtt gggtgaaact gaccggacag tatctgctgg 5940
aaggctatgg ccttaccgag tgtgcgccgc tggtcagcgt taacccatat gatattgatt 6000
atcatagtgg tagcatcggt ttgccggtgc cgtcgacgga agccaaactg gtggatgatg 6060
atgataatga agtaccacca ggtcaaccgg gtgagctttg tgtcaaagga ccgcaggtga 6120
tgctgggtta ctggcagcgt cccgatgcta ccgatgaaat catcaaaaat ggctggttac 6180
acaccggcga catcgcggta atggatgaag aaggattcct gcgcattgtc gatcgtaaaa 6240
aagacatgat tctggtttcc ggttttaacg tctatcccaa cgagattgaa gatgtcgtca 6300
tgcagcatcc tggcgtacag gaagtcgcgg ctgttggcgt accttccggc tccagtggtg 6360
aagcggtgaa aatcttcgta gtgaaaaaag atccatcgct taccgaagag tcactggtga 6420
ctttttgccg ccgtcagctc acgggataca aagtaccgaa gctggtggag tttcgtgatg 6480
agttaccgaa atctaacgtc ggaaaaattt tgcgacgaga attacgtgac gaagcgcgcg 6540
gcaaagtgga caataaagcc tga 6563
<210> 8
<211> 1686
<212> DNA
<213> Escherichia coli
<400> 8
ttgaagaagg tttggcttaa ccgttatccc gcggacgttc cgacggagat caaccctgac 60
cgttatcaat ctctggtaga tatgtttgag cagtcggtcg cgcgctacgc cgatcaacct 120
gcgtttgtga atatggggga ggtaatgacc ttccgcaagc tggaagaacg cagtcgcgcg 180
tttgccgctt atttgcaaca agggttgggg ctgaagaaag gcgatcgcgt tgcgttgatg 240
atgcctaatt tattgcaata tccggtggcg ctgtttggca ttttgcgtgc cgggatgatc 300
gtcgtaaacg ttaacccgtt gtataccccg cgtgagcttg agcatcagct taacgatagc 360
ggcgcatcgg cgattgttat cgtgtctaac tttgctcaca cactggaaaa agtggttgat 420
aaaaccgccg ttcagcacgt aattctgacc cgtatgggcg atcagctatc tacggcaaaa 480
ggcacggtag tcaatttcgt tgttaaatac atcaagcgtt tggtgccgaa ataccatctg 540
ccagatgcca tttcatttcg tagcgcactg cataacggct accggatgca gtacgtcaaa 600
cccgaactgg tgccggaaga tttagctttt ctgcaataca ccggcggcac cactggtgtg 660
gcgaaaggcg cgatgctgac tcaccgcaat atgctggcga acctggaaca ggttaacgcg 720
acctatggtc cgctgttgca tccgggcaaa gagctggtgg tgacggcgct gccgctgtat 780
cacatttttg ccctgaccat taactgcctg ctgtttatcg aactgggtgg gcagaacctg 840
cttatcacta acccgcgcga tattccaggg ttggtaaaag agttagcgaa atatccgttt 900
accgctatca cgggcgttaa caccttgttc aatgcgttgc tgaacaataa agagttccag 960
cagctggatt tctccagtct gcatctttcc gcaggcggtg ggatgccagt gcagcaagtg 1020
gtggcagagc gttgggtgaa actgaccgga cagtatctgc tggaaggcta tggccttacc 1080
gagtgtgcgc cgctggtcag cgttaaccca tatgatattg attatcatag tggtagcatc 1140
ggtttgccgg tgccgtcgac ggaagccaaa ctggtggatg atgatgataa tgaagtacca 1200
ccaggtcaac cgggtgagct ttgtgtcaaa ggaccgcagg tgatgctggg ttactggcag 1260
cgtcccgatg ctaccgatga aatcatcaaa aatggctggt tacacaccgg cgacatcgcg 1320
gtaatggatg aagaaggatt cctgcgcatt gtcgatcgta aaaaagacat gattctggtt 1380
tccggtttta acgtctatcc caacgagatt gaagatgtcg tcatgcagca tcctggcgta 1440
caggaagtcg cggctgttgg cgtaccttcc ggctccagtg gtgaagcggt gaaaatcttc 1500
gtagtgaaaa aagatccatc gcttaccgaa gagtcactgg tgactttttg ccgccgtcag 1560
ctcacgggat acaaagtacc gaagctggtg gagtttcgtg atgagttacc gaaatctaac 1620
gtcggaaaaa ttttgcgacg agaattacgt gacgaagcgc gcggcaaagt ggacaataaa 1680
gcctga 1686
<210> 9
<211> 561
<212> PRT
<213> Escherichia coli
<400> 9
Leu Lys Lys Val Trp Leu Asn Arg Tyr Pro Ala Asp Val Pro Thr Glu
1 5 10 15
Ile Asn Pro Asp Arg Tyr Gln Ser Leu Val Asp Met Phe Glu Gln Ser
20 25 30
Val Ala Arg Tyr Ala Asp Gln Pro Ala Phe Val Asn Met Gly Glu Val
35 40 45
Met Thr Phe Arg Lys Leu Glu Glu Arg Ser Arg Ala Phe Ala Ala Tyr
50 55 60
Leu Gln Gln Gly Leu Gly Leu Lys Lys Gly Asp Arg Val Ala Leu Met
65 70 75 80
Met Pro Asn Leu Leu Gln Tyr Pro Val Ala Leu Phe Gly Ile Leu Arg
85 90 95
Ala Gly Met Ile Val Val Asn Val Asn Pro Leu Tyr Thr Pro Arg Glu
100 105 110
Leu Glu His Gln Leu Asn Asp Ser Gly Ala Ser Ala Ile Val Ile Val
115 120 125
Ser Asn Phe Ala His Thr Leu Glu Lys Val Val Asp Lys Thr Ala Val
130 135 140
Gln His Val Ile Leu Thr Arg Met Gly Asp Gln Leu Ser Thr Ala Lys
145 150 155 160
Gly Thr Val Val Asn Phe Val Val Lys Tyr Ile Lys Arg Leu Val Pro
165 170 175
Lys Tyr His Leu Pro Asp Ala Ile Ser Phe Arg Ser Ala Leu His Asn
180 185 190
Gly Tyr Arg Met Gln Tyr Val Lys Pro Glu Leu Val Pro Glu Asp Leu
195 200 205
Ala Phe Leu Gln Tyr Thr Gly Gly Thr Thr Gly Val Ala Lys Gly Ala
210 215 220
Met Leu Thr His Arg Asn Met Leu Ala Asn Leu Glu Gln Val Asn Ala
225 230 235 240
Thr Tyr Gly Pro Leu Leu His Pro Gly Lys Glu Leu Val Val Thr Ala
245 250 255
Leu Pro Leu Tyr His Ile Phe Ala Leu Thr Ile Asn Cys Leu Leu Phe
260 265 270
Ile Glu Leu Gly Gly Gln Asn Leu Leu Ile Thr Asn Pro Arg Asp Ile
275 280 285
Pro Gly Leu Val Lys Glu Leu Ala Lys Tyr Pro Phe Thr Ala Ile Thr
290 295 300
Gly Val Asn Thr Leu Phe Asn Ala Leu Leu Asn Asn Lys Glu Phe Gln
305 310 315 320
Gln Leu Asp Phe Ser Ser Leu His Leu Ser Ala Gly Gly Gly Met Pro
325 330 335
Val Gln Gln Val Val Ala Glu Arg Trp Val Lys Leu Thr Gly Gln Tyr
340 345 350
Leu Leu Glu Gly Tyr Gly Leu Thr Glu Cys Ala Pro Leu Val Ser Val
355 360 365
Asn Pro Tyr Asp Ile Asp Tyr His Ser Gly Ser Ile Gly Leu Pro Val
370 375 380
Pro Ser Thr Glu Ala Lys Leu Val Asp Asp Asp Asp Asn Glu Val Pro
385 390 395 400
Pro Gly Gln Pro Gly Glu Leu Cys Val Lys Gly Pro Gln Val Met Leu
405 410 415
Gly Tyr Trp Gln Arg Pro Asp Ala Thr Asp Glu Ile Ile Lys Asn Gly
420 425 430
Trp Leu His Thr Gly Asp Ile Ala Val Met Asp Glu Glu Gly Phe Leu
435 440 445
Arg Ile Val Asp Arg Lys Lys Asp Met Ile Leu Val Ser Gly Phe Asn
450 455 460
Val Tyr Pro Asn Glu Ile Glu Asp Val Val Met Gln His Pro Gly Val
465 470 475 480
Gln Glu Val Ala Ala Val Gly Val Pro Ser Gly Ser Ser Gly Glu Ala
485 490 495
Val Lys Ile Phe Val Val Lys Lys Asp Pro Ser Leu Thr Glu Glu Ser
500 505 510
Leu Val Thr Phe Cys Arg Arg Gln Leu Thr Gly Tyr Lys Val Pro Lys
515 520 525
Leu Val Glu Phe Arg Asp Glu Leu Pro Lys Ser Asn Val Gly Lys Ile
530 535 540
Leu Arg Arg Glu Leu Arg Asp Glu Ala Arg Gly Lys Val Asp Asn Lys
545 550 555 560
Ala
<210> 10
<211> 6049
<212> DNA
<213> Artificial Sequence
<400> 10
agcggataac aatttcacac aggagagctc aaagagtgga acaatgcagg acgccgtccg 60
gtaacggtgt tgagcctcgg ttgtgtggtg attattgtcg ccgctaacat cgtcggcatc 120
ggcatggcga attaatcttt ctgcgaattg agatgacgcc actggctggg cgtcatcccg 180
gtttcccggg taaacaccac cgaaaaatag ttactatctt caaagccaca ttcggtcgaa 240
atatcactga ttaacaggcg gctatgctgg agaagatatt gcgcatgaca cactctgacc 300
tgtcgcagat attgattgat ggtcattcca gtctgctggc gaaattgctg acgcaaaacg 360
cgctcactgc acgatgcctc atcacaaaat ttatccagcg caaagggact tttcaggcta 420
gccgccagcc gggtaatcag cttatccagc aacgtttcgc tggatgttgg cggcaacgaa 480
tcactggtgt aacgatggcg attcagcaac atcaccaact gcccgaacag caactcagcc 540
atttcgttag caaacggcac atgctgacta ctttcatgct caagctgacc gataacctgc 600
cgcgcctgcg ccatccccat gctacctaag cgccagtgtg gttgccctgc gctggcgtta 660
aatcccggaa tcgccccctg ccagtcaaga ttcagcttca gacgctccgg gcaataaata 720
atattctgca aaaccagatc gttaacggaa gcgtaggagt gtttatcgtc agcatgaatg 780
taaaagagat cgccacgggt aatgcgataa gggcgatcgt tgagtacatg caggccatta 840
ccgcgccaga caatcaccag ctcacaaaaa tcatgtgtat gttcagcaaa gacatcttgc 900
ggataacggt cagccacagc gactgcctgc tggtcgctgg caaaaaaatc atctttgaga 960
agttttaact gatgcgccac cgtggctacc tcggccagag aacgaagttg attattcgca 1020
atatggcgta caaatacgtt gagaagattc gcgttattgc agaaagccat cccgtccctg 1080
gcgaatatca cgcggtgacc agttaaactc tcggcgaaaa agcgtcgaaa agtggttact 1140
gtcgctgaat ccacagcgat aggcgatgtc agtaacgctg gcctcgctgt ggcgtagcag 1200
atgtcgggct ttcatcagtc gcaggcggtt caggtatcgc tgaggcgtca gtcccgtttg 1260
ctgcttaagc tgccgatgta gcgtacgcag tgaaagagaa aattgatccg ccacggcatc 1320
ccaattcacc tcatcggcaa aatggtcctc cagccaggcc agaagcaagt tgagacgtga 1380
tgcgctgttt tccaggttct cctgcaaact gcttttacgc agcaagagca gtaattgcat 1440
aaacaagatc tcgcgactgg cggtcgaggg taaatcattt tccccttcct gctgttccat 1500
ctgtgcaacc agctgtcgca cctgctgcaa tacgctgtgg ttaacgcgcc agtgagacgg 1560
atactgccca tccagctctt gtggcagcaa ctgattcagc ccggcgagaa actgaaatcg 1620
atccggcgag cgatacagca cattggtcag acacagatta tcggtatgtt catacagatg 1680
ccgatcatga tcgcgtacga aacagaccgt gccaccggtg atggtatagg gctgcccatt 1740
aaacacatga atacccgtgc catgttcgac aatcacaatt tcatgaaaat catgatgatg 1800
ttcaggaaaa tccgcctgcg ggagccgggg ttctatcgcc acggacgcgt taccagacgg 1860
aaaaaaatcc acactatgta atacggtcat actggcctcc tgatgtcgtc aacacggcga 1920
aatagtaatc acgaggtcag gttcttacct taaattttcg acggaaaacc acgtaaaaaa 1980
cgtcgatttt tcaagataca gcgtgaattt tcaggaaatg cggtgagcat cacatcacca 2040
caattcagca aattgtgaac atcatcacgt tcatctttcc ctggttgcca atggcccatt 2100
ttcctgtcag taacgagaag gtcgcgaatt caggcgcttt ttagactggt cgtaatgaaa 2160
ttcaactagt gctctgcagg agctgtcacc ggatgtgctt tccggtctga tgagtccgtg 2220
aggacgaaac agcctctaca aataattttg tttaagagtt actagagagg aggaattaac 2280
catgaaccat ctgcgtgcgg aaggccctgc gagcgtttta gcgattggca ccgcgaatcc 2340
ggaaaacatt ctgctgcagg atgaatttcc ggattattat tttcgcgtga ccaaaagcga 2400
acatatgacc cagctgaaag aaaaatttcg caaaatttgc gacaagagca tgattcgcaa 2460
acgcaactgc tttctgaacg aagaacatct gaaacagaac ccgcgcctgg tggaacatga 2520
aatgcagacc ctggatgcgc gccaggatat gctggtggtg gaagtgccga aactgggcaa 2580
agatgcgtgc gcgaaagcga ttaaagaatg gggccagccg aaaagcaaaa ttacccatct 2640
gatttttacc agcgcgagca ccaccgatat gccgggcgca gattatcatt gcgcgaaact 2700
gctgggcctg agcccgagcg ttaaacgcgt gatgatgtat cagctgggct gctatggcgg 2760
cggcaccgtt ttacgtattg cgaaagatat tgcggaaaac aacaaaggcg cgcgcgtgct 2820
ggcggtgtgt tgtgatatta tggcgtgcct gtttcgcggc ccgagcgaaa gcgatctgga 2880
actgttagtg ggccaggcga tttttggcga tggcgcggcg gcggtgattg tgggtgcaga 2940
acctgatgaa agcgtgggcg aacgccctat ttttgaactg gtgagcaccg gccagaccat 3000
tctgccgaat agcgaaggca ccattggcgg ccatattcgc gaagcgggcc tgatttttga 3060
tctgcataaa gatgtgccga tgctgattag caacaacatt gaaaaatgcc tgattgaggc 3120
gtttaccccg attggcatta gcgattggaa cagcatcttt tggattaccc atccgggcgg 3180
caaagcgatt ctggataaag tggaagaaaa actgcatctg aaaagcgata aattcgtgga 3240
tagccgccat gtgctgagcg aacatggcaa catgagcagc agcaccgtgc tgtttgtgat 3300
ggatgaactg cgcaaacgca gcctggaaga aggcaaaagc accaccggcg atggctttga 3360
atggggcgtg ctgtttggct ttggcccggg cttaaccgtg gaacgcgttg tggttcgtag 3420
cgtgcctatt aaatattaac tcgtcgtgac tgggaaaacc ctggcgacta gtcttggact 3480
cctgttgata gatccagtaa tgacctcaga actccatctg gatttgttca gaacgctcgg 3540
ttgccgccgg gcgtttttta ttggtgagaa tccaggggtc cccaataatt acgatttaaa 3600
ttggcgaaaa tgagacgtgg gtctgacgct cagtggaacg aaaactcacg ttaagggatt 3660
ttggtcatga gattatcaaa aaggatcttc acctagatcc ttttggttca tgtgcagctc 3720
catcagcaaa aggggatgat aagtttatca ccaccgacta tttgcaacag cgccggtgat 3780
cgtgctatga tcgactgatg tcatcagcgg tggagtgcaa tgtcgtgcaa tacgaatggc 3840
gaaaagccga gctcatcggt cagcttctca accttggggt tacccccggc ggtgtgctgc 3900
tggtccacag ctccttccgt agcgtccggc ccctcgaaga tgggccactt ggactgatcg 3960
aggccctgcg tgctgcgctg ggtccgggag ggacgctcgt catgccctcg tggtcaggtc 4020
tggacgacga gccgttcgat cctgccacgt cgcccgttac accggacctt ggagttgtct 4080
ctgacacatt ctggcgcctg ccaaatgtaa agcgcagcgc ccatccattt gcctttgcgg 4140
cagcggggcc acaggcagag cagatcatct ctgatccatt gcccctgcca cctcactcgc 4200
ctgcaagccc ggtcgcccgt gtccatgaac tcgatgggca ggtacttctc ctcggcgtgg 4260
gacacgatgc caacacgacg ctgcatcttg ccgagttgat ggcaaaggtt ccctatgggg 4320
tgccgagaca ctgcaccatt cttcaggatg gcaagttggt acgcgtcgat tatctcgaga 4380
atgaccactg ctgtgagcgc tttgccttgg cggacaggtg gctcaaggag aagagccttc 4440
agaaggaagg tccagtcggt catgcctttg ctcggttgat ccgctcccgc gacattgtgg 4500
cgacagccct gggtcaactg ggccgagatc cgttgatctt cctgcatccg ccagaggcgg 4560
gatgcgaaga atgcgatgcc gctcgccagt cgattggctg agctcatgag cggagaacga 4620
gatgacgttg gaggggcaag gtcgcgctga ttgctggggc aacacgtgga gcggatcggt 4680
ttgacttttg tccttttccg ctgcataacc ctgcttcggg gtcattatag cgattttttc 4740
ggtatatcca tcctttttcg cacgatatac aggattttgc caaagggttc gtgtagactt 4800
tccttggtgt atccaacggc gtcagccggg caggataggt gaagtaggcc cacccgcgag 4860
cgggtgttcc ttcttcactg tcccttattc gcacctggcg gtgctcaacg ggaatcctgc 4920
tctgcgaggc tggccgtagg ccggccgcga tgcaggtggc tgctgaaccc ccagccggaa 4980
ctgaccccac aaggccctag cggagtgtat actggcttac tatgttggca ctgatgaggg 5040
tgtcagtgaa gtgcttcatg tggcaggaga aaaaaggctg caccggtgcg tcagcagaat 5100
atgtgataca ggatatattc cgcttcctcg ctcactgact cgctacgctc ggtcgttcga 5160
ctgcggcgag cggaaatggc ttacgaacgg ggcggagatt tcctggaaga tgccaggaag 5220
atacttaaca gggaagtgag agggccgcgg caaagccgtt tttccatagg ctccgccccc 5280
ctgacaagca tcacgaaatc tgacgctcaa atcagtggtg gcgaaacccg acaggactat 5340
aaagatacca ggcgtttccc ctggcggctc cctcgtgcgc tctcctgttc ctgcctttcg 5400
gtttaccggt gtcattccgc tgttatggcc gcgtttgtct cattccacgc ctgacactca 5460
gttccgggta ggcagttcgc tccaagctgg actgtatgca cgaacccccc gttcagtccg 5520
accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggaaaga catgcaaaag 5580
caccactggc agcagccact ggtaattgat ttagaggagt tagtcttgaa gtcatgcgcc 5640
ggttaaggct aaactgaaag gacaagtttt ggtgactgcg ctcctccaag ccagttacct 5700
cggttcaaag agttggtagc tcagagaacc ttcgaaaaac cgccctgcaa ggcggttttt 5760
tcgttttcag agcaagagat tacgcgcaga ccaaaacgat ctcaagaaga tcatcttatt 5820
aactacatgg ctctgctgta gtgagtgggt tgcgctccgg cagcggtcct gatcccccgc 5880
agaaaaaaag gatctcaaga agatcctttg atcttttcta cggcgcgccc agctgtctag 5940
ggcggcggat ttgtcctact caggagagcg ttcaccgaca aacaacagat aaaacgaaag 6000
gcccagtctt tcgactgagc ctttcgtttt atttgatgcc tttaattaa 6049
<210> 11
<211> 6381
<212> DNA
<213 > Artificial Sequence
<400> 11
agcggataac aatttcacac aggagagctc aaagagtgga acaatgcagg acgccgtccg 60
gtaacggtgt tgagcctcgg ttgtgtggtg attattgtcg ccgctaacat cgtcggcatc 120
ggcatggcga attaatcttt ctgcgaattg agatgacgcc actggctggg cgtcatcccg 180
gtttcccggg taaacaccac cgaaaaatag ttactatctt caaagccaca ttcggtcgaa 240
atatcactga ttaacaggcg gctatgctgg agaagatatt gcgcatgaca cactctgacc 300
tgtcgcagat attgattgat ggtcattcca gtctgctggc gaaattgctg acgcaaaacg 360
cgctcactgc acgatgcctc atcacaaaat ttatccagcg caaagggact tttcaggcta 420
gccgccagcc gggtaatcag cttatccagc aacgtttcgc tggatgttgg cggcaacgaa 480
tcactggtgt aacgatggcg attcagcaac atcaccaact gcccgaacag caactcagcc 540
atttcgttag caaacggcac atgctgacta ctttcatgct caagctgacc gataacctgc 600
cgcgcctgcg ccatccccat gctacctaag cgccagtgtg gttgccctgc gctggcgtta 660
aatcccggaa tcgccccctg ccagtcaaga ttcagcttca gacgctccgg gcaataaata 720
atattctgca aaaccagatc gttaacggaa gcgtaggagt gtttatcgtc agcatgaatg 780
taaaagagat cgccacgggt aatgcgataa gggcgatcgt tgagtacatg caggccatta 840
ccgcgccaga caatcaccag ctcacaaaaa tcatgtgtat gttcagcaaa gacatcttgc 900
ggataacggt cagccacagc gactgcctgc tggtcgctgg caaaaaaatc atctttgaga 960
agttttaact gatgcgccac cgtggctacc tcggccagag aacgaagttg attattcgca 1020
atatggcgta caaatacgtt gagaagattc gcgttattgc agaaagccat cccgtccctg 1080
gcgaatatca cgcggtgacc agttaaactc tcggcgaaaa agcgtcgaaa agtggttact 1140
gtcgctgaat ccacagcgat aggcgatgtc agtaacgctg gcctcgctgt ggcgtagcag 1200
atgtcgggct ttcatcagtc gcaggcggtt caggtatcgc tgaggcgtca gtcccgtttg 1260
ctgcttaagc tgccgatgta gcgtacgcag tgaaagagaa aattgatccg ccacggcatc 1320
ccaattcacc tcatcggcaa aatggtcctc cagccaggcc agaagcaagt tgagacgtga 1380
tgcgctgttt tccaggttct cctgcaaact gcttttacgc agcaagagca gtaattgcat 1440
aaacaagatc tcgcgactgg cggtcgaggg taaatcattt tccccttcct gctgttccat 1500
ctgtgcaacc agctgtcgca cctgctgcaa tacgctgtgg ttaacgcgcc agtgagacgg 1560
atactgccca tccagctctt gtggcagcaa ctgattcagc ccggcgagaa actgaaatcg 1620
atccggcgag cgatacagca cattggtcag acacagatta tcggtatgtt catacagatg 1680
ccgatcatga tcgcgtacga aacagaccgt gccaccggtg atggtatagg gctgcccatt 1740
aaacacatga atacccgtgc catgttcgac aatcacaatt tcatgaaaat catgatgatg 1800
ttcaggaaaa tccgcctgcg ggagccgggg ttctatcgcc acggacgcgt taccagacgg 1860
aaaaaaatcc acactatgta atacggtcat actggcctcc tgatgtcgtc aacacggcga 1920
aatagtaatc acgaggtcag gttcttacct taaattttcg acggaaaacc acgtaaaaaa 1980
cgtcgatttt tcaagataca gcgtgaattt tcaggaaatg cggtgagcat cacatcacca 2040
caattcagca aattgtgaac atcatcacgt tcatctttcc ctggttgcca atggcccatt 2100
ttcctgtcag taacgagaag gtcgcgaatt caggcgcttt ttagactggt cgtaatgaaa 2160
ttcaactagt gctctgcagg agctgtcacc ggatgtgctt tccggtctga tgagtccgtg 2220
aggacgaaac agcctctaca aataattttg tttaagagtt actagagagg aggaattaac 2280
catgaaccat ctgcgtgcgg aaggccctgc gagcgtttta gcgattggca ccgcgaatcc 2340
ggaaaacatt ctgctgcagg atgaatttcc ggattattat tttcgcgtga ccaaaagcga 2400
acatatgacc cagctgaaag aaaaatttcg caaaatttgc gacaagagca tgattcgcaa 2460
acgcaactgc tttctgaacg aagaacatct gaaacagaac ccgcgcctgg tggaacatga 2520
aatgcagacc ctggatgcgc gccaggatat gctggtggtg gaagtgccga aactgggcaa 2580
agatgcgtgc gcgaaagcga ttaaagaatg gggccagccg aaaagcaaaa ttacccatct 2640
gatttttacc agcgcgagca ccaccgatat gccgggcgca gattatcatt gcgcgaaact 2700
gctgggcctg agcccgagcg ttaaacgcgt gatgatgtat cagctgggct gctatggcgg 2760
cggcaccgtt ttacgtattg cgaaagatat tgcggaaaac aacaaaggcg cgcgcgtgct 2820
ggcggtgtgt tgtgatatta tggcgtgcct gtttcgcggc ccgagcgaaa gcgatctgga 2880
actgttagtg ggccaggcga tttttggcga tggcgcggcg gcggtgattg tgggtgcaga 2940
acctgatgaa agcgtgggcg aacgccctat ttttgaactg gtgagcaccg gccagaccat 3000
tctgccgaat agcgaaggca ccattggcgg ccatattcgc gaagcgggcc tgatttttga 3060
tctgcataaa gatgtgccga tgctgattag caacaacatt gaaaaatgcc tgattgaggc 3120
gtttaccccg attggcatta gcgattggaa cagcatcttt tggattaccc atccgggcgg 3180
caaagcgatt ctggataaag tggaagaaaa actgcatctg aaaagcgata aattcgtgga 3240
tagccgccat gtgctgagcg aacatggcaa catgagcagc agcaccgtgc tgtttgtgat 3300
ggatgaactg cgcaaacgca gcctggaaga aggcaaaagc accaccggcg atggctttga 3360
atggggcgtg ctgtttggct ttggcccggg cttaaccgtg gaacgcgttg tggttcgtag 3420
cgtgcctatt aaatattaat actagagaaa gaggagaaat actagatggc ggtgaaacat 3480
ctgattgtgc tgaaatttaa agacgagatc accgaggcgc agaaagagga atttttcaaa 3540
acctatgtga acctggtgaa catcatcccg gcgatgaaag atgtgtattg gggcaaagat 3600
gtgacccaga aaaacaaaga agaaggctat acccatattg tggaagtgac ctttgaaagc 3660
gtggaaacca ttcaggatta tattattcac ccggcgcatg tgggctttgg cgatgtgtat 3720
cgcagctttt gggaaaaact gctgattttt gattacaccc cgcgcaaata actcgtcgtg 3780
actgggaaaa ccctggcgac tagtcttgga ctcctgttga tagatccagt aatgacctca 3840
gaactccatc tggatttgtt cagaacgctc ggttgccgcc gggcgttttt tattggtgag 3900
aatccagggg tccccaataa ttacgattta aattggcgaa aatgagacgt gggtctgacg 3960
ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca aaaaggatct 4020
tcacctagat ccttttggtt catgtgcagc tccatcagca aaaggggatg ataagtttat 4080
caccaccgac tatttgcaac agcgccggtg atcgtgctat gatcgactga tgtcatcagc 4140
ggtggagtgc aatgtcgtgc aatacgaatg gcgaaaagcc gagctcatcg gtcagcttct 4200
caaccttggg gttacccccg gcggtgtgct gctggtccac agctccttcc gtagcgtccg 4260
gcccctcgaa gatgggccac ttggactgat cgaggccctg cgtgctgcgc tgggtccggg 4320
agggacgctc gtcatgccct cgtggtcagg tctggacgac gagccgttcg atcctgccac 4380
gtcgcccgtt acaccggacc ttggagttgt ctctgacaca ttctggcgcc tgccaaatgt 4440
aaagcgcagc gcccatccat ttgcctttgc ggcagcgggg ccacaggcag agcagatcat 4500
ctctgatcca ttgcccctgc cacctcactc gcctgcaagc ccggtcgccc gtgtccatga 4560
actcgatggg caggtacttc tcctcggcgt gggacacgat gccaacacga cgctgcatct 4620
tgccgagttg atggcaaagg ttccctatgg ggtgccgaga cactgcacca ttcttcagga 4680
tggcaagttg gtacgcgtcg attatctcga gaatgaccac tgctgtgagc gctttgcctt 4740
ggcggacagg tggctcaagg agaagagcct tcagaaggaa ggtccagtcg gtcatgcctt 4800
tgctcggttg atccgctccc gcgacattgt ggcgacagcc ctgggtcaac tgggccgaga 4860
tccgttgatc ttcctgcatc cgccagaggc gggatgcgaa gaatgcgatg ccgctcgcca 4920
gtcgattggc tgagctcatg agcggagaac gagatgacgt tggaggggca aggtcgcgct 4980
gattgctggg gcaacacgtg gagcggatcg gtttgacttt tgtccttttc cgctgcataa 5040
ccctgcttcg gggtcattat agcgattttt tcggtatatc catccttttt cgcacgatat 5100
acaggatttt gccaaagggt tcgtgtagac tttccttggt gtatccaacg gcgtcagccg 5160
ggcaggatag gtgaagtagg cccacccgcg agcgggtgtt ccttcttcac tgtcccttat 5220
tcgcacctgg cggtgctcaa cgggaatcct gctctgcgag gctggccgta ggccggccgc 5280
gatgcaggtg gctgctgaac ccccagccgg aactgacccc acaaggccct agcggagtgt 5340
atactggctt actatgttgg cactgatgag ggtgtcagtg aagtgcttca tgtggcagga 5400
gaaaaaaggc tgcaccggtg cgtcagcaga atatgtgata caggatatat tccgcttcct 5460
cgctcactga ctcgctacgc tcggtcgttc gactgcggcg agcggaaatg gcttacgaac 5520
ggggcggaga tttcctggaa gatgccagga agatacttaa cagggaagtg agagggccgc 5580
ggcaaagccg tttttccata ggctccgccc ccctgacaag catcacgaaa tctgacgctc 5640
aaatcagtgg tggcgaaacc cgacaggact ataaagatac caggcgtttc ccctggcggc 5700
tccctcgtgc gctctcctgt tcctgccttt cggtttaccg gtgtcattcc gctgttatgg 5760
ccgcgtttgt ctcattccac gcctgacact cagttccggg taggcagttc gctccaagct 5820
ggactgtatg cacgaacccc ccgttcagtc cgaccgctgc gccttatccg gtaactatcg 5880
tcttgagtcc aacccggaaa gacatgcaaa agcaccactg gcagcagcca ctggtaattg 5940
atttagagga gttagtcttg aagtcatgcg ccggttaagg ctaaactgaa aggacaagtt 6000
ttggtgactg cgctcctcca agccagttac ctcggttcaa agagttggta gctcagagaa 6060
ccttcgaaaa accgccctgc aaggcggttt tttcgttttc agagcaagag attacgcgca 6120
gaccaaaacg atctcaagaa gatcatctta ttaactacat ggctctgctg tagtgagtgg 6180
gttgcgctcc ggcagcggtc ctgatccccc gcagaaaaaa aggatctcaa gaagatcctt 6240
tgatcttttc tacggcgcgc ccagctgtct agggcggcgg atttgtccta ctcaggagag 6300
cgttcaccga caaacaacag ataaaacgaa aggcccagtc tttcgactga gcctttcgtt 6360
ttatttgatg cctttaatta a 6381
<210> 12
<211> 1158
<212> DNA
<213> Cannabis sativa
<400> 12
atgaaccatc tgcgtgcgga aggccctgcg agcgttttag cgattggcac cgcgaatccg 60
gaaaacattc tgctgcagga tgaatttccg gattattatt ttcgcgtgac caaaagcgaa 120
catatgaccc agctgaaaga aaaatttcgc aaaatttgcg acaagagcat gattcgcaaa 180
cgcaactgct ttctgaacga agaacatctg aaacagaacc cgcgcctggt ggaacatgaa 240
atgcagaccc tggatgcgcg ccaggatatg ctggtggtgg aagtgccgaa actgggcaaa 300
gatgcgtgcg cgaaagcgat taaagaatgg ggccagccga aaagcaaaat tacccatctg 360
atttttacca gcgcgagcac caccgatatg ccgggcgcag attatcattg cgcgaaactg 420
ctgggcctga gcccgagcgt taaacgcgtg atgatgtatc agctgggctg ctatggcggc 480
ggcaccgttt tacgtattgc gaaagatatt gcggaaaaca acaaaggcgc gcgcgtgctg 540
gcggtgtgtt gtgatattat ggcgtgcctg tttcgcggcc cgagcgaaag cgatctggaa 600
ctgttagtgg gccaggcgat ttttggcgat ggcgcggcgg cggtgattgt gggtgcagaa 660
cctgatgaaa gcgtgggcga acgccctatt tttgaactgg tgagcaccgg ccagaccatt 720
ctgccgaata gcgaaggcac cattggcggc catattcgcg aagcgggcct gatttttgat 780
ctgcataaag atgtgccgat gctgattagc aacaacattg aaaaatgcct gattgaggcg 840
tttaccccga ttggcattag cgattggaac agcatctttt ggattaccca tccgggcggc 900
aaagcgattc tggataaagt ggaagaaaaa ctgcatctga aaagcgataa attcgtggat 960
agccgccatg tgctgagcga acatggcaac atgagcagca gcaccgtgct gtttgtgatg 1020
gatgaactgc gcaaacgcag cctggaagaa ggcaaaagca ccaccggcga tggctttgaa 1080
tggggcgtgc tgtttggctt tggcccgggc ttaaccgtgg aacgcgttgt ggttcgtagc 1140
gtgcctatta aatattaa 1158
<210> 13
<211> 385
<212> PRT
<213> Cannabis sativa
<400> 13
Met Asn His Leu Arg Ala Glu Gly Pro Ala Ser Val Leu Ala Ile Gly
1 5 10 15
Thr Ala Asn Pro Glu Asn Ile Leu Leu Gln Asp Glu Phe Pro Asp Tyr
20 25 30
Tyr Phe Arg Val Thr Lys Ser Glu His Met Thr Gln Leu Lys Glu Lys
35 40 45
Phe Arg Lys Ile Cys Asp Lys Ser Met Ile Arg Lys Arg Asn Cys Phe
50 55 60
Leu Asn Glu Glu His Leu Lys Gln Asn Pro Arg Leu Val Glu His Glu
65 70 75 80
Met Gln Thr Leu Asp Ala Arg Gln Asp Met Leu Val Val Glu Val Pro
85 90 95
Lys Leu Gly Lys Asp Ala Cys Ala Lys Ala Ile Lys Glu Trp Gly Gln
100 105 110
Pro Lys Ser Lys Ile Thr His Leu Ile Phe Thr Ser Ala Ser Thr Thr
115 120 125
Asp Met Pro Gly Ala Asp Tyr His Cys Ala Lys Leu Leu Gly Leu Ser
130 135 140
Pro Ser Val Lys Arg Val Met Met Tyr Gln Leu Gly Cys Tyr Gly Gly
145 150 155 160
Gly Thr Val Leu Arg Ile Ala Lys Asp Ile Ala Glu Asn Asn Lys Gly
165 170 175
Ala Arg Val Leu Ala Val Cys Cys Asp Ile Met Ala Cys Leu Phe Arg
180 185 190
Gly Pro Ser Glu Ser Asp Leu Glu Leu Leu Val Gly Gln Ala Ile Phe
195 200 205
Gly Asp Gly Ala Ala Ala Val Ile Val Gly Ala Glu Pro Asp Glu Ser
210 215 220
Val Gly Glu Arg Pro Ile Phe Glu Leu Val Ser Thr Gly Gln Thr Ile
225 230 235 240
Leu Pro Asn Ser Glu Gly Thr Ile Gly Gly His Ile Arg Glu Ala Gly
245 250 255
Leu Ile Phe Asp Leu His Lys Asp Val Pro Met Leu Ile Ser Asn Asn
260 265 270
Ile Glu Lys Cys Leu Ile Glu Ala Phe Thr Pro Ile Gly Ile Ser Asp
275 280 285
Trp Asn Ser Ile Phe Trp Ile Thr His Pro Gly Gly Lys Ala Ile Leu
290 295 300
Asp Lys Val Glu Glu Lys Leu His Leu Lys Ser Asp Lys Phe Val Asp
305 310 315 320
Ser Arg His Val Leu Ser Glu His Gly Asn Met Ser Ser Ser Thr Val
325 330 335
Leu Phe Val Met Asp Glu Leu Arg Lys Arg Ser Leu Glu Glu Gly Lys
340 345 350
Ser Thr Thr Gly Asp Gly Phe Glu Trp Gly Val Leu Phe Gly Phe Gly
355 360 365
Pro Gly Leu Thr Val Glu Arg Val Val Val Arg Ser Val Pro Ile Lys
370 375 380
Tyr
385
<210> 14
<211> 306
<212> DNA
<213> Cannabis sativa
<400> 14
atggcggtga aacatctgat tgtgctgaaa tttaaagacg agatcaccga ggcgcagaaa 60
gaggaatttt tcaaaaccta tgtgaacctg gtgaacatca tcccggcgat gaaagatgtg 120
tattggggca aagatgtgac ccagaaaaac aaagaagaag gctataccca tattgtggaa 180
gtgacctttg aaagcgtgga aaccattcag gattatatta ttcacccggc gcatgtgggc 240
tttggcgatg tgtatcgcag cttttgggaa aaactgctga tttttgatta caccccgcgc 300
aaataa 306
<210> 15
<211> 101
<212> PRT
<213> Cannabis sativa
<400> 15
Met Ala Val Lys His Leu Ile Val Leu Lys Phe Lys Asp Glu Ile Thr
1 5 10 15
Glu Ala Gln Lys Glu Glu Phe Phe Lys Thr Tyr Val Asn Leu Val Asn
20 25 30
Ile Ile Pro Ala Met Lys Asp Val Tyr Trp Gly Lys Asp Val Thr Gln
35 40 45
Lys Asn Lys Glu Glu Gly Tyr Thr His Ile Val Glu Val Thr Phe Glu
50 55 60
Ser Val Glu Thr Ile Gln Asp Tyr Ile Ile His Pro Ala His Val Gly
65 70 75 80
Phe Gly Asp Val Tyr Arg Ser Phe Trp Glu Lys Leu Leu Ile Phe Asp
85 90 95
Tyr Thr Pro Arg Lys
100