WO2000018906A2 - Rearrangement de genes modifies par codon - Google Patents

Rearrangement de genes modifies par codon Download PDF

Info

Publication number
WO2000018906A2
WO2000018906A2 PCT/US1999/022588 US9922588W WO0018906A2 WO 2000018906 A2 WO2000018906 A2 WO 2000018906A2 US 9922588 W US9922588 W US 9922588W WO 0018906 A2 WO0018906 A2 WO 0018906A2
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
codon
nucleic acids
codon altered
library
Prior art date
Application number
PCT/US1999/022588
Other languages
English (en)
Other versions
WO2000018906A9 (fr
WO2000018906A3 (fr
Inventor
Phillip A. Patten
Lu Liu
Willem P. C. Stemmer
Original Assignee
Maxygen, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Maxygen, Inc. filed Critical Maxygen, Inc.
Priority to AU11990/00A priority Critical patent/AU1199000A/en
Priority to JP2000572353A priority patent/JP2002537758A/ja
Priority to KR1020017003873A priority patent/KR20010085850A/ko
Priority to EP99969739A priority patent/EP1117777A2/fr
Priority to CA002331335A priority patent/CA2331335A1/fr
Priority to IL14044199A priority patent/IL140441A0/xx
Publication of WO2000018906A2 publication Critical patent/WO2000018906A2/fr
Publication of WO2000018906A9 publication Critical patent/WO2000018906A9/fr
Publication of WO2000018906A3 publication Critical patent/WO2000018906A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • C12N15/1027Mutagenizing nucleic acids by DNA shuffling, e.g. RSR, STEP, RPR
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/475Growth factors; Growth regulators
    • C07K14/505Erythropoietin [EPO]
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/52Cytokines; Lymphokines; Interferons
    • C07K14/53Colony-stimulating factor [CSF]
    • C07K14/535Granulocyte CSF; Granulocyte-macrophage CSF
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/0068Means for controlling the apparatus of the process
    • B01J2219/00686Automatic
    • B01J2219/00689Automatic using computers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2740/00Reverse transcribing RNA viruses
    • C12N2740/00011Details
    • C12N2740/10011Retroviridae
    • C12N2740/16011Human Immunodeficiency Virus, HIV
    • C12N2740/16111Human Immunodeficiency Virus, HIV concerning HIV env
    • C12N2740/16122New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes

Definitions

  • Patten and Stemmer filed 09/29/98, and 60/117,729, "SHUFFLING OF CODON ALTERED GENES,” Attorney Docket No. 02-028510, by Patten and Stemmer, filed January 29, 1999.
  • the application is also related to USSN 60/118,813 "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION,” by Crameri et al, Attorney Docket Number 02-296, filed February 5, 1999; and USSN 60/141,049 "OLIGONUCLEOTIDE MEDIATED
  • the genetic code is highly degenerate. Every DNA/RNA triplet (codon) encoding an amino acid can typically be altered, with the exception of ATG/AUG (coding for methionine) and TGG/UGG (coding for Tryptophan), without altering the sequence of the protein encoded by the corresponding nucleic acid sequence. Roughly, on average (the distribution of amino acids varies from protein to protein), each coding triplet can be substituted about 3 different ways, since there are 61 codons encoding 20 amino acids (there are 3 additional triplets encoding stop codons, for a total of 64 codons encoding 20 amino acids).
  • hypermutable viruses such as HIVs and other retroviruses typically stay one step ahead of the host immune system by accumulating non-random mutations based, in part, upon the particular codons used to encode recognition molecules, e.g., in the envelope portion of the virus.
  • the mutations are non-random because viruses are selected for the ability to mutate to forms which are not quickly recognized by the host immune system.
  • viruses are selected to have a non-random set of codons encoding, e.g., envelope proteins, allowing the viruses to shift forms rapidly by making, e.g., specific point mutations to generate specific alterations in protein structure.
  • Codon use is also non-random within species. By preferentially making a subset of all possible t-RNAs, cells may conserve energy, and can optimize, or even regulate, the efficiency of cellular translation systems. This fact has long been recognized empirically, often allowing investigators initially to determine the reading frame of a given nucleic acid sequence simply by consideration of the codons resulting from different potential reading frames.
  • One consequence of this "species codon bias" is that proteins within a species have a limited set of possible mutations that can arise as a consequence of, e.g., point mutation. This limits the possible evolution rate of proteins.
  • the present invention provides methods of accessing a completely different mutational spectrum for a selected protein than is available in the naturally occurring nucleic acid encoding the protein. This increases the type and rate of forced evolution for the selected protein, allowing for rapid improvement of any detectable characteristic of the protein.
  • nucleic acids are synthesized with altered codon usage, and/or which encode one or several amino acid residue changes as compared to the selected protein, where the amino acid and codon usage changes can be conservative or non-conservative.
  • the resulting codon amino acid modified nucleic acid(s) are recombined using DNA shuffling techniques with either the native nucleic acid, or with each other (or both), typically using recursive shuffling methods.
  • the nucleic acids or the encoded protein are then screened for a desirable property.
  • the invention provides methods of making codon altered nucleic acids.
  • a first nucleic acid sequence encoding a first polypeptide sequence is selected.
  • a plurality of codon altered nucleic acid sequences, each of which encode the first polypeptide, or a modified form thereof, are then selected (e.g., a library of codon altered nucleic acids can be selected in a biological assay which recognizes library components or activities), and the plurality of codon altered nucleic acid sequences is recombined to produce a target codon altered nucleic acid encoding a second protein.
  • the target codon altered nucleic acid is then screened for a detectable functional or structural property, optionally including comparison to the properties of the first polypeptide.
  • a nucleic acid encoding such a polypeptide can be used in essentially any procedure desired, including introducing the target codon altered nucleic acid into a cell, vector, virus, attenuated virus (e.g., as a component of a vaccine or immunogenic composition), transgenic organism, or the like.
  • Kits and compositions for practicing the methods are also provided, including one or more of: cell recombination mixtures and substrates (e.g., nucleic acids with altered codon usage), containers, instructional material for practicing the methods, or the like.
  • substrates e.g., nucleic acids with altered codon usage
  • instructional material for practicing the methods, or the like.
  • Figure 1 is a nucleic acid/amino acid sequence of a part of the monkey EPO gene, which is similar to the human EPO gene.
  • Figure 2 shows an example of a codon altered EPO nucleic acid sequence.
  • Figure 3 shows an alignment of naturally occurring EPOs.
  • Figure 4 is a schematic of the human EPO wobble sequence space.
  • Figure 5 is a schematic of Mammalian EPO Family- Wobble Sequence Space.
  • Figure 6 is a sequence alignment of G-CSF homologs, with species information.
  • Figure 7 is a sequence alignment of G-CSF homologs, with differences broken out.
  • Figure 8 is a sequence alignment showing the hydrophobic core residues of human G-CSF (blacked out).
  • Figure 9 is a schematic showing the shuffling strategy for G-CSF.
  • Figure 10 is a list of oligos used to make a codon altered alkaline phosphatase.
  • Figure 11 is a map of oligos used to make a codon altered alkaline phosphatase.
  • Figure 12 is a schematic of vaccination with evolution defective viruses.
  • Figure 13 is a schematic of different mutations that result from different codon types for ser, arg, and leu.
  • Figure 14 is a schematic of vaccination with evolution defective viruses.
  • Figure 15 is a schematic of vaccination with evolution defective viruses showing sophisticated versus non-sophisticated "mutant clouds.”
  • FIG. 16 panels A-C show results of single mutations of different codons for ser, arg, and leu.
  • Fig. 17 is a schematic of protein evolution with expanded mutation spectra.
  • Fig. 19 is a list of oligos in one application for synthesis of HIV Env.
  • a "recombinant" nucleic acid is a nucleic acid produced by recombination between two or more nucleic acids, or any nucleic acid made by an in vitro or artificial process.
  • the term "recombinant" when used with reference to a cell indicates that the cell comprises (and optionally replicates) a heterologous nucleic acid, or expresses a peptide or protein encoded by a heterologous nucleic acid.
  • Recombinant cells can contain genes that are not found within the native (non-recombinant) wild-type form of the cell. Recombinant cells can also contain genes found in the native form of the cell where the genes are modified and re-introduced into the cell by artificial means.
  • a "codon altered" nucleic acid is a first nucleic acid that encodes a first polypeptide similar or identical to a naturally occurring polypeptide encoded by a naturally occurring nucleic acid, where the first nucleic acid utilizes a plurality of codons to encode the first polypeptide, which differ from the codons of the naturally occurring nucleic acid that encode the naturally occurring polypeptide.
  • nucleic acid sequence refers to either a nucleic acid (e.g., RNA, DNA or modified form thereof, in isolated, recombinant or native form) or to a representation of the nucleic acid such as a sequence of letters indicating the primary structure (sequence) of the nucleic acid.
  • polypeptide sequence refers to either a polypeptide (or modified form thereof, in isolated, recombinant or native form) or to a representation of the polypeptide such as a sequence of letters or other character string information indicating the primary structure (amino acid sequence) of the polypeptide.
  • a "modified form" of a reference polypeptide is a target polypeptide which has a similar, but not identical, sequence to the reference polypeptide.
  • the sequence of the target polypeptide can differ from the reference polypeptide by conservative or non- conservative substitutions of the reference polypeptide sequence.
  • different nucleic acids encoding different target polypeptides having different non- conservative substitutions relative to the reference polypeptide can be recombined to produce a recombined nucleic acid encoding a target polypeptide more similar to the reference polypeptide.
  • a "plurality of forms" of a selected nucleic acid refers to a plurality of homologs of the nucleic acid.
  • the homologs can be from naturally occurring homologs (e.g., two or more homologous genes, or derivatives thereof) or by artificial synthesis of one or more nucleic acids having related sequences, or by modification of one or more nucleic acid to produce related nucleic acids.
  • Nucleic acids are homologous when they are derived, naturally or artificially, from a common ancestor sequence. During natural evolution, this occurs when two or more descendent sequences diverge from a parent sequence over time, i.e., due to mutation and natural selection. Under artificial conditions, divergence occurs, e.g., in one of two ways.
  • a given sequence can be artificially recombined with another sequence, as occurs, e.g., during typical cloning, to produce a descendent nucleic acid.
  • a nucleic acid can be synthesized de novo, by synthesizing a nucleic acid which varies in sequence from a given parental nucleic acid sequence.
  • homology is typically inferred by sequence comparison between two sequences. Where two nucleic acid sequences show sequence similarity it is inferred that the two nucleic acids share a common ancestor. The precise level of sequence similarity required to establish homology varies in the art depending on a variety of factors. For purposes of this disclosure, two sequences are considered homologous where they share sufficient sequence identity to allow recombination to occur between two nucleic acid molecules, or when codon changes can be made which would result in two or more nucleic acids having the ability to recombine.
  • nucleic acids typically require regions of close similarity spaced roughly the same distance apart to permit recombination to occur.
  • nucleic acid or polypeptide sequences refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence, as measured using one of the sequence comparison algorithms described below (or other algorithms available to persons of skill) or by visual inspection.
  • substantially identical in the context of two nucleic acids or polypeptides refers to two or more sequences or subsequences that have at least about 40%, 50%, 60%, or preferably about 70% or 80% or more, or most preferably 90-95% nucleotide or amino acid residue identity, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection. Such "substantially identical" sequences are typically considered to be homologous.
  • the "substantial identity” exists over a region of the sequences that is at least about 50 residues in length, more preferably over a region of at least about 100 residues, and most preferably the sequences are substantially identical over at least about 150 residues, or over the full length of the two sequences to be compared.
  • sequence comparison and homology determination typically one sequence acts as a reference sequence to which test sequences are compared.
  • test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated.
  • sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
  • Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc.
  • BLAST algorithm One example of algorithm that is suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in Altschul et al., J. Mol. Biol. 215:403-410 (1990). Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information
  • HSPs high scoring sequence pairs
  • Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always ⁇ 0).
  • M forward score for a pair of matching residues; always > 0
  • N penalty score for mismatching residues; always ⁇ 0.
  • a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative- scoring residue alignments; or the end of either sequence is reached.
  • the BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment.
  • W wordlength
  • E expectation
  • BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989) Proc. Natl. Acad. Sci. USA 89: 10915).
  • the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul (1993) Proc. Nat 'I. Acad.
  • nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001.
  • nucleic acid sequences are substantially identical/ homologous is that the two molecules hybridize to each other under stringent conditions.
  • hybridizing specifically to refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions, including when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA.
  • Bod(s) substantially refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target polynucleotide sequence.
  • Stringent hybridization conditions and “stringent hybridization wash conditions” in the context of nucleic acid hybridization experiments such as Southern and northern hybridizations are sequence dependent, and are different under different environmental parameters. Longer sequences and sequences with higher G:C content remain hybridized at higher temperatures (or at lower salt).
  • An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology—Hybridization with Nucleic Acid Probes part I chapter 2 "Overview of principles of hybridization and the strategy of nucleic acid probe assays," Elsevier, New York.
  • highly stringent hybridization and wash conditions are selected to be about 5 °C lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength and pH.
  • T m thermal melting point
  • highly stringent hybridization and wash conditions are selected to be about 5 °C lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength and pH.
  • Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe.
  • Very stringent conditions are selected to be equal to the T m for a particular probe.
  • An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on a filter in a Southern or northern blot is 50% formamide with 1 mg of heparin at 42 °C, with the hybridization being carried out overnight.
  • An example of highly stringent wash conditions is 0.15M NaCl at 72 °C for about 15 minutes.
  • An example of stringent wash conditions is a 0.2x SSC wash at 65 °C for 15 minutes (see, Sambrook, infra., for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal.
  • An example medium stringency wash for a duplex of, e.g., more than 100 nucleotides, is lx SSC at 45 °C for 15 minutes.
  • An example low stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4-6x SSC at 40 °C for 15 minutes.
  • stringent conditions typically involve salt concentrations of less than about 1.0 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3, and the temperature is typically at least about 40 °C. Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide.
  • a signal to noise ratio of 2x (or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization. If the signal to noise ratio is less than 2x binding of an unrelated probe (e.g., a nucleic acid encoding a non-homologous protein), the nucleic acids at issue do not hybridize under stringent conditions. Similarly, if the signal to noise ratio is less than 25% as high as that observed for a perfectly matched probe under stringent conditions, the nucleic acids do not "hybridize under stringent conditions" as that term is used herein. This does not apply to highly stringent conditions, as the stringency can theoretically be increased until only a perfectly matched probe will hybridize.
  • a target nucleic acid to be probed is blotted onto a filter by any conventional method.
  • An unrelated nucleic acid such as a plasmid vector (assuming that the target nucleic acid has no homology with the target nucleic acid) is also blotted, in approximately equal amounts onto the filter.
  • the filter is probed with a labeled probe complementary to the target nucleic acid.
  • the experiment is repeated at gradually increasing stringency of hybridization and wash conditions until signal from the hybridization of the labeled probe to the complementary target is 10-lOOX as high as to the unrelated plasmid vector nucleic acid. Once these conditions are determined as described above, a test nucleic acid is probed under the same conditions as the target.
  • test nucleic acid If signal from the labeled probe is 25% as high or higher than the signal from binding of the probe to the target, the test nucleic acid "hybridizes under stringent conditions" to the probe. If the signal is less than 25% as high, the test nucleic acid does not hybridize under stringent conditions to the probe.
  • nucleic acids which do not hybridize to each other under stringent conditions are still recognizable as variant forms of a nucleic acid when the polypeptides they encode are substantially identical. This occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code. Such nucleic acids are not functionally equivalent, as described in detail herein, due to differences in mRNA folding, alterations of regulatory sequences and the like.
  • nucleic acid sequences or polypeptides are variant forms is that the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the polypeptide encoded by the second nucleic acid, as tested by polyclonal antisera generated to the first polypeptide.
  • a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative substitutions.
  • Consatively modified variations of a particular polynucleotide sequence are those polynucleotide variations that encode identical or essentially identical amino acid sequences, or where the polynucleotide does not encode an amino acid sequence, which encode essentially identical sequences.
  • a large number of functionally identical nucleic acids encode any given polypeptide. For instance, the codons CGU, CGC, CGA, CGG, AGA, and AGG all encode the amino acid arginine. Thus, at every position where an arginine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide.
  • nucleic acid variations are "silent variations,” which are one species of “conservatively modified variations.” Every polynucleotide sequence described herein which encodes a polypeptide also optionally describes every possible silent variation, except where otherwise noted.
  • each codon in a nucleic acid except AUG, which is ordinarily the codon for methionine, and TGG, which is ordinarily the codon for tryptophan can be modified to yield a peptide which is structurally identical.
  • nucleic acid refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides which have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides.
  • nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g. degenerate codon substitutions) and complementary sequences and as well as the sequence explicitly indicated.
  • degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al. (1991) Nucleic Acid Res. 19: 5081; Ohtsuka et al. (1985) J. Biol. Chem. 260: 2605-2608; Cassol et al. (1992) ; Rossolini et al. (1994) Mol. Cell.
  • nucleic acid is generic to the terms “gene”, “DNA,” “cDNA”, “oligonucleotide,” “RNA,” “mRNA,” and the like.
  • Nucleic acid derived from a gene refers to a nucleic acid for whose synthesis the gene, or a subsequence thereof, has ultimately served as a template.
  • an mRNA, a cDNA reverse transcribed from an mRNA, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc. are all derived from the gene and detection of such derived products is indicative of the presence and/or abundance of the original gene and/or gene transcript in a sample.
  • a nucleic acid is "operably linked” when it is placed into a functional relationship with another nucleic acid sequence.
  • a promoter or enhancer is operably linked to a coding sequence if it increases the transcription of the coding sequence.
  • a "recombinant expression cassette” or simply an “expression cassette” is a nucleic acid construct, generated recombinantly or synthetically, with nucleic acid elements that are capable of effecting expression of a structural gene in hosts compatible with such sequences. Expression cassettes include at least promoters and optionally, transcription termination signals.
  • the recombinant expression cassette includes a nucleic acid to be transcribed (e.g., a nucleic acid encoding a desired polypeptide), and a promoter.
  • an expression cassette can also include nucleotide sequences that encode a signal sequence that directs secretion of an expressed protein from the host cell. Transcription termination signals, enhancers, and other nucleic acid sequences that influence gene expression, can also be included in an expression cassette.
  • the sequence diversity of substrates for DNA shuffling procedures is increased by using codon-altered nucleic acids as templates and/or by using templates that encode proteins with conservative or non-conservative amino acid modifications as compared to a selected wild-type protein.
  • codon altered nucleic acids can be chemically synthesized (e.g., using standard artificial synthetic protocols, e.g., those typically used by commercial sources from which nucleic acids can be ordered), or can be made using any of a variety of methods herein of available to one of skill.
  • oligonucleotide fragments can be made which correspond to a codon altered nucleic acid which is desired using standard synthetic methods, followed by polymerase and/or ligase mediated oligonucleotide ligation/recombination protocols to generate full-length nucleic acids.
  • codon usage modifications and coding modifications can be extensive enough to reduce or, under stringent conditions, even eliminate the hybridization of the codon-altered nucleic acids to a nucleic acid which naturally encodes the selected protein. This dramatically alters the mutations which result from possible single nucleotide mutations, providing access to greater diversity for DNA shuffling protocols.
  • the recombination and selection of such nucleic acids during DNA shuffling procedures can result not only in access to a different set of possible mutations, but can also result in modified forms of transcriptional or translational regulation, alterations in nucleic acid localization, mRNA stability and the like.
  • the modified hybridization properties of codon altered nucleic acids leads to alterations in the ability of the nucleic acids to hybridize with potential recombination partners, altering, and ultimately increasing, the available recombination diversity during shuffling.
  • "family shuffling" using codon-altered substrates even further increases the possible sequence diversity of the starting materials for recombination.
  • family shuffling methods involve shuffling nucleic acids encoding sequence variants of a given protein (e.g., species or allele homologs).
  • this procedure is modified by generating codon-altered versions of the sequence variants to access additional molecular diversity during recombination. Additional diversity is achieved by conservatively and non-conservatively modifying the starting nucleic acids to encode non-naturally occurring sequence variants.
  • Family shuffling can be performed even using homologs of relatively low identity. In such cases, codons may be changed in one or more of the family members to increase the level of identity between the members, thereby increasing their ability to recombine using the methods of this invention.
  • Gene shuffling and family shuffling provide two of the most powerful methods available for improving and "migrating" (gradually changing the type of reaction, substrate or activity of a selected protein such as an enzyme, or regulation or structure of an expressed component) the functions of proteins.
  • family shuffling homologous sequences, e.g., from different species, chromosomal positions, or due to synthetic alteration, are recombined.
  • gene shuffling a single sequence is mutated or otherwise altered and then recombined.
  • the generation and screening of high quality shuffled libraries provides for DNA shuffling (or "directed evolution").
  • the availability of appropriate high-throughput analytical chemistry to screen the libraries permits integrated high-throughput shuffling and screening of the libraries to achieve a desired activity.
  • oligonucleotides for constructing codon- modified nucleic acids are designed in a computer ("in silico").
  • Predicted codon-modified recombinant nucleic acids can also be determined in silico, i.e., essentially as taught in Selifonov and Stemmer "METHODS FOR MAKING CHARACTER STRINGS,
  • families of nucleic acids can be recombined simply by appropriate selection of the relevant oligonucleotides which are used in gene reconstruction methods to produce recombinant nucleic acids, i.e., by using codon-modified nucleic acid oligonucleotides as discussed herein in conjunction with family oligonucleotide-mediated shuffling methods, e.g., as taught in Crameri et al. "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" filed February 5, 1999, USSN 60/118,813 and Crameri et al.
  • OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION filed June 24, 1999, USSN 60/141,049.
  • the technique can be used to recombine homologous or even non-homologous nucleic acid sequences; in the context of the present invention, oligonucleotides corresponding to families of codon-modified nucleic acids are shuffled.
  • the present invention provides significant advantages over previously used methods for optimization of genes. For example, DNA shuffling of codon modified nucleic acids can result in optimization of a desirable property even in the absence of a detailed understanding of the mechanism by which the particular property is mediated.
  • shuffled DNAs can encode polypeptides or RNAs with properties entirely absent in the parental DNAs which are shuffled.
  • molecular diversity is accessed and sequences can be shuffled to obtain desired, including entirely new, properties.
  • sequence recombination can be achieved in many different formats and permutations of formats, as described in further detail below.
  • the targets for modification vary in different applications, as does the property sought to be acquired or improved.
  • candidate targets for acquisition of a property or improvement in a property include genes that encode proteins which have enzymatic or therapeutic or other commercially useful activities.
  • a more extensive listing is found supra; however, even this list is not intended to be limiting, as essentially any nucleic acid can be codon modified and shuffled, using one or more of the processes herein.
  • Shuffling methods use at least two variant forms of a starting target (the variant forms can be nucleic acids, or representations thereof, e.g., as character strings in a computer program).
  • the variant forms of candidate codon-altered substrates can show substantial sequence or secondary structural similarity with each other, but they should also differ in at least one and preferably at least two positions.
  • the initial diversity between forms can be the result of natural variation, e.g., the different variant forms (homologs) are obtained from different individuals or strains of an organism, or constitute related sequences from the same organism (e.g., allelic variations), or constitute homologs from different organisms (interspecific variants), or constitute artificial homologs, e.g., codon-altered nucleic acids encoding the same or a similar protein. Any or all of these sequences can represent or include codon altered nucleic acids.
  • Initial diversity can also be induced, e.g., the variant forms can be generated by error-prone transcription, such as an error-prone PCR or use of a polymerase which lacks proof-reading activity (see, Liao (1990) Gene 88:107-11 1), of the first variant form, or, by replication of the first form in a mutator strain (mutator host cells are discussed in further detail below, and are generally well known).
  • error-prone transcription such as an error-prone PCR or use of a polymerase which lacks proof-reading activity (see, Liao (1990) Gene 88:107-11 1)
  • mutator host cells are discussed in further detail below, and are generally well known.
  • the initial diversity between substrates is greatly augmented in subsequent steps of recombination for library generation.
  • a mutator strain can include any mutants in any organism impaired in the functions of mismatch repair.
  • Impairment can be of the genes noted, or of homologous genes in any organism. The properties or characteristics that can be acquired or improved vary widely, and, of course depend on the choice of substrate.
  • At least two variant forms of a nucleic acid are recombined to produce a library of recombinant nucleic acids.
  • the library is then screened to identify at least one recombinant nucleic acid that is optimized for the particular property or properties of interest.
  • Recursive sequence recombination can be employed to achieve still further improvements in a desired property, or to bring about new (or "distinct") properties.
  • Recursive sequence recombination entails successive cycles of recombination to generate molecular diversity. That is, one creates a family of nucleic acid molecules showing some sequence identity to each other but differing due to the presence of mutations. In any given cycle, recombination can occur in vivo or in vitro, intracellularly or extracellularly.
  • mutagenesis e.g., error-prone PCR or cassette mutagenesis
  • a single cycle of DNA shuffling of codon-altered nucleic acids provides for generation of surprisingly effective nucleic acids.
  • single cycle recombination is also preferred.
  • 2, 3, 4, 5, or even 10 or more cycles of recombination can be performed, each cycle optionally comprising one or more selection steps.
  • a recombination cycle is usually followed by at least one cycle of screening or selection for molecules having a desired property or characteristic.
  • a recombination cycle is performed in vitro, the products of recombination, i.e., recombinant segments, are sometimes introduced into cells before the screening step.
  • Recombinant segments can also be linked to an appropriate vector or other regulatory sequences before screening.
  • products of recombination generated in vitro are sometimes packaged in viruses (e.g., bacteriophage) before screening.
  • viruses e.g., bacteriophage
  • recombination products can sometimes be screened in the cells in which recombination occurred.
  • recombinant segments are extracted from the cells, and optionally packaged as viruses, before screening.
  • a gene can have many component sequences, each having a different intended role (e.g., coding sequences, regulatory sequences, targeting sequences, stability-conferring sequences, subunit sequences and sequences affecting integration). Each of these component sequences can be varied and recombined simultaneously. Screening/selection can then be performed, for example, for recombinant segments that have increased ability to confer activity upon a cell without the need to attribute such improvement to any of the individual component sequences of the vector.
  • initial round(s) of screening can sometimes be performed using bacterial cells due to high transfection efficiencies and ease of culture.
  • bacterial expression is often not practical or desired, and yeast, fungal or other eukaryotic systems are also used for library expression and screening.
  • other types of screening which are not amenable to screening in bacterial or simple eukaryotic library cells, are performed in cells selected for use in an environment close to that of their intended use. Final rounds of screening can be performed in the precise cell type of intended use.
  • At least one and usually a collection of recombinant segments surviving a first round of screening/selection are subject to a further round of recombination.
  • These recombinant segments can be recombined with each other or with exogenous segments representing the original substrates or further variants thereof. Again, recombination can proceed in vitro or in vivo. If the previous screening step identifies desired recombinant segments as components of cells, the components can be subjected to further recombination in vivo, or can be subjected to further recombination in vitro, or can be isolated before performing a round of in vitro recombination.
  • the previous screening step identifies desired recombinant segments in naked form or as components of viruses
  • these segments can be introduced into cells to perform a round of in vivo recombination.
  • the second round of recombination irrespective how performed, generates further recombinant segments which encompass additional diversity that is present in recombinant segments resulting from a previous round (or from multiple previous rounds, e.g., where the process is iteratively repeated).
  • the second round of recombination can be followed by a further round of screening/selection according to the principles discussed above for the first round.
  • the stringency of screening/selection can be increased between rounds.
  • the nature of the screen and the property being screened for can vary between rounds if improvement in more than one property is desired or if acquiring more than one new property is desired. Additional rounds of recombination and screening can then be performed until the recombinant segments have sufficiently evolved to acquire the desired new or improved property or function.
  • the practice of this invention involves the construction of recombinant nucleic acids and the expression of genes in transfected host cells.
  • Molecular cloning techniques to achieve these ends are known in the art.
  • a wide variety of cloning and in vitro amplification methods suitable for the construction of recombinant nucleic acids such as expression vectors are well-known to persons of skill.
  • General texts which describe molecular biological techniques useful herein, including mutagenesis include Berger and Kimmel, Guide to Molecular Cloning Techniques. Methods in Enzymology volume 152 Academic Press, Inc., San Diego, CA (Berger); Sambrook et al., Molecular Cloning - A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989
  • RNA polymerase mediated techniques e.g., NASBA
  • PCR polymerase chain reaction
  • LCR ligase chain reaction
  • Q ⁇ -replicase amplification RNA polymerase mediated techniques
  • NASBA RNA polymerase mediated techniques
  • Oligonucleotides e.g., for use in in vitro amplification gene reconstruction methods, for use as gene probes, or as shuffling targets (e.g., synthetic genes or gene segments) are typically synthesized chemically according to the solid phase phosphoramidite triester method, e.g., as described by Beaucage and Caruthers (1981), Tetrahedron Letts., 22(20): 1859- 1862, e.g., using an automated synthesizer, e.g., as described in Needham-
  • Oligonucleotides can also be custom made and ordered from a variety of commercial sources known to persons of skill. Purification of oligonucleotides (e.g., using gel-purification methods) to improve the quality of synthesized oligonucleotides can be particularly desirable in the processes herein to improve the quality of nucleic acid synthesis protocols.
  • nucleic acid can be custom ordered from any of a variety of commercial sources, such as The Midland Certified Reagent Company (mcrc@oligos.com), The Great American Gene Company (http://www.genco.com), ExpressGen Inc. (www.expressgen.com), Operon Technoloigies Inc. (Alameda, CA) and many others.
  • peptides and antibodies can be custom ordered from any of a variety of sources, such as PeptidoGenic (pkim@ccnet.com), HTI Bio-products, inc. (http://www.htibio.com), BMA Biomedicals Ltd (U.K.), Bio ' Synthesis, Inc., and many others.
  • libraries of codon altered nucleic acids can be made and recombined.
  • the codon altered nucleic acids can also include differences in encoded amino acid sequences, which can be either conservative or non-conservative in nature.
  • the codon altered nucleic acids can be derived from a single parental amino acid sequence, or can be derived from a family of original sequences, e.g., natural or synthetic homologous variants of a given sequence.
  • Libraries can exist, e.g., in pools or aliquots of cells, viral plaques, enzymatically synthesized pools or aliquots of nucleic acids, or chemically synthesisized pools of nucleic acids.
  • a library as used in the invention comprises at least 2 nucleic acid sequences.
  • the libraries of this invention comprise at least 2, 5, 10, 100, 1000, or more nucleic acid sequences.
  • libraries are typically constructed with a high percentage of codons altered relative to an initial (e.g., wild type) nucleic acid. Codon usage divergence for each of the codon altered nucleic acids can be 50%o, 75%, or even 90% or more as compared to the first nucleic acid. This eliminates hybridization to the parental nucleic acid (and thereby inhibits recombination with the parental nucleic acid, a desirable feature in certain embodiments discussed below).
  • codons are modified in members of a gene family so as to increase the degree of identity between the members.
  • the genes are homologous genes from different species.
  • the degree of nucleic acid identity may be lower than the degree of amino acid identity, at least in part, because of differences in codon usage between the species.
  • the homologous genes represent different members of a gene family within a single species. Such genes may encode functionally distinct members of a gene family that nevertheless share significant structural or functional similarity.
  • homologous genes are reverse translated into nucleic acid sequences, and the nucleic acid sequences are modified so as to increase the level of identity between them. Nucleic acids with the modified sequences can then be synthesized in vitro.
  • the modified nucleic acid sequences are at least as identical to each other as the original amino acid sequences.
  • Additional sequence diversity is provided by generating nucleic acids with non-overlapping non-conservative substitutions in each of the codon altered nucleic acids as compared to the first nucleic acid. This provides for reversion to wild-type upon recombination, while optionally allowing for the incorporation of non-conservative changes to the sequence in the event that they produce a detectable improvement during screening.
  • Modification of the codons of one or more of the codon altered nucleic acids to provide one or more different hydrophobic core residue for an encoded polypeptide as compared to the first polypeptide is also provided.
  • This modification of core amino acids provides minor differences in encoded proteins, while changing the mutational spectrum of the resulting nucleic acid, thereby increasing sequence diversity.
  • codon usage may need to be altered when expressed sequences are shuttled between different organisms (e.g., animal cells, plant cells, bacterial cells, etc.) for optimal expression. This produces a nucleic acid which encodes the same protein, but which, after typical forms of point mutation, will access a different mutational diversity than the original form of the protein.
  • phage libraries are made and recombined in mutator strains such as cells with mutant or impaired gene products of mutS, mutT, mutH, mutL, ovrD, dcm, vsr, umuC, umuD, sbcB, recJ, etc.
  • the impairment is achieved by genetic mutation, allelic replacement, selective inhibition by an added reagent such as a small compound or an expressed antisense RNA, or other techniques.
  • High multiplicity of infection (MOI) libraries are used to infect the cells to increase recombination frequency. Additional strategies for making phage libraries and or for recombining DNA from donor and recipient cells are set forth in U.S. Pat. No. 5,521,077. Additional recombination strategies for recombining plasmids in yeast are set forth in WO 97 07205.
  • the library to be made can be an in vitro set of molecules, or present in cells, phage or the like.
  • Virtual libraries of nucleic acids generated in silico are also a feature of the invention (see also, Selifonov and Stemmer, supra).
  • the library is screened to identify at least one recombinant nucleic acid that exhibits distinct or improved activity compared to the parental nucleic acid or nucleic acids which are recombined. Additional details on making appropriate libraries are found below, e.g., in the section entitled "Formats for Sequence Recombination.”
  • nucleic acid can be codon altered and shuffled. No attempt is made herein to identify the hundreds of thousands of known nucleic acids.
  • Common sequence repositories for known proteins include GenBank EMBL, DDBJ and the NCBI. Other repositories can easily be identified by searching the internet.
  • One class of preferred targets for activation includes nucleic acids encoding therapeutic proteins such as erythropoietin (EPO), insulin, peptide hormones such as human growth hormone; growth factors and cytokines such as epithelial Neutrophil Activating Peptide-78, GRO ⁇ /MGSA, GRO ⁇ , GRO ⁇ , MlP-l ⁇ , MlP-l ⁇ , MCP-1, epidermal growth factor, fibroblast growth factor, hepatocyte growth factor, insulin-like growth factor, the interferons, the interleukins, keratinocyte growth factor, leukemia inhibitory factor, oncostatin M, PD-ECSF, PDGF, pleiotropin, SCF, c-kit ligand, VEGEF, G-CSF etc.
  • EPO erythropoietin
  • insulin peptide hormones
  • growth factors and cytokines such as epithelial Neutrophil Activating Peptide-78, GRO ⁇ /MGSA,
  • transcriptional and expression activators include genes and proteins that modulate cell growth, differentiation, regulation, or the like. Expression and transcriptional activators are found in prokaryotes, viruses, and eukaryotes, including fungi, plants, and animals, including mammals, providing a wide range of therapeutic targets. It will be appreciated that expression and transcriptional activators regulate transcription by many mechanisms, e.g., by binding to receptors, stimulating a signal transduction cascade, regulating expression of transcription factors, binding to promoters and enhancers, binding to proteins that bind to promoters and enhancers, unwinding DNA, splicing pre-mRNA, polyadenylating RNA, and degrading RNA.
  • transcriptional and expression activators include genes and proteins that modulate cell growth, differentiation, regulation, or the like. Expression and transcriptional activators are found in prokaryotes, viruses, and eukaryotes, including fungi, plants, and animals, including mammals, providing a wide range of therapeutic targets. It will be appreciated that expression and transcriptional activ
  • Expression activators include cytokines, inflammatory molecules, growth factors, their receptors, and oncogene products, e.g., interleukins (e.g., IL-1, IL-2, IL-8, etc.), interferons, FGF, IGF-I, IGF-II, FGF, PDGF, TNF, TGF- ⁇ , TGF- ⁇ , EGF, KGF, SCF/c-Kit,
  • CD40L/CD40, VL A-4/VCAM- 1 , ICAM-l/LFA-1, and hyalurin/CD44 signal transduction molecules and corresponding oncogene products, e.g., Mos, Ras, Raf, and Met; and transcriptional activators and suppressors, e.g., p53, Tat, Fos, Myc, Jun, Myb, Rel, and steroid hormone receptors such as those for estrogen, progesterone, testosterone, aldosterone, the LDL receptor ligand and corticosterone.
  • proteins from infectious organisms for possible vaccine applications including infectious fungi, e.g., Aspergillus, Candida species; bacteria, particularly E. coli, which serves a model for pathogenic bacteria, as well as medically important bacteria such as Staphylococci (e.g., aureus), Streptococci (e.g., pneumoniae), Clostridia (e.g., perfringens), Neisseria (e.g., gonorrhoea), Enterobacteriaceae (e.g., coli), Helicobacter (e.g., pylori), Vibrio (e.g., cholerae), Capylobacter (e.g.
  • Pseudomonas e.g., aeruginosa
  • Haemophilus e.g., influenzae
  • Bordetella e.g., pertussis
  • Mycoplasma e.g., pneumoniae
  • Ureaplasma e.g., urealyticum
  • Legionella e.g., pneumophila
  • Spirochetes e.g., Treponema, Leptospira, and Borrelia
  • Mycobacteria e.g., tuberculosis, smegmatis
  • Actinomyces e.g., israelii
  • Nocardia e.g., asteroides
  • Chlamydia e.g., trachomatis
  • Rickettsia e.g., Coxiella, Ehrilichia, Rochalimaea, Brucella, Yersinia, Fracisella, and Pasteur
  • RNA viruses examples include Rhabdoviruses, e.g., VSV; Paramyxovimses, e.g., RSV; Orthomyxovimses, e.g., influenza; Bunyaviruses; and Arenaviruses), dsDNA viruses (Reoviruses, for example), RNA to DNA viruses, i.e., Retroviruses, e.g., especially HIV and HTLV, and certain DNA to RNA viruses such as Hepatitis B virus.
  • Rhabdoviruses e.g., VSV
  • Paramyxovimses e.g., RSV
  • Orthomyxovimses e.g., influenza
  • Bunyaviruses Bunyaviruses
  • Arenaviruses Arenaviruses
  • RNA to DNA viruses i.e., Retroviruses, e.g., especially HIV and HTLV
  • certain DNA to RNA viruses such as Hepatitis B virus.
  • subtilisin can be evolved by shuffling codon altered forms of the gene for subtilisin (von der Osten et al., J. Biotechnol. 28:55-68 (1993) provide a subtilisin coding nucleic acid). Proteins which aid in folding such as the chaperonins are also preferred.
  • Preferred known genes suitable for codon alteration and shuffling also include the following: Alpha- 1 antitrypsin, Angiostatin, Antihemolytic factor, Apolipoprotein,
  • Atrial natriuretic factor Apoprotein, Atrial natriuretic factor, Atrial natriuretic polypeptide, Atrial peptides, C-X-C chemokines (e.g., T39765, NAP-2, ENA-78, Gro-a, Gro-b, Gro-c, IP- 10, GCP-2, NAP-4, SDF-1, PF4, MIG), Calcitonin, CC chemokines (e.g., Monocyte chemoattractant protein-1, Monocyte chemoattractant protein-2, Monocyte chemoattractant protein-3, Monocyte inflammatory protein-1 alpha, Monocyte inflammatory protein-1 beta, RANTES, 1309, R83915, R91733, HCC1, T58847, D31065, T64262), CD40 ligand, Collagen, Colony stimulating factor (CSF), Complement factor 5a, Complement inhibitor, Complement receptor 1, Factor IX, Factor VII, Factor VIII, Factor X, Fi
  • arthritides mitogen, Superoxide dismutase, Thymosin alpha 1, Tissue plasminogen activator, Tumor necrosis factor beta (TNF beta), Tumor necrosis factor receptor (TNFR), Tumor necrosis factor-alpha (TNF alpha) and Urokinase.
  • TNF beta Tumor necrosis factor beta
  • TNFR Tumor necrosis factor receptor
  • TNF alpha Tumor necrosis factor-alpha
  • Urokinase ase.
  • Many other known coding nucleic acids such as those in GenebankTM, can be codon-altered and shuffled.
  • homologous genes from different organisms can have significantly lower homology at the nucleic acid level than at the amino acid level.
  • genetic information for some bacterial species is high in GC content (up to 70%), while others have AT rich (>60%) codon usage.
  • genes from different organisms may have, for example, 40-60% amino acid identity but only 25-35% nucleic acid identity. It is often desirable to increase such levels of nucleic acid identity so as to enhance the ability of the homologous sequences to recombine, thereby increasing the efficiency of family shuffling using the methods of this invention.
  • protein sequences of gene family members are reverse translated back into DNA sequences, for example by using one of the preferable codon usage charts in any conventional DNA manipulation program (e.g. the Wisconsin Package ,
  • the genes are chemically synthesized, e.g., using a high throughput oligonucleotide synthesizer in, e.g. a 96-well format, optionally in conjunction with polymerase and/ or ligase gene synthesis methods.
  • the DNA sequence similarity after such treatment will be at least as high as the amino acid similarity, but can be at least about 10% to 15% higher than the amino acid identity (in contrast to the situation for naturally occurring genes, which are ordinarily less well conserved than encoded polypeptides), based on the random frequency of sequence identity for any given codon.
  • the minimal requirement for amino acid identity can be as low as about 35% while still retaining adequate nucleic acid homology for standard recombination methods (as discussed, supra, oligonucleotide-mediated recombination methods do not require high levels of similarity to achieve recombination). In some cases, however, the minimal amino acid identity can be even lower, e.g. if the conserved regions are clustered within the genes.
  • the protein erythropoietin alpha also known as EPO, Epogen, and Procrit is a hematopoietic hormone, providing a variety of benefits to patients suffering from anemia (a common symptom of, e.g., AIDS).
  • EPO is produced as a pharmaceutical, with sales of nearly 1 billion dollars world-wide. Accordingly, proteins with EPO-like activity (and preferably superior activity) are of substantial commercial interest.
  • Figure 1 shows the sequence of a part of the monkey EPO gene, which is similar to the human EPO gene.
  • Figure 2 shows an example of a codon altered EPO nucleic acid (or "wobble" EPO gene). In general, transversions rather than transition mutations are made where possible. The purpose of this strategy is to maximally disrupt hybridization of the resulting gene with naturally occurring EPOs.
  • Figure 3 shows an alignment of naturally occurring EPOs.
  • This strategy is further fine-tuned by applying standard rules of base pairing (e.g., elimination of G-C pairing and GC stacking) to maximize sequence disruption; in addition, conservative or non-conservative amino acid modifications can also be made (in some cases, where multiple codon-altered nucleic acids are shuffled, it is desirable to make codon altered nucleic acids with non-overlapping non-conservative substitutions to permit reversion to the wild-type amino acid during shuffling).
  • the size of the sequence space for nucleic acids encoding EPO is large, at about 2.8 x 10 different sequences (there are about
  • codon-altered nucleic acids It is of interest to further evolve codon-altered nucleic acids. Shuffling with other homologous genes from nature, designed genes (incorporating libraries of designed sequence variation), and genes containing mutations of interest are strategies for evolving any gene of interest. However, the codon altered nucleic acid may not be easily shuffled with these genes because of the sequence differences; or they may be undesirable for other reasons (e.g., the naturally occurring sequences may be proprietary, or include proprietary elements).
  • codon-altered homologous nucleic acids which encode desired amino acid variations (e.g., those found in homologous genes), but which have a codon-set close to the nucleic acid(s) to be recombined (thereby permitting, e.g., hybridization during recombination).
  • codon-altered nucleic acids encoding the same proteins are synthesized with a similar codon selection. Standard family shuffling is then practiced with the codon altered nucleic acids. This is shown schematically for EPO in Figure 5.
  • EPO wobble variants are screened for expression and then receptor binding assays are conducted in an ELISA format, using human EPOr-Fc fusions. Following selection of binding variants, activity is measured as thymidine incorporation in UT7-EPO (A human bone marrow cell line) cell proliferation assays. Cells are treated for 2-3 days with various concentration of EPO variants after which time they are incubated in the presence of 3-H thymidine for 4 hours and incorporation of thymidine is measured. See also, Erickson- miller et al. (1997) Blood 90:2421 (for the receptor binding assay), and Wen et al. (1994) I Biol. Chem.
  • Assays for selecting EPO can also be based, e.g., on the ability of EPO proteins to stimulate the growth of blood cell, e.g., in vitro or in vivo.
  • Family shuffling can be used to breed diversity from genes into the libraries to be screened. Additionally, design heuristics such as randomization of hydrophobic core residues can be used to take advantage of the redundancy between primary structure and tertiary structure of proteins (i.e. many different primary structures encode proteins with very similar three dimensional structures).
  • Design heuristics are employed to create a sequence space of mutants that are predicted to be highly biased (relative to random mutagenesis) to encode proteins which preserve the original activity. Methods such as high throughput (HTP) screening and phage panning are used to identify members of the designed libraries that have the desired activity. DNA shuffling is used to breed this population of active clones in order to fine tune the mutants, thus allowing one to evolve variants with equivalent or superior function relative to the naturally occurring proteins.
  • HTP high throughput
  • Figures 6 and 7 show several mammalian homologues of G-CSF.
  • Figure 8 shows the hydrophobic core residues of human G-CSF (blacked out).
  • Figure 9 shows a strategy for evolving variants of human G-CSF that are highly divergent in sequence.
  • three genes are synthesized (Genes 1, 2 and 3, Figure 8) which contain all of the mammalian homologue diversity of G-CSF. These genes are shuffled, phage panned against the G-CSF receptor, and HTP screened for biological function (receptor activation). Active clones are iterative ly shuffled and screened if necessary to give evolved variants that rival or surpass the human gene in activity (on human cells).
  • a related approach is to search the protein databases for a protein that has a similar activity to a protein that on wishes to evolve.
  • Denesyuk et al. (1996) J. Theor. Biol shows the results of such a search for G-CSF.
  • LIF is a very similarly folded protein.
  • Another approach is to use computational methods to create families of variants that are predicted to be functional. Dahiyat and Mayo Science recently described computer methods that are used to design proteins. Proteins are simulated on the computer, often with the aid of genetic algorithms, and a subset that are deemed Tit' are actually synthesized and ' analyzed'. These computational methods are becoming increasingly powerful. They would be useful to, for example, predict a family of mutations on the surface of a protein that would not destroy function. DNA shuffling can be used to optimized active clones obtained by design.
  • G-CSF G-CSF
  • structure function data for example alanine scan data for G- CSF reported recently by Reidhaar-Olsen in Biochemistry
  • This library is synthesized, put through biological screens and/or selections (i.e. panning against the G-CSF receptor), and active variants are obtained. DNA shuffling is then used to evolve these active variants to have the desired level of function.
  • G-CSF proteins are displayed on phage and screened for binding to human G- CSF receptor in an ELISA format. Variants that bind receptor are selected in a high throughput screen for receptor activation.
  • This cell based assay measures receptor activation via a reporter gene (such as luciferase) activated by a G-CSF responsive construct containing STAT binding elements.
  • Cells (such as HepG2) are transformed with a G-CSF responsive reporter plasmid and treated with the codon shuffled G-CSF variant for 2.5 hours. Cells are then lysed and luciferase activity measured. See also, Tian et al (1998) Science 281:257-259.
  • Alkaline phosphatase is a widely used reporter enzyme for ELISA assays, protein fusion assays, and in a secreted form as a reporter gene for mammalian cells. A more active form of the enzyme is desirable.
  • a codon altered form of alkaline phosphatase was generated by PCR assembly using the oligos set forth in Figure 10.
  • a map of the oligos is set forth in Figure 11. The procedure used was essentially identical to that taught in Stemmer et al. (1994) Gene 164:49- 57.
  • the oligos were mixed 1:1 at a variety of dilutions and PCR assembled by performing e.g., 25-60 cycles of PCR at e.g., 94 °C (60 sec), 94 °C (30 sec), 50 °C (30 sec), 72 °C (30 sec). Assembly of the BIAP gene was conducted in a circular format and gene fragements were purified.
  • Cells can be stably transduced with a number of viral vectors including those derived from retroviruses, pox viruses, adenoviruses (Ads), herpes viruses and parvoviruses.
  • viral vectors include those derived from murine leukemia viruses (MuLV), gibbon ape leukemia viruses (GaLV), human immuno deficiency viruses (HIV), adenoviruses, adeno associated viruses (AAVs), Epstein Barr viruses, canarypox viruses, cowpox viruses, and vaccinia viruses.
  • Viral vectors based upon retroviruses, adeno-associated viruses, herpes viruses and adenoviruses are all used as gene therapy vectors for the introduction of therapeutic nucleic acids into the cells of an organism by ex vivo and in vivo methods.
  • packaging cells are commonly used to prepare virions used to transduce target cells.
  • trans-active genes are rendered inactive and "rescued" by trans-complementation to provide a packaged vector.
  • This form of trans complementation is provided by co-infection of a packaging cell with a virus or vector which supplies functions missing from a particular gene therapy vector in trans, or by using a cell line (e.g., 293 cells) which have viral components integrated into the genome of the packaging cell.
  • cells transduced with HIV or murine retroviral proviral sequences which lack the nucleic acid packaging site produce retroviral trans active components, but do not specifically incorporate the retroviral nucleic acids into the capsids produced, and therefore produce little or no live virus.
  • transduced "packaging" cells are subsequently transduced with a vector nucleic acid which lacks coding sequences for retroviral trans active functions, but includes a packaging signal
  • the vector nucleic acid is packaged into an infective virion.
  • a number of packaging cell lines useful for MoMLV-based vectors are known in the art, such as PA317 (ATCC CRL 9078) which expresses MoMLV core and envelope proteins see, Miller et al. J. Virol. 65:2220-2224 (1991).
  • Carrol et al. (1994) Journal of virology 68(9):6047-6051 describe the construction of packaging cell lines for HIV viruses. Reciprocal complementation of defective HIV molecular clones is described, e.g., in Lori et al. (1992) Journal of Virology 66(9) 5553-5560.
  • HIV functions of viral replication not supplied by trans-complementation which are necessary for replication of the vector are present in the vector.
  • this typically includes, e.g., the TAR sequence, the sequences necessary for HIV packaging, the RRE sequence if the instability elements of the pi 7 gene of gag is included, and sequences encoding the polypurine tract.
  • HIV sequences that contain these functions include a portion of the 5' long terminal repeat (LTR) and sequences downstream of the 5' LTR responsible for efficient packaging, i.e., through the major splice donor site (“MSD”), and the polypurine tract upstream of the 3' LTR through the U3R section of the 3' LTR.
  • LTR 5' long terminal repeat
  • MSD major splice donor site
  • the packaging site (psi site or ⁇ site) is partially located adjacent to the 5' LTR, primarily between the MSD site and the gag initiator codon (AUG) in the leader sequence. See, Garzino-Demo et al. (1995) Hum- Gene Ther. 6(2): 177-184. For a general description of the structural elements of the HIV genome, see, Holmes et al. PCT/EP92/02787.
  • Another common vector is based upon adenovirus.
  • vectors which include the adenovirus ITRs are packaged in, e.g., 293 cells, which provide many of the components necessary for vector packaging.
  • Adeno-associated viruses utilize helper viruses such as adenovirus or he ⁇ es virus to achieve productive infection.
  • helper virus functions AAV integrates (site-specifically) into a host cell's genome, but the integrated AAV genome has no pathogenic effect.
  • the integration step allows the AAV genome to remain genetically intact until the host is exposed to the appropriate environmental conditions (e.g., a lytic helper virus), whereupon it re-enters the lytic life-cycle.
  • Samulski (1993) Current Opinion in Genetic and Development 3:74-80 and the references cited therein provides an overview of the AAV life cycle.
  • AAVs adenovirus or he ⁇ es helper functions
  • the genome of AAV is described in Laughlin et al. (1983) Gene, 23:65-73.
  • Expression of AAV is described in Beaton et al. (1989) J. Virol, 63:4450-4454.
  • the packaging sites for all parvoviruses, including B 19 and AAV are located in the viral ITRs.
  • rAAV vectors deliver foreign nucleic acids to a wide range of mammalian cells (Hermonat & Muzycka (1984) Proc Natl Acad Sci USA 81:6466- 6470; Tratschin et al. (1985) Mol Cell Biol 5:3251-3260), integrate into the host chromosome (McLaughlin et al. (1988) J Virol 62: 1963-1973), and show stable expression of the transgene in cell and animal models (Flotte et al. (1993) Proc Natl Acad Sci USA 90:10613- 10617). rAAV vectors are able to infect non-dividing cells (Podsakoff et al.
  • rAAV vectors include the lack of an intrinsic strong promoter, thus avoiding possible activation of downstream cellular sequences, and the vector s naked icosohedral capsid structure, which renders the vectors stable and easy to concentrate by common laboratory techniques.
  • vectors to be packaged can recombine with nucleic acids providing packaging functions in trans, producing a replication-competent virus. This can be a problem both when vectors are produced for therapeutic applications (e.g., in gene therapy) and during production of encoded components in vitro.
  • the present invention provides a way of reducing or eliminating recombination between nucleic acids encoding trans-active components and vector nucleic acids encoding packaging sites.
  • nucleic acid subsequences of a vector which are adjacent to modified or deleted elements provided in trans are codon modified to eliminate hybridization to wild-type sequences. Because these sequences do not hybridize, they cannot recombine with nucleic acids producing trans-active components.
  • One additional advantage of this approach is that the vectors also cannot recombine with live viruses, e.g., in a human body which is infected with a virus that packages vector elements.
  • two types of gene therapy vectors are those based upon retroviruses (which can be packaged by, e.g., HIV- 1) and adenoviruses (which can be packaged by adenovirus).
  • the nucleic acids encoding trans-active components can be codon modified so that they do not hybridize to wild-type sequences. This also prevents recombination with vectors having wild-type sequences, preventing recombination and formation of replication competent viruses. After codon modification, vectors or trans active nucleic acids can be shuffled as described supra, and screened for the ability to package nucleic acids, or to be packaged, as appropriate.
  • codon modification of viral sequences has an additional use as well. Codon alteration of viral sequences can result in attenuation of the virus, e.g., due to modification of regulatory sequences, alterations in mRNA secondary structure, inefficient translation due to rare codon use, and the like.
  • Such "codon attenuated" viruses have a significant advantage over existing attenuated viruses (which are typically generated by serial passage in cells other than the normal host type for the virus).
  • codon attenuated viruses can encode a wild-type set of proteins, making them ideal as immunogenic compositions to generate antibodies, or to use as vaccines.
  • Viral proteins can also be used in various diagnostic assays. For example, the standard diagnostic test for HIV infection in current use tests for the presence of anti-HIV antibodies in blood by probing with viral proteins.
  • Adenovirus is a common vector used, e.g., for gene therapy.
  • the virus is typically modified to make it replication deficient. This can be achieved e.g., by deleting the El and E4 genes.
  • the functions of El and E4 can be supplied by trans complementation when El and E4 deleted vectors are grown in the ubiquitous human embryonic kidney cell line 293, which has uncharacterized adenovirus fragments inco ⁇ orated into their genome that supply the missing functions in trans.
  • the replication defective adenoviral vectors recombine at a low, but clinically significant frequency, resulting in replication competent adenovirus contamination of vector preparations. Because adenovirus has detrimental effects on health, this is a significant problem for application of adenovirus-based gene therapy vectors.
  • a codon usage library encompassing several hundred bases to several kilobases of sequence flanking the adenovirus El and E4 genes are made.
  • the library is designed to enforce a high degree of divergence from the natural adenoviral consensus sequence, while at the same time inco ⁇ orating a large degree of degeneracy in the codons to allow for a large space of sequence diversity to be searched.
  • the design principle is to obtain mutants that encode the same or similar protein sequence, but with many mismatches to the wild-type El and E4 sequences found in the 293 genome. These mismatches strongly reduce the frequency of unwanted recombination with the trans complementary genes.
  • engineered adenoviral vectors, or adenovirus helper vectors which package adenoviral sequences which include packaging sequences (adenoviral or adeno-associated viral ITRs) in trans have reduced levels of recombination. This provides for a lower rate of competent adenovirus production, making culture and production of such vectors safer.
  • HIV- 1 and HIV-2 are genetically related, antigenically cross reactive, and share a common cellular receptor (CD4). See, Rosenburg and Fauci (1993) in Fundamental Immunology, Third Edition Paul (ed) Raven Press, Ltd., New York (Rosenburg and Fauci i) and the references therein for an overview of HIV infection. HIV- 1 infection is epidemic world wide, causing a variety of immune system- failure related phenomena commonly termed acquired immune deficiency syndrome (AIDS). HIV type 2 (HIV-2) has been isolated from both healthy individuals and patients with AIDS-like illnesses (Andreasson, et al. (1993) Aids 7, 989-93; Clavel, et al. (1986) Nature, 324, 691-695; Gao, et al.
  • CD4 common cellular receptor
  • HIV-2 AIDS cases have been identified principally from West Africa, sporadic HIV-2 related AIDS cases have also been reported in the United States (O'Brien, et al. (1991) Aids 5, 85-8) and elsewhere. HIV-2 will likely become endemic in other regions over time, following routes of transmission similar to HIV-1 (Harrison, et al. (1991) Journal of Acquired Immune Deficiency Syndromes 4, 1 155-60; Kanki, et al. (1992) American Journal of Epidemiology 136, 895-907; Romieu, et al. (1990) Journal of Acquired Immune Deficiency Syndromes 3, 220-30).
  • HIV-2 produces human disease with lesser penetrance than HIV-1, and exhibits a considerably longer period of clinical latency (at least 25 years, and possibly longer, as opposed to less than a decade for HIV-1; see, Kanki, et al. (1991) Aids Clinical Review 1991, 17-38; Romieu, et al. (1990) Journal of Acquired Immune Deficiency Syndromes 3, 220-30, and Travers et al. (1995) Science 268: 1612-1615).
  • HIV virus populations to rapidly point mutate to avoid the immune response poses a special challenge for vaccine design. While the immune system has responded to viruses in a gradual and co-evolutionary manner, the present invention provides a general approach that provides for massively faster evolution to produce new vaccines to stimulate more effective immune responses.
  • virus mutations are selected which reduce recognition and neutralization by the immune system's B and T-cell responses. See, Lukashov et al. (1995) J. Virol. 69:6911-6916. During the long incubation time, these mutations accumulate and eventually overwhelm the immune system's defenses. See, Ho et al. (1995) Nature.
  • Attenuated vaccines typically produced by prolonged growth of human viruses in animal cells, have proven useful as vaccines for several diseases, including mumps, rubella and measles. Attenuation involves the slow accumulation of many mutations throughout the viral genome during the course of adapting to growth in the animal cells. When used to vaccinate humans, the attenuated virus grows only weakly and elicits a complex immune response which the virus is unable to avoid. The mutations in the attenuated virus could, in principle, revert in the same stepwise fashion that it underwent to grow in culture.
  • immunogenic compositions such as vaccines are created which contain a large number of silent substitutions.
  • such viruses have native protein sequences and elicit essentially the same immune responses as the corresponding wild-type virus (typically one or a few additional disabling mutations can also be inco ⁇ orated). Codon alteration results in two effects that both increase the potential of the vaccine.
  • codon alteration results in impairment of virus evolution.
  • modification of the codons alters the mutational escape spectrum of the virus, upsetting the evolutionary selection for specific codons.
  • the six codon amino acids are the best targets for codon alteration.
  • Serine, arginine and leucine each have one group of four codons, plus two codons in an unrelated group. See, Figure 12. Switching all of the serine codons from AGY to TCX and vice versa, yields proteins with unaltered amino acid sequences. See also, Figure 13. However, these codon groups differ significantly in the spectrum of the amino acids that they yield upon point mutation. Of all possible point mutations of one codon for serine (TCA) 78% result in a different amino acid compared to point mutations obtained for the AGT codon for serine.
  • a virus with hundreds of codon alterations is in, statistically, a very different mutational space, able to access a totally different mutation spectrum, or "cloud,” compared to the wild-type virus.
  • the overall strategy for producing an evolution-defective virus is additionally set forth in Figures 14, 15 and 17.
  • Figure 16, panels A-C show results of single mutations of different codons for ser, arg, and leu.
  • Point mutation is critical for viruses such as HIV-1 to stay ahead of the host immune system.
  • the amino acid mutations that are required for virus escape are likely not random. Wild type codon usage has evolved to allow optimal immune system evasion. The wild type codon usage is likely to favor mutations that represent alterations that avoid the host immune system, without detrimentally affecting the protein(s) encoded. While complex, this natural pattern of amino acid sequence change of the natural virus in response to the host system is non-random and weakly predictable. See also, Seiller-Moiseiwitsch et al. (1994) Annu. Rev. Genet. 28:559-596.
  • a preferred balance of attenuation and evolution impairment is obtained by DNA shuffling (e.g., Stemmer et al. (1995) Gene 164:49-53), e.g., of the wild- type and codon altered sequences, followed by selection of the resulting library of viruses that retain moderate growth despite many codon alterations. While attenuation that can be obtained by this approach may be sufficient for obtaining a vaccine for most viruses, for HIV-1, the evolution impairment is more important, due to the high mutation rate of the virus. Live vaccines are used only if they elicit an immune response which is complex and strong enough to prevent infection of the wild-type virus.
  • Live virus vaccines are typically more protective than single protein vaccines because it is harder to out-mutate T and B-cell responses to a larger number of epitopes.
  • the weak growth of the live virus vaccine results in a larger antigenic dose and point mutation is increases the complexity of the immune response.
  • vaccine potential is evaluated in Macaques (M. nemestrina) or chimpanzees using SIV variants that are known to cause AIDS. Sequence for an example SIV, SIVsmm, is found at Gene Bank Accession No. x 14307. This virus is closely related to HIV-2. See, Hirsch (1989) Nature 339: 389-392.
  • HIV-2ROD three molecular clones of HIV-2
  • HIV-2SBL-ISY, and HIV-2uc ⁇ have also been reported to infect macaques (M. mulatta and . nemestrina) or baboons (Franchini, et al. (1989) Proc. Natl. Acad. Sci. U.S.A. 86, 2433-2437; Barnett, et al. (1993) Journal of Virology 67, 1006-14; Boeri, et al. (1992) Journal of Virology 66, 4546-50; Castro, et al. (1991) Virology 184, 219-26; Franchini, et al (1990) Journal of Virology 64, 4462-7; Putkonen, et al.
  • HIV-2 molecular clones provide attractive models for studies of AIDS pathogenesis, and for drug and vaccine development against HIV-1 and HIV-2.
  • HIV-2 was suggested as a possible vaccine candidate against the more virulent HIV-1 due to its long asymptomatic latency period, and its ability to protect against infection by HIV-1 (see, Travers et al. (1995) Science 268: 1612-1615 and related commentary by Cohen et al (1995) Science 268: 1566).
  • Travers et al. id
  • codon altered HIV-2 viruses can also be used as a live vaccine, against both HIV-2 and HIV- 1.
  • the natural pathogenicity of HIV-2 is less than HIV-1, it is, in addition to HIV-1, a preferred virus for modification.
  • the methods of the invention entail performing recombination ("shuffling") and screening or selection to "evolve" individual genes, whole plasmids or viruses, multigene clusters, or even whole genomes (Stemmer (1995) Bio/Technology 13:549-553). Reiterative cycles of recombination and screening/selection can be performed to further evolve the nucleic acids of interest. Such techniques do not require the extensive analysis and computation required by conventional methods for polypeptide engineering. Shuffling allows the recombination of large numbers of mutations in a minimum number of selection cycles, in contrast to natural pair-wise recombination events (e.g., as occur during sexual replication).
  • sequence recombination techniques described herein provide particular advantages in that they provide recombination between mutations in any or all of these, thereby providing a very fast way of exploring the manner in which different combinations of mutations can affect a desired result. In some instances, however, structural and/or functional information is available which, although not required for sequence recombination, provides opportunities for modification of the technique.
  • the recombination procedure starts with at least two substrates that generally show substantial sequence identity to each other (e.g., at least about 30%, 50%, 70%, 80% or 90%) or more sequence identity), but differ from each other at certain positions.
  • at least one codon altered nucleic acid is recombined with one or more additional nucleic acid (the additional nucleic acid can also be a codon altered nucleic acid) herein.
  • the difference between nucleic acids to be recombined can be any type of mutation, for example, substitutions, insertions and deletions. Often, different segments differ from each other in about 5-20 positions.
  • the starting materials must differ from each other in at least two nucleotide positions.
  • the starting DNA segments can be natural variants of each other, for example, allelic or species variants. More typically, they will be codon altered nucleic acids derived from one or more homologous nucleic acid sequence. The segments can also be from nonallelic genes showing some degree of structural and usually functional relatedness (e.g., codon altered nucleic acids derived from different, but homologous, genes within a superfamily). The starting DNA segments can also be induced variants of each other.
  • one DNA segment can be produced by error-prone PCR replication of the other, or by substitution of a mutagenic cassette. Induced mutants can also be prepared by propagating one (or both) of the segments in a mutagenic strain.
  • the second DNA segment is not a single segment but a large family of related segments.
  • the different segments forming the starting materials are often the same length or substantially the same length. However, this need not be the case; for example; one segment can be a subsequence of another.
  • the segments can be present as part of larger molecules, such as vectors, or can be in isolated form.
  • the starting DNA segments are recombined by any of the sequence recombination formats provided herein to generate a diverse library of recombinant DNA segments.
  • a library can vary widely in size from having fewer than 10 to more than 10 5 , 10 , 10 1 , 10 15 , 10 20 or even more members.
  • the starting segments and the recombinant libraries generated will include essentially full-length coding sequences and any essential regulatory sequences, such as a promoter and polyadenylation sequence, required for expression.
  • the recombinant DNA segments in the library can be inserted into a common vector providing sequences necessary for expression before performing screening/selection.
  • restriction enzyme sites in nucleic acids to direct the recombination of mutations in a nucleic acid sequence of interest. These techniques are particularly preferred in the evolution of fragments that cannot readily be shuffled by other existing methods due to the presence of repeated DNA or other problematic primary sequence motifs. These situations also include recombination formats in which it is preferred to retain certain sequences unmutated.
  • the use of restriction enzyme sites is also preferred for shuffling large fragments (typically greater than 10 kb), such as gene clusters that cannot be readily shuffled and "PCR-amplified” because of their size. Although fragments up to 50 kb have been reported to be amplified by PCR (Barnes, Proc. Natl. Acad. Sci.
  • the restriction endonucleases used are of the Class II type (Sambrook, Ausubel and Berger, supra) and of these, preferably those which generate nonpalindromic sticky end overhangs such as Alwn I, Sfi I or BstXl. These enzymes generate nonpalindromic ends that allow for efficient ordered reassembly with DNA ligase.
  • restriction enzyme (or endonuclease) sites are identified by conventional restriction enzyme mapping techniques (Sambrook, Ausubel, and Berger, supra.), by analysis of sequence information for that gene, or by introduction of desired restriction sites into a nucleic acid sequence by synthesis (i.e. by inco ⁇ oration of silent mutations).
  • one or more codon-altered nucleic acid can be recombined at restriction sites, e.g., with one or more nucleic acid of interest (including, e.g. a gene or gene cluster to be modified by recombination with the codon-altered nucleic acid).
  • the DNA substrate molecules to be digested can either be from in vivo replicated DNA, such as a plasmid preparation, or from synthetic or e.g., PCR amplified nucleic acid fragments harboring the restriction enzyme recognition sites of interest, preferably near the ends of the fragment.
  • at least two variants of a gene of interest, each having one or more mutations, and at least one of which inco ⁇ orating codon- modifications are digested with at least one restriction enzyme determined to cut within the nucleic acid sequence of interest.
  • the restriction fragments are then joined with DNA ligase to generate full length genes having shuffled regions. The number of regions shuffled will depend on the number of cuts within the nucleic acid sequence of interest.
  • the shuffled molecules can be introduced into cells as described above and screened or selected for a desired property as described herein. Nucleic acid can then be isolated from pools (libraries), or clones having desired properties and subjected to the same procedure until a desired degree of improvement is obtained. In some embodiments, at least one DNA substrate molecule or fragment thereof is isolated and subjected to mutagenesis. In some embodiments, the pool or library of religated restriction fragments are subjected to mutagenesis or additional recombination protocols before the digestion-ligation process is repeated.
  • “Mutagenesis” as used herein comprises such techniques known in the art as PCR mutagenesis, oligonucleotide-directed mutagenesis, site-directed mutagenesis, etc., and recursive sequence recombination by any of the techniques described herein.
  • a further technique for recombining mutations in a nucleic acid sequence utilizes "reassembly PCR.” This method can be used to assemble multiple segments that have been separately evolved into a full length nucleic acid template such as a gene. This technique is performed when a pool of advantageous mutants is known from previous work or has been identified by screening mutants that may have been created by any mutagenesis technique known in the art, such as PCR mutagenesis, cassette mutagenesis, doped oligo mutagenesis, chemical mutagenesis, or propagation of the DNA template in vivo in mutator strains.
  • Boundaries defining segments of a nucleic acid sequence of interest preferably lie in intergenic regions, introns, or areas of a gene not likely to have mutations of interest.
  • oligonucleotide primers are synthesized for PCR amplification of segments of the nucleic acid sequence of interest, such that the sequences of the oligonucleotides overlap the junctions of two segments.
  • the overlap region is typically about 10 to 100 nucleotides in length.
  • PCR products are then "reassembled” according to assembly protocols such as those discussed herein to assemble randomly fragmented genes.
  • assembly protocols such as those discussed herein to assemble randomly fragmented genes.
  • the PCR products are first purified away from the primers, by, for example, gel electrophoresis or size exclusion chromatography. Purified products are mixed together and subjected to about 1-10 cycles of denaturing, reannealing, and extension in the presence of polymerase and deoxynucleoside triphosphates (dNTPs) and appropriate buffer salts in the absence of additional primers ("self-priming").
  • dNTPs deoxynucleoside triphosphates
  • oligos such as PCR primers can include codon modifications as compared to a starting sequence.
  • oligonucleotides can form the basis for PCR concatemerization reactions in which overlapping hybridized oligonucleotides are extended in one or more PCR amplification cycles.
  • a template nucleic acid is not required (although a template or fragments thereof can be added to the amplification mixture, which can aid in the eventual reassembly of a full-length gene).
  • oligonucleotide gene reassembly methods are found, e.g., in Crameri et al. "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION” filed February 5, 1999, USSN 60/118,813 and Crameri et al. "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION” filed June 24, 1999, USSN 60/141,049.
  • the PCR primers for amplification of segments of a nucleic acid sequence of interest are used to introduce variation into the gene of interest as follows. Mutations at sites of interest in a nucleic acid sequence are identified by screening or selection, by sequencing homologues of the nucleic acid sequence, and so on. Oligonucleotide PCR primers are synthesized which encode wild type or mutant information at sites of interest. These primers are then used in PCR mutagenesis to generate libraries of full length genes encoding permutations of wild type and mutant information at the designated positions. This technique is typically advantageous in cases where the screening or selection process is expensive, cumbersome, or impractical relative to the cost of sequencing the genes of mutants of interest and synthesizing mutagenic oligonucleotides.
  • sequence information from one or more substrate sequences is added to a given "parental" sequence of interest, with subsequent recombination between rounds of screening or selection.
  • this is done with site- directed mutagenesis performed by techniques well known in the art (e.g., Berger, Ausubel and Sambrook, supra.) with one substrate as a template and oligonucleotides encoding single or multiple mutations from other substrate sequences, e.g. homologous genes.
  • the selected recombinant(s) can be further evolved using recursive techniques.
  • site-directed mutagenesis can be done again with another collection of oligonucleotides encoding homologue mutations, and the above process repeated until the desired properties are obtained.
  • degenerate oligonucleotides can be used that encode the sequences in both homologues.
  • One oligonucleotide can include many such degenerate codons and still allow one to exhaustively search all permutations over that block of sequence.
  • the first referred to as "in silico" shuffling utilizes computer algorithms to perform “virtual” shuffling using genetic operators in a computer.
  • codon altered gene sequence strings are recombined in a computer system and desirable products are made, e.g., by reassembly PCR or ligation of synthetic oligonucleotides.
  • the predicted recombinational outcomes are used to produce corresponding molecules, e.g., by oligonucleotide synthesis and reassembly PCR.
  • the second useful format is referred to as "oligonucleotide mediated shuffling" in which oligonucleotides corresponding to a family of related homologous nucleic acids (e.g., as applied to the present invention, codon modified synthetic homologous variants of a nucleic acid) which are recombined to produce selectable nucleic acids.
  • This format is described in detail in Crameri et al.
  • OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION filed February 5, 1999, USSN 60/118,813 and Crameri et al.
  • OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION filed June 24, 1999, USSN 60/141,049.
  • selected oligonucleotides are synthesized, ligated and elongated, typically either in a polymerase or ligase-mediated elongation reaction. The technique can be used to recombine homologous or even non-homologous codon-altered nucleic acid sequences.
  • oligonucleotide-mediated recombination is the ability to recombine homologous nucleic acids with low sequence similarity, or even non-homologous nucleic acids.
  • one or more set of fragmented nucleic acids e.g., cleaved codon-modified oligonucleotides, or synthesized codon-modified oligonucleotides
  • are recombined e.g., with a with a set of crossover family diversity oligonucleotides.
  • Each of these crossover oligonucleotides have a plurality of sequence diversity domains corresponding to a plurality of sequence diversity domains from homologous or non-homologous nucleic acids with low sequence similarity.
  • the fragmented oligonucleotides which are derived by comparison to one or more homologous or non- homologous nucleic acids, can hybridize to one or more region of the crossover oligos, facilitating recombination.
  • sets of overlapping family gene shuffling oligonucleotides (which are derived by comparison of homologous nucleic acids that include one or more codon-modified nucleic acid, followed by synthesis of corresponding oligonucleotides) are hybridized and elongated (e.g., by reassembly PCR or ligation), providing a population of recombined nucleic acids, which can be selected for a desired trait or property.
  • the set of overlapping family shuffling gene oligonucleotides includes a plurality of oligonucleotide member types which have consensus region subsequences derived from a plurality of homologous target nucleic acids.
  • family gene shuffling oligonucleotide which include one or more codon-altered nucleic acid(s) are provided by aligning homologous nucleic acid sequences to select conserved regions of sequence identity and regions of sequence diversity.
  • a plurality of family gene shuffling oligonucleotides are synthesized (serially or in parallel) which correspond to at least one region of sequence diversity.
  • Sets of fragments, or subsets of fragments used in oligonucleotide shuffling approaches can be provided by cleaving one or more homologous nucleic acids (e.g., with a DNase), or, more commonly, by synthesizing a set of oligonucleotides corresponding to a plurality of regions of at least one nucleic acid (typically oligonucleotides corresponding to a full-length nucleic acid are provided as members of a set of nucleic acid fragments).
  • homologous nucleic acids e.g., with a DNase
  • synthesizing a set of oligonucleotides corresponding to a plurality of regions of at least one nucleic acid typically oligonucleotides corresponding to a full-length nucleic acid are provided as members of a set of nucleic acid fragments.
  • these cleavage fragments can be used in conjunction with family gene shuffling oligonucleotides, e.g., in one or more recombination reaction to produce recombinant codon-altered nucleic acid(s).
  • oligonucleotide shuffling formats are found in co-filed application by Crameri et al., "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" (Attorney Docket Number 02-296-2US) and in co-filed application by Welch et al., "USE OF CODON VARIED OLIGONUCLEOTIDE SYNTHESIS FOR SYNTHETIC SHUFFLING" (Attorney docket number 02-1007).
  • these applications provide for tri-nucleotide-based synthesis of degenerate oligonucleotides, thereby providing for codon substitution during oligonucleotide shuffling.
  • this procedure utilizes tri-nucleotide phosphoramidite chemistry to synthesize oligos, rather than standard mono-nucleotide synthesis. Because codons are altered as a unit, the synthetic scheme of degenerate oligonucleotides is simplified. Additional In Vitro DNA Shuffling Formats
  • the initial substrates for recombination are a pool of related sequences, e.g., different variant forms, as homologs from different individuals, strains, or species of an organism, or related sequences from the same organism, as allelic variations.
  • the sequences can be DNA or RNA and can be of various lengths depending on the size of the gene or DNA fragment to be recombined or reassembled.
  • the sequences are from 50 base pairs (bp) to 50 kilobases (kb).
  • the pool of related substrates are converted into overlapping fragments, e.g., from about 5 bp to 5 kb or more.
  • the size of the fragments is from about 10 bp to 1000 bp, and sometimes the size of the DNA fragments is from about 100 bp to 500 bp.
  • the conversion can be effected by a number of different methods, such as DNase I or RNase digestion, random shearing or partial restriction enzyme digestion, or by oligonucleotide synthesis as in the family oligonucleotide-mediated shuffling methods of crameri et al., discusses supra. For discussions of protocols for the isolation, manipulation, enzymatic digestion, and the like, of nucleic acids, see, for example, Sambrook et al. and Ausubel, both supra.
  • the concentration of nucleic acid fragments of a particular length and sequence is often less than 0.1 % or 1% by weight of the total nucleic acid.
  • the number of different specific nucleic acid fragments in the mixture is usually at least about 2, 10, 100, 500 or 1,000 or more.
  • the mixed population of nucleic acid fragments are converted to at least partially single-stranded form using any of a variety of techniques, including, for example, heating, chemical denaturation, use of DNA binding proteins, and the like (in oligonucleotide mediated methods, this step can be omitted). Conversion can be effected by heating to about 80 °C to 100 °C, more preferably from 90 °C to 96 °C, to form single-stranded nucleic acid fragments and then reannealing. Conversion can also be effected by treatment with a single- stranded DNA binding protein (see Wold (1997) Annu. Rev. Biochem.
  • Single- stranded nucleic acid fragments having regions of sequence identity with other single- stranded nucleic acid fragments can then be reannealed by cooling to 20 °C to 75 °C, and preferably from 40 °C to 65 °C. Renaturation can be accelerated by the addition of polyethylene glycol (PEG), other volume-excluding reagents or salt.
  • PEG polyethylene glycol
  • the salt concentration is preferably from 0 mM to 200 mM, more preferably the salt concentration is from 10 mM to 100 mM.
  • the salt may be KC1 or NaCl.
  • the concentration of PEG is preferably from 0% to 20%>, more preferably from 5% to 10%.
  • the fragments that reanneal can be from different substrates.
  • the annealed nucleic acid fragments are incubated in the presence of a nucleic acid polymerase, such as Taq or Klenow, and dNTP's (i.e. dATP, dCTP, dGTP and dTTP).
  • a nucleic acid polymerase such as Taq or Klenow
  • dNTP's i.e. dATP, dCTP, dGTP and dTTP.
  • Taq polymerase can be used with an annealing temperature of between 45-65 °C.
  • Klenow polymerase can be used with an annealing temperature of between 20-30 °C.
  • the polymerase can be added to the random nucleic acid fragments prior to annealing, simultaneously with annealing or after annealing.
  • the process of denaturation, renaturation and incubation in the presence of polymerase or ligase of overlapping fragments to generate a collection of polynucleotides containing different permutations of fragments is sometimes referred to as shuffling of the nucleic acid in vitro.
  • This cycle is repeated for a desired number of times. Preferably the cycle is repeated from 2 to 100 times, more preferably the sequence is repeated from 10 to 40 times.
  • the resulting nucleic acids are a family of double-stranded polynucleotides of from about 50 bp to about 100 kb, preferably from 500 bp to 50 kb.
  • the population represents variants of the starting substrates showing substantial sequence identity thereto but also diverging at several positions.
  • the population has many more members than the starting substrates.
  • the population of fragments resulting from shuffling is used to transform host cells, optionally after cloning into a vector.
  • subsequences of recombination substrates can be generated by amplifying the full-length sequences under conditions which produce a substantial fraction, typically at least 20 percent or more, of incompletely extended amplification products.
  • Another embodiment uses random primers to prime an entire template DNA to generate less than full length amplification products.
  • the amplification products, including the incompletely extended amplification products are denatured and subjected to at least one additional cycle of reannealing and amplification.
  • This variation in which at least one cycle of reannealing and amplification provides a substantial fraction of incompletely extended products, is termed "stuttering.”
  • the partially extended (less than full length) products reanneal to and prime extension on different sequence-related template species.
  • the conversion of substrates to fragments can be effected by partial PCR amplification of substrates.
  • a mixture of fragments is spiked with one or more oligonucleotides.
  • the oligonucleotides can be designed to include precharacterized mutations of a wildtype sequence (e.g., codon modification), or sites of natural variations between individuals or species.
  • the oligonucleotides also typically include sufficient sequence or structural homology flanking such mutations or variations to allow annealing with the wildtype fragments. Annealing temperatures can be adjusted depending on the length of homology.
  • recombination occurs in at least one cycle by template switching, such as when a DNA fragment derived from one template primes on the homologous position of a related but different template.
  • Template switching can be induced by addition of recA (see, Kiianitsa (1997) supra), rad51 (see, Namsaraev (1997) Mol. Cell. Biol. 17:5359-5368), rad55 (see, Clever (1997) EMBOJ. 16:2535-2544), rad57 (see, Sung (1997) Genes Dev. 11:1111-1121) or other polymerases (e.g., viral polymerases, reverse transcriptase) to the amplification mixture.
  • Template switching can also be increased by increasing the DNA template concentration.
  • Another embodiment utilizes at least one cycle of amplification, which can be conducted using a collection of overlapping single-stranded DNA fragments of related sequence, and different lengths. Fragments can be prepared using a single stranded DNA phage, such as M13 (see, Wang (1997) Biochemistry 36:9486-9492). Each fragment can hybridize to and prime polynucleotide chain extension of a second fragment from the collection, thus forming sequence-recombined polynucleotides.
  • ssDNA fragments of variable length can be generated from a single primer by Pfu, Taq, Vent, Deep Vent, UlTma DNA polymerase or other DNA polymerases on a first DNA template (see, Cline (1996) Nucleic Acids Res. 24:3546-3551).
  • the single stranded DNA fragments are used as primers for a second, Kunkel-type template, consisting of a uracil-containing circular ssDNA. This results in multiple substitutions of the first template into the second. See, Levichkin (1995) Mol. Biology 29:572-577; Jung (1992) Gene 121: 17-24.
  • shuffled nucleic acids obtained by use of the recursive recombination methods of the invention are put into a cell and/or organism for screening.
  • Shuffled genes can be introduced into, for example, bacterial cells, yeast cells, fungal cells vertebrate cells, invertebrate cells or plant cells for initial screening.
  • Bacillus species such as B. subtilis and E. coli are two examples of suitable bacterial cells into which one can insert and express shuffled genes which provide for convenient shuttling to other cell types (a variety of vectors for shuttling material between these bacterial cells and eukaryotic cells are available; see, Sambrook, Ausubel and Berger, all supra).
  • the shuffled genes can be introduced into bacterial, fungal or yeast cells either by integration into the chromosomal DNA or as plasmids.
  • shuffled genes can be introduced into plant or animal cells for production pu ⁇ oses (it will be appreciated that transgenic plants are, increasingly, an important source of industrial enzymes), or can be introduced into a plant or animal cell for therapeutic pu ⁇ oses.
  • a transgene of interest can be modified using the recursive sequence recombination methods of the invention in vitro and reinserted into the cell for in vivo/in situ selection for the new or improved property, in bacteria, eukaryotic cells, or whole eukaryotic organisms.
  • DNA substrate molecules e.g., those comprising codon modifications relative to a wild-type sequence
  • cells where the cellular machinery directs their recombination.
  • a library of mutants is constructed and screened or selected for mutants with improved phenotypes by any of the techniques described herein.
  • the DNA substrate molecules encoding the best candidates are recovered by any of the techniques described herein, then fragmented and used to transfect a plant host and screened or selected for improved function. If further improvement is desired, the DNA substrate molecules are recovered from the host cell, such as by PCR, and the process is repeated until a desired level of improvement is obtained.
  • the fragments are denatured and reannealed prior to transfection, coated with recombination stimulating proteins such as recA, or co-transfected with a selectable marker such as Neo to allow the positive selection for cells receiving recombined versions of the gene of interest.
  • Methods for in vivo shuffling are described in, for example, PCT application WO 98/13487 and WO 97/20078. The efficiency of in vivo shuffling can be enhanced by increasing the copy number of a gene of interest in the host cells.
  • the selection methods herein are utilized in a "whole genome shuffling" format.
  • An extensive guide to the many forms of whole genome shuffling is found in the pioneering application to the inventors and their co-workers entitled “Evolution of Whole Cells and Organisms by Recursive Sequence Recombination," PCT/US99/ 15972, by del Cardayre et al. Any codon-altered set of nucleic acids can be used to transform cells, which can then be shuffled by in a whole genome format.
  • whole genome shuffling makes no presuppositions at all regarding what nucleic acids may confer a desired property. Instead, entire genomes (e.g., from a genomic -library, or isolated from an organism) are shuffled in cells and selection protocols applied to the cells. These genomes can be spiked with any desired set of nucleic acids, including codon-modified nucleic acids.
  • Assays The relevant assay for selection of a desired property of a codon-modified nucleic acid will depend on the application. Many assays which detect activity for proteins, receptors, ligands, cells and the like are known. Formats include binding to immobilized components, cell or organismal viability, production of reporter compositions, and the like.
  • each well of a microtiter plate can be used to run a separate assay, or, if concentration or incubation time effects are to be observed, every 5-10 wells can test a single variant.
  • a single standard microtiter plate can assay about 100 (e.g., 96) reactions. If 1536 well plates are used, then a single plate can easily assay from about 100- about 1500 different reactions.
  • library members e.g., cells, viral plaques, spores or the like, are separated on solid media to produce individual colonies (or plaques).
  • colonies or plaques are identified, picked, and up to 10,000 different mutants inoculated into 96 well microtitre dishes containing two 3 mm glass balls/well.
  • the Q-bot does not pick an entire colony but rather inserts a pin through the center of the colony and exits with a small sampling of cells, (or mycelia) and spores (or viruses in plaque applications). The time the pin is in the colony, the number of dips to inoculate the culture medium, and the time the pin is in that medium each effect inoculum size, and each can be controlled and optimized.
  • the uniform process of the Q-bot decreases human handling error and increases the rate of establishing cultures (roughly 10,000/4 hours). These cultures are then shaken in a temperature and humidity controlled incubator.
  • the glass balls in the microtiter plates act to promote uniform aeration of cells and the dispersal of mycelial fragments similar to the blades of a fermenter.
  • Clones from cultures of interest can be cloned by limiting dilution.
  • plaques or cells constituting libraries can also be screened directly for production of proteins, either by detecting hybridization, protein activity, protein binding to antibodies, or the like. The ability to detect a subtle increase in the performance of a shuffled library member over that of a parent strain relies on the sensitivity of the assay.
  • the chance of finding the organisms having an improvement is increased by the number of individual mutants that can be screened by the assay.
  • a prescreen that increases the number of mutants processed by, e.g., 10-fold can be used.
  • the goal of the primary screen is to quickly identify mutants having equal or better product titres than the parent strain(s) and to move only these mutants forward to liquid cell culture for subsequent analysis.
  • High throughput screening systems are commercially available (see, e.g., Zymark Co ⁇ ., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman Instruments, Inc. Fullerton, CA; Precision Systems, Inc, Natick, MA, etc.). These systems typically automate entire procedures including all sample and reagent pipetting, liquid dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate for the assay. These configurable systems provide high throughput and rapid start up as well as a high degree of flexibility and customization.
  • Zymark Co ⁇ provides technical bulletins describing screening systems for detecting the modulation of gene transcription, ligand binding, and the like.
  • Microfluidic approaches to reagent manipulation have also been developed, e.g., by Caliper Technologies (Mountain View, CA).
  • Optical images viewed (and, optionally, recorded) by a camera or other recording device are optionally further processed in any of the embodiments herein, e.g., by digitizing the image and/or storing and analyzing the image on a computer.
  • a variety of commercially available peripheral equipment and software is available for digitizing, storing and analyzing a digitized video or digitized optical image, e.g., using PC (Intel x86 or pentium chip- compatible DOSTM, OS2TM WINDOWSTM, WINDOWS NTTM or WINDOWS95TM based machines), MACINTOSHTM, or UNIX based (e.g., SUNTM work station) computers.
  • PC Intel x86 or pentium chip- compatible DOSTM, OS2TM WINDOWSTM, WINDOWS NTTM or WINDOWS95TM based machines
  • MACINTOSHTM e.g., UNIX based (e.g., SUNTM work station) computers.
  • One conventional system carries light from the assay device to a cooled charge-coupled device (CCD) camera, in common use in the art.
  • a CCD camera includes an array of picture elements (pixels). The light from the specimen is imaged on the CCD. Particular pixels corresponding to regions of the specimen (e.g., individual hybridization sites on an array of biological polymers) are sampled to obtain light intensity readings for each position. Multiple pixels are processed in parallel to increase speed.
  • the apparatus and methods of the invention are easily used for viewing any sample, e.g., by fluorescent or dark field microscopic techniques.
  • Integrated systems comprising these and other useful features, e.g., a digital computer with additional features such as high-throughput liquid control software, image analysis software, data inte ⁇ retation software, a robotic liquid control armature for transferring solutions from a source to a destination operably linked to the digital computer, an input device (e.g., a computer keyboard) for entering data to the digital computer to control high throughput liquid transfer by the robotic liquid control armature an image scanner for digitizing label signals from labeled assay components, or the like are a feature of the invention.
  • a digital computer with additional features such as high-throughput liquid control software, image analysis software, data inte ⁇ retation software, a robotic liquid control armature for transferring solutions from a source to a destination operably linked to the digital computer, an input device (e.g., a computer keyboard) for entering data to the digital computer to control high throughput liquid transfer by the robotic liquid control armature an image scanner for digitizing label signals from labeled assay components, or the like are a
  • the invention provides an integrated system comprising a computer or computer readable medium comprising a database having at least two artificial homologous codon-altered nucleic acid sequence strings, and a user interface allowing a user to selectively view one or more sequence strings in the database.
  • a computer or computer readable medium comprising a database having at least two artificial homologous codon-altered nucleic acid sequence strings, and a user interface allowing a user to selectively view one or more sequence strings in the database.
  • sequence database programs for aligning and manipulating sequences.
  • standard text manipulation software such as word processing software (e.g., Microsft WordTM or Corel WodperfectTM) and database software (e.g., spreadsheet software such as Microsoft ExcelTM, Corel Quattro ProTM, or database programs such as Microsoft AccessTM or ParadoxTM) can be used in conjuction with a user interface (e.g., a GUI in a standard operating system such as a Windows, Macintosh or LINUX system) to manipulate strings of characters.
  • Specialized alignment programs such as BLAST can also be inco ⁇ orated into the systems of the invention for alignment of codon-altered nucleic acids (or corresponding character strings).
  • the integrated system can also include an automated oligonucleotide synthesizer operably linked to the computer or computer readable medium.
  • the synthesizer is programmed to synthesize one or more oligonucleotide comprising one or more subsequence of one or more of the at least two artificial homologous codon-altered nucleic acids.
  • Kits will optionally additionally comprise instructions for performing methods or assays, packaging materials, one or more containers which contain assay, device or system components, or the like.
  • the present invention provides kits embodying the methods and apparatus herein.
  • Kits of the invention optionally comprise one or more of the following: (1) a shuffled codon-modified component as described herein; (2) instructions for practicing the methods described herein, and/or for operating the selection procedure herein; (3) one or more assay component; (4) a container for holding nucleic acids or enzymes, other nucleic acids, transgneic plants, animals, cells, or the like, (5) packaging materials and (6) software fixerd in a computer readable medium comprising sequences corresponding to one or more codon-altered nucleic acid character string.
  • the present invention provides for the use of any component or kit herein, for the practice of any method or assay herein, and/or for the use of any apparatus or kit to practice any assay or method herein.

Abstract

La présente invention concerne des techniques de recombinaison de banques, modifiées par codon, d'acides nucléiques. En plus des modifications par codon, ces acides nucléiques peuvent inclure des modifications conservatrices ou non conservatrices des séquences de codage, par opposition à des séquences de type sauvage. L'invention concerne l'obtention, outre de nouvelles protéines, de vecteurs caractérisés par des taux réduits de retransformation en virus de type sauvage et atténué.
PCT/US1999/022588 1998-09-29 1999-09-28 Rearrangement de genes modifies par codon WO2000018906A2 (fr)

Priority Applications (6)

Application Number Priority Date Filing Date Title
AU11990/00A AU1199000A (en) 1998-09-29 1999-09-28 Shuffling of codon altered genes
JP2000572353A JP2002537758A (ja) 1998-09-29 1999-09-28 コドン変更された遺伝子のシャッフリング
KR1020017003873A KR20010085850A (ko) 1998-09-29 1999-09-28 코돈 변형 유전자의 재편성
EP99969739A EP1117777A2 (fr) 1998-09-29 1999-09-28 Rearrangement de genes modifies par codon
CA002331335A CA2331335A1 (fr) 1998-09-29 1999-09-28 Rearrangement de genes modifies par codon
IL14044199A IL140441A0 (en) 1998-09-29 1999-09-28 Shuffling of codon altered genes

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US10236298P 1998-09-29 1998-09-29
US11772999P 1999-01-29 1999-01-29
US11881399P 1999-02-05 1999-02-05
US60/118,813 1999-02-05
US14104999P 1999-06-24 1999-06-24
US60/141,049 1999-06-24
US60/117,729 1999-06-24
US60/102,362 1999-06-24

Publications (3)

Publication Number Publication Date
WO2000018906A2 true WO2000018906A2 (fr) 2000-04-06
WO2000018906A9 WO2000018906A9 (fr) 2000-08-31
WO2000018906A3 WO2000018906A3 (fr) 2000-10-26

Family

ID=27493256

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/022588 WO2000018906A2 (fr) 1998-09-29 1999-09-28 Rearrangement de genes modifies par codon

Country Status (7)

Country Link
EP (1) EP1117777A2 (fr)
JP (1) JP2002537758A (fr)
KR (1) KR20010085850A (fr)
AU (1) AU1199000A (fr)
CA (1) CA2331335A1 (fr)
IL (1) IL140441A0 (fr)
WO (1) WO2000018906A2 (fr)

Cited By (73)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1033373A1 (fr) * 1997-10-23 2000-09-06 Nippon Institute for Biological Science Facteur stimulant les colonies de granulocytes de felins
WO2001051663A2 (fr) * 2000-01-11 2001-07-19 Maxygen, Inc. Systemes integres et procedes associes de production diversifiee et de criblage
WO2001061028A2 (fr) * 2000-02-16 2001-08-23 Sequel Genetics, Inc. Techniques et produits d'identification et d'analyse de sequence adn a base de peptides
WO2001068835A2 (fr) * 2000-03-13 2001-09-20 Aptagen Technique de modification d'un acide nucleique
WO2001090346A2 (fr) * 2000-05-23 2001-11-29 California Institute Of Technology Recombinaison de genes et mise au point de proteines hybrides
WO2001090197A1 (fr) * 2000-05-26 2001-11-29 The Australian National University Peptides synthetiques et leurs utilisations
WO2001096551A2 (fr) * 2000-06-14 2001-12-20 Diversa Corporation Ingenierie cellulaire complete par mutagenese d'une partie substantielle d'un genome de depart, par combinaison de mutations et eventuellement repetition
US6337186B1 (en) 1998-06-17 2002-01-08 Maxygen, Inc. Method for producing polynucleotides with desired properties
US6399383B1 (en) 1997-10-28 2002-06-04 Maxygen, Inc. Human papilloma virus vectors
US6420175B1 (en) 1994-02-17 2002-07-16 Maxygen, Inc. Methods for generating polynucleotides having desired characteristics by iterative selection and recombination
WO2002057495A2 (fr) * 2000-11-10 2002-07-25 The Penn State Research Foundation Structure de modelisation utile pour predire le nombre, le type et la distribution des croisements dans des experiences d'evolution dirigee
EP1304376A1 (fr) * 2000-07-25 2003-04-23 Takeda Chemical Industries, Ltd. Procede de production d'une proteine de recombinaison
US6596539B1 (en) 1997-10-31 2003-07-22 Maxygen, Inc. Modification of virus tropism and host range by viral genome shuffling
WO2003075129A2 (fr) 2002-03-01 2003-09-12 Maxygen, Inc. Procedes, systemes et logiciel pour identifier des biomolecules fonctionnelles
WO2003078583A2 (fr) 2002-03-09 2003-09-25 Maxygen, Inc. Optimisation de points de croisement a des fins d'evolution dirigee
EP1421203A2 (fr) * 2001-05-17 2004-05-26 Diversa Corporation Nouvelles molecules de liaison a un antigene destinees a des applications therapeutiques, diagnostiques, prophylactiques, enzymatiques, industrielles et agricoles et procedes de generation et de criblage de telles molecules
US6759226B1 (en) 2000-05-24 2004-07-06 Third Wave Technologies, Inc. Enzymes for the detection of specific nucleic acid sequences
US6902918B1 (en) 1998-05-21 2005-06-07 California Institute Of Technology Oxygenase enzymes and screening method
US7033781B1 (en) 1999-09-29 2006-04-25 Diversa Corporation Whole cell engineering by mutagenizing a substantial portion of a starting genome, combining mutations, and optionally repeating
US7045289B2 (en) 1991-09-09 2006-05-16 Third Wave Technologies, Inc. Detection of RNA Sequences
US7087415B2 (en) 2000-07-31 2006-08-08 Athena Biotechnologies, Inc. Methods and compositions for directed gene assembly
US7098010B1 (en) 2000-05-16 2006-08-29 California Institute Of Technology Directed evolution of oxidase enzymes
US7150982B2 (en) 1991-09-09 2006-12-19 Third Wave Technologies, Inc. RNA detection assays
WO2008028889A1 (fr) * 2006-09-04 2008-03-13 Glaxo Group Limited Gène synthétique
WO2008098198A2 (fr) 2007-02-08 2008-08-14 The California Institute Of Technology Oxydation d'alcane par des hydroxylases modifiées
US7435570B2 (en) 2003-08-11 2008-10-14 California Institute Of Technology Thermostable peroxide-driven cytochrome P450 oxygenase variants and methods of use
US7465567B2 (en) 2001-04-16 2008-12-16 California Institute Of Technology Peroxide-driven cytochrome P450 oxygenase variants
US7524664B2 (en) 2003-06-17 2009-04-28 California Institute Of Technology Regio- and enantioselective alkane hydroxylation with modified cytochrome P450
US7691616B2 (en) 2001-07-20 2010-04-06 California Institute Of Technology Cytochrome P450 oxygenases
US7711490B2 (en) 2001-01-10 2010-05-04 The Penn State Research Foundation Method and system for modeling cellular metabolism
US7826975B2 (en) 2002-07-10 2010-11-02 The Penn State Research Foundation Method for redesign of microbial production systems
EP2275536A1 (fr) 2002-08-06 2011-01-19 Verdia, Inc. Variants de l'amine oxydase AP1
EP2322629A2 (fr) 2003-04-29 2011-05-18 Pioneer Hi-Bred International Inc. Nouveaux gènes de glyphosate-N-acétyltransférase (GAT)
US8027821B2 (en) 2002-07-10 2011-09-27 The Penn State Research Foundation Method for determining gene knockouts
US8026085B2 (en) 2006-08-04 2011-09-27 California Institute Of Technology Methods and systems for selective fluorination of organic molecules
WO2012021785A1 (fr) 2010-08-13 2012-02-16 Pioneer Hi-Bred International, Inc. Procédés et compositions comprenant des séquences présentant une activité d'hydroxyphénylpyruvate dioxygénase (hppd)
US8252559B2 (en) 2006-08-04 2012-08-28 The California Institute Of Technology Methods and systems for selective fluorination of organic molecules
EP2559702A1 (fr) 2007-02-08 2013-02-20 Domantis Limited Domaines variables d'anticorps isolés contre la sérum albumine
US8383346B2 (en) 2008-06-13 2013-02-26 Codexis, Inc. Combined automated parallel synthesis of polynucleotide variants
WO2013166113A1 (fr) 2012-05-04 2013-11-07 E. I. Du Pont De Nemours And Company Compositions et procédés comprenant des séquences ayant une activité de méganucléase
WO2014153234A1 (fr) 2013-03-14 2014-09-25 Pioneer Hi-Bred International, Inc. Compositions ayant une activité de dicamba décarboxylase et procédés d'utilisation
WO2014150914A2 (fr) 2013-03-15 2014-09-25 Pioneer Hi-Bred International, Inc. Polypeptides phi-4 et leurs procédés d'utilisation
WO2014153242A1 (fr) 2013-03-14 2014-09-25 Pioneer Hi-Bred International, Inc. Compositions ayant une activité de décarboxylase de dicamba et leurs procédés d'utilisation
WO2015023846A2 (fr) 2013-08-16 2015-02-19 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2015038734A2 (fr) 2013-09-13 2015-03-19 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2015120270A1 (fr) 2014-02-07 2015-08-13 Pioneer Hi Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2015120276A1 (fr) 2014-02-07 2015-08-13 Pioneer Hi Bred International Inc Protéines insecticides et leurs procédés d'utilisation
CN104955961A (zh) * 2012-12-11 2015-09-30 塞勒密斯株式会社 利用密码子随机化和诱变来合成基因文库的方法
WO2016061206A1 (fr) 2014-10-16 2016-04-21 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
US9322007B2 (en) 2011-07-22 2016-04-26 The California Institute Of Technology Stable fungal Cel6 enzyme variants
WO2016144688A1 (fr) 2015-03-11 2016-09-15 Pioneer Hi Bred International Inc Combinaisons de pip-72 insecticides et procédés d'utilisation
WO2016186986A1 (fr) 2015-05-19 2016-11-24 Pioneer Hi Bred International Inc Protéines insecticides et leurs procédés d'utilisation
EP3115459A2 (fr) 2008-05-23 2017-01-11 E. I. du Pont de Nemours and Company Nouveaux gènes dgat permettant d'augmenter la production de lipides de stockage de semences et profils d'acides gras modifiés dans des plantes à graines oléagineuses
WO2017023486A1 (fr) 2015-08-06 2017-02-09 Pioneer Hi-Bred International, Inc. Protéines insecticides d'origine végétale et leurs méthodes d'utilisation
WO2017105987A1 (fr) 2015-12-18 2017-06-22 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2017192560A1 (fr) 2016-05-04 2017-11-09 Pioneer Hi-Bred International, Inc. Protéines insecticides et procédés pour les utiliser
WO2018005411A1 (fr) 2016-07-01 2018-01-04 Pioneer Hi-Bred International, Inc. Protéines insecticides issues de plantes et procédés pour leur utilisation
WO2018048869A1 (fr) * 2016-09-06 2018-03-15 Bioventures, Llc Compositions et procédés pour générer des vecteurs de vaccin atténués et/ou incompétents pour la réplication sans risque de réversion
WO2018084936A1 (fr) 2016-11-01 2018-05-11 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2018111551A1 (fr) 2016-12-14 2018-06-21 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2018118811A1 (fr) 2016-12-22 2018-06-28 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2018148001A1 (fr) 2017-02-08 2018-08-16 Pioneer Hi-Bred International Inc Associations insecticides de protéines insecticides d'origine végétale et leurs procédés d'utilisation
EP3363900A1 (fr) * 2017-02-21 2018-08-22 ETH Zurich Assemblage d'adn multiplexe guidé par l'évolution de pièces d'adn, voies et géomes
WO2018208882A1 (fr) 2017-05-11 2018-11-15 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2019040335A1 (fr) 2017-08-19 2019-02-28 University Of Rochester Dérivés de michéliolide, leurs procédés de préparation et leur utilisation à titre d'agents anticancéreux et anti-inflammatoires
EP3450549A1 (fr) 2012-10-29 2019-03-06 University Of Rochester Dérivés d'artémisinine, leurs procédés de préparation et d'utilisation en tant qu'agents antipaludéens
EP3460061A1 (fr) 2008-09-26 2019-03-27 Tocagen Inc. Vecteur rétroviral recombinante compétent de réplication basé sur le virus des leucémies de la souri pour l'expression de la cytosine déaminase thermostable
US11162080B2 (en) 2007-03-30 2021-11-02 The Research Foundation For The State University Of New York Attenuated viruses useful for vaccines
US11214817B2 (en) 2005-03-28 2022-01-04 California Institute Of Technology Alkane oxidation by modified hydroxylases
WO2022015619A2 (fr) 2020-07-14 2022-01-20 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
CN114040979A (zh) * 2019-06-21 2022-02-11 国立大学法人大阪大学 稳定地保持外源基因的人工重组rna病毒的制作方法
EP4122945A1 (fr) 2013-12-23 2023-01-25 University of Rochester Procédés et compositions pour la synthèse ribosomique de peptides macrocycliques
WO2023173084A1 (fr) 2022-03-11 2023-09-14 University Of Rochester Cyclopepticorps et leurs utilisations

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101148191B1 (ko) * 2011-09-27 2012-05-23 김후정 에리스로포이에틴-유래 펩타이드 및 그 용도

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992006176A1 (fr) * 1990-09-28 1992-04-16 Ixsys, Inc. Banques de peptides randomises a expression en surface
WO1997035966A1 (fr) * 1996-03-25 1997-10-02 Maxygen, Inc. Procedes et compositions pour biotechniques metaboliques et cellulaires
WO1998013485A1 (fr) * 1996-09-27 1998-04-02 Maxygen, Inc. Procedes permettant l'optimisation de la therapie genique grace a un rearrangement et une selection recursifs de sequences

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992006176A1 (fr) * 1990-09-28 1992-04-16 Ixsys, Inc. Banques de peptides randomises a expression en surface
WO1997035966A1 (fr) * 1996-03-25 1997-10-02 Maxygen, Inc. Procedes et compositions pour biotechniques metaboliques et cellulaires
WO1998013485A1 (fr) * 1996-09-27 1998-04-02 Maxygen, Inc. Procedes permettant l'optimisation de la therapie genique grace a un rearrangement et une selection recursifs de sequences

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AKASHI H: "Codon bias evolution in Drosophila. Population genetics of mutation-selection drift" GENE: AN INTERNATIONAL JOURNAL ON GENES AND GENOMES,GB,ELSEVIER SCIENCE PUBLISHERS, BARKING, vol. 205, no. 1-2, 31 December 1997 (1997-12-31), pages 269-278, XP004102913 ISSN: 0378-1119 *
KIM C H ET AL: "Codon optimization for high-level expression of human erythropoietin (EPO) in mammalian cells" GENE: AN INTERNATIONAL JOURNAL ON GENES AND GENOMES,GB,ELSEVIER SCIENCE PUBLISHERS, BARKING, vol. 199, no. 1-2, 15 October 1997 (1997-10-15), pages 293-301, XP004126394 ISSN: 0378-1119 *
PATTEN P A ET AL: "APPLICATIONS OF DNA SHUFFLING TO PHARMACEUTICALS AND VACCINES" CURRENT OPINION IN BIOTECHNOLOGY,GB,LONDON, vol. 8, 1997, pages 724-733, XP002916609 ISSN: 0958-1669 *
S RAYNER ET AL: "MerMade: An oligodeoxyribonucleotide synthesizer for high throughput oligonucleotide production in dual 96-well plates" PCR METHODS AND APPLICATIONS,US,COLD SPRING HARBOR, NY, vol. 8, no. 7, 1 July 1998 (1998-07-01), pages 741-747, XP002089330 ISSN: 1054-9803 *

Cited By (120)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7045289B2 (en) 1991-09-09 2006-05-16 Third Wave Technologies, Inc. Detection of RNA Sequences
US7150982B2 (en) 1991-09-09 2006-12-19 Third Wave Technologies, Inc. RNA detection assays
US6420175B1 (en) 1994-02-17 2002-07-16 Maxygen, Inc. Methods for generating polynucleotides having desired characteristics by iterative selection and recombination
US7935800B2 (en) 1996-01-24 2011-05-03 Third Wave Technologies, Inc. RNA detection assays
US7807806B2 (en) 1996-01-24 2010-10-05 Third Wave Technologies, Inc. RNA detection assays
US8063184B2 (en) 1996-11-29 2011-11-22 Third Wave Technologies, Inc. RNA detection assays
EP1033373A1 (fr) * 1997-10-23 2000-09-06 Nippon Institute for Biological Science Facteur stimulant les colonies de granulocytes de felins
EP1033373A4 (fr) * 1997-10-23 2002-09-25 Nippon Inst For Biolog Science Facteur stimulant les colonies de granulocytes de felins
US6399383B1 (en) 1997-10-28 2002-06-04 Maxygen, Inc. Human papilloma virus vectors
US6596539B1 (en) 1997-10-31 2003-07-22 Maxygen, Inc. Modification of virus tropism and host range by viral genome shuffling
US6902918B1 (en) 1998-05-21 2005-06-07 California Institute Of Technology Oxygenase enzymes and screening method
US6337186B1 (en) 1998-06-17 2002-01-08 Maxygen, Inc. Method for producing polynucleotides with desired properties
US7033781B1 (en) 1999-09-29 2006-04-25 Diversa Corporation Whole cell engineering by mutagenizing a substantial portion of a starting genome, combining mutations, and optionally repeating
WO2001051663A2 (fr) * 2000-01-11 2001-07-19 Maxygen, Inc. Systemes integres et procedes associes de production diversifiee et de criblage
WO2001051663A3 (fr) * 2000-01-11 2002-06-13 Maxygen Inc Systemes integres et procedes associes de production diversifiee et de criblage
WO2001061028A3 (fr) * 2000-02-16 2009-06-11 Sequel Genetics Inc Techniques et produits d'identification et d'analyse de sequence adn a base de peptides
WO2001061028A2 (fr) * 2000-02-16 2001-08-23 Sequel Genetics, Inc. Techniques et produits d'identification et d'analyse de sequence adn a base de peptides
WO2001068835A3 (fr) * 2000-03-13 2003-01-30 Aptagen Technique de modification d'un acide nucleique
WO2001068835A2 (fr) * 2000-03-13 2001-09-20 Aptagen Technique de modification d'un acide nucleique
US7098010B1 (en) 2000-05-16 2006-08-29 California Institute Of Technology Directed evolution of oxidase enzymes
US7115403B1 (en) 2000-05-16 2006-10-03 The California Institute Of Technology Directed evolution of galactose oxidase enzymes
WO2001090346A3 (fr) * 2000-05-23 2002-10-10 California Inst Of Techn Recombinaison de genes et mise au point de proteines hybrides
WO2001090346A2 (fr) * 2000-05-23 2001-11-29 California Institute Of Technology Recombinaison de genes et mise au point de proteines hybrides
US6759226B1 (en) 2000-05-24 2004-07-06 Third Wave Technologies, Inc. Enzymes for the detection of specific nucleic acid sequences
US7820786B2 (en) 2000-05-26 2010-10-26 Savine Therapeutics Pty Ltd Synthetic peptides and uses therefore
WO2001090197A1 (fr) * 2000-05-26 2001-11-29 The Australian National University Peptides synthetiques et leurs utilisations
WO2001096551A3 (fr) * 2000-06-14 2002-05-23 Diversa Corp Ingenierie cellulaire complete par mutagenese d'une partie substantielle d'un genome de depart, par combinaison de mutations et eventuellement repetition
WO2001096551A2 (fr) * 2000-06-14 2001-12-20 Diversa Corporation Ingenierie cellulaire complete par mutagenese d'une partie substantielle d'un genome de depart, par combinaison de mutations et eventuellement repetition
EP1304376A4 (fr) * 2000-07-25 2004-12-29 Takeda Chemical Industries Ltd Procede de production d'une proteine de recombinaison
EP1304376A1 (fr) * 2000-07-25 2003-04-23 Takeda Chemical Industries, Ltd. Procede de production d'une proteine de recombinaison
US7087415B2 (en) 2000-07-31 2006-08-08 Athena Biotechnologies, Inc. Methods and compositions for directed gene assembly
WO2002057495A3 (fr) * 2000-11-10 2003-10-16 Penn State Res Found Structure de modelisation utile pour predire le nombre, le type et la distribution des croisements dans des experiences d'evolution dirigee
WO2002057495A2 (fr) * 2000-11-10 2002-07-25 The Penn State Research Foundation Structure de modelisation utile pour predire le nombre, le type et la distribution des croisements dans des experiences d'evolution dirigee
US8086414B2 (en) 2001-01-10 2011-12-27 The Penn State Research Foundation Method and system for modeling cellular metabolism
US7711490B2 (en) 2001-01-10 2010-05-04 The Penn State Research Foundation Method and system for modeling cellular metabolism
US7704715B2 (en) 2001-04-16 2010-04-27 California Institute Of Technology Peroxide-driven cytochrome P450 oxygenase variants
US7465567B2 (en) 2001-04-16 2008-12-16 California Institute Of Technology Peroxide-driven cytochrome P450 oxygenase variants
EP1421203A4 (fr) * 2001-05-17 2005-06-01 Diversa Corp Nouvelles molecules de liaison a un antigene destinees a des applications therapeutiques, diagnostiques, prophylactiques, enzymatiques, industrielles et agricoles et procedes de generation et de criblage de telles molecules
EP1421203A2 (fr) * 2001-05-17 2004-05-26 Diversa Corporation Nouvelles molecules de liaison a un antigene destinees a des applications therapeutiques, diagnostiques, prophylactiques, enzymatiques, industrielles et agricoles et procedes de generation et de criblage de telles molecules
US9322001B2 (en) 2001-07-20 2016-04-26 California Institute Of Technology Cytochrome P450 oxygenases
US7691616B2 (en) 2001-07-20 2010-04-06 California Institute Of Technology Cytochrome P450 oxygenases
US8367386B2 (en) 2001-07-20 2013-02-05 California Institute Of Technology Cytochrome P450 oxygenases
US8722371B2 (en) 2001-07-20 2014-05-13 California Institute Of Technology Cytochrome P450 oxygenases
US8076114B2 (en) 2001-07-20 2011-12-13 California Institute Of Technology Cytochrome P450 oxygenases
EP2410049A1 (fr) 2002-02-26 2012-01-25 Third Wave Technologies, Inc. Enzymes de détection d'ARN
EP2315145A1 (fr) 2002-03-01 2011-04-27 Codexis Mayflower Holdings, LLC Procédés, systèmes et logiciel pour identifier des biomolécules fonctionnelles
EP2278509A1 (fr) 2002-03-01 2011-01-26 Maxygen Inc. Procédés, systèmes et logiciel pour identifier des biomolécules fonctionnelles
EP2390803A1 (fr) 2002-03-01 2011-11-30 Codexis Mayflower Holdings, LLC Procédés, systèmes et logiciel pour identifier des biomolécules fonctionnelles
WO2003075129A2 (fr) 2002-03-01 2003-09-12 Maxygen, Inc. Procedes, systemes et logiciel pour identifier des biomolecules fonctionnelles
WO2003078583A2 (fr) 2002-03-09 2003-09-25 Maxygen, Inc. Optimisation de points de croisement a des fins d'evolution dirigee
US7826975B2 (en) 2002-07-10 2010-11-02 The Penn State Research Foundation Method for redesign of microbial production systems
US8027821B2 (en) 2002-07-10 2011-09-27 The Penn State Research Foundation Method for determining gene knockouts
US8457941B2 (en) 2002-07-10 2013-06-04 The Penn State Research Foundation Method for determining gene knockouts
US8108152B2 (en) 2002-07-10 2012-01-31 The Penn State Research Foundation Method for redesign of microbial production systems
EP2275536A1 (fr) 2002-08-06 2011-01-19 Verdia, Inc. Variants de l'amine oxydase AP1
EP2322629A2 (fr) 2003-04-29 2011-05-18 Pioneer Hi-Bred International Inc. Nouveaux gènes de glyphosate-N-acétyltransférase (GAT)
EP2535414A1 (fr) 2003-04-29 2012-12-19 Pioneer Hi-Bred International Inc. Nouveaux gènes de glyphosate-N-acétyltransférase (GAT)
US8343744B2 (en) 2003-06-17 2013-01-01 The California Institute Of Technology Regio- and enantioselective alkane hydroxylation with modified cytochrome P450
US9145549B2 (en) 2003-06-17 2015-09-29 The California Institute Of Technology Regio- and enantioselective alkane hydroxylation with modified cytochrome P450
US8741616B2 (en) 2003-06-17 2014-06-03 California Institute Of Technology Regio- and enantioselective alkane hydroxylation with modified cytochrome P450
US7524664B2 (en) 2003-06-17 2009-04-28 California Institute Of Technology Regio- and enantioselective alkane hydroxylation with modified cytochrome P450
US7863030B2 (en) 2003-06-17 2011-01-04 The California Institute Of Technology Regio- and enantioselective alkane hydroxylation with modified cytochrome P450
US7435570B2 (en) 2003-08-11 2008-10-14 California Institute Of Technology Thermostable peroxide-driven cytochrome P450 oxygenase variants and methods of use
US9963720B2 (en) 2005-03-28 2018-05-08 California Institute Of Technology Alkane oxidation by modified hydroxylases
US10648006B2 (en) 2005-03-28 2020-05-12 California Institute Of Technology Alkane oxidation by modified hydroxylases
US11214817B2 (en) 2005-03-28 2022-01-04 California Institute Of Technology Alkane oxidation by modified hydroxylases
US8715988B2 (en) 2005-03-28 2014-05-06 California Institute Of Technology Alkane oxidation by modified hydroxylases
US9404096B2 (en) 2005-03-28 2016-08-02 California Institute Of Technology Alkane oxidation by modified hydroxylases
US8026085B2 (en) 2006-08-04 2011-09-27 California Institute Of Technology Methods and systems for selective fluorination of organic molecules
US8252559B2 (en) 2006-08-04 2012-08-28 The California Institute Of Technology Methods and systems for selective fluorination of organic molecules
WO2008028889A1 (fr) * 2006-09-04 2008-03-13 Glaxo Group Limited Gène synthétique
WO2008098198A2 (fr) 2007-02-08 2008-08-14 The California Institute Of Technology Oxydation d'alcane par des hydroxylases modifiées
EP2559703A1 (fr) 2007-02-08 2013-02-20 Domantis Limited Domaines variables d'anticorps isolés contre la sérum albumine
EP2559704A1 (fr) 2007-02-08 2013-02-20 Domantis Limited Domaines variables d'anticorps isolés contre la sérum albumine
EP2559702A1 (fr) 2007-02-08 2013-02-20 Domantis Limited Domaines variables d'anticorps isolés contre la sérum albumine
US11162080B2 (en) 2007-03-30 2021-11-02 The Research Foundation For The State University Of New York Attenuated viruses useful for vaccines
EP3115459A2 (fr) 2008-05-23 2017-01-11 E. I. du Pont de Nemours and Company Nouveaux gènes dgat permettant d'augmenter la production de lipides de stockage de semences et profils d'acides gras modifiés dans des plantes à graines oléagineuses
US8383346B2 (en) 2008-06-13 2013-02-26 Codexis, Inc. Combined automated parallel synthesis of polynucleotide variants
EP3460061A1 (fr) 2008-09-26 2019-03-27 Tocagen Inc. Vecteur rétroviral recombinante compétent de réplication basé sur le virus des leucémies de la souri pour l'expression de la cytosine déaminase thermostable
WO2012021785A1 (fr) 2010-08-13 2012-02-16 Pioneer Hi-Bred International, Inc. Procédés et compositions comprenant des séquences présentant une activité d'hydroxyphénylpyruvate dioxygénase (hppd)
US9322007B2 (en) 2011-07-22 2016-04-26 The California Institute Of Technology Stable fungal Cel6 enzyme variants
WO2013166113A1 (fr) 2012-05-04 2013-11-07 E. I. Du Pont De Nemours And Company Compositions et procédés comprenant des séquences ayant une activité de méganucléase
EP3450549A1 (fr) 2012-10-29 2019-03-06 University Of Rochester Dérivés d'artémisinine, leurs procédés de préparation et d'utilisation en tant qu'agents antipaludéens
CN104955961A (zh) * 2012-12-11 2015-09-30 塞勒密斯株式会社 利用密码子随机化和诱变来合成基因文库的方法
WO2014153242A1 (fr) 2013-03-14 2014-09-25 Pioneer Hi-Bred International, Inc. Compositions ayant une activité de décarboxylase de dicamba et leurs procédés d'utilisation
WO2014153234A1 (fr) 2013-03-14 2014-09-25 Pioneer Hi-Bred International, Inc. Compositions ayant une activité de dicamba décarboxylase et procédés d'utilisation
WO2014150914A2 (fr) 2013-03-15 2014-09-25 Pioneer Hi-Bred International, Inc. Polypeptides phi-4 et leurs procédés d'utilisation
WO2015023846A2 (fr) 2013-08-16 2015-02-19 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2015038734A2 (fr) 2013-09-13 2015-03-19 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
EP4159028A1 (fr) 2013-09-13 2023-04-05 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
EP3692786A1 (fr) 2013-09-13 2020-08-12 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
EP4122945A1 (fr) 2013-12-23 2023-01-25 University of Rochester Procédés et compositions pour la synthèse ribosomique de peptides macrocycliques
WO2015120276A1 (fr) 2014-02-07 2015-08-13 Pioneer Hi Bred International Inc Protéines insecticides et leurs procédés d'utilisation
EP3705489A1 (fr) 2014-02-07 2020-09-09 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2015120270A1 (fr) 2014-02-07 2015-08-13 Pioneer Hi Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2016061206A1 (fr) 2014-10-16 2016-04-21 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2016144688A1 (fr) 2015-03-11 2016-09-15 Pioneer Hi Bred International Inc Combinaisons de pip-72 insecticides et procédés d'utilisation
WO2016186986A1 (fr) 2015-05-19 2016-11-24 Pioneer Hi Bred International Inc Protéines insecticides et leurs procédés d'utilisation
EP3943602A1 (fr) 2015-08-06 2022-01-26 Pioneer Hi-Bred International, Inc. Protéines insecticides d'origine végétale et leurs méthodes d'utilisation
WO2017023486A1 (fr) 2015-08-06 2017-02-09 Pioneer Hi-Bred International, Inc. Protéines insecticides d'origine végétale et leurs méthodes d'utilisation
WO2017105987A1 (fr) 2015-12-18 2017-06-22 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
EP3960863A1 (fr) 2016-05-04 2022-03-02 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2017192560A1 (fr) 2016-05-04 2017-11-09 Pioneer Hi-Bred International, Inc. Protéines insecticides et procédés pour les utiliser
WO2018005411A1 (fr) 2016-07-01 2018-01-04 Pioneer Hi-Bred International, Inc. Protéines insecticides issues de plantes et procédés pour leur utilisation
EP3954202A1 (fr) 2016-07-01 2022-02-16 Pioneer Hi-Bred International, Inc. Protéines insecticides issues de plantes et leurs procédés d'utilisation
WO2018048869A1 (fr) * 2016-09-06 2018-03-15 Bioventures, Llc Compositions et procédés pour générer des vecteurs de vaccin atténués et/ou incompétents pour la réplication sans risque de réversion
US20190211313A1 (en) * 2016-09-06 2019-07-11 Bioventures, Llc Compositions and methods for generating reversion free attenuated and/or replication incompetent vaccine vectors
US11149255B2 (en) 2016-09-06 2021-10-19 Bioventures, Llc Compositions and methods for generating reversion free attenuated and/or replication incompetent vaccine vectors
EP4050021A1 (fr) 2016-11-01 2022-08-31 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2018084936A1 (fr) 2016-11-01 2018-05-11 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2018111551A1 (fr) 2016-12-14 2018-06-21 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2018118811A1 (fr) 2016-12-22 2018-06-28 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2018148001A1 (fr) 2017-02-08 2018-08-16 Pioneer Hi-Bred International Inc Associations insecticides de protéines insecticides d'origine végétale et leurs procédés d'utilisation
WO2018153835A1 (fr) * 2017-02-21 2018-08-30 Eth Zurich Assemblage de parties d'adn d'adn multiplexé guidé par évolution, voies et génomes
EP3363900A1 (fr) * 2017-02-21 2018-08-22 ETH Zurich Assemblage d'adn multiplexe guidé par l'évolution de pièces d'adn, voies et géomes
WO2018208882A1 (fr) 2017-05-11 2018-11-15 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2019040335A1 (fr) 2017-08-19 2019-02-28 University Of Rochester Dérivés de michéliolide, leurs procédés de préparation et leur utilisation à titre d'agents anticancéreux et anti-inflammatoires
CN114040979A (zh) * 2019-06-21 2022-02-11 国立大学法人大阪大学 稳定地保持外源基因的人工重组rna病毒的制作方法
WO2022015619A2 (fr) 2020-07-14 2022-01-20 Pioneer Hi-Bred International, Inc. Protéines insecticides et leurs procédés d'utilisation
WO2023173084A1 (fr) 2022-03-11 2023-09-14 University Of Rochester Cyclopepticorps et leurs utilisations

Also Published As

Publication number Publication date
WO2000018906A9 (fr) 2000-08-31
WO2000018906A3 (fr) 2000-10-26
IL140441A0 (en) 2002-02-10
AU1199000A (en) 2000-04-17
KR20010085850A (ko) 2001-09-07
CA2331335A1 (fr) 2000-04-06
EP1117777A2 (fr) 2001-07-25
JP2002537758A (ja) 2002-11-12

Similar Documents

Publication Publication Date Title
EP1117777A2 (fr) Rearrangement de genes modifies par codon
US6423542B1 (en) Oligonucleotide mediated nucleic acid recombination
CA2320697C (fr) Recombinaison d'acides nucleiques induite par des oligonucleotides
US6368861B1 (en) Oligonucleotide mediated nucleic acid recombination
US8029988B2 (en) Oligonucleotide mediated nucleic acid recombination
US6436675B1 (en) Use of codon-varied oligonucleotide synthesis for synthetic shuffling
US20060051795A1 (en) Oligonucleotide mediated nucleic acid recombination
US6387702B1 (en) Enhancing cell competence by recursive sequence recombination
CA2362737A1 (fr) Recombinaison d'acides nucleiques d'insertion modifies
EA020657B1 (ru) Специализированная многосайтовая комбинаторная сборка
US20030054390A1 (en) Oligonucleotide mediated nucleic acid recombination
MXPA01003212A (en) Shuffling of codon altered genes
DK2253704T3 (en) Oligonucleotide-mediated recombination nucleic acid
KR20010042040A (ko) 올리고뉴클레오티드 매개된 핵산 재조합
MXPA00009027A (en) Oligonucleotide mediated nucleic acid recombination

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref country code: AU

Ref document number: 2000 11990

Kind code of ref document: A

Format of ref document f/p: F

AK Designated states

Kind code of ref document: A2

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: C2

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: C2

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

COP Corrected version of pamphlet

Free format text: PAGES 1/28-28/28, DRAWINGS, REPLACED BY NEW PAGES 1/29-29/29; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

AK Designated states

Kind code of ref document: A3

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 140441

Country of ref document: IL

ENP Entry into the national phase

Ref document number: 2331335

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 1999969739

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 11990/00

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 1020017003873

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: PA/a/2001/003212

Country of ref document: MX

WWP Wipo information: published in national office

Ref document number: 1999969739

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1020017003873

Country of ref document: KR

WWW Wipo information: withdrawn in national office

Ref document number: 1999969739

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1020017003873

Country of ref document: KR