EP1276858A1 - Rekombination von nukleinsäuren, die nicht auf pcr basiert - Google Patents

Rekombination von nukleinsäuren, die nicht auf pcr basiert

Info

Publication number
EP1276858A1
EP1276858A1 EP01926467A EP01926467A EP1276858A1 EP 1276858 A1 EP1276858 A1 EP 1276858A1 EP 01926467 A EP01926467 A EP 01926467A EP 01926467 A EP01926467 A EP 01926467A EP 1276858 A1 EP1276858 A1 EP 1276858A1
Authority
EP
European Patent Office
Prior art keywords
stranded
double
nucleic acid
dna
gene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP01926467A
Other languages
English (en)
French (fr)
Inventor
Alexander Volkov
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Danisco US Inc
Original Assignee
Genencor International Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Genencor International Inc filed Critical Genencor International Inc
Publication of EP1276858A1 publication Critical patent/EP1276858A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • C12N15/1027Mutagenizing nucleic acids by DNA shuffling, e.g. RSR, STEP, RPR
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/66General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease

Definitions

  • the invention relates to the in vitro recombination of nucleic acids.
  • the methods provided herein are generally based on random fragmentation of double-stranded DNA molecules, generation of single-stranded ends on both ends of the fragments and assembly of the fragments having single-stranded ends.
  • oligonucleotide-directed or site-directed mutagenesis a short sequence within a given nucleic acid is replaced with an oligonucleotide, comprising one or more desired mutations, which will be introduced into the nucleic acid, thereby altering the sequence of the nucleic acid and the potential protein encoding information.
  • Site-directed mutagenesis has been widely in the study of protein structure and properties [for review, e.g. see
  • cassette mutagenesis generally a double-stranded oligonucleotide harboring a block of random or partially random nucleotides is inserted into a chosen position of a nucleic acid.
  • This insertion can either be an insertion, without deleting or substituting any other nucleotide, or a complete or partial replacement of existing nucleotides [Oliphant et al., Gene 44(2-3): 177-83 (1986); Arkin et al., Proc. Natl. Acad. Sci. U.S.A., 89(16)7811-5 (1992); for review see, Kegler-Ebo et al., Methods Mol. Biol. 57:297-310 (1996)].
  • Error-prone PCR uses low-fidelity polymerization conditions to introduce a low level of point mutations randomly over a long sequence. Many protein encoding genes were subjected to random mutagenesis by error-prone PCR resulting in mutagenized genes encoding variant proteins with an altered and/or improved property [Song et al., Appl. Environ. Microbiol. 66(3):890-894 (2000); Henke et al., Biol Chem. 380(7-8):1029-33 (1999); Buchholz et al., Nat. Biotechnol. 16(7):657-62 (1998)].
  • homologous recombination In nature the evolution of most proteins and/or protein encoding genes occurs by mutation, homologous recombination and natural selection. Homologous recombination is a ubiquitous process that plays an important role in species adaptation and survival. During meiosis homologous recombination ensures mixing and combining of the genes. The importance of homologous recombination is illustrated by its duality of functions - increasing genetic diversity as well as preserving genetic integrity [Stahl, Sci. Am. 256:91-101 (1987)]. However, natural in vivo recombination mechanisms usually operate at low efficiencies.
  • a gene or a plasmid can be assembled from a large number of oligodeoxynucleotides [Stemmer et al., Gene, 164(1 ):49-53 (1995)]. Other methods directed to recombining nucleic acids are published in WO 97/07205;
  • Recombining nucleic acids, when combined with an activity screen or selection protocol can accelerate finding desired traits.
  • Such methods have reportedly been used to enhance enzyme stability (Zhao et al., supra), to enhance enzyme activity [Stemmer, Nature 370:389-391 (1994); Buchholz et al., Nat. Biotechnol. 16(7):657-62 (1998); Christians et al., Nat. Biotechnol. 17(3):259-264 (1999); Merz et al., Biochemistry 39(5):880-9 (2000)], to change substrate specificity [Zhang et al., Proc. Natl. Acad. Sci.
  • the methods described above are either labor intensive or require PCR.
  • many of the methods discussed above require a PCR reassembly step and, in addition, a PCR procedure to amplify the desired full-length nucleic acid.
  • the rate of down-mutations grows with the information content of the sequence (e.g., the length of a sequence, numbers of PCR cycles, library size).
  • the balance of down-mutations to up-mutations will statistically prevent the selection of further improvements.
  • PCR-based methods are known to be most powerful for the manipulation of nucleic acids in the range of 1 to 5 kb and usually become very inefficient for amplification of DNA sequences above 20 kb, failing to produce sufficient amount of full-length product for cloning and subsequent analysis.
  • amplification of nucleic acids in the range of greater than 50 kb was reported, as pointed out above, amplification of such long sequences may lead to the accumulation of unacceptably high number or errors and inactivation of biological functions encoded by these sequences.
  • methods employing PCR have many deficiencies in in vitro homologous recombination of larger molecules, such as viral genomes and/or chromosomal fragments.
  • the present invention provides methods directed to the in vitro recombination of nucleic acids.
  • the methods are based on random fragmentation of double-stranded nucleic acid molecules resulting in smaller double-stranded fragments, generation of single- stranded ends (or cohesive ends) on those fragments and assembly.
  • the methods provided herein do not require the use of PCR amplification or PCR-like thermocycling for the assembly of recombined nucleic acids and thus offer advantages over existing in vitro recombination methods.
  • PCR may be used for amplification once the recombined nucleic acid is formed.
  • a method for forming a recombined nucleic acid comprises randomly fragmenting one or more double-stranded nucleic acid molecules, preferably DNA, to form double-stranded fragments having ends; generating single-stranded cohesive ends on each end of said fragments; and assembling together at least two of said fragments having said cohesive ends to form a recombined nucleic acid.
  • the nucleic acid molecules and said fragments remain double-stranded throughout said fragmenting, generating single-stranded cohesive ends and assembling steps.
  • the recombined nucleic acid formed from the present invention can be further subjected to at least one more repetition of said fragmenting, generating single-stranded cohesive ends and assembling steps.
  • the method is repeated at least 10 or more times.
  • one or more double-stranded nucleic acid molecules are added to the formed recombined nucleic acid and said fragmenting, generating single-stranded cohesive ends and assembling steps are repeated.
  • the double-stranded nucleic acid molecule or molecules which are randomly fragmented can be any variety of molecules.
  • the nucleic acid molecule is not subjected to a size limitation.
  • the nucleic acid molecule is a gene, an operon or metabolic pathway containing more than one gene or a chromosome.
  • at least two nucleic acid molecules having sequences which differ from each other are fragmented.
  • the nucleic acid molecules can be homologs of one another, variants of a nucleic acid, or a mix of naturally occuring nucleic acid molecules and variants of one another.
  • the method comprises randomly fragmenting a double-stranded nucleic acid molecule comprising a gene flanked by known sequences to form double-stranded fragments having ends; generating single-stranded cohesive ends on each end of said fragments; and assembling said fragments with a double-stranded insert having single-stranded ends complementary to said known sequences to form a recombined nucleic acid having said gene deleted or inverted.
  • a gene is deleted, the insert does not comprise the gene.
  • the double-stranded insert having single-stranded ends is formed by a method comprising amplifying a gene with primers that have additional nucleotides that are not natural extensions of said gene, wherein said additional nucleotides are complementary to opposing ends of said known sequences to form an amplification product.
  • the amplication product is treated to form said insert having single-stranded ends complementary to said known sequences.
  • a method for forming a recombined nucleic acid having a gene inserted therein comprises randomly fragmenting a double-stranded nucleic acid molecule; generating single-stranded cohesive ends on each end of said fragments; and assembling said fragments with a double-stranded insert comprising a gene and having single-stranded ends to form a recombined nucleic acid having a gene inserted therein.
  • the double-stranded insert is formed by a method comprising amplifying said insert with primers that have additional nucleotides to form an amplification product; and treating said amplification product to form said insert having single-stranded ends.
  • the additional nucleotides can be random or natural extensions from a selected site for insertion.
  • the recombined nucleic acids formed herein can be used in conventional amplification after formation, the formation of proteins, screening assays, in formation of genetic pools, cloning or expressions vectors, inserting heterologous genes into existing operons for the purpose of metabolic engineering, and have a variety of other applications.
  • the compositions formed from the methods of the present invention are also provided herein.
  • the present invention provides for forming recombined nucleic acids which do not require the use of PCR for purposes of assembly or recombination. As such, the recombined nucleic acids are not subjected to size limitations or amplification of undesired errors. Moreover, the present invention allows recombination of sequences with low homology which would not occur if subjected to the high temperatures of PCR. Additionally, methods of forming proteins, screening assays, genetic pools, and recombined nucleic acids and proteins are provided herein.
  • a method for forming a recombined nucleic acid comprises randomly fragmenting one or more double-stranded nucleic acid molecules to form double-stranded fragments having ends; generating single-stranded cohesive ends on each end of said fragments; and assembling together at least two of said fragments having said cohesive ends to form a recombined nucleic acid.
  • Nucleic acid as used herein can be any nucleic acid, DNA (synthetic, genomic or cDNA), RNA, or a mix of DNA and RNA.
  • nucleic acid is DNA.
  • DNA is used herein for illustration.
  • the one or more double-stranded DNA molecules can be any double-stranded DNA molecule.
  • the DNA molecule can be recombined in a variety of ways.
  • DNA molecules which differ in sequence from each other are used.
  • a heterogenous population of DNA molecules may be used.
  • the terms "population” or "library” or grammatical equivalents thereof, as used herein, generally mean a collection of components such as nucleic acids, nucleic acid fragments, proteins, vectors, constructs, cells, etc.
  • a populations of the invention comprises from at least two components to 10 9 components.
  • populations comprising from at least 10 components to 10 8 components, more preferred are populations comprising from at least 50 components to 10 7 components and most preferred are populations comprising from at least 100 components to 10 6 components.
  • the family members are related, but differ in at least one aspect, e.g., in their sequence, i.e., they are not identical.
  • the recombined nucleic acid is formed of one initial DNA molecule, wherein the sequence of the initial DNA molecule has been rearranged such that the recombined nucleic acid has a sequence which differs from the initial DNA molecule.
  • more than one double-stranded DNA molecule can be used wherein the DNA molecules are homogenous. In this sense, homogenous refers to DNA molecules having the same sequence.
  • the DNA molecules are at least about 30 bp and can be any desired length.
  • the DNA molecules are genes, operons or metabolic pathways, or chromosomes.
  • the double-stranded DNA molecules to be fragmented can be any molecules including naturally occurring molecules and variants thereof.
  • naturally occurring By “naturally occurring”, “wild type” or grammatical equivalents thereof, is meant a nucleic acid sequence or an amino acid sequence that is found in nature and in one embodiment, includes naturally occurring allelic variations.
  • the DNA molecules are non-naturally occurring sequences.
  • non-naturally occurring or grammatical equivalents thereof, is meant a nucleic acid sequence or an amino acid sequence that is not found in nature.
  • the DNA molecules are a mixture of naturally occurring and non-naturally occurring sequences.
  • the DNA molecule is a variant of a naturally occurring nucleic acid.
  • a "variant” or grammatical equivalents thereof, refers to a component that is altered at one or more sites with respect to a corresponding naturally occurring component.
  • a nucleic acid variant (or variant nucleic acid) comprises a nucleotide sequence that is altered by one or more nucleotides when compared to a nucleotide sequence of a naturally occurring nucleic acid or to a nucleotide sequence of a non-naturally-occurring sequence.
  • a protein variant (or variant protein) comprises an amino acid sequence that is altered by one or more amino acid residues when compared to an amino acid sequence of a naturally occurring protein or to an amino acid sequence of a non-naturally-occurring protein.
  • a variant has one or more deletions, substitutions, insertions, truncations or combinations thereof.
  • a population of double-stranded DNA molecules comprises a naturally-occurring nucleic acid, homologs, naturally occurring allelic variations thereof as well as random and site-directed variants. Wherein all the DNA molecules are based on the same nucleic acid, being variants or homologs thereof, etc., the DNA molecules are said to be related or a family.
  • homolog refers to a gene or protein which is identified as functionally equivalent but produced in a different species.
  • a population of double-stranded DNA molecules is generated by mutagenesis.
  • the mutagenesis methods employed may be site- directed or random and are generally known in the art.
  • error-prone PCR can be used to generate the double-stranded DNA molecules.
  • Other methods for obtaining DNA molecules can be used, such as using mutator strains, chemical mutagenesis or irradiation with X-rays or ultraviolet light using methods as known in the art.
  • the DNA molecules can be represented at about the same ratio.
  • the population may comprise five different variants, 'a', 'b', 'c', 'd', and 'e' of a naturally occurring nucleic acid.
  • variants may be combined in a 1 :1 :1 :1 :1 ratio.
  • one variant e.g., 'a', that may comprise a desired mutation
  • the other variants e.g., 5:1 :1:1 :1
  • Each variant may be present in a different molar ratio in the population.
  • the method comprises fragmenting the double-stranded DNA molecules.
  • the DNAs are randomly fragmented.
  • each double-stranded DNA is fragmented into at least two fragments.
  • the fragments may be of different sizes and are preferably at least about 15 base pairs and may be at least 1 kb, 5 kb, 10 kb, or preferably larger is some embodiments.
  • Random fragmentation can be done by using enzymes including, but not limited to DNAsel [Liao, J. Biol. Chem. 249:2354 (1974); Matsuda and Ogoshi J. Biochem. 59:230 (1966); Hong, Methods Enzymol.
  • Random fragmentation may be by shearing of DNA in one embodiment, and includes, but is not limited to sonication of DNA and passage of the DNA through a tube having a small orifice, such as a needle.
  • first and second double-stranded DNAs are fragmented to generate at least 4 fragments.
  • the number of different specific nucleic acid fragments will be at least about 100, preferably at least about 500, more preferably at last about 1000 and most preferably at least about 10 4 .
  • the DNA fragments generated by fragmentation of a population of double-stranded DNA or the double-stranded DNA population may comprise short single-stranded 5'- or 3'- protruding ends.
  • the short 5'- or 3'- protruding ends of the double-stranded DNA population or the short 5'- or 3'- protruding ends of the DNA fragments are removed. Enzymatic removal of 5'- or 3'- protruding ends includes, but is not limited to using one or more of the following enzymes: Bal 31 , S1 nuclease, mung bean nuclease, P1 nuclease, DNAsel, exonuclease I, exonuclease VII, N.
  • DNA polymerase e.g., DNA polymerase I (Kornberg polymerase), DNA polymerase I (Klenow fragment), T4 DNA polymerase, T7 DNA polymerase, Taq DNA polymerase, micrococcal DNA polymerase, etc.
  • single-stranded cohesive ends are generated.
  • single-stranded end “cohesive end” or grammatical equivalents thereof, herein is meant, a nucleic acid that is protruding from the end(s) of an otherwise double-stranded nucleic acid.
  • the fragments remain double-stranded throughout two or more subsequent steps of the method of the invention. In this embodiment, it is understood that the fragments remain double-stranded but for the cohesive ends.
  • the cohesive ends are generated by digestion of the DNA fragments and are part of or integral to the DNA fragment prior to the generation of the cohesive ends.
  • the methods do not comprise attachment of a single stranded oligo or PCR synthesis to form the single stranded end.
  • Preferred embodiments utilize a type lls restriction endonucleases or an exonuclease.
  • cohesive ends may be formed by the addition of oligonucleotides either in a polymerase reaction or in a ligation reaction using standard methods known in the art. Additionally, cohesive ends can be formed by removing nucleotides, or by the addition and removal of nucleotides. For example, in one embodiment, the cohesive ends can be added by using primers to form a template, synthesizing a complementary end thereon, and then removing said template. Preferred embodiments exclude the use of a ribonucleotides and/or ribonucleases in the generation of the cohesive ends. Alternative embodiments include nucleic acid fragments and/or the generation of cohesive ends which include the use of ribonucleotides. The nucleic acids may comprise a mix of DNA and RNA. In the generation of cohesive ends, wherein nucleotides are removed which include RNA, ribonucleases or chemical agents which degrade RNA may be used.
  • a type Ms endonuclease is an enzyme which binds to a short recognition sequence, rarely palindromic, which will cut one strand (and usually both) downstream of the recognition sequence instead of within it.
  • type Ms endonucleases allow generation of cohesive ends which can have random sequences, since cleavage is based on the recognition site, whereas the cleavage site can be any sequence at all falling at the appropriate distance from the recognition site.
  • the type Ms endonucleases do not require a palindrome for site recognition and typically cut DNA a measured number of bases to one side of the recognition site [e.g., the Mbo II site is 5'...GAAGA...3', and the cut site is 8 bases 3' of the recognition site on the upper DNA strand and 7 bases 5' of the recognition site on the bottom strand; abbreviated as GAAGA(8/7)].
  • Some type Ms restriction enzymes, such as Sael, Scgl, BsaXI, Bsp24l, C/ ' el and C/ePI cut on both sides of their recognition sequence, and thus have 4 cleavage sites instead of two.
  • REBASE restriction enzyme data base
  • type Ms restriction enzymes recognize more than one nucleic acid sequence (e.g., Sael, Scgl, SsaXI, BseMII, 6sp24l, C/ ' el, C/ePI, Mme ⁇ , Tagil and Tthl 111I).
  • REBASE restriction enzyme data base
  • type Ms restriction enzymes Robots and Macelis, Nucleic Acids Res. 26(1 ):338-350 (1998); incorporated as reference in its entirety.
  • the site is 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 bases away from the recognition site.
  • one or more of the following type Ms endonucleases are used: (names of enzymes in parentheses are isoschizomers; enzymes with asteriks indicate that the enzyme is not commercially available as of the publication of Roberts and Macelis, supra, but can be isolated by methods known in the art): ocelli * , Sael * , Bbv ⁇ , BbvW* (Bbs ⁇ , Bpi ⁇ , BpuAI), Bce83l * , Bcefl * , Scgl, Bc/VI, BfiV, Bin ⁇ * (Ac N ⁇ , Alw ⁇ ), SsaXI*, BseMII*, SseRI, Ssgl, BsmAI (Alw26 ⁇ ), Bsp24l ⁇ BspMI, Bs/DI (Sse3DI), Bts ⁇ , C/ ' el * , C/ePI*, Ec/I*.
  • Type Ms restriction enzymes leaving cohesive ends comprising only one nucleotide such as Mboll
  • type Ms restriction enzymes leaving cohesive ends comprising two nucleotides such as Bsgl
  • type Ms restriction enzymes leaving cohesive ends comprising three nucleotides such as Ksp632l
  • type Ms restriction enzymes leaving cohesive ends comprising four nucleotides such as Bbv ⁇
  • type I Is restriction enzymes leaving cohesive ends comprising five nucleotides such as Hgal
  • the double-stranded DNAs or the double-stranded DNA fragments, generated by fragmentation are cloned into a vector and then released by digestion with a type I Is restriction endonuclease to generate cohesive ends.
  • the vector comprises a recognition sites for a type II restriction enonuclease (such as EcoRV ), flanked on either side by a recognition sequence for a type Ms restriction endonuclease.
  • the recognition sequences for type II and type lls restriction endonuclease may overlap.
  • the double-stranded DNAs or the double-stranded DNA fragments are cloned into the type II restriction site, thereby generating constructs comprising the double-stranded DNAs or the double-stranded DNA fragments, flanked by two type lls restriction sites.
  • the double-stranded DNAs or the double-stranded DNA fragments are released, each comprising two cohesive ends.
  • the size of the vector is larger than the average size of the DNA fragments to be cloned into the vector. This allows the double-stranded DNAs or the double-stranded DNA fragments, not ligated into the vector to be removed by centrifugation using e.g., a sizing filtration, such as Microcon 100 filters. The ligated DNA will remain on the filter, while non-ligated DNA pass through the filter. The ligated DNA may then be digested with a type lls restriction endonuclease and the released double-stranded DNAs or the double-stranded DNA fragments, each comprising cohesive end(s), can be separated from the vector, by passing them through a Microcon 100 filter, as described above.
  • the vector will remain on the filter.
  • the vector thus does not only act as a provider for the type lls recognition sites, but also helps to discriminate between un-ligated, ligated and released double-stranded DNAs and/or released double-stranded DNA fragments.
  • gel electrophoresis can be used instead of centrifugation for separation of the vector, un-ligated, ligated and released double-stranded DNAs and/or released double- stranded DNA fragments.
  • the released double-stranded DNA fragments comprising cohesive ends are then used for assembling or recombining, as is described below.
  • the ligation products are transformed into a host cell and propagated prior to further manipulation with type lls restriction endonuclease. Appropriate host cells are described further below.
  • the double-stranded DNA fragments are digested with one or more type lls restriction enzymes without being prior subcloned into a vector.
  • type lls restriction endonuclease(s) cut at recognition site(s) located within the double-stranded DNA fragments and generate cohesive ends.
  • the double-stranded DNA population is digested with one or more type lls restriction enzymes without being prior fragmented as described above.
  • type lls restriction endonuclease(s) cut at recognition site(s) located within the double-stranded DNA population and generate cohesive ends.
  • the use of more than one type lls restriction enzyme serves simultaneously to fragment the double- stranded DNA population and to generate cohesive ends.
  • an appropriate type lls restriction endonuclease may be chosen for digestion to generate double-stranded DNA fragments of a desired size and comprising cohesive end(s).
  • an adaptor comprising a recognition sequence for a type lls restriction endonuclease is ligated to both ends of the first double-stranded DNA fragments.
  • two complementary oligonucleotides comprising a recognition sequence for a type lls restriction endonuclease are synthesized and hybridized to each other, thereby forming a double-stranded "type lls adaptor", which is then ligated to the DNA fragment(s).
  • Methods for the chemical synthesis of oligonucleotides are known in the art and- as such are not presented herein.
  • the recognition sequences for type lls restriction endonucleases can be found e.g., in Roberts and Macelis, supra.
  • the DNA sequence of these type lls adaptors are designed in such a way that the cleavage reaction is directed toward sequences located within the double-stranded DNA fragments.
  • the type lls restriction endonuclease whose recognition sequence is provided by the type lls adaptor is added and the DNA is digested, thereby generating first double-stranded DNA fragments and second double-stranded DNA fragments comprising cohesive ends.
  • the cohesive ends generated on a first double-stranded DNA fragment have complementation to the cohesive ends generated on a second double-stranded DNA fragment.
  • an adaptor comprising a recognition sequence for a type lls restriction endonuclease is ligated to both ends of the double-stranded DNAs comprised within the double-stranded DNA population, i.e. the double-stranded DNA are not fragmented prior to the ligation of the type lls adaptor.
  • the type lls adaptors are added to both ends of the double- stranded DNA fragment or double-stranded DNA simultaneously.
  • both ends of the double-stranded DNA are accessible for ligation of the type lls adaptors, e.g., by being blunt-ended.
  • the type lls adaptors ligated to the ends of the respective DNAs can be identical.
  • the type lls adaptors are added to both ends of the double-stranded DNA fragment or double-stranded DNA sequentially.
  • initially usually only one end of the double-stranded DNA is accessible for ligation of the type lls adaptor (the 'first accessible end'), e.g., by being blunt-ended.
  • the other end either comprises a cohesive end or is protected by other DNA, e.g., vector DNA.
  • the other end is prepared for the second ligation, i.e. is made blunt-ended.
  • a second type lls adaptor is ligated to the second accessible end.
  • the type lls adaptors ligated to the ends of the respective DNAs can be identical or different.
  • the type lls adaptor of the invention may be labeled.
  • labeled herein is meant that a compound (such as a type lls adaptor) has at least one element, isotope or chemical compound attached to enable the detection of the compound.
  • the labels may be incorporated into the compound at any position.
  • the compound is either directly or indirectly labeled with a label, which provides a detectable signal, e.g. radioisotope, fluorescers, colored dyes, enzyme, antibodies, particles such as magnetic particles, chemiluminescers, or specific binding molecules, etc.
  • Specific binding molecules include pairs, such as biotin and streptavidin, digoxin and antidigoxin etc.
  • the complementary member would normally be labeled with a molecule which provides for detection, in accordance with known procedures, as outlined above.
  • the label can directly or indirectly provide a detectable signal.
  • a label attached to a type lls adaptor can be used to purify the type lls adaptor away from the digested DNA fragments.
  • the single-stranded ends are generated using an exonuclease.
  • Exonucleases are commercially available and include, but are not limited to ⁇ exonuclease, bacteriophage T7 gene 6 exonuclease, Bal 31 nuclease, and exonuclease III [see Sambrook et al., Molecular Cloning: A Laboratory Manual (2nd ed.) Vol. 1-3, Cold Spring Harbor Laboratory Press, New York (1989); Brown, Molecular Biology LabFax, BIOS Scientific Publishers Limited; Information Press Ltd, Oxford, UK, 1991)].
  • the exonuclease is added to the double-stranded DNA fragments and is incubated, according to the recommendations of the supplier, under conditions sufficient for the successive removal of nucleotides from the double-stranded DNA fragments, thereby generating single-stranded or cohesive ends.
  • the single-stranded ends are generated using a 5'-3' exonuclease, thereby generating protruding 3'- cohesive ends. Protruding 3'- ends ensure that the DNA fragments are not modified by DNA polymerase or ligase until they hybridize with a complementary sequence.
  • the single-stranded ends are generated using ⁇ exonuclease. ⁇ exonuclease is 10-100 times more active with double-stranded DNA (blunt- ended or with a 3'-overhang) than with single-stranded DNA, but is inefficient with double- stranded DNA with a 5'-overhang.
  • ⁇ exonuclease catalyzes the processive, stepwise release of 5'- mononucleotides from the 5'- ends of double-stranded DNA.
  • the preferred substrate is double-stranded DNA with a terminal 5'- phosphate.
  • Appropriate reaction conditions following published protocols and/or protocols provided by the supplier of this commercially available enzyme are used to generate 3'- cohesive ends on the double-stranded DNAs and double-stranded DNA fragments [Little, in Gene Amplification and Analysis: structural Analysis of Nucleic Acids (J.G. Chirikjian and T.S.
  • the single-stranded ends are generated using the bacteriophage T7 gene 6 exonuclease.
  • T7 gene 6 exonuclease is a double-stranded specific 5'-3' exonuclease that removes mononucleotides from both the 5' termini of the two strands of linear DNA.
  • Appropriate reaction conditions following published protocols and/or protocols provided by the supplier of this commercially available enzyme are used to generate 3'- cohesive ends on the double-stranded DNAs and double-stranded DNA fragments [Roberts et al., Biochemistry 21(23):6000-5 (1982); Brantley and Beer, Gene Anal. Tech. 6(4):75-8 (1989) ].
  • the reaction conditions for the exonuclease treatment of double-stranded DNA and/or double-stranded DNA fragments can be varied to allow the generation of single- stranded ends of a defined length.
  • Single-stranded ends of various lengths can be generated by changing one or more of the following reaction conditions: (i) concentration of exonuclease, (ii) ratio of exonuclease vs. substrate, (iii) incubation time, (iv) incubation temperature, (v) salt concentration.
  • a single-stranded end comprises 2-300 nucleotides, that is 2- 300 nucleotides of the opposite strand are successively removed by the exonuclease. More preferably, a single-stranded end comprises 2-100 nucleotides, more preferably, 2-60, 2-40 or 2-20 nucleotides, and more preferably, a single-stranded end or cohesive end comprises 2-10 nucleotides, most preferably, about 10.
  • Cohesive ends comprising e.g., 10 nucleotides statistically occur only once in a million, thereby making the method of the invention suitable for the recombination of entire chromosomes.
  • Assembly and recombination of DNA fragments according to the method of the invention is preferably based upon complementarity of overhanging single-stranded ends that are generated on said double-stranded DNA fragments. In one embodiment, complementation does not need to be exact for hybridization.
  • the terms "complementary” or “complementarity”, or grammatical equivalents thereof, as used herein, refer to the natural binding of polynucleotides under permissive salt and temperature conditions by base-pairing. Complementarity between two single-stranded nucleic acids may be "partial", in which only some of the nucleic acid bind, or it may be complete when total complementarity exists between the single-stranded nucleic acids.
  • the invention provides a method for the generation of recombined double-stranded
  • the recombined nucleic acid has a sequence which is different from an initial DNA molecule prior to recombination.
  • the sequence may differ by having at least one section of sequence replaced by a fragment of a variant or homolog in accordance with the methods provided herein.
  • the sequence may differ by having sections within one DNA molecule rearranged in a different order.
  • the product encoded by the recombined nucleic acid retains the function of the wild type protein, such as catalytic activity, but has an altered property such as further discussed below.
  • a chimeric nucleic acid or protein as used herein refers to any sequence which has been manipulated to contain at least a portion, ranging from at least one residue to as many as but for one residue of another molecule. Generally, at least one random fragment has been incorporated.
  • the term "assembly”, or grammatical equivalents thereof, herein means combining one or more nucleic acid molecules to form one contiguous nucleic acid molecule.
  • “Recombination”, or forming a “recombined” nucleic acid is generally the reassortment of sections of nucleic acid sequences between one, or preferably at least two nucleic acid molecules, having sequences which differ from each other.
  • Assembly is based on the annealing of cohesive ends between nucleic acid molecules.
  • the cohesive ends anneal based on substantial complementation.
  • substantial complementation means that the ends have at least partial to complete complementation such that they can anneal under the selected reaction conditions.
  • the reaction conditions may include further polymerase or ligation reactions. Generally, such reactions can be adjusted to favor hybridization and do not require temperature conditions such as those required in PCR.
  • novel polynucleotides may encode useful proteins, such as novel receptors, ligands, antibodies and enzymes.
  • These novel polynucleotides may also comprise hybrid nucleic acids, wherein, for example, 5' untranslated regions of genes, 3' untranslated regions of genes, introns, exons, promoter regions, enhancer regions and other regulatory sequences for gene expression, such as dominant control regions, are recombined.
  • the double-stranded DNA fragments comprising single-stranded ends are assembled based on the complementarity of their respective cohesive ends.
  • the lowest level of identity required for recombination of homologous sequences is determined by the size of the cohesive ends. Identity can range from 1 to 8 nucleotides (1 , 2, 3, 4, 5, 6, 7 or 8) and assists in and recombination of sequences. Cohesive ends of 1-8 nucleotides are provided on DNA fragments treated with type lls restriction endonucleases. For assembly of genes, large gene fragments, viral genomes or entire operons, DNA fragments comprising longer cohesive ends are preferred. DNA fragments with longer cohesive ends are preferably generated using exonuclease treatment of DNA.
  • the methods of the invention provide for the recombination of DNA fragments ranging from 50-100 bp to several Mbp.
  • the methods of the invention are particular useful for the recombination of very large DNA sequences, when conventional cloning protocols fail. Assembly of large DNA sequences, using e.g., the method, wherein the cohesive ends are generated by exonuclease can substitute restriction endonuclease based DNA manipulations for example, when these large DNA sequences lack convenient or any restriction sites and PCR assembly becomes very inefficient.
  • Assembly occurs by contacting a single-stranded end on a first double-stranded DNA fragment with a single-stranded end on a second double-stranded DNA fragment.
  • the assembly can also be done using double-stranded DNA with cohesive ends that have not been fragmented. Only those cohesive ends having regions of at least some homology with other cohesive ends will assemble into a recombined nucleic acid.
  • the assembly step is perfomed in a reaction mixture which only includes randomly fragmented DNA.
  • the assembly step in making 5 the recombined nucleic acid does not contain extraneous DNA such as vector DNA, particularly DNA that has been cut at unique restriction sites.
  • the methods include the identification of linear recombined nucleic acids comprising assembled randomly fragmented pieces of double-stranded DNA, ando which excludes other types of DNA.
  • the recombined nucleic acid once it is formed, it can be cloned into a vector or amplified, etc.
  • non-randomly fragmented nucleic acids can be included in the reaction mixture with the randomly fragmented nucleic acids.
  • two or more fragments are assembled, or preferably at least one gene is assembled or s recombined.
  • the method comprises identifying the recombined nucleic acids which have multiple random fragments assembled together or contiguously.
  • the recombined nucleic acids having more than one random fragment assembled to another random fragment or separated from other assembly products and further utilized.
  • Such utilizations include the production of recombined proteins and their use in screening assays.
  • such embodiments include an increase in the ratio of randomly fragmented DNA to non-randomly fragmented DNA or vector, so as to increase the likelihood that two or more inserts will assemble with each other or contiguously.
  • the assembly reaction does not require any prior denaturation of the population of DNA fragments, as used by current methods known to the skilled artisan.
  • the assembly5 reaction can proceed at temperatures, wherein the DNA fragments remain double-stranded, i.e, are not separated into two strands.
  • the assembly is performed at about 37°C. In one aspect of this embodiment assembly is performed at temperatures lower than 37°C.
  • the assembly reaction is accelerated by the addition of a volumeo excluder, such as polyethylene glycol (PEG) or other known volume excluders as known in the art.
  • a volumeo excluder such as polyethylene glycol (PEG) or other known volume excluders as known in the art.
  • the concentration of PEG is preferably from 0% to about 30%, more preferably from about 5% to about 20% and most preferably from about 10% to about 15%.
  • the assembly reaction is accelerated by the addition of a salt, including, but not limited to sodium chloride, potassium chloride or ammonium sulphate.5
  • a salt including, but not limited to sodium chloride, potassium chloride or ammonium sulphate.5
  • the salt concentration is from 0 mM to about 2500 mM, preferably from about 0 mM to 250 mM, more preferably from about 10 mM to about 200 mM and most preferably from about 10 mM to about 100 mM.
  • the recombined nucleic acid comprising a gap is incubated with DNA polymerase and dNTPs (i.e., dATP, dCTP, dGTP, dTTP) under conditions sufficient for closing the gap, that is the missing nucleotides are synthesized by the DNA polymerase.
  • dNTPs i.e., dATP, dCTP, dGTP, dTTP
  • the DNA polymerase which can be employed herein may be any enzyme known in the art 5 that can catalyze a DNA chain extending reaction using as a template the sequence of an existing strand.
  • DNA polymerases include, but are not limited to DNA polymerase I (Komberg polymerase), DNA polymerase I (Klenow fragment), T4 DNA polymerase, T7 DNA polymerase, Tag DNA polymerase, micrococcal DNA polymerase, etc. After DNA synthesis, the remaining nick may be treated with DNA ligase using standard methods.o DNA polymerase and DNA ligase may be added simultaneously.
  • the steps of the methods provided herein may constitute a cycle which favor direction toward desirable mutations leading to desirable traits or phenotypes.
  • the recombined nucleic acid may be cloned into a vector, propagated and screened for a species or first subpopulation with a desired property. This results in the identification ands isolation of, or enrichment for, a recombined nucleic acid encoding a polypeptide that has acquired a desired property.
  • nucleic acid sequences are recombined at the same time.
  • any number of different nucleic acids may be assembled or recombined at the same time. This is advantageous because a largeo number of different variants can be made rapidly without iterative procedures.
  • the steps of the method are repeated at least one time using the newly generated first subpopulation of chimeric DNAs as the starting material.
  • the cycle is repeated at least 2 time, more preferably up to 5 times, more preferably up to 10 times, and most preferably up to5 100 times or more.
  • the chimeric DNA or the full-length. gene generated according to one of the methods described above is amplified.
  • the terms "amplification” or “amplify” or grammatical equivalents thereof, as used herein, refer to the production of additional copies of a nucleic acid sequence and is generally carried out using theo polymerase chain reaction (PCR).
  • PCR polymerase chain reaction
  • PCR technologies are well known in the art (e.g., see Dieffenbach and Dveksler in PCR Primer, A Laboratory Manual, Cold Spring Harbor Press, Princeton, N.Y.).
  • the first subpopulation of chimeric DNA is subjected to reiterated assembly without prior cloning into a vector, propagation or 5 screening to identify a species with a desired property.
  • the first subpopulation of chimeric DNA is cloned into a vector, propagated and screened to identify a species or first subpopulation with a desired property prior to subjecting the first subpopulation to reiterated assembly or recombination. After the second round of assembly, a second subpopulation is obtained that may be screened for the same property or for a different property.
  • the invention provides chimeric DNAs and chimeric DNAs encoding variant polypeptides.
  • the chimeric DNA and the variant polypeptide preferably 5 have at least one property, which differs from the same property of the corresponding naturally occurring polynucleotide or corresponding naturally occurring polypeptide.
  • the property of the chimeric DNA or of the variant polypeptide is the result of assembly according to the present invention or assembly and prior mutagenesis.
  • a property affecting binding to a polypeptide refers to any characteristic or attribute of a polynucleotide that can be selected or detected. These properties include, but are not limited to, a property affecting binding to a polypeptide, a property conferred on a cell comprising a particular polynucleotide, a property affecting gene transcription (e.g., promoter strength, promoter recognition, promoter regulation, enhancer function), a property affecting RNA processings (e.g., RNA splicing, RNA stability, RNA conformation, and post-transcriptional modification), a property affecting translation (e.g., level, regulation, binding of mRNA to ribosomal proteins, post-translational modification).
  • gene transcription e.g., promoter strength, promoter recognition, promoter regulation, enhancer function
  • RNA processings e.g., RNA splicing, RNA stability, RNA conformation, and post-transcriptional modification
  • translation e.g.,
  • polypeptide refers to any characteristic or attribute of a polypeptide that cano be selected or detected. These properties include, but are not limited to oxidative stability, substrate specificity, catalytic activity, thermal stability, alkaline stability, pH activity profile, resistance to proteolytic degradation, Km, kcat, Km/kcat ratio, protein folding, inducing an immune response, ability to bind to a ligand, ability to bind to a receptor, ability to be secreted, ability to be displayed on the surface of a cell, ability to oligomerize, ability to5 signal, ability to stimulate cell proliferation, ability to inhibit cell proliferation, ability to induce apoptosis, ability to be modified by phosphorylation or glycosylation, ability to treat disease.
  • the term "screening" has its usual meaning in the art and is, in general a multi-step process.
  • a recombined nucleic acid or variant polypeptide is provided in the first step.
  • a property of the nucleic acid or varianto polypeptide is determined in the second step.
  • the determined property is compared to a property of the corresponding naturally occurring polynucleotide, to the property of the corresponding naturally occurring polypeptide or to the property of the starting material for the generation of the recombined nucleic acid.
  • the latter may be a synthetic DNA.
  • a change in any of the above-listed properties, when comparing the property of a recombined nucleic acid or protein to the property of a naturally occurring nucleic acid or naturally occurring protein is preferably at least a 20%, more preferably, 50%, more preferably at least a 2-fold increase or decrease. Generally, any change which can be detected is considered as a change in property.
  • a change in substrate specificity is defined as a difference between the kcat/Km ratio of the naturally occurring protein and that of the variant thereof.
  • the kcat/Km ratio is generally a measure of catalytic efficiency.
  • the objective will be to generate variants of naturally occurring proteins with greater (numerically large) kcat/Km ratio for a given substrate when compared to that of the naturally occurring protein, thereby enabling the use of the protein to more efficiently act on a target substrate.
  • An increase in kcat/Km ratio for one substrate may be accompanied by a reduction in kcat/Km ratio for another substrate.
  • a change in oxidative stability can be evidenced, for example, by at least about 20%, more preferably at least 50% increase of enzyme activity when exposed to various oxidizing conditions.
  • oxidizing conditions include, but are not limited to exposure of the protein to the organic oxidant diperdodecanoic acid (DPDA). Oxidative stability is measured by known procedures.
  • alkaline stability is evidenced by at least about a 5% or greater increase or decrease (preferably increase) in the half life of the enzymatic activity of a variant of a naturally occurring protein when compared to that of the naturally occurring protein.
  • alkaline stability can be measured as a function of autoproteolytic degradation of subtilisin at alkaline pH, e.g., 0.1 M sodium phosphate, pH 12 at 25°C or 30°C.
  • alkaline stability is measured by known procedures.
  • thermal stability is evidenced by at least about a 5% or greater increase or decrease (preferably increase) in the half life of the catalytic activity of a variant of naturally occurring protein when exposed to a relatively high temperature and neutral pH as compared to that of the naturally occurring protein.
  • thermal stability can be measured as a function of autoproteolytic degradation of subtilisin at elevated temperatures and neutral pH, e.g., 2mM calcium chloride, 50 mM MOPS, pH 7.0 at 59°C.
  • thermal stability is measured by known procedures.
  • Receptor variants for example are experimentally tested and validated in in vivo and in in vitro assays. Suitable assays include, but are not limited to, e.g., examining their binding affinity to natural ligands and to high affinity agonists and/or antagonists. In addition to cell-free biochemical affinity tests, quantitative comparison are made comparing kinetic and equilibrium binding constants for the natural ligand to the naturally occurring receptor and to the receptor variants. The kinetic association rate (K on ) and dissociation rate (K off ), and the equilibrium binding constants (K d ) can be determined using surface plasmon resonance on a BIAcore instrument following the standard procedure in the literature [Pearce et al., Biochemistry 38:81-89 (1999)].
  • the binding constant between a natural ligand and its corresponding naturally occurring receptor is well documented in the literature. Comparisons with the corresponding naturally occurring receptors are made in order to evaluate the sensitivity and specificity of the receptor variants.
  • binding affinity to natural ligands and agonists is expected to increase relative to the naturally occurring receptor, while antagonist affinity should decrease.
  • Receptor variants with higher affinity to antagonists relative to the non naturally occurring receptors may also be generated by the methods of the invention.
  • ligand variants for example are experimentally tested and validated in in vivo and in in vitro assays. Suitable assays include, but are not limited to, e.g., examining their binding affinity to natural receptors and to high affinity agonists and/or antagonists. In addition to cell-free biochemical affinity tests, quantitative comparison are made comparing kinetic and equilibrium binding constants for the natural receptor to the naturally occurring ligand and to the ligand variants.
  • the kinetic association rate (K on ) and dissociation rate (Ko ff ), and the equilibrium binding constants (K ) can be determined using surface plasmon resonance on a BIAcore instrument following the standard procedure in the literature [Pearce et al., Biochemistry 38:81-89 (1999)].
  • K on kinetic association rate
  • Ki ff dissociation rate
  • K ff equilibrium binding constants
  • the binding constant between a natural receptor and its corresponding naturally occurring ligand is well documented in the literature. Comparisons with the corresponding naturally occurring ligands are made in order to evaluate the sensitivity and specificity of the ligand variants.
  • binding affinity to natural receptors and agonists is expected to increase relative to the naturally occurring ligand, while antagonist affinity should decrease.
  • Ligand variants with higher affinity to antagonists relative to the non naturally occurring ligands may also be generated by the methods of the invention.
  • the methods of the invention are also useful for the specific deletion of a gene or nucleic acid.
  • a DNA comprising a gene of interest that will be deleted is provided.
  • the gene may be flanked by known nucleotide sequences.
  • the DNA comprising the gene of interest is randomly fragmented.
  • the gene encodes a full length protein, or a desired segment.
  • single-stranded overhanging ends can be generated.
  • at least three fragments are generated, comprising the gene and nucleic acids on opposing flanking ends of said gene.
  • the gene is randomly fragmented into at least two fragments, preferably more.
  • Some of the DNA fragments will have cohesive ends corresponding to the sequence that flank the gene of interest.
  • Two oligonucleotides with overlapping, complementary sequences are designed and synthesized. When annealed to each other, these oligonucleotides comprise cohesive ends corresponding to the sequences that flank the gene of interest. The same sequences will be represented in the population of DNA fragments generated from the starting DNA.
  • the sequences of the oligonucleotide that hybridize to each other and form a double-stranded region may be completely random or may comprise a specific sequence which is not the gene or nucleic acid to be deleted.
  • oligonucleotides separately or annealed, are added to the DNA fragments and the mix is treated with DNA ligase and optionally with DNA polymerase. Some assembled chimeric DNAs will have flanking sequences joined by the sequences incorporated into the oligonucleotide. The gene of interest is deleted.
  • the methods of the invention are also useful for the inversion of a gene or nucleic acid relative to surrounding genes or sequences.
  • a DNA comprising a gene of interest that will be inverted, is provided.
  • the gene is flanked by known nucleotide sequences.
  • the gene of interest is amplified by PCR using primers that comprise additional nucleotides, for example 5', and correspond to the gene flanking sequences.
  • additional nucleotides refer to the nucleotidse that do not anneal to the template, but rather overhang. Generally, the additional nucleotides are as long as preferred for a cohesive end as discussed above. Primers are known in the art, and generally are at least 7 nucleotides long.
  • the nucleotides are not natural extensions such that the nucleic acids would be assembled back into the initial sequence.
  • Natural extensions refer to extending the sequence, such as a gene, to have its flanking sequence in the order and orientation that would be found prior to manipulation. Rather, the additional nucleic acids, for example, 5', will be complementary to, for example, the 3' end of the flanking sequence.
  • the original DNA comprising the gene of interest is fragmented, preferably randomly, and treated so as to generate single-stranded overhanging.
  • the amplified DNA fragment comprising the gene of interest is also treated to provide single-stranded ends.
  • Some of the single-stranded ends, generated on the fragments derived from original DNA and those generated on the amplified gene fragment are complementary to each other.
  • the fragments are mixed, annealed and ligated.
  • Some assembled chimeric DNAs will have the gene of interest in inverted orientation.
  • the methods of the invention are also useful for the insertion of a gene or a nucleic acid into a known or unknown sequence.
  • a DNA comprising a known sequence into which a gene of interest will be inserted, is provided.
  • the gene of interest is amplified by PCR using primers comprising additional nucleotides, for example, on their 5' ends that are complementary to the insertion sequence.
  • the original DNA comprising the gene of interest is fragmented and treated with exonuclease to provide single-stranded overhanging ends (e.g., 3'), thereby providing at least two DNA fragments, T and 'II'.
  • the amplified DNA fragment comprising the gene of interest is also treated with exonuclease to provide single-stranded ends. Some of the single-stranded ends, generated on the fragments derived from original DNA and those generated on the amplified gene fragment are complementary to each other. The fragments are mixed, annealed and ligated. Some assembled chimeric DNAs will have the gene of interest inserted into the specified sequence. In one embodiment, the additional nucleotides added to the PCR primers are random and as such the amplified gene can be inserted into any site.
  • At least one double-stranded DNA molecule encodes a protein.
  • protein herein is meant at least two covalently attached amino acids, which includes proteins, polypeptides, oligopeptides and peptides.
  • the protein may be a naturally occurring proteins, a variant of a naturally occurring protein or a synthetic protein.
  • the protein may be made up of naturally occurring amino acids and peptide bonds, or synthetic peptidomimetic structures, generally depending on the method of synthesis.
  • amino acid or “peptide residue”, as used herein means both naturally occurring and synthetic amino acids. For example, homo-phenylalanine, citrulline and noreleucine are considered amino acids for the purposes of the invention.
  • Amino acid also includes imino acid residues such as proline and hydroxyproline.
  • the side chains may be in either the (R) or the (S) configuration.
  • the amino acids are in the (S) or L- configuration.
  • Stereoisomers of the twenty conventional amino acids, unnatural amino acids such as ⁇ , ⁇ -disubstituted amino acids, N-alkyl amino acids, lactic acid, and other unconventional amino acids may also be suitable components for proteins of the present invention.
  • Examples of unconventional amino acids include, but are not limited to: 4- hydroxyproline, ⁇ -carboxyglutamate, ⁇ -N,N,N-thmethyllysine, ⁇ -N-acetyllysine, O- phosphoserine, N-acetylsehne, N-formylmethionine, 3-methylhistidine, 5-hydroxylysine, ⁇ - N-methylarginine, and other similar amino acids and imino acids. If non-naturally occurring side chains are used, non-amino acid substituents may be used, for example to prevent or retard in vivo degradations.
  • Proteins including non-naturally occurring amino acids may be synthesized or in some cases, made recombinedly; see van Hest et al., FEBS Lett. 428:(1- 2) 68-70 (1998); and Tang et al., Abstr. Pap. Am. Chem. S218:U138-U138 Part 2 (1999), both of which are expressly incorporated by reference herein.
  • a recombined or variant protein is distinguished from a naturally occurring protein by at least one or more characteristics.
  • the recombined or variant protein may be isolated or purified away from some or all of the proteins and compounds with which it is normally associated in its wild type host, and thus may be substantially pure.
  • an isolated recombined or variant protein is unaccompanied by at least some of the material with which it is normally associated in its natural state, preferably constituting at least about 0.5%, more preferably at least about 5% by weight of the total protein in a given sample.
  • a substantially pure protein comprises at least about 75% by weight of the total protein, with at least about 80% being preferred, and at least about 90% being particularly preferred.
  • substantially pure means an object species (such as a protein or nucleic acid) is the predominant species present (i.e., on a molar basis it is more abundant than any other individual species in a composition), and preferably a substantially purified fraction is a composition, wherein the object species comprises at least about 50% (on a molar basis) of all macromolecular species present. Generally, a substantially pure composition will comprise more than about 80 to 90 percent of all macromolecular species present in the composition. Isolated nucleic acids and proteins are those taken from their native environment. Most preferably, the object species is purified to essential homogeneity (macomolecular contaminant species cannot be detected in the composition by conventional detection methods), wherein the composition consists essentially of a single macromolecular species.
  • proteins whose amino acid sequence is altered by one or more amino acids when compared to the sequence of a naturally occurring protein are also included within this definition.
  • the definition also includes the production of a protein from one organism in a different organism or host cell.
  • the recombined or variant protein may be made at a significantly higher concentration than is normally seen, through the use of a inducible promoter or high expression promoter, such that the recombined or variant protein is made at increased concentration levels.
  • all of the recombined or variant proteins outlined herein are in a form not normally found in nature, as they may contain amino acid substitutions, insertions and deletions, with substitutions being preferred.
  • the nucleic acids may be from any number of eukaryotic or prokaryotic organisms or from archaebacteria. Particularly preferred are nucleic acids from mammals. Suitable mammals include, but are not limited to, rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, goats, pigs, cows, horses, etc) and in the most preferred embodiment, from humans.
  • eukaryotic organisms include plant cells, such as maize, rice, wheat, cotton, soybean, sugarcane, tobacco, and arabidopsis; fish, algae, yeast, such as Saccharomyces cerevisiae; Aspergillus and other filamentous fungi; and tissue culture cells from avian or mammalian origins.
  • nucleic acids from prokaryotic organisms include gram negative organisms and gram positive organisms. Specifically included are enterobacteriaciae bacteria, pseudomonas, micrococcus, corynebacteria, bacillus, lactobacilli, streptomyces, and agrobacterium.
  • Polynucleotides encoding proteins and enzymes isolated from extremophilic organisms include, but not limited to hyperthermophiles, psychrophiles, psychrotrophs, halophiles, barophiles and acidophiles, are particularly preferred.
  • Such enzymes may function at temperatures above 100°C in terrestrial hot springs and deep sea thermal vents, at temperatures below 0°C in arctic waters, in the saturated salt environment of the Dead Sea, at pH values at around 0 in coal deposits and geothermal sulfur-rich springs, or at pH values greater than 11 in sewage sludge.
  • the proteins can be intracellular proteins, extracellular proteins, secreted proteins, enzymes, ligands, receptors, antibodies or portions thereof.
  • the first double-stranded DNA encodes all or a portion of an enzyme.
  • enzyme herein is meant any of a group of proteins that catalyzes a chemical reaction.
  • Enzymes include, but are not limited to (i) oxidoreductases; (ii) transferases, comprising transferase transferring one-carbon groups (e.g., methyltransferases, hydroxymethyl-, formyl-, and related transferases, carboxyl- and carbamoyltransferases, amidinotransferases) transferases transferring aldehydic or ketonic residues, acyltransferases (e.g., acyltransferases, aminoacyltransferas), glycosyltransferases (e.g., hexosyltransferases, pentosyltransferases), transferases transferring alkyl or related groups, transferases transferring nitrogenous groups (e.g., aminotransferases, oximinotransferases), transferases transferring phosphorus-containing groups (e.g., phosphotransferases, pyrophosphotransferases, nucleotid
  • Peptide hydrolases include ⁇ -aminoacylpeptide hydrolase, peptidylamino-acid hydrolase, acylamino hydrolase, serine carboxypeptidase, metallocarboxy-peptidase, thiol proteinase, carboxylproteinase and metalloproteinase. Serine, metallo, thiol and acid proteases are included, as well as endo and exo-proteases.
  • the first and/or second double-stranded DNA encode a variant of an enzyme.
  • the first double-stranded DNA encodes all or a portion of a receptor.
  • receptor or grammatical equivalents herein is meant a proteinaceous molecule that has an affinity for a ligand.
  • receptors include, but are not limited to antibodies, cell membrane receptors, complex carbohydrates and glycoproteins, enzymes, and hormone receptors.
  • Cell-surface receptors appear to fall into two general classes: type 1 and type 2 receptors.
  • Type 1 receptors have generally two identical subunits associated together, either covalently or otherwise. They are essentially preformed dimers, even in the absence of ligand.
  • the type 1 receptors include the insulin receptor and the IGF (insulin like growth factor) receptor.
  • the type-2 receptors generally are in a monomeric form, and rely on binding of one ligand to each of two or more monomers, resulting in receptor oligomerization and receptor activation.
  • Type-2 receptors include the growth hormone receptor, the leptin receptor, the LDL (low density lipoprotein) receptor, the GCSF (granulocyte colony stimulating factor) receptor, the interleukin receptors including IL-1 , IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-11 , IL-12, IL-13, IL-15, IL-17, etc., receptors, EGF (epidermal growth factor) receptor, EPO (erythropoietin) receptor, TPO (thrombopoietin) receptor, VEGF (vascular endothelial growth factor) receptor, PDGF (platelet derived growth factor; A chain and B chain) receptor, FGF (basic fibroblast growth factor) receptor, T-cell receptor, transferrin receptor, prolactin receptor, CNF (ciliary neurotrophic factor) receptor, TNF (tumor necrosis factor) receptor, Fas receptor, NGF (nerve growth
  • T cell receptors MHC (major histocompatibility antigen) class I and class II receptors and receptors to the naturally occurring ligands, listed below.
  • MHC major histocompatibility antigen
  • the first and/or second double-stranded DNA encode a variant of a receptor.
  • the first double-stranded DNA encodes all or a portion of a ligand.
  • ligand or grammatical equivalents herein is meant a proteinaceous molecule capable of binding to a receptor.
  • Ligands include, but are not limited to cytokines IL-1ra, IL-1 , 1 L- 1 a , IL-1b, IL-2, IL-3, IL-4, IL-5, IL-6, IL-8, IL-10, IFN- ⁇ , INF- ⁇ , IFN- ⁇ -2a; IFN-Q-2B, TNF- ⁇ ; CD40 ligand (chk), human obesity protein leptin, GCSF, BMP-7, CNF, GM-CSF, MCP-1 , macrophage migration inhibitory factor, human glycosylation-inhibiting factor, human rantes, human macrophage inflammatory protein 1 ⁇ , hGH, LIF, human melanoma growth stimulatory activity, neutrophil activating peptide-2, CC-chemokine MCP-3, platelet factor M2, neutrophil activating peptide 2, eotaxin, stromal cell-derived factor-1 , insulin, IGF-I, IGF-II, TGF- ⁇ 1
  • the first and/or second double-stranded DNA encode a variant of a ligand.
  • the first double-stranded DNA encodes all or a portion of an antibody.
  • antibody or grammatical equivalents, as used herein, refer to antibodies and antibody fragments that retain the ability to bind to the epitope that the intact antibody binds and include polyclonal antibodies, monoclonal antibodies, chimeric antibodies, anti-idiotype (anti-ID) antibodies.
  • the antibodies are monoclonal antibodies.
  • Antibody fragments include, but are not limited to the complementarity-determining regions (CDRs), single-chain fragment variables (scfv), heavy chain variable region (VH), light chain variable region (VL).
  • the first and/or second double-stranded DNA encode a variant of an antibody.
  • Information with respect to nucleic acid sequences and amino acid sequences for enzymes, receptors, ligands, and antibodies is readily available from numerous publications and several data bases, such as the one from the National Center for Biotechnology Information (NCBI).
  • the expression vectors may be either self- replicating extrachromosomal vectors or vectors which integrate into a host genome. Generally, these expression vectors include transc ptional and translational regulatory nucleic acid operably linked / to the nucleic acid encoding the variant protein.
  • control sequence or grammatical equivalents thereof, as used herein, refer to DNA sequences necessary for the expression of an operably linked coding sequence in a particular host organism.
  • the control sequences that are suitable for prokaryotes, for example, include a promoter, optionally an operator sequence, and a ribosome binding site.
  • Eukaryotic cells are known to utilize promoters, polyadenylation signals, and enhancers. It is understood that when screening for a particular property, or an alteration in properties, that the control can be a "ground zero" control. Alternatively, two proteins may be compared against one another, rather than a control.
  • control sequences are generated by usingo the methods described herein.
  • Nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence.
  • DNA for a presequence or secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide;
  • a promoter or enhancer is operably linked to a codings sequence if it affects the transcription of the sequence; or
  • a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation.
  • "operably linked” means that the nucleic acid sequences being linked are contiguous, and, in the case of a secretory leader, contiguous and in reading phase.
  • transchptional and translational regulatory nucleic acid will generally be appropriate to the host cell used to express the fusion protein; for example, transcriptional and translational regulatory nucleic acid sequences from5 Bacillus are preferably used to express the fusion protein in Bacillus. Numerous types of appropriate expression vectors, and suitable regulatory sequences are known in the art for a variety of host cells.
  • control sequences are operably linked to a another nucleic acid by using the methods described herein.
  • a naturally occurring secretory sequence leads to a low level of secretion of a variant protein
  • a replacement of the naturally occurring secretory leader sequence is desired.
  • an unrelated secretory leader sequence is operably linked to a variant protein encoding nucleic acid leading to increased protein secretion.
  • any secretory leader sequence resulting in enhanced secretion of the5 variant protein when compared to the secretion of the naturally occurring protein and its secretory sequence, is desired.
  • Suitable secretory leader sequences that lead to the secretion of a protein are know in the art.
  • a secretory leader sequence of a naturally occurring protein or a variant protein is removed by techniques known in the art and subsequent expression results in intracellular accumulation of the recombined protein.
  • the transchptional and translational regulatory sequences may include, but are not limited to, promoter sequences, ribosomal binding sites, transchptional start and stop sequences, translational start and stop sequences, and enhancer or activator sequences.
  • the regulatory sequences include a promoter and transchptional start and stop sequences.
  • Promoter sequences encode either constitutive or inducible promoters.
  • the promoters may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which combine elements of more than one promoter, are also known in the art, and are useful in the present invention.
  • the promoters are strong promoters, allowing high expression in cells, particularly mammalian cells, such as the ST AT or CMV promoter, particularly in combination with a Tet regulatory element.
  • the expression vector may comprise additional elements.
  • the expression vector may have two replication systems, thus allowing it to be maintained in two organisms, for example in mammalian or insect cells for expression and in a prokaryotic host for cloning and amplification.
  • the expression vector contains at least one sequence homologous to the host cell genome, and preferably two homologous sequences which flank the expression construct.
  • the integrating vector may be directed to a specific locus in the host cell by selecting the appropriate homologous sequence for inclusion in the vector. Constructs for integrating vectors are well known in the art.
  • the expression vector contains a selectable marker gene to allow the selection of transformed host cells.
  • Selection genes are well known in the art and will vary with the host cell used.
  • the nucleic acids are introduced into the cells, either alone or in combination with an expression vector.
  • introduction into or grammatical equivalents herein is meant that the nucleic acids enter the cells in a manner suitable for subsequent expression of the nucleic acid.
  • the method of introduction is largely dictated by the targeted cell type, discussed below. Exemplary methods include CaPO 4 precipitation, liposome fusion, lipofectin®, electroporation, viral infection, etc.
  • the nucleic acids may stably integrate into the genome of the host cell, or may exist either transiently or stably in the cytoplasm (i.e. through the use of traditional plasmids, utilizing standard regulatory sequences, selection markers, etc.).
  • the proteins of the present invention are produced by cultuhng a host cell transformed either with an expression vector containing nucleic acid encoding the protein or with the nucleic acid encoding the protein alone, under the appropriate conditions to induce or cause expression of the protein.
  • the conditions appropriate for protein expression will vary with the choice of the expression vector and the host cell, and will be easily ascertained by one skilled in the art through routine experimentation.
  • the use of constitutive promoters in the expression vector will require optimizing the growth and proliferation of the host cell, while the use of an inducible promoter requires the appropriate 5 growth conditions for induction.
  • the timing of the harvest is important.
  • the baculovirus used in insect cell expression systems is a lytic virus, and thus harvest time selection can be crucial for product yield.
  • Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect and animal cells, including mammalian cells. Of particular interest are Drosophila melangastero cells, Saccharomyces cerevisiae and other yeasts, E. coli, Bacillus subtilis, SF9 cells, C129 cells, 293 cells, Neurospora, BHK, CHO, COS, Pichia Pastoris, etc.
  • the proteins are expressed in mammalian cells.
  • Mammalian expression systems are also known in the art, and include retroviral systems.
  • a mammalian promoter is any DNA sequence capable of binding mammalian RNA s polymerase and initiating the downstream (3') transcription of a coding sequence for the fusion protein into mRNA.
  • a promoter will have a transcription initiating region, which is usually placed proximal to the 5' end of the coding sequence, and a TATA box, using a located 25-30 base pairs upstream of the transcription initiation site. The TATA box is thought to direct RNA polymerase II to begin RNA synthesis at the correct site.
  • Ao mammalian promoter will also contain an upstream promoter element (enhancer element), typically located within 100 to 200 base pairs upstream of the TATA box.
  • An upstream promoter element determines the rate at which transcription is initiated and can act in either orientation.
  • mammalian promoters are the promoters from mammalian viral genes, since the viral genes are often highly expressed and have a broad host range.5 Examples include the SV40 early promoter, mouse mammary tumor virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, and the CMV promoter.
  • transcription termination and polyadenylation sequences recognized by mammalian cells are regulatory regions located 3' to the translation stop codon and thus, together with the promoter elements, flank the coding sequence.
  • the 3' terminus of theo mature mRNA is formed by site-specific post-translational cleavage and polyadenylation.
  • transcription terminator and polyadenlytion signals include those derived form SV40.
  • Techniques5 include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated transfection, protoplast fusion, electroporation, viral infection, encapsulation of the polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei.
  • the type of mammalian cells used in the present invention can vary widely.
  • any mammalian cells may be used, with mouse, rat, primate and human cells being particularly preferred, although as will be appreciated by those in the art, modifications of the system by pseudotyping allows all 5 eukaryotic cells to be used, preferably higher eukaryotes.
  • a screen can be set up such that the cells exhibit a selectable phenotype in the presence of a bioactive peptide.
  • cell types implicated in a wide variety of disease conditions are particularly useful, so long as a suitable screen may be designed to allow the selection of cells that exhibit an altered phenotype as a consequence of theo presence of a peptide within the cell.
  • suitable mammalian cell types include, but are not limited to, tumor cells of all types (particularly melanoma, myeloid leukemia, carcinomas of the lung, breast, ovaries, colon, kidney, prostate, pancreas and testes), cardiomyocytes, endothelial cells, epithelial cells, lymphocytes (T-cell and B cell) , mast cells, eosinophils, vascular intimals cells, hepatocytes, leukocytes including mononuclear leukocytes, stem cells such as haemopoetic, neural, skin, lung, kidney, liver and myocyte stem cells (for use in screening for differentiation and de-differentiation factors), osteoclasts, chondrocytes and other connective tissue cells, keratinocytes, melanocytes, liver cells, kidney cells, and adipocytes.
  • Suitable cells also include known research cells, including, but not limited to, Jurkat T cells,o NIH3T3 cells, CHO, COS,
  • the cells may be additionally genetically engineered, that is, they contain exogenous nucleic acid other than the chimeric nucleic acid of the invention.
  • the proteins are expressed in bacterial systems. 5 Bacterial expression systems are well known in the art.
  • a suitable bacterial promoter is any nucleic acid sequence capable of binding bacterial RNA polymerase and initiating the downstream (3') transcription of the coding sequence of the protein into mRNA.
  • a bacterial promoter has a transcription initiation region which is usually placed proximal to the 5' end of the coding sequence. This transcriptiono initiation region typically includes an RNA polymerase binding site and a transcription initiation site. Sequences encoding metabolic pathway enzymes provide particularly useful promoter sequences. Examples include promoter sequences derived from sugar metabolizing enzymes, such as galactose, lactose and maltose, and sequences derived from biosynthetic enzymes such as tryptophan. Promoters from bacteriophage may also be5 used and are known in the art.
  • a bacterial promoter can include naturally occurring promoters of non- bacterial origin that have the ability to bind bacterial RNA polymerase and initiate transcription.
  • the ribosome binding site is called the Shine-Delgarno (SD) sequence and includes an initiation codon and a sequence 3-9 nucleotides in length located 3-11 nucleotides upstream of the initiation codon.
  • SD Shine-Delgarno
  • the expression vector may also include a signal peptide sequence that provides for secretion of the expressed protein in bacteria.
  • the signal sequence typically encodes a signal peptide comprised of hydrophobic amino acids, which direct the secretion of the protein from the cell, as is well known in the art.
  • the protein is either secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located between the inner and outer membrane of the cell (gram-negative bacteria).
  • bacterial secretory leader sequences operably linked to the chimeric nucleic acid, are preferred.
  • the proteins of the invention are expressed in bacteria and/or are displayed on the bacterial surface.
  • the bacterial expression vector may also include a selectable marker gene to allow for the selection of bacterial strains that have been transformed. Suitable selection genes include genes which render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, such as those in the histidine, tryptophan and leucine biosynthetic pathways.
  • Expression vectors for bacteria are well known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, and Streptococcus lividans, among others.
  • the bacterial expression vectors are transformed into bacterial host cells using techniques well known in the art, such as calcium chloride treatment, electroporation, and others.
  • proteins are produced in insect cells.
  • Expression vectors for the transformation of insect cells and in particular, baculovirus-based expression vectors, are well known in the art.
  • proteins are produced in yeast cells.
  • Yeast expression systems are well known in the art, and include expression vectors for Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorpha, Kluyveromyces fragilis and K. lactis, Pichia guillerimondii and P. pastoris, Schizosaccharomyces pombe, and Yarrowia lipolytica.
  • Preferred promoter sequences for expression in yeast include the inducible GAL1.10 promoter, the promoters from alcohol dehydrogenase, enolase, glucokinase, glucose-6-phosphate isomerase, glyceraldehyde-3- phosphate-dehydrogenase, hexokinase, phosphofructokinase, 3-phosphoglycerate mutase, pyruvate kinase, and the acid phosphatase gene.
  • Yeast selectable markers include ADE2, HIS4, LEU2, TRP1 , and ALG7, which confers resistance to tunicamycin; the neomycin phosphotransferase gene, which confers resistance to G418; and the CUP1 gene, which allows yeast to grow in the presence of copper ions.
  • the proteins of the invention are expressed in yeast and/or are displayed on the yeast surface.
  • Suitable yeast expression and display systems are known in the art (Boder and Wittrup, Nat. Biotechnol. 15:553-7 (1997); Cho et al., J. Immunol. Methods 220:179-88 (1998); all of which are expressly incorporated by reference).
  • Surface display in the ciliate Tetrahymena thermophila is described by Gaertig et al. Nat. Biotechnol. 17:462-465 (1999), expressly incorporated by reference.
  • proteins are produced in viruses and/or are displyed on the surface of the viruses.
  • Expression vectors for protein expression in viruses and for display are well known in the art and commercially available (see review by Felici et al., Biotechnol. Annu. Rev. 1 :149-83 (1995)). Examples include, but are not limited to M13 (Lowman et al., (1991 ) Biochemistry 30:10832-10838 (1991 ); Matthews and Wells, (1993) Science 260:1113-1117; Stratagene); fd (Krebber et al., (1995) FEBS Lett. 377:227-231); T7 (Novagen, Inc.); T4 (Jiang et al., Infect. Immun.
  • proteins of the invention may be further fused to other proteins, if desired, for example to increase expression or increase stability.
  • the proteins may be covalently modified.
  • One type of covalent modification includes reacting targeted amino acid residues of a protein with an organic derivatizing agent that is capable of reacting with selected side chains or the N-or C-terminal residues of a protein.
  • Dehvatization with bifunctional agents is useful, for instance, for crosslinking a protein to a water-insoluble support matrix or surface for use in the method for purifying anti-protein antibodies or screening assays, as is more fully described below.
  • crosslinking agents include, e.g., 1 ,1-bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, for example, esters with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl esters such as 3,3'-dithiobis-
  • succinimidylpropionate bifunctional maleimides such as bis-N-maleimido-1 ,8-octane and agents such as methyl-3-[(p-azidophenyl)dithio]propioimidate.
  • Other modifications include deamidation of glutaminyl and asparaginyl residues to the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of praline and lysine, phosphorylation of hydroxyl groups of seryl or threonyl residues, methylation of the "- amino groups of lysine, arginine, and histidine side chains [T.E. Creighton, Proteins: 5 Structure and Molecular Properties, W.H. Freeman & Co., San Francisco, pp. 79-86 (1983)], acetylation of the N-terminal amine, and amidation of any C-terminal carboxyl group.
  • Another type of covalent modification of the protein included within the scope of this invention comprises altering the native glycosylation pattern of the variant protein or of the corresponding naturally occurring protein. "Altering the native glycosylation pattern" iso intended for purposes herein to mean deleting one or more carbohydrate moieties found in a protein, and/or adding one or more glycosylation sites that are not present in the respective protein.
  • Addition of glycosylation sites to a protein may be accomplished by altering the amino acid sequence thereof.
  • the alteration may be made, for example, by the addition of,s or substitution by, one or more serine or threonine residues to the protein (for O-linked glycosylation sites).
  • the amino acid sequence may optionally be altered through changes at the DNA level, particularly by mutating the DNA encoding the protein at preselected bases such that codons are generated that will translate into the desired amino acids.
  • Removal of carbohydrate moieties present on the protein may be accomplished chemically or enzymatically or by mutational substitution of codons encoding for amino acid5 residues that serve as targets for glycosylation.
  • Chemical deglycosylation techniques are known in the art and described, for instance, by Hakimuddin et al., Arch. Biochem. Biophys., 259:52 (1987) and by Edge et al., Anal. Biochem., 118:131 (1981 ).
  • Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a variety of endo-and exo-glycosidases as described by Thotakura et al., Meth.
  • Another type of covalent modification of a protein comprises linking the protein to one of a variety of non-proteinaceous polymers, e.g., polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in U.S. Patent Nos. 4,640,835; 4,496,689; 4,301 ,144; 4,670,417; 4,791 ,192 or 4,179,337.
  • non-proteinaceous polymers e.g., polyethylene glycol, polypropylene glycol, or polyoxyalkylenes
  • the proteins of the present invention may also be modified in a way to form chimeric5 molecules comprising a protein fused to another, heterologous polypeptide or amino acid sequence.
  • a chimeric molecule comprises a fusion of a protein with a tag polypeptide which provides an epitope to which an anti-tag antibody can selectively bind.
  • the epitope tag is generally placed at the amino-or carboxyl-terminus of the protein. The presence of such epitope-tagged forms of a protein can be detected using an antibody against the tag polypeptide. Also, provision of the epitope tag enables the protein to be readily purified by affinity purification using an anti-tag antibody or another type of affinity matrix that binds to the epitope tag.
  • the chimeric molecule may comprise a fusion of a protein with an immunoglobulin or a particular region of an immunoglobulin.
  • a fusion could be to the Fc region of an IgG molecule.
  • tag polypeptides and their respective antibodies are well known in the art. Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; the flu HA tag polypeptide and its antibody 12CA5 [Field et al., Mol. Cell. Biol., 8:2159-2165
  • tag polypeptides include the Flag-peptide [Hopp et al., BioTechnology, 6:1204-1210 (1988)]; the KT3 epitope peptide [Martin et al., Science, 255:192-194 (1992)]; tubulin epitope peptide [Skinner et al., J. Biol. Chem., 266:15163-15166 (1991 )]; and the T7 gene 10 protein peptide tag [Lutz-Freyermuth et al., Proc. Natl. Acad. Sci. USA, 87:6393- 6397 (1990)].
  • the protein is purified or isolated after expression.
  • the proteins may be isolated or purified in a variety of ways known to those skilled in the art depending on what other components are present in the sample. Standard purification methods include electrophoretic, molecular, immunological and chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse-phase HPLC chromatography, and chromatofocusing.
  • the protein may be purified using a standard anti-library antibody column. Ultrafiltration and diafiltration techniques, in conjunction with protein concentration, are also useful. For general guidance in suitable purification techniques, see Scopes, R., Protein Purification, Springer- Verlag, NY (1982). The degree of purification necessary will vary depending on the use of the protein. In some instances no purification may be necessary.

Landscapes

  • Genetics & Genomics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
EP01926467A 2000-04-21 2001-03-28 Rekombination von nukleinsäuren, die nicht auf pcr basiert Withdrawn EP1276858A1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US55720800A 2000-04-21 2000-04-21
US557208 2000-04-21
PCT/US2001/009971 WO2001081568A1 (en) 2000-04-21 2001-03-28 Non-pcr based recombination of nucleic acids

Publications (1)

Publication Number Publication Date
EP1276858A1 true EP1276858A1 (de) 2003-01-22

Family

ID=24224471

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01926467A Withdrawn EP1276858A1 (de) 2000-04-21 2001-03-28 Rekombination von nukleinsäuren, die nicht auf pcr basiert

Country Status (4)

Country Link
EP (1) EP1276858A1 (de)
AU (1) AU2001253001A1 (de)
CA (1) CA2406466A1 (de)
WO (1) WO2001081568A1 (de)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6951719B1 (en) 1999-08-11 2005-10-04 Proteus S.A. Process for obtaining recombined nucleotide sequences in vitro, libraries of sequences and sequences thus obtained
US8053191B2 (en) 2006-08-31 2011-11-08 Westend Asset Clearinghouse Company, Llc Iterative nucleic acid assembly using activation of vector-encoded traits
WO2012064975A1 (en) 2010-11-12 2012-05-18 Gen9, Inc. Protein arrays and methods of using and making the same
CA3132011A1 (en) 2010-11-12 2012-06-14 Gen9, Inc. Methods and devices for nucleic acids synthesis
LT3594340T (lt) * 2011-08-26 2021-10-25 Gen9, Inc. Kompozicijos ir būdai, skirti nukleorūgščių didelio tikslumo sąrankai
US9150853B2 (en) 2012-03-21 2015-10-06 Gen9, Inc. Methods for screening proteins using DNA encoded chemical libraries as templates for enzyme catalysis
CA2871505C (en) 2012-04-24 2021-10-12 Gen9, Inc. Methods for sorting nucleic acids and multiplexed preparative in vitro cloning
JP6509727B2 (ja) 2012-06-25 2019-05-15 ギンゴー バイオワークス, インコーポレイテッド 核酸アセンブリおよび高処理シークエンシングのための方法
GB2566986A (en) 2017-09-29 2019-04-03 Evonetix Ltd Error detection during hybridisation of target double-stranded nucleic acid

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3614310A1 (de) * 1986-04-28 1987-10-29 Hoechst Ag Verfahren zur isolierung mutierter gene sowie der entsprechenden wildtyp-gene
US5783431A (en) * 1996-04-24 1998-07-21 Chromaxome Corporation Methods for generating and screening novel metabolic pathways
JPH1066576A (ja) * 1996-08-07 1998-03-10 Novo Nordisk As 突出末端を有する2本鎖dna及びこれを用いたdnaのシャフリング方法
EP1228200A1 (de) * 1999-10-27 2002-08-07 California Institute Of Technology Herstellung funktionaler hybridgene und hybridproteine

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0181568A1 *

Also Published As

Publication number Publication date
WO2001081568A1 (en) 2001-11-01
CA2406466A1 (en) 2001-11-01
AU2001253001A1 (en) 2001-11-07

Similar Documents

Publication Publication Date Title
EP1328627B1 (de) Methode zur bildung von oligonucleotidbanken welche eine kontrollierte mutationsverteilung aufweisen
US5939250A (en) Production of enzymes having desired activities by mutagenesis
CN105247066B (zh) 使用RNA引导的FokI核酸酶(RFN)提高RNA引导的基因组编辑的特异性
US7833759B2 (en) Method of increasing complementarity in a heteroduplex
EP1409667B2 (de) Verfahren zur herstellung von polynukleotidvarianten
US20020155439A1 (en) Method for generating a library of mutant oligonucleotides using the linear cyclic amplification reaction
AU720334B2 (en) Method of screening for enzyme activity
AU2002314712A1 (en) A method of increasing complementarity in a heteroduplex polynucleotide
WO2001081568A1 (en) Non-pcr based recombination of nucleic acids
KR20210060541A (ko) 개선된 고처리량 조합 유전적 변형 시스템 및 최적화된 Cas9 효소 변이체
US6790605B1 (en) Methods for obtaining a desired bioactivity or biomolecule using DNA libraries from an environmental source
EP1280894B1 (de) Methoden zur herstellung von rekombinierten nucleinsäuren
JP2004528801A (ja) 増加的切断短縮化核酸及びその製造方法
EP1179596A1 (de) Nuklease
Zhang et al. CRISPR/Cas9-assisted ssDNA recombineering for site-directed mutagenesis and saturation mutagenesis
WO2002016642A1 (en) Methods and compositions for directed molecular evolution using dna-end modification
AU756201B2 (en) Method of screening for enzyme activity
CA2486900A1 (en) A method for obtaining circular mutated and/or chimaeric polynucleotides
AU2003200812A2 (en) Method of screening for enzyme activity
ZA200306203B (en) A method of increasing complementarity in a heteroduplex polynucleotide.

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20021112

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

17Q First examination report despatched

Effective date: 20030523

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20040616