CA2408652A1 - A method for designing and screening random libraries of compounds - Google Patents

A method for designing and screening random libraries of compounds Download PDF

Info

Publication number
CA2408652A1
CA2408652A1 CA002408652A CA2408652A CA2408652A1 CA 2408652 A1 CA2408652 A1 CA 2408652A1 CA 002408652 A CA002408652 A CA 002408652A CA 2408652 A CA2408652 A CA 2408652A CA 2408652 A1 CA2408652 A1 CA 2408652A1
Authority
CA
Canada
Prior art keywords
promoter
molecule
multidimensional
peptide
library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002408652A
Other languages
French (fr)
Inventor
Mikhail Popkov
Rosemonde Mandeville
Oleg Romar
Valery Alakhov
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Supratek Pharma Inc
Original Assignee
Supratek Pharma Inc.
Mikhail Popkov
Rosemonde Mandeville
Oleg Romar
Valery Alakhov
Biophage Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Supratek Pharma Inc., Mikhail Popkov, Rosemonde Mandeville, Oleg Romar, Valery Alakhov, Biophage Inc. filed Critical Supratek Pharma Inc.
Publication of CA2408652A1 publication Critical patent/CA2408652A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1086Preparation or screening of expression libraries, e.g. reporter assays
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1037Screening libraries presented on the surface of microorganisms, e.g. phage display, E. coli display
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/02Libraries contained in or displayed by microorganisms, e.g. bacteria or animal cells; Libraries contained in or displayed by vectors, e.g. plasmids; Libraries containing only microorganisms or vectors
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/531Production of immunochemical test materials
    • G01N33/532Production of labelled immunochemicals

Abstract

Provided herein are methods for producing novel and useful multidimensional libraries (MDL) comprising molecules of varying lengths, wherein the molecules comprise a functional unit that potentially interacts with a target molecule, and a structural unit. Also provided are novel oligonucleotides, vectors, transformed and transfected unicellular hosts, as well as kits for screening molecules of a multidimensional library for their potential interaction with a target molecule.

Description

A METHOD FOR DESIGNING AND SCREENING RANDOM LIBRARIES OF
COMPOUNDS
FIELD OF THE INVENTION
The present invention relates generally to methods for generating and screening Multidimensional Libraries (MDL) for proteins, polypeptides, and/or peptides designated multidimensional peptides (MDPs) that are members of the MDL, for binding specificity and desired affinity for target molecule.
BACKGROUND OF THE INVENTION
In numerous fields, such as medicine and agriculture to name only a few, there is an increasing need to find new molecules that can effectively modulate a wide range of biological processes. Traditional methods utilize "irrational drug design" -the process of selecting the right molecules from large ensembles or repertoires.
Generally, such methods include screening collections of natural materials, such as fermentation broths of plant extracts, or libraries of synthetic molecules, with assays that range dramatically in complexity from simple binding reactions to elaborate physiological preparations. Often, these assays provide only lead compounds, which require much improvement and refinement by empirical methods or by chemical design before any efficacious compound is identified. The process is time-consuming and costly, but it is unlikely to be totally replaced by rational methods even when they are based on detailed knowledge of the chemical structure of the target molecules.
Moreover, irrational drug design methods require continuous improvement in both the generation of repertoires and in the methods of their selection. Kay et al., Gezze 128:59-65 (1993).
Recently, several developments have been made in using peptides or nucleotides to provide libraries of compounds for lead discovery. Generally, there are two different approaches to the construction of random peptide libraries. In one approach, peptides are chemically synthesized izz vitz-o in several formats.
For example, the standard serial process of stepwise search of synthetic peptides now encompasses a variety of highly sophisticated methods in which large arrays of peptides are synthesized in parallel and screened with acceptor molecules labeled with fluorescent or other reporter groups. The sequence of any effective peptide can be decoded from its address in the array. (Geysen et al.. Proc. Natl. Acad. Sci. USA 81:3998-CONFIRMATION COPY
2 (1984); Maeji et al., J. Irnmunol. Methods 146:83-90 (1992); and Fodor et al., Science 251:767-775 (1991)).
In another approach, combinatorial libraries of peptides are synthesized on resin beads. Each resin bead contains about 20 pmoles of the same peptide. The beads are then screened with labeled acceptor molecules. Those with bound acceptors are submitted to visual inspection and then physically removed. The peptide is identified by direct sequence analysis. Lam et al., Nature 354:82-84 (1991). Although this method may, in principle, be used with other chemical entities, it would require sensitive methods for sequence determination, thus making its use with other chemical entities very inefficient.
A different method of solving the problem of identification using a combinatorial peptide library involves the use of hexapeptides. Houghten et al., Nature 354:84-86 (1991). In particular, with hexapeptides of the 20 natural amino acids, 400 separate libraries are synthesized, each of which has its first two amino acid residues fixed, i-e., invariant. The remaining four positions of the hexapeptides are occupied by all possible combinations. An assay based on competition for binding or other activity is then used to fmd the library with an active peptide. Once the library with the active peptide is located, 20 new libraries are synthesized and assayed in order to determine the effective amino acid residue in the third position. This process is then repeated until all six positions in the peptide are identified. However, this method is inherently time consuming and inefficient. Moreover, the size of the peptides that can be assayed is limited to six amino acid residues. Thus, its ability to assay the effect of a protein's, polypeptide's or peptide's secondary or tertiary structure on binding with a target molecule is extremely limited.
Recently, another approach using hexapeptides was suggested. In particular, starting with 20 amino acids, a total of 120 (20 x 6) peptide mixtures are synthesized.
In the 20 mixtures, position 6 contains a unique amino acid, and positions 1-5 contain a mixture of all natural amino acids. In another 20 mixtures, position 5 contains a unique amino acid and all other positions contain a mixture of all twenty amino acids, etc.
Once synthesized, the 120 peptide mixtures are tested simultaneously and the most active of each of the 20 mixtures representing each position is identified.
Houghten, Abstract, European Peptide Society 1992 symposium, Interlaken, Switzerland.
Although this method increases the speed in which an active peptide can be found, it
3 PCT/IBO1/00810 also possesses the inherent limitation of assaying hexapeptides. Thus, like the method explained above, the tertiary structure of compounds having activity can not be adequately explored.
A second approach using recombinant DNA techniques involves expressing peptides i>z vivo as either soluble fusion proteins or viral capsid fusion proteins. In particular, a number of peptide libraries use the MI3 phage. MI3 is a filamentous bacteriophage that has been a used extensively in molecular biology laboratories for the past 20 years. The viral particle comprises six different capsid proteins and one copy of the viral genome, as a single-stranded circular DNA molecule. Once the M13 DNA
penetrates into a host cell such as E. coli, it is converted into double-stranded, circular DNA. The viral DNA carries a second origin of replication that is used to generate the single-stranded DNA found in the viral particles. During viral morphogenesis, there is an ordered assembly of the single-stranded DNA and the viral proteins, and the viral particles are excluded from cells in a process much like secretion. The M13 virus is neither Iysogenic nor Iytic like other bacteriophages (i-ee. bacteriophage ~.). Once infected, the cells chronically release the virus. This feature leads to higher titers of virus-infected cultures, i.e., 1012 pfulml.
The genome of the M13 phage is about 8000 nucleotides in length and has been completely sequenced. The viral capsid protein, protein III (pIII) is responsible for the infection of bacteria. In E. coli, the F factor encodes the pillin protein, which interacts with the pIII protein, and is responsible for phage uptake. Hence, all E. coli hosts for the M13 virus are considered males because they carry the F factor.
Using mutational analysis, investigators have determined that the 406 amino acid long pIII
capsid protein has two domains. The C-terminus anchors the protein to the viral coat, while portions of the N-terminus of the pIII are essential for interactions with the E.
coli pillin protein, Crissman and Smith, Virology 132: 445-455 (1984).
Although the N-terminus anchors of the pIII protein have shown to be necessary for viral infection, the extreme N-terminus of the mature protein does not tolerate alterations. In 1985, Smith published experiments reporting the use of the pIII protein of bacteriophage M13 as an experimental system expressing a heterologous protein on the viral coat surface, Science 228:1315-1317 (1985). It was later recognized, independently by two groups that the M13 phage pIII gene display system could be a useful tool for mapping antibody epitopes. Also de La Cruz et al. J. Biol. Chezzz. 263:4318-4322 (1988)] cloned and expressed segments of the cDNA encoding the Plasmodium falciparuzzz surface
4 coat protein into the gene III, and recombinant phage were tested for immunoreactivity with a polyclonal antibody, Parmley and Smith Gene 73:305-318 (1988), cloned and expressed segments of the E. coli ~i-galactosidase gene in the gene III and identified recombinants carrying the epitope of an anti-~i-galactosidase monoclonal antibody.
These authors also described a process termed "biopanning", in which mixtures of recombinant phage were incubated with biotinylated monoclonal antibodies, and phage-antibody complexes specially recovered with streptavidin-coated plastic plates.
In 1989, Parmley and Smith (Adv. Exp. Med. Biol. 251:215-218 (1989)) suggested that short, synthetic DNA segments cloned into the pIII gene might represent a library of epitopes. These authors reasoned that since linear epitopes were often about 6 amino acids in length, it should be possible to use a random recombinant DNA
library to express all possible hexapeptides to isolate epitopes that could bind to antibodies.
Scott and Smith (Science 249:386-390 (1990)) described construction and expression of an "epitope library" of hexapeptides on the surface of the M13 phage.
The library was made by inserting a 33 base pair Bgl I digested oligonucleotide sequence into the SfzI digested phage fd-tet, i.e., fUSE 5 RF. The 33 base pair fragment contained a random or "degenerate" coding sequence (NNK)6 where N represents G, A, T and C, and K represents G and T. The authors stated that the library consisted of 2x108 recombinants expressing 4x10' different hexapeptides. Theoretically, this library expressed 69% of the 6.4x10' possible peptides (206). Cwirla et al., Proc. Nail.
Acad. Sci. U.S.A. 87:6378-6382 (1990) also described a somewhat similar library of hexapeptides expressed as gene pIII fusion of M13 fd phage. WO 91119818, published on Dec. 26, 1991 by Dower and Cwirla describes a similar library of pentameric to octomeric random amino acid sequences.
Furthermore, Devlin et al., Scieyace 249:404-401 (1990), described a peptide library of about 15 residues generated using an (NNS) coding scheme for oligonucleotide synthesis in which S is G or C.
Likewise, Christian et al., (J. Mol. Biol. 227:771-718 (1992)) have described a phage display library expressing decapeptides. The starting DNA was generated by means of an oligonucleotide comprising the degenerate codons (NN(G/T))lo with a self-complementary 3' terminus. In forming a hairpin, this sequence creates a self-priming replication site that could be used by T4 DNA polymerase to generate the complementary DNA strand. The double-stranded DNA was then cleaved at the SfiI
sites at the 5' terminus and hairpin for cloning into the fUSE 5 vector described by Scott and Smith Science 249:386-390 (1990).
These libraries may encompass a very large repertoire of different peptides that
5 represent potential targets to a variety of macromolecules like receptors, polypeptides, enzymes, carbohydrates and antibodies. Therefore, phage display technology appears to be a very powerful tool for the selection of peptide sequences that bind to a target molecule. These peptides may find numerous applications, for example as antigens in vaccine composition, as enzyme inhibitors, as antagonists or agonists to receptors.
However, due to the limited length of the peptides in the libraries discussed above, the peptides are not able to mimic native proteins, and adopt their conformation.
For example, the monoclonal antibody 1B7, first described by Sato et al., Infect.
Irnnaun.
46:422-428 (1984) was initially raised against the B. pertussis toxin (PTX).
This antibody is able to neutralize the toxin in vitro and to protect mice from intracerebral challenge with virulent B. pertussis. The epitope recognized by 1B7 was shown to be discontinuous and largely dependent on conformation. Hoping to obtain peptide sequences that would mimic such a discontinuous epitope, Felici et al., Gene 128:21-27 (1993) constructed two phage display libraries consisting of nine random amino acids inserted in the major coat protein (pVIII), which nanopeptides are linear or flanked by two cysteine residues (circular). The two libraries were screened with the antibody 1B7. The positive clones were sequenced and a consensus sequence was obtained only for linear peptides. In the absence of a three-dimensional structure of the PTX
however, it was very difficult to determine how the consensus peptide sequence corresponded to amino acid residues of the original protein that are important in the constitution of the discontinuous epitope recognized by the antibody 1B7.
Despite this lack of information, the authors were expecting that the selected nanopeptides would sufficiently mimic the binding site of the original protein to serve as antigens in the production of vaccines against PTX. However, contrary to their expectations, the peptides were able to compete with PTX for the binding site of 1B7, but were not capable of sufficiently mimicking the discontinuous epitope of PTX to elicit the production of antibodies specific to the original antigen, PTX. Moreover, it was determined that the phage recombinant peptides adopted a conformation that may be governed by the surrounding phage sequences, which conformation is recognizable by
6 the antibody IB7. Thus, when peptides alone were synthesized without the presence of surrounding phage sequences, they lost their ability to bind the antibody 1B7.
The same group Luzzago et al., Gene 128:51-57 (1993) used the same libraries to select oligopeptides that would bind to antibody H107, which recognizes the native conformation of the recombinant human ferritin H-subunit (H-Fer). This time though, the three-dimensional structure of H-Fer was known. The consensus peptide sequence obtained only for the linear selected peptides was used to assign to amino acids specifically located in the original protein, a putative role in the conformation of the H-Fer epitope. When the peptides were synthesized in the presence of surrounding sequences located in the original protein as well as those synthesized with surrounding phage sequences, the peptides screened with H107 antibody were capable of mimicking the original proteic assembly and efficiently bind the antibody HI07.
These results indicate the unpredictability of libraries described above. In particular, different results were obtained with peptides selected with two different antibodies. Thus, these libraries were not successful in the selection of epitopes of all existing antigens.
Yet another shorter peptide library is described by O'Neil et al., Proteins:
Structure, Fu>zction and Geyzetics 14:509-515 (1992). In this library, a random circular hexapeptide sequence was constructed and inserted it in the pIII phage protein. The library was then used to select targets to the receptor glycoprotein IIb/IIIa (am,(33), a member of the integrin family of cell adhesion molecules that mediate platelet aggregation through the binding of fibrinogen and von Willebrand factor. The purpose of this work was to find targets that could be used as antagonists or as anti-thrombotic agents. The glycoprotein IIb/IIIa binds to a very short sequence commonly known as the RGD sequence. Using a circular library, the same authors were successful in identifying consensus sequence. The authors could indeed identify targets that were better antagonists than the target SK106760 (a cyclic peptide developed after extensive synthesis of an array of peptides) used to elute the phages of interest. They also found that a variant RGD sequence wherein the arginine was replaced by a lysine was surprisingly one of the best anti-aggregatory selected peptides.
Kay et al. have also constructed the TSAR-9 library, which expresses 36 random amino acids at the N-terminus of the mature pIII molecule Geyze 128:59-(1993). This library contains 10$ individual recombinants. While this value is miniscule when compared to its potential coding diversity (2030, it is biologically very
7 diverse. For example, it has been estimated that only 105 40-amino acid exons exist in the human genome Dorit et al., Science 250:1377-1382 (1990). The library was panned with streptavidin and a polyclonal goat anti-mouse IgG Fc antibody preparation coupled to paramagnetic beads. Streptavidin selected the class of phage expressing the amino acid motif, HPQ/M, similar to the motif identified by Devlin et al., Sciezzce 249:404-406 (1990) using a 15-amino acid random peptide library displayed on phages. The polyclonal goat anti-mouse IgG Fc antibody preparation selected phage-displaying sequences similar to a region of the mouse IgG Fc. Thus, a single immunodominant epitope on the mouse IgG was identified.
Other investigators have used other viral capsid proteins for the expression of non-viral DNA on the surface of phage particles. The protein pVIII is a major viral capsid protein and interacts with the single stranded DNA of M13 viral particles at its C-terminus. It is 50 amino acids long and exists in approximately 2,700 copies/particle. The N-terminus of the protein is exposed and will tolerate insertions, although large inserts have been reported to disrupt the assembly of fusion pVIl1 proteins into viral particles, (Cesareni G., FEBS Lett. 307:66-70 (1992)). To minimize the negative effect of pVIII-fusion proteins, a phagemid system has been used.
Bacterial cells carrying the phagemid are infected with helper phage and secrete viral particles that have a mixture of both wild-type and fusion pVllI capsid molecules.
Gene VIII has also served as a site for expressing peptides on the surface of M13 viral particles: 4 and 6 amino acid sequences corresponding to different segments of the P.
falciparuzaz major surface antigen of the filamentous bacteriophage fd (Greenwood et 1. J. Mol. Biol. 220:821-827 (1991)).
Leostra et al., J. Imnzuzzol. Methods 152:149-157 (1992), described the construction of a library comprising annealing oligonucleotides of about 17 or degenerate bases with an 8 nucleotide long palindromic sequence at their 3'-ends to express random hexa- or octapeptides as fusion proteins with the (3-galactosidase protein in a bacterial expression vector. The DNA was then converted into a double-stranded form with a Klenow DNA polymerase, blunt-end ligated into a vector, and then cloned into an expression vector of the C-terminus of a truncated [3-galactosidase to generate 107 recombinants. Colonies were then lysed, blotted on nitrocellulose filters and screened for immunoreactivity with several different monoclonal antibodies.
A number of clones were isolated by repeated rounds of screening and then sequenced.
However, as one of ordinary skill in the art can readily realize, it is extremely
8 laborious, and thus expensive to construct such a library. As a result, its applications are limited.
Pasquallni and Ruoslahti, Nature 380:364-366 (1996), reported an approach to study organ-selective targeting based on ifa vivo screening of random peptide libraries.
Peptides capable of mediating selective localization of phage to brain and kidney blood vessels were identified, and showed up to 13-fold selectivity for these organs.
A particular "genetic" method of producing a library has been described wherein the libraries are synthetic oligonucleotides themselves. Active oligonucleotide molecules are selected by binding to an acceptor site and then amplified by the polymerase chain reaction (PCR). PCR allows serial enrichment, and the structure of the active molecules is then decoded by DNA sequencing of clones generated from the PCR products. However, the repertoire is limited to nucleotides and the natural pyrimidine and purine bases, or those modifications that preserve specific Watson-Crick pairing and can be copied by polymerase, (Singer et al., Nucleic Acids Res.
25:781-786 (1997)). Later this approach was further developed by the introduction of various chemical derivatives of nucleotides into already identified motives, (Gold et al., Proc. Natl. Acad. Sci. U.S.A. 94:59-64 (1997)).
Another "genetic" method has been described in which the libraries of synthetic oligonucleotides or DNA fragments encoding antibodies are displayed in ribosomes via an iu vitro translation system, (Hanes et al., Proc. Natl. Acad.
Sci. U.S.A.
94:4937-4942 (1997)); (He et al., Nucleic Acids Res. 25:5132-5134)). This approach allows effective expression of very large (up to 1015-101$ members) libraries, while phage display system is usually limited by 101°-1012 sequences. The main advantage of these genetic methods resides in the capacity for cloning and amplification of the DNA
sequences, which allows enrichment by serial selection and provides a simple and easy method for decoding the structure of active molecules. Such results are not available when using libraries comprising peptides from a specific molecule, because, as explained above, results obtained from such libraries are unpredictable since the length and the conformation of the exposed peptide may be sufficient to retrieve a peptide binding to a specific molecule, and yet not be suited to retrieve a peptide that efficiently mimics a more sophisticated binding region on another molecule.
However, a limitation of these genetic methods of producing libraries involves the frequency of TAG (stop) codons in the oligonucleotides expressed by the peptide library. Efforts have been made to ameliorate this problem using hosts carrying
9 suppression. However, this strategy may not be 100% efficient to avoid stop codon expression in an oligonucleotide coding for a random peptide. Moreover, the problem becomes very serious when expressing oligonucleotides of longer length encoding random peptides.
Accordingly, what is needed are methods of generating libraries of peptides of random and unspecified length, and comprising functional peptide units which interact with a target molecule, and structural peptide units, which help position the functional peptide units in order to maximize binding affinity for the target molecule.
What is also needed are methods of generating libraries of peptides of random and unspecified length that utilize oligonucleotides of various and random length.
What is also needed are methods of generating libraries of peptides wherein the peptides have random lengths, and are not limited to a certain length. As a result, an opportunity is made available to develop the secondary and/or tertiary structures of the potential binding peptides and in sequences flanking the actual binding portions) of the functional unit of the MDP. Such complex structural developments are not feasible when only oligonucleotides with fixed lengths are used.
What is further needed are libraries that can be advantageously screened to identify multidimensional peptides (MDPs) having binding specificity for a variety of targets.
Moreover, what is also needed are methods of producing libraries comprising oligonucleotides which are capable of expressing peptides of varying length, and effectively and efficiently minimize the negative impact of random stop codons in oligonucleotides of the library.
The citation of any reference herein should not be construed as an admission that such reference is available as "Prior Art" to the instant application.
SUMMARY OF THE INVENTION
In accordance with the present invention, there is provided a novel and useful multidimensional library (MDL) for the selection of specific molecules that bind or interact in any way with molecules or molecular complexes (targets) of interest. A
MDL may be represented by various natural or artificial polymeric compounds including, but not limited, to isolated oligonucleotides, proteins, polypeptides, peptides, polycarbohydrates, polyamines, heterocycles, or their combinations, to name only a few.
Broadly, the present invention extends to a multidimensional library for screening molecules that potentially interact with a target molecule, wherein the library 5 comprises at least one molecule comprising a general formula of (XYn)m, wherein:
(XYn) is a repeating unit of the at least one molecule in which:
X is a functional unit that interacts with the target molecule, Y is a structural unit,
10 n is the number of the structural units in the repeating unit, and m is a number of repeating units in the at least one molecule.
In addition, the present invention extends to a multidimensional library as described above, wherein the at least one molecule of the library is detestably labeled.
Numerous detectable labels have applications herein, and can be readily utilized by one of ordinary skill in the art. Examples of detectable labels having applications herein include, but certainly are not limited to a radioactive element, a chemical which fluoresces, a chromophore, an enzyme, or an amplifiable nucleotide sequence, to name only a few. Particular examples of detectable labels having applications herein are described infra.
As explained above, the at least one molecule of a multidimensional library of the invention can be comprised of numerous different types of molecules, e.g., oligonucleotides, proteins, polypeptides, peptides, polycarbohydrates, polyamines, heterocycles, or their combinations, to name only a few. In a particular example wherein the at least one molecule of the multidimensional library comprises a protein, a polypeptide, or a peptide, X is a functional peptide unit that potentially participates in an interaction between the at least one molecule and the target, Y is a structural peptide unit, n is an integer such that O~n-X10, and m is an integer, such that 220.
Moreover, as explained above, the at least one molecule of a multidimensional library of the present invention can be an isolated oligonucleotide. In such an embodiment, the functional unit of the at least one molecule comprises a nucleotide regulatory sequence and the structural unit comprises a nucleotide sequence comprising
11 from 6 to at least 60 contiguous nucleotides. Numerous nucleotide regulatory sequences having applications herein. Particular examples include, but certainly are not limited to a promoter, an enhancer, a cis-acting locus, a traps-acting locus, an attenuator, an upstream activator, or a regulatory non-translatable region sequence.
Likewise, numerous promoters can serve as a functional unit in the at least one isolated oligonucleotide of a multidimensional library of the pxesent invention.
Particular examples of such promoters include, but certainly are not limited to an SV40 early promoter, a promoter contained in the 3' long terminal repeat of Rous sarcoma virus, a herpes thymidine kinase promoter, the regulatory sequences of the metallothionein gene, a [3-lactamase promoter, a tac promoter, an alcohol dehydrogenase promoter, a phosphoglycerol kinase promoter, an alkaline phosphatase promoter, an elastase I gene control region active in pancreatic acinar cells, an insulin gene control region active in pancreatic beta cells, an immunoglobulin gene control region active in lymphoid cells, a mouse mammary tumor virus control region active in testicular, breast, lymphoid and mast cells, an albumin gene control region active in liver, an alpha-fetoprotein gene control region active in liver, an alpha 1-antitrypsin gene control region active in the liver, a beta-globin gene control region active in myeloid cells, a myelin basic protein gene control region active in oligodendrocytic cells in the brain, a myosin light chain-2 gene control region active in skeletal muscle, a gonadotropic releasing hormone gene control region active in the hypothalamus dihydrofolate reductas'e (DHFR) promoter, a constitutive RSV-LTR promoter, a metallothionein IIa gene promoter, a RSV-LTR
promoter, an immediate early promoter of hCMV, an early promoter of SV40, an early promoter of adenovirus, an early promoter of vaccinia, an early promoter of polyoma, a late promoter of SV40, a late promoter of adenovirus, a late promoter of vaccinia, a late promoter of polyoma, the lac system, the trp system, the TAC system, the TRC
system, the major operator and promoter regions of phage lambda, a control region of fd coat protein, 3-phosphoglycerate kinase promoter, acid phosphatase promoter, or a promoter of yeast a mating factor.
Furthermore, the present invention extends to a multidimensional library comprises at least one multidimensional peptide comprising a general formula of (XYn)m, wherein:
X is a functional peptide unit that participates in an interaction between the at least one multidimensional molecule and the target;
Y is a structural peptide unit;
12 n is an integer, such that O~n~lO; and m is an integer, such that 2-_<r~20, wherein the at least one multidimensional peptide is encoded by at least one isolated oligonueleotide having a general formula of [(NNB)F~]m, wherein:
NisAorCorGorT/U;
B is C or G or T/U, but not A;
F is a codon encoding a predeternnined amino acid residue;
n is an integer, such that O~n~lO; and m is an integer, such that 2~~20.
Naturally, the present invention extends to an isolated oligonucleotide which encodes at least one molecule of a multidimensional library, wherein the isolated oligonucleotide comprises a general formula of [(NNB)F"]m, wherein:
NisAorCorGorT/U;
B is C or G or T/LJ, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that O~n~IO; and m is an integer, such that 2~m~20.
Optionally, an isolated oligonucleotide of the present invention can further comprise a signal sequence, which is described iyzfra, which signals the unicellular host to express the multidimensional peptide encoded by the isolated oligonucleotide on the surface of the unicellular host. Such signal sequences are well known to those of ordinary skill in the art, and can be spliced onto an isolated oligonucleotide of the present invention at the appropriate position using routine laboratory techniques.
The present invention further encompasses numerous methods of synthesizing such isolated oligonucleotides. One such method comprises synthesizing oligonucleotides as described above, wherein N represents equimolar mixture of A, C, G, and T; B represents equimolar mixture of G, C, and T. Thus, the NNB motif encodes any possible natural amino acids and contains only one stop codon (TAG); F
represents a single pre-synthesized codon, a combination of several single codons, or their random pre-synthesized sequences that result in one or a combination of pre-selected annino acids; n is a number of codons resulting in structural blocks of amino acids which is a random value and could be for example, 0-10; m is a number of functional codons which could be for example, 2-20.
13 Another method encompassed by the present invention for synthesizing such isolated oligonucleotides comprises using activated three-nucleotides corresponding to all natural amino acids as NNB codons and activated polynucleotides encoding structural blocks of pre-selected amino acids as F codons.
Still another method encompassed by the present invention for synthesizing isolated oligonucleotides having the general formula as described above comprises successive splitting and uniting steps, i.e., a "split-pull" synthesis. A
method for performing a "split-pull" synthesis comprises the steps of:
(a) synthesizing three nucleotides having the NNB structure on a resin support;
(b) dividing the resin support into n+1 fractions;
(c) continuing synthesis on each fraction of the resin support according to the following scheme:
fraction 1: Resin-NNB
fraction 2: Resin-NNB-(N1N2N3) fraction 3: Resin-NNB-(N1N2N3)2 fraction 4: Resin-NNB-(N1N2N3)3 fraction n+1: Resin-NNB (N1N2N3)n where N1, N2, and N3 are nucleotides resulting in the F codon described above;
(d) mixing the resin support fractions together and continuing the synthesis as set forth in step (a) to produce a compound having a general structure of Resin NNB-(N1N2N3)0-n-NNB;
(e) repeating steps (a) through (d) m times until a stochastic collection of oligonucleotides having a general formula of [NNB-(N1N2N3)0-n]m is obtained;
and (f) detaching the oligonucleotides from the resin support fractions.
14 The use of the "split-pull" synthesis avoids the use of synthesized oligonucleotides rich in GC nucleotides that are often found in libraries using an NNS, NNK, and NNB formula for variant codons. Such oligonucleotides are difficult to assemble and sequence properly.
Naturally, the present invention further extends to a cloning vector comprising an isolated oligonucleotide which encodes a multidimensional peptide of a multidimensional library of the present invention, and an origin of replication.
Numerous cloning vectors have applications herein, including E. coli, a bacteriophage, a plasmid, and a pUC plasmid derivative, to name only a few. A particular example of a bacteriophage that has applications as a cloning vector comprises a lambda derivative. Moreover, a plasmid cloning vector further comprises a pBR322 derivative, and a pUC plasmid derivative further comprises a pGEX vector, a pmal-c vector, or a pFLAG vector.
The present invention extends to an expression vector for expressing an isolated oligonucleotide which encodes a multidimensional peptide of a multidimensional library of the present invention. An expression vector of the present invention comprises an isolated oligonucleotide comprising a general formula of [(NNB)F~]"" wherein:
NisAorCorGorT/U;
B is C or G or T/LT, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that O~n~IO; and m is an integer, such that 2~m~e20, operatively associated with a promoter.
Numerous expression vectors have applications in the present invention.
Particular examples are described infra. Moreover, numerous promoters can be used in an expression vector of the present invention. Examples of such promoters include, but certainly are not limited to an immediate early promoter of hCMV, an early promoter of SV40, an early promoter of adenovirus, an early promoter of vaccinia, an early promoter of polyoma, a late promoter of SV40, a late promoter of adenovirus, a late promoter of vaccinia, a late promoter of polyoma, the lac system, the trp system, the TAC system, the TRC system, the major operator and promoter regions of phage lambda, a control region of fd coat protein, 3-phosphoglycerate kinase promoter, acid phosphatase promoter, and a promoter of yeast a mating factor.

In addition, the present invention extends to a unicellular host transformed or transfected with an expression vector comprising an isolated oligonucleotide which encodes a multidimensional peptide of a multidimensional library of the present invention, wherein the isolated oligonucleotide comprises a general formula of 5 [(NNB)F~]m, in which:
NisAorCorGorT/U;
B is C or G or T/LT, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that O~n~el0; and 10 m is an integer, such that 2~m~20, operatively associated with a promoter.
A large variety of unicellular hosts have applications in the present invention, e.g. E. coli, Pseudomonas, Bacillus, Strepomyces, yeast, CHO, R1.1, B-W, L-M, COSI, COS7, BSC1, BSC40, BMT10 and Sf9 cells, to name only a few.
15 In another embodiment, the present invention extends to a method for generating a multidimensional library (MDL) comprising at least one multidimensional peptide having affinity for a target molecule, wherein the at least one multidimensional peptide has a general formula of (XYn)m, wherein:
X is a functional peptide unit that participates in an interaction between the at least one multidimensional peptide and the target;
Y is a structural peptide unit;
n is an integer, such that O~n~tl0;
m is an integer, such that 2-~m~20.
Such a method comprises the steps of:
(a) providing at least one oligonucleotide having the general formula of [(NNB)F~]"" wherein:
NisAorCorGorT/U;
B is C or G or T/LT, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that O~n~lO; and m is an integer, such that 2~tm~20;
16 (b) inserting the at least one oligonucleotide into an expression vector, such that the at least one oligonucleotide is operatively associated with a promoter;
(c) transforming or transfecting a unicellular host with the expression S vector; and (d) culturing the unicellular host under conditions that provide for expression of the at least one oligonucleotide to produce the at least one multidimensional peptide having affinity for the target molecule.
Furthermore, the present invention extends to a method fox identifying a molecule of a multidimensional library that interacts with a target molecule, comprising the steps of:
(a) generating a multidimensional library (MDL) comprising at least one molecule comprising a general formula of:
(XYn)m wherein (XYn) is a repeating unit of the at least one molecule in which:
X is a functional unit of the at least one molecule, Y is a structural unit of the at least one molecule, n is the number of the structural units in the repeating unit, such that O~n~lO, and m is the number of repeating units in the at least one molecule, such that 2~m~20;
(b) contacting the multidimensional library with the target molecule; and (c) detecting binding of the target molecule with the at least one molecule.
Optionally, the at least one molecule of the multidimensional library is detectably labeled. Particular examples of detectable labels having applications herein are described infra.
As explained above, the at least one molecule of a multidimensional library of the invention can be comprised of numerous different types of molecules, ~., oligonucleotides, proteins, polypeptides, peptides, polycarbohydrates, polyamines, heterocycles, or their combinations, to name only a few. In a particular example wherein the at least one molecule of the multidimensional library comprises a protein, a polypeptide, or a peptide, X is a functional peptide unit that potentially participates in an interaction between the at least one molecule and the target, Y is a structural peptide
17 unit, n is an integer, such that O~n-X10, and m is an integer, such that 2~m~20. In a method for screening for identifying a molecule of a multidimensional library that interacts with a target molecule, wherein the multidimensional library comprises at least one multidimensional peptide, a skilled artisan can readily perform the step of S generating the library using routine solid phase protein synthesis methods.
Alternatively, the step of generating such a library in a screening method of the present invention also comprises the steps of:
(a) providing at least one isolated oligonucleotide having the general formula of:
[(NNB)Fn]m, wherein:
NisAorCorGorT/U;
B is C or G or T/LT, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that 0-~n~tl0; and m is an integer, such that 2-_em~20;
(b) inserting the at least one isolated oligonucleotide into an expression vector, such that the at least one isolated oligonucleotide is operatively associated with a promoter;
(c) transforming a unicellular host with the expression vector; and (d) culturing the unicellular host under conditions that provide for expression of the at Ieast one oligonucleotide to produce at Ieast one multidimensional peptide having affinity for the target molecule.
Methods of producing such isolated oligonucleotides are encompassed by the present invention, and described above and infra.
2S Numerous promoters have applications in such a method of the invention.
Particular examples include, but certainly are not limited to an immediate early promoter of hCMV, an early promoter of SV40, an early promoter of adenovirus, an early promoter of vaccinia, an early promoter of polyoma, a late promoter of SV40, a late promoter of adenovirus, a late promoter of vaccinia, a late promoter of polyoma, the lac system, the trp system, the TAC system, the TRC system, the major operator and promoter regions of phage lambda, a control region of fd coat protein, 3-phosphoglycerate kinase promoter, acid phosphatase promoter, or a promoter of yeast a mating factor, to name only a few. Furthermore, numerous expression vectors have applications in a method fox generating a multidimensional library (MDL) comprising
18 at least one multidimensional peptide having affinity for a target molecule.
Particular examples of such expression vectors are set forth infra.
Likewise, numerous unicellular hosts have applications in a method for generating a multidimensional library (MDL) of the present invention, a g., E.
coli, Pseudomonas, Bacillus, Strepomyces, yeast, CHO, R1.1, B-W, L-M, COS1, COS7, BSC1, BSC40, BMT10 and Sf9 cells, to name only a few.
In another embodiment, the present invention extends to a kit for screening molecules that potentially interact with a target molecule. Such a kit comprises:
(a) a predetermined amount of a multidimensional library (MDL) comprising at least one molecule that potentially has affinity for the target molecule, wherein the at least one molecule has a general formula of (XYn)m, wherein:
X is a functional unit that interacts with the target molecule;
Y is a structural unit;
n is an integer, such that O~n~lO;
m is an integer, such that 2-_en~20, (b) other reagents; and (c) directions for use of the kit.
Optionally, the at least one molecule of a multidimensional library of a kit of the invention is detectably labeled. Particular examples of such labels are described infra.
Furthermore, the present invention extends to a kit for screening molecules as described above, wherein the at least one molecule comprises an isolated oligonucleotide, a protein, a polypeptide, a peptide, a carbohydrate, a polyamine, a heterocyclic molecule, or a combination thereof. In a particular embodiment, wherein the at least one molecule comprises a protein, a polypeptide, or a peptide, X
is a functional peptide unit that participates in an interaction between the at least one multidimensional molecule and the target, Y is a structural peptide unit, n is an integer such that O~n~lO, and m is an integer such that 2~n~20. Reagents having applications in this embodiment of a kit of the present invention are generally those that maintain a peptide's native conformation. Examples of such reagents include, but certainly are not limited to protease inhibitors, such as PMSF, phosphate buffered saline, TRIS
glycine buffer, TRIS HCl buffer, etc., wherein the reagents are at physiological pH.
19 Other such reagents well known to those of ordinary skill in the art are encompassed herein.
In another embodiment, the present invention extends to a kit for screening molecules of a multidimensional library that potentially interact with a target molecule, wherein the kit comprises:
(a) a unicellular host transformed or transfected with an expression vector comprising at least one oligonucleotide operatively associated with a promoter, wherein the at least one oligonucleotide has the general formula of [(NNB)Fn]m, wherein:
N is A or C or G or T/LT;
B is C or G or T/LT, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that O~n~lO; and m is an integer, such that 2-~rr~20;
(b) reagents for expressing the at least one oligonucleotide;
(c) other reagents; and (d) directions for use the kit.
With such a kit, one of ordinary skill in the art can readily express the at least one isolated oligonucleotide inserted into the unicellular host to produce the at least one multidimensional peptide of a multidimensional library when needed. Then, this library can readily be used to screen molecules of the library for interaction with a target molecule. Optionally, a signal sequence could be placed on the at least one multidimensional peptide using routine molecular biology techniques well known to those skilled in the art.
Reagents having applications herein include those that aid a protein maintain its native conformation, such as those described above, as well as those used to express the at least one oligonucleotide inserted into the unicellular host.
Particular examples of such reagents include, but certainly are not limited to PCR reagents, such as oligonucleotides, oligonucleotide primers, enzymes, gel matrixes, buffers, etc.
Furthermore, the present invention extends a kit for screening molecules that potentially interact with a target molecule, comprising:
(a) a predetermined amount of a multidimensional library (MDL) comprising at least one isolated oligonucleotide that potentially has affinity for the target molecule, wherein the at least one molecule has a general formula of (XYn)m, wherein:
X is a functional unit comprising a nucleotide regulatory sequence that potentially interacts with the target molecule;
5 Y is a structural unit comprising a nucleotide sequence comprising from 5 to at least 50 contiguous nucleotides;
n is an integer, such that O~n~lO;
m is an integer, such that 2~m~20, (b) other reagents; and 10 (c) directions for use of the kit.
Numerous nucleotide regulatory sequences can serve as a functional unit in the at least one molecule in a multidimensional library used to screen molecules that potentially interact with a target molecule. Particular nucleotide regulatory sequences 15 having applications herein comprise a promoter, an enhancer, a cis-acting locus, a trans-acting locus, an attenuator, an upstream activator, or a regulatory non-translatable region sequence.
Moreover, the present invention extends to a multidimensional library (MDL) comprising at least one multidimensional peptide having affinity for a target molecule,
20 wherein the at least one multidimensional peptide has a general formula of (XYn)m, wherein:
X is a functional peptide unit that participates in an interaction between the at least one multidimensional molecule and the target;
Y is a structural peptide unit;
n is an integer, such that O~n-_<10; and m is an integer, such that 2~m~20, wherein such a libxary is made with a process comprising the steps of:
(a) providing at least one isolated oligonucleotide having the general formula of:
[(NNB)Fn]"" wherein:
NisAorCorGorT/U;
B is C or G or T/LT, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that 0-~n~10; and
21 m is an integer, such that 2~m~20;
(b) inserting the at least one isolated oligonucleotide into an expression vector, such that the at least one isolated oligonucleotide is operatively associated with a promoter;
(c) transforming a unicellular host with the expression vector; and (d) culturing the unicellular host under conditions that provide for expression of the at least one oligonucleotide to produce at least one multidimensional peptide having affinity for the target molecule.
Accordingly, it is an object of the present invention to provide a multidimensional library wherein the overall size of the construct as well as the number of the functional units and the structural units is limited only by the vehicle that is used to display the library.
It is another object of the present invention to provide a multidimensional library (MDL) that enables a skilled artisan to select molecules with affinity and selectivity of interaction with a pre-selected target molecule.
It is another object of the present invention to provide a method for identifying a molecule interacting with a target of interest that is reproducible, quick, simple, efficient and relatively inexpensive.
It is another object of the present invention to provide methods of producing multidimensional libraries using isolated oligonucleotides of random size with a minimal amount of internal stop codons. The limitation of stop codons becomes especially important when the size of the inserted oligonucleotide is large, e.g., greater than about 20 codons. For example, using a heretofore known "genetic" method for producing peptide libraries, in an isolated oligonucleotide of 100 codons, the possibility of not having a stop codon, i-e., of having an open reading frame, would be (47/48)100 or about 12%. However, using a method of the present invention, the possibility of having a stop codon in the reading frame would be (31/32)100 or about only 4%.
It is another object of the present invention to provide a method for generating and screening a large library of diverse proteins, polypeptides and/or peptide molecules.
22 It is yet another object of the present invention to provide a rapid and easy way of producing a large multidimensional library comprising a plurality of longer proteins, polypeptides and/or peptides that can be efficiently screened to identify those having novel and/or improved specificities, affinities and stabilities for a particular target of choice.
It is yet still another object of the present invention to provide a method for producing a multidimensional library as described above which avoids the need for purifying, or isolating genes, nor any need fox detailed knowledge of the function of portions of the binding sequence, the amino acids that are involved in target binding sequence, or the amino acids that are involved in target binding in order to produce MDP. Moreover, since MDPs are screened iyz vitro, the solvent requirements involved in MDP/target interactions are not limited to aqueous solvents; thus, non-physiological interactions and binding conditions different from those found in vivo can be exploited.
It is still yet another object of the present invention to provide a method for designing variant oligonucleotides that permits greater variability in the sequence of the oligonucleotides than is presently permitted using schemes described above.
Moreover, non-natural amino acids could also be used in an MDL described herein if an appropriate expression system is used supplied with tRNA modified to express respective non-natural amino acids [Satoh et al., Nucleic Acids S~nnp.
Ser.
37:117-118 (1997)].
These and other aspects of the present invention will be better appreciated by reference to the following drawings and Detailed Description.
BRIEF DESCRIPTfON OF THE DRAWINGS
FIG. 1 is a schematical view of the construction of a linear oligonucleotide library. (A) The vector fUSE 5 contains two non-complementary SfiI sites separated by a 14 base pairs "stuffer fragment". Removal of the SfiI fragment allows oriented ligation of oligonucleotides with the appropriate cohesive ends. (B) The oligonucleotide ON-69 was annealed to two half site fragments to form cohesive termini complementary to SfiI sites 1 and 2 in the vector. The gapped structure, where the single-stranded region comprises the variable 16-mer codon sequence was ligated to the vector and electro-transformed into E. coli.
23 FIG. 2 is a schematical view of the strategy for the "split-pull" synthesis.

equimolar mixture of the 4 nucleotides ; NZ - mixture of G (19%), A (31%), T
(31%), and C (19%) ; N3 - mixture of G (39%), T (22%) and C (39%).
FIG. 3 shows the amino acid sequences (deduced from the DNA sequences) of the N-terminal peptides of pllI of infectious phages randomly selected from the library.
Individual isolates were sequenced with the oligo primer fUSE32P, which is 32 nucleotides downstream of the gene III cloning site of fUSE 5. Structural blocks are underlined. Single letter code for amino acids is A (Ala), C (Cys), D (Asp), E
(Glu), F
(Phe), G (Gly), H (His), I (Ile), K (Lys), L (Leu), M (Met), N (Asn), P (Pro), Q (Gln), R (Arg), S (Ser), T (Thr), V (Val), W (Trp), Y (Tyr).
FIG. 4 depicts the amino acid frequencies in functional domains (A) and in structural blocks (B) analyzed in randomly chosen isolates; 2X represent 100%
deviation from the optimal frequency that is equal to 5 % for functional domain (A) and is equal to 20% for structural block (B). Figure 4C shows block length distribution.
FIG. 5 shows selection of phages that bind to streptavidin. Phages from MDL
were bound to strepavidin coated microtiter wells and then eluted with glycinelHC1 buffer pH 2.2. Enrichment was calculated as the total number of phages recovered after the elution (n;, measured in transducing units) divided by the number of transducing units recovered after the first round of selection (n1). The data represents mean values from plating in triplicate (SEM < 10%).
FIG. 6 shows the amino acid sequences (deduced from DNA sequence) of the N-terminal peptides of pIII of 29 clones recovered after 2, 3, or 4 rounds of panning on streptavidin coated microtiter wells.
DETAILED DESCRIPTION OF THE INVENTION
The present invention is based upon the discovery that surprisingly and unexpectedly, libraries comprising molecules of random length that are not limited to a maximum length can be constructed easily and efficiently. Moreover, such libraries can be advantageously screened to identify molecules of a multidimensional library of the present invention that possess binding specificity for a variety of targets. Thus, libraries of the present invention are novel, useful and unobvious with respect to heretofore known libraries in which the length of molecules of the library is limited, such as, for example, less than 15 and preferably about 10-12 amino acids.
24 Broadly, the present invention extends to a multidimensional library for screening molecules that potentially interact with a target molecule, wherein said library comprises at least one molecule comprising a general formula of (XYn)m, wherein:
(XYn) is a repeating unit of the at least one molecule in which:
X is a functional unit that interacts with the target molecule, Y is a structural unit, n is the number of structural units in the repeating unit, and m is a number of repeating units in the at least one molecule.
The presence of structural and functional units in a molecule of a multidimensional library of the present invention provide the opportunity for the development of secondary and/or tertiary structure in molecules, which potentially increase their affinity for the target, and more accurately mimic molecules and compounds found in vivo. Such complex structural developments are not feasible in libraries utilizing peptides of a limited length.
As explained above, An MDL of the present invention may be represented by various natural or artificial polymeric compounds including, but not limited, to isolated oligonucleotides, e.g. oligonucleotides, proteins, polypeptides, peptides, polycarbohydrates, polyamines, heterocycles, or their combinations, to name only a few.
For example, the present invention extends to a multidimensional library for screening molecules that potentially interact with a target molecule, wherein the library comprises at least one molecule comprising a general formula of (XY")m, and the at least one molecule of the library comprises a protein, a polypeptide or a peptide having in which(XYn) is a repeating unit of the at least one molecule wherein:
X is a functional peptide unit that participates in an interaction between the at least one molecule and the target;
Y is a structural peptide unit;
n is an integer such that O~n~lO; and m is an integer, such that 2-~m~20.

In another example, the present invention extends to a multidimensional library for screening molecules that potentially interact with a target molecule, wherein said library comprises at least one isolated oligonucleotide comprising a general formula of (XYn)m, wherein:
5 (XYn) is a repeating unit of the at least one oligonucleotide;
X is a functional unit comprising a nucleotide regulatory sequence;
Y is a structural unit comprising a nucleotide sequence comprising from 6 to at least 60 contiguous nucleotides.
Moreover, the present invention extends to a multidimensional library 10 comprising at least one multidimensional peptide having affinity for a target molecule, wherein the at least one multidimensional peptide has a general formula of (XYn)m, wherein:
X is a functional peptide unit that participates in an interaction between the at least one multidimensional peptide and the target;
15 Y is a structural peptide unit;
n is an integer, such that O~n~lO;
m is an integer, such that 2~err~20, wherein such a library is made by a process comprising the steps of:
(a) providing at least one oligonucleotide having the general formula of 20 [(NNB)F~]m, wherein:
NisAorCorGorT/U;
B is C or G or T/LT, but not A;
F is a colon encoding a predetermined amino acid residue;
n is an integer, such that O~n~lO; and
25 m is an integer, such that 220;
(b) inserting the at least one oligonucleotide into an expression vector, such that the at least one oligonucleotide is operatively associated with a promoter;
(c) transforming a unicellular host with the expression vector; and (d) culturing the unicellular host under conditions that provide for expression of the at least one oligonucleotide to produce the at least one multidimensional peptide having affinity for the target molecule.
26 Furthermore, as explained above, the present invention extends to cloning vectors and expression vectors comprising, iyzter alia, at least one oligonucleotide having the general formula of [(NNB)F~]m, wherein:
NisAorCorGorT/LT;
B is C or G or TIU, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that O~n~lO; and m is an integer, such that 2~m~20 In addition, the present invention extends to unicellular hosts transformed or transfected with vectors of the present invention.
Numerous terms and phrases used throughout the instant Specification and appended Claims are defined below:
As used herein, the phrase "multidimensional library" or "MDL" refers to a library of various natural or artificial polymeric compounds, including, but not limited to, polynucleotides, polypeptides, peptides, polycarbohydrates, polyamines, heterocycles, or a combination thereof.
As used herein, the phrase "multidimensional peptide" or "MDP" refers to a polypeptide or peptide having a general formula of (XYn)m, wherein X is a functional peptide unit that participates in the interaction between the multidimensional peptide and a target molecule, and Y is a structural peptide units) involved in positioning the functional peptide units) so as to maximize its/their interaction with the target molecule. The presence of structural and functional peptide units provide the opportunity for the development of secondary and/or tertiary structure in the potential binding protein/peptides, and in sequences flanking the actual binding portions) of the binding domain of MDP. Such complex structural developments are not feasible in libraries utilizing peptides of a limited length.
MDPs or MDP composition comprising a part thereof may be used in any in vivo or ifz vitro application that might make use of a peptide or a polypeptide that specifically interacts with a target. Thus, MDP or the MDP composition can be used in place of, or to bind to, a cell surface receptor, a viral receptor, an enzyme, a lectin, an integrin, an adhesin, a Cap binding protein, a metal binding protein, DNA or RNA
binding proteins, immunoglobulins, vitamin cofactors, peptides that recognize any bio-organic or inorganic compound, etc.
27 Selected MDPs that possess catalytic activities may be used as artificial enzymes to chemically modify targets.
By virtue of the affinity for a target, MDPs or compositions comprising MDPs or a portion thereof used ih vitro and in vivo can deliver a chemically or biologically active moiety, such as a metal ion, a radioisotope, peptide, toxin or fragment thereof, or an enzyme or fragment thereof, or pharmaceutical formulation thereof, to the specific target in or on the cell. The MDPs can also have an in vitro utility similar to monoclonal antibodies or other specific binding molecules for the detection, quantification, separation or purification of other molecules. In one embodiment, a number of MDPs or the binding domains thereof can be assembled as multimetric units to provide multiple binding domains that have the same specificity and can be fused to another molecule that has a biological or chemical activity.
The MDPs that are produced with a method of the present invention can replace the function of macromolecules such as monoclonal or polyclonal antibodies and thereby circumvent the need for the complex methods of hybridoma formation or an in vivo antibody production. Moreover, MDPs differ from other natural binding molecules in that MDPs have an easily characterized and designed activity that can allow their direct and rapid detection in a screening process. Furthermore, it is expected that some MDP molecules may possess catalytic activities and can therefore be used as artificial enzymes.
As used in the present invention, MDPs are intended to encompass a concatenated protein, polypeptide and/or peptide that includes structural and functional elements. The affinity of the functional element of the MDP molecule for a target is characterized by: 1) its strength of binding under specified conditions; 2) the stability of its binding or other interactions under specified conditions; and, 3) its selective specificity for the chosen target. The structural element of the MDP molecule is a domain that separates the functional elements and locates them at the most appropriate coordinates relative to each other.
As used herein, the terms "target" or "target molecule" can be used interchangeably, and refer to a substance, including a molecule or portion thereof, or complex of several molecules for which a receptor naturally exists or can be prepared according to the method of the invention. In particular, a target is a substance that
28 specifically interacts with the functional units) of a molecule of an MDL of the present invention, and includes, but is not limited to, a chemical group, an ion, a metal, a protein, a glycoprotein or any portion thereof, a peptide or any portion of a peptide, a nucleic acid or any portion of a nucleic acid, a sugar, a carbohydrate or a carbohydrate polymer, a lipid, a fatty acid, a vital particle or portion thereof, a membrane vesicle or portion thereof, a cell wall component, a synthetic organic compound, a bio-organic compound and an inorganic compound.
An MDP that can bind to a target can function as a receptor, i-e., a lock into which the target fits and binds; or an MDP can function as a key which fits into and binds to a target when the target is a larger protein molecule; or an MDP can function as a catalyst to accelerate or to slow down the chemical or biochemical conversion of the target. In this invention, a target is a substance that specifically interacts with, or binds to, an MDP and includes, but is not limited to, an organic chemical group, an ion, a metal or non-metal inorganic ion, a glycoprotein, a protein, a polypeptide, a peptide, a nucleic acid, a carbohydrate or a carbohydrate polymer, a lipid, a fatty acid, a viral particle, a membrane vesicle, a Bell wall component, a synthetic organic compound, a molecular complex or any portion of any of the above.
As used herein, the singular forms "a," "an" and "the" include plural referents unless the context clearly dictates otherwise.
Moreover, in accordance with the present invention there may be employed conventional molecular biology, microbiology, and recombinant DNA techniques within the skill of the art. Such techniques are explained fully in the literature. See, e.~., Sambrook, Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (herein "Sambrook et al., 1989"); DNA Cloning: A Practical Approach, Volumes I and II (D.N. Glover ed. 1985); Oligonucleotide Syrzthesis (M.J. Gait ed.
1984);
Nucleic Acid Hybridization, B.D. Hames & S.J. Higgins eds. (1985);
Transcription And Trarzslation, B.D. Hames & S.J. Higgins, eds. (1984)]; Animal Cell Culture , R.I.
Freshney, ed. (1986); Irnnzobilized Cells Arzd Erzzymes IRL, Press, (1986); B.
Perbal, A
Practical Guide To Molecular Cloning (1984); F.M. Ausubel et al. (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, Inc. (1994). Therefore, if appearing herein, the following terms shall have the definitions set out below.
29 A "vector" is a replicon, such as plasmid, phage or cosmid, to which another DNA segment may be attached so as to bring about the replication of the attached segment. A "replicon" is any genetic element (e.~., plasmid, chromosome, virus) that functions as an autonomous unit of DNA replication ifz vivo, i.e., capable of replication under its own control.
A "cassette" refers to a segment of DNA that can be inserted into a vector at specific restriction sites. The segment of DNA encodes a polypeptide of interest, and the cassette and restriction sites are designed to ensure insertion of the cassette in the proper reading frame for transcription and translation.
A cell has been "transfected" by exogenous or heterologous DNA when such DNA has been introduced inside the cell. A cell has been "transformed" by exogenous or heterologous DNA when the transfected DNA effects a phenotypic change.
Preferably, the transforming DNA should be integrated (covalently linked) into chromosomal DNA making up the genome of the cell.
"Heterologous" DNA refers to DNA not naturally located in the cell, or in a chromosomal site of the cell. Thus, oligonucleotides that encode multidimensional peptides of a multidimensional library, as well as oligonucleotides which make up a multidimensional library of the invention, are heterologous DNA when inserted into a vector that is used to transform/transfect a cell.
A "nucleic acid molecule" refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules") or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine; "DNA molecules"), or any phosphoester analogs thereof, such as phosphorothioates and thioesters, in either single stranded form, or a double-stranded helix. Double stranded DNA-DNA, DNA-RNA and RNA-RNA helices are possible.
The term oligonucleotide, and in particular DNA or RNA, or isolated nucleic acid molecule, refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms. Thus, this term includes double-stranded DNA found, inter alia, in linear or circular DNA molecules (~., restriction fragments), plasmids, and chromosomes. Tn discussing the structure of particular double-stranded DNA molecules, sequences may be described herein according to the normal convention of giving only the sequence in the 5' to 3' direction along the non-transcribed strand of DNA (i.e., the strand having a sequence homologous to the mRNA). A "recombinant oligonucleotide" is a oligonucleotide that has undergone a molecular biological manipulation. ' A nucleic acid molecule is "hybridizable" to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic 5 acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength (see Sambrook et al., supra). The conditions of temperature and ionic strength determine the "stringency" of the hybridization. For preliminary screening for homologous nucleic acids, low stringency hybridization conditions, corresponding to a Tm of 55° C can be used, e.g., 5x SSC, 10 0.1% SDS, 0.25% milk, and no formamide; or 30% formamide, 5x SSC, 0.5%
SDS).
Moderate stringency hybridization conditions correspond to a higher Tm, e.g., 40%
formamide, with 5x or 6x SSC. High stringency hybridization conditions correspond to the highest Tm, e.g., 50% formamide, 5x or 6x SSC. Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the 15 stringency of the hybridization, mismatches between bases are possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences.
The 20 relative stability (corresponding to higher T"~ of nucleic acid hybridization decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al., supra, 9.50-0.51). For hybridization with shorter nucleic acids, i-e., oligonucleotides, the position of mismatches becomes more important, and the length 25 of the oligonucleotide determines its specificity (see Sambrook et al., supra, 11.7-11.8).
Preferably a minimum length for a hybridizable nucleic acid is at least about nucleotides; preferably at least about 30 nucleotides; more preferably the length is at least about 40 nucleotides; and even more preferably at least about 50 nucleotides.
In a specific embodiment, the term "standard hybridization conditions" refers
30 to a Tm of 55° C, and utilizes conditions as set forth above. In a preferred embodiment, the Tm is 60° C; in a more preferred embodiment, the Tm is 65°
C.
"Homologous recombination" refers to the insertion of a foreign DNA
sequence of a vector in a chromosome. Preferably, the vector targets a specific chromosomal site for homologous recombination. For specific homologous
31 recombination, the vector will contain sufficiently long regions of homology to sequences of the chromosome to allow complementary binding and incorporation of the vector into the chromosome. Longer regions of homology, and greater degrees of sequence similarity, may increase the efficiency of homologous recombination.
A DNA "coding sequence" is a DNA sequence which is transcribed and translated into a polypeptide in a cell ifz vitro or irz vivo when placed under the control of appropriate regulatory sequences. The boundaries of the coding sequence are determined by a start codon at the 5' (amino) terminus and a translation stop codon at the 3' (carboxyl) terminus. A coding sequence can include, but is not limited to prokaryotic sequences, cDNA from eukaxyotic mRNA, genomic DNA sequences from eukaryotic (~., mammalian) DNA, and even synthetic DNA sequences. If the coding sequence is intended for expression in a eukaryotic cell, a polyadenylation signal and transcription termination sequence will usually be located 3' to the coding sequence.
Transcriptional and translational control sequences are nucleotide regulatory sequences, such as promoters, enhaneers, terminators, and the like, that provide for the expression of a coding sequence in a host cell. In eukaryotic cells, polyadenylation signals are control sequences.
A "promoter" is a nucleotide regulatory region capable of I binding RNA
polymerase in a cell and initiating transcription of a downstream (3' direction) coding sequence. For purposes of defining the present invention, the promoter sequence is bounded at its 3' terminus by the transcription initiation site and extends upstream (5' direction) to include the minimum number of bases or elements necessary to initiate transcription at levels detectable above background. Within the promoter sequence will be found a transcription initiation site (conveniently defined for example, by mapping with nuclease S 1), as well as protein binding domains (consensus sequences) responsible for the binding of RNA polymerase.
A coding sequence is "under the control" of transcriptional and translational control sequences in a cell when RNA polymerase transcribes the coding sequence into mRNA, which is then traps-RNA spliced and translated into the protein encoded by the coding sequence.
A "signal sequence" can be included at the beginning of the coding sequence of a protein, polypeptide or peptide that is to be expressed on the surface of a cell. This sequence encodes a signal peptide, N-terminal to the mature polypeptide, that directs the host cell to translocate the polypeptide. The term "translocation signal sequence" is
32 used herein to refer to this sort of signal sequence. Translocation signal sequences can be found associated with a variety of proteins native to eukaryotes and prokaryotes, and are often functional in both types of organisms.
As used herein, the phrase "nucleotide regulatory sequence" refers to nucleotide sequences that are involved in the regulation of expression of a particular gene.
As used herein, the term "promoter" refers to a nucleotide regulatory sequence capable of binding RNA polymerise in a cell and initiating transcription of a downstream (3' direction) coding sequence. For purposes of defining the present invention, the promoter sequence is bounded at its 3' terminus by the transcription initiation site and extends upstream (5' direction) to include the minimum number of bases or elements necessary to initiate transcription at levels detectable above background. Within the promoter sequence will be found a transcription initiation site.
As used herein, the term "enhancer" refers to a nucleotide regulatory sequence that increases the transcriptional activity of nearby structural genes.
Enhances can act over a distance of thousands of base pairs and can be located 5' or 3' to the gene they affect.
As used herein, the phrase "cis-acting locus" refers to a nucleotide regulatory sequence that affects the activity only of DNA sequences on its own molecule of DNA;
this property usually implies that the locus does not encode for protein.
As used herein, the phrase "trans-acting locus" refers to a nucleotide regulatory sequence that affects the activity of a DNA 5' or 3' from its own molecule of DNA.
As used herein, the term "attenuator" refers to a regulatory nucleotide sequence between a promoter and the structural gene of some operons that can act to regulate the transit of RNA polymerise and thus control transcription of the structural gene.
As used herein, the phrase "regulatory non-translatable region sequence"
refers to a regulatory nucleotide sequence that does not code for a protein or fragment thereof.
Detectable Labels As explained above, detectable labels have applications in a variety of embodiments of the present invention. In particular, numerous labels well known to those of ordinary skill in the art can have applications with oligonucleotides described herein, as well as at least one molecule of a multidimensional library of the present invention. Even a target molecule of a multidimensional library can be detectably
33 labeled using routine techniques. Suitable labels include enzymes, fluorophores (eg., fluorescene isothiocyanate (FITC), phycoerythrin (PE), Texas red (TR), rhodamine, free or chelated lanthanide series salts, especially Eu3+, to name a few fluorophores), chromophores, radioisotopes, chelating agents, dyes, colloidal gold, latex particles, ligands (~., biotin), chemiluminescent agents, enzymes, and amplifiable nucleotide sequences. When a control marker is employed, the same or different labels may be used for the receptor and control marker.
In the instance where a radioactive label, such as the isotopes 3H, I4C, 3'P, 3sS, ~6C1, SICr, 5'Co, SgCo, 59Fe, 9°Y, lash i3il and IB~Re are used, known currently available counting procedures may be utilized. In the instance where the label is an enzyme, detection may be accomplished by any of the presently utilized colorimetric, spectrophotometric, fluorospectrophotometric, amperometric or gasometric techniques known in the art.
Direct labels are one example of a label which can be used according to the present invention. A direct label has been defined as an entity, which in its natural state, is readily visible, either to the naked eye, or with the aid of an optical filter and/or applied stimulation, e.g., U.V. light to promote fluorescence. Among examples of colored labels, which can be used according to the present invention, include metallic sol particles, for example, gold sol particles such as those described by Leuvering (U.S.
Patent 4,313,734); dye sol particles such as described by Gribnau et al. (U.S.
Patent 4,373,932) and May et al. (WO 88/08534); dyed latex such as described by May, supra, Snyder (EP-A 0 280 559 and 0 281 327); or dyes encapsulated in liposomes as described by Campbell et al. (U.S. Patent 4,703,017). Other direct labels include a radionucleotide, a fluorescent moiety or a luminescent moiety. In addition to these direct labeling devices, indirect labels comprising enzymes can also be used according to the present invention. Various types of enzyme linked immunoassays are well known in the art, for example, alkaline phosphatase and horseradish peroxidase, lysozyme, glucose-6-phosphate dehydrogenase, lactate dehydrogenase, urease, these and others have been discussed in detail by Eva Engvall in Enzyme Immunoassay ELISA and EMIT in Methods iu Efzzymology, 70:419-439(1980) and in U.S. Patent 4,857,453.
Other labels for use in the invention include magnetic beads or magnetic resonance imaging labels.
34 Moreover, molecules of a multidimensional library of the present invention can also be labeled by metabolic labeling. Metabolic labeling, for example, occurs during iya vitro incubation of the cells that express the multidimensional peptides) in the presence of culture medium supplemented with a metabolic label, such as [35S]-methionine or [32P]-orthophosphate. Likewise, isolated oligonucleotides of a multidimensional library of the present invention can be metabolically labeled by replicating in the presence of metabolic labels.
In addition to metabolic (or biosynthetic) labeling with [35S]-methionine, the invention further contemplates labeling with [1øC]-amino acids and [3H]-amino acids (with the tritium substituted at non-labile positions).
Due to the degenerate nature of codons in the genetic code, a multidimensional peptides) of a multidimensional library of the present invention can be encoded by numerous oligonucleotides. "Degenerate nature" refers to the use of different three-Ietter codons to specify a particular amino acid pursuant to the genetic code.
It is well known in the art that a total of 64 codons are known in nature and can be used interchangeably to code for the twenty naturally occurring amino acid residues. A list of the naturally occurring codons are sat forth below:
Phenylalanine (Phe or F) UUU or UUC
Leucine (Leu or L) UUA or UUG or CUU or CUC or CUA or CUG
Isoleucine (Ile or I) AUU or AUC or AUA
Methionine (Met or M) AUG
Valine (Va1 or V) GUU or GUC of GUA or GUG

Serine (Ser or S) UCU or UCC or UCA or UCG or AGU
or AGC

Proline (Pro or P) CCU or CCC or CCA or CCG

Threonine (Thr or ACU or ACC or ACA or ACG
T) Alanine (Ala or A) GCU or GCG or GCA or GCG

Tyrosine (Tyr or Y) UAU or UAC

Histidine (His or H) CAU or CAC

Glutamine (Gln or Q) CAA or CAG

Asparagine (Asn AAU or AAC
or N) Lysine (Lys or I~) AAA or AAG

Aspartic Acid (Asp GAU or GAC
or D) Glutamic Acid (Glu GAA or GAG
or E) Cysteine (Cys or C) UGU or UGC

Arginine (Arg or R) CGU or CGC or CGA or CGG or AGA or AGG
Glycine (Gly or G) GGU or GGC or GGA or GGG
Tryptophan (Trp or W) UGG
Termination codon UAA (ochre) or UAG (amber) or UGA (opal) Methods for synthesizing isolated oligonucleotides that encode for multidimensional peptides of a multidimensional library of the present invention employ 48 of these codons. However, those 48 codons encode for all 20 naturally occurring amino acid residues. As explained iyzfra, these 48 codons provide much 10 greater variability to the multidimensional peptides of a library of the present invention then do conventional methods of producing peptides for a library. It should be understood that the codons specified above are for RNA sequences. The corresponding codons for DNA have a T substituted for U.
15 Cloning Vectors Furthermore, the present invention also relates to cloning vectors comprising oligonucleotides encoding multidimensional peptides of a multidimensional library of the invention and an origin of replication, wherein the MDP comprises a general formula of (XYn)m, and X is a functional peptide unit, Y is a structural peptide unit, n is 20 an integer, such that O~n-X10, and m is an integer wherein 2-_err~20. In another embodiment, a cloning vector of the present invention comprises at least one isolated oligonucleotide of a multidimensional library of the invention, wherein the isolated oligonucleotide comprises a general formula of (XY")m, wherein (XYn) is a repeating unit of the isolated oligonucleotide, X is a functional unit comprising a nucleotide 25 regulatory sequence, Y is a structural unit comprising from 6 to at least 60 contiguous nucleotides, n is the number of the structural units in the repeating unit, such that 0-~n-X10, and an origin of replication.
A large number of vector-host systems known in the art may be used. Possible vectors include, but are not limited to, plasmids or modified viruses, but the vector 30 system must be compatible with the host cell used. Examples of vectors include, but are not limited to, E. coli, bacteriophages such as lambda derivatives, or plasmids such as pBR322 derivatives or pUC plasmid derivatives, e.g., pGEX vectors, pmal-c, pFLAG, etc. The insertion into a cloning vector can, for example, be accomplished by ligating an isolated oligonucleotide into a cloning vector which has complementary cohesive termini. However, if the complementary restriction sites used to fragment the DNA are not present in the cloning vector, the ends of the DNA molecules may be enzymatically modified. Alternatively, any site desired may be produced by ligating nucleotide sequences (linkers) onto the DNA termini; these ligated linkers may comprise specific chemically synthesized oligonucleotides encoding restriction endonuclease recognition sequences. Isolated oligonucleotides that encode at least one multidimensional peptide of a multidimensional library of the invention are molecules of a multidimensional library can be introduced into host cells via transfection, electroporation, microinjection, transduction, cell fusion, DEAF dextran, calcium phosphate precipitation, lipofection (lysosome fusion), use of a gene gun, or a DNA
vector transporter (see, ~., Wu et al., 1992, J. Biol. Chem. 267:963-967; Wu and Wu, 1988, J. Biol. Cl~.em. 263:14621-14624; Hartmut et al., Canadian Patent Application No. 2,012,311, filed March 15, 1990), etc., so that many copies of the oligonucleotides are generated. The same is true for isolated oligonucleotides that form a multidimensional library of the present invention, wherein the isolated oligonucleotide comprises a general formula of (XYn)m, wherein:
(XYn) is a repeating unit of the oligonucleotide in which:
X is functional unit comprising a nucleotide regulatory sequence, Y is a structural unit comprising a nucleotide sequence comprising from 6 to at least 60 nucleotides, n is the number of said structural units in said repeating unit, and m is a number of repeating units in said at least one isolated oligonucleotide.
Expression Vectors As explained above, oligonucleotides encoding multidimensional peptides of libraries of the present invention can be inserted into an appropriate expression vector, i.e., a vector which contains the necessary elements for the transcription and translation of the inserted protein-coding sequence. Such elements are termed herein a "promoter." Thus, an isolated oligonucleotide is operatively associated with a promoter in an expression vector of the invention. An expression vector also preferably includes an origin of replication.

The necessary transcriptional and translational signals can be provided on a recombinant expression vector, or they may be supplied by a recombinant oligonucleotide.
Potential host-vector systems include but are not limited to mammalian cell systems infected with virus (~., vaccinia virus, adenovirus, etc.); insect cell systems infected with virus (~., baculovirus); microorganisms such as yeast containing yeast vectors; or bacteria transformed with bacteriophage, DNA, plasmid DNA, or cosmid DNA. The expression elements of vectors vary in their strengths and specificities.
Depending on the host-vector system utilized, any one of a number of suitable transcription and translation elements may be used.
A multidimensional peptide of a multidimensional library of the invention may be expressed chromosomally, after integration of the coding sequence by recombination. In this regard, any of a number of amplification systems may be used to achieve high levels of stable oligonucleotide expression (See Sambrook et al., 1989, supra).
The cells) containing the recombinant vectors) comprising the oligonucleotide(s) encoding the multidimensional peptides) are cultured in an appropriate cell culture medium under conditions that provide for expression of oligonucleotide(s) by the cell.
Any of the methods previously described for the insertion of DNA fragments into a cloning vector may be used to construct expression vectors containing an oligonucleotide(s) encoding a multidimensional peptides) comprising appropriate transcriptional/translational control signals and the protein coding sequences. These methods may include in vitro recombinant DNA and synthetic techniques and in vivo recombination (genetic recombination).
Expression of an oligonucleotide(s) to produce a multidimensional peptides) of a multidimensional library of the present invention may be controlled by any promoter/enhancer element known in the art, but these regulatory elements must be functional in the host selected for expression. Promoters which may be used include, but are not limited to, the SV40 early promoter region (Benoist and Chambon, 1981, Nature 290:304-310), the promoter contained in the 3' long terminal repeat of Rous sarcoma virus (Yamamoto, et al., 1980, Cell 22:787-797), the hexpes thymidine kinase promoter (Wagner et al., 1981, Proc. Natl. Acad. Sci. U.S.A. 78:1441-1445), the regulatory sequences of the metallothionein gene (Brinster et al., 1982, Nature 296:39-42); prokaryotic expression vectors such as the (3-lactamase promoter (Villa-Kamaroff, et al., 1978, Proc. Natl. Acad. Sci. U.S.A. 75:3727-3731), or the tac promoter (DeBoer, et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80:21-25); see also "Useful proteins from recombinant bacteria" in Scientific American, 1980, 242:74-94; promoter elements from yeast or other fungi such as the Gal 4 promoter, the ADH (alcohol dehydrogenase) promoter, PGK (phosphoglycerol kinase) promoter, alkaline phosphatase promoter; and the animal transcriptional control regions, which exhibit tissue specificity and have been utilized in transgenic animals: elastase I
gene control region which is active in pancreatic acinar cells (Swift et al., 1984, Cell 38:639-646;
Ornitz et al., 1986, Cold Spring Harbor Symp. Quafat. Biol. 50:399-409;
MacDonald, 1987, Hepatology 7:425-515); insulin gene control region which is active in pancreatic beta cells (Hanahan, 1985, Nature 315:115-122), immunoglobulin gene control region which is active in lymphoid cells (Grosschedl et al., 1984, Cell 38:647-658;
Adames et al., 1985, Nature 318:533-538; Alexander et al., 1987, Mol. Cell. Biol. 7:1436-1444), mouse mammary tumor virus control region which is active in testicular, breast, lymphoid and mast cells (Leder et al., 1986, Cell 45:485-495), albumin gene control region which is active in liver (Pinkert et al., 1987, Gefies and Devel. 1:268-276), alpha-fetoprotein gene control region which is active in liver (Krumlauf et al., 1985, Mol. Cell. Biol. 5:1639-1648; Hammer et al., 1987, Science 235:53-58), alpha 1-antitrypsin gene control region which is active in the liver (Kelsey et al., 1987, Genes and Devel. 1:161-171), beta-globin gene control region which is active in myeloid cells (Mogram et al., 1985, Nature 315:338-340; Kollias et al., 1986, Cell 46:89-94), myelin basic protein gene control region which is active in oligodendrocyte cells in the brain (Readhead et al., 1987, Cell 48:703-712), myosin light chain-2 gene control region which is active in skeletal muscle (Sani, 1985, Nature 314:283-286), and gonadotropic releasing hormone gene control region which is active in the hypothalamus (Mason et al., 1986, Science 234:1372-1378).
Expression vectors comprising at least one isolated oligonucleotide having a general formula of [(NNB)F~]"" wherein:
NisAorCorGorT/U;
B is C or G or T/LT, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that O~n~lO; and m is an integer, such that 2-_<m~20, operatively associated with a promoter, can be identified by four general approaches:
(a) PCR amplification of the desired plasmid DNA or specific mRNA, (b) nucleic acid hybridization, (c) presence or absence of selection marker gene functions, and (d) expression of inserted sequences. In the first approach, the isolated oligonucleotides can be amplified by PCR to provide for detection of the amplified product. In the second approach, the presence of a foreign oligonucleotide in an expression vector can be detected by nucleic acid hybridization using probes comprising sequences that are homologous to an inserted marker gene. In the third approach, the recombinant vector/host system can be identified and selected based upon the presence or absence of certain "selection marker" gene functions (~., ~i-galactosidase activity, thymidine kinase activity, resistance to antibiotics, transformation phenotype, occlusion body formation in baculovirus, etc.) caused by the insertion of foreign genes in the vector.
Unicellular Hosts Transformed or Transfected with a Vector of the Present Inyention A wide variety of unicellular hosdexpression vector combinations may be employed in replicating and/or expressing isolated oligonucleotides which form a multidimensional library of the present invention, or isolated oligonucleotides which encode for multidimensional peptides of multidimensional libraries of the present invention. For example, useful expression vectors may consist of segments of chromosomal, non-chromosomal and synthetic DNA sequences. Suitable vectors include derivatives of SV40 and known bacterial plasmids, e.g., E. coli plasmids col El, pCRl, pBR322, pMal-C2, pET, pGEX (Smith et al., 1988, Gene 67:31-40), pMB9 and their derivatives, plasmids such as RP4; phage DNAS, e.g., the numerous derivatives of phage ~,, e.g., NM989, and other phage DNA, ~, M13 and filamentous single stranded phage DNA; yeast plasmids such as the 2~ plasmid or derivatives thereof;
vectors useful in eukaryotic cells, such as vectors useful in insect or mammalian cells;
vectors derived from combinations of plasmids and phage DNAs, such as plasmids that have been modified to employ phage DNA or other expression control sequences;
and the like.
For example, in a baculovirus expression systems, both non-fusion transfer vectors, such as but not limited to pVL941 (BamHl cloning site; Summers), pVL1393 (BamHl, SnaaI, XbaI, EcoRl, NotI, XmaIII, BgIII, and Pstl cloning site;
Invitrogen), pVL1392 (BgIII, PstI, NotI, XmaIII, EcoRI, XbaI, Smal, and BamH1 cloning site;
Summers and Invitrogen), and pBlueBacIl1 (BanzHl, BglII, Pstl, NcoI, and HindlII
cloning site, with blue/white recombinant screening possible; Invitrogen), and fusion transfer vectors, such as but not limited to pAc700 (BamH1 and KpnI cloning site, in 5 which the BamH1 recognition site begins with the initiation codon; Summers), pAc701 and pAc702 (same as pAc700, with different reading frames), pAc360 (BamHl cloning site 36 base pairs downstream of a polyhedrin initiation codon; Invitrogen (195)), and pBlueBacHisA, B, C (three different reading frames, with BamHl, BgllI, PstI, NcoI, and HindIII cloning site, an N-terminal peptide for ProBond purification, and 10 blue/white recombinant screening of plaques; Invitrogen (220)) can be used.
Mammalian expression vectors contemplated for use in the invention include vectors with inducible promoters, such as the dihydrofolate reductase (DHFR) promoter, e.g., any expression vector with a DHFR expression vector, or a 15 DHFRlmethotrexate co-amplification vector, such as pED (PstI, SaII, SbaI, SnzaI, and EcoRI cloning site, with the vector expressing both the cloned gene and DHFR;
see Kaufman, Curz-ezat Protocols in Molecular Biology, 16.12 (1991).
Alternatively, a glutamine synthetase/methionine sulfoximine co-amplification vector, such as pEEl4 (HindIIT, XbaI, Smal, SbaI, EcoRI, and BcII cloning site, in which the vector expresses 20 glutamine synthase and the cloned gene; Celltech). In another embodiment, a vector that directs episomal expression under control of Epstein Barr Virus (EBV) can be used, such as pREP4 (BamHl, SfiI, XhoI, NotI, NlzeI, HizzdIII, Nh.eI, PvuII, and KpnI
cloning site, constitutive RSV-LTR promoter, hygromycin selectable marker;
Invitrogen), pCEP4 (BanzHl, S,~ZI, XhoI, NotI, NheI, HindIII, NheI, PvuII, and Kpf2I
25 cloning site, constitutive hCMV immediate early gene, hygromycin selectable marker;
Invitrogen), pMEP4 (KpnI, PvuI, NlzeI, HirzdIII, NotI, XlaoI, SfiI, BamH1 cloning site, inducible metallothionein IIa gene promoter, hygromycin selectable marker:
Invitrogen), pREP8 (BaznHl, XhoI, NotI, HindIII, NheI, and KpnI cloning site, RSV-LTR promoter, histidinol selectable marker; Invitrogen), pREP9 (KpnI, NlzeI, HindIlT, 30 NotI, XlzoI, SfiI, and BamHI cloning site, RSV-LTR promoter, 6418 selectable marker;
Invitrogen), and pEBVHis (RSV-LTR promoter, hygromycin selectable marker, N-terminal peptide purifiable via ProBond resin and cleaved by enterokinase;
Invitrogen).
Selectable mammalian expression vectors for use in the invention include pRc/CMV
(HizzdIII, BstXI, NotI, SbaI, and ApaI cloning site, 6418 selection;
Invitrogen), pRc/RSV (Hi»dIII, SpeI, BstXI, NotI, Xbal cloning site, 6418 selection;
Invitrogen), and others. Vaccinia virus mammalian expression vectors (see, Kaufman, 1991, supra) for use according to the invention include but are not limited to pSCl1 (S»aaI
cloning site, TK- and [3-gal selection), pMJ601 (SaII, SmaI, AfII, NarI, BspMII, BamHI, ApaI, NheI, SacII, Kp»I, and HiradlII cloning site; TK- and (3-gal selection), and pTKgptFlS
(EcoRI, PstI, SaII, AccI, Hi»dII, SbaI, Ba»aHI, and Hpa cloning site, TK or XPRT
selection).
Yeast expression systems can also be used in expression vectors of the present invention as well as to express an oligonucleotide(s) encoding a multidimensional peptides) of a library of the present invention. For example, the non-fusion pYES2 vector (XbaI, SplzI, SIaoI, NotI, GstXI, EcoRI, BstXI, Ba»aHl, SacI, Kp»1, and Hi»dIl1 cloning sit; Invitrogen) or the fusion pYESHisA, B, C (XbaI, SphI, ShoI, NotI, BstXI, EcoRI, BamHl, Sacl, Kp»I, and Hi»dIII cloning site, N-terminal peptide purified with ProBond resin and cleaved with enterokinase; Invitrogen), to mention just two, can be employed.
Particular examples of vectors having applications herein include bacteriophage vectors such as 8X174, 1, M13 and its derivatives, fl, fd, Pfl, etc., phagemid vectors, plasmid vectors, insect viruses, such as baculovirus vectors, mammalian cell vectors, such as parvovirus vectors, adenovirus vectors, vaccinia virus vectors, retrovirus vectors, yeast vectors such as Tyl, killer particles, etc.
Once a suitable host system and growth conditions are established, recombinant expression vectors can be propagated and prepared in large quantity. As previously explained, the expression vectors which can be used include, but are not limited to, the following vectors or their derivatives: human or animal viruses such as vaccinia virus or adenovirus; insect viruses such as baculovirus; yeast vectors;
bacteriophage vectors (e.g., lambda), and plasmid and cosmid DNA vectors, to name but a few.
In addition, a unicellular host cell strain may be chosen which modulates the expression of the inserted sequences, or modifies and processes the gene product in the specific fashion desired. Different host cells have characteristic and specific mechanisms for the translational and post-translational processing and modification (~., glycosylation, cleavage [e.g., of signal sequence]) of proteins.
Appropriate cell lines or host systems can be chosen to ensure the desired modification and processing of the foreign protein expressed. Thus, for example, one can readily modify the oligonucleotide(s) that encode multidimensional peptides of a multidimensional library to have a signal, sequence instructing the unicellular host to translocate the multidimensional peptide to the surface of the host. Moreover, other modifications can be made to multidimensional peptides, such as glycosylation.
As explained above, vectors are introduced into the desired host cells by methods known in the art, e.g., transfection, electroporation, microinjection, transduction, cell fusion, DEAF dextran, calcium phosphate precipitation, lipofection (lysosome fusion), use of a gene gun, or a DNA vector transporter (see, ~., Wu et al., 1992, J. Biol. Clzezzz. 267:963-967; Wu and Wu, 1988, J. Biol. Cherzz.
263:14621-14624; Hartmut et al., Canadian Patent Application No. 2,012,311, filed March 15, 1990).
Commercial Kits In a further embodiment, the present invention extends to commercial test kits suitable for use by skilled artisans to produce a multidimensional library described herein, and to use such a multidimensional library to screen molecules of a multidimensional library to determine if any has affinity with a particular target molecule. A particular kit of the present invention for screening molecules that potentially interact with a target molecule comprises:
(a) a predetermined amount of a multidimensional library (MDL) comprising at least one molecule that potentially has affinity for the target molecule, wherein the at least one molecule has a general formula of (XYn)m, wherein:
X is a functional unit that interacts with the target molecule;
Y is a structural unit;
n is an integer, such that O~n~lO;
m is an integer, such that 2~m~20, (b) other reagents; and (c) directions for use of the kit.
Yet another kit for screening molecules that potentially interact with a target molecule comprises:
(a) a predetermined amount of a multidimensional library (MDL) comprising at least one multidimensional peptide that potentially has affinity for the target molecule, wherein the at least one multidimensional molecule has a general formula of (XY")m, wherein:
X is a functional peptide unit that participates in an interaction between the at least one multidimensional peptide and the target;
Y is a structural peptide unit;
n is an integer, such that O~n~lO;
m is an integer, such that 2-~rr~20, (b) other reagents; and (c) directions for use of the kit.
Reagents having applications in such kits are generally those that maintain a peptide's native conformation. Examples of such reagents include, but certainly are not limited to protease inlubitors, such as PMSF, phosphate buffered saline, TRIS glycine buffer, TRIS HCl buffer, etc., wherein the reagents are at physiological pH.
Another class of such a kit employs oligonucleotides. For example, in a particular embodiment, a kit of the present invention for screening molecules that potentially interact with a target molecule comprises:
(a) a predetermined amount of a multidimensional library (MDL) comprising at least one isolated oligonucleotide that potentially has affinity for the target molecule, wherein the at least one isolated oligonucleotide has a general formula of (XYn)m, wherein:
X is a functional unit comprising a nucleotide regulatory sequence that participates in an interaction between the at least one isolated oligonucleotide and the target;
Y is a structural unit comprising a nucleotide sequence comprising from 5 to at least 50 contiguous nucleotides, n is an integer, such that 0~n~10; and m is an integer, such that 2~m~20, (b) other reagents; and (c) directions for use of the kit.
Particular examples of nucleotide regulatory sequences having applications herein are discussed above.

Yet another embodiment of a kit for screening molecules that potentially interact with a target molecule comprises:
(a) a unicellular host transformed or transfected with an expression vector comprising at least one isolated oligonucleotide operatively associated with a promoter, wherein the at least one isolated oligonucleotide has the general formula of [(NNB )Fn] "" wherein:
NisAorCorGorT/LT;
B is C or G or T/U, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that 0-~n~10; and m is an integer, such that 2~rn~20;
(b) reagents for expressing the at least one isolated oligonucleotide;
(c) other reagents; and (d) directions for use the kit.
Examples of reagents having applications in such a kit include those that promote the maintenance of a protein's native conformation as discussed above, as well as reagents that are used to amplify and express oligonucleotides, e.~., PCR
reagents such as oligonucleotides, oligonucleotide primers, enzymes, gel matrixes, buffers, etc.
With such kits, skilled artisans can.produce multidimensional library described herein, and use it to identify a member of the library that interacts with a particular target molecule.
Methods to Identify MDPs and construction of MDL libraries In an embodiment, the process of the present method for rapidly and efficiently identifying novel compounds termed MDPs consists of two steps: (a) constructing a library of vectors expressing inserted synthetic oligonucleotide sequences encoding a plurality of proteins, polynucleotides and/or peptides as fusion proteins, for example, attached to an accessible surface structural protein of a vector; and (b) screening the expressed library or plurality of recombinant vectors to isolate those members producing proteins, polypeptides and/or peptides that bind to a target of interest. The nucleic acid sequence of the inserted synthetic oligonucleotides of the isolated vector is determined and the amino acid sequence encoded is deduced to identify an MDP
binding domain that binds the target of choice.

It is, of course, understood that once a library is constructed according to the present invention, the library can be screened any number of times with a number of different targets of choice to identify MDPs binding the given target. Such screening methods are also encompassed within the present invention.
5 A. Synthesis And Assembly Of Oli~onucleotides In order to prepare a library of vectors expressing a plurality of proteins, polypeptides and/or peptides MDPs according to the present invention, single stranded sets of oligonucleotides are synthesized and assembled in vitro according to the following scheme.
10 The synthetic oligonucleotide sequences are designed to encode functional and structural peptide units. The functional units are encoded by variant or unpredicted oligonucleotides. The structural units are encoded with invariant nucleotide sequences of unpredicted length, or they may be encoded with unpredicted nucleotide sequences of variant or invariant length comprising structural microlibraries composed of a 15 limited number of pre-selected amino acids. The size of the structural peptide units can also be randomized, and thus affect the overall length multidimensional peptides of the library.
Pairs of variant nucleotides in which one individual member is represented by 5'(NNB)n3' and the other member is represented by 3'(NNV)m5' where N is A, C, G
20 or T; B is G, T or C; V is G, A or C; n is an integer, such that 10 < n <
100, and m is an integer, such that 10 < m < 100 are synthesized for assembly into synthetic oligonucleotides. As assembled, according to the present invention, there are at least n+m variant codons in each inserted synthesized double stranded oligonucleotide sequence.
25 As it would be understood by those skilled in the art, the variant nucleotide positions have the potential to encode all 20 naturally occurring amino acids, whose naturally occurring codons (64 in total) are described above (non-natural amino acids may also be used if an appropriate expression system is used) and, when assembled as taught by the present method, encode only one stop eodon, i-e., TAG. The sequence of 30 amino acids encoded by the variant nucleotides of the present invention is unpredictable and substantially random in sequence.

Although the variant oligonucleotides according to the present scheme employ only 48 different codons to encode all 20 naturally occurring amino acids, the present scheme for designing the variant nucleotides advantageously provides greater variability than that available in conventional schemes, ~. those which use nucleotides of the formula NNK, in which K is G or T (Cwirla et al., Pf-oc.
Natl. Acad.
Sci. U.S.A. 87: 6378-6382, 1990) or of the formula NNS, in which S is G or C
(Devlin et al., Scie~ace 249: 404-401, 1990; Scott and Smith, Science 249: 386-390, 1990), in which only 32 codons are employed.
Moreover, when the synthesized oligonucleotides are inserted into an expression vector(s), the single stop codon TAG can be suppressed by expressing the library of vectors in a mutant host, such as E. coli supE. Other hosts having applications herein are described above, and in Sambrook, pp. 2.55, 2.57-0.59, 4.13-4.15.
Moreover, as would be understood by those skilled in the art, use of variant codons of the formula NNK or NNS would, like the presently employed NNB
formula, encode only one type of stop codon, i.e., TAG. If the use of suppressors, such as SupE, were 100% efficient to suppress the single stop codon, there would be no difference or advantage in using the present NNB scheme over those schemes used by conventional methods.
The NNB scheme set forth herein offers additional flexibility when the MDPs are expressed in hosts that lack suppresser tRNA genes. That is, the NNB
scheme would not be restricted only to host organisms that have been subject to intense molecular genetic manipulation and thus offers greater flexibility in host selection.
One could avoid stop codons altogether by using codon triplets, but then one would need to know codon preference ideally for each host.
The invariant nucleotides are positioned at particular sites in the nucleotide sequences to locate variant nucleotides on particular distance form each other.
The 3' termini invariant nucleotide positions are complementary pairs of 6, 9 or 12 nucleotides to aid in annealing the two synthesized single stranded sets of nucleotides together and the conversion to double-stranded DNA, designated herein-synthesized double stranded oligonucleotides.
FIG 1B schematically shows the general assembly process according to a method of the present invention. The oligonucleotides are assembled by a process comprising: synthesizing three single stranded nucleotides having a formula represented:
1) 5'- Complementary site 1 - [(NNB)Fo_n]m - Complementary site 2 - 3' (ON-69) 2) 3'- Complementary site 1 - 5' (ON-11) 3) 3'- Complementary site 2 - 5' (ON-10) wherein NNB represents a codon that results in any of the 20 natural amino acids; F
represents a single pre-synthesized codon, or combination of several single codons, or their random pre-synthesized sequences that result in one or combination of pre-selected amino acids; n is a number of codons resulting in structural blocks of amino acids which is a random value and could be for example, 0-10; m is a number of functional codons which could be for example, 2-20.
Such random oligonucleotides can be obtained by synthesis in which N
represents equimolar mixture of A, C, G, and T; B represents equimolar mixture of G, C, and T.
Any method for synthesis of the single stranded sets of nucleotides is suitable, including such as the use of an automatic nucleotide synthesizer. The synthesizer can be programmed so that the nucleotides can be incorporated, either in equimolar or non-equimolar ratios amounts as the variant positions, i.e., N or B.
In a particular example, a purified single stranded nucleotide sequence, designated ON-69, is ligated into the SfiI sites of fUSE 5 after annealing to two "half site" oligonucleotides, ON-10 and ON-11, which are complementary to the 3' and 5' portions of ON-69, respectively. "Half site" oligonucleotides anneal to the 5' and 3' ends of oligonucleotide ON-69 to form appropriate SfzI cohesive ends. This will leave the appropriate SfiI site exposed without the need to digest with SfiI, thus avoiding the cutting of any SfiI site that might have appeared in the variable region.
The scheme for synthesis and assembly of the unpredictable oligonucleotides used to construct the libraries of the present invention incorporates m variant, unpredicted nucleotide sequences of the formula (NNB)m where B is G, T or C
and m is an integer such that 25m<_20 into the synthesized single stranded oligonucleotides.
Such a scheme provides a number of important advantages not available with conventional libraries. As assembled, the present synthesized oligonucleotides encode all 20 naturally occurring amino acids by the use of 48 different amino acid encoding codons. Thus, the present scheme advantageously provides greater variability than other conventional schemes. For example, conventional schemes in which the variant nucleotides have the formula NNK, where K is G or T, or NNS, where S is C or G, use only 32 different amino acid encoding codons. The use of a larger number of amino acid encoding codons may make the present libraries less susceptible to codon preferences of the host when the libraries are expressed. Although both the present scheme and conventional schemes retain only one stop codon, the use of NNB, as presently taught, advantageously provides synthesized oligonucleotides in which the probability of a stop codon is decreased compared to conventional NNS or NNK
schemes.
Additionally, the present scheme avoids the use of synthesized oligonucleotides rich in GC nucleotides such as often found in libraries using an NNS
formula for variant codons. As is well known to those skilled in the art, nucleotide sequences rich in GC residues are difficult to assemble properly and to sequence.
Perhaps most significantly, the present scheme for synthesis and assembly of the oligonucleotides provides sequences of oligonucleotides encoding unpredicted amino acid sequences of random length, which are different from any prior conventional libraries. When constructed according to the present invention, the present synthesized single stranded oligonucleotides comprise at least about nucleotides in length encoding the complementary site and about 2-20 unpredicted amino acids (functional units) in the MDP binding domain separated by about 0-structural peptide units of about 0-10 amino acid residues in length.
According to a particular embodiment, n is 0<n<10 and m is 2<m<20. Thus, the synthesized single stranded oligonucleotides comprise at least 27-687 nucleotides and encode about 2-20 unpredicted amino acid residues in a functional unit of an MDP. In the specifically exemplified examples, the synthesized oligonucleotides encode respectively, about 4-16 amino acid residues in the MDP binding domain.
The conventional teaching in the art is that the length of inserted oligonucleotides should be kept small encoding preferably less than 15 and most preferably about 6-8 amino acids and of fixed length. Contrarily, the present inventors have found surprisingly and unexpectedly, and in stark contrast to conventional teaching, not only can multidimensional libraries encoding products of variant length be constructed, but that such libraries can be advantageously screened to identify MDPs or proteins, polypeptides and/or proteins having binding specificity for a variety of targets.
Among those interested in using computer modeling to identify binding molecules for drug development, the conventional wisdom has been that the peptides used as leads for developing non-peptide mimetics should be kept to a maximum of about 6-8 amino acids. Computer modeling of larger peptides has been deemed impractical or non-informative. Hence, the conventional wisdom has been that screening libraries of short peptide sequences is more productive. In complete contrast, the present invention, which provides methods to efficiently generate and screen libraries of peptides that have variant lengths to identify MDPs comprising functional peptide units separated to the most optimal distances with structural peptide elements that also have the most appropriate flexibility. This can be used later for drug development using such computer modeling techniques. Additionally, MDPs identified by the methods of the present invention afford a whole new vista of drug candidates.
As demonstrated in the Examples i~afra, the variable length and the presence of structural elements of variable length in oligonucleotides inserted into expression vectors affords the ability to identify MDPs wherein a short sequence of amino acids split with optimized structural linkers permits the MDPs to possess optimal specificity and selectivity for a target with either a simple or complex binding site.
In a particular application, i-e., identification of an MDP having binding specificity for a large target molecule, multidimensional libraries described herein provide the opportunity to identify or map MDPs that encompass not only a few contiguous amino acid residues, but also, those that encompass discontinuous amino acids.
Additionally, the possibility to optimize positions of functional peptide units by 5 separating them with structural peptide units in the inserted synthesized oligonucleotides of multidimensional libraries set forth herein provides an opportunity for the development of secondary and/or tertiary structure development in the potential MDP, and in sequences flanking the actual functional portions) of the peptide.
Such complex structural developments are not feasible when only oligonucleotides of fixed 10 length are used.
B. Insertion Of The Synthetic Oli~onucleotides Into An Appropriate Exuression Vector At least one isolated oligonucleotide of appropriate size prepared as described above, and particularly, a plurality of such oligonucleotides, is inserted into an 15 appropriate expression vector. When inserted into a suitable host, this vector expresses the plurality of proteins, polypeptides andlor proteins as heterofunctional fusion proteins with an expressed component of the vector. These proteins, polypeptides and/or proteins are screened to identify MDPs having affinity for a target of choice.
Any of a variety of vectors can be used according to the methods of the 20 invention, examples of which are described above. Moreover, an appropriate vector comprises a gene encoding an effector domain of an MDP to aid expression and/or detection of the MDP. At least two different restriction enzyme sites within such gene, comprising a linker, are preferred. It is particularly useful to include a "stuffer fragment" within the linker region of the vector when the vector (e.~. phage or 25 plasmid) is intended to express the MDP as a fusion protein that is expressed on the surface of the vector. As used in the present application, a "stuffer fragment" is intended to encompass a relatively short (i~e., about 14 nucleotides) known DNA
sequence flanked by at least two restriction enzyme sites, useful for cloning the DNA
sequences coding for a binding site recognized by a known target, such as an epitope of 30 a known monoclonal antibody. The restriction enzyme sites at the termini of the stuffer fragment are useful for the insertion of the synthesized double stranded oligonucleotides, resulting in deletion of the stuffer fragment (Scott and Smith, Sciezzce 249: 386-390, 1990) Because of the physical linkage between the expressed heterologous fusion protein and the phage or plasmid vector containing the stuffer fragment, and because the stuffer fragment comprises a known DNA sequence encoding a protein that is easily detected and immunologically active (i-ee. an immunological marker), the presence or absence of the stuffer fragment can be easily detected either at the nucleotide level by DNA sequencing, PCR or hybridization, or at the amino acid level, i-ee. using an immunological assay. Such determination allows rapid discrimination between recombinant (MDP expressing) vectors generated by insertion of the synthesized double stranded oligonucleotides and non-recombinant vectors.
In one advantageous aspect, the use of a stuffer fragment avoids a problem often encountered with the use of a conventional polylinker in the vector, i-ee. the restriction sites of the polylinker are too close so that adjacent sites cannot be cleaved independently and used at the same time.
In a particular embodiment, the vector is, or is derived from, a filamentous bacteriophage, including but not limited to M13, fl, fd, Pfl, etc., vector encoding a phage structural protein, preferably a phage coat protein, such as pIII, pVIII, etc.
Moreover, the filamentous phage is an fd derived phage vector such as fUSES
described in Scott and Smith (Science 249: 386-390, 1990) which encodes the structural coat protein pIII. Other vectors having applications in the present invention are described above.
The phage vector is chosen to contain, or is constructed to contain, an origin of replication located in the S' region of a gene encoding a bacteriophage structural protein so that the plurality of the synthesized oligonucleotides inserted are expressed as fusion proteins on the surface of the bacteriophage. This advantageously provides not only a plurality of accessible expressed proteins/peptides but also provides a physical link between the proteins/peptides and the inserted oligonucleotides to provide for easy screening and sequencing of the identified MDPs.
In addition, according to a particular embodiment, the structural bacteriophage protein is pIII; The fCTSES vector described by Smith et al., and illustrated in FIG 1A, containing the pIII gene having a 14-by "stuffer fragment" introduced at the N-terminal end, flanked by two SfiI restriction sites was used in examples exemplified in Section 6. The library is constructed by cloning the plurality of synthesized oligonucleotides into a cloning site near the N-terminus of the mature coast protein of the appropriate vector, preferably the pIII protein, so that the oligonucleotides are expressed as coast protein-fusion proteins.
C. Expression Of Vectors In Appropriate Hosts As explained above, once the appropriate expression vectors are prepared, they are inserted into an appropriate host or used in transcription and translation system in vitro. Methods of transfecting and transforming unicellular hosts are described above.
The oligonucleotides are expressed by culturing the transformed or transfected unicellular hosts under appropriate culture conditions for colony or phage production.
Preferably, the host cells are protease deficient and may or may not carry suppresser tRNA genes.
For example, a small aliquot of the electroporated cells is plated and the number of colonies or plaques is counted in order to determine the number of recombinants. The library of recombinant vectors in host cells is plated at high density for a single amplification of the recombinant vectors.
Moreover, in a particular embodiment of the invention, recombinant fd vector fUSE5, engineered to contain the synthesized double stranded oligonucleotides according to the invention, are transfected into MC1061 E. coli cells by electroporation. MDPs are expressed on the outer surface of the viral capsid extruded from the host E. coli cells are accessible for screening. The parent fUSE 5 vector contains the 14-by stuffer fragment. When the double stranded synthesized oligonucleotides are inserted between the two SfiI sites, the stuffer fragment is removed.
Optionally, several different strains of E. coli may be electroporated to establish different versions of the same library. Of course, the same E. coli strain would need to be used for the entire set of screening experiments. This strategy is based on the consideration that there is likely an in vivo biological selection, both positive and negative, on the viral assembly, secretion, and infectivity rate of individual ' S3 fd recombinants due to the sequence nature of the peptide-pITI fusion proteins.
Therefore, E. coli yvith different genotypes (i-e., chaperone over-expressing, or secretion enhanced) will serve as bacterial hosts, because they will yield libraries that differ in subtle, unpredictable ways.
D. Methods To Identify MDPs: Screening Of MDL Libraries Once a multidimensional library of the present invention is available, it can be screened to identify molecules) of the library that interact with a target of choice. As stated above, in the present invention, a target is intended to encompass a substance, including a molecular complex, a molecule or portion thereof, for which a protein receptor naturally exists or can be prepared according to the method of the invention.
Thus in the present invention, a target is a substance that specifically interacts with the functional elements of an MDP and includes, but is not limited to, a chemical group, an ion, a metal, a protein, a glycoprotein or any portion thereof, a peptide or any portion of a peptide, a nucleic acid or any portion of a nucleic acid, a sugar, a carbohydrate or a carbohydrate polymer, a lipid, a fatty acid, a vital particle or portion thereof, a membrane vesicle or portion thereof, a cell wall component, a synthetic organic compound, a bio-organic compound and an inorganic compound.
Screening the MDL libraries of the present invention can be accomplished by any of a variety of methods known to those of skill in the art.
If the MDPs are expressed as fusion proteins with a cell surface molecule, then screening is advantageously achieved by incubating the vectors with an immobilized target and harvesting those vectors that bind to the target. Such useful screening methods designated "panning" techniques are described in Parmley et al., (Gene 73:
305-318, 1988). In panning methods useful to screen the present libraries, the target can be immobilized on plates, beads, such as magnetic beads, sepharose beads used in columns, etc. In particular embodiments, the immobilized target can be "tagged", i-e., using such as biotin, fluorochrome, etc., for FACS sorting.
In a particular embodiment, screening a library of phage expressing MDPs, i-ee.
phage and phagemid vectors was achieved as follows: using microtiter plates, the target was first diluted, i.e. in 100 mM NaHC03, pH 8.5 and a small aliquot of target solution was adsorbed onto wells of microtiter plates (by incubation overnight at 4° C). An aliquot of BSA solution (1 mglml, in 100 mM NaHC03, pH 8.5) was added and the plate incubated at room temperature for 1 hr. The contents of the microtiter plate were flicked out and the wells washed carefully with PBS-0.5% Tween 20. The plates were washed free of unbound targets repeatedly. A small aliquot of phage solution was introduced into each well and the wells are incubated at room temperature for 1-2 hrs.
The contents of the microtiter plates were flicked out and washed repeatedly.
The plates were incubated with wash solution in each well for 10 min at room temperature to allow bound phages with rapid dissociation constants to be released. The wells were then washed five more times to remove all unbound phages.
In order to recover the phage bound to the wells, a pH change was used. An aliquot of 50 mM glycine-HCl (pH 2.2), 100 mg/ml BSA solution was then added to the washed wells to denature the proteins and release the bound phages. After 10 nnin, the contents were transferred into clean tubes and a small aliquot of 1 M Tris-HCl (pH
7.5) or 1 M NaH2P04 (pH 7.0) was added to neutralize the pH of the phage sample.
The phages were then diluted, ~, 10-3-10-~ and aliquots plated with E. coli K9lKan cells to determine the number of the plaque forming units of the sample. The titer of the input samples was also determined for comparison (dilutions are generally 3)' An important aspect of screening the libraries is the elution. For clarity of explanation, the following is discussed in terms of MDP expression by phages;
however, it is readily understood that such discussion is applicable to any system where the MDP is expressed on a surface fusion molecule. It is conceivable that from a plurality of proteins expressed on phages, the conditions that disrupt the peptide-target interactions during recovery of the phages are specific for every given peptide sequence. For example, certain interactions may be disrupted by acid pHs but not by basic pHs, and vice versa. Thus, it is important to test a variety of elution conditions (including, but not limited to, pH 2-3, pH 12-13, excess target in competition, detergents, mild protein denaturants, urea, varying temperature, light, presence or absence of metal ions, chelators, etc.) and compare the primary structures of the MDP
proteins expressed on the phages recovered for each set of conditions in order to determine the appropriate elution conditions for each target/MDP combination.
Some of these elution conditions may be incompatible with phage infection because they are bactericidal and will need to be removed by dialysis i.e. dialysis bag, Centricon/Amicon microconcentrators).
The ability of the diffexent multidimensional peptides to be eluted under different conditions may not only be due to the denaturation of the specific peptide 5 region involved in binding to the target but may also be due to conformational changes in the flanking regions. These flanking sequences may also be denatured in combination with the actual binding sequence; these flanking regions may also change their secondary or tertiary structure in response to exposure to the elution conditions (i-ee. pH 2-3, pH 12-13, excess target in competition, detergents, mild protein 10 denaturants, urea, heat, cold, light, metal ions, chelators, etc.) which in turn leads to the conformational deformation of the peptide responsible for binding to the target.
E. Applications And Uses Of MDPs And MDP Compositions The MDP products can be used in any industrial or pharmaceutical application 15 that uses a peptide binding moiety specific for any given target. The MDPs can also be intermediates in the production of unifunctional binding peptides that are produced and selected by the method of the invention to have a binding affinity, specificity and avidity for a given target. Thus, according to the present invention, MDPs and MDP
compositions are used in a wide variety of applications including, but not limited to 20 uses in the field of biomedicine; biologic control and pest regulation;
agriculture;
cosmetics; environmental control and waste management; chemistry; catalysis;
nutrition and food industries; military uses; climate control;
pharmaceuticals; etc.
The MDPs and MDP compositions are also useful in a wide variety of in vivo applications in the fields of biomedicine, bioregulation, and control. In certain 25 applications, the MDPs are employed as mimetic replacements for compositions such as enzymes, hormone receptors, immunoglobulins, metal binding proteins, calcium binding proteins, nucleotide binding proteins, adhesive proteins such as integrins, adhesins, lectins, etc. In other applications, the MDPs are employed as mimetic replacements of pxotein/peptides, sugars or other molecules that bind to receptor 30 molecules, such as for example, mimetics for molecules that bind to streptavidin, immunoglobulins, cellular receptors, etc.
Other in vivo uses include administration of MDPs and MDP compositions as immunogens for vaccines, useful for active immunization procedures. MDPs can also be used to develop immunogens for vaccines by generating a first series of MDPs specific for a given cellular or viral macromolecule target and then developing a second series of MDPs that bind to the first MDPs, i.e. the first MDP is used as a target to identify the second series of MDPs. The second series of MDPs will mimic the initial cellular or viral macromolecular target site but will contain only relevant peptide binding sequences, eliminating irrelevant peptide sequences. Either the entire MDP
developed in the second series, or the binding domain, or a portion thereof, can be used as an immunogen for an active vaccination program.
In irz vivo applications, MDPs and MDP compositions can be administered to animals and/or humans by a number of routes including injection (i-ee, intravenous, intraperitoneal, intramuscular, subcutaneous, intraarticular, intramammary, intraurethrally, etc.), topical application, or by absorption through epithelial or mucocutaneous linings. Delivery to plants, insects and protists for bio-regulation and/or control can be achieved by direct application to the organism, dispersion in the habitat, addition to the surrounding environment or surrounding water, etc.
Moreover, in the chemical industry, MDPs can be employed for ,use in separations, purifications, preparative methods, catalysis, etc.
In addition, MDPs can also be used in the field of diagnostics to detect targets occurring in lymph, blood, feces, saliva, sweat, tears, mucus, or any other physiological liquid or solid. In the area of histology and pathology, MDPs can be used to detect targets in tissue sections, organ sections, smears, or in other specimens examined macroscopically or microscopically. MDPs can also be used in other diagnostics as replacements for antibodies, as for example in hormone detection kits, or in pathogen detection kits, etc., where a pathogen can be any pathogen including bacteria, viruses, mycoplasma, fungi, protozoan, etc. MDPs may also be used to define the epitopes that monoclonal antibodies bind to by using monoclonal antibodies as targets for MDP
bindings, thereby providing a method to define the epitope of the original immunogen used to develop the monoclonal antibody. MDPs or the binding domain or a portion thereof can thus serve as epitope mimetics and/or mimotopes.
Other applications will be readily apparent to those of skill in the art and are intended to be encompassed by the present invention.
The present invention may be better understood by reference to the following non-limiting Examples that are provided as exemplary of the invention. The following Examples are presented in order to more fully illustrate the preferred embodiments of the invention. They should in no way be construed, however, as limiting the broad scope of the invention.
EXAMPLES
The description of the methods for the construction of MDL can be further subdivided into; (1) synthesis and assembly of synthetic oligonucleotides; (2) insertion of the synthetic nucleotides into an appropriate expression vector; and (3) expression of the MDL library in vectors.
Reagents And Strains Used In The Examines SfiI and BgII restriction endonucleases, T4 DNA ligase, T4 kinase, Klenow polymerise were obtained from Boehringer Mannheim. Sequenase T7 was obtained from Pharmacia. Oligonucleotides were synthesized with an applied Biosystems PCR-Mate Synthesizer and purified on ODC columns (ABI). The fUSE 5 vector and E.
coli MC1061, K802, K9lKan were kindly provided by Professor George Smith, University of Missouri, Columbia, MO and described in Smith et al., (Scieyzce 228: 1315-1317, 1985) and Parmley and Smith, (Geyae 73: 305-318, 1988).
Example 1:
~nthesis And Assembly Of Oli~onucleotides FIG. 1B shows the formula of the oligonucleotides and the assembly scheme used in the construction of the MDL. The oligonucleotides were synthesized with an applied Biosystems PCR-Mate synthesizer. The 5'- and 3'- ends have a fixed sequence, chosen to reconstruct the amino acid sequence in the vicinity of the signal peptidase site. The central portion contained the variable regions that comprise the oligonucletide library members, and may also code for spacer residues on either or both sides of the variable sequence.
This sequence, designated ON-69, was ligated into the S,fiI sites of fUSE 5 after annealing to the two "Half-site" oligonucleotides, ON-10 (5'-AAGCGCCACC-3') (SEQ. m. NO.: 1) and ON-11 (5'-ACCGGCCCCGT-3') (SEQ. 1D. N0.:2), which are complementary to the 3'- and 5'- portions of ON-69, respectively. "Half-site"
oligonucleotides anneal to the 5'- and 3'- ends of oligonucleotide ON-69 to form appropriate SCI cohesive ends. This left the appropriate SfiI site exposed without the need for digestion with SfiI, thus avoiding the cutting of any SfcI sites that might have appeared in the variable region. Oligonucleotides were phosphorylated with T4 kinase and annealed in 20 mM Tris-HC1, pH 7.5, 2 mM MgCl2, 50 mM NaCI by mixing 4 ~g of ON-IO and 4 ~g of ON-I 1 with 2.75 ~ g of ON-69, heating to 65°C for 5 nnin and allowing to cool slowly to room temperature. This represented an approximate molar ratio of 1:10:10 (ON-69: ON-10: ON-11).
Example 2.
Strategy Of The ~~Snlit-Pull" Synthesis Another way to produce the random polynucleotide is to use successive splitting and uniting steps ("split-pull" synthesis) during the synthesis as schematically shown in FIG 2. The oligonucleotide was synthesized with the starting linker sequence 5'-GGGCCGGT-N1N2N3- (SEQ. 1D. N0.:3) on a resin support, where Nl is A, C, G
and T (nominally equimolar); NZ is A (31%), C (19%), G (19%), and T (31%); N3 is C
(39%), G (39%), and T (22%). The resin support was then divided into four fractions and synthesis continued in each fraction separately according to the following scheme:
Part 1 (30%): Resin-GGGCCGGT-N1NZN3- (SEQ. ff~. N0.:3) Part 2 (17%): Resin-GGGCCGGT-N1NZN3-GGT- (SEQ.1D. N0.:4) Part 3 (23%): Resin-GGGCCGGT-N1N2N3-(GGT)2- (SEQ. ID.
N0.:5) Part 4 (30%): Resin-GGGCCGGT-N1NZN3-(GGT)3- (SEQ. 1D.
N0.:6) All resin particles were thoroughly mixed together and synthesis continued by adding random -N1NZN3- sequence resulting in Resin-GGGCCGGT-N1NZN3-(GGT)3-N1N2N3 (SEQ. ID. N0.:7). This protocol was then repeated four times; after which the closing linker sequence -GGTGGCGCTTCTG-3' (SEQ. ID. N0.:8) was added. The final mixture was detached from the resin. The stochastic collection of polynucleotides of the general formula 5'-GGGCCGGT{N1NZN3(GGT)°_3}N1NZN3 GGTGGCGCTTCTG-3' was thus obtained. This sequence can be ligated into the SfiI
sites of fCTSE 5 after annealing to two "half site" oligonucleotides, ON-10 (5'-AAGCGCCACC-3') (SEQ. ID. N0.:1) and ON-11 (5'-ACCGGCCCCGT-3') (SEQ.
~. N0.:2), which are complementary to the 3'- and 5'- portions of the sequence, respectively.
Example 3.
Construction Of The MDL Library The vector fUSE 5 (100 fig) was digested to completion with Sfd and ethanol precipitated twice in the presence of 2 M ammonium acetate. This DNA could not be self-ligated, indicating complete removal of the 14-by "stuffer" that lies between the SfiI sites (Figure 1A). Twenty ~g of SfiI digest of fUSE 5 vector was then ligated with 200 ng of annealed oligonucleotide insert (molar ratio 1:5) by an overnight incubation at 15° C in 1 m1 of T4 Iigase buffer (20 mM Tris-HCI, pH 7.5, 5 mM
MgCl2, 2 mM
DTT, 1 mM ATP) and 4000 units of T4 DNA Iigase. The Iigated DNA was ethanol precipitated in the presence of 0.3 M sodium acetate, xesuspended in 40 p.1 of water, and transformed by electroporation into E. coli MC1061. Ten electro-transformations, each containing 80 p1 of cell suspensions (final concentration 5x101°
cells/mI) and 2 pg of DNA (500 pg/m1), were performed by pulsing at 12.5 kV/cm for 5 msec as described in Dower et al., (Nucleic Acids Res. 16:6127-6145 (1988). After electroporation, E. coli. cells were allowed to undergo non-selective outgrowth at 37° C
for 1 hr in 2 ml of SOC medium (consisting of 2% Bacto tryptone, 0.5% Bacto yeast extract, 10 mM NaCI, 2.5 mM KC, 10 mM MgCl2, 10 mM MgSOd, 20 mM glucose; as described by Hanahan et al., (J. Mol. Biol. 166:557-580 (1983)) containing 0.2 mg/ml tetracycline. Aliquots (20 ~tl) of cells from each of the transformants were then removed and various dilutions plated on LB plates (Lucia-Bertani medium) containing mg/ml tetracycline to assess the transformation efficiency. The remainder of the cell suspension was used to inoculate 1L of L-broth containing tetracycline (20 mg/ml) and was grown through approximately 10 doublings at 37° C to amplify the library.
Phages from liquid cultures were obtained by clearing the supernatant twice by centrifugation (8000 RPM for 10 min at 4° C), precipitation of phage particles with 5 polyethylene glycol (final concentration 3.3% polyethylene glycol-8000, 0.4 M NaCI), and centrifugation as described above. Phage pellets were re-suspended in TBS
(50 mM Tris-HCI, pH 7.5, 150 mM NaCI) and stored at 4° C. A portion of the library was used to infect I~9lKan cells that were plated at low density on LB
tetracycline plates (40 mg/ml).
Example 4.
Characterization Of The MDL
Constructing a library of peptides displayed on the N-terminus of processed pIII necessarily alters the amino acids in the vicinity of the signal peptidase cleavage site. Certain changes in the corresponding region of the major coat protein, pVIII, have been shown to reduce processing efficiency, slowing or preventing the incorporation of pVIII to virions (Felici et al. J. Mol. Biol. 222: 301-310, 1991). If all the pIII were similarly affected, the diversity of peptides contained in the library would be reduced (Parmley and Smith, Gehe 73: 305-313, 1988). The finding that most amino acids appear at each position of the variable peptides of randomly chosen phage indicates that processing defects do not impose important constraints on the diversity of the library. Furthermore, it is indicative that the inserted sequence in the fusion protein does not deleteriously alter the biological properties of the bacteriophage protein.
In order to determine whether any coding bias existed in the variant non predicted peptides expressed by these libraries, perhaps due to biases imposed during ifa vitro synthesis of the oligonucleotides, or irc vivo during the expression by the reproducing phages, inserted synthetic oligonucleotide fragments of 20 randomly chosen isolates were examined from the MDP library. Individual clones producing infectious phages were picked, and the DNA of their variable region was sequenced using sequenase T7 kit and an oligonucleotide sequencing primer fUSE32P (5'-TGAATTTTCTGTATGAGG-3') (SEQ. 1D. N0.:9), which is complementary to the sequence located 32 nucleotides to the 3' side of the second Sfil site in the fUSE 5 vector.

In FIG. 3, the amino acid frequencies are deduced for the peptides encoded by the oligonucleotide inserts of a sample of randomly chosen infectious phage.
It is observed that very few (<10%) of the inserted oligonucleotide sequences characterized so far in the library have exhibited complete deletions. The percentage of the various deletions is equal to 65%. This is likely a reflection of the heavy G content in the structural peptide units) coding part in assembling the oligonucleotides. FIG.
4 shows distribution of amino acids in the library. Microsoft EXCEL program was used to evaluate amino acid frequencies. Such analyses showed that the nucleotide codons coding fox, and hence most amino acids, occurred at the expected frequency in the MDP library of expressed proteins. The notable exceptions were leucine and serine, which were over-represented (FIG. 4A). Thus, except for the structural block composition limited to 5 amino acids, any position in the variable domain could have any amino acid. Therefore, the sequences are unpredicted or random. In the structural peptide units all the five amino acids are distributed within two-fold margin of theoretical distribution that is equal 20% (FIG. 4B). The structural peptide units have a length between 1 and 3 amino acids, and are distributed between 20% and 50%
(FIG.
4C).
Example 5.
Identification Of Target Binding MDPs Streptavidin was diluted to 200 pg/ml in 0.1 M NaHC03, and 50 ~1 of the solution was added to each well and used to select clones from MDL by successive rounds of biopanning on 96-well plates (Nunc maxisorb microtiter plate).
Streptavidin was then bound to the plate overnight at 4° C. The wells were then washed with PBS
and blocked with 1% BSA in 0.1 M NaHC03 for 1 hr at room temperature. After blocking, the wells were washed six times with 0.1% Tween20/TBS (T-TBS). The 2x1011 phage particles/well of the primary MDL were then added in 100 ~1 of 0.1%
BSA/T-TBS and the plates were incubated for 2 hrs at room temperature. The plates were then washed 12 times with T-TBS to remove non-specific phages (phages which express peptides without the desired specificity) and the remaining bound phages were eluted by a 10-min treatment with 100 ~.l of 0.1 M HCl (pH 2.2 adjusted with glycine).
Neutralization of the eluate, titration, and amplifications on agar medium were carried out essentially as described in Parmley and Smith, (Gene 73:305-313 (1988)).
The binding and elution reactions were repeated five times. Recoveries of phages from this process are shown in FIG. 5, where the repeated selection of phages resulted in an enrichment of phages capable of binding to streptavidin. These results indicate that phages of higher affinity were preferentially enriched in each panning step.
After five rounds of biopanning and phage amplification, the individual phages derived from second, third, and fourth rounds of panning were grown and their peptide encoding regions sequenced. The amino acid sequences of these 29 phages that bound to streptavidin are summarized in FIG. 6.
CONCLUSION
The results of these Examples readily demonstrate that methods of the present invention set forth herein provide a novel and useful multidimensional library comprising multidimensional peptides that vary in size, and are not limited to a particular size. Thus, libraries described herein permit exploration of the effect of secondary and tertiary structure of polypeptides on the ability of proteins and polypeptides to interact with, and particularly bind with a target molecule.
In addition, since multidimensional peptides of libraries described herein comprise both functional and structural peptide units, the potential affinity of a multidimensional peptide can be maximized, thus ensuring an accurate model of a protein that interacts with the target.
Furthermore, a novel and useful method of producing oligonucleotides that encode the multidimensional peptides of a multidimensional library, as set forth herein results in multidimensional peptides that have a limited amount of stop codons, have random amino acid sequences, and do not have a maximum length. As a result, the number of multidimensional peptides, and thus the number of members of the library available to interact with the target is maximized.
The present invention is not to be limited in scope by the specific embodiments describe herein. Indeed, various modifications of the invention in addition to those described herein will become apparent to those skilled in the art from the foregoing description and the accompanying figures. Such modifications are intended to fall within the scope of the appended claims.
It is further to be understood that all base sizes or amino acid sizes, and alI
molecular weight or molecular mass values, given for nucleic acids or polypeptides are approximate, and are provided for description.
Various publications are cited herein, the disclosures of which are incorporated by reference in their entireties.

Claims (38)

WHAT IS CLAIMED IS:
1. A multidimensional library for screening molecules that potentially interact with a target molecule, wherein said library comprises at least one molecule comprising a general formula of (XY n)m, wherein:
(XY n) is a repeating unit of said at least one molecule in which:
X is a functional unit that interacts with said target molecule, Y is a structural unit, n is the number of said structural units in said repeating unit, and m is a number of repeating units in said at least one molecule.
2. The multidimensional library of Claim 1, wherein said at least one molecule is detectably labeled.
3. The multidimensional library of Claim 2, wherein said detectable label comprises a radioactive element, a chemical which fluoresces, or an enzyme.
4. The multidimensional library of Claim 1, wherein said at least one molecule comprises an isolated oligonucleotide, a protein, a polypeptide, a peptide, a carbohydrate, a polyamine, a heterocyclic molecule, or a combination thereof.
5. The multidimensional library of Claim 1 wherein said at least one molecule comprises a protein, a polypeptide or a peptide.
6. The multidimensional library of Claim 5, wherein:
X is a functional peptide unit that participates in an interaction between the at least one molecule and the target;
Y is a structural peptide unit;
n is an integer, such that 0<=n<=10; and m is an integer, such that 2<=m<=20.
7. The multidimensional library of Claim 1, wherein said at least one molecule comprises an isolated oligonucleotide.
8. The multidimensional library of Claim 7, wherein said functional unit comprises a nucleotide regulatory sequence and said structural unit comprises a nucleotide sequence comprising from 6 to at least 60 contiguous nucleotides.
9. The multidimensional library of Claim 8, wherein said nucleotide regulatory sequence comprises a promoter, an enhancer, a cis-acting locus, a trans-acting locus, an attenuator, an upstream activator, or a regulatory non-translatable region sequence.
10. The multidimensional library of Claim 9, wherein said promoter comprises: an SV40 early promoter, a promoter contained in the 3' long terminal repeat of Rous sarcoma virus, a herpes thymidine kinase promoter, the regulatory sequences of the metallothionein gene, a .beta.-lactamase promoter, a tac promoter, an alcohol dehydrogenase promoter, a phosphoglycerol kinase promoter, an alkaline phosphatase promoter, an elastase I gene control region active in pancreatic acinar cells, an insulin gene control region active in pancreatic beta cells, an immunoglobulin gene control region active in lymphoid cells, a mouse mammary tumor virus control region active in testicular, breast, lymphoid and mast cells, an albumin gene control region active in liver, an alpha-fetoprotein gene control region active in liver, an alpha 1-antitrypsin gene control region active in the liver, a beta-globin gene control region active in myeloid cells, a myelin basic protein gene control region active in oligodendrocyte cells in the brain, a myosin light chain-2 gene control region active in skeletal muscle, a gonadotropic releasing hormone gene control region active in the hypothalamus dihydrofolate reductase (DHFR) promoter, a constitutive RSV-LTR promoter, a metallothionein IIa gene promoter, a RSV-LTR promoter, an immediate early promoter of hCMV, an early promoter of SV40, an early promoter of adenovirus, an early promoter of vaccinia, an early promoter of polyoma, a late promoter of SV40, a late promoter of adenovirus, a late promoter of vaccinia, a late promoter of polyoma, the lac system, the trp system, the TAC system, the TRC system, the major operator and promoter regions of phage lambda, a control region of fd coat protein, 3-phosphoglycerate kinase promoter, acid phosphatase promoter, or a promoter of yeast .alpha.
mating factor.
11. A multidimensional library (MDL) comprising at least one multidimensional peptide having affinity for a target molecule, wherein said at least one multidimensional peptide has a general formula of (XY n)m, wherein:
X is a functional peptide unit that participates in an interaction between the at least one multidimensional molecule and the target;
Y is a structural peptide unit;
n is an integer, such that 0<=n<=10; and m is an integer, such that 2<=m<=20.
12. An isolated oligonucleotide encoding at least one multidimensional peptide comprising a general formula of (XY n)m, wherein:
X is a functional peptide unit that participates in an interaction between the at least one multidimensional molecule and the target;
Y is a structural peptide unit;
n is an integer, such that 0<=n<=10; and m is an integer, such that 2<=m<=20, said isolated oligonucleotide having a general formula of [(NNB)F n]m, wherein:
N is A or C or G or T/U;
B is C or G or T/U, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that 0<=n<=10; and m is an integer, such that 2<=m<=20.
13. A cloning vector comprising an origin of replication and an isolated oligonucleotide of Claim 12.
14. The cloning vector of Claim 13, wherein said cloning vector is selected from the group consisting of E. coli, a bacteriophage, a plasmid, and a pUC
plasmid derivative.
15. The cloning vector of Claim 14, wherein said bacteriophage further comprises a lambda derivative, said plasmid further comprises a pBR322 derivative, and said pUC plasmid derivative further comprises a pGEX vector, a pmal-c vector, or a pFLAG vector.
16. An expression vector comprising an isolated oligonucleotide of Claim 12 operatively associated with a promoter.
17. The expression vector of 16 wherein said promoter is selected from the group consisting of an immediate early promoter of hCMV, an early promoter of SV40, an early promoter of adenovirus, an early promoter of vaccinia, an early promoter of polyoma, a late promoter of SV40, a late promoter of adenovirus, a late promoter of vaccinia, a late promoter of polyoma, the lac system, the trp system, the TAC
system, the TRC system, the major operator and promoter regions of phage lambda, a control region of fd coat protein, 3-phosphoglycerate kinase promoter, acid phosphatase promoter, and a promoter of yeast mating factor.
18. A unicellular host transformed or transfected with an expression vector of Claim 16.
19. The unicellular host of Claim 18, wherein said unicellular host is selected from the group consisting of E. coli, Pseudomonas, Bacillus, Strepomyces, yeast, CHO, R1.1, B-W, L-M, COS1, COS7, BSC1, BSC40, BMT10 and Sf9 cells.
20. A method for generating a multidimensional library (MDL) comprising at least one multidimensional peptide having affinity for a target molecule, wherein the at least one multidimensional peptide has a general formula of (XY n)m, wherein:
X is a functional peptide unit that participates in an interaction between the at least one multidimensional peptide and the target;
Y is a structural peptide unit;
n is an integer, such that 0<=n<=10;
m is an integer, such that 2<=m<=20, the method comprising the steps of:
(a) providing at least one oligonucleotide having the general formula of [(NNB)F n]m, wherein:
N is A or C or G or T/U;

B is C or G or T/U, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that 0<=n<=10; and m is an integer, such that 2<=m<=20;
(b) inserting said at least one oligonucleotide into an expression vector, such that said at least one oligonucleotide is operatively associated with a promoter;
(c) transforming a unicellular host with the expression vector; and (d) culturing said unicellular host under conditions that provide for expression of said at least one oligonucleotide to produce said at least one multidimensional peptide having affinity for said target molecule.
21. The method of Claim 20, wherein the at least one multidimensional peptide is produced on the surface of the unicellular host.
22. A method for identifying a molecule that interacts with a target molecule, comprising the steps of:
(a) generating a multidimensional library (MDL) comprising at least one molecule comprising a general formula of:
(XY n)m wherein (XY n) is a repeating unit of said at least one molecule in which:
X is a functional unit of said at least one molecule, Y is a structural unit of said at least one molecule, n is the number of said structural units in said repeating unit, such that 0<=n<=10, and m is the number of repeating units in said at least one molecule, such that 2<=m<=20;
(b) contacting the multidimensional library with said target molecule; and (c) detecting binding of said target molecule with said at least one molecule.
23. The method of Claim 22, wherein said at least one molecule is detectably labeled.
24. The method of Claim 23, wherein said detectable label comprises a radioactive element, a chemical which fluoresces, or an enzyme.
25. The method of claim 22, wherein said at least one molecule comprises a protein, a polypeptide or a peptide.
26. The method of Claim 25, wherein the step of generating the multidimensional library comprises the steps of:
(a) providing at least one isolated oligonucleotide having the general formula of:
[(NNB)F n]m, wherein:
N is A or C or G or T/U;
B is C or G or T/U, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that 0<=n<=10; and m is an integer, such that 2<=m<=20;
(b) inserting said at least one isolated oligonucleotide into an expression vector, such that said at least one isolated oligonucleotide is operatively associated with a promoter;
(c) transforming a unicellular host with the expression vector; and (d) culturing said unicellular host under conditions that provide for expression of said at least one oligonucleotide to produce at least one multidimensional peptide having affinity for said target molecule.
27. The method of Claim 26, wherein said promoter of step (b) comprises an immediate early promoter of hCMV, an early promoter of SV40, an early promoter of adenovirus, an early promoter of vaccinia, an early promoter of polyoma, a late promoter of SV40, a late promoter of adenovirus, a late promoter of vaccinia, a late promoter of polyoma, the lac system, the trp system, the TAC system, the TRC
system, the major operator and promoter regions of phage lambda, a control region of fd coat protein, 3-phosphoglycerate kinase promoter, acid phosphatase promoter, or a promoter of yeast a mating factor.
28. The method of Claim 26, wherein said unicellular host is selected from the group consisting of E. coli, Pseudomonas, Bacillus, Strepomyces, yeast, CHO, R1.1, B-W, L-M, COS1, COS7, BSC1, BSC40, BMT10 and Sf9 cells.
29. The method of Claim 26 , wherein said at least one multidimensional peptide is produced on the surface of said unicellular host.
30. A kit for screening molecules that potentially interact with a target molecule, comprising:
(a) a predetermined amount of a multidimensional library (MDL) comprising at least one molecule that potentially has affinity for the target molecule, wherein the at least one molecule has a general formula of (XY n)m, wherein:
X is a functional unit that interacts with the target molecule;
Y is a structural unit;
n is an integer, such that 0<=n<=10;
m is an integer, such that 2<=m<=20, (b) other reagents; and (c) directions for use of the kit.
31. The kit of Claim 30, wherein said at least one molecule is detectably labeled.
32. The kit of Claim 30, wherein said at least one molecule comprises an isolated oligonucleotide, a protein, a polypeptide, a peptide, a carbohydrate, a polyamine, a heterocyclic molecule, or a combination thereof.
33. The kit of Claim 30, wherein said at least one molecule comprises a protein, a polypeptide, or a peptide.
34. The kit of Claim 33, wherein:

X is a functional peptide unit that participates in an interaction between the at least one multidimensional molecule and the target;
Y is a structural peptide unit;
n is an integer, such that 0<=n<=10;
m is an integer, such that 2<=m<=20.
35. The kit of Claim 30, wherein said at least one molecule comprises an isolated oligonucleotide.
36. The kit of Claim 35, wherein said functional unit comprises a nucleotide regulatory sequence and said structural unit comprises a nucleotide sequence comprising from 6 to at least 60 contiguous nucleotides.
37. The kit of Claim 36, wherein said nucleotide regulatory sequence comprises a promoter, an enhancer, a cis-acting locus, a trans-acting locus, an attenuator, an upstream activator, or a regulatory non-translatable region sequence.
38. A kit for screening molecules of a multidimensional library that potentially interact with a target molecule, the kit comprising:
(a) a unicellular host transformed or transfected with an expression vector comprising at least one oligonucleotide operatively associated with a promoter, wherein the at least one oligonucleotide has the general formula of [(NNB)F n]m, wherein:
N is A or C or G or T/U;
B is C or G or T/U, but not A;
F is a codon encoding a predetermined amino acid residue;
n is an integer, such that 0<=n<=10; and m is an integer, such that 2<=m<=20;
(b) reagents for expressing the at least one oligonucleotide;
(c) other reagents; and (d) directions for use of the kit.
CA002408652A 2000-05-12 2001-05-11 A method for designing and screening random libraries of compounds Abandoned CA2408652A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US57047700A 2000-05-12 2000-05-12
US09/570,477 2000-05-12
PCT/IB2001/000810 WO2001086293A2 (en) 2000-05-12 2001-05-11 A method for designing and screening random libraries of compounds

Publications (1)

Publication Number Publication Date
CA2408652A1 true CA2408652A1 (en) 2001-11-15

Family

ID=24279792

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002408652A Abandoned CA2408652A1 (en) 2000-05-12 2001-05-11 A method for designing and screening random libraries of compounds

Country Status (5)

Country Link
EP (1) EP1309869A1 (en)
JP (1) JP2003532431A (en)
AU (1) AU2001258667A1 (en)
CA (1) CA2408652A1 (en)
WO (1) WO2001086293A2 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030224480A1 (en) * 2001-12-27 2003-12-04 Fujitsu Limited Method of designing multifunctional base sequence
JP4953046B2 (en) * 2005-07-29 2012-06-13 独立行政法人産業技術総合研究所 A novel functional peptide creation system that combines a random peptide library or a peptide library that mimics an antibody hypervariable region and an in vitro peptide selection method using RNA-binding proteins

Also Published As

Publication number Publication date
WO2001086293A8 (en) 2003-03-06
WO2001086293A2 (en) 2001-11-15
AU2001258667A1 (en) 2001-11-20
JP2003532431A (en) 2003-11-05
EP1309869A1 (en) 2003-05-14

Similar Documents

Publication Publication Date Title
AU2009251209B2 (en) Isolating biological modulators from biodiverse gene fragment libraries
US5625033A (en) Totally synthetic affinity reagents
EP1904634B1 (en) Novel phage display technologies
Wang et al. Use of a gene-targeted phage display random epitope library to map an antigenic determinant on the bluetongue virus outer capsid protein VP5
US20090264303A1 (en) Polypeptides having a functional domain of interest and methods of identifying and using same
US20040110253A1 (en) Method for identifying MHC-presented peptide epitopes for T cells
EP0689590A1 (en) Totally synthetic affinity reagents
US20030124537A1 (en) Procaryotic libraries and uses
CA2408652A1 (en) A method for designing and screening random libraries of compounds
US20010046680A1 (en) Identification of polypeptides and nucleic acid molecules using linkage between DNA and polypeptide
US20050130124A1 (en) Phagemid display system
Fisch Peptide display in functional genomics
US6440700B1 (en) Method of combinatorial protein synthesis based on ribosomal frameshifting
US20030091999A1 (en) Compositions and methods for identifying polypeptides and nucleic acid molecules

Legal Events

Date Code Title Description
FZDE Discontinued