MX2007015543A - Methods for synthesis of encoded libraries - Google Patents

Methods for synthesis of encoded libraries

Info

Publication number
MX2007015543A
MX2007015543A MXMX/A/2007/015543A MX2007015543A MX2007015543A MX 2007015543 A MX2007015543 A MX 2007015543A MX 2007015543 A MX2007015543 A MX 2007015543A MX 2007015543 A MX2007015543 A MX 2007015543A
Authority
MX
Mexico
Prior art keywords
group
seq
oligonucleotide
compounds
library
Prior art date
Application number
MXMX/A/2007/015543A
Other languages
Spanish (es)
Inventor
Morgan Barry
Hale Stephen
C Aricomuendel Christopher
Wagner Richard
Benjamin Dennis
j franklin George
A Centrella Paolo
A Acharya Raksha
J Kavarana Malcolm
Clark Matthew
Phillip Creaser Steffen
i israel David
L Gefter Malcolm
Jakob Vest Hansen Nils
Original Assignee
A Acharya Raksha
C Aricomuendel Christopher
Benjamin Dennis
A Centrella Paolo
Clark Matthew
Phillip Creaser Steffen
Franklin George J
L Gefter Malcolm
Hale Stephen
Jakob Vest Hansen Nils
Israel David I
J Kavarana Malcolm
Morgan Barry
Praecis Pharmaceuticals Inc
Wagner Richard
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by A Acharya Raksha, C Aricomuendel Christopher, Benjamin Dennis, A Centrella Paolo, Clark Matthew, Phillip Creaser Steffen, Franklin George J, L Gefter Malcolm, Hale Stephen, Jakob Vest Hansen Nils, Israel David I, J Kavarana Malcolm, Morgan Barry, Praecis Pharmaceuticals Inc, Wagner Richard filed Critical A Acharya Raksha
Publication of MX2007015543A publication Critical patent/MX2007015543A/en

Links

Abstract

The present invention provides angiogenesis inhibitor compounds comprising a MetAP-2 inhibitory core coupled to a peptide, as well as pharmaceutical compositions comprising the angiogenesis inhibitor compounds and a pharmaceutically acceptable carrier. The present invention also provides methods of treating an angiogenic disease, e.g., cancer, in a subject by administering to the subject a therapeutically effective amount of one or more of the angiogenesis of the invention.

Description

METHODS FOR SYNTHESIS OF CODIFIED LIBRARIES Related Requests This application is a continuation of US Patent Application No. 11/015458 filed December 17, 2004, pending, which is related to US Provisional Patent Application Serial No. 60/530854, filed on April 17. December 2003; US Provisional Patent Application Serial No. 60/540681, filed on January 30, 2004; US Provisional Patent Application No. of Sesrie 60 / 553,715 filed on March 15, 2004; and US Provisional Patent Application Serial No. 60 / 588,672 filed July 16, 2004. This application also claims priority to US Provisional Patent Application Serial No. 60 / 689,466, filed June 9, 2005 and Application US Provisional Patent No. Serial No. 60 / 731,041 filed October 28, 2005. The complete contents of each of the applications cited above are incorporated herein by reference.
BACKGROUND OF THE INVENTION The search for more efficient methods to identify compounds that have useful biological activities has led to the development of methods to detect vast numbers of different compounds, present in collections referred to as combinatorial libraries. Such libraries may include 105 or more different compounds. There are a variety of methods for producing combinatorial libraries, and combinatorial syntheses of peptides, peptidomimetics and small organic molecules have been reported. The two main challenges in the use of combinatorial approaches in drug discovery are the synthesis of libraries of sufficient complexity and the identification of molecules which are active in the detections used. It is generally recognized that, to a greater degree of complexity of a library, that is, the number of different structures present in the library, the library is more likely to contain molecules with the activity of interest. Therefore, the chemistry used in library synthesis must be capable of producing vast numbers of compounds within a reasonable timeframe. However, for a given formal or global concentration, increasing the number of different members within the library decreases the concentration of any particular member of the library. This complicates the identification of active molecules from high complexity libraries. One approach to overcoming these obstacles has been the development of coded libraries, and particularly libraries in which each compound includes a tag that can be amplified. Such libraries include libraries encoded by DNA, in which a DNA tag identifying a member of the library can be amplified using molecular biology techniques, such as the polymerase chain reaction. However, the use of such methods to produce very large libraries has yet to be demonstrated, and it is clear that improved methods for producing such libraries are required for the materialization of the potential of this approach for drug discovery.
SUMMARY OF THE INVENTION The present invention provides a method for synthesizing libraries of molecules which include a tag of coding oligonucleotides. The method uses a "split and collect" strategy in which a solution comprising an initiator, comprising a first building block attached to an encoding oligonucleotide, is divided ("fractionated") into multiple fractions. In each fraction, the initiator is reacted with a second, unique building block and a second single oligonucleotide which identifies the second building block. These reactions can be simultaneous or sequential and, if they are sequential, each reaction can precede the other. The Dimeric molecules produced in each of the fractions combine ("gather") and then divide again into multiple fractions. Each of these fractions is then reacted with a third block of unique construction (specific for fraction) and a third unique oligonucleotide which encodes the building block. The number of unique molecules present in the product library is a function of (1) the number of different building blocks used in each stage of the synthesis, and (2) the number of times the meeting and division process is repeated. In one embodiment, the invention provides a method for synthesizing a molecule comprising or consisting of a functional portion which is operably linked to an encoding oligonucleotide. The method includes the stages of: (1) providing an initiator compound consisting of a functional portion comprising n building blocks, wherein n is an integer of 1 or greater, wherein the functional portion comprises at least one reactive group and wherein the functional portion is operatively linked to an initial oligonucleotide; (2) reacting the initiator compound with a building block comprising at least one complementary reactive group, wherein at least one complementary reactive group is complementary to the group reagent of step (1), under conditions suitable for the reaction of the reactive group and the complementary reactive group to form a covalent bond; (3) reacting the initial oligonucleotide with an incoming oligonucleotide which identifies the building block of step (b) in the presence of an enzyme that catalyzes the ligation of the initial oligonucleotide and the incoming oligonucleotide, under conditions suitable for ligation of the oligonucleotide entrant and the initial oligonucleotide, thereby producing a molecule comprising or consisting of a functional portion comprising n + 1 building blocks which are operably linked to an encoding oligonucleotide. If the functional portion of step (3) comprises a reactive group, steps 1-3 can be repeated one or more times, thereby forming the lai cycles, where i is an integer of 2 or greater, with the product of the step (3) of a cycle s, where s is an integer of i-1 or less, becoming the initiating compound of the cycle s + 1. In one embodiment, the invention provides a method for synthesizing a library of compounds, in wherein the compounds comprise a functional portion comprising two or more building blocks which are operatively linked to an oligonucleotide which identifies the structure of the functional portion. The method includes steps of (1) providing a solution comprising m starter compounds, wherein m is an integer of 1 or greater, wherein the starter compounds consist of a functional portion comprising n building blocks, where n is an integer of 1 or greater, which are operatively linked to an initial oligonucleotide which identifies the n building blocks; (2) divide the solution of step (1) into r fractions, where r is an integer of 2 or greater; (3) reacting the initiator compounds in each fraction with one of the blocks of censorship, thereby producing r fractions comprising compounds consisting of a functional portion comprising n + 1 building blocks operably linked to the initial oligonucleotide; (4) reacting the initial oligonucleotide in each fraction with one of a set of r different incoming oligonucleotides in the presence of an enzyme that catalyzes the ligation of the incoming oligonucleotide and the initial oligonucleotide, under conditions suitable for the enzymatic ligation of the incoming oligonucleotide and the initial oligonucleotide, consequently producing aliquots comprising molecules consisting of a functional portion comprising n + 1 building blocks operably linked to an elongated oligonucleotide which encodes the n + 1 building blocks. Optionally, the method may include further the step of (5) recombining the r fractions produced in step (4), thereby producing a solution comprising compounds consisting of a functional portion comprising n + 1 building blocks, which are operatively linked to an oligonucleotide elongate. Steps (1) to (5) may be conducted one or more times to yield the cycles l a i, where i is an integer of 2 or greater. In the cycle s + 1, where s is an integer of i-1 or less, the solution comprising m initiator compounds of step (1) is the solution of step (5) of cycle s. Likewise, the initiator compounds of step (1) of cycle s + 1 are the compounds of step (5) of cycle s. In a preferred embodiment, the building blocks are coupled in each stage using conventional chemical reactions. The building blocks can be coupled to produce linear or branched polymers or oligomers, such as peptides, peptidomimetics, and peptoids, or non-oligomeric molecules, such as molecules comprising a scaffold structure to which one or more additional chemical moieties are attached. For example, if the building blocks are amino acid residues, the building blocks can be coupled using standard peptide synthesis strategies, such as solution phase or solid phase synthesis using suitable protection / deprotection strategies as are known in field.
Preferably, the building blocks are coupled using phase chemistry in solution. The coding oligonucleotides are single-stranded or double-stranded cligpnucleotides, preferably double-stranded oligonucleotides. The coding oligonucleotides are preferably oligonucleotides of 4 to 12 bases or base pairs per block of construction; the coding oligonucleotides can be coupled using standard synthetic oligonucleotide methodology of solution phase or solid phase, but are preferably acted using an enzymatic phase-in-solution process. For example, pligpnucleotides can be coupled using a topoisomerase, a ligase, or a DNA polymerase, if the sequence of the coding oligonucleotides includes an initiation sequence for ligation by one of these enzymes. The enzymatic coupling of the coding oligonucleotides offers the advantages of (1) higher accuracy of addition compared to standard (non-enzymatic) synthetic coupling; and (2) the use of a simpler protection / deprotection strategy. In another aspect, the invention provides compounds of Formula I: where X is a functional portion comprising one or more building blocks; Z is an oligonucleotide attached at its 3 'end to B; And it is an oligonucleotide which binds at its terminal 5 'end to C; A is a functional group that forms a covalent bond with X; B is a functional group that forms a bond with the 3 'end of Z; C is a functional group that forms a bond with the 5 'end of Y; D, F and E are each, independently, a bifunctional linking group; and S an atom or a molecular scaffold. Such compounds include those that are synthesized using the methods of the invention. The invention furthermore relates to a library of compounds that comprises compounds comprising a functional portion comprising two or more building blocks which are operatively linked to an oligonucleotide which encodes the structure of the functional portion. Such libraries may comprise from about 102 to about 1012 or more members different, for example, 102, 103, 104, 105, 106, 107, 108, 10J, 1010 '1011, 1012 or more different members, ie, different molecular structures. In one embodiment, the compound library comprises compounds which are each independently of Formula I: where X is a functional portion comprising one or more building blocks; Z is an oligpnucleotide attached at its 3 'end to B; And it is an oligonucleotide which binds at its terminal 5 'end to C; A is a functional group that forms a covalent bond with X; B is a functional group that forms a bond with the 3 'end of Z; C is a functional group that forms a bond with the 5 'end of Y; D, F and E are each, independently, a bifunctional linking group; and S an atom or a molecular scaffold. Such libraries include those that are synthesized using the methods of the invention. In another aspect, the invention provides a method to identify a compound that binds to a biological target, the method comprises the steps of: (a) contacting the biological target with a library of compounds of the invention, wherein the compound library includes compounds which comprise a functional moiety comprising two or more building blocks which are operatively linked to an oligonucleotide which encodes the structure of the functional portion. This step is conducted under suitable conditions so that at least one member of the compound library binds to the target; (2) remove the members of the library who do not join the goal; (3) amplifying the oligonucleotides encoding at least one member of the library of compounds that binds to the target; (4) sequencing the oligonucleotides coding for step (3); and using the sequences determined in step (5) to determine the structure of the functional portions of the members of the library of compounds which bind to the biological target. The present invention provides several advantages in the identification of molecules having a desired property. For example, the methods of the invention allow the use of a variety of chemical reactions to construct the molecules in the presence of the oligonucleotide tag. The methods of the invention also provide a high fidelity means for incorporating oligonucleotide labels in the chemical structures produced in this way. In addition, they enable the synthesis of libraries that have a large number of copies of each member, thus allowing multiple rounds of selection against a biological target while leaving a sufficient number of molecules that follow the final round for amplification and sequence of labels. oligonucleotides.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a schematic representation of the ligation of double-stranded oligonucleotides, in which the initial oligonucleotide has a leaving segment which is complementary to the leaving segment of the incoming oligonucleotide. The initial strand is represented as either free, conjugated to an aminohexyl linker or conjugated to a phenylalanine residue by an aminohexyl linker. Figure 2 is a schematic representation of the ligation of oligonucleotides using a strand tablet. In this embodiment, the splint is a 12-mer oligonucleotide with sequences complementary to the initial single-stranded oligonucleotide and the single-stranded incoming oligonucleotide. Figure 3 is a schematic representation of the ligation of an initial oligonucleotide and an incoming oligonucleotide, when the initial oligonucleotide is double-stranded with strands covalently bound, and the incoming oligonucleotide is double-stranded. Figure 4 is a schematic representation of the elongation of oligonucleotides using a polymerase. The initial strand is represented as either free, conjugated to an aminohexyl linker or conjugated to a phenylalanine residue by an aminohexyl linker. Figure 5 is a schematic representation of the synthesis cycle of one embodiment of the invention. Figure 6 is a schematic representation of a multiple rounds selection process using the libraries of the invention. Figure 7 is a gel that results from the electrophoresis of the products of each of the cycles 1 to 5 described in Example 1 and after the ligation of the closing primer. Molecular weight standards are shown on line 1, and the indicated amounts of a hyperladder, for DNA quantification, are shown on lines 9 through 12. Figure 8 is a schematic illustration of the coupling of building blocks using azide cycloaddition -alquino. Figures 9 and 10 illustrate the coupling of building blocks by nucleophilic aromatic substitution in a chlorinated triazine. Figure 11 shows heteroaromatic structures Representative chlorines, suitable for use in the synthesis of functional portions. Figure 12 illustrates the cyclization of a linear peptide using the azide / alkyne cycloaddition reaction. Figure 13a is a chromatogram of the library produced as described in Example 2 following Cycle 4. Figure 13b is a mass spectrum of the library produced as described in Example 2 following Cycle 4.
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to methods for producing compounds and combinatorial libraries of compounds, compounds and libraries produced by the methods of the invention, and methods for using libraries to identify compounds having a desired property, such as a desired biological activity. The invention also relates to the compounds identified using these methods. A variety of approaches have been assumed to produce and detect combinatorial chemistry libraries. The examples include methods in which the individual members of the library physically separate themselves, such as when a single compound is synthesized in each of them. a large number of reaction vessels. However, in these libraries typically one compound is detected at a time, or at most, several compounds at a time and, therefore, does not result in the most efficient detection process. In other methods, the compounds are synthesized in solid soppings. Such solid supports include chips in which specific compounds occupy specific regions of the chip or membrane ("addressable position"). In other methods, the compounds are synthesized in beads, with each piece containing a different chemical structure. Two difficulties that arise in the detection of large libraries are (1) the number of different compounds that can be detected; and (2) the identification of compounds which are active in the detection. In a method, the compounds which are active in the detection are identified by reducing the original library in increasingly smaller fractions and subfractions, in each case selecting the fraction or subfraction which keeps active groups and also subdividing until a active subfraction which contains a set of compounds which is sufficiently small that all the members of the subset can be individually synthesized and valued for the desired activity. This is a tedious and slow activity. Another method to display the results of a Detection of combinatorial libraries is to use libraries in which members of the library are labeled with an identification tag, ie, each tag present in the library is associated with a discrete structure of the compound present in the library, so that identification of the label discloses the structure of the labeled molecule. One approach to labeling libraries utilizes oligonucleotide labels, as described, for example, in U.S. Patent Nos. 5,573,905; 5,708,153; 5,723,598, 6,060,596 published PCT applications WO 93/06121; WO 93/20242; WO 94/13623; WO 00/23458; WO 02/074929 and WO 02/103008, and by Brenner and Lerner (Proc.Nat.Acid.Sci USA 89, 5381-5383 (1992); Nielsen and Janda. {Methods: A Companion to Methods in Enzymology 6, 361-371 (1994) and Nielsen, Brenner and Janda (J. Am. Chem. Soc. 115, 9812-9813 (1993)), each of which is incorporated herein by reference in its entirety. can be amplified, using for example, polymerase chain reaction, to produce many copies of the label and identify the label when sequencing.The tag sequence then identifies the structure of the binding molecule, which can be synthesized in pure form The present invention provides an improvement in the methods for producing libraries encoded by DNA, as well as also the first examples of large libraries (105 members or greater) of molecules encoded by DNA in which the functional portion is synthesized using synthetic methods of phase in solution. The present invention provides methods that enable easy synthesis of combinatorial libraries encoded by oligonucleotides, and allow an efficient high-fidelity means to add such an oligonucleotide tag to each member of a vast collection of molecules. The methods of the invention include methods for synthesizing bifunctional molecules which comprise a first portion ("functional portion") which is constituted of building blocks, and a second portion operatively linked to the first portion, comprising an oligonucleotide label of the invention. which identifies the structure of the first portion, that is, the oligonucleotide label indicates which building blocks were used in the construction of the first portion, as well as the order in which the building blocks were joined. Generally, the information provided by the oligonucleotide tag is sufficient to determine the building blocks used to construct the active portion. In certain embodiments, the sequence of the oligonucleotide label is sufficient to determine the disposition of the building blocks in the functional portion, for example, for peptide portions, the amino acid sequence. The term "functional portion" as used herein, refers to a chemical portion comprising one or more building blocks. Preferably, the building blocks in the functional portion are not nucleic acids. The functional portion can be a linear or branched or cyclic polymer or oligomer or a small organic molecule. The term "building block", comp is used herein, is a chemical structural unit which is attached to other chemical structural units or can be attached to other such units. When the functional portion is polymeric or oligomeric, the building blocks are the monomeric units of the polymer or oligomer. The building blocks can also include a scaffolding structure ("scaffold building block") to which one or more additional structures ("peripheral building blocks") is attached, or may be joined. It is to be understood that the term "building block" is used herein to refer to a chemical structural unit as it exists in a functional portion and also in the reactive form used for the synthesis of the functional portion. Inside the portion functional, a building block will exist without any portion of the building block which is lost as a consequence of incorporating the building block into the functional portion. For example, in cases in which the link formation reaction releases a small molecule (see below), the building block as it exists in the functional portion is a "building block residue", ie, the rest of the building block used in the synthesis after the loss of the atoms that contributes to the released molecule. The building blocks can be any chemical compounds which are complementary, ie the building blocks must be able to react together to form a structure comprising two or more building blocks. Typically, all of the building blocks used will have at least two reactive groups, although it is possible that some of the building blocks used (for example the last building block in an oligomeric functional portion) will have only one reactive group each. . The reactive groups in two different building blocks must be complementary, ie, capable of reacting together to form a covalent bond, optionally with the concomitant loss of a small molecule, such as water, HCl, HF, and so on.
For the present purposes, two reactive groups are complementary if they are capable of reacting together to form a covalent bond. In a preferred embodiment, the binding formation reactions occur rapidly under ambient conditions without substantial formation of side products. Preferably, a given reactive group will react with a complementary reactive group given exactly once. In one embodiment, complementary reactive groups of two building blocks react, for example, by nucleophilic substitution, to form a covalent bond. In one embodiment, one member of a pair of complementary reactive groups is an electrophilic group and the other member of the pair is a nucleophilic group. Additional electrophilic and nucleophilic groups include any two groups which react by nucleophilic substitution under suitable conditions to form a covalent bond. A variety of suitable bond formation reactions are known in the art. See, for example, March, Advanced Organic Chemistry, fourth edition, New York: John Wiley and Sons (1992), Chapters 10 to 16; Carey and Sundberg, Advanced Organic Chemistry, Part B, Plenary Assembly (1990), Chapters 1-11; and Coliman et al. , Principies and Applications of Organotransition Metal Chemistry, University Science Books, Mili Valley, Calif. (1987), Chapters 13 to 20; each one of which is incorporated herein by reference in its entirety. Examples of suitable electrophilic groups include reactive carbonyl groups, such as acyl chloride groups, ester groups, including carbonylpentafluorophenyl esters and succinimide esters, ketone groups and aldehyde groups; reactive sulphonyl groups, such as sulfonyl chloride groups, and reactive phosphonyl groups. Other electrophilic groups include terminal epoxide groups, isocyanate groups, and alkyl halide groups. Suitable nucleophilic groups include primary and secondary amino groups and hydroxyl groups and carboxyl groups. Suitable complementary reactive groups are set forth in the following. One skilled in the art can readily determine other pairs of reactive groups that can be used in the present method, and the examples provided herein are not intended to be limiting. In a first embodiment, the complementary reactive groups include activated carboxyl groups, reactive sulfonyl groups or reactive phosphonyl groups, or a combination thereof, and primary or secondary amino groups. In this embodiment, the complementary reactive groups react under suitable conditions to form an amide, sulfonamide or phosphonamidate bond.
In a second embodiment, the complementary reactive groups include epoxide groups and primary or secondary amino groups. A building block containing epoxide reacts with a building block containing amine under suitable conditions to form a carbon-nitrogen bond, resulting in a β-aminoalcohol. In another embodiment, complementary reactive groups include aziridine groups and primary or secondary amino groups. Under suitable conditions, a building block containing aziridine reacts with a building block containing amine to form a carbon-nitrogen bond, resulting in a 1,2-diamine. In a third embodiment, the complementary reactive groups include isocyanate groups and primary or secondary amine groups. A building block containing isocyanate will react with a building block containing amino under suitable conditions to form a carbon-nitrogen bond, resulting in a urea group. In a fourth embodiment, the complementary reactive groups include isocyanate groups and hydroxyl groups. A building block containing isocyanate will react with a hydroxyl-containing building block under suitable conditions to form a carbon-oxygen bond, resulting in a carbamate group.
In a fifth embodiment, complementary reactive groups include amino groups and carbonyl-containing groups, such as aldehyde or ketone groups. The amines react with such groups by reductive amination to form a new carbon-nitrogen bond. In a sixth embodiment, the complementary reactive groups include phosphorous-lido groups and aldehyde or ketone groups. A building block containing phosphorus-ylide will react with a building block containing aldehyde or ketone under suitable conditions to form a carbon-carbon double bond, resulting in an alkene. In a seventh embodiment, the complementary reactive groups react by cycloaddition to form a cyclic structure. An example of such complementary reactive groups are organic alkanes and azides, which react under suitable conditions to form a triazole ring structure. An example of the use of this reaction to join two building blocks is illustrated in Figure 8. Suitable conditions for such reactions are known in the art and include those described in WO 03/101972, the entire contents of which are incorporated for reference in the I presented. In an eighth embodiment, the complementary reactive groups are an alkyl halide and a nucleophile, such as an amino group, a hydroxyl group or a carboxyl group. Such groups react under suitable conditions to form a carbon-nitrogen (alkyl halide plus amine) or carbon oxygen (alkyl halide plus hydroxyl or carboxyl group). In a ninth embodiment, the complementary functional groups are a halogenated heteroaromatic group and a nucleophile, and the building blocks are joined under suitable conditions by aromatic nucleophilic substitution. Suitable halogenated heteroaromatic groups include pyrimidines, triazines and chlorinated purines, which react with nucleophiles, such as amines, under moderate conditions in aqueous solution. Representative examples of the reaction of a trichlorotriazine labeled with oligonucleotides with amines are shown in Figures 9 and 10. Examples of suitable chlorinated heteroaromatic groups are shown in Figure 11. Additional reactions of bond formation that can be used to splice building blocks in the synthesis of the molecules and libraries of the invention include those shown in the following. The reactions shown in the following emphasize the reactive functional groups. Various substituents can be present in the reactants, including those labeled Ri, R2, R3 and R4. Possible positions which can be substituted include, but are not limited to, those indicated by Ri, R2 R3 and R4. These substituents may include any suitable chemical moieties, but are preferably limited to those which will not interfere with or significantly inhibit the stated reaction, and, unless otherwise specified, may include hydrogen, alkyl, substituted alkyl, aryl, substituted aryl , heteroaryl, substituted heteroaryl, alkoxy, aryloxy, arylalkyl, substituted arylalkyl, amino, substituted amino and others as are known in the art. Suitable substituents in these groups include alkyl, aryl, heteroaryl, cyano, halogen, hydroxyl, nitro, amino, mercapto, carboxyl, and carboxamide. Where specified, suitable electron acceptor groups include nitro, carboxyl, haloalkyl, such as trifluoromethyl, and others as are known in the art. Examples of suitable electron donor groups include alkyl, alkoxy, hydroxyl, amino, halogen, acetamido and others as are known in the art. Addition of a primary amine to an alkene: Nucleophilic substitution: Reductive alkylation of an amine: Carbon-carbon bond formation reactions catalysed by palladium: Ugi condensation reactions: Electrophilic aromatic substitution reactions: X is an electron donor group. Imine / iminium / enamine formation reactions: Cycloaddition reactions: Diels-Alder cycloaddition 1,3-dipolar cycloaddition, X-Y-Z = C-N-O, C-N-S, N3, Nucleophilic aromatic substitution reactions: W is an electron acceptor group Examples of suitable substituents X and Y include substituted or unsubstituted amino, substituted or unsubstituted alkoxy substituted, substituted or unsubstituted thioalkoxy, substituted or unsubstituted aryloxy and unsubstituted and substituted thioaryloxy.
Heck reaction: Acetal formation: Examples of suitable substituents X and Y include substituted and unsubstituted amino, hydroxyl and sulfhydryl; And it is a linker that connects X and Y and is suitable for forming the ring structure found in the reaction product aldol reactions: Examples of suitable substituents X include O, S and NR3. Scaffold building blocks which can be used to form the molecules and libraries of the invention include those having two or more functional groups which can participate in binding formation reactions with peripheral building block precursors, for example, using a or more of the link formation reactions discussed in the above. The scaffold portions can also be synthesized during the construction of the libraries and molecules of the invention, for example, using building block precursors which can react in specific forms to form molecules comprising a central molecular portion to which groups are attached. peripheral functional In one embodiment, a library of the invention comprises molecules comprising a constant scaffolding portion, but different peripheral portions or different arrangements of peripheral portions. In certain libraries, all members of the library comprise a portion of constant scaffolding; other libraries may comprise molecules that have two or more different scaffold portions. Examples of formation reactions of scaffold portions that can be used in the construction of the molecules and libraries of the invention are set forth in Table 8. The references cited in the table are incorporated herein by reference in their entirety. The groups Rn R2 R'3 and R4 are limited only in that they should not interfere with, or significantly inhibit, the indicated reaction, and may include hydrogen, alkyl, substituted alkyl, heteroalkyl, substituted heteroalkyl, cycloalkyl, heterocycloalkyl, substituted cycloalkyl, heterocycloalkyl. substituted, aryl, substituted aryl, arylalkyl, heteroarylalkyl, substituted arylalkyl, substituted heteroarylalkyl, heteroaryl, substituted heteroaryl, halogen, alkoxy, aryloxy, amino, substituted amino and others as are known in the art. Suitable substituents include, but are not limited to, alkyl, alkoxy, thioalkoxy, nitro, hydroxyl, sulfhydryl, aryloxy, aryl-S-, halogen, carboxy, amino, alkylamino, dialkylamino, arylamino, cyano, cyanate, nitrile, isocyanate, thiocyanate, carbamyl, and substituted carbamyl. It is to be understood that the synthesis of a functional portion can proceed by a particular type of coupling reaction, such as, but not limited to, one of the reactions discussed in the foregoing, or by a combination of two or more coupling reactions, such as two or more of the coupling reactions discussed in the above. For example, in one embodiment, the building blocks are spliced by a combination of amide bond formation (complementary amino and carboxylic acid groups) and reductive amination (complementary amino and aldehyde or ketone groups). Any coupling chemistry can be used, provided that it is compatible with the presence of an oligonucleotide. Labels of double-stranded oligonucleotides (duplexes), as used in certain embodiments of the present invention, are chemically more robust than single-stranded labels, and, therefore, tolerate a wider range of reaction conditions and enable the use of link formation reactions that might be impossible with single-strand labels. A building block may include one or more functional groups in addition to the reactive group or groups used to form the functional portion. One or more of these additional functional groups can be protected to prevent undesired reactions of these functional groups. Suitable protection groups are known in the art for a variety of functional groups (Greene and Wuts, Protective Groups in Organic Synthesis, second edition, New York: John Wiley and Sons (1991), incorporated herein by reference). Particularly useful protecting groups include t-butyl esters and ethers, acetals, ethers and triphenyl amines, acetyl esters, trimethylsilyl ethers, ethers and esters and trichloroethyl carbamates. In one embodiment, each building block comprises two reactive groups, which may be the same or different. For example, each building block added in the cycle s can comprise two reactive groups which are the same, but which are complementary to the reactive groups of the building blocks added in steps s-1 and s + 1. In another modality, each building block comprises two reactive groups which are themselves complementary. For example, a library comprising polyamide molecules can be produced by reactions between building blocks comprising two primary amino groups and building blocks comprising two activated carboxyl groups. In the resulting compounds there is no end terminal N or C, since the alternating amide groups have opposite directionality. Alternatively, a polyamide library can be produced using building blocks each comprising an amino group and an activated carboxyl group. In this embodiment, the building blocks added in step n of the cycle will have a free reactive group which is complementary to the reactive group available in the n-1 building block, while, preferably, the other reactive group in the n-block of construction is protected. For example, if members of the library are synthesized from address C to N, the aggregated building blocks will comprise an activated carboxyl group and a protected amino group. The functional portions can be polymeric or oligomeric portions, such as peptides, peptidomimetics, peptide nucleic acids or peptoids, or they can be small non-polymeric molecules, for example, molecules having a structure comprising a central scaffold and structures arranged approximately in the periphery of the scaffolding. Polymeric or oligomeric linear libraries will result from the use of building blocks having two reactive groups, while branched polymeric or oligomeric libraries will result from the use of building blocks having three or more reactive groups, optionally in combination with blocks of construction that have only two reactive groups. Such molecules can be represented by the general formula X? X2 ... Xn, where each X is a monomer unit of a polymer comprising n monomer units, where n is an integer greater than 1 In the case of oligomeric or polymeric compounds, the terminal building blocks do not need to comprise two functional groups. For example, in the case of a polyamide library, the C-terminal building block may comprise an amino group, but the presence of a carboxyl group is optional. Similarly, the construction blog at the N-terminal end may comprise a carboxyl group, but does not need to contain an amino group. Oligomeric or polymeric branched compounds can also be synthesized provided that at least one building block comprises three functional groups which are reactive with other building blocks. A library of the invention may comprise linear molecules, branched molecules or a combination thereof. Libraries can also be constructed using, for example, a scaffold building block having two or more reactive groups, in combination with other building blocks having only one reactive group available, for example, where any reactive groups additional ones are protected or are not reactive with the other reactive groups present in the scaffold building block. In one embodiment, for example, the synthesized molecules can be represented by the general formula X (Y) nr where X is a scaffold building block; each and is a building block attached to X and n is an integer of at least two, and preferably an integer from 2 to about 6. In a preferred embodiment, the initial building block of cycle 1 is a block of construction. Scaffolding construction. In the molecules of the formula X (Y) n, each one can be the same or different, but in most of the members of a typical library, each one will be different. In one embodiment, libraries of the invention comprise polyamide compounds. The polyamide compounds can be composed of building blocks derived from any amino acids, including the twenty a-amino acids that occur naturally, such as alanine (Ala; A), glycine (Gly; G), asparagine (Asn; N) , aspartic acid (Asp; D), glutamic acid (Glu; E), histidine (His; H), leucine (Leu; L), lysine (Lys; K), phenylalanine (Phe; F), tyrosine (Tyr; Y ), threonine (Thr; T), serine (Ser; S), arginine (Arg; R), valine (Val; V), glutamine (Gln; Q), isoleucine (lie; I), cysteine (Cys; C) , methionine (Met; M), proline (Pro; P) and tryptophan (Trp; W), where the codes of three letters and one letter for each amino acid. In its naturally occurring form, each of the aforementioned amino acids exists in the L configuration, which should be assumed herein unless otherwise noted. In the present method, however, the D-configuration forms of these amino acids can also be used. These D-amino acids are indicated herein by the three-letter or lowercase letter, ie, wing (a), gly (g), leu (1), gln (q), thr (t), ser (s) ), and so on. The building blocks can also be derived from other α-amino acids, including, but not limited to, 3-arylalanines, such as naphthylalanine, phenyl-substituted phenylalanines, including 4-fluoro-, 4-chloro, 4-bromo and 4-methylphenylalanine.; 3-heteroarylalanines, such as 3-pyridylalanine, 3-thienylalanine, 3-quinolilalanine, and 3-imidazolylalanine; ornithine; citrulline; homocitrulin; sarcosine; homoproline; homocysteine; substituted proline, such as hydroxyproline and fluoroproline; dehydroproline; norleucine; O-methyltyrosine; O-methylserine; O-methyl threonine and 3-cyclohexylalanine. Each of the preceding amino acids can be used in the D or L configuration. The building blocks can also be amino acids which are not α-amino acids, such as α-azaamino acids; ß,?, d, e, -aminoacids, and amino acids substituted with N, such as glycine substituted with N, where the substituent N may be, for example, an unsubstituted or substituted alkyl, aryl, heteroaryl, arylalkyl or heteroarylalkyl group. In one embodiment, the N substituent is a side chain of an α-amino acid that occurs naturally or does not occur naturally. The building block can also be a peptidomimetic structure, such as a dipeptide, tripeptide, tetrapeptide or pentapeptide mimetic. Such peptidomimetic building blocks are preferably derived from aminoacyl compounds, so that the chemistry of adding these building blocks to the growing poly (aminoacyl) group is the same as, or similar to, the chemistry used for the other building blocks. . The building blocks can also be molecules which are capable of forming bonds which are isosteric with a peptide bond, to form peptidomimetic functional portions comprising a modification of the peptide backbone, such as? [CH2S],? [CH2NH],? [CSNH2],? [NHCO],? [COCH2], and? [(E) or (Z) CH = CH]. In the nomenclature used in the above,? indicates the absence of an amide bond. The structure that replaces the amide group is specified within the brackets. In one embodiment, the invention provides a method for synthesizing a compound comprising or consisting of a functional portion which is operably linked to a coding oligonucleotide. The method includes the steps of: (1) providing an initiator compound consisting of an initial functional portion comprising n building blocks, where n is an integer of 1 or greater, wherein the initial functional portion comprises at least one reactive group, and wherein the initial functional portion is operatively linked to an initial oligonucleotide which encodes the n building blocks; (2) reacting the initiator compound with a building block comprising at least one complementary reactive group, wherein at least one complementary reactive group is complementary to the reactive group of step (1), under conditions suitable for the reaction of the reactive group and the complementary reactive group to form a covalent bond; (3) reacting the initial oligonucleotide with an incoming oligonucleotide in the presence of an enzyme which catalyzes ligation of the initial oligonucleotide and the incoming oligonucleotide, under conditions suitable for ligation of the incoming oligonucleotide and the initial oligonucleotide, thereby producing a molecule comprising or consists of a functional portion comprising n + 1 building blocks which are operatively linked to a coding oligonucleotide. If the functional portion of step (3) comprises a reactive group, steps 1-3 can be repeated one or more times, forming accordingly the cycles lai, where i is an integer of 2 or greater, with the product of step (3) of a cycle s-1, where s is a whole number of io less, becoming the initiating compound of the stage ( 1) of the cycle s. In each cycle, a building block is added to the growing functional portion and a sequence of oligonucleotides, which encodes the new building block, is added to the growing coding oligonucleotide. In one embodiment, the initial initiator compound (s) are generated by reacting a first building block with an oligonucleotide (eg, an oligonucleotide which includes primer sequences for PCR or an initial oligonucleotide) or with a linker to which such an oligonucleotide joins. In the embodiment set forth in Figure 5, the linker comprises a reactive group for the attachment of a first building block and is linked to an initial oligonucleotide. In this embodiment, the reaction of a building block, or in each of multiple aliquots, one of a collection of building blocks, with the reactive group of the linker and the addition of an oligonucleotide encoding the building block to the initial oligonucleotide , produces one or more initial initiating compounds of the process established in the above. In a preferred embodiment, each individual building block is associated with an oligonucleotide different, so that the sequence of the nucleotides in the oligonucleotide added in a given cycle identifies the building block added in the same cycle. The coupling of building blocks and oligonucleotide ligation will generally occur at similar concentrations of starting materials and reactive agents. For example, reactant concentrations in the order of micromolar to millimolar, for example from about 10 μM to about 10 mM, are preferred in order to have efficient coupling of the building blocks. In certain embodiments, the method further comprises, after step (2), the step of debugging any initial unreacted functional portion. Debugging any initial functional portion without reacting in a particular cycle prevents the initial functional portion of the cycle from reacting with an aggregate building block in a subsequent cycle. Such reactions can lead to the generation of functional portions that lose one or more building blocks, potentially leading to a variety of functional portion structures which correspond to a particular sequence of oligonucleotides. Such debugging can be achieved by reacting any remaining initial functional portion with a compound that reacts with the reactive group of step (2).
Preferably, the scavenger compound reacts rapidly with the reactive group of step (2) and does not include additional reactive groups that can react with the building blocks added in subsequent cycles. For example, in the synthesis of a compound where the reactive group of step (2) is an amino group, a suitable scavenger compound is an N-hydroxysuccinimide ester, such as N-hydroxysuccinimide ester of acetic acid. In another embodiment, the invention provides a method for producing a library of compounds, wherein each compound comprises a functional portion comprising two or more residues of building blocks which are operatively linked to an oligonucleotide. In a preferred embodiment, the oligonucleotide present in each molecule provides sufficient information to identify the building blocks within the molecule and, optionally, the order of addition of the building blocks. In this embodiment, the method of the invention comprises a method for synthesizing a library of compounds, wherein the compounds comprise a functional portion comprising two or more building blocks which are operatively linked to an oligonucleotide which identifies the structure of the functional portion. The method comprises the steps of (1) providing a solution comprising m initiator compounds, wherein m is a number integer of 1 or greater, where the initiator compounds consist of a functional portion comprising n building blocks, where n is an integer of 1 or greater, which are operatively linked to an initial oligonucleotide which identifies the n building blocks; (2) dividing the solution of step (1) into at least r fractions, where r is an integer of 2 or greater; (3) reacting each fraction with one of the r building blocks, thereby producing r fractions comprising compounds consisting of a functional portion comprising n + 1 building blocks operably linked to the initial oligonucleotide; (4) reacting each of the fractions of step (3) with one of a set of different incoming oligonucleotides under conditions suitable for enzymatic ligation of the incoming oligonucleotide to the initial oligonucleotide, thereby producing r fractions comprising molecules that they consist of a functional portion comprising n + 1 building blocks operatively linked to an elongated oligonucleotide which encodes the n + 1 building blocks. Optionally, the method may further include the step of (5) recombining the r fractions, produced in step (4), thereby producing a solution comprising molecules consisting of a functional portion comprising n + 1 building blocks, which are operatively linked to an elongated oligonucleotide which encodes the n + 1 building blocks. Steps (1) to (5) may be conducted one or more times to yield the cycles l a i, where i is an integer of 2 or greater. In the cycle s + 1, where s is an integer of i-1 or less, the solution comprising m initiator compounds of step (1) is the solution of step (5) of cycle s. Also, the initiator compounds of step (1) of cycle s + 1 are the products of step (4) in cycle s. Preferably the solution of step (2) is divided into r fractions in each cycle of library synthesis. In this mode, each fraction is reacted with a single building block. In the methods of the invention, the order of addition of the building block and the incoming oligonucleotide is not critical, and steps (2) and (3) of the synthesis of a molecule, and steps (3) and (4) in library synthesis they can be reversed, i.e., the incoming oligonucleotide can be ligated to the initial oligonucleotide before the new building block is added. In certain embodiments, it may be possible to conduct these two stages simultaneously. In certain embodiments, the method further comprises, after step (2), the step of debugging any initial unreacted functional portion. Debug any The initial unreacted functional portion in a particular cycle prevents the initial functional portion of a cycle from reacting with an aggregate building block in a subsequent cycle. Such reactions can lead to the generation of functional portions that lose one or more building blocks, potentially leading to a variety of functional portion structures which correspond to a particular sequence of oligonucleotides. Such debugging can be achieved by reacting any remaining initial functional portion with a compound that reacts with the reactive group of step (2). Preferably, the scavenger compound reacts rapidly with the reactive group of step (2) and does not include additional reactive groups that can react with the building blocks added in subsequent cycles. For example, in the synthesis of a compound where the reactive group of step (2) is an amino group, a suitable scavenger compound is an N-hydroxysuccinimide ester, such as N-hydroxysuccinimide ester of acetic acid. In one embodiment, the building blocks used in library synthesis are selected from a set of candidate building blocks by evaluating the ability of the candidate building blocks to react with appropriate complementary functional groups under the conditions used for the synthesis of the building blocks. library. Building blocks which show to be suitably reactive under such conditions can then be selected for incorporation into the library. The products of a given cycle can, optionally, be purified. When the cycle is an intermediate cycle, that is, any cycle prior to the final cycle, these products are intermediaries and can be purified prior to the initiation of the next cycle. If the cycle is the final cycle, the products of the cycle are the final products, and can be purified prior to any use of the compounds. This purification step can, for example, remove unreacted or excess reactants and the enzyme used for the ligation of the oligonucleotides. Any methods that are suitable for separating the products of another species present in the solution can be used, including liquid chromatography, such as high performance liquid chromatography (HPLC) and precipitation with a suitable co-solvent, such as ethanol. The methods suitable for purification will depend on the nature of the products and the solvent system used for the synthesis. The reactions are preferably conducted in aqueous solution, such as a buffered aqueous solution, but can also be conducted in mixed aqueous / organic media consistent with the solubility properties of the building blocks, oligonucleotides, intermediates and final products and the enzyme used to catalyze the ligation of oligonucleotides. It should be understood that the theoretical number of compounds produced by a given cycle in the method described in the above is the product of the number of different initiator compounds, m, used in the cycle and the number of different building blocks added in the cycle, r . The actual number of different compounds produced in the cycle can be as high as the product of r and m (r x m), but it can be smaller, given the differences in reactivity of certain building blocks with certain different building blocks. For example, the kinetics of adding a particular building block to a particular initiator compound may be so that on the time scale of the synthetic cycle, little to nothing of the product of that reaction may occur. In certain modalities, a common building block is added prior to cycle 1, after the last cycle or between any two cycles. For example, when the functional portion is a polyamide, a common N-terminal shell building block can be added after the final cycle. A common building block can also be introduced between any two cycles, for example, to add a functional group, such as an alkyne group or azide, which can be used to modify the functional portions, for example by cyclization, after the synthesis of libraries. The term "operatively linked", as used herein, means that two chemical structures are joined together in such a way that they remain joined through the various manipulations that are expected to be experienced. Typically the functional portion and the coding oligonucleotide are covalently linked by an appropriate linking group. The binding group is a bivalent moiety with a binding site for the oligonucleotide and a binding site for the functional moiety. For example, when the functional portion is a polyamide compound, the polyamide compound can be attached to the linking group at its N-terminus, its C-terminal end or by a functional group on one of the side chains. The linking group is sufficient to separate the polyamide compound and the oligonucleotide by at least one atom, and preferably by more than one atom, such as at least two, at least three, at least four, at least five or at least six atoms. Preferably, the linking group is flexible enough to allow the polyamide compound to bind target molecules in a manner which is independent of the oligonucleotide. In one embodiment, the linking group joins the N-terminal end of the polyamide compound and the 5 'phosphate group of the oligonucleotide. For example, the linking group can be derived from a linking group precursor comprising an activated carboxyl group at one end and an activated ester at the other end. The reaction of the linking group precursor with the N-terminal nitrogen atom will form an amide bond connecting the linking group to the N-terminal polyamide compound or building block, while the reaction of the linking group precursor to the hydroxy group 5 'of the oligonucleotide will result in the binding of the oligonucleotide to the linking group via an ester linkage. The linking group may comprise, for example, a polymethylene chain, such as a chain - (CH2) not a poly (ethylene glycol) chain, such as a (CH2CH20) n chain, where in both cases n is an integer of 1 at about 20. Preferably, n is from 2 to about 12, more preferably from about 4 to about 10. In one embodiment, the linking group comprises a hexamethylene group (- (CH2) 6 ~). When the building blocks are amino acid residues, the resulting functional portion is a polyamide. The amino acids can be coupled using any suitable chemistry for the formation of amide bonds. Preferably, the coupling of the amino acid building blocks is conducted under conditions which are compatible with enzymatic ligation of oligonucleotides, for example, at neutral or near neutral pH and in aqueous solution. In one embodiment, the polyamide compound is synthesized from the C-terminal to the N-terminal direction. In this embodiment, the first building block, or C-terminal, is coupled in its carboxyl group to an oligonucleotide by a suitable linking group. The first building block is reacted with the second building block, which preferably has an activated carboxyl group and a protected amino group. Any activating / protecting group strategy which is suitable for phase-in-solution amide bond formation can be used. For example, suitable activated carboxyl species include acyl fluorides (U.S. Patent No. 5,360,928, incorporated herein by reference in its entirety), symmetrical anhydrides and N-hydroxysuccinimide esters. The acyl groups can also be activated in-house, as is known in the art, by reaction with a suitable activating compound. Suitable activating compounds include dicyclohexylcarbodiimide (DCC), diisopropylcarbodiimide (DIC), l-ethoxycarbonyl-2-ethoxy-1,2-dihydroquinoline (EEDQ), l-ethyl-3- (3-dimethylaminopropyl) carbodiimide hydrochloride (EDC), n-propan-phosphonic anhydride (PPA), N, N-bis (2-oxo-3-oxazolidinyl) imido-phosphoryl chloride (BOP-C1), hexafluorophosphate bromo-tris-pyrrolidinophosphonium (PyBrop), diphenylphosphoryl azide (DPPA), Castro's reactive agent (BOP, PyBop), salts of O-benzotriazolyl-N, N, N ', N' -tetramethyluronium (HBTU), diethylphosphoryl cyanide (DEPCN), 2,5-diphenyl-2,3-dihydro-3-oxo-4-hydroxy-thiophene dioxide (Steglich's reactive agent, HOTDO), 1,1 '-carbonyl-diimidazole (CDI), and chloride of 4- (4,6-dimethoxy-1,3,5-triazin-2-yl) -4-methylmorpholinium (DMT-MM). The coupling reactive agents can be used alone or in combination with additives such as N. N-dimethyl-4-aminopyridine (DMAP), N-hydroxy-benzotriazole (HOBt), N-hydroxybenzotriazine (HOOBt), N-hydroxysuccinimide (HOSu) N-hydroxyazabenzotriazole (HOAt), azabenzotriazolyl-tetramethyluronium salts (HATU, HAPyU) or 2-hydroxypyridine. In certain modalities, the synthesis of a library requires the use of two or more activation strategies, to enable the use of a structurally diverse set of building blocks. For each building block, one skilled in the art can determine the appropriate activation strategy. The protection group N terminal can be any protection group which is compatible with the process conditions, for example, protection groups which are suitable for conditions of phase synthesis in solution. A preferred protection group is the group fluorenylmethoxycarbonyl ("Fmoc"). Any potentially reactive functional groups in the side chain of the aminoacyl building block may also need to be adequately protected. Preferably the side chain protection group is orthogonal to the N-terminal protection group, ie, the side chain protection group is removed under conditions which are different from those required for the removal of the N-terminal protection group. Suitable side chain protecting groups include the nitroveratril group, which can be used to protect side chain carboxyl groups and side chain amino groups. Another suitable side chain amine protection group is the N-pent-4-enoyl group. The building blocks can be modified after incorporation into the functional portion, for example, by a suitable reaction involving a functional group in one or more of the building blocks. The modification of building blocks can take place after the addition of the final building block or at any intermediate point in the synthesis of the functional portion, for example, after any cycle of the synthetic process. When a library of bifunctional molecules of the invention is synthesized, the modification of building blocks can be carried out in the entire library or in a portion of the library, thus increasing the degree of complexity of the library. Suitable reactions of modifying building blocks include those reactions that can be performed under conditions compatible with the functional portion and the coding oligonucleotide. Examples of such reactions include acylation and sulfonation of amino groups or hydroxyl groups, alkylation of amino groups, esterification or thioesterification of carboxyl groups, amidation of carboxyl groups, epoxidation of alkenes, and other reactions as are known in the art. When the functional portion includes a building block having an alkyne or an azide functional group, the azide / alkyne cycloaddition reaction can be used to derive the building block. For example, a building block that includes an alkyne can be reacted with an organic azide, or a building block that includes an azide can be reacted with an alkyne, in each case forming a triazole. Modification reactions of building blocks can take place after the addition of the final building block or at an intermediate point in the synthetic process, and can be used to attach a variety of chemical structures to the functional portion, including carbohydrates, portions of union of metal and structures to target certain biomolecules or tissue types. In another embodiment, the functional portion comprises a linear series of building blocks and this linear series is cycled using a suitable reaction. For example, if at least two building blocks in the linear array include sulfhydryl groups, the sulfhydryl groups can be oxidized to form a disulfide bond, thereby cycling the linear array. For example, the functional portions can be oligopeptides which include two or more L or D-cysteine and / or L or D-homocysteine portions. The building blocks may also include other functional groups capable of reacting together to cyclize the linear array, such as carboxyl and amino groups or hydroxyl groups. In a preferred embodiment, one of the building blocks in the linear array comprises an alkyne group and another building block in the linear array comprises an azide group. The azide and alkyne groups can be induced to react by cycloaddition, which results in the formation of a macrocyclic structure. In the example illustrated in Figure 9, the functional portion is a polypeptide comprising a propargylglycine building block at its C-terminus and an azidoacetyl group at its N-terminus. The reaction of the alkyne and the azide group under suitable conditions results in the formation of a cyclic compound, which includes a triazole structure within the macrocycle. In the case of a library, in one embodiment, each member of the library comprises building blocks containing alkyne and azide and can be cyclized in this manner. In a second embodiment, all members of the library comprise building blocks containing alkyne and azide, but only a portion of the library is cycled. In a third embodiment, only certain functional portions include building blocks containing alkyne and azide, and only these molecules are cyclized. In the second and third modalities mentioned above, the library, after the cycloaddition reaction, will include cyclic and linear functional portions. In some embodiments of the invention in which the same functional portion, e.g., triazine, is added to each and all of the library fractions during a particular stage of synthesis, it may not be necessary to add an oligonucleotide tag that encode that portion of function. Oligonucleotides can be ligated by chemical or enzymatic methods. In one embodiment, the oligonucleotides are linked by chemical means. The chemical ligation of DNA and RNA can be performed using reactive agents such as water-soluble carbodiimide and cyanogen bromide as taught, for example, by Shabarova, et al. (1991) Nucleic Acids Research, 19, 4247-4251), Federova, et al. (1996) Nucleosides and Nucleotides, 15, 1137-1147, and Carriero and Damlia (2003) Journal of Organi c Chemistry, 68, 8328-8338. In one embodiment, chemical ligation is performed using cyanogen bromide, 5 M in acetonitrile, in a 1:10 v / v ratio with 5 'phosphorylated oligonucleotide in a pH 7.6 buffer (1 M MES + 20 mM MgCl2) at 0 degrees for 1 -5 minutes. In another embodiment, the oligonucleotides are ligated using enzymatic methods. In each embodiment, the oligonucleotides can be double-stranded, preferably with a leaving segment of about 5 to about 14 bases. The oligonucleotide can also be single-stranded, in which case a splint with an overlap of about 6 bases with each of the oligonucleotides to be ligated is used to place the reactive 5 'and 3' portions in proximity to each other. In one embodiment, the initial building block is operatively linked to an initial oligonucleotide. Prior to or after the coupling of a second building block to the initial building block, a second oligonucleotide sequence which identifies the second building block is ligated to the initial oligonucleotide. Methods for ligating the initial oligonucleotide sequence and the incoming oligonucleotide sequence are set forth in Figures 1 and 2. In Figure 1, the initial oligonucleotide is double-stranded, and a strand it includes a segment-out sequence which is complementary to one end of the second oligonucleotide and puts the second oligonucleotide in contact with the initial oligonucleotide. Preferably the leaving sequence of the initial oligonucleotide and the complementary sequence of the second oligonucleotide are at least about 4 bases; more preferably both sequences are of the same length. The initial oligonucleotide and the second oligonucleotide can be ligated using a suitable enzyme. If the initial oligonucleotide binds to the first building block at the 5 'end of one of the strands (the "main strand"), then the strand which is complementary to the main strand (the "secondary strand") will include the sequence of outgoing segment at its 5 'end, and the second oligonucleotide will include a complementary sequence at its 5' end. After ligation of the second oligonucleotide, a strand can be added which is complementary to the sequence of the second oligonucleotide which is 3 'to the complementary sequence of the leaving segment, and which includes an additional sequence of the outgoing segment. In one embodiment, the oligonucleotide is elongated as set forth in Figure 2. The oligonucleotide bound to the growing functional portion and the incoming oligonucleotide are placed for ligation by the use of a "spline" sequence, which includes a region which is complementary to the 3 'end of the initial oligonucleotide and a region which is complementary to the 5' end of the incoming oligonucleotide. The tablet places the 5 'end of the oligonucleotide in proximity with the 3' end of the incoming oligo and ligation is accomplished using enzymatic ligation. In the example illustrated in Figure 2, the initial oligonucleotide consists of 16 nucleobases and the splint is complementary to the 6 bases at the 3 'end. The incoming oligonucleotide consists of 12 nucleobases, and the splint is complementary to the 6 bases at the 5 'terminal end. The length of the splint and the lengths of the complementary regions are not critical. However, the complementary regions must be long enough to allow stable formation of dimers under the conditions of ligation, but not so long as to yield an excessively large coding nucleotide in the final molecules. It is preferred that the complementary regions be from about 4 bases to about 12 bases, more preferably from about 5 bases to about 10 bases, and most preferably from about 5 bases to about 8 bases in length. The methods of fractioning and gathering used for the methods for library synthesis established in the present ensure that each unique functional portion is operably linked to at least a single oligonucleotide sequence which identifies the functional portion. If 2 or more different oligonucleotide labels are used for at least one building block in at least one of the synthetic cycles, each distinct functional portion comprising that building block will be encoded by multiple oligonucleotides. For example, if 2 oligonucleotide tags are used for each building block during the synthesis of a cycle 4 library, there will be 16 DNA sequences (24) encoding each unique functional portion. There are several potential advantages to coding each unique functional portion with multiple sequences. First, the selection of a different combination of tag sequences encoding the same functional portion ensures that those molecules were independently selected. Second, the selection of a different combination of tag sequences encoding the same functional portion eliminates the possibility that the selection was based on the sequence of the oligonucleotide. Third, technical devices can be recognized if sequence analysis suggests that a particular functional portion is enriched in a high way, but only a combination of sequences from many possibilities appears. Multiple labeling can be achieved by having fractional reactions independent with the same building block but a different label of oligonucleotides. Alternatively, multiple labeling can be achieved by mixing an appropriate ratio of each label in a single labeling reaction with an individual building block. In one embodiment, the initial oligonucleotide is double-stranded and the two strands are spliced covalently. A means for covalently splicing the two strands is shown in Figure 3, in which a binding portion, eg, a linker, is used to join the two strands and the functional portion. The binding portion may be any chemical structure comprising a first functional group which is adapted to react with a building block, a second functional group which is adapted to react with the 3 'end of an oligonucleotide, and a third group functional which is adapted to react with the 5 'end of an oligonucleotide. Preferably, the second and third functional groups are oriented to place the two strands of oligonucleotides in a relative orientation that allows hybridization of the two strands. For example, the binding portion, for example, the linker, may have the general structure (I): where A, is a functional group that can form a covalent bond with a building block, B is a functional group that can form a bond with the 5 'end of an oligonucleotide, and C is a functional group that can form a bond with the 3 'end of an oligonucleotide. D, F and E are chemical groups that bind functional groups A, C and B to S, which is a nucleus atom or scaffold. Preferably, D, E and F are each independently a chain of atoms, such as an alkylene chain or an oligo chain (ethylene glycol), and D, E and F can be the same or different, and are preferably effective to allow hybridization of the two oligonucleotides and the synthesis of the functional portion. In one embodiment, the trivalent binding portion is a linker having the structure In this embodiment, the NH group is available for binding to a building block, while terminal phosphate groups are available for binding to an oligonucleotide. In embodiments in which the initial oligonucleotide is double-stranded, the incoming oligonucleotides are also double-stranded. As shown in Figure 3, the initial oligonucleotide can have one strand which is longer than the other, providing a segment segment outgoing. In this embodiment, the incoming oligonucleotide includes a segment-outgoing sequence which is complementary to the segment-outgoing sequence of the initial oligonucleotide. Hybridization of the two complementary sequences of protruding segments places the incoming oligonucleotide in position for ligation to the initial oligonucleotide. This ligation can be performed enzymatically using a DNA or RNA ligase. The sequences of protruding segments of the incoming oligonucleotide and the The initial oligonucleotide are preferably of the same length and consist of two or more nucleotides, preferably from 2 to about 10 nucleotides, more preferably from 2 to about 6 nucleotides. In a preferred embodiment, the incoming oligonucleotide is a double-stranded oligonucleotide having a segment-outgoing sequence at each end. The segment sequence at one end is complementary to the segment segment leaving the initial oligonucleotide, while, after ligation of the incoming oligonucleotide and the initial oligonucleotide, the segment segment leaving at the other end becomes the sequence of salient segment of the initial oligonucleotide of the next cycle. In one embodiment, the three outgoing segment sequences are all 2 to 6 nucleotides in length, and the coding sequence of the incoming oligonucleotide is 3 to 10 nucleotides in length, preferably 3 to 6 nucleotides in length. In a particular embodiment, the sequences of protruding segments are all 2 nucleotides in length and the coding sequence is 5 nucleotides in length. In the embodiment illustrated in Figure 4, the incoming strand has a region at its 3 'end which is complementary to the 3' end of the initial oligonucleotide, leaving segments protruding at the 5 'ends of both strands. The 5 'ends can be filled when using, for example, a DNA polymerase, such as vent polymerase, which results in an elongated double-stranded oligonucleotide. The secondary strand of this oligonucleotide can be removed, and an additional sequence added to the 3 'end of the main strand using the same method. The tag of coding oligonucleotides is formed as a result of the successive addition of oligonucleotides that identify each successive building block. In one embodiment of the methods of the invention, successive oligonucleotide labels can be coupled by enzymatic ligation to produce a coding oligonucleotide. The ligation of oligonucleotides catalyzed by enzymes can be performed using any enzyme that has the ability to bind fragments of nucleic acids. Exemplary enzymes include ligases, polymerases, and topoisomerases. In specific embodiments of the invention, DNA ligase (EC 6.5.1.1), DNA polymerase (EC 2.7.7.7), RNA polymerase (EC 2.7.7.6) or topoisomerase (EC 5.99.1.2) are used to ligate the oligonucleotides. Enzymes contained in each EC class can be found, for example, as described in Bairoch (2000) Nuclei c Acids Research 28: 304-5. In a preferred embodiment, the oligonucleotides used in the methods of the invention are oligodeoxynucleotides and the enzyme used to catalyze the The ligation of the oligonucleotides is DNA ligase. In order for the ligation to occur in the presence of the ligase, ie, for a phosphodiester bond to be formed between two oligonucleotides, one oligonucleotide must have a free 5 'phosphate group and the other oligonucleotide must have a free 3' hydroxyl group. Exemplary DNA ligases that can be used in the methods of the invention include T4 DNA ligase, Taq DNA ligase, T4 RNA ligase, DNA ligase (E. coli) (all available from, eg, New England Biolabs, MA). One skilled in the art will understand that each enzyme used for ligation has optimal activity under specific conditions, for example, temperature, buffer concentration, pH and time. Each of these conditions can be adjusted, for example, according to the manufacturer's instructions, to obtain optimal ligation of the oligonucleotide labels. The incoming oligonucleotide can be of any desirable length, but is preferably at least three nucleobases in length. More preferably, the incoming oligonucleotide is 4 or more nucleobases in length. In one embodiment, the incoming oligonucleotide is from 3 to about 12 nucleobases in length. It is preferred that the oligonucleotides of the molecules in the libraries of the invention have a common terminal sequence which can serve as a primer for PCR, as is known in technique. Such a common terminal sequence can be incorporated as the terminal end of the incoming oligonucleotide added in the final cycle of library synthesis, or it can be added after the synthesis of libraries, for example, using the enzymatic ligation methods described herein. A preferred embodiment of the method of the invention is set forth in Figure 5. The process begins with a synthesized DNA sequence which is linked at its 5 'end to a linker which ends in an amino group. In step 1, this start DNA sequence is ligated to a DNA sequence in the presence of a strand of DNA, DNA ligase and dithiothreitol in Tris buffer. This yields a labeled DNA sequence which can then be used directly in the next step or purified, for example, using HPLC or ethanol precipitation, before proceeding to the next step. In step 2 the tagged DNA is reacted with an protected activated amino acid, in this example, an amino acid fluoride protected with Fmoc, which yields an amino acid-DNA protected conjugate. In step 3, the protected amino acid-DNA conjugate is deprotected, for example, in the presence of piperidine, and the resulting deprotected conjugate, optionally, is purified, for example, by HPLC or ethanol precipitation. The unprotected conjugate is the product of the first synthesis cycle, and it becomes the starting material for the second cycle, which adds a second amino acid residue to the free amino group of the deprotected conjugate. In embodiments in which PCR should be used to amplify and / or sequence the oligonucleotides encoding selected molecules, the encoding oligonucleotides may include, for example, primer sequences for PCR and / or primers for sequencing (eg, primers such as, for example, 3'-GACTACCGCGCTCCCTCCG-5 'and 3' -GACTCGCCCGACCGTTCCG-5 '). A primer sequence for PCR can be included, for example, in the initial oligonucleotide prior to the first synthesis cycle, and / or can be included with the first incoming oligonucleotide, and / or can be ligated to the coding oligonucleotide after the final cycle of the synthesis of libraries, and / or can be included in the incoming oligonucleotide of the final cycle. Sequences of primers for PCR added after the final cycle of library synthesis and / or in the incoming oligonucleotide of the final cycle are referred to herein as "cover sequences". In one embodiment, the primer sequence for PCR is designed on the label of coding oligonucleotides. For example, a primer sequence for PCR can be incorporated into the initial label of oligonucleotides and / or it can be incorporated into the final oligonucleotide label. In one embodiment the same primer sequence for PCR is incorporated into the initial and final label of oligonucleotides. In another embodiment, a first primer sequence for PCR is incorporated into the initial oligonucleotide label and a second primer sequence for PCR is incorporated into the final oligonucleotide label. Alternatively, the second primer sequence for PCR can be incorporated into the cover sequence as described herein. In preferred embodiments, the primer sequence for PCR is at least about 5, 7, 10, 13, 15, 17, 20, 22, or 25 nucleotides in length. Primer sequences for PCR suitable for use in libraries of the invention are known in the art; Suitable primers and methods are set forth, for example, in Innis, et al, eds., PCR Protocols: A Guide to Methods and Applications, San Diego: Academic Press (1990), the contents of which are incorporated herein by reference. reference in its entirety. Other primers suitable for use in the construction of the libraries described herein are those primers described in PCT Publications WO 2004/069849 and WO 2005/003375, the entire contents of which are expressly incorporated herein by reference. The term "polynucleotide" as used in the present in reference to primers, probes and fragments or segments of nucleic acids to be synthesized by extension of the primer is defined as a molecule comprised of two or more deoxyribonucleotides, preferably more than three. The term "primer" as used herein refers to a polynucleotide either purified from a digestion by nucleic acid restriction enzymes or produced in synthetic form, which is capable of acting as a point of initiation of the synthesis of nucleic acids when placed under conditions in which the synthesis of a primer extension product is induced which is complementary to a strand of nucleic acid, i.e., in the presence of nucleotides and an agent for polymerization such as DNA polymerase , reverse transcriptase and the like, and at a suitable temperature and pH. The primer is preferably single-stranded for maximum efficiency, but may alternatively be in double-stranded form. If it is double-stranded, the primer is first treated to separate it from its complementary strand before being used to prepare extension products. Preferably, the primer is a polydeoxyribonucleotide. The primer must be long enough to prepare the synthesis of extension products in the presence of the polymerization agents. The exact lengths of the primers will depend on many factors, including temperature and the source of the primer.
The primers used herein are selected to be "substantially" complementary to the different strands of each specific sequence that must be amplified. This means that the primer must be sufficiently complementary to hybridize non-randomly with its respective template strand. Therefore, the primer sequence may or may not reflect the exact sequence of the template. The polynucleotide primers can be prepared using any suitable method, such as, for example, the phosphotriester or phosphodiester methods described in Narang et al, (1979) Meth. Enzymol. , 68:90; U.S. Patent No. 4,356,270, U.S. Patent No. 4,458,066, U.S. Patent No. 4,416,988, U.S. Patent No. 4,293,652; and Brown et al. , (1979) Meth. Enzymol. , 68: 109. The contents of all the documents cited above are incorporated herein by reference. In cases where the primer sequences for PCR are included in an incoming oligonucleotide, these incoming oligonucleotides will preferably be significantly longer than the incoming oligonucleotides added in the other cycles, since they will include a coding sequence and a primer sequence for PCR In one embodiment, the cover sequence is added after the addition of the final building block and final incoming oligonucleotide, and the synthesis of a library as set forth herein includes the step of ligating the cover sequence to the coding oligonucleotide, such that the oligonucleotide portion of substantially the All of the members of the library end up in a sequence that includes a primer sequence for PCR. Preferably, the cover sequence is added by ligation to the pooled fractions which are products of the final synthetic cycle. The cover sequence can be added using the enzymatic process used in the construction of the library. In one embodiment, the same cover sequence is linked to every member of the library. In another embodiment, a plurality of cover sequences are used. In this embodiment, envelope oligonucleotide sequences containing varying bases, for example, are ligated onto members of the library after the final synthetic cycle. In one embodiment, after the final synthetic cycle, the fractions are pooled and then divided into fractions again, with each fraction having a different aggregate cover sequence added. Alternatively, multiple cover sequences may be added to the library assembled after the final cycle of the synthesis. In both modalities, the final members of the library will include molecules comprising specific functional portions linked to identification oligonucleotides including two or more different cover sequences. In one embodiment, the cover primer comprises an oligonucleotide sequence containing variable, ie degenerate, nucleotides. Such degenerate bases within the cover primers allow the identification of molecules of interest of the library when determining whether a combination of building blocks is the consequence of PCR duplication (identical sequence) or independent presentations of the molecule (different sequence). For example, such degenerate bases can reduce the potential number of false positives identified during biological detection of the encoded library. In one embodiment, a degenerate cover primer comprises or has the following sequence: '-CAGCGTTCGA-3 '3' "AA GTCGCAAGCT N NN GTCTGTTCGAAGTGGACG-5 ' superposition region sequence for the constant ra to degenerate constant for ligation over amplification extension of the library primer where N can be any of the 4 bases, allowing 1024 different sequences (45). The primer has the following sequence after its ligation on the library and extension of the primer: 5 '-CAGCGTTCGA N' N 'N' N 'N' CAGACAAGCTTCACCTGC-3 '3'-AA GTCGCAAGCT NN N NN GTCTGTTCGAAGTGGACG-5' In other modality, the cover primer comprises or has the following sequence: 3'-AA GTCGCAAGCTACG ABBBABBBABBBA GACTACCGCGCTCCCTCCG superposition region sequence for the constant for degenerate ligation about extension of the library primer where B can be any of C, G or T, allowing 19,683 different sequences (39). The design of the degenerated region in this primer improves the analysis of DNA sequences, since the A bases that flank and accentuate the degenerated B bases avoid the homopolymer sections of more than 3 bases, and facilitate the alignment of sequences. In one embodiment, the degenerate envelope oligonucleotide is ligated to the members of the library using a suitable enzyme and the upper strand of the oligonucleotide Degenerate coating is subsequently polymerized using a suitable enzyme, such as a DNA polymerase. In another embodiment, the PCR primer sequence is a "universal adapter" or "universal primer". As used herein, a "universal adapter" or "universal primer" is an oligonucleotide that contains a single PCR primer region, i.e., for example, about 5, 7, 10, 13, 15, 17, 20, 22, or 25 nucleotides in length, and is located adjacent to a single sequencing region that is, for example, about 5, 7, 10, 13, 15, 17, 20, 22, or 25 nucleotides in length, and optionally followed by a single discrimination key sequence (or sample identifier sequence) consisting of at least one of each of the four deoxyribonucleotides (ie, A, C, G, T). As used herein, the term "key discrimination sequence" or "sample identification sequence" refers to a sequence that can be used to uniquely label a population of molecules in a sample. Single sample identifier sequence can be mixed, sequenced and rearranged after DNA sequencing for the analysis of individual samples The same discrimination sequence can be used for a complete library or, alternatively, different key discrimination sequences can be used to track different libraries. In one embodiment, the key sequence of discrimination is either in the primer for 5 'PCR, the primer for 3' PCR, or in both primers. If both PCR primers contain a sample identifier sequence, the number of different samples that can be combined with unique sample identifier sequences is the product of the number of sample identifier sequences in each primer. In this way, 10 different sample identification sequence primers 5 '"can be combined with 10' different sample identification sequence primers to yield 100 different combinations of sample identification sequences, non-limiting examples of 5 'and 3' primers for Unique PCRs containing key discrimination sequences include the following: 5 'primers (variable positions in black and italics): 5' A - GCCTTGCCAGCCCGCTCAGiTGACTCCCAAATCGATGTG; 5 'C - GCCTTGCCAGCCCGCTCAGCTGACTCCCAAATCGATGTG; 5 'G - GCCTTGCCAGCCCGCTCAGGTGACTCCCAAATCGATGTG; 5 'T- GCCTTGCCAGCCCGCTCAG7TGACTCCCAAATCGATGTG; 5 'AA - GCCTTGCCAGCCCGCTCAG? TGACTCCCAAATCGATGTG; 5'AC - GCCTTGCCAGCCCGCTCAGlCrGACTCCCAAATCGATGTG; 5 'AG - GCCTTGCCAGCCCGCTCAG? GTGACTCCCAAATCGATGTG; 5 'AT - GCCTTGCCAGCCCGCTCAG? 7TGACTCCCAAATCGATGTG; and 5 'CA - GCCTTGCCAGCCCGCTCAGCTGACTCCCAAATCGATGTG. 3 'SID primers (variable positions in black and italics): 3'A - GCCTCCCTCGCGCCATCAG? GCAGGTGAAGCTTGTCTG; 3 'C- GCCTCCCTCGCGCCATCAG03CAGGTGAAGCTTGTCTG; 3 'G-GCCTCCCTCGCGCCATCAGGGCAGGTGAAGCTTGTCTG; 3 'T- GCCTCCCTCGCGCCATCAG7GCAGGTGAAGCTTGTCTG; 3 'AA - GCCTCCCTCGCGCCATCAGLGCAGGTGAAGCTTGTCTG; 3 'AC - GCCTCCCTCGCGCCATCAG? CGCAGGTGAAGCTTGTCTG; 3'AG-GCCTCCCTCGCGCCATCAG GGCAGGTGAAGCTTGTCTG; 3 'AT- GCCTCCCTCGCGCCATCAG 7GCAGGTGAAGCTTGTCTG; 3 'CA - GCCTCCCTCGCGCCATCAGC4GCAGGTGAAGCTTGTCTG In one embodiment, the key discrimination sequence is approximately 4, 5, 6, 7, 8, 9, or 10 nucleotides in length. In another embodiment, the key discrimination sequence is a combination of approximately 1-4 nucleotides. In yet another embodiment, each universal adapter is approximately forty-four nucleotides in length. In one embodiment the universal adapters are ligated, using T4 DNA ligase, onto the end of the coding oligonucleotide. Different universal adapters can be designed specifically for each library preparation and, therefore, will provide a unique identifier for each library. The size and sequence of the universal adapters can be modified as deemed necessary by someone with experience in the technique . As indicated in the above, the nucleotide sequence of the oligonucleotide tag as part of the methods of this invention can be determined by the use of the polymerase chain reaction (PCR). The oligonucleotide tag is comprised of polynucleotides that identify the building blocks that constitute the functional portion as described herein. The nucleic acid sequence of the oligonucleotide tag is determined by subjecting the oligonucleotide tag to a PCR reaction as follows. The appropriate sample is contacted with a pair of primers for PCR, each member of the pair having a preselected nucleotide sequence. The pair of primers for PCR is capable of initiating the primer extension reactions by hybridizing to a binding site for PCR primer on the label of coding oligonucleotides. The binding site for PCR primer is preferably designed on the label of coding oligonucleotides. For example, a binding site for PCR primer can be incorporated into the initial oligonucleotide label and the second binding site for PCR primer can be on the final label of oligonucleotides. Alternatively, the second binding site for PCR primer can be incorporated into the cover sequence as described herein. In preferred modalities, the site binding for PCR primer is at least about 5, 7, 10, 13, 15, 17, 20, 22, or 25 nucleotides in length. The PCR reaction is performed by mixing the pair of PCR primers, preferably a predetermined amount thereof, with the nucleic acids of the coding oligonucleotide tag, preferably a predetermined amount thereof, in a PCR buffer to form a PCR reaction mixture. The mixture is subjected to thermal cycling for a number of cycles, which are typically predetermined, sufficient for the formation of a PCR reaction product. A sufficient amount of product is one that can be isolated in an amount sufficient to allow determination of the DNA sequence. PCR is typically carried out when subjected to thermal cycling ie, repeatedly increasing and decreasing the temperature of a PCR reaction mixture within a temperature range whose lower limit is about 30 ° C to about 55 ° C and whose upper limit is about 90 ° C to about 100 ° C. The increase and decrease may be continuous, but it is preferably in phases with periods of relative temperature stability at each of the temperatures favoring the synthesis of polynucleotides, denaturation and hybridization. The PCR reaction is carried out using any suitable method. It generally occurs in a buffered aqueous solution, i.e., a PCR buffer, preferably at a pH of 7-9. Preferably, a molar excess of the primer is presented. A large molar excess is preferred to improve the efficiency of the process. The PCR buffer also contains the deoxyribonucleotide triphosphates (substrates of the synthesis of polynucleotides) dATP, dCTP, dGTP, and dTTP and a polymerase, typically thermostable, all in amounts suitable for the primer extension reaction (polynucleotide synthesis). The resulting solution (PCR mixture) is heated to about 90 ° C-100 ° C for about 1 to 10 minutes, preferably 1 to 4 minutes. After this heating period the solution is allowed to cool to 54 ° C, which is preferred for the hybridization of the primers. The synthesis reaction can occur at a temperature ranging from room temperature to a temperature below which the polymerase (induction agent) no longer functions efficiently. Thus, for example, if DNA polymerase is used, the temperature is generally no greater than about 40 ° C. The thermal cycles are repeated until the desired amount of PCR product is produced. An exemplary PCR buffer comprises the following reactive agents: 50 mM KCl; 10 mM Tris-HCl at pH 8.3; 1.5 mM MgCl. sub.2; 0.001% (weight / vol) gelatin, 200 μM dATP; 200 μM dTTP; 200 μM dCTP; 200 μM dGTP; and 2.5 units of DNA polymerase from Thermus aquaticus (Taq) I per 100 microliters of buffer. Suitable enzymes for lengthening the primer sequences include, for example, DNA polymerase I from E. coli, Taq DNA polymerase, Klenow fragment from E. coli DNA polymerase I, T4 DNA polymerase, other available DNA polymerases, reverse transcriptase, and other enzymes, including thermostable enzymes, which will facilitate the combination of the nucleotides in an appropriate manner to form the extension products of the primers which are complementary to each strand of nucleic acid. Generally, the synthesis will start at the 3 'end of each primer and proceed in the 5' direction along the template strand, until the synthesis ends, producing molecules of different lengths. The newly synthesized DNA strand and its complementary strand form a double-stranded molecule which can be used in the successive stages of the analysis process. PCR amplification methods are described in detail in U.S. Patent Nos. 4,683,192, 4,683,202, 4,800,159, and 4,965,188, and at least in PCR Technology: Principies and Applications for DNA Amplification, H. Erlich, ed., Stockton Press, New York (1989); and PCR Protocols: A Guide to Methods and Applications, Innis et al, eds., Academic Press, San Diego, Calif. (1990).
The contents of all the documents cited above are incorporated herein by reference. Once the tag of encoding oligonucleotides has been amplified, the sequence of the tag, and ultimately the composition of the selected molecule, can be determined using nucleic acid sequence analysis, a well-known procedure for determining the sequence of sequences of nucleotides. The analysis of nucleic acid sequences is focused on a combination of (a) physiochemical techniques, based on the hybridization or denaturation of a probe strand plus its complementary objective, and (b) enzymatic reactions with polymerases. The invention furthermore relates to the compounds which can be produced using the methods of the invention and collections of such compounds, either as isolated species or assembled to form a library of chemical structures. The compounds of the invention include compounds of the formula where X is a functional portion comprising one or more building blocks, Z is an oligonucleotide attached at its 3 'end to B and Y is an oligonucleotide which binds to C at its 5' terminus. A is a functional group that forms a covalent bond with X, B is a functional group that forms a bond with the 3 'end of Z and C is a functional group that forms a bond with the 5' end of Y. D, F and E are chemical groups that bind to functional groups A, C and B to S, which is a nucleus atom or scaffold. Preferably, D, E and F are each independently a chain of atoms, such as an alkylene chain or an oligo chain (ethylene glycol), and D, E and F can be the same or different, and are preferably effective to allow hybridization of the two oligonucleotides and the synthesis of the functional portion. Preferably, Y and Z are substantially complementary and oriented in the compound to enable Watson-Crick base pairing and duplex formation under suitable conditions. And and Z are the same length or different lengths. Preferably, Y and Z are of the same length, or one of Y and Z is 1 to 10 bases longer than the other. In a preferred embodiment, Y and Z are each of 10 or more bases in length and have complementary regions of ten or more base pairs. More preferably, Y and Z are substantially complementary over their entire length, that is, they have no more than one mismatch for every ten base pairs. Most preferably, Y and Z are complementary over their entire length, i.e., except for any region of outgoing segment in Y or Z, the strands hybridize by mating Watson-Crick bases without matching errors over their entire length. S can be a single atom or a molecular scaffold. For example, S can be a carbon atom, a boron atom, a nitrogen atom or a phosphorus atom, or a polyatomic scaffold, such as a phosphate group or a cyclic group, such as a cycloalkyl, cycloalkenyl, heterocycloalkyl group , heterocycloalkenyl, aryl or heteroaryl. In one modality, the linker is a group of the structure where each of n, m and p is, independently, an integer from 1 to about 20, preferably from 2 to eight, and more preferably from 3 to 6. In a particular embodiment, the linker has the structure shown in the following.
In one embodiment, libraries of the invention include molecules that consist of a functional portion composed of building blocks, wherein each functional portion is operably linked to a coding oligonucleotide. The nucleotide sequence of the coding oligonucleotide is indicative of the building blocks present in the funciconal portion, and in some medalities, of the cenectivity or arrangement of the building blocks. The invention provides the advantage that the methodology used to construct the functional portion and that used to construct the oligonucleotide tag can be performed in the same reaction medium, preferably an aqueous medium, thereby simplifying the method for preparing the library in comparison with the methods in the prior art. In certain embodiments in which the oligonucleotide ligation steps and the building block addition steps can be conducted in aqueous media, each reaction will have a different optimum pH. In these embodiments, the addition reaction of building blocks can be conducted at a suitable pH and temperature in a suitable aqueous buffer. The buffer can then be exchanged for an aqueous buffer which provides a suitable pH for ligation of the oligonucleotides. In another embodiment, the invention provides compounds, and libraries comprising such compounds, of Formula II • X (Y) n (ID where X is a molecular scaffold, each and independently is a peripheral portion, and n is an integer from 1 to 6. Each A is independently, a building block and n is an integer from 0 to approximately 5. L is a binding portion and Z is a single-stranded or double-stranded oligonucleotide which identifies the structure - tX (Y) n. The structure X (Y) n may be, for example, one of the scaffold structures set forth in Table 8 (see below). In one embodiment, the invention provides compounds, and libraries that comprise such compounds, of Formula III where t is an integer from 0 to about 5, preferably from 0 to 3, and each A is, independently, a building block. L is a binding moiety and Z is a single-stranded or double-stranded oligonucleotide which identifies each A and Ri, R2, R3 and R4. Ri, R2, R3 and R4 are each independently a substituent selected from hydrogen, alkyl, substituted alkyl, heteroalkyl, substituted heteroalkyl, cycloalkyl, heterocycloalkyl, substituted cycloalkyl, substituted heterocyclealkyl, aryl, substituted aryl, arylalkyl, heteroarylalkyl, substituted arylalkyl, heteroarylalkyl substituted, heteroaryl, substituted heteroaryl, alkoxy, aryloxy, amino, and substituted amino. In one embodiment, each A is an amino acid residue. Libraries that include compounds of Formula II or Formula III may comprise at least about 100; 1000; 10,000; 100,000; 1,000,000 or 10,000,000 compounds of Formula II or Formula III. In a embodiment, the library is prepared by a method designed to produce a library comprising at least about 100; 1000; 10,000; 100,000; 1,000,000 or 10,000,000 compounds of Formula II or Formula III.
Table 8. a and Kappi C O (20C Organic Letters 6 7 '774 775-781 Synlett l i: 133 i H * W Kiibum.j R2 m "Om> 42 T 25et83-e2tt5 2642 612 5356-36 .A Aldehyde Amines 15 20 25 An advantage of the methods of the invention is that they can be used to prepare libraries comprising vast numbers of compounds. The ability to amplify coding oligonucleotide sequences using known methods such as polymerase chain reaction ("PCR") means that the selected molecules can be identified even if relatively few copies are recovered. This allows the practical use of very large libraries, which, as a consequence of their high degree of complexity, comprise relatively few copies of any given member of the library, or require the use of very large volumes. For example, a library consisting of 108 unique structures in which each structure has 1 x 1012 copies (approximately 1 peak wave), requires approximately 100 L of solution in 1 μM effective concentration. For the same library, if each member is represented by 1,000,000 copies, the volume required is 100 μL in 1 μM effective concentration. In a preferred embodiment, the library comprises from about 103 to about 1015 copies of each member of the library. Given the differences in synthesis efficiency among members of the library, it is possible that different members of the library will have different numbers of copies in any given library. Therefore, although the number of copies of each member theoretically present in the library may be the same, the actual number of copies of any given member of the library is independent of the number of copies of any other member. More preferably, libraries of compounds of the invention include at least about 105, 106 or 107 copies of each member of the library, or substantially all members of the library. By "substantially all" the members of the library is meant at least about 85% of the members of the library, preferably at least about 90%, and more preferably at least about 95% of the members of the library. library. Preferably, the library includes a sufficient number of copies of each member that multiple rounds (ie, two or more) of selection against a biological target can be made, with sufficient quantities of link molecules remaining after the final round of selection to enable the amplification of the oligonucleotide labels of the remaining molecules and, therefore, the identification of the functional portions of the binding molecules. A schematic representation of such a selection process is illustrated in Figure 6, in which 1 and 2 represent members of the library, B is a target molecule and X is a portion operably linked to B that enables the removal of B from the medium of selection. In this example, compound 1 binds to B, while compound 2 does not bind to B. The selection process, as depicted in Round 1, comprises (I) contacting a library comprising compounds 1 and 2 with BX under conditions suitable for the binding of compound 1 to B; (II) remove the unbound compound 2, (III) dissociate compound 1 from B and remove BX from the reaction medium. The result of Round 1 is a collection of molecules that is enriched in compound 1 relative to compound 2. Subsequent rounds that use stages I-III result in additional enrichment of compound 1 relative to compound 2. Although three rounds of selection are shown in Figure 6, in practice any number of rounds can be used, for example from one round to ten rounds, to achieve the desired enrichment of binding molecules relative to unbound molecules. In the embodiment shown in Figure 6, there is no amplification (synthesis of more copies) of the remaining compounds after any of the selection rounds. Such amplification can lead to a mixture of compounds which are not consistent with the relative amounts of the remaining compounds after selection. This inconsistency is due to the fact that certain compounds can be synthesized more easily than other compounds, and thus can be amplified in a form which is not proportional to its presence after the selection. For example, if compound 2 is more readily synthesized than compound 1, the amplification of the remaining molecules after Round 2 may result in a disproportionate amplification of compound 2 relative to compound 1, and a resulting mixture of compounds with a much smaller enrichment (if any) of compound 1 relative to compound 2. In one embodiment, the target is immobilized on a solid support by any known immobilization technique. The solid support can be, for example, a water-insoluble matrix contained within a column for chromatography or a membrane. The encoded library can be applied to a water-insoluble matrix contained within a column for chromatography. The column is then washed to remove non-specific binders. The compounds bonded to the target can then be dissociated by changing the pH, concentration of salts, concentration of organic solvents, or other methods, such as competition with a known ligand for the target. In another embodiment, the target is free in solution and incubated with the coded library. The target binding compounds (also referred to herein as "ligands") are selectively isolated by a size separation step such as gel filtration or ultrafiltration. In one embodiment, the mixture of the encoded compounds and the target biomolecule is passed through a column for size exclusion chromatography (gel filtration), which separates any ligand-target complexes from the unbound compounds. The ligand-target complexes are transferred to a column for reverse phase chromatography, which dissociates the ligands from the target. The dissociated ligands are then analyzed by PCR amplification and sequence analysis of the coding oligonucleotides. This approach is particularly advantageous in situations where the immobilization of the target can result in a loss of activity. In some embodiments of the invention, the selection method may comprise amplifying the oligonucleotide encoding at least one member of the compound library that binds to a target prior to sequencing. In one embodiment, the library of compounds comprising coding oligonucleotides is amplified prior to sequence analysis in order to minimize any potential bias in the distribution of the population of DNA molecules present in the mixture of selected libraries. For example, only a small amount of library is recovered after a selection step and is typically amplified using PCR prior to the analysis of sequences. PCR has the potential to produce a bias in the distribution of the population of DNA molecules present in the mixture of selected libraries. This is especially problematic when the number of input molecules is small and the input molecules are deficient templates for PCR. The PCR products produced in the first cycles are more efficient templates than the covalent duplex library, and therefore the frequency of these molecules in the final amplified population can be much higher than in the original input template. Accordingly, in order to minimize this potential PCR bias, in one embodiment of the invention, a population of single-stranded oligonucleotides corresponding to the individual members of the library is produced, for example, by using a primer in a reaction , followed by PCR amplification using two primers. In doing so, there is a linear accumulation of extension product of the single-stranded primer prior to exponential amplification using PCR, and the diversity and distribution of molecules in the accumulated primer extension product more accurately reflect the diversity and distribution of the molecules present in the original entry template, since the exponential phase of amplification occurs only after much of the original molecular diversity present is represented in the population of molecules produced during the primer extension reaction. Once the simple ligands are identified by the process described in the above, various levels of analysis can be applied to yield structure-activity interrelationship information and to guide further optimization of the affinity, specificity and bioactivity of the ligand. For ligands derived from the same scaffold, three-dimensional molecular modeling can be used to identify significant structural features common to the ligands, thereby generating families of small molecule ligands that presumably bind at a common site in the target biomolecule. A variety of detection approaches can be used to obtain ligands that possess high affinity for one target but significantly weaker affinity for another closely related target. One detection strategy is to identify ligands for both biomolecules in parallel experiments and subsequently eliminate common ligands by a cross reference comparison. In this method, the ligands for each biomolecule can be identified separately as described above. This method is compatible with immobilized target biomolecules and free target biomolecules in solution.
For immobilized target biomolecules, another strategy is to add a pre-selection step that eliminates all ligands that bind to the non-target biomolecule of the library. For example, a first biomolecule can be contacted with a coded library as described above. The compounds that do not bind to the first biomolecule are then separated from any first biomolecule-ligand complexes which they form. The second biomolecule is then contacted with the compounds which do not bind to the first biomolecule. The compounds that bind to the second biomolecule can be identified as described above and have significantly greater affinity for the second biomolecule than for the first biomolecule. A ligand for a biomolecule of unknown function which is identified by the method described above can also be used to determine the biological function of the biomolecule. This is advantageous since although new gene sequences continue to be identified, the functions of the proteins encoded by these sequences and the validity of these proteins as targets for the discovery and development of new drugs are difficult to determine and represent perhaps the most significant obstacle to apply genomic information to the treatment of the disease. Specific ligands of The objective obtained through the process described in this invention can be effectively employed in biological tests of whole cells or in animal models suitable for understanding the function of the target protein and the validity of the target protein for therapeutic intervention. This approach can also confirm that the target is specifically affordable for the discovery of small molecule drugs. In one embodiment, one or more compounds within a library of the invention are identified as ligands for a particular biomolecule. These compounds can then be titrated in an in vitro assay for the ability to bind to the biomolecule. Preferably, functional portions of the binding compounds are synthesized without the oligonucleotide tag or linker portion, and these functional portions are assessed for the ability to bind to the biomolecule. The effect of the binding of functional portions to the biomolecule on the function of the biomolecule can also be assessed using in vitro cell-free or cell-based assays. For a biomolecule having a known function, the assay can include a comparison of the activity of the biomolecule in the presence and absence of the ligand, for example, by direct measurement of the activity, such as enzymatic activity, or by an indirect measurement, such as a cellular function that is influenced by the biomolecule. If the biomolecule is of unknown function, a cell which expresses the biomolecule can be contacted with the ligand and the effect of the ligand on the viability, function, phenotype, and / or gene expression of the cell is assessed. The in vitro assay can be, for example, a cell death assay, a cell proliferation assay or a viral replication assay. For example, if the biomolecule is a protein expressed by a virus, a cell infected with the virus can be contacted with a ligand for the protein. The affect of the binding of the ligand to the protein on viral viability can then be assessed. A ligand identified by the method of the invention can also be assessed in a model in vivo or in a human. For example, the ligand can be evaluated in an animal or organism which produces the biomolecule. Any resulting change in the health status (eg, disease progression) of the animal or organism can be determined. For a biomolecule, such as a protein or a nucleic acid molecule, of unknown function, the effect of a ligand which binds to the biomolecule on a cell or organism which produces the biomolecule can provide information concerning the biological function of the biomolecule. For example, the observation that a particular cellular process is inhibited in the presence of the ligand indicates that the process depends, at least in part, on the function of the biomolecule. Ligands identified using the methods of the invention can also be used as affinity reagents for the biomolecule to which they bind. In one embodiment, such ligands are used to effect affinity purification of the biomolecule, for example, by chromatography of a solution comprising the biomolecule using a solid phase to which one or more of such ligands binds. This invention is further illustrated by the following examples which should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application, as well as in the Figures and the Sequence List, are hereby incorporated by reference.
Examples Example 1: Synthesis and Characterization of a library in the order of 105 members The synthesis of a library comprising in the order of 105 different members was achieved using the following reactive agents: Compound 1 One-letter codes for deoxyribonucleotides: A = adenosine C = cytidine G = guanosine T = thymidine Precursors of building blocks: BB] BB2 BB3 BB4 BB5 BB6 titi / tíb »btjy BB 10 BB 1 1 BB 12 Oligonucleotide labels: Sequence Label number '-PO4-GCAACGAAG (SEQ ID NO: 1) ACCGTTGCT-PO3-5 '(SEQ ID NO: 2) 5'-PO3-GCGTACAAG (SEQ ID NO: 3) 1.2 ACCGCATGT-PO3-5' (SEQ ID NO. : 4) 5 '-PO3-GCTCTGTAG (SEQ ID NO: 5) 1 .3 ACCGAGACA-PO3-5' (SEQ ID NO: 6) 5 '-PO3-GTGCCATAG (SEQ ID NO: 7) 1.4 ACCACGGTA-PO3- 5 '(SEQ ID NO: 8) '-P03-GTTGACCAG (SEQ ID NO: 9) 1.5 ACCAACTGG-P03-5 '(SEQ ID NO: 10) 5' -PO3-CGACTTGAC (SEQ ID NO: 11) 1.6 CAAGTCGCA-PO3-5 '(SEQ ID NO: 12) 5' -PO3-CGTAGTCAG (SEQ ID NO: 13) 1.7 ACGCATCAG-PO3-5 '(SEQ ID NO: 14) 5' -PO3-CCAGCATAG (SEQ ID NO: 15) 1.8 ACGGTCGTA-PO3-5 '(SEQ ID NO: 16) 5' -PO3-CCTACAGAG (SEQ ID NO: 17) 1.9 ACGGATGTC-PO3-5 '(SEQ ID NO: 18) 5' -PO3-CTGAACGAG (SEQ ID NO: 19) 1.10 CGTTCAGCA-PO3-5 '(SEQ ID NO: 20) 5' -PO3-CTCCAGTAG (SEQ ID NO: 21) 1.11 ACGAGGTCA-PO3-5 '(SEQ ID NO: 22) 5' -PO3-TAGGTCCAG (SEQ ID NO: 23) 1.12 ACATCCAGG-PO3-5 '(SEQ ID NO: 24) 5' -PO3-GCGTGTTGT (SEQ ID NO: 25) 2.1 TCCGCACAA-PO3-5 '(SEQ ID NO: 26) 5' -PO3-GCTTGGAGT (SEQ ID NO: 27) 2.2 TCCGAACCT-P03-5 '(SEQ ID NO: 28) 5' -P03-GTCAAGCGT (SEQ ID NO: 29) 2.3 TCCAGTTCG-PO3-5 '(SEQ ID NO: 30) 5' - PO3 - CAAGAGCGT (SEQ ID NO: 31) 2.4 TCGTTCTCG-PO3-5 '(SEQ ID NO: 32) '- P03 - CAGTTCGGT (SEQ ID NO: 33) 2.5 TCGTCAAGC-PO3-5 '(SEQ ID NO: 34) 5' -PO3-CGAAGGAGT (SEQ ID NO: 35) 2.6 TCGCTTCCT-PO3-5 '(SEQ ID NO: 36) '-PO3-CGGTGTTGT (SEQ ID NO: 37) 2.7 TCGCCACAA-PO3-5 '(SEQ ID NO: 38) 5' -PO3-CGTTGCTGT (SEQ ID NO: 39) 2.8 TCGCAACGA-PO3-5 '(SEQ ID NO: 40) 10Í '-PO.-CCGATCTGT (SEQ ID NO: 41) 2.9 TCGGCTAGA-PO.-5 '(SEQ ID NO: 42) 5' -PO.-CCTTCTCGT (SEQ ID NO: 43) 2.10 TCGGAAGAG-PO.-5 '(SEQ ID NO:! 5' -PO.-TGAGTCCGT (SEQ ID NO: 45) 2.11 TCACTCAGG-POj-5 '(SEQ ID NO: 46) 5' -PO; -TGCTACGGT (SEQ ID NO: 47) 2.12 TC? GATTGC-POj-5 '(S? Q ID NO: 4 S) 5' -P0.-GTGCG7TGA (SEQ 10 NO: 49) 3.1 CAC? CGCAA-PO, -5 '(SEQ ID NO: 5C) 5' -PO.-GTTGGCAGA (SEQ ID NO: 51) 32 CACAACCGT-? 0, -5 '(SEQ ID NO: 52) 5' • PO. -CCTGTAGGA (SEQ ID NO: 53) 3.3 CAGGACATC-PO, -5 '(SEQ ID NO: 54) 5' -PO, -CTGCGTAGA (SEQ ID NO: 55) 3.4 CAGACGCAT-PO, -5 '(SEQ ID NO: 56) 5' -PO.-CTTACGCGA (SEQ ID NO: 57) 3.5 CAG? ATGCG-PO.-5 '(S? QID NO: 8) 5' -PO .. -TGGTCACGA! SEQ ID NO: 59) 3.6 CAACCAGTG-PO.-5 '(SEQ ID NO: 6C) 5' - PO, -TCAGAGCGA (SEQ ID NO: 61) 3.7 CA? GTCTCG- PO.-5 * (SEQ ID NO: 62) 5 '-PO, -TTGCTCGGA (SEQ ID NO: 63) 3.8 C? AACGAGC-PO, -5 '(SEQ ID NO: 64) 5' -PO.-GC? GTTGGA (S? Q ID NO: 65) 3.9 CACGTCAAC-PO.-5 '(SEQ ID NO: 66) 5' FO.-GCCTGA? GA (SEQ ID NO: 6'7) 310 C? CGGACTT PO.-5 '(SEQ ID NO: 68) S'-PO, GTAGCCAGA! SEQ ID NO: 69) 311 CACATCGGT-PO.-5 '(SEQ ID NO: 70) 5' -PO.-GTCGCTTGA (SEQ ID NO: 71) 3.12 C? CAGCGAA- PO. - 5 '(S? Q ID NO: 72) 5' PO, -GCCTAAGTT (SEQ ID NO: 73) 41 CTCGGATTC-PO.-5 '(SEQ ID NO: 4) '-P03-GTAGTGCTT SEQ ID NO: 75) 4.2 CTCATCACG-PO -5' (SEQ ID NO: 76: 5 '-PO3-GTCGAAGTT SEQ ID NO: 77) 4.3 CTCAGCTTC-PO -5' (SEQ ID NO: 78) 5 '-PO3-GTTTCGGTT SEQ ID NO: 79) 4.4 CTCAAAGCC-PO -5' (SEQ ID NO: 80) 5 '-PO3-CAGCGTTTT SEQ ID NO: 81) 4.5 CTGTCGCAA-PO -5' (SEQ ID NO: 82) 5 '-PO3-CATACGCTT SEQ ID NO: 83) 4.6 CTGTATGCG-PO -5' (SEQ ID NO: 84) 5 '-PO3-CGATCTGTT SEQ ID NO: 85) 4.7 CTGCTAGAC-PO -5' ( SEQ ID NO: 86! 5 '- P03 - CGCTTTGTT SEQ ID NO: 87) 4.8 CTGCGAAAC-PO -5' (SEQ ID NO: 88) 5 '-PO3-CCACAGTTT SEQ ID NO: 89) 4.9 CTGGTGTCA-PO -5 '(SEQ ID NO: 90) 5' - PO3 - CCTGAAGTT SEQ ID NO: 91) 4.10 CTGGACTTC-PO -5 '(SEQ ID NO: 92; 5' -PO3-CTGACGATT SEQ ID NO: 93) 4.11 CTGACTGCT-PO -5 '(SEQ ID NO: 94) 5' -PO3-CTCCACTTT SEQ ID NO: 95) 4.12 CTGAGGTGA-PO -5 '(SEQ ID NO: 96) 5' -PO3-ACCAGAGCC SEQ ID NO: 97) 5.1 AATGGTCTC -PO -5 '(SEQ ID NO: 98) 5' -PO3-ATCCGCACC SEQ ID NO: 99) 5.2 AATAGGCGT-PO -5 '(SEQ ID NO: 100) '-PO3-GACGACACC SEQ ID NO: 101) 5.3 AACTGCTGT-PO -5' (SEQ ID NO: 102) '-PO3-GGATGGACC SEQ ID NO: 103) 5.4 AACCTACCT-PO -5' (SEQ ID NO: 104) '-PO3-GCAGAAGCC SEQ ID NO: 105) 5.5 AACGTCTTC-PO -5' (SEQ ID NO: 106! '-PO3-GCCATGTCC (SEQ ID NO: 107) 5.6 AACGGTACA-PO3-5 '(SEQ ID NO: 108) '-PO3-GTCTGCTCC (SEQ ID NO: 109) 5.7 AACAGACGA-PO3-5 '(SEQ ID NO: 110) '- PO3 - CGACAGACC (SEQ ID NO: 111) 5.8 AAGCTGTCT-PO3-5' (SEQ ID NO: 112) '-PO3-CGCTACTCC (SEQ ID NO: 113) 5.9 AAGCGATGA-PO3-5 '(SEQ ID NO: 114) '- PO3 - CCACAGACC (SEQ ID NO: 115) 5.10 AAGGTGTCT-PO3-5' (SEQ ID NO: 116) '-PO3-CCTCTCTCC (SEQ ID NO: 117) 5.11 AAGGAGAGA-PO3-5 '(SEQ ID NO: 118) '-PO3-CTCGTAGCC (SEQ ID NO: 119) 5.12 AAGAGCATC-PO3-5 '(SEQ ID NO: 120) ligase IX buffer: 50 mM Tris, pH 7.5; 10 mM dithiothreitol; 10 mM MgCl2; 2.5 mM ATP; 50 mM NaCl. 10X ligase buffer: 500 mM Tris, pH 7.5; 100 mM dithiothreitol; 100 mM MgCl2; 25 mM ATP; 500 mM NaCl Cycle 1 To each of twelve PCR tubes was added 50 μL of a 1 mM solution of Compound 1 in water; 75 μL of a 0.80 mM solution of one of Labels 1.1-1.12; 15 μL of 10X ligase buffer and 10 μL of deionized water. The tubes they were heated at 95 ° C for 1 minute and then cooled to 16 ° C for 10 minutes. To each tube were added 5,000 units of T4 DNA ligase (2.5 μL of a 2,000,000 unit / mL solution (New England Biolabs, Cat. No. M0202)) in 50 μL of ligase IX buffer and the resulting solutions were incubated at 16 °. C for 16 hours. After the ligation, samples were transferred to 1.5 ml Eppendorf tubes and treated with 20 μL of 5 M aqueous NaCl and 500 μL cold ethanol (-20 ° C), and kept at -20 ° C for 1 hour. After centrifugation, the supernatant was removed and the pellet was washed with 70% aqueous ethanol at -20 ° C. Each of the granules was then dissolved in 150 μL of 150 mM sodium borate buffer, pH 9.4. Stock solutions comprising one of each of the building block precursors BB1 to BB12, N, N-diisopropylethanolamine and O- (7-azabenzotriazol-1-yl) -1, 1,3,3-tetramethyluronium hexafluorophosphate, each at a concentration of 0.25 M, were prepared in DMF and stirred at room temperature for 20 minutes. The solutions of building block precursors were added to each of the solutions of the granules described above to provide a 10-fold excess of building block precursor relative to the linker. The resulting solutions were stirred. A complement of 10 equivalent of building block precursor was added to the reaction mixture after 20 minutes, and another 10 equivalents after 40 minutes. The final concentration of DMF in the reaction mixture was 22%. The reaction solutions were then stirred overnight at 4 ° C. The progress of the reaction was monitored by RP-HPLC using 50 mM aqueous tetraethylammonium acetate (pH = 7.5) and acetonitrile, and a gradient of 2-46% acetonitrile for 14 min. The reaction was stopped when -95% starting material (linker) was acylated. After acylation, the reaction mixtures were combined and lyophilized to dryness. The lyophilized material was then purified by HPLC, and fractions corresponding to the library (acylated product) were pooled and lyophilized. The library was dissolved in 2.5 ml of 0.01M sodium phosphate buffer (pH = 8.2) and 0. lml of piperidine (4% v / v) was added to this. The addition of piperidine results in turbidity which does not dissolve in the mixing. The reaction mixtures were stirred at room temperature for 50 minutes, and then the cloudy solution was centrifuged (14,000 rpm), the supernatant was removed using a 200 μl pipette, and the pellet was resuspended in 0.1 ml of water. The aqueous wash was combined with the supernatant and the pellet was discarded. The unprotected library was precipitated from the solution by the addition of excess ice-cold ethanol to bring the final concentration of ethanol in the reaction to 70% v / v. Centrifugation of the aqueous ethanol mixture gave a white granule comprising the library. The granule was washed once with 70% aq. cold. After solvent removal the granule was dried in the air (~ 5min.) To remove small amounts of ethanol and then used in cycle 2. The labels and precursors of corresponding building blocks used in Round 1 are set at Table 1, in the following.
Table 1 Label Precursor Building Block BB1 1.11 BB2 1.6 BB3 1.2 BB4 1.8 BB5 1.1 BBß 1.10 BB7 1.12 BB8 1.5 BB9 1.4 BB10 1.3 BB11 1.7 BB12 1.9 For each of these cycles, the combined solution resulting from the previous cycle was divided into 12 equal aliquots of 50 ul each and placed in PCR tubes. To each tube was added a solution comprising a different labeling, and ligation, purification and acylation were performed as described for Cycle 1, except that for Cycles 3-5, the HPLC purification step described for Cycle 1 was omitted. The correspondence between labels and precursors of building blocks for Cycles 2-5 is presented in Table 2. The products of cycle 5 were ligated with the closing primer shown in the following, using the method described in the above for ligation of labels. 5 '-PO3-GGCACATTGATTTGGGAGTCA GTGTAACTAAACCCTCAGT-P03-5' Table 2 Precursor of Cycle 2 Cycle 3 Cycle 4 Cycle 5 Label Block Label Label Label Construction BB1 2.7 3.7 4.7 5.7 BB2 2.8 3.8 4.8 5.8 BB3 2.2 3.2 4.2 5.2 BB4 2.10 3.10 4.10 5.10 BB5 2.1 3.1 4.1 5.1 BB6 2.12 3.12 4.12 5.12 BB7 2.5 3.5 4.5 5.5 BB8 2.6 3.6 4.6 5.6 BB9 2.4 3.4 4.4 5.4 BB10 2.3 3.3 '4.3 5.3 BB11 2.9 3.9 4.9 5.9 BB12 2.11 3.11 4.11 5.11 Results: The synthetic procedure described in the above has the ability to produce a library comprising 12 (approximately 249,000) different structures. The synthesis of the library was monitored by gel electrophoresis of the product of each cycle. The results of each of the five cycles and the final library after the ligation of the closing primer are illustrated in Figure 7. The compound labeled "main piece" is Compound 1. The figure shows that each cycle results in the increase expected in molecular weight and that the products of each cycle are substantially homogeneous with respect to molecular weight. Example 2: Synthesis and Characterization of a library in the order of 108 members The synthesis of a library comprising in the order of 108 different members was achieved using the following reactive agents: Compound 2: One-letter codes for deoxyribonucleotides A = adenosine C = cytidine G = guanosine T = thymidine BB6 BB7 BB5 OH H HNN- < > "IC Fmoc 'X-X OH X Fmoc O F Fmoocc - NNHH BB8 BB9 BBIO BB14 BB1I BB12 BB13 BBI5 BB16 BB17 BBI8 BB19 BB20 BB26 BB28 BB25 BB27 BB32 BB33 BB34 BB35 BB4I BB42 BB43 BB40 BB47 BB44 BB45 BB46 BB48 BB49 BB50 BB5I BB55 BB53 BB54 BB52 BB57 BB58 Fmoc O Fmoc. .OH HN. ~ N ^ O Fmoc - NH "p- ° O HO BB60 BB59 BB61 BB62 BB63 BB64 BB65 BB69 BB70 BB71 BB81 BB82 BB83 BB85 BB86 BB84 BB91 BB92 BB93 BB94 BB95 BB96 Table 3: Oligonucleotide Labels Used in Cycle 1: Tag Number Upper Strand Sequence Lower Strand Sequence 5'-P03- 5'-P03-AAATCGATGTGGTCACTCAG GAGTGACCACATCGATTTGG (SEQ ID NO: 121) (SEQ ID NO: 122 ) S'-P03- 5'-P03- AAATCGATGTGGACTAGGAG CCTAGTCCACATCGATTTGG 1. 2 (SEQ ID NO: 123) (SEQ ID NO: 124) 5'-P03- 5'-P03- AAATCGATGTGCCGTATGAG CATACGGCACATCGATTTGG 1. 3 (SEQ ID NO: 125) (SEQ ID NO: 126) 5'-P03- 5'-P03-AAATCGATGTGCTGAAGGAG CCTTCAGCACATCGATTTGG (SEQ ID NO: 127) (SEQ ID NO: 128) 5'-P03- 5'-P03 - AAATCGATGTGGACTAGCAG GCTAGTCCACATCGATTTGG (SEQ ID NO: 129) (SEQ ID NO: 130) 5'-P03- 5'-P03- AAATCGATGTGCGCTAAGAG CTTAGCGCACATCGATTTGG 1. 6 (SEQ ID NO: 131) (SEQ ID NO: 132) 5'-P03- 5'-P03-AAATCGATGTGAGCCGAGAG CTCGGCTCACATCGATTTGG (SEQ ID NO: 133) (SEQ ID NO: 134) '-P03- 5'-P03- AAATCGATGTGCCGTATCAG GATACGGCACATCGAT1TGG 1. 8 (SEQ ID NO: 135) (SEQ ID NO: 136) 5'-P03- 5'-P03- AAATCGATGTGCTGAAGCAG GCTTCAGCACATCGATTTGG 1.9 (SEQ ID NO: 137) (SEQ ID NO: 138) 5'-P03- 5'- P03- AAATCGATGTGTGCGAGTAG ACTCGCACACATCGATTTGG 1.10 (SEQ ID NO: 139) (SEQ ID NO: 140) 5'-P03- 5'-P03- AAATCGATGTGTTTGGCGAG CGCCAAACACATCGATTTGG 1.11 (SEQ ID NO: 141) (SEQ ID NO: 142) 5'-P03 - 5'-P03- AAATCGATGTGCGCTAACAG GTTAGCGCACATCGATTTGG 1.12 (SEQ ID NO: 143) (SEQ ID NO: 144) 5'-P03- 5'-P03- AAATCGATGTGAGCCGACAG GTCGGCTCACATCGATTTGG 1.13 (SEQ ID NO: 145) (SEQ ID NO: 146) 5'-P03- 5 * -P03- 10 AAATCGATGTGAGCCGAAAG TTCGGCTCACATCGATITGG 1. 14 (SEQ ID NO: 147) (SEQ ID NO: 148) 5'-P03- 5'-P03- AAATCGATGTGTCGGTAGAG CTACCGACACATCGATTTGG 1.15 (SEQ ID NO: 149) (SEQ ID NO: 150) 5'-P03- 5'- P03- AAATCGATGTGGTTGCCGAG CGGCAACCACATCGATTTGG 1.16 (SEQ ID NO: 151) (SEQ ID NO: 152) 5'-P03- 5'-P03- AAATCGATGTGAGTGCGTAG ACGCACTCACATCGATTTGG 1.17 (SEQ ID NO: 153) (SEQ ID NO: 154) 5'-P03- 5'-P03-AAATCGATGTGGTTGCCAAG TGGCAACCACATCGATTTGG 1.18 (SEQ ID NO: 155) (SEQ ID NO: 156) 5'-P03- 5'-P03- AAATCGATGTGTGCGAGGAG CCTCGCACACATCGATTTGG 1.19 (SEQ ID NO: 157) (SEQ ID NO: 158) 5'-P03- 5'-P03- AAATCGATGTGGAACACGAG CGTGTTCCACATCGA'ITTGG 1. 20 (SEQ ID NO: 159) (SEQ ID NO: 160) 5'-P03- 5'-P03- AAATCGATGTGCTTGTCGAG CGACAAGCACATCGATTTGG 1.21 (SEQ ID NO: 161) (SEQ ID NO: 162) 5'-P03- 5'-P03- AAATCGATGTGTTCCGGTAG AOCCGGAACACATCGATTTGG 1.22 (SEQ ID NO: 163) (SEQ ID NO: 164) 5'-P03- 5 ' -P03- AAATCGATGTGTGCGAGCAG GCTCGCACACATCGATTTGG 1.23 (SEQ ID NO: 165) (SEQ ID NO: 166) 5 * -P03- 5'-P03- AAATCGATGTGGTCAGGTAG ACCTGACCACATCGATTTGG 1.24 (SEQ ID NO: 167) (SEQ ID NO: 168) 5'- P03- 5'-P03- 1.25 AAATCGATGTGGCCTGTTAG AACAGGCCACATCGATTTGG (SEQ ID NO: 169) (SEQ ID NO: 170) 5 * -P03- 5'-P03- AAATCGATGTGGAACACCAG GGTGTTCCACATCGATTTGG 1.26 (SEQ ID NO: 171) (SEQ ID NO: 172) 5'-P03- 5'-P03 -AAATCGATGTGCTTGTCCAG GGACAAGCACATCGATTTGG 1. 27 (SEQ ID NO: 173) (SEQ ID NO: 174) 5'-P03- 5'-P03- AAATCGATGTGTGCGAGAAG TCTCGCACACATCGATTTGG 1.28 (SEQ ID NO: 175) (SEQ ID NO: 176) 5'-P03- 5'- P03- AAATCGATGTGAGTGCGGAG CCGCACTCACATCGATTTGG 1.29 (SEQ ID NO: 177) (SEQ ID NO: 178) 5'-P03- 5'-P03- AAATCGATGTGTTGTCCGAG CGGACAACACATCGATTTGG 1.30 (SEQ ID NO: 179) (SEQ ID NO: 180) 5'-P03 - 5'-P03- 10 AAATCGATGTGTGGAACGAG CGTTCCACACATCGATTTGG 1. 31 (SEQ ID NO: 181) (SEQ ID NO: 182) 5'-P03- 5'-P03- AAATCGATGTGAGTGCGAAG TCGCACTCACATCGATTTGG 1.32 (SEQ ID NO: 183) (SEQ ID NO: 184) 5'-P03- 5'- P < 33- AAATCGATGTGTGGAACCAG GGTTCCACACATCGATTTGG 1.33 (SEQ ID NO: 185) (SEQ ID NO: 186) 5'-P03- 5'-P03- AAATCGATGTGTTAGGCGAG CGCCTAACACATCGATTTGG 1.34 (SEQ ID NO: 187) (SEQ ID NO: 188) 5'-P03 - 5'-P03- 15 AAATCGATGTGGCCTGTGAG CACAGGCCACATCGATTTGG 1. 35 (SEQ ID NO: 189) (SEQ ID NO: 190) 5'-P03- 5'-P03-AAATCGATGTGCTCCTGTAG ACAGGAGCACATCGATTTGG 1. 36 (SEQ ID NO: 191) (SEQ ID NO: 192) 5'-P03- 5'-P03- AAATCGATGTGGTCAGGCAG GCCTGACCACATCGATTTGG 1.37 (SEQ ID NO: 193) (SEQ ID NO: 194) 5'-P03- 5'- P03- AAATCGATGTGGTCAGGAAG TCCTGACCACATCGATTTGG 1.38 (SEQ ID NO: 195) (SEQ ID NO: 196) 5'-P03- 5'-P03- AAATCGATGTGGTAGCCGAG CGGCTACCACATCGATTTGG 1.39 (SEQ ID NO: 197) (SEQ ID NO: 198) 5'-P03- 5'-P03- AAATCGATGTGGCCTGTAAG TACAGGCCACATCGATTTGG 1.40 (SEQ ID NO: 199) (SEQ ID NO: 200) 5'-P03- 5'-P03-AAATCGATGTGCTTTCGGAG CGGAAAGCACATCGATITGG 1.41 (SEQ ID NO: 201) (SEQ ID NO: 202) 5'-P03- 5'-P03- AAATCGATGTGCGTAAGGAG CCTTACGCACATCGATTTGG 1.42 (SEQ ID NO: 203) (SEQ ID NO: 204) '-P03- 5'-P03- AAATCGATGTGAGAGCGTAG ACGCTCTCACATCGATTTGG 1.43 (SEQ ID NO: 205) (SEQ ID NO: 206) 5'-P03- 5'-P03- AAATCGATGTGGACGGCAAG TGCCGTCCACATCGATTTGG 1.44 (SEQ ID NO: 207) (SEQ ID NO: 208) 5'-P03- 5 '-P03-AAATCGATGTGCTTTCGCAG GCGAAAGCACATCGATTTGG 1. 45 (SEQ ID NO: 209) (SEQ ID NO: 210) 5'-P03- 5'-P03- AAATCGATGTGCGTAAGCAG GCTTACGCACATCGATTTGG 1.46 (SEQ ID NO: 211) (SEQ ID NO: 212) 5'-P03- 5'- P03- AAATCGATGTGGCTATGGAG CCATAGCCACATCGATTTGG 1.47 (SEQ ID NO: 213) (SEQ ID NO: 214) 5'-P03- 5'-P < 33- AAATCGATGTGACTCTGGAG CCAGAGTCACATCGATTTGG 1.48 (SEQ ID NO: 215) (SEQ ID NO: 216) 5 '-P03- 10 5'-P03-AAATCGATGTGCTGGAAAG TTCCAGCACATCGATTTGG 1. 49 (SEQ ID NO: 217) (SEQ ID NO: 218) 5'-P03- 5'-P03-AAATCGATGTGCCGAAGTAG ACTTCGGCACATCGATTTGG 1.50 (SEQ ID NO: 219) (SEQ ID NO: 220) 5'-P03- 5 '- P03- AAATCGATGTGCTCCTGAAG TCAGGAGCACATCGATTTGG 1.51 (SEQ ID NO: 221) (SEQ ID NO: 222) 5'-P03- 5 '-P03- AAATCGATGTGTCCAGTCAG GACTGGACACATCGATTTGG 1.52 (SEQ ID NO: 223) (SEQ ID NO: 224) 5'-P03- 5 '-P03- AAATCGATGTGAGAGCGGAG CCGCTCTCACATCGATTGGG 1.53 (SEQ ID NO: 225) (SEQ ID NO: 226) 5'-P03- 5'-P03- AAATCGATGTGAGAGCGAAG TCGCTCTCACATCGATTTGG 1.54 (SEQ ID NO.-227) ( SEQ ID NO: 228) 5'-P03- 5'-P03-AAATCGATGTGCCGAAGGAG CCTTCGGCACATCGATTTGG 1.55 (SEQ ID NO: 229) (SEQ ID NO: 230) 5'-P03- 5'-P03- AAATCGATGTGCCGAAGCAG GCTTCGGCACATCGATTTGG 1.56 (SEQ ID NO: 231) (SEQ ID NO: 232) 5'-P03- 5'-P03- AAATCGATGTGTGTTCCGAG CGGAACACACATCGATTTGG 1.57 (SEQ ID NO: 233) (SEQ ID NO: 234) 5'-P03- 5 ' -P03- AAATCGATGTGTCTGGCGAG CGCCAGACACATCGATTTGG 1.58 (SEQ ID NO: 235) (SEQ ID NO: 236) 5'-P03- 5'-P03- AAATCGATGTGCTATCGGAG CCGATAGCAC? TCGATTTGG 1.59 (SEQ ID NO: 237) (SEQ ID NO: 238) 5 '-P03- 5'-P03- 1.60 AAATCGATGTGCGAAAGGAG CCTTTCGCACATCGATTTGG (SEQ ID NO: 239) (SEQ ID NO: 240) 5'-P03- 5'-P03- AAATCGATGTGCCGAAGAAG TCTTCGGCACATCGATTTGG 1.61 (SEQ ID NO: 241) (SEQ ID NO: 242) 5'-P03- 5'-P03 - AAATCGATGTGGTTGCAGAG CTGCAACCACATCGATTTGG 1.62 (SEQ ID NO: 243) (SEQ ID NO: 244) 5'-P03- 5'-P03- AAATCGATGTGGATGGTGAG CACCATCCACATCGATTTGG 1.63 (SEQ ID NO: 245) (SEQ ID NO: 246) 5'-P03- 5'-P03- AAATCGATGTGCTATCGCAG GCGATAGCACATCGATTGGG 1.64 (SEQ ID NO: 247) (SEQ ID NO: 248) 5 * -P03- 5'-P03- AAATCGATGTGCGAAAGCAG GCTTTCGCACATCGATTTGG 1.65 (SEQ ID NO: 249) (SEQ ID NO: 250) 5 '-P03- 5'-P03- AAATCGATGTGACACTGGAG CCAGTGTCACATCGATTTGG 1.66 (SEQ ID NO: 251) (SEQ ID NO: 252) 5'-P03- 5'-P03- AAATCGATGTGTCTGGCAAG TGCCAGACACATCGAT TGG 1.67 (SEQ ID NO: 253) (SEQ ID NO: 254) 5'-P03- 5'-P03- AAATCGATGTGGATGGTCAG GACCATCCACATCGATTTGG 1.68 (SEQ ID NO: 255) ( SEQ ID NO: 256) 5'-P03- 5'-P03-AAATCGATGTGGTTGCACAG GTGCAACCACATCGATTTGG 1.69 (SEQ ID NO: 257) (SEQ ID NO: 258) 5'-P03- 5'-P03-CGATGCCCCATCCGA AAATCGATGTGGGCATCGAG TTTGG 15 1.70 (SEQ ID NO: 259) (SEQ ID NO: 260) 5'-P03- 5'-P03-AAATCGATGTGTGCCTCCAG GGAGGCACACATCGATTTGG 1.71 (SEQ ID NO: 261) (SEQ ID NO: 262) 5'-P03- 5'-P03- AAATCGATGTGTGCCTCAAG TGAGGCACACATCGATTTGG 1.72 (SEQ ID NO: 263) (SEQ ID NO: 264) 5'-P03- 5'-P03-AAATCGATGTGGGCATCCAG GGATGCCCACATCGATTTGG 1.73 (SEQ ID NO: 265) (SEQ ID NO: 26β) 5'-P03- 5 ' -P03-TGATGCCCA CAT CGA AAATCGATGTGGGCATCAAG TTT GG 1.74 (SEQ ID NO: 267) (SEQ ID NO: 268) 5'-P03- 5'-P03-CGA CAG GCA CAT AAATCGATGTGCCTGTCGAG CGA TTT GG 1.75 (SEQ ID NO: 269) (SEQ ID NO; 270) 5'-P03- 5'-P03-ATC CGT CCA CAT AAATCGATGTGGACGGATAG CGA TTT GG 1.76 (SEQ ID NO: 271) (SEQ ID NO: 272) 5'-P03- 5'-P03-GGA CAG GCA CAT AAATCGATGTGCCTGTCCAG CGA TTTGG 1.77 (SEQ ID NO: 273) (SEQ ID NO: 274) 25 '-P03- 5'-P03-CGTGCT TCA CAT AAATCGATGTGAAGCACGAG CGA TTT GG 1.78 (SEQ ID NO: 275) (SEQ ID NO: 2 76) 5'-P03- 5'-P03-TGACAG GCA CAT AAATCGATGTGCCTGTCAAG CGA TTT GG 1.79 (SEQ ID NO: 277) (SEQ ID NO: 2 78) 5'-P03- 5'-P03-GGTGCT TCA CAT AAATCGATGTGAAGCACCAG CGA TTT GG 1.80 (SEQ ID NO: 279) (SEQ ID NO: 2 80) 5 * -P03-ACG AAG GCA CAT '-P03-AAATCGATGTGCCTTCGTAG CGA TTT GG 1.81 (SEQ ID NO: 281) (SEQ ID NO: 2 82) 5'-P03- 5'-P03-CGGACG AA CAT AAATCGATGTGTCGTCCGAG CGA TTT GG 182 (SEQ ID NO: 283) (SEQ ID NO: 2 84) 5'-P03- 5'-P03-CAG ACT CCA CAT AAATCGATGTGGAGTCTGAG CGA TTT GG 1.83 (SEQ ID NO: 285) (SEQ ID NO: 2 86) 5'-P03- 5'- P03-CGG ATC AA CAT AAATCGATGTGTGATCCGAG CGA TTT GG 184 (SEQ ID NO: 287) (SEQ ID NO: 2 88) 5'-P03- 5'-P03-CGC CTG AA CAT AAATCGATGTGTCAGGCGAG CGA TTTGG 1.85 (SEQ ID NO: 289 ) (SEQ ID NO: 290) 5'-P03- 5'-P03-TGG ACG AA CAT AAATCGATGTGTCGTCCAAG CGA TTT GG 1.86 (SEQ ID NO: 291) (SEQ ID NO: 292) 5'-P03- 5'-P03 -CTC CGT CCA CAT AAATCGATGTGGACGGAGAG CGA TTT GG 1.87 (SEQ ID NO: 293) (SEQ ID NO: 294) 5'-P03- 5'-P03-CTG CTA CCA CAT AAATCGATGTGGTAGCAGAG CGA TTT GG 1.88 (SEQ ID NO: 295) (SEQ ID NO: 296) 5'-P03- 5'-P03- AAATCGATGTGGCTGTGTAG ACACAGCCACATCGATTTGG 1. 89 (SEQ ID NO: 297) (SEQ ID NO: 298) 5'-P03- 5'-P03-GTC CGT CCA CAT AAATCGATGTGGACGGACAG CGA TTT GG 1.90 (SEQ ID NO: 299) (SEQ ID NO: 300) 5 ' -P03- 5'-P03-TGC CTG ACÁ CAT AAATCGATGTGTCAGGCAAG CGA TTT GG 1.91 (SEQ ID NO: 301) (SEQ ID NO: 302) 5'-P03- 5'-P03- AAATCGATGTGGCTCGAAAG TTCGAGCCACATCGATTTGG 1 2 (SEQ ID NO: 303) (SEQ ID NO: 304) 5'-P03- 5'-P03-CCG AAG GCA CAT AAATCGATGTGCCTTCGGAG CGA TTT GG 193 (SEQ ID NO: 305) (SEQ ID NO: 306) 5 '-P03- 5'-P03-GTG CTA CCA CAT AAATCGATGTGGTAGCACAG CGA TTTGG 1.94 (SEQ ID NO: 307) (SEQ ID NO: 308) 5'-P03- 5 * -P03-GAC CTT CCA CAT 1. 95 AAATCGATGTGGAAGGTCAG CGA TTTGG (SEQ ID NO: 309) (SEQ ID NO: 310) 5'-P03- 5'-P03-ACAGCACCACAT AAATCGATGTGGTGCTGTAG CGA TTT GG 196 (SEQ ID N0: 311) (SEQ ID NO: 312) Table 4: Oligonucleotide labels used in cycle 2: Tag Number Upper Strand Sequence Lower Strand Sequence 5'-P03-GTT GCC TGT 5'-P03-AGG CAÁ CCT 21 (SEQ ID NO-313) (SEQ ID NO 314) 5'-P03-CAG GAC GGT 5'-P03-CGT CCT GCT 22 (SEQ ID NO: 315) (SEQ ID NO 316) 5'-P03-AGA CGT GGT 5'-P03-CAC GTC TCT 23 (SEQ ID NO: 317) (SEQ ID NO: 318) 5'-P03-CAG GAC CGT 5'-P03-GGT CCT GCT 24 (SEQ ID NO: 319) (SEQ ID NO 320) 5'-P03- CAGGAC AGT 5'-P03-TGT CCT GCT 25 (SEQ ID NO: 321) (SEQ ID NO: 322) 5'-P03-CAC TCT GGT 5'-P03-CAG AGT GCT 26 (SEQ ID NO: 323) ( SEQ ID NO: 324) 5'-P03-GAC GGC TGT 5'-P03-AGC CGTCCT 27 (SEQ ID NO: 325) (SEQ ID NO.326) 5'-P03-CAC TCT CGT 5'-P03-GAG AGT GCT 28 (SEQ ID NO.327) (SEQ ID NO.328) 5'-P03-GTA GCC TGT 5'-P03-AGG CTA CCT 29 (SEQ ID NO: 329) (SEQ ID NO.330) 5 ' -P03-GCC ACT TGT 5'-P03-AAG TGG CCT 210 (SEQ ID NO 331) (SEQ ID NO-332) 5'-P03-CAT CGC TGT 5'-P03-AGC GAT GCT 211 (SEQ ID NO: 333) (SEQ ID NO.334) 5'-P03-CAC TGG TGT 5'- P03-ACC AGT GCT 212 (SEQ ID NO-335) (SEQ ID NO-336) 5'-P03-GCC ACT GGT 5'-P03-CAG TGG CCT 213 (SEQ ID NO 337) (SEQ ID NO 338) 5 '-P03-TCT GGC TGT 5'-P03-AGC CAG ACT 214 (SEQ ID NO: 339) (SEQ ID NO.340) 5'-P03-GCC ACT CGT 5'-P03-GAG TGG CCT 215 (SEQ ID NO-341) (SEQ ID NO-342) 5 '-P03-TGC CTC TGT 5' -P03-AGA GGC ACT 216 (SEQ ID NO-343) (SEQ ID NO 344) 5'-P03-CAT CGC AGT 5 '-P03-TGC GAT GCT 217 (SEQ ID NO.345) (SEQ ID NO: 346) 5'-P03-CAG GAA GGT 5'-P03-CTT CCT GCT 218 (SEQ ID NO 347) (SEQ ID NO 348) ) 5'-P03-GGC ATC TGT 5'-P03-AGA TGC CCT 219 (SEQ ID NO 349) (SEQ ID NO 350) 5'-P03-CGG TGC TGT 5'-P03-AGC ACC GCT 220 (SEQ ID NO: 351) (SEQ ID NO: 352) 221 5'-P03-CACTGGCGT 5'-P03-GCC AGTGCT (SEQ ID NO: 353) (SEQ ID NO: 354) '-P03-TCTCCTCGT 5'-P03-GAGGAGACT 2. 22 (SEQ ID NO: 355) (SEQ ID NO: 356) '-P03-CCTGTCTGT 5'-P03-AGACAGGCT 223 (SEQ ID NO: 357) (SEQ ID NO: 358) '-P03-CAACGCTGT 5'-P03-AGCGTTGCT 2. 24 (SEQ ID NO: 359) (SEQ ID NO: 360) '-P03-TGCCTCGGT 5'-P03-CGA GGC ACT 225 (SEQ ID NO: 361) (SEQ ID NO: 362) '-P03-ACACTGCGT 5'-P03-GCA GTG TCT 226 (SEQ ID NO: 363) (SEQ ID NO: 364) '-P03-TCGTCCTGT 5'-P03-AGG ACGACT 227 (SEQ ID NO: 365) (SEQ ID NO: 366) '-P03-GCTGCC AGT 5'-P03-TGGCAGCCT 228 (SEQ ID NO: 367) (SEQ ID NO: 368) '-P03-TCAGGCTGT 5'-P03-AGCCTG ACT 229 (SEQ ID NO: 369) (SEQ ID NO: 370) '-P03-GCC AGG TGT 5'-P03-ACCTGGCCT 2. 30 (SEQ ID NO: 371) (SEQ ID NO: 372) '-P03-CGG ACCTGT 5'-P03-AGGTCCGCT 231 (SEQ ID NO: 373) (SEQ ID NO: 374) '-P03-CAA CGC AGT 5'-P03-TGCGTTGCT 232 (SEQ ID NO: 375) (SEQ ID NO: 376) '-P03-CACACG AGT 5'-P03-TCGTGTGCT 2. 33 (SEQ ID NO: 377) (SEQ ID NO: 378) '-P03-ATGGCCTGT 5'-P03-AGGCCATCT 234 (SEQ ID NO: 379) (SEQ ID NO: 380) '-P03-CCAGTCTGT 5'-P03-AGACTGGCT 235 (SEQ ID NO: 381) (SEQ ID NO: 382) '-P03-GCC AGG AGT 5'-P03-TCC TGG CCT 2. 36 (SEQ ID NO: 383) (SEQ ID NO: 384) '-P03-CGG ACC AGT 5'-P03-TGGTCCGCT 237 (SEQ ID NO: 385) (SEQ ID NO.386) '-P03-CCTTCGCGT 5'-P03-GCG AAG GCT 238 (SEQ ID NO: 387) (SEQ ID NO: 388) '-P03-GCAGCC AGT 5'-P03-TGG CTG CCT 2. 39 (SEQ ID NO: 389) (SEQ ID NO: 390) '-P03-CCA GTC GGT 5'-P03-CGACTGGCT 240 (SEQ ID NO: 391) (SEQ ID NO: 392) '-P03-ACTGAGCGT 5'-P03-GCTCAGTCT 241 (SEQ ID NO: 393) (SEQ ID NO: 394) "-P03-CCAGTCCGT 5'-P03-GGACTGGCT 2. 42 (SEQ ID NO: 395) (SEQ ID NO: 396) e '-P03-CCAGTC AGT 5'-P03-TGACTGGCT 2. 43 (SEQ ID NO: 397) (SEQ ID NO: 398) '-P03-CATCGAGGT 5'-P03-CTCGATGCT 244 (SEQ ID NO: 399) (SEQ ID NO: 400) '-P03-CCATCGTGT 5'-P03-ACG ATG GCT 245 (SEQ ID NO: 401) (SEQ ID NO: 402) '-P03-GTGCTGCGT 5'-P03-GCAGCACCT 246 (SEQ ID NO: 403) (SEQ ID NO: 404) '-P03-GACTACGGT 5'-P03-CGTAGTCCT 2. 47 (SEQ ID NO: 405) (SEQ ID NO: 406) '-P03-GTG CTG AGT 5'-P03-TCAGCACCT 2. 48 (SEQ ID NO: 07) (SEQ ID NO: 408) '-P03-GCTGCATGT 5'-P03-ATGCAGCCT 2. 49 (SEQ ID NO: 409) (SEQ ID NO: 410) '-P03-GAGTGGTGT 5'-P03-ACCACTCCT 2. 50 (SEQ ID NO: 411) (SEQ ID NO: 412) '-P03-GACTACCGT 5'-P03-GGTAGTCCT 2. 51 (SEQ ID NO: 413) (SEQ ID NO: 414) '-P03-CGGTGATGT 5'-P03-ATCACCGCT 2. 52 (SEQ ID NO: 415) (SEQ ID NO: 416) '-P03-TGCGACTGT 5'-P03-AGTCGCACT 2. 53 (SEQ ID NO: 417) (SEQ ID NO: 418) '-P03-TCTGGAGGT 5'-P03-CTCCAGACT 2. 54 (SEQ ID NO: 419) (SEQ ID NO: 420) '-P03-AGCACTGGT 5'-P03-CAGTGCTCT 2. 55 (SEQ ID NO: 421) (SEQ ID NO: 422) , -P03-TCGCTTGGT 5'-P03-CAAGCGACT 2. 56 (SEQ ID NO: 423) (SEQ ID NO: 424) '-P03-AGCACTCGT 5'-P03-GAGTGCTCT 2. 57 (SEQ ID NO: 425) (SEQ ID NO: 426) '-P03-GCGATTGGT 5'-P03-CAATCGCCT 2. # 2. 79 (SEQ ID NO: 469) (SEQ ID NO: 470) 5'-P03-AGTGCCAGT 5'-P03-TGGCACTCT 2. 80 (SEQ ID NO: 471) (SEQ ID NO: 472) 5'-P03-TAG? GGCGT 5'-P03-GCCTCTACT 2. 81 (SEQ ID NO: 473) (SEQ ID NO: 474) 5'-P03-GTCAGCGGT 5'-P03-CGCTGACCT 2. 82 (SEQ ID NO: 475) (SEQ ID NO: 476) 5'-P03-TCAGGAGGT 5'-P03-CTCCTGACT 2. 83 (SEQ ID NO: 477) (SEQ ID NO: 478) 5'-P03-AGCAGGTGT 5'-P03-ACCTGCTCT 2. 84 (SEQ ID NO: 479 (SEQ ID NO: 80) 5'-P03-TTCCGCAGT 5'-P03-TGCGGAACT 2. 85 (SEQ ID NO: 481) (SEQ ID NO: 482) 5'-P03-GTCAGCCGT 5'-P03-GGCTGACCT 2. 86 (SEQ ID NO: 483) (SEQ ID NO: 484) 5'-P03-GGTCTGCGT 5'-P03-GCAGACCCT 2. 87 (SEQ ID NO: 485) (SEQ ID NO: 486) 5'-P03-TAGCCGAGT 5'-P03-TCGGCTACT 2. 88 (SEQ ID NO: 487) (SEQ ID NO: 488) 5'-P03-GTCAGCAGT 5'-P03-TGCTGACCT 2. 89 (SEQ ID NO: 489) (SEQ ID NO: 490) 5'-P03-GGTCTGAGT 5'-P03-TCAGACCCT 2. 90 (SEQ ID NO: 491) (SEQ ID NO: 492) 5'-P03-CGGACAGGT 5'-P03-CTGTCCGCT 2. 91 (SEQ ID NO: 493) (SEQ ID NO: 494) 5'-P03-TTAGCCGGT5'- 5 '-P03-CGGCTA ACT5 * -P03- P03-3' 3 '2.92 (SEQ ID NO: 495) (SEQ ID NO: 496) 5'-P03-GAGACGAGT 5'-P03-TCGTCTCCT 2. 93 (SEQ ID NO: 497) (SEQ ID NO: 498) 5'-P03-CGTAACCGT 5'-P03-GGTTACGCT 2. 94 (SEQ ID NO: 499) (SEQ ID NO: 500) 5'-P03-TTGGCGTGT5'- 5'-P03-ACGCCAACT5'-P03-P03-3 '3' 2.95 (SEQ ID NO: 501) (SEQ ID NO: 502) 5'-P03-ATGGCAGGT 5'-P03-CTGCCATCT 2. 96 (SEQ ID NO: 503) (SEQ ID NO: 504) Table 5: Oligonucleotide labels used in cycle 3: Sequence Sequence Number of Label Upper Strand Lower Strand 5'-P03-CAGCTACGA 5'-P03-GTAGCTGAC 3. 1 (SEQ ID NO: 505) (SEQ ID NO: 506) 5'-P03-CTCCTGCGA 5'-P03-GCAGGAGAC 3. 2 (SEQ ID NO: 507) (SEQ ID NO: 508) 5'-P03-GCTGCCTGA 5'-P03-AGGCAGCAC 3. 3 (SEQ ID NO: 509) (SEQ ID NO: 510) 5'-P03-CAGGAACGA 5'-P03-GTTCCTGAC 3. 4 (SEQ ID NO: 511) (SEQ ID NO: 512) 5'-P03-CAC ACGCGA 5'-P03-GCGTGTGAC 3. 5 (SEQ ID NO: 513) (SEQ ID NO: 514) 5'-P03-GCAGCCTGA 5'-P03-AGG CTG CAC 3. 6 (SEQ ID NO: 515) (SEQ ID NO: 516) 5'-P03-CTG AACGGA 5'-P03-CGTTCAGAC 3. 7 (SEQ ID NO: 517) (SEQ ID NO: 518) 5'-P03-CTG AACCGA 5'-P03-GGTTCAGAC 3. 8 (SEQ ID NO: 519) (SEQ ID NO: 520) 5'-P03-TCTGGACGA 5'-P03-GTC CAG AAC 3. 9 (SEQ ID NO: 521) (SEQ ID NO: 522) 5'-P03-TGCCTACGA 5'-P03-GTA GGC AAC 3. 10 (SEQ ID NO: 523) (SEQ ID NO: 524) 5'-P03-GGC ATA CGA 5'-P03-GTA TGC CAC 3. 11 (SEQ ID NO: 525) (SEQ ID NO: 526) 5'-P03-CGGTGACGA 5'-P03-GTCACCGAC 3. 12 (SEQ ID NO: 527) (SEQ ID NO: 528) 5'-P03 -CAA CGA CGA 5'-P03-GTCGTTGAC 3. 13 (SEQ ID NO: 529) (SEQ ID NO: 530) 5'-P03 • CTC CTC TGA 5'-P03-AGA GGA GAC 3.14 (SEQ ID NO: 531) (SEQ ID NO: 532) 5'-P03 TCA GGA CGA 5'-P03-GTCCTG AAC 3.15 (SEQ ID NO: 533) (SEQ ID NO: 534) 5'-P03 • AAA GGC GGA 5'-P03-CGCCTTTAC 3.16 (SEQ ID NO: 535) (SEQ ID NO: 536) 5'-P03 CTC CTC GGA 5'-P03-CGAGGAGAC 3.17 (SEQ ID NO: 537) (SEQ ID NO: 538) 5'-P03- CAG ATG CGA 5'-P03-GCATCTGAC 3.18 (SEQ ID NO: 539) (SEQ ID NO: 5 '-P03-GTG GAG AGA 5'-P03-TCT CCA CAC 3. 34 (SEQ ID NO: 571) (SEQ ID NO: 572) 5'-P03-GGA CTG CGA 5'-P03-GCA GTC CAC 3. 35 (SEQ ID NO: 573) (SEQ ID NO: 574) 5'-P03-CCG AACCGA 5'-P03-GGTTCG GAC 3. 36 (SEQ ID NO: 575) (SEQ ID NO: 576) 5'-P03-CAC TGC CGA 5 * -P03-GGC AGTGAC 3. 37 (SEQ ID NO: 577) (SEQ ID NO: 578) 5'-P03-CGA AAC GGA 5'-P03-CGTTTC GAC 3. 38 (SEQ ID NO: 579) (SEQ ID NO: 580) 5'-P03-GGA CTG AGA 5'-P03-TCA GTC CAC 3. 39 (SEQ ID NO: 581) (SEQ ID NO: 582) 5'-P03-CCG AAC AGA 5'-P03-TGT TCG GAC 3. 40 (SEQ ID NO: 583) (SEQ ID NO: 584) 5'-P03-CGA AAC CGA 5 * -P03-GGTTTC GAC 3. 41 (SEQ ID NO: 585) (SEQ ID NO: 586) 5'-P03-CTG GCT TGA 5'-P03-AAG CCA GAC 3. 42 (SEQ ID NO: 587) (SEQ ID NO: 588) 5'-P03-ACC ACC TGA 5'-P03-AGG TGT GAC 3. 43 (SEQ ID NO: 589) (SEQ ID NO: 590) 5'-P03-AAC GAC CGA 5'-P03-GGTCGTTAC 3.44 (SEQ ID NO: 591) (SEQ ID NO: 592) '-P03-ATC CAG CGA 5'-P03-GCT GGA TAC 3. 45 (SEQ ID NO: 593) (SEQ ID NO: 594) 5 - P03-TGC GAA GGA 5'-P03-CTT CGC AAC 3. 46 (SEQ ID NO: 595) (SEQ ID NO: 596) 5'-P03-TGC GAA CGA 5'-P03-GTT CGC AAC 3. 47 (SEQ ID NO: 597) (SEQ ID NO: 598) 5'-P03-CTG GCT GGA 5'-P03-CAG CCA GAC 3. 48 (SEQ ID NO: 599) (SEQ ID NO: 600) 5'-P03-ACC ACC GGA 5'-P03-CGG TGT GAC 3. 49 (SEQ ID NO: 601) (SEQ ID NO: 602) - . 5 -P03-AGTGCA GGA 5'-P03-CTG CAC TAC 350 (SEQ ID NO: 603) (SEQ ID NO: 604) - . 5 - . 5 - . 5 - . 5 - . 5 - . 5 - . 5 - . 5 - . 5 - . 5 - . 5 - . 5 - . 5 -P03-GACCGTTGA 5'-P03-AAC GGTCAC 351 (SEQ ID NO: 605) (SEQ ID NO: 606) -PÜ3-GGTGAGTGA 5 -P03-ACTCAC CAC 352 (SEQ ID NO: 607) (SEQ ID NO: 608) -P03-CCT TCC TG? 5 -P03-AGG AAG GAC 353 (SEQ ID NO: 609) (SEQ ID NO: 610) 5 -P03-CTGGCTAGA 5 -P03-TAGCCA GAC 354 (SEQ ID NO: 11) (SEQ ID NO: 612) -PÜ3-CAC ACC AGA 5 -P03-TGG TGT GAC 355 (SEQ ID NO: 613) (SEQ ID NO.-614) -P03-AGC GGT? GA 5'-P03-TAC CGC TAC 356 (SEQ ID NO: 615) (SEQ ID NO: 616) -P03-GTC AGAGGA 5'-P03-CTC TGA CAC 357 (SEQ ID NO: 617) (SEQ ID NO: 618) -P03-TTC CGA CGA 5 -PÜ3-GTC GGA AAC 358 (SEQ ID NO: 619) (SEQ ID NO: 620) -P03-? GG CGT AGA 5'-P03-TAC GCC TAC 359 (SEQ ID NO: 621) (SEQ ID NO-622) -P03-CTC G? C TGA 5 -P03-AGT CGA GAC 360 (SEQ ID NO: 623) (SEQ ID NO: 624) 5 -P03-TAC GCT GGA 5 -P03-CAG CGT AAC 361 (SEQ ID NO.-625) (SEQ ID NO: 626) 5 -P03-GTTCGGTGA 5 -P03-? CC GAA CAC 362 (SEQ ID NO: 627) (SEQ ID NO: 628) 5-P3-GCC AGC AGA 5 -P03-TGC TGG C? C 363 (SEQ ID NO: 629) (SEQ ID NO: 630) 5 -P03-G? CCGTAGA 5, -P03-lAC GGTCAC 364 (SEQ ID NO: 631) (SEQ ID NO 632) 5 -P03-G7GCTCTG? 5 -P03-AGA GC? CAC 365 (SEQ ID NO: 633) (SEQ ID NO: 634) 5 -P03-GG T GAG CG? 5 -P03-GCT CAC CAC 366 (SEQ ID NO: 635) (SEQ ID NO.636) 5 -P03-GGI GAG AGA 5-P03-TCT CAC CAC 367 (SEQ ID NO.637) (SEQ ID No. 638) 5 -P03-CCTTCC AGA 5'-P03-TGG AAG GAC 368 (SEQ ID NO: 639) (SEQ ID NO.640) 5 -P03-C1C CTA CGA 5 -P03-GTA GGA G? C 369 (SEQ ID NO'641) (SEQ ID NO-642) 5 -P03-CTC G? C GGA 5'-P03-CGTCGA GAC 370 (SEQ ID NO.643) (SEQ ID NO: 644) -P03-GCC GTT TGA 5 -P03-AAA CGG C? C 371 (SEQ ID NO-645) (SEQ ID NO: 646) "-P03-GCG GAG TGA 5 -P03-ACTCCG CAC 372 (SEQ ID NO 647) (SEQ ID NO: 648) '-P03-CGT GCT TGA 5'-P03-AAG C? C GAC 373 (SEQ ID NO-649) (SEQ ID NO: 650) -P03-CTC GAC CGA 5 -P03-GG! CGA GAC 374 (SEQ ID NO: 651) (SEQ ID NO: 652) -P03-AG? GCA GGA 5 -P03-C1G CTC TAC 375 (SEQ ID NO.653) (SEQ ID NO.654) '-Pü GGGCTCGG? 5 -P03-CG? GCA C? C 76 (SEQ ID NO: 655) (SEQ ID NO: 656) -PQ3- CIC G? C? G? 5'-P0 • IGTCG? G? C 377 (SEQ ID NO: 657) (SEQ ID NO: 658) v-po GG? G? Ci IG? 5-PÜ3 C C r CTC C ?C 378 (S ?Q ID NO: 659) (SEQ ID NO: 660) '-l > () 3-? G C'I IG? 5 -PC •? C- \ GCC 1? C 379 (S? Q ID NO: 661) (SEQ ID NO: 662) 5'.PO -? G? GC? CG? . -POV • Gl '(i CK "1? C" i 80 (SEQ ID NO- 663) (SEQ ID NO: 664) 5 -P03- CC? L'CC IG? 5'-Pü3 •? G <;? IGG.? C 3 1 (SEQ ID NO: 665) (SEQ ID NO666) 5 -IO3- Gp CGG? G? 5"-P0 -TCCG ?? C? C 582 (SEQ ID NO: 667) (SEQ ID NO: 668.}. V. | > () VIGI? GCG? 5 - P03-GC 1 \?. \ C 8 (SEQ ID NO: 66) (S? Q ID NO: 670) 5'- P03- G1GCTCCG? 5 -PU -GG.? GC? C? C 38! SEQ ID NO- 671) (S? Q: D NO: 672) .POV t CTC? G? 5 'POMG? GC.? C? C 385 (SEQ ID NO: 673) (SEQ ID: 674) '.P () 3-GCC (i 1T G? 51-P () 3-C? CGG C? C 386 (SEQ ID NO-675) (SEQ ID NO: 676) - . 5 - . 5 - . 5 - . 5 -P03-G? G TGC tG? 5'-J'03 -.? GC? CI V? V 387 (SEQ ID NO: 677) (SEQ ID NO: 678) '-PO'-GCI CCI TG? 5"-P () 3 - ?? G (¡? (¡C? C 388 (SEQ ID NO: 679) (SEQ ID NO: 680) -1'03-CCG ??? GG.? 5 -P03-CIT ICG G? C 389 (SFQ ID NO: 681) (S? Q ID NO: 682) -P03-C? C I'G? GG? .V.PO.VO'C? G T G? C 390 (SEQ ID NO: 683) (SEQ ID NO: 684) 5-P () 3-CC T CiCl GG? 5 '-P03-C? G C? C G? C 391 (SEQ ID NO: 685) (SEQ ID NO: 686) . -POI-CCG ??? CG? .V-P03-GIT ICGG? C n (SEQ ID O-.687) (SEQ ID NO: 688) S -IO3-GCG? G? G? 5 -PÜ3- K T CCG C? C ')? (SEQ ID NO: 689) (SEQ ID NO.690) -POUiCC p? G \ 5'.P () 3-I ?? (GGC? C 394 (SEQ ID NO: 691) (SEQ ID NO: 692) V-P03-ICI CGT GG? 5 -PÜ3-C? C G? G? ? C 395 (S? Q ID NO- 693 > (S? Q ID NO -694) .V-P0.3-CGI GCT? G? 5'-Pü3- 1? G C? C G? C 39 (SEQ ID NO: 695! (SE'Q ID NO: 696) Table 6: Oligonucleotide labels used in cycle 4 Sequence Sequence Number of the Label Upper Strand Lower Strand 5'-P? 3-GCCJ ICp 5'.Pü3-G? C? GGCIC 41 (SEQ ID NO: 697) (SEQ ID NO: 698) < -P03-CICCTGGri •• P03-CC GG? CJIC 42 (SEQ ID NO: 699) (SEQ ID NO: 700) '-P03- ACTCTGCTT 5' -P03-GC A GAG TTC 4. 3 (SEQ ID N0: 701) (SEQ ID NO: 702) 5'-P03-CATCGCCTT 5-P03-GGC GAT GTC 4. 4 (SEQ ID NO: 703) (SEQ ID NO: 704) 5'-P03-GCCACTATT 5'-P03-TAGTGGCTC 4. 5 (SEQ ID NO: 705) (SEQ ID NO: 706) 5'-P03-CACACGGTT 5'-P03-CCG TGTGTC 4. 6 (SEQ ID NO: 707) (SEQ ID NO: 708) 5'-P03-CAACGCCTT 5-P03-GGC GTTGTC 4. 7 (SEQ ID NO: 709) (SEQ ID NO: 710) S'-POS-ACTGAGGTT 5'-P03-CCT CAG TTC 4. 8 (SEQ ID NO: 711) (SEQ ID NO-.712) 5'-P03-GTGCTGGTT 5'-P03-CCA GCA CTC 4. 9 (SEQ ID NO: 713) (SEQ ID NO: 714) 5'-P03-CATCGACTT 5-P03-GTC GAT GTC 4. 10 (SEQ ID NO: 715) (SEQ ID NO: 716) 5'-P03-CCATCGGTT 5'-P03-CCG ATG GTC 4. M (SEQ ID NO: 717) (SEQ ID NO: 718) '-P03-GCTGCACTT 5 * -P03-GTG CAG CTC 4.12 (SEQ ID NO: 719) (SEQ ID NO: 720) '-P03-ACAGAGGTT 5'-P03-CCT CTG TTC 4. 13 (SEQ ID NO: 721) (SEQ ID NO: 722) 5 * -P03-AGTGCCGTT 5'-P03-CGG CAC TTC 4. 14 (SEQ ID NO: 723) (SEQ ID NO: 724) 5'-P03-CGGACATTT 5'-P03-ATG TCC GTC 4. 15 (SEQ ID NO: 725) (SEQ ID NO: 726) 5'-P03-GGtCTGGTT 5'-P03-CCAGACCTC 4. 16 (SEQ ID NO: 727) (SEQ ID NO.-728) 5 * -P03-GAGACGGTT 5'-P03-CCG TCTCTC 4. 17 (SEQ ID NO: 729) (SEQ ID NO: 730) 5-P03-CTTTCCGTT 5'-P03-CGG AAA GTC 4. 18 (SEQ ID NO: 731) (SEQ ID NO: 732) 5'-P03-CAGATGGTT 5 -P03-CC A TCT GTC 4. 19 (SEQ ID NO: 733) (SEQ ID NO: 734) 5'-P03-CGGACACTT 5'-P03-GTGTCCGTC 4. 20 (SEQ ID NO: 735) (SEQ ID NO: 736) 5"-P03-ACTCTCGTT 5'-P03-CGA GAG TTC 4. 21 (SEQ ID NO: 737) (SEQ ID NO: 738) 5'-P03-GCAGCACTT 5'-P03-GTG CTG CTC 4. 22 (SEQ ID NO: 739) (SEQ ID NO: 740) 5'-P03-ACTCTCCTT 5'-P03-GGAGAGTTC 4.23 (SEQ ID NO: 741) (SEQ ID NO: 742) '-P03-ACCTTGGTT 5'-P03-CCA AGG TTC 4. 440 (SEQ ID NO: 775) (SEQ ID NO: 776) '-P03-TACGCTCTT S'-POS-GAGCGTATC 441 (SEQ ID NO: 777) (SEQ ID NO: 778) '-P03-ACGGCAGTT 5'-P03-CTGCCGTTC 442 (SEQ ID NO: 779) (SEQ ID NO: 780) '-P03-ACTGACGTT 5'-P03-CGTCAGTTC 443 (SEQ ID NO: 781) (SEQ ID NO: 782) '-P03-ACGGCACTT 5'-P03-GTG CCG TTC 444 (SEQ ID NO: 783) (SEQ ID NO: 784) '-P03-ACTGACCTT 5, -P03-GGTCAGTTC 445 (SEQ ID NO: 785) (SEQ ID NO: 786) '-P03-TTTGCGGTT 5'-P03-CCG CAÁ ATC 446 (SEQ ID NO: 787) (SEQ ID NO: 788) '-P03-TGGTAGGTT 5'-P03-CCTACCATC 447 (SEQ ID NO: 789) (SEQ ID NO: 790) '-P03-GTTCGGCTT 5'-P03-GCCGAACTC 448 (SEQ ID NO: 791) (SEQ ID NO: 792) '-P03-GCCGTTCTT 5'-P03-GAACGGCTC 449 (SEQ ID NO: 793) (SEQ ID NO: 794) '-P03-GGAGAGGTT 5'-P03-CCTCTCCTC 450 (SEQ ID NO: 795) (SEQ ID NO: 796) '-P03-CACTGACTT 5'-P03-GTC AGTGTC 451 (SEQ ID NO: 797) (SEQ ID NO: 798) '-P03-CGTGCTCTT 5'-P03-GAGCACGTC 452 (SEQ ID NO: 799) (SEQ ID NO: 800) '-P03-AATCCGCTT 5'-P03-GCGGATTTC 453 (SEQ ID NO: 801) (SEQ ID NO: 802) '-P03-AGGCTGGTT 5'-P03-CCAGCCTTC 454 (SEQ ID NO.803) (SEQ ID NO.804) -P03 < iCtAGTGTT S -P03-CACTAGCTC 455 (SEQ ID NO: 805) (SEQ ID NO 806) 'POJ-G ÍAGAGCTT S' P03- < 3CTCTCCTC 456 (SEQ 10 NO: 807) (SEQ ID NO: 808) - P03-GGAGAGATT 5'.f »03-TCTCtCCTC 4. 57 (SEQ ID NO: 809) (SEQ ID NO: 810) 5-P03 AGOCTGCTT S'-KN-OC GCC TC 4. 58 (SEQ SO NO: 811) (SEQ ID NO: ßl2) 5-P03-GAGTGCGTT 5 -P03-CGC CTCTC 459 (SEQ ID NO: 813) (SEQ ID NO: 814) J-P03-CCATCC TT J. P03-TGG TGGTC 4.60 (SEQ ID NO: 81S) (SEQ ID NO: 816) -P? 3-ocrAGTctt 5 -P03-GACTAGCTC 4. 61 (SEQ ID NO: 817) (SEQ ID NO: 818) - . 5 -P03-AGGCTGATT 5 - P03-TCA GCC TTC 462 (SEQ ID NO: 819) (SEQ ID NO: 820) 'P03-ACAGACGTT S -POS-CGTCTGTTC 4. 63 (SEQ ID NO: 821) (SEQ ID NO.-T22) 5, P03-0AGTGCCTT 5 -P03-CGCACTCTC 4. 64 (SEQ ID NO: 823) (SEQ ID NO: 824) S'- H CAGACCTT 5 'P03 < K3TCTGTTC 4. 65 (SEQ ID NO: 825) (SEQ ID NO: 826) 5 -P03-CGAGCTTTT 5 -P03.AAGCTCGTC 4. 66 (SEQ ID NO: 827) (SEQ ID NO: 82β) 5 -P03-TTAGCGGTT 5 -P03-CCGCT TC 4. 67 (SEQ ID NO: 829) (SEQ ID NO-.830) 5 -P03-CCTCTTGTT 5.P03-CA GAGGTC 468 (SEQ ID NO: 831) (SEQ ID N ° T32) '.P03 ^} GTCTCTTT 5 ') 3-AG GACCTC 4. 69 (SEQ ID NO: 833) (SEQ ID NO: 834) 5, P03-GCCAGATTT 5'-P03.ATC TGG CTC 4. 70 (SEQ ID NO: 835) (SEQ ID NO: 836) S'.POS-GAGACCTTT 5 -P03-AGGTCTCTC 471 (SEQ ID NO: 837) (SEQ ID NO: 838) . P03-CACACAGTT 5 'P03-CTGTGTGTC 4. 72 (SEQ ID NO: 839) (SEQ ID NO: 840) -P03-CCTCTTCTT 5 -P03-GAAGAGGTC 4. 73 (SEQ ID NO: 841) (SEQ ID NO: 842) 5-.P03-TAGAGCGTT S-P03-CGCTCT TC 4. 74 (SEQ ID NO: 8 3) (SEQ ID NO: 844) 5 -P03-GCACCTTTT 5 -PO3 AA0GTGCTC 4. 75 (SEQ ID NO: 84S) (SEQ ID NO: 846) 5 -P03-GGCTTGTTT S'-P03.ACAAGCCtC 4. 76 (SEQ ID NO: 847) (SBQ ID NO: 848) 5 -P03-GACGCGATT 5 *. | »03-tCGCGTCTC 4. 77 (SEQ ID NO: 849) (SBQ ID NO: 850) 5 -P03-CGAGCTGTT 5 -P03-CAGCTCGTC 4. 7S (SEQ ID NO: 851) (SEQ ID NO: 852) -P03TAGAGCCTT 5 -P03-GGCTCT TC 479 (SEQ ID NO: 853) (SEQ ID NO: 854) S'.POS-CATCCGTTT 5 -P03-ACGGATGTC 4. 80 (SEQ ID NO: 855) (S 4. 93 (SEQ ID NO: 881) (SEQ ID NO: 882) 5'-P03-CATAGGCTT 5'-P03-GCC TATGTC 4. 94 (SEQ ID NO: 883) (SEQ ID NO: 884) 5'-P03-CCTTCACTT 5'-P03-GTG AAG GTC 4. 95 (SEQ ID NO: 885) (SEQ ID NO: 886) 5'-P03-GCACCTCTT 5'-P03-GAGGTG CTC 4.96 (SEQ ID NO: 887) (SEQ ID NO: 888) Table 7: Correspondence between the building blocks and the oligonucleotide labels used in cycles 1-4. Cycle Block Cycle Cycle Cycle Construction 1 2 3 4 BB1 1.1 2.1 3.1 4.1 BB2 1.2 2.2 3.2 4.2 BB3 1.3 2.3 3.3 4.3 BB4 1.4 2.4 3.4 4.4 BB5 1.5 2.5 3.5 4.5 BB6 1.6 2.6 3.6 4.6 BB7 1.7 2.7 3.7 4.7 BB8 1.8 2.8 3.8 4.8 BB9 1.9 2.9 3.9 4.9 BB10 1.10 2.10 3 .10 4 .10 BB11 1.11 2.11 3 .11 4 .11 BB12 1.12 2.12 3 .12 4 .12 BB13 1.13 2.13 3 .13 4 .13 BB14 1.14 2.14 3 .14 4 .14 BB15 1.15 2.15 3 .15 4 .15 BB16 1.16 2.16 3 .16 4 .16 BB17 1.17 2.17 3 .17 4 .17 BB18 1.18 2.18 3 .18 4 .18 BB19 1.19 2.19 3 .19 4 .19 BB20 1.20 2.20 3 .20 4 .20 BB21 1.21 2.21 3 .21 4 .21 BB22 1.22 2.22 3 .22 4 .22 BB23 1.23 2.23 3 .23 4 .23 BB24 1.24 2.24 3 .24 4 .24 BB25 1.25 2.25 3 .25 4 .25 BB26 1.26 2.26 3 .26 4 .26 BB27 1.27 2.27 3 .27 4. .27 BB28 1.28 2.28 3, .28 4. .28 BB29 1.29 2.29 3,, 29 4., 29 BB30 1.30 2.30 3., 30 4. 30 BB31 1.31 2.31 3. .31 4. 31 BB32 1.32 2.32 3. 32 4. 32 BB33 1.33 2.33 3. 33 4. 33 BB34 1.34 2.34 3. 34 4. 34 BB35 1.35 2.35 3. 35 4. 35 BB36 1.36 2.36 3. 36 4. 36 BB37 1.37 2.37 3. 37 4. 37 BB38 1.38 2.38 3. 38 4. 38 BB39 1.39 2.39 3. 39 4. 39 BB40 1.44 2.44 3. 44 4. 44 BB41 1.41 2.41 3.41 4.41 BB42 1 .42 2 .42 3 .42 4 .42 BB43 1 .43 2 .43 3 .43 4 .43 BB44 1 .40 2 .40 3 .40 4 .40 BB45 1 .45 2 .45 3 .45 4 .45 BB46 1 .46 2 .46 3 .46 4 .46 BB47 1 .47 2 .47 3 .47 4 .47 BB48 1 .48 2 .48 3 .48 4 .48 BB49 1 .49 2 .49 3 .49 4 .49 BB50 1 .50 2 .50 3 .50 4 .50 BB51 1 .51 2 .51 3 .51 4 .51 BB52 1 .52 2 .52 3 .52 4 .52 BB53 1 .53 2 .53 3 .53 4 .53 BB54 1 .54 2 .54 3 .54 4 .54 BB55 1 .55 2 .55 3 .55 4 .55 BB56 1 .56 2 .56 3 .56 4 .56 BB57 1 .57 2 .57 3 .57 4 .57 BB58 1 .58 2 .58 3 .58 4 .58 BB59 1 .59 2 .59 3 .59 4 .59 BB60 1 .60 2 .60 3 .60 4 .60 BB61 1 .61 2, .61 3, .61 4 .61 BB62 1. .62 2. .62 3, .62 4 .62 BB63 1, .63 2., 63 3,, 63 4 .63 BB64 1., 64 2. 64 3., 64 4, .64 BB65 1., 65 2. 65 3. 65 4., 65 BB66 1. 66 2. 66 3. 66 4. 66 BB67 1. 67 2. 67 3. 67 4. 67 BB68 1. 68 2. 68 3. 68 4. 68 BB69 1. 69 2. 69 3. 69 4. 69 BB70 1. 70 2. 70 3. 70 4. 70 BB71 1. 71 2. 71 3. 71 4. 71 BB72 1. 72 2. 72 3. 72 4. 72 BB73 1.73 2.73 3.73 4.73 BB74 1. .74 2 .74 3 .74 4 .74 BB75 1. .75 2, .75 3, .75 4 .75 BB76 1, .76 2 .76 3 .76 4 .76 BB77 1,, 77 2, .77 3 .77 4 .77 BB78 1. .78 2, .78 3, .78 4, .78 BB79 1. .79 2, .79 3, .79 4, .79 BB80 1 ., 80 2, .80 3 .80 4 .80 BB81 1, .81 2 .81 3 .81 4 .81 BB82 1, .82 2 .82 3 .82 4 .82 BB83 1, .96 2, .96 3 .96 4 .96 BB84 1,, 83 2, .83 3 .83 4 .83 BB85 1, .84 2 .84 3 .84 4 .84 BB86 1. .85 2 .85 3 .85 4 .85 BB87 1 .86 2 .86 3 .86 4 .86 BB88 1 .87 2 .87 3 .87 4 .87 BB89 1, .88 2 .88 3 .88 4 .88 BB90 1. .89 2 .89 3.. 89 4 .89 BB91 1. .90 2, .90 3. .90 4 .90 BB92 1. .91 2, .91 3. .91 4 .91 BB93 1., 92 2. .92 3,, 92 4 , .92 BB94 1., 93 2., 93 3., 93 4, .93 BB95 1. 94 2., 94 3., 94 4,, 94 BB96 1. 95 2. 95 3., 95 4.. 95 ligase IX buffer: 50 mM Tris, pH 7.5; 10 mM dithiothreitol; 10 mM MgCl2; 2mM ATP; 50 mM NaCl. 10X ligase buffer: 500 mM Tris, pH 7.5; 100 mM dithiothreitol; 100 mM MgCl2; 20 mM ATP; 50OmM NaCl Water Soluble Spacer Linkage to Compound 2 To a solution of Compound 2 (60mL, 1mM) in sodium borate buffer (150mM, pH 9.4) which was cooled to 4 ° C was added 40 equivalents of N-Fmoc-15-amino-4, 7, 10, 13-tetraoxaoctadecanoic acid (S-Ado) in N, N-dimethylformamide (DMF) (16 mL, 0.15 M) followed by 40 equivalents of 4- (hydrochloride, 6-dimethoxy [1.3.5] triazin-2-yl) -4-methylmorpholinium (DMTMM) in water (9.6 mL, 0.25 M). The mixture was shaken slightly for 2 hours at 4 ° C before a complement of 40 equivalents of S-Ado and DMTMM were added and shaken for an additional 16 hours at 4 ° C. After acylation, a volume 0. IX of 5 M aqueous NaCl and a 2.5X volume of cold ethanol (-20 ° C) was added and the mixture was allowed to stand at -20 ° C for at least one hour. The mixture was then centrifuged for 15 minutes at 14,000 rpm in a centrifuge at 4 ° C to give a white granule which was washed with cold EtOH and then dried in a lyophilizer at room temperature for 30 minutes. The solid was dissolved in 40 mL of water and purified by Reverse Phase HPLC with a Waters Xterra RP18 column. A gradient profile in binary mobile phase was used to elute the product using a 50 mM aqueous triethylammonium acetate buffer at pH 7.5 and 99% strength solution. acetontrile / 1% water. The purified material was concentrated by lyophilization and the resulting residue was dissolved in 5 L of water. A 0.1X volume of piperidine was added to the solution and the mixture shaken slightly for 45 minutes at room temperature. The product was then purified by ethanol precipitation as described above and isolated by centrifugation. The resulting granule was washed twice with cold EtOH and dried by lyophilization to give the purified Compound 3. Cycle 1 To each well in a 96-well plate was added 12.5 μL of a 4 M solution of Compound 3 in water; 100 μL of a 1 mM solution of one of the oligonucleotide labels 1.1 to 1.96, as shown in Table 3 (the molar ratio of Compound 3 to the labels was 1: 2). The plates were heated at 95 ° C for 1 minute and then cooled to 16 ° C for 10 minutes. To each well was added 10 μL of 10X ligase buffer, 30 units of T4 DNA ligase (1 μL of a solution of 30 units / μL (FermentasLife Science, Cat. No. EL0013)), 76.5 μL of water and the resulting solutions they were incubated at 16 ° C for 16 hours. After the ligation reaction, 20 μL of 5 M aqueous NaCl were added directly to each well, followed by 500 μL of cold ethanol (-20 ° C), and kept at -20 ° C for 1 hour.
The plates were centrifuged for 1 hour at 3200g in a Beckman Coulter Allegra 6R centrifuge using Beckman Microplus Carriers. The supernatant was carefully removed by inverting the plate and the pellet was washed with 70% cold water ethanol at -20 ° C. Each of the granules was then dissolved in sodium borate buffer (50 μL, 150 mM, pH 9.4) to a concentration of 1 M and cooled to 4 ° C. To each solution were added 40 equivalents of one of the 96 building block precursors in DMF (13 μL, 0.15 M) followed by 40 equivalents of DMT-MM in water (8 μL, 0.25M), and the solutions were shaken slightly at 4 ° C. After 2 hours, a complement of 40 equivalents of one of each building block precursor and DMTMM were added and the solutions shaken slightly for 16 hours at 4 ° C. After acylation, 10 equivalents of N-hydroxy-succinimide ester of acetic acid in DMF (2 μL, 0.25M) were added to each solution and shaken lightly for 10 minutes. After acylation, the 96 reaction mixtures were combined and 0.1 volume of 5M aqueous NaCl and 2.5 volumes of cold absolute ethanol were added and the solution was allowed to stand at -20 ° C for at least one hour. The mixture was then centrifuged. After centrifugation, as much supernatant as possible was removed with a micropipette, the granule was washed with cold ethanol and centrifuged again. The supernatant was removed with a 200 μL pipette. 70% cold ethanol was added to the tube, and the resulting mixture was centrifuged for 5 min at 4 ° C. The supernatant was removed and the remaining ethanol was removed by lyophilization at room temperature for 10 minutes. The granule was then dissolved in 2 mL of water and purified by Reverse Phase HPLC with a Waters Xterra RPis column. A gradient profile in binary mobile phase was used to elute the library using a 50 mM aqueous triethylammonium acetate buffer. pH 7.5 and 99% acetonitrile / 1% water solution. Fractions containing the library were collected, pooled, and lyophilized. The resulting residue was dissolved in 2.5 mL of water and 250 μL of piperidine were added. The solution was shaken slightly for 45 minutes and then precipitated with ethanol as described above. The resulting granule was dried by lyophilization and then dissolved in sodium borate buffer (4.8 mL, 150 mM, pH 9.4) to a concentration of 1 mM. The solution was cooled to 4 ° C and 40 equivalents each of n-Fmoc-propargylglycine in DMF (1.2 mL, 0.15 M) and DMTMM in water (7.7 mL, 0.25 M) were added. The mixture was shaken slightly for 2 hours at 4 ° C before a complement of 40 equivalents of N-Fmoc-propargylglycine and DMT-MM were added and the solution shaken for 16 hours additional The mixture was then purified by EtOH and Reverse Phase HPLC as described above and the N-Fmoc group was removed by piperidine treatment as described above. With the final purification by precipitation with EtOH, the resulting granule was dried by lyophilization and introduced in the next cycle of synthesis Cycles 2-4 For each of these cycles, the dry granule of the previous cycle was dissolved in water and the library concentration was determined by spectrophotometry based on the extinction coefficient of the DNA component of the library, where the extinction coefficient Initial of Compound 2 is 131,500 L / (mol.cm). The concentration of the library was adjusted with water so that the final concentration in the subsequent ligation reactions was 0.25 mM. The library was then divided into 96 equal aliquots into a 96-well plate. A solution comprising a different tag was added to each well (the molar ratio of the library to tag was 1: 2), and the ligations were performed as described for Cycle 1. The oligonucleotide tags used in Cycles 2, 3 and 4 are set forth in Tables 4, 5 and 6, respectively. The correspondence between the labels and the precursors of blocks of construction for each of cycles 1 to 4 is given in Table 7. The library was precipitated by the addition of ethanol as described above for Cycle 1, and dissolved in sodium borate buffer (150 mM, pH 9.4) to a concentration of 1 mM. Subsequent acylations and purifications were performed as described for Cycle 1, except that the HPLC purification was omitted during Cycle 3. The products of Cycle 4 were ligated with the closing primer shown in the following, using the method described in FIG. previous for the tag attachment. 5'-P03-CAG AAG ACÁ GAC AAG CTT CAC CTG C (SEQ ID NO: 889! 5'-P03-GCA GGT GAA GCT TGT CTG TCT TCT GAA (SEQ ID NO: 890 'Results: The synthetic procedure described above has the ability to produce a library comprising 9β4 (approximately 108) different structures. the library was monitored by gel electrophoresis and LC / MS of the product of each cycle.After completion, the library was analyzed using various techniques.Figure 13a is a chromatogram of the library after Cycle 4, but before primer ligation of closure, Figure 13b is a mass spectrum of the library in the same phase synthetic The average molecular weight was determined by LC / MS analysis of negative ions. The signal from the ions was deciphered using ProMass software. This result is consistent with the predicted average mass of the library. The DNA component of the library was analyzed by agarose gel electrophoresis, which showed that most of the library material corresponds to the right sized bound product. The DNA sequence analysis of molecular clones of the PCR product derived from a sampling of the library shows that the DNA ligation occurred with high fidelity and almost completeness.
Cycling of the library At the end of cycle 4, a portion of the library was covered at the N-terminal end using azidoacetic acid under the usual acylation conditions. The product, after purification by precipitation with EtOH, was dissolved in sodium phosphate buffer (150 mM, pH 8) to a concentration of 1 mM and 4 equivalents each of CuSO4 in water (200 mM), ascorbic acid in water (200 mM), and a catalytic amount of the compound shown in the following as a solution in DMF (200 mM) were added. The reaction mixture was then shaken slightly for 2 hours at room temperature. 14 ' To test the magnitude of the cyclization, aliquots of 5 μL of the cyclization reaction of the library were removed and treated with an azide or alkyne labeled in fluorescent form (lμL of 100 mM DMF stocks) prepared as described in Example 4. After 16 hours, none of the alkyne or azide labels were incorporated into the library by HPLC analysis at 500 nm. This result indicated that the library no longer contained azide or alkyne groups capable of cycloaddition and that the library must therefore have reacted with itself, either through cyclization or intermolecular reactions. The cyclized library was purified by Reverse Phase HPLC as described above. Control experiments using the library without cycling showed complete incorporation of the fluorescent labels mentioned in the above.
Example 4: Preparation of Fluorescent Labels for Cyclization Assay: In separate tubes, propargylglycine or 2-amino-3-phenylpropylazide (8 μmol each) was combined with FAM-OSu (Molecular Probes Inc.) (1.2 equiv.) In borate buffer pH 9.4 (250 μL). The reactions were allowed to proceed for 3 h at room temperature, and then lyophilized overnight. Purification by HPLC gave the desired alkyne and azide in quantitative yield. fluorescent Example 5: Cyclization of individual compounds using the azide / alkyne cycloaddition reaction Preparation of Azidoacetyl-Gly-Pro-Phe-Pra-NH2: Using 0.3 mmol of Rink-amide resin, the indicated sequence was synthesized using standard solid phase synthesis techniques with amino acids protected with Fmoc and HATU as activation agent (Pra = C-propargylglycine). The azidoacetic acid was used to cover the tetrapeptide. The peptide it was cleaved from the resin with 20% TFA / DCM for 4 h. Purification by RP HPLC afforded the product as a white solid (75 mg, 51%). 1R NMR (DMSO-d6, 400 MHz): 8.4 - 7.8 (m, 3H), 7.4 - 7.1 (m, 7 H), 4.6 - 4.4 (m, 1H), 4.4 - 4.2 (m, 2H), 4.0 - 3.9 (m, 2H), 3.74 (dd, 1H, J = 6 Hz , 17 Hz), 3.5 - 3.3 (m, 2H), 3.07 (dt, 1H, J = 5 Hz, 14 Hz), 2.92 (dd, 1H, J = 5 Hz, 16 Hz), 2.86 (t, 1H, J = 2 Hz), 2.85 - 2.75 (m, 1H), 2. 6-2.4 (m, 2H), 2.2-1.6 (m, 4H). IR (ground) 2900, 2100, 1450, 1300 cm "1 ESMS 497.4 ([M + H], 100%), 993.4 ([2M + H], 50%) ESIMS with ion source fragmentation: 519.3 ([M + Na], 100%), 491.3 (100%), 480.1 ([M-NH2], 90%), 452.2 ([M-NH2-CO] 5 20%), 424.2 (20%), 385.1 ([M-Pra], 50%), 357.1 ([M-Pra-CO] 5 40%), 238.0 ([M-Pra-Phe], 100%).
Cyclolation of Azidoacetyl-Gly-Pro-Phe-Pra-NH2: The azidoacetyl peptide (31 mg, 0.62 mmol) was dissolved in MeCN (30 mL). Diisopropylethylamine (DIEA, 1 mL) and Cu (MeCN) 4PF6 (1 mg) were added. After shaking during 1. 5 h, the solution was evaporated and the resulting residue was taken up in 20% MeCN / H20. After centrifugation to remove insoluble salts, the solution was subjected to preparative reverse phase HPLC. The desired cyclic peptide was isolated as a white solid (10 mg, 32%). XH NMR (DMSO-d6, 400 MHz): 8.28 (t, 1H, J = 5 Hz), 7.77 (s, 1H-), 7.2-6.9 (m, 9H), 4.98 (m, 2H), 4.48 (m , 1H), 4.28 (m, 1H), 4.1 - 3.9 (m, 2H), 3.63 (dd, 1H, J = 5 Hz, 16 Hz), 3.33 (m, 2H), 3.0 (m, 3H), 2.48 (dd, 1H, J - 11 Hz, 14 Hz), 1.75 (m, IHO, 1.55 (m, 1H), 1.32 (m, 1H), 1.05 (m, 1H). IR (ground) 2900, 1475, 1400 cm "1. ESIMS 497.2 ([M + H], 100%), 993.2 ([2M + H], 30%), 1015.2 ([2M + Na], 15%) ESMS with ion source fragmentation: 535.2 (70%), 519.3 ([M + Na], 100%), 497.2 ([M + H], 80%), 480.1 ([M-NH2], 30%), 452.2 ([M-NH2-CO] , 40%), 208.1 (60%).
Preparation of Azidoacetyl-Gly-Pro-Phe-Pra-Gly-OH: Using 0.3 mmol of Glycine-Wang resin, the indicated sequence was synthesized using amino acids protected with Fmoc and HATU as the activating agent. Azidoacetic acid was used in the last coupling step to cover the pentapeptide. Cleavage of the peptide was achieved using 50% TFA / DCM for 2 h. Purification by RP HPLC gave the peptide as a white solid (83 mg, 50%). X H NMR (DMSO-d 6, 400 MHz): 8.4 - 7.9 (m, 4 H), 7.2 (m, 5 H), 4.7 - 4.2 (m, 3 H), 4.0 - 3.7 (m, 4 H), 3.5 - 3. 3 (m, 2H), 3.1 (m, 1H), 2.91 (dd, 1H, J = 4 Hz, 16 Hz), 2.84 (t, 1H, J = 2.5 Hz), 2.78 (m, 1H), 2.6 - 2.4 (m, 2H), 2.2-1.6 (m, 4H). IR (ground) 2900, 2100, 1450, 1350 cm "1. ESIMS 555.3 ([M + H], 100%) ESIMS with ion source fragmentation: 577.1 ([M + Na], 90%), 555.3 ( [M + H], 80%), 480.1 ([M-Gly], 100%), 385.1 ([M-Gly-Pra], 70%), 357.1 ([M-Gly-Pra-CO], 40% ), 238.0 ([M-Gly-Pra-Phe], 80%).
Cyclolation of Azidoacetyl-Gly-Pro-Phe-Pra-Gly-OH: The peptide (32 mg, 0.058 mmol) was dissolved in MeCN (60 mL). Diisopropylethylamine (1 mL) and Cu (MeCN) 4PF6 (1 mg) were added and the solution was stirred for 2 h. The solvent was evaporated and the unpurified product was subjected to RP HPLC to remove dimers and trimers. The cyclic monomer was isolated as a colorless glass (6 mg, 20%). ESIMS 555.6 ([M + H], 100%), 1109.3 ([2M + H], 20%), 1131.2 ([2M + Na], 15%). ESIMS with ion source fragmentation: 555.3 ([M + H], 100%), 480.4 ([M-Gly], 30%), 452.2 ([M-Gly-CO], 25%), 424.5 ([M -Gly-2CO], 10%, only possible in a cyclic structure).
Conjugation of Linear Peptide to DNA: Compound 2 (45 nmol) was dissolved in 45 μL of sodium borate buffer (pH 9.4, 150 mM). At 4 ° C, the linear peptide (18 μL of a 100 mM stock in DMF, 180 nmol; 40 equiv.) Was added, followed by DMT-MM (3.6 μL of a 500 mM stock in water, 180 nmol, 40 equiv.). After stirring for 2 h, the LCMS showed complete reaction, and the product was isolated by ethanol precipitation. ESIMS 1823.0 ([M-3H] / 3, 20%), 1367.2 ([M-4H] / 4, 20%), 1093.7 ([M-5H] / 5, 40%), 911.4 ([M-6H] / 6, 100%).
Conjugation of Cyclic Peptide to DNA: Compound 2 (20 nmol) was dissolved in 20 μL of sodium borate buffer (pH 9.4, 150 mM). At 4 ° C, the linear peptide (8 μL of a 100 mM stock in DMF, 80 nmol, 40 equiv.) Was added, followed by DMT-MM (1.6 μL of a 500 mM stock in water, 80 nmol, 40 equiv. ). After stirring for 2 h, the LCMS showed complete reaction, and the product was isolated by ethanol precipitation. ESIMS 1823.0 ([M-3H] / 3, 20%), 1367.2 ([M-4HJ / 4, 20%), 1093.7 ([M-5H] / 5, 40%), 911.4 ([M-6H] / 6, 100%).
Peptide Cycle Linked to DNA: The linear peptide-DNA conjugate (10 nmol) was dissolved in sodium phosphate buffer pH 8 (10 μL, 150 mM).
At room temperature, 4 equivalents each of CuS0, ascorbic acid, and Sharpless ligand were added all (0.2 μL of reserves 200 mM). The reaction was allowed to proceed overnight. The RP HPLC showed that the linear peptide-DNA was not present, and that the product co-eluted with authentic cyclic peptide-DNA. Small amounts of dimers or other oligomers were not observed. elutes (® 4.48 min. elutes (3> 4.27 min.
LC conditions: Targa C18, 2.1 x 40 mm, 10-40% MeCN at 40mM ac. For 8 mon.
Example 6: Application of Nucleophilic Aromatic Substitution Reactions to Functional Portion Synthesis General Procedure for Arylation of Compound 3 with Cyanuric Chloride: Compound 2 is dissolved in sodium borate buffer pH 9.4 at a concentration of 1 mM. The solution is cooled to 4 ° C and 20 equivalents of cyanuric chloride are then added as a 500 mM solution in MeCN. After 2 h, the complete reaction is confirmed by LCMS and the resulting dichlorotriazine-DNA conjugate is isolated by precipitation with ethanol.
Procedure for Substitution Dichlorotriazine-DNA Amin: The dichlorotriazine-DNA conjugate is dissolved in borate buffer pH 9.5 at a concentration of 1 mM. At room temperature, 40 equivalents of an aliphatic amine are added as a solution in DMF. The reaction is monitored by LCMS and is usually complete after 2 h. The resulting alkylamino-monochlorotriazine-DNA conjugate is isolated by precipitation with ethanol.
Procedure for Substitution of Monoclorotriazine-DNA Amin: The alkylamino-monochlorotriazine-DNA conjugate is dissolved in borate buffer pH 9.5 at a concentration of 1 mM. At 42 ° C, 40 equivalents of a second aliphatic amine are added as a solution in DMF. The reaction is monitored by LCMS and is usually complete after 2 h. The resulting diaminotriazine-DNA conjugate is isolated by precipitation with ethanol.
Example 7: Application of Reductive Amination Reactions to Functional Portion Synthesis General Procedure for Reductive Amination of DNA-Linker Containing a Secondary Amine with an Aldehyde Construction block: Compound 2 was coupled to an N-terminal proline residue. The resulting compound was dissolved in sodium phosphate buffer (50 μL, 150 M, pH 5.5) at a concentration of 1 mM. To this solution were added 40 equivalents each of an aldehyde building block in DMF (8 μL, 0.25M) and sodium cyanoborohydride in DMF (8 μL, 0.25M) and the solution was heated at 80 ° C for 2 hours . After alkylation, the solution was purified by ethanol precipitation.
General Procedure for Reductive DNA-Linker Aminations Containing an Aldehyde with Amine Construction Blocks: Compound 2 coupled to a building block comprising an aldehyde group was dissolved in sodium phosphate buffer (50 μL, 250 mM, pH 5.5) at a concentration of 1 mM. To this solution, 40 equivalents were added each one of an amine building block in DMF (8 μL, 0.25M) and sodium cyanoborohydride in DMF (8 μL, 0.25M) and the solution was heated at 80 ° C for 2 hours. After alkylation, the solution was purified by ethanol precipitation.
Example 8: Application of Peptoid Construction Reactions to Functional Portion Synthesis General Procedure for Synthesis of Peptoides in DNA-Linker: Compound 2 was dissolved in sodium borate buffer (50 μL, 150 mM, pH 9.4) at a concentration of 1 mM and cooled to 4 ° C. To this solution was added 40 equivalents of N-hydroxysuccinimidyl bromoacetate in DMF (13 μL, 0.15 M) and the solution shaken slightly at 4 ° C for 2 hours. After acylation, the DNA-linker was purified by ethanol precipitation and redissolved in sodium borate buffer (50 μL, 150 mM, pH 9.4) at a concentration of 1 mM and cooled to 4 ° C. To this solution were added 40 equivalents of a building block of Amine in DMF (13 μL, 0.15 M) and the solution shaken slightly at 4 ° C for 16 hours. After alkylation, the DNA-linker was purified by ethanol precipitation and redissolved in sodium borate buffer (50 μL,, 150 mM, pH 9.) at a concentration of 1 mM and cooled to 4 ° C. The synthesis of peptoids was continued by the gradual addition of N-hydroxysuccinimidyl bromoacetate followed by the addition of an amine building block.
Example 9: Application of the Reaction Cycloadition of Azide-Alkyne to Synthesis of Functional Portions General Procedure A DNA conjugate containing alkyne is dissolved in phosphate buffer pH 8.0 at a concentration of approximately lmM. To this mixture are added 10 equivalents of an organic azide and 5 equivalents each of copper (II) sulfate, ascorbic acid, and the ligand (tris - ((1-benzyltriazol-4-yl) methyl) amine all at room temperature. The reaction is followed by LCMS, and is usually completed after 1-2 h.The resulting triazole-DNA conjugate can be isolated by precipitation with ethanol.
Example 10 Identification of a ligand for Abl kinase from within a coded library The ability to enrich molecules of interest in a library encoded by DNA above undesirable members of the library is essential to identify simple compounds with defined properties against therapeutic targets of interest. To demonstrate this enrichment ability a known linker molecule (described by Shah et al, Science 305, 399-401 (2004), incorporated herein by reference) for rhAbl kinase (GenBank U07563) was synthesized. This compound was linked to a double-stranded DNA oligonucleotide by the linker described in the preceding examples using standard chemical methods to produce a similar molecule (functional portion linked to an oligonucleotide) to those produced by the methods described in Examples 1 and 2 A library generally produced as described in Example 2 and the Abl kinase bound to DNA were designed with unique DNA sequences that allowed qPCR analysis of both species. The Abl kinase bound to DNA was mixed with the library in a ratio of 1: 1000. This mixture was equilibrated with rhAble kinase, and the enzyme was captured on a solid phase, washed to remove unbound members from the library and the binding molecules were eluted. The ratio of the molecules of the library to the Abl kinase bound to the DNA inhibitor in the eluate was 1: 1, indicating an enrichment greater than 500 times of the Abl-kinase binding agent bound to DNA in a 1000-fold excess of the library molecules.
Equivalents Those skilled in the art will recognize, or will be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be embraced by the following claims.

Claims (129)

  1. CLAIMS 1. A method for synthesizing a molecule comprising a functional portion which is operably linked to an encoding oligonucleotide, the method comprising the steps of: (a) providing a starter compound consisting of an initial functional portion comprising n blocks of construction, where n is an integer of 1 or greater, wherein the initial functional portion comprises at least one reactive group, and is operably linked to an initial oligonucleotide; (b) reacting the initiator compound with a building block comprising at least one complementary reactive group, wherein at least one complementary reactive group is complementary to the reactive group of step (a), under conditions suitable for the reaction of the complementary reactive group to form a covalent bond; (c) reacting the initial oligonucleotide with an incoming oligonucleotide which identifies the building block of step (b) in the presence of an enzyme that catalyzes the ligation of the initial oligonucleotide and the incoming oligonucleotide, under conditions suitable for ligation of the oligonucleotide entrant and the initial oligonucleotide to form a coding oligonucleotide; thereby producing a molecule comprising a functional portion comprising n + 1 building blocks which are operably linked to a coding oligonucleotide.
  2. 2. The method of Claim 1 wherein the functional portion of step (c) comprises a reactive group, and steps (a) to (c) are repeated one or more times, thereby forming the lai cycles, where i is an integer of 2 or greater, where the product of step (c) of a cycle s, where s is an integer of i-1 or less, is the initiating compound of cycle s + 1.
  3. 3. The The method of Claim 1 wherein step (c) precedes step (b) or step (b) precedes step (c).
  4. 4. The method of any of Claim 1 wherein at least one of the building blocks is an amino acid or an activated amino acid.
  5. 5. The method of Claim 1 wherein the reactive group and the complementary reactive group are selected from the group consisting of an amino group; a carboxyl group; a sulfonyl group; a phosphonyl group; an epoxide group; an aziridine group; and an isocyanate group.
  6. 6. The method of Claim 1 wherein the reactive group and the complementary reactive group are selected from the group consisting of a hydroxyl group; a carboxyl group; a sulfonyl group; a phosphonyl group; an epoxide group; an aziridine group; and an isocyanate group.
  7. The method of Claim 1 wherein the reactive group and the complementary reactive group are selected from the group consisting of an amino group and an aldehyde or ketone group.
  8. 8. The method of claim 7 wherein the reaction between the reactive group and the complementary reactive group is conducted under reducing conditions.
  9. The method of Claim 1 wherein the reactive group and the complementary reactive group are selected from the group consisting of an ilido phosphorous group and an aldehyde or ketone group.
  10. The method of Claim 1 wherein the reactive group and the complementary reactive group react by cycloaddition to form a cyclic structure.
  11. The method of Claim 10 wherein the reactive group and the complementary reactive group are selected from the group consisting of an alkyne and an azide.
  12. The method of Claim 10 wherein the reactive group and the complementary reactive group are selected from the group consisting of a halogenated heteroaromatic group and a nucleophile.
  13. The method of Claim 12 wherein the halogenated heteroaromatic group is selected from the group consists of chlorinated pyrimidines, chlorinated triazines and chlorinated purines.
  14. 14. The method of Claim 12 wherein the nucleophile is an amino group.
  15. 15. The method of Claim 1, wherein the enzyme is selected from the group consisting of a DNA ligase, an RNA ligase, a DNA polymerase, an RNA polymerase and a topoisomerase.
  16. 16. The method of Claim 1 wherein the initial oligonucleotide is double-stranded or single-stranded.
  17. 17. The method of Claim 16 wherein the initial oligonucleotide comprises a primer sequence for PCR.
  18. 18. The method of claim 16 wherein the initial oligonucleotide is single stranded and the incoming oligonucleotide is single stranded; or the initial oligonucleotide is double-stranded and the incoming oligonucleotide is double-stranded.
  19. 19. The method of Claim 18 wherein the initial functional portion and the initial oligonucleotide are joined by a binding portion.
  20. The method of Claim 19 wherein the initial oligonucleotide is double stranded and the binding portion is covalently coupled to the functional portion initial and both strands of the initial oligonucleotide.
  21. 21. The method of Claim 1 wherein the incoming oligonucleotide is from 3 to 10 nucleotides in length.
  22. 22. The method of claim 2, wherein the incoming oligonucleotide of cycle i comprises a closing primer for PCR.
  23. The method of claim 2, further comprising in step i, the step of (d) ligating an oligonucleotide comprising a primer sequence for closing PCR for the coding oligonucleotide.
  24. The method of Claim 23 wherein the oligonucleotide comprising a primer sequence for closing PCR is ligated to the coding oligonucleotide in the presence of an enzyme that catalyzes ligation.
  25. 25. The method of Claim 2, further comprising after step i, the step of (e) cyclizing the functional portion.
  26. 26. The method of Claim 25 wherein the functional portion comprises an alkynyl group and an azido group, and the compound is subjected to conditions suitable for the cycloaddition of the alkynyl group and the azido group to form a triazole group, thereby cyclizing the functional portion.
  27. 27. A method for synthesizing a library of compounds, wherein the compounds comprise a functional portion comprising two or more building blocks which are operatively linked to an initial oligonucleotide which identifies the structure of the functional portion, the method comprises the steps of (a) provide a solution comprising m starter compounds, wherein m is an integer of 1 or greater, wherein the starter compounds consist of a functional portion comprising n building blocks, where n is an integer of 1 or greater , which is operatively linked to an initial oligonucleotide which identifies the n building blocks; (b) dividing the solution of step (a) into reaction vessels, where r is an integer of 2 or greater, thereby producing aliquots of the solution; (c) reacting the initiator compounds in each reaction vessel with one of the r building blocks, thereby producing aliquots comprising compounds consisting of a functional portion comprising n + 1 building blocks operably linked to the initial oligonucleotide; and (d) reacting the initial oligonucleotide in each aliquot with one of a set of r oligonucleotides different entrances in the presence of an enzyme that catalyzes the ligation of the incoming oligonucleotide and the initial oligonucleotide, under conditions suitable for the enzymatic ligation of the incoming oligonucleotide and the initial oligonucleotide; thus producing aliquots comprising molecules consisting of a functional portion comprising n + 1 building blocks operably linked to an elongated oligonucleotide which encodes the n + 1 building blocks.
  28. The method of Claim 27, further comprising the step of (e) combining two or more of the aliquots, thereby producing a solution comprising molecules consisting of a functional portion comprising n + 1 building blocks, which are operatively linked to an elongated oligonucleotide which encodes the n +1 building blocks.
  29. 29. The method of claim 28 wherein aliquots are combined.
  30. 30. The method of Claim 28 wherein steps (a) to (e) are conducted one or more times to yield the cycles lai, where i is an integer of 2 or greater, wherein in the cycle s + 1 , where s is an integer of i-1 or less, the solution comprising m initiator compounds stage (a) is the solution of stage (e) of cycle s.
  31. The method of any Claim 7 or Claim 8 wherein in at least one of the cycles 1 to i of step (d) precedes step (c).
  32. 32. The method of Claim 28 wherein at least one of building blocks is an amino acid.
  33. 33. The method of Claim 7, wherein the enzyme is DNA ligase, RNA ligase, DNA polymerase, RNA polymerase or topoisomerase.
  34. 34. The method of claim 28 wherein the initial oligonucleotide is a double-stranded oligonucleotide.
  35. 35. The method of Claim 34 wherein the incoming oligonucleotide is a double-stranded oligonucleotide.
  36. The method of Claim 28 wherein the initiator compounds comprise a linker portion comprising a first functional group adapted to bind to a building block, a second functional group adapted to be linked to the 5 'end of an oligonucleotide, and a third group functionally adapted to bind to the 3 'end of an oligonucleotide.
  37. 37. The method of Claim 36 wherein the linking portion is of the structure wherein A is a functional group adapted to be linked to a building block; B is a functional group adapted to bind to the 5 'end of an oligonucleotide; C is a functional group adapted to be linked to the 3 'end of an oligonucleotide; S is an atom or a scaffold; D is a chemical structure that connects A to S; E is a chemical structure that connects B to S; and F is a chemical structure connecting C to S.
  38. 38. The method of Claim 37 wherein: A is an amino group; B is a phosphate group; and C is a phosphate group.
  39. 39. The method of Claim 37 wherein D, E and F are each, independently, an alkylene group or an oligo (ethylene glycol) group.
  40. 40. The method of Claim 37 wherein S is a carbon atom, a nitrogen atom, a phosphorus atom, a boron atom, a phosphate group, a cyclic group, a polycyclic group.
  41. 41. The method of claim 40 wherein the linking portion is of the structure wherein each of n, m and p is, independently, an integer from 1 to about 20.
  42. 42. The method of Claim 41 wherein each of n, m and p is independently an integer from 2 to eight.
  43. 43. The method of Claim 42 wherein each of n, m and p is independently an integer from 3 to 6.
  44. 44. The method of Claim 41 wherein the linking portion has the structure
  45. 45. The method of claim 27, wherein each of the initiator compounds comprises a reactive group and wherein each of the r building blocks comprises a complementary reactive group which is complementary to the reactive group.
  46. 46. The method of Claim 45 wherein the reactive group and the complementary reactive group are selected from the group consisting of an amino group; a carboxyl group; a sulfonyl group; a phosphonyl group; an epoxide group; an aziridine group; and an isocyanate group.
  47. 47. The method of Claim 45 wherein the reactive group and the complementary reactive group are selected from the group consisting of a hydroxyl group; a carboxyl group; a sulfonyl group; a phosphonyl group; an epoxide group; an aziridine group; and an isocyanate group.
  48. 48. The method of Claim 45 wherein the reactive group and the complementary reactive group are selected from the group consisting of an amino group and a aldehyde or ketone group.
  49. 49. The method of claim 45 wherein the reaction between the reactive group and the complementary reactive group is conducted under reducing conditions.
  50. 50. The method of Claim 45 wherein the reactive group and the complementary reactive group are selected from the group consisting of an ilido phosphorous group and an aldehyde or ketone group.
  51. 51. The method of Claim 45 wherein the reactive group and the complementary reactive group react by cycloaddition to form a cyclic structure.
  52. 52. The method of Claim 51 wherein the reactive group and the complementary reactive group are selected from the group consisting of an alkyne and an azide.
  53. 53. The method of Claim 45 wherein the reactive group and the complementary functional group are selected from the group consisting of a halogenated heteroaromatic group and a nucleophile.
  54. 54. The method of Claim 53 wherein the halogenated heteroaromatic group is selected from the group consisting of chlorinated pyrimidines, chlorinated triazines and chlorinated purines.
  55. 55. The method of Claim 53 wherein the nucleophile is an amino group.
  56. 56. The method of claim 28, which also it comprises after cycle i, the step of: (f) cycling one or more of the functional portions.
  57. 57. The method of claim 56 wherein a functional portion of step (f) comprises an azido group and an alkynyl group.
  58. 58. The method of Claim 57 wherein the functional portion is maintained under conditions suitable for the cycloaddition of the azido group and the alkynyl group to form a triazole group, thereby forming a cyclic functional portion
  59. 59. The method of claim 58 wherein the cycloaddition reaction is conducted in the presence of a copper catalyst.
  60. 60. The method of Claim 59 wherein at least one of the one or more functional portions of step (f) comprises at least two sulfhydryl groups, and the functional portion is maintained under conditions suitable for the reaction of the two sulfhydryl groups to form a disulfide group, thereby cyclizing the functional moiety.
  61. 61. The method of Claim 27 wherein the initial oligonucleotide comprises a primer sequence for PCR.
  62. 62. The method of claim 28, wherein the incoming oligonucleotide of cycle i comprises a primer of closure for PCR.
  63. 63. The method of claim 28, further comprising after step i, the step of (d) ligating an oligonucleotide comprising a primer sequence for closing PCR for the coding oligonucleotide.
  64. 64. The method of Claim 63 wherein the oligonucleotide comprising a primer sequence for closing PCR is ligated to the coding oligonucleotide in the presence of an enzyme that catalyzes ligation.
  65. 65. A compound of the formula wherein: X is a functional portion comprising one or more building blocks; Z is an oligonucleotide attached at its 3 'end to B; And it is an oligonucleotide which binds at its terminal 5 'end to C; A is a functional group that forms a link covalent with X; B is a functional group that forms a bond with the 3 'end of Z; C is a functional group that forms a bond with the 5 'end of Y; D, F and E are each, independently, a bifunctional linking group; and S an atom or a molecular scaffold.
  66. 66. The compound of claim 65 wherein D, E and F are each independently an alkylene chain or an oligo chain (ethylene glycol). and
  67. 67. The compound of Claim 65, wherein Y and Z are substantially complementary and are oriented in the compound to enable Watson-Crick base pairing and duplex formation under suitable conditions.
  68. 68. The compound of Claim 65 wherein Y and Z are of the same length or different lengths.
  69. 69. The compound of Claim 68 wherein Y and Z are of the same length.
  70. 70. The compound of claim 65, wherein Y and Z are each of 10 or more bases in length and have complementary regions of ten or more base pairs.
  71. 71. The compound of Claim 65, wherein S is a carbon atom, a boron atom, a nitrogen atom, a phosphorus atom, or a polyatomic scaffold.
  72. 72. The compound of Claim 71 wherein S is a phosphate group or a cyclic group.
  73. 73. The compound of Claim 72 wherein S is a cycloalkyl, cycloalkenyl, heterocycloalkyl, heterocycloalkenyl, aryl or heteroaryl group.
  74. 74. The compound of claim 65 wherein the linking portion is of the structure wherein each of n, m and p is, independently, an integer from 1 to about 20.
  75. 75. The compound of Claim 74 wherein each of n, m and p is independently an integer from 2 to eight.
  76. 76. The compound of Claim 75 wherein each of n, m and p is independently an integer from 3 to 6.
  77. 77. The compound of Claim 65 wherein the linking portion has the structure
  78. 78. The compound of Claim 65 wherein X and Y comprise a primer sequence for PCR.
  79. 79. A library of compounds comprising at least about 102 distinct compounds, the compounds comprising a functional portion comprising two or more building blocks which are operatively linked to an oligonucleotide which identifies the structure of the functional portion.
  80. 80. The library of compounds of Claim 79, the library comprises at least about 105 copies of each of the various compounds.
  81. 81. The library of claim 79, the library comprises at least about 106 copies of each of the various compounds.
  82. 82. The library of compounds of Claim 79 comprising at least about ÍO4 different compounds.
  83. 83. The library of compounds of Claim 79 comprising at least about 106 different compounds.
  84. 84. The library of compounds of the Claim 79 comprising at least about 108 different compounds.
  85. 85. The library of compounds of Claim 79 comprising at least about 1010 different compounds.
  86. 86. The library of compounds of Claim 79 comprising at least about 1012 different compounds.
  87. 87. The library of compounds of claim 79 wherein the library comprises a multiplicity of compounds which are independently of Formula I: I z (where: X is a functional portion comprising one or more building blocks; Z is an oligonucleotide attached at its 3 'end to B; And it is an oligonucleotide which binds at its terminal 5 'end to C; A is a functional group that forms a covalent bond with X; B is a functional group that forms a bond with the 3 'end of Z; C is a functional group that forms a bond with the 5 'end of Y; D, F and E are each, independently, a bifunctional linking group; and S an atom or a molecular scaffold.
  88. 88. The library of compounds of the Claim 87 wherein A, B, C, D, E, F and S each have the same identity for each compound of Formula I.
  89. 89. The library of compounds of Claim 87, the library consists essentially of a multiplicity of compounds of Formula I.
  90. 90. The library of compounds of claim 87 wherein D, E and F are each independently an alkylene chain or an oligo chain (ethylene glycol).
  91. 91. The library of compounds of the Claim 87, wherein Y and Z are substantially complementary and oriented in the compound to enable Watson-Crick base pairing and duplex formation under suitable conditions.
  92. 92. The library of compounds of the Claim 87 where Y and Z are of the same length or different lengths.
  93. 93. The library of compounds of Claim 87 wherein Y and Z are of the same length.
  94. 94. The library of compounds of claim 87, wherein Y and Z are each 10 or more bases in length and have complementary regions of ten or more base pairs.
  95. 95. The library of compounds of Claim 87, wherein S is a carbon atom, a boron atom, a nitrogen atom, a phosphorus atom, or a polyatomic scaffold.
  96. 96. The library of compounds of Claim 87 wherein S is a phosphate group or a cyclic group.
  97. 97. The library of compounds of Claim 96 wherein S is a cycloalkyl, cycloalkenyl, heterocycloalkyl, heterocycloalkenyl, aryl or heteroaryl group.
  98. 98. The library of compounds of the claim 7 wherein the linking portion is of the structure wherein each of n, m and p is, independently, an integer from 1 to about 20.
  99. 99. The library of compounds of Claim 98 wherein each of n, m and p is independently an integer from 2 to eight.
  100. 100. The compound of Claim 99 wherein each of n, m and p is independently an integer from 3 to 6.
  101. 101. The compound of Claim 87 wherein the linking portion has the structure
  102. 102. The library of claim 87 wherein X and Z comprise a sequence of primer for PCR.
  103. 103. A compound prepared by the method of Claim 1.
  104. 104. A library of compounds prepared by the method of Claim 27.
  105. 105. A method for identifying one or more compounds that bind to a biological target, the method comprises stages of: (a) putting the biological objective in contact with a library of compounds prepared by the method of Claim 27 under suitable conditions so that at least one member of the compound library binds to the target; (b) remove members of the library who do not join the goal; (c) amplifying the oligonucleotides encoding at least one member of the library of compounds that binds to the target; (d) sequencing the coding oligonucleotides of step (c); and (e) using the sequences determined in step (d) to determine the structure of the functional portions of the members of the library of compounds which bind to the biological target; identifying one or more compounds accordingly that join the biological objective.
  106. 106. A method for identifying a compound that binds to a biological target, the method comprises the steps of (a) contacting the biological target with a library of compounds comprising at least about 102 different compounds, the compounds comprising a functional portion comprising two or more building blocks which are operatively linked to an oligonucleotide which identifies the structure of the functional portion under suitable conditions so that at least one member of the library of compounds binds to the target; (b) remove members of the library who do not join the goal; (c) amplifying the oligonucleotides encoding at least one member of the library of compounds that binds to the target; (d) sequencing the coding oligonucleotides of step (c); and (e) using the sequences determined in step (d) to determine the structure of the functional portions of the members of the library of compounds which bind to the biological target; identifying one or more compounds accordingly that join the biological objective.
  107. 107. The method of Claim 106 wherein the library comprises at least about 105 copies of each of the various compounds.
  108. 108. The method of claim 106 wherein the library comprises at least about 106 copies of each of the various compounds.
  109. 109. The method of claim 106 wherein the library comprises at least about 104 different compounds.
  110. 110. The method of Claim 106 wherein the library comprises at least about 10d different compounds.
  111. 111. The method of Claim 106 wherein the library comprises at least about 108 different compounds.
  112. 112. The method of Claim 106 wherein the library comprises at least about 1010 different compounds.
  113. 113. The method of Claim 106 wherein the library of compounds comprises at least about 1012 different compounds.
  114. 114. The method of claim 106 wherein the library of compounds comprises a multiplicity of compounds which are independently of Formula I: (wherein: X is a functional portion comprising one or more building blocks; Z is an oligonucleotide attached at its 3 'terminal to B; Y is an oligonucleotide which is attached at its 5' terminus to C; is a functional group that forms a covalent bond with X; B is a functional group that forms a bond with the 3 'end of Z; C is a functional group that forms a bond with the 5' end of Y; D, F and E are each, independently, a bifunctional linking group, and S is an atom or a molecular scaffold
  115. 115. The method of Claim 114 wherein A, B, C, D, E, F and S each have the same identity for each compound of Formula I.
  116. 116. The method of Claim 114 wherein the library of compounds consists essentially of a multiplicity of compounds of Formula I.
  117. 117. The method of claim 114 wherein D, E and F are each independently an alkylene chain or an oligo chain. (ethylene glycol).
  118. 118. The method of Claim 114, wherein Y and Z are substantially complementary and oriented in the compound to enable Watson-Crick base pairing and duplex formation under suitable conditions.
  119. 119. The method of claim 114 wherein Y and Z are of the same length or different lengths.
  120. 120. The method of Claim 119 wherein Y and Z are of the same length.
  121. 121. The method of claim 114, wherein Y and Z are each 10 or more bases in length and have complementary regions of ten or more base pairs.
  122. 122. The method of Claim 114, wherein S is a carbon atom, a boron atom, a nitrogen atom, a phosphorus atom, or a polyatomic scaffold.
  123. 123. The method of Claim 114 wherein S is a phosphate group or a cyclic group.
  124. 124. The method of Claim 123 wherein S is a cycloalkyl, cycloalkenyl, heterocycloalkyl, heterocycloalkenyl, aryl or heteroaryl group.
  125. 125 The method of claim 114 wherein the linker portion is of the structure wherein each of n, m and p is, independently, an integer from 1 to about 20.
  126. 126. The method of Claim 125 wherein each of n, m and p is independently an integer from 2 to eight.
  127. 127. The method of Claim 126 wherein each of n, m and p is independently an integer from 3 to 6.
  128. 128. The method of Claim 127 wherein the linking portion has the structure
  129. 129. The method of claim 114 wherein X and Z comprise a primer sequence for PCR.
MXMX/A/2007/015543A 2005-06-09 2007-12-07 Methods for synthesis of encoded libraries MX2007015543A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US60/689,466 2005-06-09
US60/731,041 2005-10-28

Publications (1)

Publication Number Publication Date
MX2007015543A true MX2007015543A (en) 2008-09-02

Family

ID=

Similar Documents

Publication Publication Date Title
AU2004299145B2 (en) Methods for synthesis of encoded libraries
DK179064B1 (en) Methods for synthesizing encoded libraries
US7972994B2 (en) Methods for synthesis of encoded libraries
CA2626325A1 (en) Methods for identifying compounds of interest using encoded libraries
MX2007015543A (en) Methods for synthesis of encoded libraries
AU2011205057B2 (en) Methods for synthesis of encoded libraries
AU2011205058A1 (en) Methods for synthesis of encoded libraries