WO2009067657A2 - Procédés d'identification d'une fonction moléculaire - Google Patents
Procédés d'identification d'une fonction moléculaire Download PDFInfo
- Publication number
- WO2009067657A2 WO2009067657A2 PCT/US2008/084329 US2008084329W WO2009067657A2 WO 2009067657 A2 WO2009067657 A2 WO 2009067657A2 US 2008084329 W US2008084329 W US 2008084329W WO 2009067657 A2 WO2009067657 A2 WO 2009067657A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- compounds
- chemical building
- chemical
- display
- building block
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1058—Directional evolution of libraries, e.g. evolution of libraries is achieved by mutagenesis and screening or selection of mixed population of organisms
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/06—Libraries containing nucleotides or polynucleotides, or derivatives thereof
- C40B40/08—Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B50/00—Methods of creating libraries, e.g. combinatorial synthesis
- C40B50/04—Methods of creating libraries, e.g. combinatorial synthesis using dynamic combinatorial chemistry techniques
Definitions
- the present invention relates generally to the field of chemistry. More specifically, the invention pertains to methods for identifying molecular functions and interactions between compounds and molecular diversity in a sample using continuously- varied, patterned displays.
- a method of identifying molecular function comprises contacting a multi-dimensional, continuously-varying display with one or more target molecules, the display having a continuously varying pattern of compounds attached to the surface.
- the method further entails detecting an interaction between the target molecules and the compounds and identifying those compounds that interact with the target molecules.
- the interaction with the target molecules is indicative of one or more characteristics of the compounds that affect a particular function of the compounds.
- the method includes the step of identifying one or more modifications to one or more characteristics of the compounds so as to identify additional compounds that have altered interactions with the target molecules.
- the method also entails preparing a display having a predetermined, continuously varying pattern of compounds having a modified structure based on the identifications of the previous step.
- the steps of the method can be repeated.
- the method identifies a range of compounds and each compound optionally having one or more modifications altering one or more characteristics that affect a particular function.
- the steps of the method are repeated one or more times.
- the characteristic affecting a function of the compound is structure, charge, hydrophobicity, aromaticity, sequence, or a combination thereof.
- the compound is a peptide or protein having an amino acid sequence.
- a starting peptide or protein is modified by the addition, substitution or deletion of one or more amino acids.
- the peptide or protein is modified to alter the charge of at least one amino acid.
- the peptide or protein is modified by a functionally conservative substitution of one or more amino acids in the peptide or protein sequence.
- the peptide or protein is modified by at least one non-conservative amino acid substitution.
- the interaction between the target molecules and the compounds is selected from the group consisting of receptor- ligand binding, enzyme-co factor binding, enzyme-substrate interactions, enzymatic reactions, catalytic reactions, van der Waals forces, London forces, ionic bonds, hydrogen bonds, and covalent bonds.
- the target molecule is a ligand, peptidomimetic compound, a nucleic acid, a carbohydrate, a protein, a glycoprotein, a peptide, an antibody, an aptamer, a therapeutic agent, a prodrug, or an inhibitor.
- the compound is a nucleic acid.
- the compound is a carbohydrate, an antibody, an aptamer, a therapeutic agent, or a prodrug.
- the compounds are generated in situ on the surface of the display.
- predetermined mixtures of compounds each optionally varied at one or more positions within the sequence, are attached to the surface.
- the compounds are peptides or proteins, and the one or more sequence variations are conservative amino acid substitutions.
- the compounds are peptides or proteins, and the mixtures are attached at loci on the surface.
- each loci has a predetermined mixture of peptides or proteins having sequence variations that are conservative at certain loci and non- conservative at other loci.
- each compound is modified at one or more positions. In other embodiments, one or more modifications at each position do not greatly affect the function of the compound.
- the compound is a protein or peptide. In more particular embodiments, the compound catalyzes a reaction. In even more particular embodiments, the compound is a peptide that reacts with a target molecule. In other embodiments, compounds are identified that alter the properties of a surface.
- a multi-dimensional, continuously-varied display is provided. The display comprises a substrate to which a plurality of compound species are attached at loci to provide a continuously varying pattern.
- the display has two or more dimensions, including spatial dimensions, that are orthogonal and defined by variations in one or more compounds densities on the surface, distribution of compound species on the surface, or the mixture of compound species on the surface.
- each compound is comprised of a sequence of chemical building blocks and the plurality of compounds on the surface comprises one or more distinct sequences.
- the multi-dimensional, continuously-varied display further comprises a chemical dimension.
- the compounds define the chemical dimension on the display, and each compound comprises a plurality of chemical building blocks interacting to form a sequence. Each chemical building block in the sequence represents a point in the chemical dimension on the display. In addition, each chemical building block in the sequence is optionally varied such that at least one chemical building block in each compound differs from the other compounds at a particular locus.
- the one or more target molecules are contacted to the display as a mixture of target molecules. In other embodiments, the mixture of target molecules is differentially operably linked to a detectable label such that each target molecule is individually detected.
- each locus encompasses an area of less than or equal to 0.1 mm 2 .
- the loci are contiguous with one another on the display.
- the compounds are attached to the substrate by non-covalent interactions or covalent bonds.
- the compounds are attached to the substrate via linker groups.
- the chemical building blocks are amino acids.
- each chemical building block comprises a plurality of amino acids.
- the composition further comprises a display in which the concentration of compounds on the substrate is varied. In particular embodiments, at least one of the compounds is varied by at least 10% over an area of 0.01 cm 2 of the substrate.
- the density of compounds on the substrate is about 10,000 unique compounds/mm 2 to 10,000,000 compounds/mm 2 .
- the compounds are distributed on the display such that the distribution of compounds continuously varies between loci on the display to form the continuously varying pattern.
- a multi-dimensional, continuously varying display is provided, the display comprising a substrate to which a plurality of compounds are attached.
- each compound comprises a plurality of chemical building blocks interacting to form a sequence for each compound, and are attached at loci on the display to provide a continuously varying pattern.
- the loci on the area define the pattern, and the pattern has a complexity of greater than or equal to 10 12 as determined by mathematical complexity analysis.
- the display further comprises a chemical dimension in which the compounds define the chemical dimension on the display, and each compound comprises a plurality of chemical building blocks interacting to form a sequence.
- each chemical building block in the sequence represents a point in the chemical dimension on the display in which each chemical building block in the sequence is optionally varied such that at least one chemical building block in each compound differs from the other compounds at a particular locus.
- the complexity is between 10 5 -10 12 .
- the composition of the display continuously varies by at least 10% over an area of at least 0.1 mm 2 .
- each locus encompasses an area of less than or equal to 0.1 mm 2 .
- the compounds are attached to the substrate by covalent bonds or non-covalent interactions.
- the chemical building blocks are amino acids.
- the concentration of compounds on the substrate is about 10,000 compounds/mm 2 to 10,000,000 compounds/mm 2 .
- the composition of the pattern is measured at a resolution of between 0.003 mm to 1 mm.
- a method of making a multi-dimensional, continuously- varied display comprises attaching a plurality of compounds that comprise a plurality of chemical building blocks in situ on a substrate at a plurality of loci to produce a continuously varying pattern.
- the compounds define a chemical dimension and each chemical building block defines a point within the chemical dimension that is optionally varied so that at least one chemical building block in each compound differs from the other compounds at a particular locus.
- each chemical building block is an amino acid, a peptide, or a nucleic acid.
- each compound is attached to a linker.
- each locus encompasses an area of less than or equal to 0.1 mm 2 .
- the method further comprises attaching a first chemical building block to a proportion of attachment sites coated on the substrate at a locus.
- the first chemical building block has a first protecting group bound to a first reactive moiety on the first chemical building block.
- the method further entails attaching a second chemical building block to a proportion of attachment sites coated on the substrate that were not attached to the first chemical building block, the second chemical building block has a second protecting group bound to a second reactive moiety on the second building block in which the second protecting group is either the same or different than the first protecting group.
- These embodiments also include removing a proportion of the first and/or second protecting groups from a proportion of the first and/or second reactive moieties and reacting the deprotected reactive moieties of the first and/or second chemical building blocks with a third chemical building block having a third protecting group bound to a third reactive moiety.
- the third protecting group being optionally the same as either the first or second protecting groups.
- These embodiments also entail removing a proportion of the first and/or second protecting groups from a proportion of the first and/or second reactive moieties, reacting the deprotected reactive moieties with a fourth chemical building block having a fourth protecting group bound to a fourth reactive moiety in which the fourth protecting group being optionally the same as either the first, second, or third protecting groups.
- the methods of this embodiments also includes repeating any one or more of the above steps with additional chemical building blocks, any of which are optionally the same as the first, second, third, or fourth chemical building block, to generate sequentially a chemical dimension of the multi-dimensional, continuously- varied display.
- the reactive moieties are deprotected by photolysis, hydrolysis, electrochemical removal, or organic solvation.
- a method of making a multi-dimensional, continuously- varied display includes attaching a plurality of compounds comprising a plurality of chemical building blocks in situ on a substrate at a plurality of loci to produce a continuously varying pattern.
- the compounds define a chemical dimension and each chemical building block defines a point within the chemical dimension such that optionally varying the chemical building blocks for each point in the chemical dimension produces a complexity of at least 10 12 .
- each chemical building block is an amino acid or a peptide.
- the method of this aspect further comprises attaching a first chemical building block to a proportion of attachment sites coated on the substrate at a locus.
- the first chemical building block has a first protecting group bound to a first reactive moiety on the first chemical building block.
- the method of this embodiment also entails attaching a second chemical building block to a proportion of attachment sites coated on the substrate that were not attached to the first chemical building block.
- the second chemical building block has a second protecting group bound to a second reactive moiety on the second building block, and the second protecting group being either the same or different than the first protecting group.
- the method entails removing a proportion of the first and/or second protecting groups from a proportion of the first and/or second reactive moieties and reacting the deprotected reactive moieties of the first and/or second chemical building blocks with a third chemical building block having a third protecting group bound to a third reactive moiety.
- the third protecting group being optionally the same as either the first or second protecting groups.
- these embodiments include removing a proportion of the first and/or second protecting groups from a proportion of the first and/or second reactive moieties and reacting the deprotected reactive moieties with a fourth chemical building block having a fourth protecting group bound to a fourth reactive moiety in which the fourth protecting group is optionally the same as either the first, second, or third protecting groups.
- any one of the previously described elements of the method are repeated with additional chemical building blocks, any of which are optionally the same as the first, second, third, or fourth chemical building block, to generate sequentially the chemical dimension of the multi-dimensional, continuously- varied display.
- the reactive moieties are partially deprotected such that the number of deprotected moieties is less than the total number of reactive moieties on the surface.
- the partially deprotected moieties are reacted with chemical building blocks, whereby the chemical building blocks react with the deprotected reactive moieties and not the reactive moieties that remain protected.
- the methods described above further comprise contacting the substrate with a mixture of chemical building blocks, whereby the chemical building blocks react with a proportion of attachment sites coated on the substrate at a locus such that the locus has one or more different chemical building blocks attached to the different attachment sites.
- these embodiments provide that the chemical building blocks have protecting groups bound to reactive moieties.
- a proportion of the protecting groups are removed from a proportion of the reactive moieties to deprotect a proportion of reactive moieties.
- the removal of protecting groups and contacting of the substrate with a mixture of chemical building blocks can be repeated with additional mixtures of chemical building blocks, any of which are optionally the same.
- a sequentially generated chemical dimension of the multi-dimensional, continuously- varied display is produced.
- the mixture of chemical building blocks comprise a mixture of amino acids.
- the mixture of amino acids comprises a mixture of one or more different natural or non-natural amino acids.
- a method of producing a multi-dimensional, continuous Iy- varying display is provided.
- the method comprises attaching a first type of chemical building block to a first portion of a substrate, and attaching a second type of chemical building block to a second portion of the substrate.
- the method further entails that the first type of chemical building block is not attached to the second portion of the substrate, and the first and second types of chemical building blocks form a first layer of chemical building blocks on the surface.
- the method also involves attaching a third type of chemical building block to at least a first portion of the first layer of chemical building blocks and attaching a fourth type of chemical building block to at least a second portion of the first layer of chemical building blocks in which the third type of chemical building block is not attached to the second portion of the first layer of chemical building blocks.
- the method includes optionally repeating (c) and (d) using additional types of chemical building blocks.
- the display comprises between about 1,000,000 compounds and about 1 billion compounds. In certain other embodiments, the display has a complexity density of between about 10 5 cm “2 and about 10 20 cm “2 . In still more embodiments, the display has a total complexity of between about 10 5 and about 10 20 . [0032] In yet another aspect, a method of making a continuously varying surface is provided. The method comprises providing a substrate and attaching a first chemical building block to a first portion of a substrate. The chemical building block is protected by a protecting group.
- the method involves removing the protecting group from the first chemical building block by contacting a first region of the substrate with a deprotecting agent and reacting a second chemical building block to the first chemical building block in which the first chemical building block and second chemical building are joined by a bond.
- the method also entails contacting a second region of the substrate with a deprotecting agent such that the deprotecting agent partially contacts the first region.
- Any of the above is optionally repeated using additional chemical building blocks that are or are not the same as the first and second chemical building blocks.
- any of the above is optionally repeated in additional regions of the substrate that are or are not the same as the first and second regions of the substrate.
- the deprotecting agent is light. In other embodiments, the deprotection is by a reactive chemical released by the deprotecting agent, the reactive chemical reacting with the protecting group to remove the protecting group.
- the reactive chemical is acid, base, oxidizing agent, reducing agent, or catalyst. In particular embodiments, the reactive chemical is acid. In still more embodiments, the chemical building block is an amino acid.
- the first and second regions are adjacent. In very particular embodiments, the first region and second region overlap. In certain embodiments, each region that is illuminated on the substrate is adjacent to at least one other region on the substrate. In still more embodiments, a boundary is formed between adjacent regions on the substrate, the boundary comprising a mixture of chemical building blocks attached to the surface, the mixture comprising chemical building blocks from each adjacent region.
- Figure l(a) is a graphic representation showing the deprotection of a portion of the surface of a display.
- Figure l(b) is a graphic representation showing the region of the display deprotected and having a first chemical building block (either A or D) attached to the surface.
- Figure l(c) is a graphic representation showing the deprotected region of the display having compounds composed of two chemical building blocks linked together to form a sequence.
- Figure l(d) is a graphic representation showing the deprotected region of the display having compounds composed of three chemical building blocks linked together to form a sequence.
- Figure 2(a) is a graphic representation showing a display surface having a coating of poly-L-lysine coating.
- Figure 2(b) is a graphic representation showing the display surface of Figure 2(a) in which the reactive amine of the poly-L-lysine is protected by a photo-labile protecting group (NPPOC).
- NPOC photo-labile protecting group
- Figure 2(c) is a graphic representation showing the display surface of Figure 2(b) in which one of the photo-labile protecting groups is removed.
- Figure 2(d) is a graphic representation showing the display surface of Figure 2(c) in which the deprotected amine is bound to an amino acid having a protecting group (FMOC) protecting its reactive amine.
- FMOC protecting group
- Figure 3(a) is a graphic representation showing a display surface having a generic coating or linker bound to a protecting group.
- Figure 3(b) is a graphic representation showing the display surface of Figure 3(b) in which some of the linkers or coating is deprotected by light.
- Figure 3(d) is a graphic representation showing the display surface of Figure 3(c) in which the deprotected linkers or coating is bound to a chemical building block (A) having a protecting group.
- Figure 3(e) is a graphic representation showing the display surface of Figure 3(d) in which the remaining linkers or coating is deprotected.
- Figure 3(f) is a graphic representation showing the display surface of Figure 3(e) in which the remaining deprotected linkers or coating is bound to a chemical building block
- Figure 3(g) is a graphic representation showing the display surface of Figure 3(f) in which the surface is coated with the chemical building blocks having a particular protecting group.
- Figure 3(h) is a graphic representation showing the display surface of Figure 3(g) in which the protecting group is removed and replaced by a second protecting group, which is labile in different conditions from the conditions of the first protecting group.
- Figure 4 is a graphic representation showing four layers of the A/G tetramer of an
- A/G tetramer display in which dark regions represent A in a particular layer while light regions represent G in a particular layer.
- Figure 5 is a graphic representation showing the light scanning direction during production of a display and the resulting pattern ("Resulting Pattern").
- Figure 6 is a photographic representation showing the creation of a display of peptides in which each individual amino acid has been changed to an alanine.
- Figure 7 is a photographic representation of the results of an experiment using a display in which amino acids in the original peptide are shown interacting with the labeled target, as well as a set of neighboring amino acids were systematically mutated and bound to labeled target.
- Figure 8 is a graphic representation showing the identification of a range of peptide sequences, each having a particular affinity for a ligand.
- Functions include, but are not limited to, ligand binding, enzymatic activity or other catalytic function, protein-protein interactions, sensory activity, optical switching or signaling, electrochemical switching or signaling, acting as a logic element (AND gate, OR gate, etc.) or switch in optical, electronic or mechanical molecular scale devices, serving as connectors to bring specific components together during self assembly, cell signaling, enzyme inhibition, cell signal inhibition, cell toxicity, cell signal agonism, chemical-chemical interactions, surface characteristics, protein labeling, nucleic acid binding, and cell surface binding.
- a high-diversity surface composition is provided in which the actual fraction (number of different kinds of molecules divided by the total number of molecules) is very small.
- high diversity surfaces can have diverse chemical entities attached to the surface in such a way as to generate a pattern.
- the diverse chemical entities can be distributed in a non-homogeneous way on the surface, generating a pattern.
- the diverse chemical entities are distributed in proportions of the entities present at each locus is predetermined.
- the surface composition comprises polypeptides that are synthesized in situ on a substrate.
- the surface composition comprises nucleic acids that are synthesized in situ on a substrate.
- the display has disposed on its surface compounds including, but not limited to, natural ligands, synthetic small molecules, heteropolymers, chemicals, nucleic acids, peptides or protein.
- the surface variability is continuous down to the single compound level within a locus.
- each compound is different from each other compound within the locus.
- locus means a specific and/or discrete point on a surface.
- a locus can encompass a range of areas between 0.1 ⁇ m 2 to 1 cm 2 and all ranges in between.
- a locus can also be defined by the compounds disposed at that particular locus.
- each locus is positioned on a surface such that the surface is defined by a series of non-overlapping loci, each defined by a particular distribution of different compounds.
- the loci are positioned on the surface so that each locus overlaps or touches upon other loci to form a continuous pattern on the surface.
- the pattern is conceptually a series of loci connected together to create the appearance of the pattern. The pattern represents the totality of loci on the surface, all of which are drawn together to form the pattern.
- the surface is defined by multiple dimensions in chemical space with any one or more of the dimensions being continuously varied.
- the term "continuously varied” or “continuously varying” means characterized by uninterrupted change or diversity.
- the compounds are disposed on a display to form a pattern, whereby the proportion of compounds and/or distribution of compounds having particular sequences changes over a region of the display.
- the compounds are disposed such that each compound at a particular locus differs from the other compounds disposed at that particular locus with respect to its sequence.
- the compounds are distributed on the display such that the distribution of compounds continuously varies between loci on the display to form a continuously-varying pattern.
- the term "display” as used herein refers to a device that includes a solid support with compounds affixed to the surface of the solid support.
- the support consists of silicon, glass, nylon, plastic, carbohydrate (such as cellulose) or metal alloy.
- Solid supports used for display production can be obtained commercially from, for example, Genetix Inc. (Boston, MA).
- the support can be derivatized with a compound to improve compound association.
- Exemplary compounds that can be used to derivatize the support include aldehydes, thiols, alcohols, carboxylic acids, poly-lysine, epoxy, silane containing compounds and amines. Derivatized slides can be obtained commercially from, for example, Telechem International (Sunnyvale, CA).
- displays have attached to their surfaces compounds capable of binding to or capturing a target molecule in solution.
- these compounds may be referred to as capture probes.
- capture probe is intended to mean any agent capable of binding a target molecule in a mixture.
- characteristic means any unique quality or attribute of a compound that can be measured or modified. For compounds that are polypeptides, characteristics can include, e.g., primary structure, size, molecular weight, secondary structure, tertiary structure, dynamics, conformational changes, self assembly characteristics, charge, expression level, hydrophobicity, and hydrophilicity.
- characteristics can include, e.g., nucleic acid sequence, size, molecular weight, secondary structures such as hairpin loops, and other folding characteristics, self assembly characteristics, hybridization stringency, melting temperature (for part or all of the structure), and conformational dynamics.
- characteristics can include, e.g., charge, structure, size, molecular weight, interactions with other compounds, solubility in various solvents, ability to self assemble into larger units, conformational dynamics, hydrophobicity, and hydrophilicity.
- amino acid substitutions are identified within a polypeptide that produce little, if any, effect on the function or structure of the compound. Such degenerate modifications are possible due to the number of compounds that can be placed on the surface — greater than 1,000,000 compounds, greater than 10,000,000 compounds, or greater than 100,000,000 compounds, or as many as 1 billion compounds or more.
- a particular set of compounds having a range of amino acid sequences is placed on a solid surface using techniques described below.
- a polypeptide of the sequence ARRNHSTVLGK is modified such that the valine (V) is replaced by another hydrophobic amino acid such as alanine, leucine, isoleucine, or tryptophan, and the alanine (A) is replaced by another hydrophobic amino acid as well.
- V valine
- A alanine
- a mixture of polypeptides which include the conservative modifications, is placed onto the surface. The mixture contains a predetermined ratio of the original polypeptide and modified polypeptides.
- the hydrophobic changes yield no functional differences between different compounds.
- one of the modifications yields an enhanced activity that is measurable.
- the particular locus has a higher chemical diversity than functional diversity due to the functional degeneracy at both positions tested. This allows for a very high number of compounds to be deposited on the substrate and for the analysis of these compounds. Additional loci are fabricated with varying mixtures of the conservative or functionally degenerate amino acid substitutions at the V position. These loci are nearby neighbors of the first locus. Such a distribution can be used to identify regions of high functionality or interaction with a particular target molecule based on the known mixture of polypeptides present at each particular locus. As the pattern continues further away from the first locus, the surface contains polypeptides having more radical substitutions for valine synthesized on it at various loci.
- the peptide mixtures start with the original peptide and the pattern incorporates substitutions that are at first conservative and then radical modifications. On rare occasions, the substitutions will create an effect that leads to improved function, such as binding. Thus, when a display is probed with a target molecule, those rare improved functional compounds will be identified by the pattern yielded by the interaction of the target molecule with the compound at one or more particular loci on the substrate.
- Each iteration can be obtained utilizing a continuously-varied, patterned display having a set of compounds attached to its surface.
- Each display used in the process of iterative evolution of the compounds identifies a particular subset of compounds.
- the compounds are identified by contacting the display with a target molecule or target molecules.
- Each compound is analyzed using non-limiting techniques such as fluorescence detection, electrochemical detection, MALDI-TOF spectrometry, mass spectrometry, and NMR.
- a method of identifying molecular function entails contacting a display having a continuously varying pattern of compounds with a target molecule.
- the method also entails detecting an interaction between the target molecule and the compounds and identifying those compounds that interact with the target molecule.
- the interactions are indicative of characteristics of the compounds, each characteristic affecting a particular function.
- the method further includes a step of identifying one or more modifications to one or more characteristics of the compounds. These characteristics can be used to identify additional compounds that have altered interactions with the target molecules. Any characteristics that have been identified are used to determine the next step in the process of identifying a particular molecular function of the compound.
- the next step typically entails preparing a display having a predetermined, continuously varying pattern of compounds having a modified structure based on the identifications.
- each compound may have one or more modifications altering one or more characteristics that affect a particular function.
- compounds can be long heteropolymers composed of a plurality of chemical building blocks.
- chemical building block means a substance or element that can covalently or noncovalently bond with one or more other substances (i.e., other chemical building blocks, domains, or moieties) or elements (i.e., one or more atoms that compose a particular chemical building block) to form a more complex structure or substance, such as a compound.
- the compounds are disposed on a display to form a pattern, whereby the proportion of compounds and/or distribution of compounds having particular sequences changes over a region of the display. In other embodiments, there is a mixture of compounds disposed at each locus. In particular embodiments, the compounds are distributed on the display such that the distribution of compounds continuously varies between loci on the display to form a continuously- varying pattern.
- the methods described herein involve using a display having a pattern.
- This pattern can be produced by the disposition of compounds on the surface of the display at loci.
- locus means a specific and/or discrete point on a surface.
- a locus can encompass a range of areas between about 0.1 ⁇ m 2 to about 1 cm 2 and all ranges in between.
- a locus can also be defined by the compounds disposed at that particular locus.
- each locus is positioned on a surface such that the surface is defined by a series of non-overlapping loci, each defined by a particular proportion of compounds.
- the loci are positioned on the surface so that each locus overlaps, is adjacent to, or touches upon one or more loci to form a continuous pattern on the surface.
- the pattern is conceptually a series of loci connected together to create the appearance of the pattern. The pattern represents the totality of loci on the surface, all of which are drawn together to form the pattern.
- Functionally conservative chemical building block changes can be defined as changes that often result in little or no alteration in the function of interest for the entire compound.
- the rules for selecting the chemical building blocks that can be interchanged can come from a variety of sources.
- the source of rules may include literature studies (e.g. Dayhoff rules or Blossom). Such studies have defined the average functional impact of particular amino acid changes in terms of their chemical properties, the similarity of codon sequence for the original and final amino acid, and by simply looking statistically at changes that have occurred naturally in families of proteins.
- the surface composition comprises polypeptides that are synthesized in situ on a substrate.
- the surface composition comprises nucleic acids that are synthesized in situ on a substrate.
- the display has disposed on its surface compounds including, but not limited to, natural ligands, synthetic small molecules, chemicals, nucleic acids, peptides or protein.
- the surface is defined by three dimensions in chemical space with any one or more of the dimensions being continuously varied.
- the surface variability is continuous down to the single compound level within a locus.
- the term "display,” as used herein, refers to a device that includes a solid support with compounds affixed to the surface of the solid support.
- the support consists of silicon, glass, nylon or metal alloy.
- Solid supports used for microarray production can be obtained commercially from, for example, Genetix Inc. (Boston, MA).
- the support can be derivatized with a compound to improve compound attachment to the surface.
- Exemplary compounds that can be used to derivatize the support include aldehydes, poly- lysine, epoxy, silane containing compounds, carboxylic acids, alcohols and amines. Derivatized slides can be obtained commercially from Telechem International (Sunnyvale, CA).
- displays have attached to their surfaces compounds capable of binding to or capturing a target molecule in solution.
- these compounds may be referred to as capture probes.
- capture probe is intended to mean any agent, such as a compound described herein, capable of binding a target molecule in a complex solution such as a cell sample, biological fluid, or environmental sample.
- a compound means a substance or molecule formed by chemical bonding of two or more chemical building blocks.
- a compound can be a polypeptide produced by peptide bonds between a plurality of amino acids, the amino acids thereby forming a sequence.
- a compound can be an oligonucleotide sequentially produced by the formation of phosphodiester bonds between a plurality of nucleic acids.
- Other compounds include, but are not limited to, natural ligands, synthetic small molecules, synthetic heteropolymers, chemicals, nucleic acids, or polypeptides.
- the compound is capable of interacting with a biological macromolecule such as a polypeptide, nucleic acid, simple carbohydrate, complex carbohydrate, fatty acid, lipoprotein, and/or triacylglyceride.
- a compound is a polypeptide.
- the polypeptide can be produced using in situ techniques on the surface of the display. Techniques for synthesis of proteins and peptides are well known in the art and include, but are not limited to, solid phase synthesis, light-directed synthesis, electrochemically-directed synthesis, and spot synthesis (see, e.g., Muir, T. W. and Kent, S. B. (1993) Curr. Opin. Biotechnol., 4(4): 420-427, Fodor et al, Science, 251, 767 77 (1991), and U.S. Patent Nos.
- proteins attached to a display surface bind biological macromolecules such as proteins, nucleic acids, simple carbohydrates, complex carbohydrates, fatty acids, lipoproteins, and/or triacylglycerides.
- the mechanisms of binding to a target molecule include, e.g., hydrogen bonding, Van der Waals attractions, covalent bonding, ionic bonding, or hydrophobic interactions.
- Exemplary proteins include, but are not limited to, natural ligands of a receptor, hormones, antibodies, and portions thereof.
- a compound is a nucleic acid sequence, which can be a full length sequence, fragments of full length sequences or synthesized oligonucleotides, that bind under physiological conditions to nucleic acids, e.g., by Watson-Crick base pairing (interaction between oligonucleotides and single-stranded nucleic acid).
- Compounds can be composed of DNA, RNA, or both.
- Nucleic acid capture probes are complementary to cDNA or cRNA sequences obtained from pre-messenger RNA, messenger RNA, transfer RNA, heteronuclear RNA ("HnRNA”), ribosomal RNA, bacterial RNA, mitochrondrial RNA or viral RNA.
- Oligonucleotides can also bind proteins (e.g., DNA binding proteins, DNA modifying enzymes, transcription factors, etc.) or they can form aptamers which can bind with high affinity to a large variety of large and small molecules.
- Modifications include synthetic linkages such as alkylphosphonates, phosphoramidites, carbamates, carbonates, phosphate esters, acetamide, and carboxymethyl esters (see, e.g., Agrawal et. al., (1987) Tetrahedron Lett. 28:3539-3542; Agrawal et. al, (1988) PNAS (USA) 85:7079-7083; Uhlmann et. al., (1990) Chem. Rev. 90:534-583; Agrawal et. al., (1992) Trends Biotechnol. 10:152-158).
- synthetic linkages such as alkylphosphonates, phosphoramidites, carbamates, carbonates, phosphate esters, acetamide, and carboxymethyl esters
- target molecule means a molecule or group of molecules that interact with a compound on a display through covalent or noncovalent bonding.
- the target molecule is a biological macromolecule such as a protein, nucleic acid, simple carbohydrate, complex carbohydrate, fatty acid, lipoprotein, and/or triacylglyceride.
- the target molecule is isolated from a biological sample.
- the target molecule is synthesized using techniques known in the art. When using the singular of the term "target molecule,” such use is not meant to limit the meaning of the term to a single molecule.
- target molecule is meant to describe a class of molecules that has similar features. For instance, a group or class of target molecules would include peptides having the same amino acid sequence. The term would also encompass classes of molecules having the same nucleic acid sequence. Therefore, the term “target molecule” or “target molecules” is meant to encompass a group of target molecules having a unifying feature among the members of the group that differentiates those molecules from the other target molecules in the mixture.
- Target molecules can be utilized as markers to identify molecular function and iterations of compounds having enhanced or decreased functionality. Target molecules can be utilized to detect a particular interaction between compounds and the target molecules. Furthermore, target molecules can be used to determine how a modification to a particular compound affects its function. For instance, a target molecule can be an antigen in solution and the compound on the display can be an antibody. Each compound on the display can be an antibody that includes a modification to the Fc region of the antibody. In certain embodiments, the target molecules can be labeled to facilitate the identification of compounds that bind to the target. On a single display, compounds and their modifications can be screened to determine the alterations to the antibody that affect its binding to an antigen.
- Target molecules can be obtained by isolation from a cell sample by mechanisms available to one of ordinary skill in the art (see, e.g., Ausubel et. ah, Current Protocols in Molecular Biology, Wiley and Sons, New York, NY, 1999).
- the target molecules can be isolated from a tissue sample isolated from a human patient.
- Target molecules can be isolated in the form of RNA transcripts. Methods of RNA isolation are described in, for example, Ausubel et al. , Current Protocols in Molecular Biology, Vol. 1 , pp. 4.1.1-4.2.9 and 4.5.1-4.5.3, John Wiley & Sons, Inc., (1993).
- target molecules can be synthesized using solid phase protein synthesis and other techniques known in the art and described above.
- the target molecules are detectably labeled.
- detectably labeled means that a target molecule is operably linked to a moiety that is detectable.
- operably linked is meant that the moiety is attached to the target molecule by either a covalent or non-covalent (e.g., ionic) bond.
- Methods for creating covalent bonds are known (see general protocols in, e.g., Wong, S. S., Chemistry of Protein Conjugation and Cross-Linking, CRC Press 1991; Burkhart et al., The Chemistry and Application of Amino Cross linking Agents or Aminoplasts, John Wiley & Sons Inc., New York City, NY, 1999).
- a "detectable label” is a moiety that can be detected by any suitable means.
- labels can be, without limitation, fluorophores (e.g., fluorescein (FITC), phycoerythrin, rhodamine), chemical dyes, or compounds that are electrochemically active, radioactive, chemiluminescent, magnetic, paramagnetic, promagnetic, or enzymes that yield a product that may be colored, fluorescent, chemoluminescent, or magnetic.
- the signal can be detectable by any suitable means, including spectroscopic, photochemical, biochemical, immunochemical, electrochemical, electrical, optical or chemical means. In certain cases, the signal is detectable by two or more means.
- labels for target molecules include fluorescent dyes, radiolabels, electrochemical labels, and chemiluminescent labels, which are examples that are not intended to limit the scope of the invention (see, e.g., Yu, et al., (1994) Nucleic Acids Res. 22(16): 3226-3232; Zhu, et al, (1994) Nucleic Acids Res. 22(16): 3418-3422).
- the target molecules are conjugated to Cy5/ Cy3 fluorescent dyes, which are well known in the art (see, e.g., Yang et al., (2005) Clin Cancer Res. 11(2 Pt 1):612-20).
- the fluorescent labels can be selected from a variety of structural classes, including the non- limiting examples such as 1- and 2-aminonaphthalene, p,p'diaminostilbenes, pyrenes, quaternary phenanthridine salts, 9-aminoacridines, p,p'- diaminobenzophenone imines, anthracenes, oxacarbocyanine, marocyanine, 3- aminoequilenin, perylene, bisbenzoxazole, bis-p-oxazolyl benzene, 1 ,2-benzophenazin, retinol, bis-3-aminopridinium salts, hellebrigenin, tetracycline, sterophenol, benzimidazolyl phenylamine, 2-oxo-3-chromen, indole, xanthen, 7-hydroxycoumarin, phenoxazine, salicylate, strophanthidin, porphyrins,
- Target molecules can be detectable using "direct labels” or "indirect labels.”
- Direct labels are detectable labels that are directly attached to, or incorporated into, the target molecule.
- indirect labels are joined to the target molecule after interaction with a compound on the display.
- the indirect label can be attached to a binding moiety that has been subsequently attached to the target molecule after interaction with the display surface.
- Target molecules can be polypeptides that can be subsequently detectably labeled by being operably linked to a moiety that is detectable.
- Polypeptides can be detectably labeled by methods known in the art (see, e.g., Macbeth, (2002) Nature Genet. 32 (Suppl.): 526-532).
- Exemplary detectable labels for polypeptides include, but are not limited to, fluorescent dyes, radiolabels (see, e.g., Jona et. al., (2003) Curr. Opin. MoI. Therap. 5(3): 271-277) and chemiluminescent labels (see, e.g., Bacarese-Hamilton et. al, (2003) Curr. Opin.
- a polypeptide target molecule can be unlabeled and detected using an indirect label.
- the indirect label can be, e.g., an antibody unattached to the solid support, but capable of recognizing the target molecule.
- the unattached antibody can be conjugated to a label such as a radiolabel, chemiluminescent label, electrochemical label or fluorescent dyes.
- the target molecule is operative Iy coupled to a biotin molecule, which is detected after the target molecule binds a compound on the display using fluorescently or enzymatically-coupled streptavidin.
- Continuously-varied, patterned displays described herein can be produced, e.g., by depositing compounds onto the surface of the display.
- Compounds, being composed of chemical building blocks can be produced in a sequential fashion.
- compounds can be produced in which each compound at a particular locus has a particular, and unique, chemical building block at each position within its sequence as compared to the chemical building blocks of the other compounds at the locus.
- the fraction of a particular chemical building block incorporated into a growing compound varies in a continuous rather than discontinuous way over the surface (discontinuous variation generally involves having either no deprotection or complete deprotection at any point).
- Specific regions of the surface can be deprotected and different mixtures of chemical building blocks can be used to build the growing compounds. This provides mixtures at well defined positions of the compounds, thereby making the compounds distinct at these positions.
- dynamic approaches in which the coupling and deprotection are simultaneous can be utilized for certain non-limiting techniques such as light-directed synthesis. For example, in the case of light-directed approaches, amino acids are flowed over the surface, continuously moving from one locus or compound to the next, moving the excitation pattern around on the surface. Alternatively, a movable stage is used to generate overlapping patterns, generating a continuous gradient.
- there are two or three distinct blocking groups initially on the surface of the display (generated by, for example, blocking all the amines with a mixture of two or three orthogonal blocking groups).
- a new building block is then added.
- the second group is removed chemically and replaced with a photolabile group. This is now patterned.
- This process can be repeated for as many orthogonal blocking groups as one can find. It can be used at each layer in the directed synthesis. It could also be used in electrochemical synthesis.
- the patterns using each of the orthogonal sets are just like layers in a photoshop program. They can each be independently manipulated but exist in the same space.
- photolabile groups can be utilized to create a pattern on the substrate.
- the substrate can be blocked with photolabile groups requiring different wavelengths of light to remove different groups or requiring varying lengths of light exposure. Therefore, a pattern or gradient can be formed depending on the pattern of photolabile groups on the substrate.
- reactive moiety means a part or section of a compound that can polymerize or react under certain conditions to another molecule or part of a molecule.
- moieties include, but are not limited to, amines, sulfones, sulfhydryls, carbonyls, carboxyls, amides, nitriles, phosphates, and phosphites.
- reactive moieties can bind to cofactors such as metals. The metals coordinate the binding of the reactive moiety to another molecule or a particular position on the substrate.
- the display can also be produced by controlled translation of the substrate during light exposure, by controlled rastering of the light source, by controlled translation of a mask, by varying the size of the regions upon which light is focused at each deprotection step, and/or by applying mixtures of amino acids at any step.
- Compounds can be disposed on a derivatized surface utilizing methods known to those of ordinary skill in the art through a process called "printing” (see, e.g., Schena et. al, (1995) Science, 270(5235): 467-470).
- the term "printing”, as used herein, refers to the placement of compounds onto the surface of the display in such close proximity as to allow a maximum number of compounds to be disposed.
- the printing process can be carried out by, e.g., a robotic printer.
- linkers can be disposed on the surface and linked to a first chemical building block of a growing compound so as to link the compound to the surface.
- Linker groups are selected for compatibility with the ligation chemistry and for compatibility with the application of the resulting surface carrying immobilized molecules.
- Linkers include, e.g., those comprising ether groups, polyethers, alkyl, aryl ⁇ e.g., groups containing one or more phenyl rings) or alkenyl groups, ethylene glycol groups.
- Linkers can include, but are not limited to, poly-T chains, short amino acid polymers such as PGP repeats, poly-proline, and poly-lysine. Linkers can also include molecular scaffolds in which there are multiple attachment sites for groups including crown ethers, porphyrins, poly-lysine, DNA chains, and amine modified acrylate polymers.
- Linkers can range in length from about 2 to about 1,000 atoms in length.
- a linker can be based on an alkyl chain in which one or more of the CH 2 groups of the chain is replaced with an O, S, NH, CO, or CONH group.
- a linker can be a substituted alkyl chain in which one or more carbons of the chain carry a non-hydrogen substituent, such as an OH, NH 2 , or SH group or halide.
- a linker may be a polymer, such as poly(ethyleneglycol).
- a linker can include one or more functional groups or residue moieties that function for ligation to the linker or that remain after the linker has been ligated to the surface or the molecule to be immobilized. Linkers and methods of attaching linkers to compounds are known in the art.
- Additional nonlimiting examples of techniques that can be used to dispose compounds or chemical building blocks onto a surface include photolithography, electron beam lithography, silicon-based fabrication, capillary printing on a glass slide, electrochemical patterning, and ink-jet technology (see, e.g., U.S. Patent Nos. 7,266,473; 7,264,929; 7,221,654; 5,424,186; 5,143,854; 6,479,301; 6.280.595; 6,093,302; and 5,744,305).
- Protection and deprotection techniques can be utilized to develop highly complex patterns on the surface of a display. Such techniques utilize protecting groups, which are well known in the art (see, e.g., Greene and Wuts, Protective Groups in Organic Chemistry, 2d edition, John Wiley and Sons, 1991). Deprotection can be accomplished using techniques described in, e.g., U.S. Patent Nos. 5,688,489; 4,833,268; 5,670,480; 5,104,882; 5,679,642; 6,280,595; 6,093,302; and 5,225,528.
- Suitable amino protecting groups include, but are not limited to, t-butyloxycarbonyl (Boc), fluorenyl-methyloxycarbonyl (Fmoc), 2-trimethylsilyl- ethyoxycarbonyl (Teoc), NPPOC, methyl-NPOC, NVOC, 1 -methyl- 1 -(4- biphenylyl)ethoxycarbonyl (Bpoc), allyloxycarbonyl (Alloc), and benzyloxycarbonyl (Cbz).
- Carboxyl groups can be protected as fluorenylmethyl groups, and hydroxyl groups can be protected with trityl, monomethoxytrityl, dimethoxytrityl, and trimethoxytrityl groups.
- complexity is a metric that provides a pseudo-quantifiable measure of the number of different states within a system. For example, each locus within a pattern that differs from another locus in the pattern measurably increases the complexity of the surface. At even higher resolutions, increasing the number of different compounds on a surface increases the measurable complexity of the surface. Complexity provides a methodology of determining the differences between different surfaces. In certain embodiments, algorithms are used to measure complexity to provide a pseudo-quantifiable determination of the differences between displays.
- compositions consisting of a series of chemical building blocks that are attached to one another.
- these building blocks are patterned onto a surface and subsequent reactions are performed at known positions on the nascent molecular complex being synthesized.
- This directed molecular assembly can occurs using a patterning device that has a suitable resolution for a particular surface.
- the surface can be defined as the total amount of information present in terms of a line, area or volume divided by the effective resolution (in one, two or three dimensions, as appropriate), which is the total number of resolvable elements (E).
- E the total number of resolvable elements
- the relative location of the elements does not matter. For example, when the resolution is large relative to the size of the target molecules, the interactions between target molecules and compounds on the surface of the display outside of any particular resolved area may not be important. In other embodiments (e.g., where the target is relatively large, like a cell), the relative location of the elements may be important.
- the complexity of a surface can be determined in the following way.
- E can be considered to be the total resolved compounds on the surface.
- the average composition can be defined for each such element in the following way. If there are N possible building blocks and M of them are added together to make the complex, the system can be described with a three dimensional matrix in which the E compounds are in one dimension, the M building block positions (which can be building blocks linked linearly or in another configuration) in a second dimension and N coefficients in a third dimension. Each of the N coefficients represents the fraction of that position that is occupied by the corresponding type of building block.
- Resolution is the distance over which the composition is changed by at least 10% (or 20% or 50%) in a well defined manner (the fraction of each species known to within +/- 10%).
- the methods described herein can be used to change the composition as a function of position with high resolution.
- fractional composition is a measure of the diversity of compounds on a display.
- the fractional composition is varied with a resolution of at least 1 mm in all dimensions (two or three dimensions).
- the fractional composition is varied by 0.5 mm.
- the fractional composition is varied by 0.4 mm.
- the fractional composition is varied by 0.3 mm.
- the fractional composition is varied by 0.2 mm.
- the fractional composition is varied by 0.1 mm.
- the fractional composition is varied by 0.015 mm.
- complexity can be defined as the ratio of the minimum amount of information (in terms of 0/1 bits, for example) that is necessary to represent the matrix described above divided by the amount of information stored in that matrix without compression.
- This method of complexity determination is referred to herein as "multidimensional matrix analysis.”
- the amount of information stored in the matrix without compression is E x M x N coefficients (real numbers expressed at some level of accuracy appropriate for the task).
- the information in the matrix is determined using fewer numbers. In other instances, all the values of all the coefficients in all positions in the matrix are the same, and this can be represented without specifying each value separately.
- Complexity can alternatively be represented by the following mathematical definition (in which case the absolute value of complexity differs from that obtained using compression algorithms). This approach is simple, quantitative and relatively intuitive, but does not take into account the complexity of the organization of the building blocks within compounds. This method only considers the number of different compounds and the form in which they are patterned. This approach is referred to herein as "mathematical complexity analysis.”
- a library comprising different compounds can be attached directly to a surface or volume of a display or can be attached (either covalently or noncovalently) to a surface of a display by in situ synthesis. This results in a surface with some pattern of compounds on it.
- Two quantitative terms can be used to describe properties of the pattern of compounds on the surface that relate to how complex that pattern is and how useful it is for screening for molecular function of the compounds. The first quantitative term is the "complexity density":
- a is the total area on the surface occupied by the compound i in the library. Because each compound can have a distribution on the surface that is not digital (the density of the compound can change gradually over space), a; can be defined as the smallest area defined on the surface that contains 95% of the compound species i.
- the index i itself just represents a particular compound in the library.
- N is the total number of structurally distinct compounds in the library and q is a term that reflects the degree to which the density of compounds on the surface changes. This is related to the function n(x,y), which represents how many structurally distinct compounds are present at the position x,y on the surface. In particular, q is large when this number varies dramatically from one part of the surface to another.
- the compounds are arranged in contiguous square elements (regions of the substrate), such that the area, A, is divided into N different elements, each element having only one kind of compound attached to it.
- This situation is similar to known microarrays on DNA chips, except that in this situation there is no space between the elements to which the compounds are attached. It follows that:
- n(x,y) 1 (because there is only one kind of compound in any element of the grid).
- the complexity density in this instance is the total number of compounds divided, not by the total area, but by the area taken up by a single compound from the library on the surface (one of the N different squares for this example).
- the complexity density is much larger (a factor of N) than the complexity density seen when the compounds are all evenly distributed about the surface. This is expected given the ability in this system to interrogate each of the compounds in the library separately.
- the complexity density is 10 12 cm “2 and the total complexity is 10 12 . In other embodiments, the complexity density is at least 10 11 cm “2 . In yet other embodiments, the complexity density is between 10 5 and 10 10 cm “2 . In certain embodiments, the complexity density is greater than 10 12 cm “2 . In particular embodiments, the complexity density is greater than 10 20 cm “2 .
- a display resembles a DNA chip array known in the art, in which spaces are present between the display elements (such that the total area is increased but the element size and number of different compounds remain the same).
- the complexity density changes by the fraction of the surface that is left blank, but the total complexity does not change.
- the surface has a complexity density of 10 14 cm “2 (instead of 10 12 ) and a total complexity of 10 14 .
- the region occupied by each type of compound overlaps with the area occupied by 100 other distinct compounds, creating a continuous surface but with a well defined and rapidly changing composition at each point.
- Both the complexity density and total complexity values are increased by two orders of magnitude reflecting the fact that 100 times as many compounds are present at each position on the surface.
- a quantitative hallmark of a continuous surface with overlapping regions of different compound composition is that both the complexity density and the total complexity, as defined above, is greater than what can be achieved by placing compound species separately on a surface element by element because the total number of compounds that can be placed on the surface is greater.
- a continuous surface or alternatively a
- N > — (the total area a divided by the average area taken up by each of the individual compound species), while in a standard microarray format (one compound species per element), N is less than A/a
- N is a factor of 1.1, 1.2, 1.3, 1.4, 1.5, 2.0, 5.0, 10, 20, 30 , a
- half of the compounds from a library are placed on the surface in such a way that they cover an area, a, and the other half are placed so that they cover an area 10a.
- the patches of compounds of these sizes are uniformly spaced around the surface covering the entire area such that the total number of different compound species at each point is constant and defined by
- n(x,y) 0.5 + 0.5 — - -
- the complexity is decreased when half of the compounds are spread over a much greater region than the other half.
- each compound is described as a point in n-dimensional space, where n is the number of descriptors used to describe a compound (these descriptors could be chemical descriptors for each building block or substituent in a particular compound, synthesis steps used in making a particular compound, etc.).
- each compound is assigned a point in n-dimensional descriptor space.
- the average distance between the points, variance in the distance between points and range of distance between points can all be used to quantify library complexity (i.e., more diverse libraries will have a larger spread of points in n-dimensional descriptor space).
- Displays can also be made using in situ techniques.
- in situ synthesis is used to produce a display having, at each locus on the surface defined by a circle of specified radius around an arbitrary point, a distinct distribution of sequences.
- specific chemical building blocks are present in a predetermined proportion at each chemical building block position along the compound chain. The number of distinct compounds present at each point is limited only by the combinatorics of the chemical building block choices allowed at each position.
- Figure 1 illustrates schematically one way in which complexity can be increased exponentially as each additional chemical building block position is added at a single locus 102 having a plurality of attachment sites 106 each suitable for attachment of a single chemical building block (e.g., 107).
- a light-directed synthesis method (such as the method described below), is used, in which a first chemical building block species (A in Figure l(b)) is randomly applied to 50 percent of the attachment sites, and a second chemical building block species (D in Figure l(b)) is applied to the remainder of the attachment sites.
- a third chemical building block species (E in Figure l(c) is randomly added to 50 percent of the compound chains, and a fourth chemical building block species (F in Figure l(c)) is added to the remainder of the compound chains.
- each chain shown represents a species of compound of which many molecules will be present at a locus.
- A is the number of chemical building block choices at each variable position (i.e., each position at which multiple chemical building block choices are allowed) and N is the number of variable positions.
- a glass surface 110 is first prepared by applying a thin coating of poly-L-lysine 121 as shown in Figure 2(a).
- the poly-L-lysine is positively charged and the surface of the glass is negatively charged, so that the poly-L-lysine adheres to the glass due to electrostatic attraction.
- the side chains of the lysine residues of the poly-L-lysine expose amine groups which, under appropriate conditions, will form an amide bond with an exposed carboxyl group of an amino acid or other chemical entity.
- a photo-labile protecting group, NPPOC 122 is applied and binds to the exposed amine groups of the poly- L-lysine, as shown in Figure 2(b).
- the surface can now be manipulated to expose one or more amines protected by the photo-labile protecting group.
- light 111 is applied to a predetermined locus 112 on the surface of the display at a frequency, intensity, and duration calculated to remove a predetermined proportion (here, one half) of the photo-labile protecting groups 113 exposed at the selected locus 112, leaving 50 percent of the amines 114 deprotected and exposed.
- a first selected amino acid species 115 (here, alanine) having a non-photo-labile protecting group 116 bound to its amine terminus is then applied; the carbonyl carbons of the applied amino acids react and bind with the exposed amines 114 ( Figures 3(c) and 3(d)).
- Figures 3(c) and 3(d) show, removal of the photo-labile protecting group NPPOC 122 results in an exposed amine 123, with which the arriving amino acid 125 forms an amide bond; the arriving amino acid carries the non-photo-labile protecting group FMOC 124 bound to its primary amine.
- the selected locus is then again exposed to light at a frequency, intensity, and duration calculated to remove the remaining photo-labile protecting groups 117 exposed there ( Figure 3(e)).
- a second selected amino acid species 119 (here, aspartic acid) having a non- photo-labile protecting group 116 bound to its amino terminus is then applied; the carbonyl carbons of the applied amino acid react and bind with the exposed amines 118 ( Figures 3(e) and 3(f)).
- the deprotected region is the same in the first and second partial deprotection steps.
- the deprotection region in the second step can be different in size and/or partially or fully translated relative to the first region, enabling truly continuous distributions where desired.
- the non-photo-labile protecting groups 120 are then removed; in this illustrative example, the non-photo-labile protecting groups are FMOC, which is base-labile and can be removed by raising pH.
- each additional layer can comprise, e.g., one, two, or more distinct residue types in preselected proportions.
- a continuous surface of tetrapeptides is made using two amino acids, A and G.
- a micromirror array device is used to create the continuous surface.
- a region defined by four of the micromirrors forming a square with four quadrants is considered ( Figure 4).
- a surface that is covered with a photolabile deprotective group, such as NPPOC, is used.
- Light is scanned on quadrants 1 and 3.
- light translates a scanning stage in such a way that at the end of the exposure the light is on in quadrants 2 and 4 ( Figure 5).
- the edge between quadrants 1 and 3 and 2 and 4 has been fully exposed.
- the rest of the surface is partially exposed as shown in Figure 4.
- one of the amino acids, FMOC-A is coupled to the surface in this pattern.
- the converse pattern is formed.
- the quadrants to the left of 1 and 3 are illuminated as well as quadrants 2 and 4.
- the left edges of quadrants 1 and 3 are completely illuminated as are the right edges of quadrants 2 and 4 - exactly the opposite of the first scan.
- the remainder of the NPPOC is released and FMOC-G is coupled in that pattern.
- the next layer is then performed in the same way, except that is it rotated by 90 degrees.
- the light is shown on the quadrant to the left of quadrant 1 as well as on quadrants 2 and 3 (see lower part of Figure 5).
- the light is then scanned to the right.
- the left edge of quadrant 1, the right edge of quadrant 2, and the adjoining edges of quadrants 3 and 4 are fully exposed in this round and the rest is partially exposed.
- This is now coupled to FMOC-A.
- the converse is now performed (starting with the quadrant to the left of quadrant 3 exposed as well as quadrants 1 and 4) and then the scan is performed. This releases the remaining NPPOC and the resulting free amines can be coupled to FMOC-G.
- the final layer is performed as the third layer but rotated 90 degrees.
- the number of steps involved in this example is the same as the number of steps in instances in which only 4 different peptides are created by normal light directed array methods (one peptide in each quadrant). However, 16 are created and created in a way to give a distinctive pattern for each of the 16 peptides. This saves time and increases the number of possible peptides that can be placed on a surface.
- the surface can be addressed in a continuous fashion with resolution approaching 100 nm or smaller.
- This approach can be used as a way of patterning a continuous surface where there are no hard boundaries between regions to continuously vary the proportions of a particular monomer unit from one place to another.
- a surface can be exposed to one pixel to light (100 nm in diameter). The light produces an acid, which removes the protective group on the end of a compound.
- the compound can be a chain of groups that is being built from the surface.
- a chemical reaction then couples a monomer to the compound.
- the compound could be a growing amino acid chain and the monomer could be an amino acid (amino acid A). The light is then moved 100 nm in any particular direction.
- the light is moved in the X direction.
- the monomer to be coupled is a second amino acid (amino acid B) that is different from amino acid A.
- the coupling occurs such that there is no space between the 100 nm pixels.
- the boundary region occurs because the light intensity during exposure of a pixel does not go to zero abruptly at 100 nm but tails off beyond that for tens of nanometers. This process can be continued to create additional boundaries with additional monomers. Continuing with the above example, the process continues by moving 100 nanometers in the Y direction and coupling C in the next pixel. This leads to a B/C mixed boundary. Such an example would lead to a continuous mixture of amino acids from A to C.
- the monomer units are blocked initially by an agent that will not be removed by light (thus once A is added, it cannot be removed until you start the next layer of monomer units). At the beginning of the next layer, all of the blocking groups that cannot be removed by light are removed by another method and replaced with a group that can be removed by light, and process is repeated.
- the very high resolution of this method allows for the creation of extremely high densities of unique polymers on a continuous surface.
- the density can be further increased by mixing types monomer units at each position, as described herein. Without mixing, continuous surfaces with up to 10 10 different heteropolymer molecules per square centimeter could be produced. As described above, if the mixing is done using monomer units that have degenerate function under most circumstances (such as amino acids like glutamic acid and aspartic acid), many combinations can be assayed in one location, the vast majority of which will not differ much in their function. With this approach, 10 11 or 10 12 different molecules per square centimeter may be possible.
- compounds include, but are not limited to, natural ligands, synthetic small molecules, synthetic heteropolymers, chemicals, nucleic acids, peptides or protein.
- Monomers include, but are not limited to, nucleic acids, amino acids, fatty acids, monomeric blocks of polymers such as vinyl monomers, or peptides.
- the displays described herein can be used in methods for identifying molecular function of compounds.
- the methods can be used to determine those portions of a compound that affect its function.
- the portion of the compound can be a chemical building block.
- the methods can identify an amino acid alteration that enhances or decreases the function of a protein.
- the chemical building block can be a nucleotide that enhances or decreases the function of the nucleic acid.
- the chemical building block can be a single element in a compound that increases or decreases its functionality.
- the methods are used to identify sequences having both high affinity and high specificity for an arbitrarily chosen target molecule from among the dauntingly large number of possible sequences.
- a polypeptide of known amino acid sequence is obtained.
- the polypeptide is composed of 20 amino acids, there are 20 20 , or approximately 10 26 , possible amino acid sequences for the polypeptide.
- a continuously-varying, patterned display can be utilized in which the display includes a surface having compounds with alterations to the polypeptide sequence.
- the 20 naturally occurring amino acids can be grouped into subsets based on functional similarity. For example, serine and threonine are chemically very similar, and substituting one for the other in a polypeptide sequence often results in little or no change in the polypeptide's affinity for various targets.
- a first amino acid is mutated conservatively (e.g., alanine to valine).
- the first amino acid is mutated conservatively in a different manner (e.g., alanine to another non-polar amino acid). This continues from locus to locus until a pattern is developed from the most conservative mutation to mutations having more radical alterations to the characteristic of the first amino acid.
- any particular locus has a particular chemical diversity that is generally greater than the functional diversity at the locus.
- the display can have 100 compounds — in this case, polypeptides — at a locus that have any often amino acids at two positions. At one of the positions, all ten amino acids have the same function but at the other position, 2 of the amino acids show a measurable increase in function.
- the functional diversity of the system is very low because 80 of the 100 molecules with have roughly the original function and 20 will have the enhanced function. Thus the functional diversity is 2.
- the functional diversity of a display can be higher or lower than the example just described.
- amino acids can also be mutated in a similar way on the same display as the first amino acid.
- entire segments or domains of the polypeptide can be mutated, deleted, or added to the amino acid sequence.
- single amino acids can be deleted or added.
- the target molecule(s) can be labeled.
- a mixture of target molecules is labeled with different labels, allowing the identification of a particular target molecule that interacts with the compound(s) on the substrate.
- the mixture of target molecules is labeled with a single label, e.g., a fluorescent label.
- the target molecule(s) is allowed to contact the display, thereby interacting with the compounds disposed on the surface of the display. Those compounds that interact with the target molecules are detected using techniques known in the art and described above.
- the fluorescent label is strongest for those loci that have compounds that have the highest affinity for one or more of the target molecules. Loci comprising compounds having the lowest affinity for the target molecule(s) generally have the weakest fluorescent signal. Accordingly, the display provides detailed information regarding the relative affinity of a compound for a target molecule.
- molecular function can be determined for a particular compound.
- the molecular function can be determined for a polypeptide of unknown function by fabricating a continuously-varied, patterned display as described above, and reacting the polypeptides on the display with a target molecule(s).
- the molecular function assayed is catalysis, e.g., enzymatic or nonenzymatic catalysis.
- Catalytic function can be assayed, e.g., by reacting a particular target molecule with a putative catalytic compound attached to a display. The reaction can be monitored by identifying the catalytic product generated on the substrate.
- one or more characteristics of the polypeptides can be determined.
- compounds include, but are not limited to, natural ligands, synthetic small molecules, synthetic heteropolymers, chemicals, nucleic acids, peptides or protein.
- the compound is a biological macromolecule such as a protein, nucleic acid, simple carbohydrate, complex carbohydrate, fatty acid, lipoprotein, and/or triacylglyceride.
- iterative molecular evolution can be used to identify a certain range of compounds.
- a display can be constructed using known compound sequences. Data can be obtained by contacting the display with a target molecule. The interaction between the target molecule and the compounds is detected, and the compounds having a particular interaction with a target molecule are identified. Those compounds having a desired characteristic are utilized to produce a subsequent display having iteratively modified compounds. The subsequent display can then be used to identify those modified compounds exhibiting a more desired characteristic. This method can be carried out iteratively to generate compounds having a most desired characteristic.
- this method can be used to identify a particular polypeptide segment from a receptor that has the highest affinity for the ligand (Fig. 8).
- the first peptide sequence is a weak binder of the ligand.
- the method can also identify a range of compounds having a range of affinities for the ligand.
- the methods described herein can be used to identify compounds that interact with a particular target molecule, e.g., in a multiplex fashion, so that a very large number of compounds can be assayed in parallel.
- a complex mixture of compounds can be synthesized, e.g., having high sequence diversity but low functional diversity.
- a specific example of searching in situ synthesized chemical space involved optimizing the sequence of a peptide known to be a weak binder to the yeast protein Gal80.
- the 15 amino acid peptide, EGEWTEGKLSLRGSC was identified as having weak (micromolar dissociation constant) binding to the protein Gal80.
- High throughput, in situ synthesis on a porous polymer surface was used to explore the chemical space around this peptide. This was performed in two steps. First, a set of peptides was synthesized on a chip in which each residue of the 12 in the original peptide (not including the C-terminal GSC sequence) was replaced by alanine. Hundreds of replicates of each peptide having a particular alteration were synthesized at various positions on the porous polymer surface. These peptides were then tested for binding to the fluorescently labeled Gal80 target molecule.
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Genetics & Genomics (AREA)
- Biochemistry (AREA)
- Engineering & Computer Science (AREA)
- General Chemical & Material Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Medicinal Chemistry (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Bioinformatics & Computational Biology (AREA)
- Plant Pathology (AREA)
- Biophysics (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Ecology (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Steroid Compounds (AREA)
Abstract
L'invention concerne des procédés d'identification de la fonction moléculaire de composés, des procédés d'évolution itérative de composés, et une surface ayant une haute diversité chimique pour accomplir les procédés révélés.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US402307P | 2007-11-21 | 2007-11-21 | |
US404207P | 2007-11-21 | 2007-11-21 | |
US61/004,042 | 2007-11-21 | ||
US61/004,023 | 2007-11-21 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2009067657A2 true WO2009067657A2 (fr) | 2009-05-28 |
WO2009067657A3 WO2009067657A3 (fr) | 2009-12-30 |
Family
ID=40668095
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2008/084329 WO2009067657A2 (fr) | 2007-11-21 | 2008-11-21 | Procédés d'identification d'une fonction moléculaire |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2009067657A2 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11205139B2 (en) | 2018-08-06 | 2021-12-21 | Arizona Board Of Regents On Behalf Of Arizona State University | Computational analysis to predict molecular recognition space of monoclonal antibodies through random-sequence peptide arrays |
US11978534B1 (en) | 2017-07-07 | 2024-05-07 | Arizona Board Of Regents On Behalf Of Arizona State University | Prediction of binding from binding data in peptide and other arrays |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030082830A1 (en) * | 1996-10-16 | 2003-05-01 | Schreiber Stuart L. | Synthesis of combinatorial libraries of compounds reminiscent of natural products |
US20040048311A1 (en) * | 2002-01-24 | 2004-03-11 | Dana Ault-Riche | Use of collections of binding sites for sample profiling and other applications |
US20040163137A1 (en) * | 2001-02-20 | 2004-08-19 | Caroline Barry | PG-3 and biallelic markers thereof |
US20060210452A1 (en) * | 1989-06-07 | 2006-09-21 | Affymax Technologies, N.V., A Netherlands Antilles Corporation | Very large scale immobilized polymer synthesis |
US20070254289A1 (en) * | 2005-10-26 | 2007-11-01 | Applera Corporation | Genetic polymorphisms associated with Alzheimer's Disease, methods of detection and uses thereof |
-
2008
- 2008-11-21 WO PCT/US2008/084329 patent/WO2009067657A2/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060210452A1 (en) * | 1989-06-07 | 2006-09-21 | Affymax Technologies, N.V., A Netherlands Antilles Corporation | Very large scale immobilized polymer synthesis |
US20030082830A1 (en) * | 1996-10-16 | 2003-05-01 | Schreiber Stuart L. | Synthesis of combinatorial libraries of compounds reminiscent of natural products |
US20040163137A1 (en) * | 2001-02-20 | 2004-08-19 | Caroline Barry | PG-3 and biallelic markers thereof |
US20040048311A1 (en) * | 2002-01-24 | 2004-03-11 | Dana Ault-Riche | Use of collections of binding sites for sample profiling and other applications |
US20070254289A1 (en) * | 2005-10-26 | 2007-11-01 | Applera Corporation | Genetic polymorphisms associated with Alzheimer's Disease, methods of detection and uses thereof |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11978534B1 (en) | 2017-07-07 | 2024-05-07 | Arizona Board Of Regents On Behalf Of Arizona State University | Prediction of binding from binding data in peptide and other arrays |
US11205139B2 (en) | 2018-08-06 | 2021-12-21 | Arizona Board Of Regents On Behalf Of Arizona State University | Computational analysis to predict molecular recognition space of monoclonal antibodies through random-sequence peptide arrays |
US11934929B2 (en) | 2018-08-06 | 2024-03-19 | Arizona Board Of Regents On Behalf Of Arizona State University | Computational analysis to predict molecular recognition space of monoclonal antibodies through random-sequence peptide arrays |
Also Published As
Publication number | Publication date |
---|---|
WO2009067657A3 (fr) | 2009-12-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3087083B1 (fr) | Découverte systémique, maturation et extension de peptides de liaison aux protéines | |
Pellois et al. | Individually addressable parallel peptide synthesis on microchips | |
Jacobs et al. | Combinatorial chemistry—applications of light-directed chemical synthesis | |
AU719584C (en) | A method of generating a plurality of chemical compounds in a spatially arranged array | |
Terrett et al. | Combinatorial synthesis—the design of compound libraries and their application to drug discovery | |
Seneci | Solid-phase synthesis and combinatorial technologies | |
Xu et al. | Protein and chemical microarrays—powerful tools for proteomics | |
Lenz et al. | Chemical ligands, genomics and drug discovery | |
US6309831B1 (en) | Method of manufacturing biological chips | |
Frank | High-density synthetic peptide microarrays: emerging tools for functional genomics and proteomics | |
Breitling et al. | High-density peptide arrays | |
JP2005283589A (ja) | アレイ製造方法 | |
EP1003904A4 (fr) | Codage couleur et interrogation in-situ de composes chimiques couples a une matrice | |
Gao et al. | High density peptide microarrays. In situ synthesis and applications | |
Furka | Forty years of combinatorial technology | |
CN107257802B (zh) | 转谷氨酰胺酶底物的鉴别及其用途 | |
CN107835871B (zh) | 用于肽环化和蛋白酶处理的方法和组合物 | |
EP3384041B1 (fr) | Procédé d'identification de substrat de protéase | |
US20050048566A1 (en) | Apparatus, composition and method for proteome profiling | |
Li et al. | Photolithographic synthesis of cyclic peptide arrays using a differential deprotection strategy | |
WO2009067657A2 (fr) | Procédés d'identification d'une fonction moléculaire | |
Bhushan | Light-directed maskless synthesis of peptide arrays using photolabile amino acid monomers | |
Hlavac et al. | Practical aspects of combinatorial solid-phase synthesis | |
Metz et al. | Small molecule screening on chemical microarrays | |
EP3497446B1 (fr) | Procédé et composition de détection d'une cyclisation de peptide à l'aide d'étiquettes de protéine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08852574 Country of ref document: EP Kind code of ref document: A2 |
|
NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 08852574 Country of ref document: EP Kind code of ref document: A2 |