WO2007098130A2 - Stratégie innovante de déconvolution et de fusion pour criblage à grande échelle - Google Patents

Stratégie innovante de déconvolution et de fusion pour criblage à grande échelle Download PDF

Info

Publication number
WO2007098130A2
WO2007098130A2 PCT/US2007/004313 US2007004313W WO2007098130A2 WO 2007098130 A2 WO2007098130 A2 WO 2007098130A2 US 2007004313 W US2007004313 W US 2007004313W WO 2007098130 A2 WO2007098130 A2 WO 2007098130A2
Authority
WO
WIPO (PCT)
Prior art keywords
bait
items
prey
library
experiment
Prior art date
Application number
PCT/US2007/004313
Other languages
English (en)
Other versions
WO2007098130A9 (fr
WO2007098130A3 (fr
Inventor
Jing Huang
Fulai Jin
Tony R. Hazbun
Original Assignee
The Regents Of The University Of California
Purdue Research Foundation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Regents Of The University Of California, Purdue Research Foundation filed Critical The Regents Of The University Of California
Publication of WO2007098130A2 publication Critical patent/WO2007098130A2/fr
Publication of WO2007098130A9 publication Critical patent/WO2007098130A9/fr
Publication of WO2007098130A3 publication Critical patent/WO2007098130A3/fr

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/60In silico combinatorial chemistry
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • G16B35/20Screening of libraries
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/60In silico combinatorial chemistry
    • G16C20/64Screening of libraries

Definitions

  • each proteome-wide probing normally employs one bait protein and identifies on average five prey proteins (network neighbors) [21]. 6,144 (size of the yeast proteome) experiments are required to cover the whole interactome in one pass.
  • the proteome-wide platforms e.g., the arrayed yeast two-hybrid library [2] and protein microarrays [40,11-12] have the physical capacity to detect thousands to tens of thousands of proteins -far higher than five preys per experiment.
  • the efficiency for interactome mapping can be increased by screening multiple baits together, if the relationship between mixed baits and their interacting preys can be deconvo luted.
  • One possibility is by labeling baits with distinct fluorescent dyes (or potentially quantum dots). This "one color- one bait" approach, however, would be quite limiting due to both technical and economic reasons.
  • a unique N-bit binary code is assigned to each of M items in a bait library.
  • Each one of the N-bits is associated with a pair of 2N experiments.
  • Each experiment of a pair is associated with a binary state of the bit associated with that pair.
  • Each experiment involves the bait items whose binary codes have the biliary state of the bit associated with that experiment and may involve the same prey items.
  • a result of each experiment is analyzed to determine if a prey item interacts with one or more of the bait items in each experiment. Based on the binary states of the bits associated with the experiments in which the first prey item interacts with a bait, a subset of the M bait items that potentially interact with the first prey item is determined.
  • Figure 1 illustrates a method for detecting interactions according to an embodiment of the resent invention.
  • Figure 14 (a) Cross validation of the ambiguous hits on SPA_4. These ambiguous hits were first crossvalidated with the results from other SPA's for further narrow-down (9th column) and all putative interactions were then retested experimentally in quarduplicate (10th column, number of reproduced out of 4 repetitions). These putative hits were also compared to one dataset from previous duplicate screening using the original array with single AD strains (1 lth column), one dataset from previous bait pooling screening (12th column) and one dataset from other literatures (13th column), (b) Cross validation of the ambiguous hits on SPA_5. (c) Cross validation of the ambiguous hits on SPA_6.
  • One embodiment of the present invention provides for a novel pooling- deconvolution strategy that can dramatically decrease the effort required to generate large- scale data sets.
  • This "PI-Deconvolution" strategy employs imaginary tagging and allows the screening of 2 N probe proteins (baits) in 2*N pools, with N replicates for each bait. Deconvolution of baits with their binding partners (preys) can be achieved by reading the prey's profile from the 2*N experiments.
  • Embodiments of the invention have aspects of binary coding (imaginary tagging) of baits, combinatorial mix-bait screening, and built-in prey-bait tracking and cross-validation. The number of bits is not limited and can be any number, but is preferably 1, 2, 3, 4, 5, 6, 7, 8, 9, orlO.
  • An experiment occurs when one or more prey and one or more baits are put into an environment where an interaction between a bait and prey may be detected.
  • An experiment may utilize one or more plates.
  • An experiment may also refer to each well of a plate.
  • a "label” or a “detectable moiety or marker” is a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means.
  • useful labels include 32 P, fluorescent dyes and proteins (e.g., used in FRET), electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins which can be made detectable, e.g., by incorporating a radiolabel into the peptide or used to detect antibodies specifically reactive with the peptide.
  • a marker can also be phenotypic change in a cell.
  • a detector is used to detect the label. Typical detectors include spectrophotometers, phototubes and photodiodes, microscopes, scintillation counters, cameras, film and the like, as well as combinations thereof.
  • a "small organic molecule” refers to an organic molecule, either naturally occurring or synthetic, that has a molecular weight of more than about 50 Daltons and less than about 2500 Daltons, preferably less than about 2000 Daltons, preferably between about 100 to about 1000 Daltons, more preferably between about 200 to about 500 Daltons.
  • An "siRNA” or “RNAi” refers to a nucleic acid that forms a double stranded RNA, which double stranded RNA has the ability to reduce or inhibit expression of a gene or target gene when the siRNA expressed in the same cell as the gene or target gene. "siRNA” or “RNAi” thus refers to the double stranded RNA formed by the complementary strands.
  • Aptamers are DNA or RNA molecules that have been selected from random pools based on their ability to bind other molecules with high affinity specificity (see, e.g., Cox and Ellington, Bioorg. Med. Chem. 9:2525-2531 (2001); Lee et al, Nuc. Acids Res. 32:D95-D100 (2004)). Aptamers have been selected which bind nucleic acid, proteins, small organic compounds, vitamins, inorganic compounds, cells, and even entire organisms.
  • Antibody refers to a polypeptide substantially encoded by an immunoglobulin gene or immunoglobulin genes, or fragments thereof which specifically bind and recognize an analyte (antigen).
  • amino acid refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids.
  • Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, ⁇ - carboxyglutamate, and O-phosphoserine.
  • Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups ⁇ e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
  • Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
  • biomolecule is a molecule found in nature. Examples of biomolecules include, but are not limited to, polypeptides, polynucleotides, and carbohydrates.
  • a "customer” is any individual, institution, corporation, university, or organization seeking to obtain products and/or services.
  • FIG. 1 illustrates a Pooling and deconvolution method 100, PI-Deconvolution (PID), for determining an interaction between baits and preys according to an embodiment of the invention.
  • the baits and preys are organized into a bait library and a prey library. These libraries may be identical as described above.
  • the interactions between the baits and the preys occur during one or more experiments.
  • PI-Deconvolution a novel pooling and deconvolution strategy
  • PID improves coverage and accuracy simultaneously, without the necessity of secondary screens.
  • most hits are at least partially deconvoluted (92% of the hits are narrowed down to at most 4 baits); further deconvolution can be achieved by pair-wise confirmation.
  • PID is generally applicable to both two-hybrid array and proteome microarray platforms.
  • the bait and prey items described herein are often members of a library, which can be organized in the form of an array.
  • the bait and prey can be members of the same library, overlapping libraries, or different libraries.
  • Libraries can be used, e.g. , to assay for protein protein interactions, to identify enzymatic substrates, for example protein kinase substrates, to identify nucleic acids, including SNPs and allelic variants, to identify proteins and antibodies, to assay for pharmacogenetic effects, and to identify drugs that affect the function of genes or proteins, e.g., by investigating the effect of the drug on a molecule or cell.
  • high throughput screening methods involve providing a combinatorial library, e.g., a chemical or peptide library, containing a large number of potential therapeutic compounds.
  • a combinatorial library e.g., a chemical or peptide library
  • Such "combinatorial chemical libraries” or “ligand libraries” are then screened.
  • the compounds thus identified can serve as conventional "lead compounds” or can themselves be used as potential or actual therapeutics.
  • a combinatorial chemical library is a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis, by combining a number of chemical "building blocks” such as reagents.
  • WO 91/19735 encoded peptides (e.g., PCT Publication WO 93/20242), random bio-oligomers (e.g. , PCT Publication No. WO 92/00091 ), benzodiazepines (e.g., U.S. Pat. No. 5,288,514), diversomers such as hydantoins, benzodiazepines and dipeptides (Hobbs et al, Proc. Nat. Acad. ScL USA 90:6909-6913 (1993)), vinylogous polypeptides (Hagihara et al, J. Atner. Chem. Soc.
  • receptor-ligand interactions are also appropriate as tag and tag-binder pairs, such as agonists and antagonists of cell membrane receptors (e.g., cell receptor-ligand interactions such as transferrin, c-kit, viral receptor ligands, cytokine receptors, chemokine receptors, interleukin receptors, immunoglobulin receptors and antibodies, the cadherin family, the integrin family, the selectin family, and the like; see, e.g., Pigott & Power, The Adhesion Molecule Facts Book I (1993)).
  • cell membrane receptors e.g., cell receptor-ligand interactions such as transferrin, c-kit, viral receptor ligands, cytokine receptors, chemokine receptors, interleukin receptors, immunoglobulin receptors and antibodies, the cadherin family, the integrin family, the selectin family, and the like; see, e.g., Pigott & Power, The Adhesion Molecule
  • Synthetic polymers such as polyurethanes, polyesters, polycarbonates, polyureas, polyamides, polyethyleneimines, polyarylene sulfides, polysiloxanes, polyimides, and polyacetates can also form an appropriate tag or tag binder. Many other tag/tag binder pairs are also useful in assay systems described herein, as would be apparent to one of skill upon review of this disclosure.
  • Tag binders are fixed to solid substrates using any of a variety of methods currently available.
  • Solid substrates are commonly derivatized or functionalized by exposing all or a portion of the substrate to a chemical reagent which fixes a chemical group to the surface which is reactive with a portion of the tag binder.
  • groups which are suitable for attachment to a longer chain portion would include amines, hydroxyl, thiol, and carboxyl groups.
  • Aminoalkylsilanes and hydroxyalkylsilanes can be used to functionalize a variety of surfaces, such as glass surfaces. The construction of such solid phase biopolymer arrays is well described in the literature (see, e.g., Merrifield, J. Am. Chem. Soc.
  • the information storage medium can be other than an internal hard drive on a server. In other aspects, the information storage medium is physically present in the kit.
  • the information storage medium can be a portable storage medium that is inserted into a drive or external port of a computer.
  • a method for offering a protein interaction identification kit to a customer includes the following: a. presenting the customer with an identity of each of a population of biomolecules; b. accepting from the customer, an identification of target bait molecules from the population of biomolecules; and c.
  • the provider can then either provide the pools of the bait library to the customer along with the prey library, or can perform a pooling deconvolution method using the prey library and the pools of the bait library.
  • the prey library can be less than the entire library presented to the customer, but in preferred examples, includes the entire library.
  • the kit provided to a customer in this aspect of the invention can include any of the features of the kits described above.
  • yeast proteome microarrays [11,12], which contain 4,088 purified Saccharomyces cerevisiae proteins (as glutathione S-transferase fusions) immobilized on nitrocellulose-coated glass slides.
  • a small network of protein interactions (Fig. 4a, bottom) was derived. This a "gold standard" network, because all the interactions in the network have been reciprocally confirmed (bi-directional red arrows in Fig. 4a).
  • the 16 bait strains were mixed into 8 pools and screened against the two-hybrid array. In this procedure, two 8-bait pools in the same pair cover all of the 16 baits. Therefore, 4 pairs of PI-Deconvolution screens represent 4 independent screens of all the 16 baits. This protocol is a significant advantage over the individual bait procedure because it reduces the number of screens from 32 to 8, yet each bait is screened in quadruplicate. [0124] In the 13 single bait screens, 484 preys were observed and defined as two-hybrid positive colonies [2,14].
  • n value i.e. , screening a larger pool
  • acceptable pool size is also determined by the sensitivity and background of the detection method (as is true for any pooling strategy).
  • pooled screening generally relies on the gain of a signal, drug hypersensitivity cannot be scored in a pooling screen using fitness as a readout.
  • Example 5 illustrates a bait pool size of 3.
  • a method for detecting a molecule that affects the phosphorylation of a polypeptide or protein by a kinase wherein the polypeptide or protein identified is a substrate for the kinase.
  • the polypeptide or protein is contacted with the kinase in the presence of a test molecule, under conditions permissive for phosphorylation of the substrate by the kinase. Phosphorylation of the substrate by the kinase is then detected. A difference in phosphorylation in the presence versus absence of the test molecule indicates that the test molecule affects phosphorylation of the substrate by the kinase.
  • a YEAST TWO-HYBRID SMART POOL ARRAY SYSTEM FOR PROTEIN INTERACTION MAPPING [0134] A novel two-hybrid smart pool array (SPA) system was prepared in which, instead of individual AD strains, well-designed AD pools were screened in an array format that enables built-in replication and prey-bait deconvolution. Using this method, a Saccharomyces cerevisiae genome SPA increases Y2H screening efficiency by an order of magnitude. [0135] Bait pooling does not provide as large a benefit to most investigator-initiated research programs, which often focus on screening only one or a few select baits. However, instead of pooling baits, the same pooling-deconvolution principle can be applied to pool prey (AD) strains, enabling efficient screening of individual baits with high accuracy and coverage.
  • AD pool prey
  • the identity of the positive strain in a 64-strain set can be uniquely deconvoluted (to "+” or “-” profiles only) from the pattern of the corresponding 12 spots (e.g., Example 1 of Figure 7).
  • a 64-strain set contains more than one positive AD strain, there will be deconvolution ambiguity ("?” profiles). False positive or false negative spots can also cause "?” or "n” in the profile, but the profile can still be partially deconvoluted (e.g., Examples 2 and 3 of Figure 7) as described below.

Landscapes

  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Library & Information Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Computing Systems (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

Des exemples de mode de réalisation de la présente invention concernent des méthodes, des systèmes, des kits et des appareils d'analyse de données pour détecter des interactions entre atomes, molécules et/ou cellules. Un exemple de mode de réalisation de la présente invention concerne une stratégie innovante de fusion-déconvolution pouvant réduire de façon spectaculaire le travail nécessaire pour créer des jeux de données à grande échelle. Cette stratégie de “ PI-déconvolution ” utilise un codage imaginaire base X sur N chiffres de XN protéines sonde (amorces ou proies) et permet le criblage des amorces dans X*N fusions, avec N répliques pour chaque amorce. La déconvolution de protéines avec leurs partenaires de liaison peut être obtenue en lisant le profil de la proie à partir des X*N expériences. La méthode peut être utilisée pour cribler XN amorces ou protéines proie.
PCT/US2007/004313 2006-02-16 2007-02-16 Stratégie innovante de déconvolution et de fusion pour criblage à grande échelle WO2007098130A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US77431706P 2006-02-16 2006-02-16
US60/774,317 2006-02-16

Publications (3)

Publication Number Publication Date
WO2007098130A2 true WO2007098130A2 (fr) 2007-08-30
WO2007098130A9 WO2007098130A9 (fr) 2007-10-18
WO2007098130A3 WO2007098130A3 (fr) 2008-11-06

Family

ID=38437936

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/004313 WO2007098130A2 (fr) 2006-02-16 2007-02-16 Stratégie innovante de déconvolution et de fusion pour criblage à grande échelle

Country Status (1)

Country Link
WO (1) WO2007098130A2 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017205251A1 (fr) * 2016-05-25 2017-11-30 Bioinventors & Entrepreneurs Network, Llc Criblage et profilage d'attributs avec enrichissement d'échantillons par regroupement optimisé

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030032066A1 (en) * 2001-03-19 2003-02-13 Pierre Legrain Protein-protein interaction map inference using interacting domain profile pairs
US20030165873A1 (en) * 2001-03-02 2003-09-04 Come Jon H. Three hybrid assay system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030165873A1 (en) * 2001-03-02 2003-09-04 Come Jon H. Three hybrid assay system
US20030032066A1 (en) * 2001-03-19 2003-02-13 Pierre Legrain Protein-protein interaction map inference using interacting domain profile pairs

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ITO ET AL.: 'Toward a protein-protein interaction map of the building yeast: A comprehensive system to examine two-hybrid interactions in all possible combination between the yeast proteins' PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA vol. 97, no. 3, 01 February 2000, pages 1143 - 1147 *
TAVERNIER ET AL.: 'MAPPIT: a cytokine receptor-based two-hybrid method in mammalian cells' CLINICAL & EXPERIMENTAL ALLERGY vol. 32, 2002, pages 1397 - 1404 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017205251A1 (fr) * 2016-05-25 2017-11-30 Bioinventors & Entrepreneurs Network, Llc Criblage et profilage d'attributs avec enrichissement d'échantillons par regroupement optimisé

Also Published As

Publication number Publication date
WO2007098130A9 (fr) 2007-10-18
WO2007098130A3 (fr) 2008-11-06

Similar Documents

Publication Publication Date Title
Chanda et al. Fulfilling the promise: drug discovery in the post-genomic era
Bader et al. Functional genomics and proteomics: charting a multidimensional map of the yeast cell
Buck et al. ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments
Wilson et al. Recent developments in protein microarray technology
Espadaler et al. Prediction of protein–protein interactions using distant conservation of sequence patterns and structure relationships
Lay Jr et al. Problems with the “omics”
Braun Interactome mapping for analysis of complex phenotypes: insights from benchmarking binary interaction assays
Scholtens et al. Local modeling of global interactome networks
Wang et al. In vitro DNA-binding profile of transcription factors: methods and new insights
Jin et al. A pooling-deconvolution strategy for biological network elucidation
Naidu et al. Current knowledge on microarray technology-an overview
WO2007098130A2 (fr) Stratégie innovante de déconvolution et de fusion pour criblage à grande échelle
Alexandari et al. De novo distillation of thermodynamic affinity from deep learning regulatory sequence models of in vivo protein-DNA binding
US20040067539A1 (en) Method of making and using microarrays of biological materials
Johnston The yeast genome: on the road to the Golden Age
Braberg et al. Genetic interaction analysis of point mutations enables interrogation of gene function at a residue‐level resolution: Exploring the applications of high‐resolution genetic interaction mapping of point mutations
US20040096840A1 (en) Validated design for microarrays
Yao et al. Exploiting antigen receptor information to quantify index switching in single-cell transcriptome sequencing experiments
US6994965B2 (en) Method for displaying results of hybridization experiment
Uttamchandani et al. The expanding world of small molecule microarrays
Frueh et al. Large-scale molecular profiling approaches facilitating translational medicine: Genomics, transcriptomics, proteomics, and metabolomics
US20040073527A1 (en) Method, system and computer software for predicting protein interactions
Kuijpers et al. Split Pool Ligation-based Single-cell Transcriptome sequencing (SPLiT-seq) data processing pipeline comparison
Barh et al. In Silico and Ultrahigh‐Throughput Screenings (uHTS) in Drug Discovery: An Overview
Koehler Microarrays in chemical biology

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07751098

Country of ref document: EP

Kind code of ref document: A2