WO2004092419A2 - Diagnosis de l'hyperinsulinemie et du diabette de type ii et protection correspondante (i) - Google Patents

Diagnosis de l'hyperinsulinemie et du diabette de type ii et protection correspondante (i) Download PDF

Info

Publication number
WO2004092419A2
WO2004092419A2 PCT/US2004/009629 US2004009629W WO2004092419A2 WO 2004092419 A2 WO2004092419 A2 WO 2004092419A2 US 2004009629 W US2004009629 W US 2004009629W WO 2004092419 A2 WO2004092419 A2 WO 2004092419A2
Authority
WO
WIPO (PCT)
Prior art keywords
protein
human
clone
diabetic
master table
Prior art date
Application number
PCT/US2004/009629
Other languages
English (en)
Other versions
WO2004092419A3 (fr
Inventor
Bruce Kelder
John J. Kopchick
Original Assignee
Ohio University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ohio University filed Critical Ohio University
Publication of WO2004092419A2 publication Critical patent/WO2004092419A2/fr
Publication of WO2004092419A3 publication Critical patent/WO2004092419A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the invention relates to various nucleic acid molecules and proteins, and their use in (1) diagnosing hyperinsulinemia and type II diabetes, or conditions associated with their development, and (2) protecting mammals (including humans) against them.
  • Type II diabetes is the predominant form found in the Western world; fewer than 8% of diabetic Americans have the type I disease.
  • Type I diabetics are often characterized by their low or absent levels of circulating endogenous insulin, i.e., hypoinsulinemia (Unger and Foster, 1998) .
  • Islet cell antibodies causing damage to the pancreas are frequently present at diagnosis. Injection of exogenous insulin is required to prevent ketosis and sustain life.
  • Early Type II diabetics are often characterized by hyperinsulinemia and high resistance to insulin. Late Type II diabetics may be normoinsulinemic or hypoinsulinemic. Type II diabetics are usually not insulin dependent or prone to ketosis under normal circumstances.
  • Type II diabetes (formerly known as non-insulin dependent diabetes, NIDDM) is the most common form of elevated blood glucose (hyperglycemia) .
  • Type II diabetes is a metabolic disorder that affects approximately 17 million Americans. It is estimated that another 10 million individuals are "prone" to becoming diabetic. These vulnerable individuals can become resistant to insulin, a pancreatic hormone that signals glucose (blood sugar) uptake by fat and muscle .
  • glucose blood sugar
  • the pancreas produce more insulin, resulting in a condition called hyperinsulinemia.
  • Type II diabetes results .
  • Complications of diabetes include retinopathy, neuropathy, and nephropathy (traditionally designated as microvascular complications) as well as atherosclerosis (a macrovascular complication) .
  • Type II diabetes is a metabolic disorder that is characterized by insulin resistance and impaired glucose-stimulated insulin secretion.
  • Type II diabetes and atherosclerotic disease are viewed as consequences of having the insulin resistance syndrome (IRS) for many years.
  • the current theory of the pathogenesis of Type II diabetes is often referred to as the "insulin resistance/islet cell exhaustion" theory. According to this theory, a condition causing insulin resistance compels the pancreatic islet cells to hypersecrete insulin in order to maintain glucose homeostasis .
  • peripheral hyperinsulinemia will be an antecedent of Type II diabetes.
  • Peripheral hyperinsulinemia can be viewed as the difference between what is produced by the ⁇ cell minus that which is taken up by the liver. Therefore, peripheral hyperinsulinemia can be caused by increased ⁇ cell production, decreased hepatic uptake or some combination of both. It is also important to note that it is not possible to determine the origin of insulin resistance once it is established since the onset of peripheral hyperinsulinemia leads to a condition of global insulin resistance. Multiple environmental and genetic factors are involved in the development of insulin resistance, hyperinsulinemia and type II diabetes. An important risk factor for the development of insulin resistance, hyperinsulinemia and type II diabetes is obesity, particularly visceral obesity. The disease exists world-wide, but in developed societies, the prevalence has risen as the average age of the population increases and the average individual becomes more obese.
  • Obesity is a serious and growing problem in the United States. Obesity-related health risks include high blood pressure, hardening of the arteries, cardiovascular disease, and Type II diabetes (also known as non-insulin-dependent diabetes mellitus, Type II diabetes) . Recent studies show that 85% of the individuals with Type II diabetes are obese.
  • Growth hormone has many roles, ranging from regulation of protein, fat and carbohydrate metabolism to growth promotion.
  • GH is produced in the somatrophic cells of the anterior pituitary and exerts its effects either through the GH-induced action of IGF-I, in the case of growth promotion, or by direct interaction with the GHR on target cells including liver, muscle, adipose, and kidney cells.
  • Hyposecretion of GH during development leads to dwarfism, and hypersecretion before puberty leads to gigantism.
  • hypersecretion of GH results in acromegaly, a clinical condition characterized by enlarged facial bones, hands, feet, fatigue and an increase in weight. Of those individuals with acromegaly, 25% develop type II diabetes. This may be due to insulin resistance caused by the high circulating levels of GH leading to high circulating levels of insulin (Kopchick et al . , Annual Rev. Nutrition 1999. 19:437-61) .
  • a further mode of GH action may be through the transcriptional regulation of a number of genes contributing to the physiological effects of GH.
  • mice exhibited an enhanced growth phenotype . They also developed kidney lesions similar to those seen in diabetic glomerulosclerosis, see Yang, et al . , Lab. Invest., 68:62-70 (1993). Ogueta, et al . , J. Endocrinol., 165: 321-8 (2000) reported that transgenic mice expressing bovine GH develop arthritic disorder and self-antibodies .
  • mice Two of the proteins which mediate growth hormone activity are the growth hormone receptor and the growth hormone binding protein, encoded by the same gene in mice (GHR/BP) . It is possible to genetically engineer mice so that the gene encoding these proteins is disrupted ("knocked-out” ; inactivated), see Zhou, et al . , Proc. Nat. Acad. Sci. (USA), 94:13215-20 (1997). Zhou, et al . inactivated the GHR/BP gene by replacing the 3 ' portion of exon 4 (which encodes a portion of the GH binding domains) and the 5' region of intron 4 with a neomycin gene cassette. The modified gene was introduced into the target mice by homologous recombination.
  • GHR/BP-KO mice Like mice expressing a GH antagonist, homozygous GHR/BP-KO mice exhibit a dwarf phenotype. GHR/BP-KO mice, made diabetic by streptozotocin treatment, are protected from the development of diabetes- associated nephropathy. Bellush, et al . , Endocrinol., 141:163-8 (2000) .
  • High-fat diets have been shown to induce both obesity and Type II diabetes in laboratory animals (Surwit et al . , 1988) .
  • Surwit and colleagues demonstrated that male C57BL/6J mice are extremely sensitive to the diabetogenic effects of a high-fat diet when initiated at weaning.
  • high-fat fed animals had significantly elevated fasting blood-glucose and insulin levels and also demonstrated a decrease in insulin sensitivity (Surwit et al., 1995).
  • Ahren and colleagues (Ahren et al . , 1997) reported evidence of insulin resistance as well as diminished glucose-stimulated insulin release, after feeding with a high-fat diet for 12 weeks. These mice also showed elevated levels of total cholesterol, triglycerides, and free fatty acids, another hallmark of Type II diabetes.
  • mammalian subjects are defined as being more favored or less favored, with normal subjects being more favored than hyperinsulinemic subjects, and hyperinsulinemic subjects being more favored than type II diabetic subjects.
  • the subjects' state may then be correlated with their gene expression activity.
  • "favorable" human genes/proteins are defined as those corresponding to mouse cDNAs which were less strongly expressed in mouse hyperinsulinemic liver than in control liver, or less strongly expressed in mouse type II diabetic liver than in control or hyperinsulinemic liver.
  • the control liver is the liver of a mouse which is normal vis-avis fasting insulin and fasting glucose levels .
  • normal as used herein, means normal relative to those parameters, and does not necessitate that the mouse be normal in every respect .
  • corresponding does not mean identical, but rather implies the existence of a statistically significant sequence similarity, such as one sufficient to qualify the human protein or gene as a homologus protein or DNA as defined below. The greater the degree of relationship as thus defined (i.e., by the statistical significance of each alignment used to connect the mouse cDNA to the human protein or gene, measured by an
  • the connection may be direct (mouse cDNA to human protein) or indirect (e.g., mouse cDNA to mouse gene, mouse gene to human protein) .
  • the human genes which most closely correspond, directly or indirectly, to the mouse cDNA are preferred, such as the one(s) with the highest, top two highest, top three highest, top four highest, top five highest, and top ten highest E values for the final alignment in the connection process.
  • the human genes/proteins deemed to correspond to our mouse cDNA clones are identified in the Master Tables.
  • a human gene/protein corresponding to a mouse cDNA which was more strongly expressed in hyperinsulinemic liver than in either normal or type II diabetic liver will be deemed both "unfavorable” , by virtue of the control :hyperinsulinemic comparison, and "favorable”, by virtue of the hyperinsulinemic :diabetic comparison. This is one of several possible “mixed” expression patterns.
  • Agents which bind the "favorable" and “unfavorable” nucleic acids may be used to evaluate whether a human subject is at increased or decreased risk for progression toward type II diabetes.
  • a subject with one or more elevated “unfavorable” and/or one or more depressed “favorable” genes/proteins is at increased risk, and one with one or more elevated “favorable” and/or one or more depressed “unfavorable” genes/proteins is at decreased risk.
  • the assay may be used as a preliminary screening assay to select subjects for further analysis, or as a formal diagnostic assay.
  • the identification of the related genes and proteins may also be useful in protecting humans against these disorders .
  • the related human DNAs may be identified by comparing the mouse sequence (or its AA translation product) to known human DNAs (and their AA translation products) . If this is unsuccessful, human cDNA or genomic DNA libraries may be screened using the mouse DNA as a probe.
  • a mouse is considered to be a diabetic subject if, regardless of its fasting plasma insulin level, it has a fasting plasma glucose level of at least 190 mg/dL.
  • a mouse is considered to be a hyperinsulinemic subject if its fasting plasma insulin level is at least 0.67 ng/mL and it does not qualify as a diabetic subject.
  • a mouse is considered to be "normal” if it is neither diabetic nor hyperinsulinemic. Thus, normality is defined in a very limited manner.
  • a mouse is considered “obese” if its weight is at least 15% in excess of the mean weight for mice of its age and sex.
  • a mouse which does not satisfy this standard may be characterized as “non-obese” , the term “normal” being reserved for use in reference to glucose and insulin levels as previously described.
  • a human is considered a diabetic subject if, regardless of his or her fasting plasma insulin level, the fasting plasma glucose level is at least 126 mg/dL.
  • a human is considered a hyperinsulinemic subject if the fasting plasma insulin level is more than 26 micro International Units/mL (it is believed that this is equivalent to 1.08 ng/mL), and does not qualify as a diabetic subject.
  • a human is considered to be "normal” if it is neither diabetic nor hyperinsulinemic. Thus, normality is defined in a very limited manner.
  • a human is considered “obese” if the body mass index (BMI) (weight divided by height squared) is at least 30 kg/m 2 .
  • BMI body mass index
  • a human who does not satisfy this standard may be characterized as “non-obese", the term “normal” being reserved for use in reference to glucose and insulin levels as previously described.
  • a human is considered overweight if the BMI is at least 25 kg/m 2 .
  • we define overweight to include obese individuals consistent with the recommendations of the National Institute of Diabetes and Digestive and Kidney Disease (NIDDK) .
  • NIDDK National Institute of Diabetes and Digestive and Kidney Disease
  • a human who does not satisfy this standard may be characterized as "non-overweight . "
  • the diagnostic and protective methods of the present invention are applied to human subjects exhibiting one or more of the aforementioned risk factors. Likewise, in a preferred embodiment, they are applied to human subjects who, while not diabetic, exhibit impaired glucose homeostasis (110 to ⁇ 126 mg/dL) .
  • the age of the subjects is at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, and at least 75.
  • the BMIs of the human subjects is at least 23, at least 24, at least 25 (i.e., overweight by our criterion), at least 26, at least 27, at least 28, at least 29, at least 30 (i.e., obese), at least 31, at least 32, at least 33, at least 34, at least 35, at least 36, at least 37, at least 38, at least 39, at least 40, or over 40.
  • Y91 U U U U Mixed genes/proteins are those exhibiting a combination of favorable and unfavorable behavior. They are considered to be both favorable and unfavorable for the purpose of the claims .
  • a mixed gene/protein can be used as would a favorable gene/protein if its favorable behavior outweighs the unfavorable. It can be used as would an unfavorable gene/protein if its unfavorable behavior outweighs the favorable. Preferably, they are used in conjunction with other agents that affect their balance of favorable and unfavorable behavior.
  • Use of mixed genes/proteins is, in general, less desirable than use of purely favorable or purely unfavorable genes/proteins.
  • genes/Proteins of Interest are those corresponding to cDNAs less strongly expressed in type II diabetic or hyperinsulinemic liver than in normal liver, or less strongly expressed in type II diabetic liver than in hyperinsulinemic liver.
  • genes/proteins are those corresponding to cDNAs more strongly expressed in hyperinsulinemic or type II diabetic liver as compared to normal liver, or in type II diabetic liver as compared to hyperinsulinemic liver.
  • corresponding mouse and human proteins For each of the differentially expressed cDNAs, corresponding mouse and human proteins have been identified, as set forth in Master Table 1. More than one human protein may be identified as corresponding to a particular mouse clone. In addition, we have considered whether these cDNAs may correspond to particular classes or subclasses of human proteins, as set forth in Master Table 2.
  • the cDNAs of the disclosed clones may be used directly. For diagnostic or screening purposes, they (or specific binding fragments thereof) may be labeled and used as hybridization probes. For therapeutic purposes, they (or specific binding fragments thereof) may be used as antisense reagents to inhibit the expression of the corresponding gene, or of a sufficiently homologous gene of another species .
  • the cDNA appears to be a full-length cDNA, that is, that it encodes an entire, functional protein, then it may ⁇ be used in the expression of that protein.
  • Such expression may be in cell culture, with the protein subsequently isolated and administered exogenously to subjects who would benefit therefrom, or in vivo, i.e., administration by gene therapy.
  • any DNA encoding the same protein, or a fragment or a mutant protein which retains the desired activity may be used for the same purpose.
  • the encoded protein of course has utility therapeutically and, in labeled or immobilized form, diagnostically.
  • the cDNAs of the disclosed clones may also be used indirectly, that is, to identify other useful DNAs, proteins, or other molecules.
  • cDNAs disclosed herein have significant similarity to any known DNA, and whether, in any of the six possible combinations of reference frame and strand, they encode a protein similar to a known protein. If so, then it follows that the known protein, and DNAs encoding that protein, may be used in a similar manner. In addition, if the known protein is known to have additional homologues, then those homologous proteins, and DNAs encoding them, may be used in a similar manner.
  • a DNA->DNA (BlastN) search for database DNAs closely related to the mouse cDNA clone identifies a particular mouse (or other nonhuman, e.g., rat) gene, and that nonhuman gene encodes a protein for which there is a known human protein homologue;
  • a DNA->DNA (BlastN) search of the database for human DNAs closely related to the mouse cDNA clone identifies a particular human DNA as a homologue of the mouse cDNA, and the corresponding human protein is known (e.g., by translation of the human DNA) ;
  • mouse cDNA encodes a mouse protein which appears similar to a human protein
  • that human protein may be used (especially in humans) for purposes analogous to the proposed use of the mouse protein in mice.
  • a specific binding fragment of an appropriate strand of the corresponding human gene or cDNA could be labeled and used as a hybridization probe (especially against samples of human mRNA or cDNA) .
  • the disclosed cDNAs have significant similarities to known DNAs (and their translated AA sequences to known proteins)
  • Example 1 Preferred parameters are set forth in Example 1. The results are also dependent on the content of the database. While the raw similarity score of a particular target (database) sequence will not vary with content (as long as it remains in the database) , its informational value (in bits) , expected value, and relative ranking can change. Generally speaking, the changes are small . It is possible to use the sequence of the entire cDNA insert to query the database. However, the error rate increases as a sequencing run progresses. Hence, it may be beneficial to search the database using a truncated (presumably more accurate) sequence, especially if the insert is quite long.
  • nucleic acid and protein databases keep growing. Hence a later search may identify high scoring target sequences which were not uncovered by an earlier search because the target sequences were not previously part of a database .
  • cognate DNAs and proteins include not only those set forth in the examples, but those which would have been highly ranked (top ten,- more preferably top three, even more preferably top two, most preferably the top one) in a search run with the same parameters on the date of filing of this application.
  • the cDNA appears to be a partial cDNA, it may be used as a hybridization probe to isolate the full-length cDNA. If the partial cDNA encodes a biologically functional fragment of the cognate protein, it may be used in a manner similar to the full length cDNA, i.e., to produce the functional fragment.
  • an antagonist of a protein or other molecule may be obtained by preparing a combinatorial library, as described below, of potential antagonists, and screening the library members for binding to the protein or other molecule in question. The binding members may then be further screened for the ability to antagonize the biological activity of the target.
  • the antagonists may be used therapeutically, or, in suitably labeled or immobilized form, diagnostically. If the cDNA is related to a known protein, then substances known to interact with that protein (e.g., agonists, antagonists, substrates, receptors, second messengers, regulators, and so forth) , and binding molecules which bind them, are also of utility. Such binding molecules can likewise be identified by screening a combinatorial library.
  • a cDNA of the present invention is a partial cDNA, and the cognate full length cDNA is not listed in a sequence database, the available cDNA may be used as a hybridization probe to isolate the full-length cDNA from a suitable cDNA library.
  • Stringent hybridization conditions are appropriate, that is, conditions in which the hybridization temperature is 5-10 deg. C. below the Tm of the cDNA as a perfect duplex.
  • sequence databases available do not include the sequence of any homologous gene, or at least of the homologous gene for a species of interest. However, given the cDNAs set forth above, one may readily obtain the homologous gene .
  • this partial cDNA may first be used as a probe to isolate the corresponding full length cDNA for the same species, and that the latter may be used as the starting DNA in the search for homologous genes .
  • the starting DNA, or a fragment thereof is used as a hybridization probe to screen a cDNA or genomic DNA library for clones containing inserts which encode either the entire homologous protein, or a recognizable fragment thereof.
  • the minimum length of the hybridization probe is dictated by the need for specificity.
  • the human cDNA library is about 10 8 bases and the human genomic DNA library is about 10 10 bases.
  • the library is preferably derived from an organism which is known, on biochemical evidence, to produce a homologous protein, and more preferably from the genomic DNA or mRNA of cells of that organism which are likely to be relatively high producers of that protein.
  • a cDNA library (which is derived from an mRNA library) is especially preferred. If the organism in question is known to have substantially different codon preferences from that of the organism whose relevant cDNA or genomic DNA is known, a synthetic hybridization probe may be used which encodes the same amino acid sequence but whose codon utilization is more similar to that of the DNA of the target organism.
  • the synthetic probe may employ inosine as a substitute for those bases which are most likely to be divergent, or the probe may be a mixed probe which mixes the codons for the source DNA with the preferred codons (encoding the same amino acid) for the target organism.
  • the Tm of a perfect duplex of starting DNA is determined. One may then select a hybridization temperature which is sufficiently lower than the perfect duplex Tm to allow hybridization of the starting DNA (or other probe) to a target DNA which is divergent from the starting DNA.
  • a 1% sequence divergence typically lowers the Tm of a duplex by 1-2 °C, and the DNAs encoding homologous proteins of different species typically have sequence identities of around 50-80%.
  • the library is screened under conditions where the temperature is at least 20°C, more preferably at least 50°C, below the perfect duplex Tm. Since salt reduces the Tm, one ordinarily would carry out the search for DNAs encoding highly homologous proteins under relatively low salt hybridization conditions, e.g., ⁇ 1M NaCl. The higher the salt concentration, and/or the lower the temperature, the greater the sequence divergence which is tolerated.
  • probes to identify homologous genes in other species see, e.g., Schwinn, et al . , J. Biol. Chem., 265:8183-89 (1990) (hamster 67-bp cDNA probe vs. human leukocyte genomic library; human 0.32kb DNA probe vs. bovine brain cDNA library, both with hybridization at 42 °C in 6xSSC) ; Jenkins et al . , J. Biol. Chem., 265:19624-31 (1990) (Chicken 770-bp cDNA probe vs. human genomic libraries; hybridization at 40°C in 50% formamide and 5xSSC) ; Murata et al., J.
  • a human protein can be said to be identifiable as homologous to a mouse cDNA clone if
  • BlastX genomic DNA
  • cDNA DNA complementary to messenger RNA
  • BlastP mouse protein by BlastP
  • BlastX mouse cDNA clone
  • BlastP it can be aligned to a mouse protein by BlastP, which in turn can be aligned to a mouse gene by BlastX, whose gDNA or cDNA can in turn be aligned to the mouse cDNA clone by BlastN;
  • any alignment by BlastN, BlastP, or BlastX is in accordance with the default parameters set forth below, and the expected value (E) of each alignment (the probability that such an alignment would have occurred by chance alone) is less than e-10.
  • a human gene is homologous to a mouse cDNA clone if it encodes a homologous human protein as defined above, or if it can be aligned either directly to the mouse cDNA clone, or indirectly through a mouse gene which can be aligned to said clone, according to the conditions set forth above.
  • the E value is less than e-15, more preferably less than e-20, still more preferably less than e-40, further more preferably less than e-50, even more preferably less than e- 60, considerably more preferably less than e-80, and most preferably less than e-100. More preferably, for those conditions in which the mouse cDNA clone is indirectly connected to the human protein by virtue of two or more successive alignments, the E value is so limited for all of said alignments in the connecting chain.
  • BlastN and BlastX report very low expected values as "0.0". This does not truly mean that the expected value is exactly zero (since any alignment could occur by chance) , but merely that it is so infinitesimal that it is not reported.
  • the documentation does not state the cutoff value, but alignments with explicit E values as low as e-178 (624 bits) have been reported as nonzero values, while a score of 636 bits was reported as "0.0".
  • a human protein may be said to be functionally homologous to the mouse cDNA clone if (1) there is a mouse protein which is encoded by a mouse gene whose cDNA can be aligned to the mouse cDNA clone, using BlastX with the default parameters set forth below, and the E value of the alignment is less than e-50, and (2) the human protein has at least one biological activity in common with the mouse protein.
  • the human proteins of interest also include those that are substantially and/or conservatively identical (as defined below) to the homologous and/or functionally homologous human proteins defined above.
  • a gene is down-regulated in more favored mammals, or up-regulated in less favored mammals, (i.e., an "unfavorable gene") then several utilities are apparent.
  • the complementary strand of the gene, or a portion thereof may be used in labeled form as a hybridization probe to detect messenger RNA and thereby monitor the level of expression of the gene in a subject. Elevated levels are indicative of progression, or propensity to progression, to a less favored state, and clinicians may take appropriate preventative, curative or ameliorative action.
  • the messenger RNA product (or equivalent cDNA) , the protein product, or a binding molecule specific for that product (e.g., an antibody which binds the product) , or a downstream product which mediates the activity (e.g., a signaling intermediate) or a binding molecule (e.g., an antibody) therefor, may be used, preferably in labeled or immobilized form, as an assay reagent in an assay for said nucleic acid product, protein product, or downstream product (e.g., a signaling intermediate) .
  • elevated levels are indicative of a present or future problem.
  • an agent which down-regulates expression of the gene may be used to reduce levels of the corresponding protein and thereby inhibit further damage to the kidney.
  • This agent could inhibit transcription of the gene in the subject, or translation of the corresponding messenger RNA.
  • Possible inhibitors of transcription and translation include antisense molecules and represser molecules.
  • the agent could also inhibit a post-translational modification (e.g., glycosylation, phosphorylation, cleavage, GPI attachment) required for activity, or post-translationally modify the protein so as to inactivate it.
  • a post-translational modification e.g., glycosylation, phosphorylation, cleavage, GPI attachment
  • it could be an agent which down- or up-regulated a positive or negative regulatory gene, respectively.
  • an agent which is an antagonist of the messenger RNA product or protein product of the gene, or of a downstream product through which its activity is manifested may be used to inhibit its activity.
  • This antagonist could be an antibody.
  • an agent which degrades, or abets the degradation of, that messenger RNA, its protein product or a downstream product which mediates its activity may be used to curb the effective period of activity of the protein.
  • the complementary strand of the gene, or a portion thereof may be used in labeled form as a hybridization probe to detect messenger RNA and thereby monitor the level of expression of the gene in a subject.
  • Depressed levels are indicative of damage, or possibly of a propensity to damage, and clinicians may take appropriate preventative, curative or ameliorative action.
  • the messenger RNA product the equivalent cDNA, protein product, or a binding molecule specific for those products, or a downstream product, or a signaling intermediate, or a binding molecule therefor, may be used, preferably in labeled or immobilized form, as an assay reagent in an assay for said protein product or downstream product.
  • depressed levels are indicative of a present or future problem.
  • an agent which up-regulates expression of the gene may be used to increase levels of the corresponding protein and thereby inhibit further progression to a less favored state.
  • it could be a vector which carries a copy of the gene, but which expresses the gene at higher levels than does the endogenous expression system.
  • it could be an agent which up- or down-regulates a positive or negative regulatory gene.
  • an agent which is an agonist of the protein product of the gene, or of a downstream product through which its activity (of inhibition of progression to a less favored state) is manifested, or of a signaling intermediate may be used to foster its activity.
  • an agent which inhibits the degradation of that protein product or of a downstream product or of a signaling intermediate may be used to increase the effective period of activity of the protein.
  • mutant proteins which are substantially identical (as defined below) to the parental protein (peptide) .
  • the fewer the mutations the more likely the mutant protein is to retain the activity of the parental protein.
  • the effect of mutations is usually (but not always) additive. Certain individual mutations are more likely to be tolerated than others .
  • a protein is more likely to tolerate a mutation which (a) is a substitution rather than an insertion or deletion; (b) is an insertion or deletion at the terminus, rather than internally, or, if internal, is at a domain boundary, or a loop or turn, rather than in an alpha helix or beta strand; (c) affects a surface residue rather than an interior residue;
  • (f) is at a site which is subject to substantial variation among a family of homologous proteins to which the protein of interest belongs.
  • Surface residues may be identified experimentally by various labeling techniques, or by 3-D structure mapping techniques like X-ray diffraction and NMR. A 3-D model of a homologous protein can be helpful .
  • Residues forming the binding site may be identified by (1) comparing the effects of labeling the surface residues before and after complexing the protein to its target, (2) labeling the binding site directly with affinity ligands, (3) fragmenting the protein and testing the fragments for binding activity, and (4) systematic mutagenesis (e.g., alanine-scanning mutagenesis) to determine which mutants destroy binding. If the binding site of a homologous protein is known, the binding site may be postulated by analogy. Protein libraries may be constructed and screened that a large family (e.g., 10 8 ) of related mutants may be evaluated simultaneously.
  • a large family e.g. 10 8
  • the mutations are preferably conservative modifications as defined below.
  • a mutant protein (peptide) is substantially identical to a reference protein (peptide) if (a) it has at least 10% of a specific binding activity or a non-nutritional biological activity of the reference protein, and (b) is at least 50% identical in amino acid sequence to the reference protein (peptide) . It is "substantially structurally identical” if condition (b) applies, regardless of (a) .
  • Percentage amino acid identity is determined by aligning the mutant and reference sequences according to a rigorous dynamic programming algorithm which globally aligns their sequences to maximize their similarity, the similarity being scored as the sum of scores for each aligned pair according to an unbiased PAM250 matrix, and a penalty for each internal gap of -12 for the first null of the gap and - 4 for each additional null of the same gap.
  • the percentage identity is the number of matches expressed as a percentage of the adjusted (i.e., counting inserted nulls) length of the reference sequence.
  • a mutant DNA sequence is substantially identical to a reference DNA sequence if they are structural sequences, and encoding mutant and reference proteins which are substantially identical as described above. If instead they are regulatory sequences, they are substantially identical if the mutant sequence has at least 10% of the regulatory activity of the reference sequence, and is at least 50% identical in nucleotide sequence to the reference sequence. Percentage identity is determined as for proteins except that matches are scored +5, mismatches - 4, the gap open penalty is -12, and the gap extension penalty (per additional null) is -4.
  • sequence which are substantially identical exceed the minimum identity of 50% e.g., are 51%, 66% , 75%, 80%, 85%, 90%, 95% or 99% identical in sequence.
  • DNA sequences may also be considered "substantially identical" if they hybridize to each other under stringent conditions, i.e., conditions at which the Tm of the heteroduplex of the one strand of the mutant DNA and the more complementary strand of the reference DNA is not in excess of 10 °C. less than the Tm of the reference DNA homoduplex. Typically this will correspond to a percentage identity of 85-90%.
  • “Semi-Conservative Modifications” are modifications which are not conservative, but which are (a) semi- conservative substitutions as hereafter defined; or (b) single or multiple insertions or deletions internally, but at interdomain boundaries, in loops or in other segments of relatively high mobility. Semi-conservative modifications are preferred to nonconservative modifications. Semi- conservative substitutions are preferred to other semi- conservative modifications.
  • Non-conservative substitutions are preferred to other non-conservative modifications.
  • no more than about five amino acids are inserted or deleted at a particular locus, and the modifications are outside regions known to contain binding sites important to activity.
  • insertions or deletions are limited to the termini .
  • a conservative substitution is a substitution of one amino acid for another of the same exchange group, the exchange groups being defined as follows
  • Cys belongs to both I and IV. Residues Pro, Gly and Cys have special conformational roles. Cys participates in formation of disulfide bonds. Gly imparts flexibility to the chain. Pro imparts rigidity to the chain and disrupts helices. These residues may be essential in certain regions of the polypeptide, but substitutable elsewhere.
  • substitutions are defined herein as being substitutions within supergroup I/II/III or within supergroup IV/V, but not within a single one of groups I-V. They also include replacement of any other amino- acid with alanine. If a substitution is not conservative, it preferably is semi-conservative.
  • Non-conservative substitutions are substitutions which are not “conservative” or “semi-conservative” .
  • “Highly conservative substitutions” are a subset of conservative substitutions, and are exchanges of amino acids within the groups Phe/Tyr/Trp, Met/Leu/lle/Val , His/Arg/Lys, Asp/Glu and Ser/Thr/Ala. They are more likely to be tolerated than other conservative substitutions. Again, the smaller the number of substitutions, the more likely they are to be tolerated.
  • a protein is conservatively identical to a reference protein (peptide) it differs from the latter, if at all, solely by conservative modifications, the protein
  • a protein is at least semi-conservatively identical to a reference protein (peptide) if it differs from the latter, if at all, solely by semi-conservative or conservative modifications .
  • a protein is nearly conservatively identical to a reference protein (peptide) if it differs from the latter, if at all, solely by one or more conservative modifications and/or a single nonconservative substitution. It is highly conservatively identical if it differs, if at all, solely by highly conservative substitutions. Highly conservatively identical proteins are preferred to those merely conservatively identical. An absolutely identical protein is even more preferred.
  • the core sequence of a reference protein is the largest single fragment which retains at least 10% of a particular specific binding activity, if one is specified, or otherwise of at least one specific binding activity of the referent. If the referent has more than one specific binding activity, it may have more than one core sequence, and these may overlap or not .
  • a peptide of the present invention may have a particular similarity relationship (e.g., markedly identical) to a reference protein (peptide)
  • preferred peptides are those which comprise a sequence having that relationship to a core sequence of the reference protein (peptide) , but with internal insertions or deletions in either sequence excluded. Even more preferred peptides are those whose entire sequence has that relationship, with the same exclusion, to a core sequence of that reference protein (peptide) .
  • library generally refers to a collection of chemical or biological entities which are related in origin, structure, and/or function, and which can be screened simultaneously for a property of interest. Libraries may.be classified by how they are constructed
  • a "natural diversity” library essentially all of the diversity arose without human intervention. This would be true, for example, of messenger RNA extracted from a non- engineered cell .
  • synthetic diversity essentially all of the diversity arose deliberately as a result of human intervention. This would be true for example of a combinatorial library; note that a small level of natural diversity could still arise as a result of spontaneous mutation. It would also be true of a noncombinatorial library of compounds collected from diverse sources, even if they were all natural products.
  • non-natural diversity In a "non-natural diversity" library, at least some of the diversity arose deliberately through human intervention.
  • the source of the diversity is limited in some way.
  • a limitation might be to cells of a particular individual, to a particular species, or to a particular genus, or, more complexly, to individuals of a particular species who are of a particular age, sex, physical condition, geographical location, occupation and/or familial relationship.
  • it might be to cells of a particular tissue or organ.
  • it could be cells exposed to particular pharmacological, environmental, or pathogenic conditions.
  • the library could be of chemicals, or a particular class of chemicals, produced by such cells.
  • the library members are deliberately limited by the production conditions to particular chemical structures. For example, if they are oligomers, they may be limited in length and monomer composition, e.g. hexapeptides composed of the twenty genetically encoded amino acids.
  • the library members are nucleic acids, and are screened using a nucleic acid hybridization probe. Bound nucleic acids may then be amplified, cloned, and/or sequenced.
  • the screened library members are gene expression products, but one may also speak of an underlying library of genes encoding those products.
  • the library is made by subcloning DNA encoding the library members (or portions thereof) into expression vectors (or into cloning vectors which subsequently are used to construct expression vectors) , each vector comprising an expressible gene encoding a particular library member, introducing the expression vectors into suitable cells, and expressing the genes so the expression products are produced.
  • the expression products are secreted, so the library can be screened using an affinity reagent, such as an antibody or receptor.
  • the bound expression products may be sequenced directly, or their sequences inferred by, e.g., sequencing at least the variable portion of the encoding DNA.
  • the cells are lysed, thereby exposing the expression products, and the latter are screened with the affinity reagent .
  • the cells express the library members in such a manner that they are displayed on the surface of the cells, or on the surface of viral particles produced by the cells. (See display libraries, below) .
  • the screening is not for the ability of the expression product to bind to an affinity reagent, but rather for its ability to alter the phenotype of the host cell in a particular detectable manner.
  • the screened library members are transformed cells, but there is a first underlying library of expression products which mediate the behavior of the cells, and a second underlying library of genes which encode those products.
  • the library members are each conjugated to, and displayed upon, a support of some kind.
  • the support may be living (a cell or virus) , or nonliving (e.g., a bead or plate).
  • the support is a cell or virus
  • display will normally be effectuated by expressing a fusion protein which comprises the library member, a carrier moiety allowing integration of the fusion protein into the surface of the cell or virus, and optionally a lining moiety.
  • the cell coexpresses a first fusion comprising the library member and a linking moiety LI, and a second fusion comprising a linking moiety L2 and the carrier moiety. LI and L2 interact to associate the first fusion with the second fusion and hence, indirectly, the library member with the surface of the cell or virus. Soluble Library
  • a soluble library In a soluble library, the library members are free in solution.
  • a soluble library may be produced directly, or one may first make a display library and then release the library members from their supports.
  • the library members are inside cells or liposomes.
  • encapsulated libraries are used to store the library members for future use; the members are extracted in some way for screening purposes. However, if they differentially affect the phenotype of the cells, they may be screened indirectly by screening the cells .
  • a cDNA library is usually prepared by extracting RNA from cells of particular origin, fractionating the RNA to isolate the messenger RNA (mRNA has a poly (A) tail, so this is usually done by oligo-dT affinity chromatography) , synthesizing complementary DNA (cDNA) using reverse transcriptase, DNA polymerase, and other enzymes, subcloning the cDNA into vectors, and introducing the vectors into cells. Often, only mRNAs or cDNAs of particular sizes will be used, to make it more likely that the cDNA encodes a functional polypeptide.
  • a cDNA library explores the natural diversity of the transcribed DNAs of cells from a particular source. It is not a combinatorial library.
  • a cDNA library may be used to make a hybridization library, or it may be used as an (or to make) expression library.
  • Genomic DNA Library A genomic DNA library is made by extracting DNA from a particular source, fragmenting the DNA, isolating fragments of a particular size range, subcloning the DNA fragments into vectors, and introducing the vectors into cells. Like a cDNA library, a genomic DNA library is a natural diversity library, and not a combinatorial library. A genomic DNA library may be used the same way as a cDNA library.
  • a synthetic DNA library may be screened directly (as a hybridization library) , or used in the creation of an expression or display library of peptides/proteins.
  • combinatorial library refers to a library in which the individual members are either systematic or random combinations of a limited set of basic elements, the properties of each member being dependent on the choice and location of the elements incorporated into it.
  • the members of the library are at least capable of being screened simultaneously. Randomization may be complete or partial; some positions may be randomized and others predetermined, and at random positions, the choices may be limited in a predetermined manner.
  • the members of a combinatorial library may be oligomers or polymers of some kind, in which the variation occurs through the choice of monomeric building block at one or more positions of the oligomer or polymer, and possibly in terms of the connecting linkage, or the length of the oligomer or polymer, too.
  • the members may be nonoligomeric molecules with a standard core structure, like the 1, 4-benzodiazepine structure, with the variation being introduced by the choice of substituents at particular variable sites on the core structure.
  • the members may be nonoligomeric molecules assembled like a jigsaw puzzle, but wherein each piece has both one or more variable moieties (contributing to library diversity) and one or more constant moieties (providing the functionalities for coupling the piece in question to other pieces) .
  • chemical building blocks are at least partially randomly combined into a large number (as high as 10 15 ) of different compounds, which are then simultaneously screened for binding (or other) activity against one or more targets.
  • a “simple combinatorial library” In a “simple combinatorial library”, all of the members belong to the same class of compounds (e.g., peptides) and can be synthesized simultaneously.
  • a “composite combinatorial library” is a mixture of two or more simple libraries, e.g., DNAs and peptides, or peptides, peptoids, and PNAs, or benzodiazepines and carbamates .
  • the number of component simple libraries in a composite library will, of course, normally be smaller than the average number of members in each simple library, as otherwise the advantage of a library over individual synthesis is small.
  • nucleic acids have also been used in combinatorial libraries. Their great advantage is the ease with which a nucleic acid with appropriate binding activity can be amplified. As a result, combinatorial libraries composed of nucleic acids can be of low redundancy and hence, of high diversity. There has also been much interest in combinatorial libraries based on small molecules, which are more suited to pharmaceutical use, especially those which, like benzodiazepines, belong to a chemical class which has already yielded useful pharmacological agents. The techniques of combinatorial chemistry have been recognized as the most efficient means for finding small molecules that act on these targets. At present, small molecule combinatorial chemistry involves the synthesis of either pooled or discrete molecules that present varying arrays of functionality on a common scaffold.
  • the size of a library is the number of molecules in it.
  • the simple diversity of a library is the number of unique structures in it. There is no formal minimum or maximum diversity. If the library has a very low diversity, the library has little advantage over just synthesizing and screening the members individually. If the library is of very high diversity, it may be inconvenient to handle, at least without automatizing the process.
  • the simple diversity of a library is preferably at least 10, 10E2, 10E3, 10E4, 10E6, 10E7, 10E8 or 10E9, the higher the better under most circumstances.
  • the simple diversity is usually not more than 10E15, and more usually not more than 10E10.
  • the average sampling level is the size divided by the simple diversity.
  • the expected average sampling level must be high enough to provide a reasonable assurance that, if a given structure were expected, as a consequence of the library design, to be present, that the actual average sampling level will be high enough so that the structure, if satisfying the screening criteria, will yield a positive result when the library is screened.
  • the preferred average sampling level is a function of the detection limit, which in turn is a function of the strength of the signal to be screened.
  • There are more complex measures of diversity than simple diversity These attempt to take into account the degree of structural difference between the various unique sequences. These more complex measures are usually used in the context of small organic compound libraries, see below.
  • the library members may be presented as solutes in solution, or immobilized on some form of support.
  • the support may be living (cell, virus) or nonliving (bead, plate, etc.) .
  • the supports may be separable (cells, virus particles, beads) so that binding and nonbinding members can be separated, or nonseparable (plate) .
  • the members will normally be placed on addressable positions on the support.
  • the advantage of a soluble library is that there is no carrier moiety that could interfere with the binding of the members to the support .
  • the advantage of an immobilized library is that it is easier to identify the structure of the members which were positive.
  • the target When screening a soluble library, or one with a separable support, the target is usually immobilized. When screening a library on a nonseparable support, the target will usually be labeled.
  • An oligonucleotide library is a combinatorial library, at least some of whose members are single-stranded oligonucleotides having three or more nucleotides connected by phosphodiester or analogous bonds.
  • the oligonucleotides may be linear, cyclic or branched, and may include non- nucleic acid moieties.
  • the nucleotides are not limited to the nucleotides normally found in DNA or RNA. For examples of nucleotides modified to increase nuclease resistance and chemical stability of aptamers, see Chart 1 in Osborne and Ellington, Chem. Rev., 97: 349-70 (1997).
  • RNA For screening of RNA, see Ellington and Szostak, Nature, 346: 818-22 (1990) . There is no formal minimum or maximum size for these oligonucleotides. However, the number of conformations which an oligonucleotide can assume increases exponentially with its length in bases. Hence, a longer oligonucleotide is more likely to be able to fold to adapt itself to a protein surface. On the other hand, while very long molecules can be synthesized and screened, unless they provide a much superior affinity to that of shorter molecules, they are not likely to be found in the selected population, for the reasons explained by Osborne and Ellington (1997) .
  • the libraries of the present invention are preferably composed of oligonucleotides having a length of 3 to 100 bases, more preferably 15 to 35 bases.
  • the oligonucleotides in a given library may be of the same or of different lengths .
  • Oligonucleotide libraries have the advantage that libraries of very high diversity (e.g., 10 15 ) are feasible, and binding molecules are readily amplified in vitro by polymerase chain reaction (PCR) .
  • PCR polymerase chain reaction
  • nucleic acid molecules can have very high specificity and affinity to targets.
  • this invention prepares and screens oligonucleotide libraries by the SELEX method, as described in King and Famulok, ⁇ olec. Biol. Repts . , 20: 97- 107 (1994) ; L. Gold, C. Tuerk. Methods of producing nucleic acid ligands, US#5595877; Oliphant et al . Gene 44:177
  • aptamer is conferred on those oligonucleotides which bind the target protein. Such aptamers may be used to characterize the target protein, both directly (through identification of the aptamer and the points of contact between the aptamer and the protein) ' and indirectly (by use of the aptamer as a ligand to modify "the chemical reactivity of the protein) .
  • each nucleotide (monomeric unit) is composed of a phosphate group, a sugar moiety, and either a purine or a pyrimidine base.
  • the sugar is deoxyribose and in RNA it is ribose.
  • the nucleotides are linked by 5 ' -3 ' phosphodiester bonds.
  • the deoxyribose phosphate backbone of DNA can be modified to increase resistance to nuclease and to increase penetration of cell membranes.
  • Derivatives such as mono- or dithiophosphates, methyl phosphonates , boranophosphates, formacetals, carbamates siloxanes, and dimethylenethio- - sulfoxideo- and-sulfono- linked species are known in the art.
  • a peptide is composed of a plurality of amino acid residues joined together by peptidyl (-NHCO-) bonds.
  • a biogenic peptide is a peptide in which the residues are all genetically encoded amino acid residues; it is not necessary that the biogenic peptide actually be produced by gene expression.
  • Amino acids are the basic building blocks with which peptides and proteins are constructed. Amino acids possess both an amino group (-NH 2 ) and a carboxylic acid group (- COOH) . Many amino acids, but not all, have the alpha amino acid structure NH 2 -CHR-COOH, where R is hydrogen, or any of a variety of functional groups.
  • Twenty amino acids are genetically encoded: Alanine, Arginine, Asparagine, Aspartic Acid, Cysteine, Glutamic Acid, Glutamine, Glycine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Proline, Serine,
  • Threonine Tryptophan, Tyrosine, and Valine.
  • Glycine are optically isomeric, however, only the L- form is found in humans. Nevertheless, the D-forms of these amino acids do have biological significance; D-Phe, for example, is a known analgesic.
  • amino acids are also known, including: 2- Aminoadipic acid; 3-Aminoadipic acid; beta-Aminopropionic acid; 2-Aminobutyric acid; 4-Aminobutyric acid (Piperidinic acid) ; 6-Aminocaproic acid; 2-Aminoheptanoic acid; 2- Aminoisobutyric acid, 3-Aminoisobutyric acid; 2-Aminopimelic acid; 2 ,4-Diaminobutyric acid; Desmosine; 2,2'- Diaminopimelic acid; 2 , 3-Diaminopropionic acid; N- Ethylglycine; N-Ethylasparagine; Hydroxylysine; allo- Hydroxylysine; 3-Hydroxyproline; 4-Hydroxyproline; Isodesmosine; allo-Isoleucine; N-Methylglycine (Sarcosine) ; N-Methylisoleucine; N-
  • Peptides are constructed by condensation of amino acids and/or smaller peptides.
  • the amino group of one amino acid (or peptide) reacts with the carboxylic acid group of a second amino acid (or peptide) to form a peptide (-NHCO-) bond, releasing one molecule of water. Therefore, when an amino acid is incorporated into a peptide, it should, technically speaking, be referred to as an amino acid residue.
  • the core of that residue is the moiety which excludes the -NH and -CO linking functionalities which connect it to other residues. This moiety consists of one or more main chain atoms (see below) and the attached side chains .
  • each amino acid consists of the -NH and -CO linking functionalities and a core main chain moiety. Usually the latter is a single carbon atom. However, the core main chain moiety may include additional carbon atoms, and may also include nitrogen, oxygen or sulfur atoms, which together form a single chain. In a preferred embodiment, the core main chain atoms consist solely of carbon atoms .
  • the side chains are attached to the core main chain atoms. For alpha amino acids, in which the side chain is attached to the alpha carbon, the C-l, C-2 and N-2 of each residue form the repeating unit of the main chain, and the word "side chain” refers to the C-3 and higher numbered carbon atoms and their substituents. It also includes H atoms attached to the main chain atoms .
  • Amino acids may be classified according to the number of carbon atoms which appear in the main chain between the carbonyl carbon and amino nitrogen atoms which participate in the peptide bonds.
  • alpha, beta, gamma and delta amino acids are known. These have 1-4 intermediary carbons.
  • Proline is a special case of an alpha amino acid; its side chain also binds to the peptide bond nitrogen.
  • main chain core carbon a side chain other than H is attached to.
  • the preferred attachment site is the C-2 (alpha) carbon, i.e., the one adjacent to the carboxyl carbon of the -CO linking functionality. It is also possible for more than one main chain atom to carry a side chain other than H. However, in a preferred embodiment, only one main chain core atom carries a side chain other than H.
  • a main chain carbon atom may carry either one or two side chains; one is more common.
  • a side chain may be attached to a main chain carbon atom by a single or a double bond; the former is more common.
  • a simple combinatorial peptide library is one whose members are peptides having three or more amino acids connected via peptide bonds.
  • the peptides may be linear, branched, or cyclic, and may covalently or noncovalently include nonpeptidyl moieties.
  • the amino acids are not limited to the naturally occurring or to the genetically encoded amino acids.
  • a biased peptide library is one in which one or more (but not all) residues of the peptides are constant residues .
  • Cyclization is a common mechanism for stabilization of peptide conformation thereby achieving improved association of the peptide with its ligand and hence improved biological activity. Cyclization is usually achieved by intra-chain cystine formation, by formation of peptide bond between side chains or between N- and C- terminals. Cyclization was usually achieved by peptides in solution, but several publications have appeared that describe cyclization of peptides on beads .
  • a peptide library may be an oligopeptide library or a protein library.
  • the oligopeptides are at least five, six, seven or eight amino acids in length. Preferably, they are composed of less than 50, more preferably less than 20 amino acids .
  • oligopeptide In the case of an oligopeptide library, all or just some of the residues may be variable.
  • the oligopeptide may be unconstrained, or constrained to a particular conformation by, e.g., the participation of constant cysteine residues in the formation of a constraining disulfide bond.
  • Proteins like oligopeptides, are composed of a plurality of amino acids, but the term protein is usually reserved for longer peptides, which are able to fold into a stable conformation.
  • a protein may be composed of two or more polypeptide chains, held together by covalent or noncovalent crosslinks. These may occur in a homooligomeric or a heterooligomeric state.
  • a peptide is considered a protein if it (1) is at least 50 amino acids long, or (2) has at least two stabilizing covalent crosslinks (e.g., disulfide bonds).
  • conotoxins are considered proteins.
  • the proteins of a protein library will be characterizable as having both constant residues (the same for all proteins in the library) and variable residues (which vary from member to member) . This is simply because, for a given range of variation at each position, the sequence space (simple diversity) grows exponentially with the number of residue positions, so at some point it becomes inconvenient for all residues of a peptide to be variable positions. Since proteins are usually larger than oligopeptides, it is more common for protein libraries than oligopeptide libraries to feature variable positions.
  • a protein library it is desirable to focus the mutations at those sites which are tolerant of mutation. These may be determined by alanine scanning mutagenesis or by comparison of the protein sequence to that of homologous proteins of similar activity. It is also more likely that mutation of surface residues will directly affect binding. Surface residues may be determined by inspecting a 3D structure of the protein, or by labeling the surface and then ascertaining which residues have received labels. They may also be inferred by identifying regions of high hydrophilicity within the protein. Because proteins are often altered at some sites but not others, protein libraries can be considered a special case of the biased peptide library.
  • the protein library comprises members which comprise a mutant of VH or VL chain, or a mutant of an antigen-specific binding fragment of such a chain.
  • VH and VL chains are usually each about 110 amino acid residues, and are held in proximity by a disulfide bond between the adjoing CL and CHI regions to form a variable domain. Together, the VH, VL, CL and CHI form an Fab fragment .
  • the hypervariable regions are at 31-35, 49-65, 98-111 and 84-88, but only the first three are involved in antigen binding. There is variation among VH and VL chains at residues outside the hypervariable regions, but to a much lesser degree.
  • a sequence is considered a mutant of a VH or VL chain if it is at least 80% identical to a naturally occurring VH or VL chain at all residues outside the hypervariable region.
  • such antibody library members comprise both at least one VH chain and at least one VL chain, at least one of which is a mutant chain, and which chains may be derived from the same or different antibodies.
  • the VH and VL chains may be covalently joined by a suitable linker moiety, as in a "single chain antibody” , or they may be noncovalently joined, as in a naturally occurring variable domain.
  • the joining is noncovalent, and the library is displayed on cells or virus, then either the VH or the VL chain may be fused to the carrier surface/coat protein.
  • the complementary chain may be co-expressed, or added exogenously to the library.
  • the members may further comprise some or all of an antibody constant heavy and/or constant light chain, or a mutant thereof.
  • a peptoid is an analogue of a peptide in which one or more of the peptide bonds (-NH-C0-) are replaced by pseudopeptide bonds, which may be the same or different. It is not necessary that all of the peptide bonds be replaced, i.e., a peptoid may include one or more conventional amino acid residues, e.g., proline.
  • a peptide bond has two small divalent linker elements, -NH- and -CO-.
  • a preferred class of psuedopeptide bonds are those which consist of two small divalent linker elements. Each may be chosen independently from the group consisting of amine (-NH-) , substituted amine (-NR-) , carbonyl (-CO-) , thiocarbonyl (-CS-) , methylene (-CH2-) , monosubstituted methylene (-CHR-) , disubstituted methylene (-CR1R2-) , ether (-0-) and thioether (-S-) .
  • the more preferred pseudopeptide bonds include: N-modified -NRC0-
  • a single peptoid molecule may include more than one kind of pseudopeptide bond.
  • the side chains attached to the core main chain atoms of the monomers linked by the pseudopeptide bonds and/or (2) the side chains (e.g., the - R of an -NRCO-) of the pseudopeptide bonds.
  • the monomeric units which are not amino acid residues are of the structure -NR1-CR2-CO- , where at least one of Rl and R2 are not hydrogen. If there is variability in the pseudopeptide bond, this is most conveniently done by using an -NRCO- or other pseudopeptide bond with an R group, and varying the R group. In this event, the R group will usually be any of the side chains characterizing the amino acids of peptides, as previously discussed.
  • R group of the pseudopeptide bond is not variable, it will usually be small, e.g., not more than 10 atoms (e.g. , hydroxyl , amino, carboxyl , methyl , ethyl , propyl) . If the conjugation chemistries are compatible, a simple combinatorial library may include both peptides and peptoids .
  • a PNA oligomer is here defined as one comprising a plurality of units, at least one of which is a PNA monomer which comprises a side chain comprising a nucleobase.
  • a PNA monomer which comprises a side chain comprising a nucleobase.
  • the classic PNA oligomer is composed of (2- aminoethyl) glycine units, with nucleobases attached by methylene carbonyl linkers. That is, it has the structure
  • outer parenthesized substructure is the PNA monomer.
  • nucleobase B is separated from the backbone N by three bonds, and the points of attachment of the side chains are separated by six bonds.
  • the nucleobase may be any of the bases included in the nucleotides discussed in connection with oligonucleotide libraries.
  • the bases of nucleotides A, G, T, C and U are preferred.
  • a PNA oligomer may further comprise one or more amino acid residues, especially glycine and proline.
  • PNA oligomer libraries have been made; see e.g. Cook, 6,204,326.
  • the small organic compound library (“compound library”, for short) is a combinatorial library whose members are suitable for use as drugs if, indeed, they have the ability to mediate a biological activity of the target protein.
  • Peptides have certain disadvantages as drugs. These include susceptibility to degradation by serum proteases, and difficulty in penetrating cell membranes. Preferably, all or most of the compounds of the compound library avoid, or at least do not suffer to the same degree, one or more of the pharmaceutical disadvantages of peptides.
  • disjunction in which a lead drug is simplified to identify its component pharmacophoric moieties
  • conjunction in which two or more known pharmacophoric moieties, which may be the same or different, are associated, covalently or noncovalently, to form a new drug
  • alteration in which one moiety is replaced by another which may be similar or different, but which is not in effect a disjunction or conjunction.
  • disjunction is intended only to connote the structural relationship of the end product to the original leads, and not how the new drugs are actually synthesized, although it is possible that the two are the same .
  • the process of disjunction is illustrated by the evolution of neostigmine (1931) and edrophonium (1952) from physostigmine (1925) . Subsequent conjunction is illustrated by demecarium (1956) and ambenonium (1956) .
  • Alterations may modify the size, polarity, or electron distribution of an original moiety. Alterations include ring closing or opening, formation of lower or higher homologues, introduction or saturation of double bonds, introduction of optically active centers, introduction, removal or replacement of bulky groups, isosteric or bioisosteric substitution, changes in the position or orientation of a group, introduction of alkylating groups, and introduction, removal or replacement of groups with a view toward inhibiting or promoting inductive (electrostatic) or conjugative (resonance) effects.
  • the substituents may include electron acceptors and/or electron donors.
  • Typical electron donors (+1) include -CH 3 , -CH 2 R, -CHR 2 , -CR 3 and -COO " .
  • the substituents may also include those which increase or decrease electronic density in conjugated systems.
  • the former (+R) groups include -CH 3 , -CR 3 , -F, -Cl, -Br, -I, -OH, -OR, -OCOR, -SH, -SR, -NH 2 , -NR 2 , and -NHCOR.
  • the later (-R) groups include -N0 2 , -CN, -CHC, -COR, -COOH, -COOR, -CONH 2 , -S0 2 R and -CF 3 .
  • a compound, or a family of compounds, having one or more pharmacological activities may be disjoined into two or more known or potential pharmacophoric moieties.
  • Analogues of each of these moieties may be identified, and mixtures of these analogues reacted so as to reassemble compounds which have some similarity to the original lead compound. It is not necessary that all members of the library possess moieties analogous to all of the moieties of the lead compound.
  • the design of a library may be illustrated by the example of the benzodiazepines .
  • benzodiazepine drugs including chlordiazepoxide, diazepam and oxazepam, have been used as anti-anxiety drugs.
  • Derivatives of benzodiazepines have widespread biological activities; derivatives have been reported to act not only as anxiolytics, but also as anticonvulsants; cholecystokinin (CCK) receptor subtype A or B, kappa opioid receptor, platelet activating factor, and HIV transactivator Tat antagonists, and GPIIblla, reverse transcriptase and ras farnesyltransferase inhibitors.
  • CCK cholecystokinin
  • the benzodiazepine structure has been disjoined into a 2-aminobenzophenone, an amino acid, and an alkylating agent. See Bunin, et al . , Proc. Nat. Acad. Sci. USA, 91:4708
  • the acid chloride building block introduces variability at the R 1 site.
  • the R 2 site is introduced by the amino acid, and the R 3 site by the alkylating agent.
  • the R 4 site is inherent in the arylstannane.
  • Bunin, et al . generated a 1, 4- benzodiazepine library of 11,200 different derivatives prepared from 20 acid chlorides, 35 amino acids, and 16 alkylating agents.
  • aliphatic groups both acyclic and cyclic (mono- or poly-) structures, substituted or not, were tested. (although all of the acyclic groups were linear, it would have been feasible to introduce a branched aliphatic) .
  • the aromatic groups featured either single and multiple rings, fused or not, substituted or not, and with heteroatoms or not.
  • the secondary substitutents included - NH 2 , -OH, -OMe, -CN, -Cl, -F, and -COOH. While not used, spacer moieties, such as -0-, -S-, -00-, -CS-, -NH- , and - NR- , could have been incorporated.
  • Bunin et al suggest that instead of using a 1, 4- benzodiazepine as a core structure, one may instead use a 1, 4-benzodiazepine-2 , 5-dione structure.
  • DeWitt, et al . , Proc. Nat. Acad. Sci. (USA), 90:6909-13 (1993) describe the simultaneous but separate, synthesis of 40 discrete hydantoins and 40 discrete benzodiazepines. They carry out their synthesis on a solid support (inside a gas dispersion tube) , in an array format, as opposed to other conventional simultaneous synthesis techniques (e.g., in a well, or on a pin) .
  • the hydantoins were synthesized by first simultaneously deprotecting and then treating each of five amino acid resins with each of eight isocyanates.
  • the benzodiazepines were synthesized by treating each of five deprotected amino acid resins with each of eight 2 -amino benzophenone imines .
  • Heterocylic combinatorial libraries are reviewed generally in Nefzi, et al . , Chem. Rev., 97:449-472 (1997).
  • the library is preferably synthesized so that the individual members remain identifiable so that, if a member is shown to be active, it is not necessary to analyze it.
  • Several methods of identification have been proposed, including: (1) encoding, i.e., the attachment to each member of an identifier moiety which is more readily identified than the member proper. This has the disadvantage that the tag may itself influence the activity of the conjugate.
  • each member is synthesized only at a particular coordinate on or in a matrix, or in a particular chamber. This might be, for example, the location of a particular pin, or a particular well on a microtiter plate, or inside a "tea bag” .
  • the present invention is not limited to any particular form of identification.
  • Solid phase synthesis permits greater control over which derivatives are formed. However, the solid phase could interfere with activity. To overcome this problem, some or all of the molecules of each member could be liberated, after synthesis but before screening.
  • candidate simple libraries which might be evaluated include derivatives of the following: Cyclic Compounds Containing One Hetero Atom
  • Heteronitrogen pyrroles pentasubstituted pyrroles pyrrolidines pyrrolines prolines indoles beta-carbolines pyridines dihydropyridines
  • 1,2, 3-triazoles pur es Heteronitrogen and Heterooxygen dikelomorpholines isoxazoles isoxazolines
  • the preferred animal subject of the present invention is a mammal.
  • mammal an individual belonging to the class Mammalia.
  • the invention is particularly useful in the treatment of human subjects, although it is intended for veterinary and nutritional uses as well.
  • Preferred nonhuman subjects are of the orders Primata (e.g., apes and monkeys), Artiodactyla or Perissodactyla (e.g., cows, pigs, sheep, horses, goats), Carnivora (e.g., cats, dogs), Rodenta (e.g., rats, mice, guinea pigs, hamsters), Lagomorpha (e.g., rabbits) or other pet, farm or laboratory mammals.
  • Primata e.g., apes and monkeys
  • Artiodactyla or Perissodactyla e.g., cows, pigs, sheep, horses, goats
  • Carnivora e.g., cats, dogs
  • prevention is intended to include “prevention,” “suppression” and “treatment.”
  • prevention strictly speaking, involves administration of the pharmaceutical prior to the induction of the disease (or other adverse clinical condition) .
  • suppression involves administration of the composition prior to the clinical appearance of the disease.
  • Treatment involves administration of the protective composition after the appearance of the disease.
  • prevention will be understood to refer to both prevention in the strict sense, and to suppression.
  • the preventative or prophylactic use of a pharmaceutical involves identifying subjects who are at higher risk than the general population of contracting the disease, and administering the pharmaceutical to them in advance of the clinical appearance of the disease.
  • the effectiveness of such use is measured by comparing the subsequent incidence or severity of the disease, or of particular symptoms of the disease, in the treated subjects against that in untreated subjects of the same high risk group .
  • a prophylaxis or treatment may be curative, that is, directed at the underlying cause of a disease, or ameliorative, that is, directed at the symptoms of the disease, especially those which reduce the quality of life. It should also be understood that to be useful, the protection provided need not be absolute, provided that it is sufficient to carry clinical value. An agent which provides protection to a lesser degree than do competitive agents may still be of value if the other agents are ineffective for a particular individual, if it can be used in combination with other agents to enhance the level of protection, or if it is safer than competitive agents.
  • At least one of the drugs of the present invention may be administered, by any means that achieve their intended purpose, to protect a subject against a disease or other adverse condition.
  • the form of administration may be systemic or topical.
  • administration of such a composition may be by various parenteral routes such as subcutaneous, intravenous, intradermal, intramuscular, intraperitoneal, intranasal, transdermal, or buccal routes.
  • parenteral routes such as subcutaneous, intravenous, intradermal, intramuscular, intraperitoneal, intranasal, transdermal, or buccal routes.
  • Parenteral administration can be by bolus injection or by gradual perfusion over time.
  • a typical regimen comprises administration of an effective amount of the drug, administered over a period ranging from a single dose, to dosing over a period of hours , days , weeks , months , or years .
  • the suitable dosage of a drug of the present invention will be dependent upon the age, sex, health, and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment, and the nature of the effect desired.
  • the most preferred dosage can be tailored to the individual subject, as is understood and determinable by one of skill in the art, without undue experimentation. This will typically involve adjustment of a standard dose, e.g., reduction of the dose if the patient has a low body weight .
  • a drug Prior to use in humans, a drug will first be evaluated for safety and efficacy in laboratory animals. In human clinical studies, one would begin with a dose expected to be safe in humans, based on the preclinical data for the drug in question, and on customary doses for analogous drugs (if any) . If this dose is effective, the dosage may be decreased, to determine the minimum effective dose, if desired. If this dose is ineffective, it will be cautiously increased, with the patients monitored for signs of side effects. See, e.g., Berkow et al, eds., The Merck Manual , 15th edition, Merck and Co., Rahway, N.J., 1987; Goodman et al .
  • the total dose required for each treatment may be administered by multiple doses or in a single dose.
  • the protein may be administered alone or in conjunction with other therapeutics directed to the disease or directed to other symptoms thereof .
  • the appropriate dosage form will depend on the disease, the pharmaceutical, and the mode of administration; possibilities include tablets, capsules, lozenges, dental pastes, suppositories, inhalants, solutions, ointments and parenteral depots. See, e.g., Berker, supra, Goodman, supra, Avery, supra and Ebadi, supra, which are entirely incorporated herein by reference, including all references cited therein.
  • the drug may be administered in the form of an expression vector comprising a nucleic acid encoding the peptide; such a vector, after incorporation into the genetic complement of a cell of the patient, directs synthesis of the peptide.
  • Suitable vectors include genetically engineered poxviruses (vaccinia) , adenoviruses, adeno-associated viruses, herpesviruses and lentiviruses which are or have been rendered nonpathogenic .
  • a pharmaceutical composition may contain suitable pharmaceutically acceptable carriers, such as excipients, carriers and/or auxiliaries which facilitate processing of the active compounds into preparations which can be used pharmaceutically. See, e.g., Berker, supra, Goodman, supra, Avery, supra and Ebadi, supra, which are entirely incorporated herein by reference, included all references cited therein.
  • Target Organism The invention contemplates that it may be appropriate to ascertain or to mediate the biological activity of a substance of this invention in a target organism.
  • the target organism may be a plant, animal, or-, microorganism.
  • a plant it may be an economic plant, in which case the drug may be intended to increase the disease, weather or pest resistance, alter the growth characteristics, or otherwise improve the useful characteristics or mute undesirable characteristics of the plant.
  • it may be a weed, in which case the drug may be intended to kill or otherwise inhibit the growth of the plant, or to alter its characteristics to convert it from a weed to an economic plant.
  • the plant may be a tree, shrub, crop, grass, etc.
  • the plant may be an algae (which are in some cases also microorganisms) , or a vascular plant, especially gymnosperms (particularly conifers) and angiosperms .
  • Angiosperms may be monocots or dicots.
  • the plants of greatest interest are rice, wheat, corn, alfalfa, soybeans, potatoes, peanuts, tomatoes, melons, apples, pears, plums, pineapples, fir, spruce, pine, cedar, and oak.
  • the target organism is a microorganism, it may be algae, bacteria, fungi, or a virus (although the biological activity of a virus must be determined in a virus-infected cell) .
  • the microorganism may be human or other animal or plant pathogen, or it may be nonpathogenic. It may be a soil or water organism, or one which normally lives inside other living things.
  • the target organism is an animal, it may be a vertebrate or a nonvertebrate animal.
  • Nonvertebrate animals are chiefly of interest when they act as pathogens or parasites, and the drugs are intended to act as biocidic or biostatic agents.
  • Nonvertebrate animals of interest include worms, mollusks, and arthropods.
  • the target organism may also be a vertebrate animal, i.e., a mammal, bird, reptile, fish or amphibian.
  • the target animal preferably belongs to the order Primata (humans, apes and monkeys), Artiodactyla (e.g., cows, pigs, sheep, goats, horses), Rodenta (e.g., mice, rats) Lagomorpha (e.g., rabbits, hares), or Carnivora (e.g., cats, dogs) .
  • the target animals are preferably of the orders Anseriformes (e.g., ducks, geese, swans) or Galliformes (e.g., quails, grouse, pheasants, turkeys and chickens) .
  • the target animal is preferably of the order Clupeiformes (e.g., sardines, shad, anchovies, whitefish, salmon) .
  • Target Tissues refers to any whole animal, physiological system, whole organ, part of organ, miscellaneous tissue, cell, or cell component (e.g., the cell membrane) of a target animal in which biological activity may be measured. Routinely in mammals one would choose to compare and contrast the biological impact on virtually any and all tissues which express the subject receptor protein.
  • the main tissues to use are: brain, heart, lung, kidney, liver, pancreas, skin, intestines, adipose, stomach, skeletal muscle, adrenal glands, breast, prostate, vasculature, retina, cornea, thyroid gland, parathyroid glands, thymus, bone marrow, bone, etc.
  • B cells B cells, T cells, macrophages, neutrophils, eosinophils, mast cells, platelets, megakaryocytes, erythrocytes, bone marrow stomal cells, fibroblasts, neurons, astrocytes, neuroglia, microglia, epithelial cells (from any organ, e.g. skin, breast, prostate, lung, intestines etc), cardiac muscle cells, smooth muscle cells, striated muscle cells, osteoblasts, osteocytes, chondroblasts, chondrocytes, keratinocytes, melanocytes, etc.
  • Screening assays will typically be either in vitro
  • cell-free assays for binding to an immobilized receptor
  • cell-based assays for alterations in the phenotype of the cell
  • in vivo is descriptive of an event, such as binding or enzymatic action, which occurs within a living organism.
  • the organism in question may, however, be genetically modified.
  • the term in vi tro refers to an event which occurs outside a living organism. Parts of an organism (e.g., a membrane, or an isolated biochemical) are used, together with artificial substrates and/or conditions.
  • the term in vitro excludes events occurring inside or on an intact cell, whether of a unicellular or multicellular organism.
  • In vivo assays include both cell-based assays, and organismic assays.
  • the cell-based assays include both assays on unicellular organisms, and assays on isolated cells or cell cultures derived from multicellular organisms. The cell cultures may be mixed, provided that they are not organized into tissues or organs.
  • organismic assay refers to assays on whole multicellular organisms, and assays on isolated organs or tissues of such organisms.
  • the in vitro assays of the present invention may be applied to any suitable analyte-containing sample, and may be qualitative or quantitative in nature.
  • the sample will normally be a biological fluid, such as blood, urine, lymph, semen, milk, or cerebrospinal fluid, or a fraction or derivative thereof, or a biological tissue, in the form of, e.g., a tissue section or homogenate.
  • a biological fluid or tissue it may be taken from a human or other mammal, vertebrate or animal, or from a plant.
  • the preferred sample is blood, or a fraction or derivative thereof.
  • the assay may be a binding assay, in which one step involves the binding of a diagnostic reagent to the analyte, or a reaction assay, which involves the reaction of a reagent with the analyte.
  • the reagents used in a binding assay may be classified as to the nature of their interaction with analyte: (1) analyte analogues, or (2) analyte binding molecules (ABM) . They may be labeled or insolubilized.
  • the assay may look for a direct reaction between the analyte and a reagent which is reactive with the analyte, or if the analyte is an enzyme or enzyme inhibitor, for a reaction catalyzed or inhibited by the analyte.
  • the reagent may be a reactant, a catalyst, or an inhibitor for the reaction.
  • An assay may involve a cascade of steps in which the product of one step acts as the target for the next step. These steps may be binding steps, reaction steps, or a combination thereof.
  • SPS Signal Producing System
  • the assay In order to detect the presence, or measure the amount, of an analyte, the assay must provide for a signal producing system (SPS) in which there is a detectable difference in the signal produced, depending on whether the analyte is present or absent (or, in a quantitative assay, on the amount of the analyte) .
  • SPS signal producing system
  • the detectable signal may be one which is visually detectable, or one detectable only with instruments. Possible signals include production of colored or luminescent products, alteration of the characteristics (including amplitude or polarization) of absorption or emission of radiation by an assay component or product, and precipitation or agglutination of a component or product.
  • signal is intended to include the discontinuance of an existing signal, or a change in the rate of change of an observable parameter, rather than a change in its absolute value.
  • the signal may be monitored manually or automatically.
  • the signal is often a product of the reaction.
  • a binding assay it is normally provided by a label borne by a labeled reagent .
  • a label may be, e.g., a radioisotope, a fluorophore, an enzyme, a co-enzyme, an enzyme substrate, an electron-dense compound, an agglutinable particle.
  • the radioactive isotope can be detected by such means as the use of a gamma counter or a scintillation counter or by autoradiography.
  • Isotopes which are particularly useful for the purpose of the present invention include 3 H, 125 I, 131 I, 35 S, 14 C, 3 P and 33 P. 125 I is preferred for antibody labeling.
  • the label may also be a fluorophore.
  • the fluorescently labeled reagent When the fluorescently labeled reagent is exposed to light of the proper wave length, its presence can then be detected due to fluorescence.
  • fluorescent labelling compounds are fluorescein isothiocyanate, rhodamine, phycoerythrin, phycocyanin, allophycocyanin, o- phthaldehyde and fluorescamine.
  • fluorescence-emitting metals such as 12 ⁇ Eu, or others of the lanthanide series, may be incorporated into a diagnostic reagent using such metal chelating groups as diethylenetriaminepentaacetic acid (DTPA) of ethylenediamine-tetraacetic acid (EDTA) .
  • DTPA diethylenetriaminepentaacetic acid
  • EDTA ethylenediamine-tetraacetic acid
  • the label may also be a chemiluminescent compound.
  • the presence of the chemiluminescently labeled reagent is then determined by detecting the presence of luminescence that arises during the course of a chemical reaction.
  • chemiluminescent labeling compounds are luminol, isolumino, theromatic acridinium ester, imidazole, acridinium salt and oxalate ester.
  • Bioluminescence is a type of chemiluminescence found in biological systems in which a catalytic protein increases the efficiency of the chemiluminescent reaction. The presence of a bioluminescent protein is determined by detecting the presence of luminescence.
  • Important bioluminescent compounds for purposes of labeling are luciferin, luciferase and aequorin.
  • Enzyme labels such as horseradish peroxidase and alkaline phosphatase, are preferred.
  • the signal producing system must also include a substrate for the enzyme. If the enzymatic reaction product is not itself detectable, the SPS will include one or more additional reactants so that a detectable product appears.
  • An enzyme analyte may act as its own label if an enzyme inhibitor is used as a diagnostic reagent.
  • Binding assays may be divided into two basic types, heterogeneous and homogeneous.
  • heterogeneous assays the interaction between the affinity molecule and the analyte does not affect the label, hence, to determine the amount or presence of analyte, bound label must be separated from free label.
  • homogeneous assays the interaction does affect the activity of the label, and therefore analyte levels can be deduced without the need for a separation step.
  • the ABM is insolubilized by coupling it to a macromolecular support, and analyte in the sample is allowed to compete with a known quantity of a labeled or specifically labelable analyte analogue.
  • the "analyte analogue” is a molecule capable of competing with analyte for binding to the ABM, and the term is intended to include analyte itself. It may be labeled already, or it may be labeled subsequently by specifically binding the label to a moiety differentiating the analyte analogue from analyte.
  • the solid and liquid phases are separated, and the labeled analyte analogue in one phase is quantified.
  • analyte analogue in the solid phase i.e., sticking to the ABM
  • the level of analyte in the sample i.e., sticking to the ABM
  • both an insolubilized ABM, and a labeled ABM are employed.
  • the analyte is captured by the insolubilized ABM and is tagged by the labeled ABM, forming a ternary complex.
  • the reagents may be added to the sample in either order, or simultaneously.
  • the ABMs may be the same or different.
  • the amount of labeled ABM in the ternary complex is directly proportional to the amount of analyte in the sample .
  • a label may be conjugated, directly or indirectly (e.g., through a labeled anti-ABM antibody), covalently (e.g., with SPDP) or noncovalently, to the ABM, to produce a diagnostic reagent.
  • the ABM may be conjugated to a solid phase support to form a solid phase (“capture") diagnostic reagent.
  • Suitable supports include glass, polystyrene, polypropylene, polyethylene, dextran, nylon, amylases, natural and modified celluloses, polyacrylamides, agaroses, and magnetite.
  • the nature of the carrier can be either soluble to some extent or insoluble for the purposes of the present invention.
  • the support material may have virtually any possible structural configuration so long as the coupled molecule is capable of binding to its target.
  • the support configuration may be spherical, as in a bead, or cylindrical, as in the inside surface of a test tube, or the external surface of a rod.
  • the surface may be flat such as a sheet, test strip, etc.
  • a biological assay measures or detects a biological response of a biological entity to a substance.
  • the biological entity may be a whole organism, an isolated organ or tissue, freshly isolated cells, an immortalized cell line, or a subcellular component (such as a membrane; this term should not be construed as including an isolated receptor) .
  • the entity may be, or may be derived from, an organism which occurs in nature, or which is modified in some way. Modifications may be genetic (including radiation and chemical mutants, and genetic engineering) or somatic (e.g., surgical, chemical, etc.). In the case of a multicellular entity, the modifications may affect some or all cells.
  • the entity need not be the target organism, or a derivative thereof, if there is a reasonable correlation between bioassay activity in the assay entity and biological activity in the target organism.
  • a culture medium may, but need not, contain serum or serum substitutes, and it may, but need not, include a support matrix of some kind, it may be still, or agitated. It may contain particular biological or chemical agents, or have particular physical parameters (e.g., temperature), that are intended to nourish or challenge the biological entity.
  • the direct signal produced by the biological marker may be transformed by a signal producing system into a different signal which is more observable, for example, a fluorescent or colorimetric signal .
  • a signal producing system for example, a fluorescent or colorimetric signal .
  • the entity, environment, marker and signal producing system are chosen to achieve a clinically acceptable level of sensitivity, specificity and accuracy.
  • the goal will be to identify substances which mediate the biological activity of a natural biological entity, and the assay is carried out directly with that entity.
  • the biological entity is used simply as a model of some more complex (or otherwise inconvenient to work with) biological entity.
  • the model biological entity is used because activity in the model system is considered more predictive of activity in the ultimate natural biological entity than is simple binding activity in an in vitro system.
  • the model entity is used instead of the ultimate entity because the former is more expensive or slower to work with, or because ethical considerations forbid working with the ultimate entity yet .
  • the model entity may be naturally occurring, if the model entity usefully models the ultimate entity under some conditions. Or it may be non-naturally occurring, with modifications that increase its resemblance to the ultimate entity.
  • Transgenic animals such as transgenic mice, rats, and rabbits, have been found useful as model systems.
  • the receptor may be functionally connected to a signal (biological marker) producing system, which may be endogenous or exogenous to the cell .
  • the binding of a peptide to the target protein results in a screenable or selectable phenotypic change, without resort to fusing the target protein (or a ligand binding moiety thereof) to an endogenous protein.
  • the target protein is endogenous to the host cell, or is substantially identical to an endogenous receptor so that it can take advantage of the latter' s native signal transduction pathway.
  • sufficient elements of the signal transduction pathway normally associated with the target protein may be engineered into the cell so that the cell signals binding to the target protein.
  • a chimera receptor a hybrid of the target protein and an endogenous receptor
  • the chimeric receptor has the ligand binding characteristics of the target protein and the signal transduction characteristics of the endogenous receptor.
  • the normal signal transduction pathway of the endogenous receptor is subverted.
  • the endogenous receptor is inactivated, or the conditions of the assay avoid activation of the endogenous receptor, to improve the signal-to-noise ratio. See Fowlkes USP 5,789,184 for a yeast system.
  • Another type of "one-hybrid” system combines a peptide: DNA-binding domain fusion with an unfused target receptor that possesses an activation domain.
  • the cell-based assay is a two hybrid system.
  • This term implies that the ligand is incorporated into a first hybrid protein, and the receptor into a second hybrid protein.
  • the first hybrid also comprises component A of a signal generating system, and the second hybrid comprises component B of that system.
  • Components A and B by themselves, are insufficient to generate a signal. However, if the ligand binds the receptor, components A and B are brought into sufficiently close proximity so that they can cooperate to generate a signal .
  • Components A and B may naturally occur, or be substantially identical to moieties which naturally occur, as components of a single naturally occurring biomolecule, or they may naturally occur, or be substantially identical to moieties which naturally occur, as separate naturally occurring biomolecules which interact in nature .
  • two-Hybrid System Transcription Factor Type
  • one member of a peptide ligand: receptor binding pair is expressed as a fusion to a DNA-binding domain (DBD) from a transcription factor (this fusion protein is called the “bait"), and the other is expressed as a fusion to a transactivation domain (TAD) (this fusion protein is called the "fish", the "prey”, or the "catch”) .
  • the transactivation domain should be complementary to the DNA-binding domain, i.e., it should interact with the latter so as to activate transcription of a specially designed reporter gene that carries a binding site for the DNA-binding domain.
  • the two fusion proteins must likewise be complementary.
  • This complementarity may be achieved by use of the complementary and separable DNA-binding and transcriptional activator domains of a single transcriptional activator protein, or one may use complementary domains derived from different proteins.
  • the domains may be identical to the native domains, or mutants thereof.
  • the assay members may be fused directly to the DBD or TAD, or fused through an intermediated linker.
  • the target DNA operator may be the native operator sequence, or a mutant operator. Mutations in the operator may be coordinated with mutations in the DBD and the TAD.
  • An example of a suitable transcription activation system is one comprising the DNA-binding domain from the bacterial repressor LexA and the activation domain from the yeast transcription factor Gal4, with the reporter gene operably linked to the LexA operator. It is not necessary to employ the intact target receptor; just the ligand-binding moiety is sufficient.
  • the two fusion proteins may be expressed from the same or different vectors.
  • the activatable reporter gene may be expressed from the same vector as either fusion protein (or both proteins) , or from a third vector.
  • Potential DNA-binding domains include Gal4, LexA, and mutant domains substantially identical to the above.
  • Potential activation domains include E. coli B42, Gal4 activation domain II, and HSV VP16, and mutant domains substantially identical to the above.
  • Potential operators include the native operators for the desired activation domain, and mutant domains substantially identical to the native operator.
  • the fusion proteins may comprise nuclear localization signals.
  • the assay system will include a signal producing system, too.
  • the first element of this system is a reporter gene operably linked to an operator responsive to the DBD and TAD of choice.
  • the expression of this reporter gene will result, directly or indirectly, in a selectable or screenable phenotype (the signal) .
  • the signal producing system may include, besides the reporter gene, additional genetic or biochemical elements which cooperate in the production of the signal. Such an element could be, for example, a selective agent in the cell growth medium.
  • the sensitivity of the system may be adjusted by, e.g., use of competitive inhibitors of any step in the activation or signal production process, increasing or decreasing the number of operators, using a stronger or weaker DBD or TAD, etc.
  • the assay is said to be a selection.
  • the signal merely results in a detectable phenotype by which the signaling cell may be differentiated from the same cell in a nonsignaling state (either way being a living cell)
  • the assay is a screen.
  • the term "screening assay” may be used in a broader sense to include a selection. When the narrower sense is intended, we will use the term “nonselective screen” .
  • Various screening and selection systems are discussed in Ladner, USP 5,198,346.
  • Screening and selection may be for or against the peptide: target protein or compound : target protein interaction.
  • Preferred assay cells are microbial (bacterial, yeast, algal, protozooal) , invertebrate, vertebrate (esp. mammalian, particularly human) .
  • the best developed two- hybrid assays are yeast and mammalian systems.
  • two hybrid assays are used to determine whether a protein X and a protein Y interact, by virtue of their ability to reconstitute the interaction of the DBD and the TAD.
  • augmented two-hybrid assays have been used to detect interactions that depend on a third, non- protein ligand.
  • the components A and B reconstitute an enzyme which is not a transcription factor.
  • the effect of the reconstitution of the enzyme is a phenotypic change which may be a screenable change, a selectable change, or both.
  • Radio-labeled ABM may be administered to the human or animal subject. Administration is typically by injection, e.g., intravenous or arterial or other means of administration in a quantity sufficient to permit subsequent dynamic and/or static imaging using suitable radio-detecting devices.
  • the dosage is the smallest amount capable of providing a diagnostically effective image, and may be determined by means conventional in the art, using known radio-imaging agents as a guide.
  • the imaging is carried out on the whole body of the subject, or on that portion of the body or organ relevant to the condition or disease under study.
  • the amount of radio-labeled ABM accumulated at a given point in time in relevant target organs can then be quantified.
  • a particularly suitable radio-detecting device is a scintillation camera, such as a gamma camera.
  • a scintillation camera is a stationary device that can be used to image distribution of radio-labeled ABM.
  • the detection device in the camera senses the radioactive decay, the distribution of which can be recorded.
  • Data produced by the imaging system can be digitized.
  • the digitized information can be analyzed over time discontinuously or continuously.
  • the digitized data can be processed to produce images, called frames, of the pattern of uptake of the radio- labelled ABM in the target organ at a discrete point in time. In most continuous (dynamic) studies, quantitative data is obtained by observing changes in distributions of radioactive decay in target organs over time.
  • a time-activity analysis of the data will illustrate uptake through clearance of the radio-labeled binding protein by the target organs with time.
  • Various factors should be taken into consideration in selecting an appropriate radioisotope .
  • the radioisotope must be selected with a view to obtaining good quality resolution upon imaging, should be safe for diagnostic use in humans and animals, and should preferably have a short physical half-life so as to decrease the amount of radiation received by the body.
  • the radioisotope used should preferably be pharmacologically inert, and, in the quantities administered, should not have any substantial physiological effect.
  • the ABM may be radio-labeled with different isotopes of iodine, for example 123 I, 125 I, or 131 I (see for example, U.S. Patent 4,609,725).
  • the extent of radio-labeling must, however be monitored, since it will affect the calculations made based on the imaging results (i.e. a diiodinated ABM will result in twice the radiation count of a similar monoiodinated ABM over the same time frame) .
  • radioisotopes other than 125 I for labeling in order to decrease the total dosimetry exposure of the human body and to optimize the detectability of the labeled molecule (though this radioisotope can be used if circumstances require) . Ready availability for clinical use is also a factor. Accordingly, for human applications, preferred radio-labels are for example, 99m Tc, 67 Ga, 68 Ga, 90 Y, li:L In, 113m In, 13 I, 18S Re, 188 Re or 211 At .
  • the radio-labelled ABM may be prepared by various methods. These include radio-halogenation by the chloramine - T method or the lactoperoxidase method and subsequent purification by HPLC (high pressure liquid chromatography) , for example as described by J. Gutkowska et al in "Endocrinology and Metabolism Clinics of America: (1987) 16 . (1) :183. Other known methods of radio-labeling can be used, such as IODOBEADSTM.
  • radio-labeled ABM there are a number of different methods of delivering the radio-labeled ABM to the end-user. It may be administered by any means that enables the active agent to reach the agent's site of action in the body of a mammal. Because proteins are subject to being digested when administered orally, parenteral administration, i.e., intravenous, subcutaneous, intramuscular, would ordinarily be used to optimize absorption of an ABM, such as an antibody, which is a protein.
  • mice Three week-old male C57B1/6 mice were placed on either a normal diet (PMI Nutrition Intenational, Inc., Brentwood, MO, Prolab RMH3000) or a high-fat diet (BioServe, Frenchtown, NJ, #F1850) for 8 weeks. High-Fat fed mice were chosen which qualified as hyperinsulinemic (but non- diabetic) or as diabetic, per criteria set forth previously. A mouse fed the normal diet and demonstrating normal weight gain, normal fasting plasma insulin levels and normal fasting blood glucose levels was chosen as the Control animal. Control, Hyperinsulinemic and Type II diabetic mice were sacrificed at 11 weeks of age (8 weeks on the feeding protocol) and total liver RNA were isolated. The mice chosen were :
  • Blood glucose levels was measured from a drop of blood taken from the tip of the tail of fasted (6 hr) mice using a Lifescan Genuine One Touch glucometer. All measurements occurred between 3:00 p.m. and 5 : 00 p.m.
  • Plasma insulin measurements Plasma insulin measurements. Blood was collected from the tail of fasted (6hr) mice into a heparinized capillary tube and stored on ice. All collections occurred between 3:00 p.m. and 5:00 p.m. Plasma was separated from red blood cells by centrifugation for 10 minutes at 8000 x g and then stored at -20 °C. Insulin concentrations were determined using the Rat Insulin ELISA kit and rat insulin standards (ALPCO) essentially as instructed by the manufacturer. Values were adjusted by a factor of 1.23 as determined by the manufacturer to correct for the species difference in cross-reactivity with the antibody.
  • ALPCO Rat Insulin ELISA kit and rat insulin standards
  • RNA isolation Total RNA was isolated from livers using the RNA STAT-60 Total RNA/mRNA Isolation Reagent according to the manufacturer's instructions (Tel-Test, Friendswood, TX) .
  • cDNA synthesis cDNA was synthesized using 1 ⁇ g of total RNA from L-9, H-34 and H-43 mice using the SMART PCR cDNA Synthesis Kit according to the manufacturer's instructions (CLONTECH, Palo Alto, CA) .
  • Library (A) included clones down-regulated in control mice compared to hyperinsulinemic (HI) mice, Library (Z) included clones up- regulated in control mice compared to hyperinsulinemic mice; Library (B) included clones down-regulated in Type-II diabetic mice compared to hyperinsulinemic mice; and Library (Y) included clones up-regulated in Type-II diabetic mice compared to hyperinsulinemic mice.
  • Nucleotide sequence determination Plasmid DNA from bacterial colonies carrying the differentially expressed cDNA inserts was isolated using the QIAprep Spin Miniprep Kit according to the manufacturer's instructions (Qiagen Inc., Santa Clarita, CA) . Nucleotide sequences were determined by use of the ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit with electrophoresis on the ABI PRISM 377 DNA Sequencer (PE Applied Biosystems, Foster City, CA.) . Nucleotide sequences and predicted amino acid sequences were compared to public domain databases using the Blast 2.0 program (National Center for Biotechnology Information, National Institutes of Health) .
  • RNA isolated from Control, hyperinsulinemic and Type-II Diabetic mice was resolved by agarose gel electrophoresis through a 1% agarose, 1 % formaldehyde denaturing gel, transferred to positively charged nylon membrane, hybridized to a probe labeled with [32P] dCTP that was generated from the cDNA insert using the Random Primed DNA Labeling Kit (Roche, Palo Alto, CA) .
  • Nucleotide sequences and predicted amino acid sequences were compared to public domain databases using the Blast 2.0 program (National Center for Biotechnology Information, National Institutes of Health) . Nucleotide sequences were displayed using ABI prism Edit View 1.0.1 (PE Applied Biosystems, Foster City, CA) .
  • Nucleotide database searches were conducted with the then current version of BLASTN 2.0.12, see Altschul, et al . , "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res., 25:3389-3402 (1997) . Searches employed the default parameters, unless otherwise stated.
  • blastN For blastN searches, the default was the blastN matrix (1,-3), with gap penalties of 5 for existence and 2 for extension.
  • Protein database searches were conducted with the then- current version of BLAST X, see Altschul et al . (1997), supra . Searches employed the default parameters, unless otherwise stated.
  • the scoring matrix was BLOSUM62, with gap costs of 11 for existence and 1 for extension.
  • the standard low complexity filter was used.
  • Ref indicates that NCBI ' s RefSeq is the source database.
  • the identifier that follows is a RefSeq accession number, not a GenBank accession number.
  • RefSeq sequences are derived from GenBank and provide non-redundant curated data representing our current knowledge of known genes. Some records include additional sequence information that was never submitted to an archival database but is available in the literature. A small number of sequences are provided through collaboration; the underlying primary sequence data is available in GenBank, but may not be available in any one GenBank record. RefSeq sequences are not submitted primary sequences . RefSeq records are owned by NCBI and therefore can be updated as needed to maintain current annotation or to incorporate additional sequence information.” See also http : //www. ncbi . nlm. nih . gov/LocusLink/refseq . html
  • Clone Z41 is apparently a partial cDNA of a gene encoding the Mus musculus cytochrome P450 3all (Cyp3all) protein.
  • the percentage identity was 97% (601/619) with 6 gaps.
  • the highest human matches in the main database were members of the cytochrome P450 subfamily 3. Note that these proteins are functionally related.
  • the cytochrome P450 proteins are monooxygenases that catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids.
  • NM__017460 cytochrome P450, subfamily IIIA (niphedipine oxidase) , polypeptide 4 (CYP3A4) , transcript variant 1, complete sequence.
  • This protein localizes to the endoplasmic reticulum and its expression is induced by glucocorticoids and some pharmacological agents. This enzyme is involved in the metabolism of approximately half the drugs which are are used today, including acetaminophen codeine, cyclosporin A, diazepam and erythromycin.
  • NM_000776 cytochrome P450, subfamily IIIA (niphedipine oxidase), polypeptide 3 (CYP3A3) , complete sequence.
  • X12387 mRNA for cytochrome P-450 (cyp3 locus, complete sequence.
  • AF182273 cytochrome P450-3A4 (CYP3A4) mRNA, complete sequence .
  • M18907 P450 mRNA encoding nifedipine oxidase, complete sequence .
  • M13785 Liver glucocorticoid-inducible cytochrome P-450 (HLp) mRNA, complete sequence.
  • BC033862/NM_000777 cytochrome P450, subfamily IIIA(CYP3A5) (niphedipine oxidase) , polypeptide 5, complete sequence .
  • J04814 cytochrome P450 PCN3 mRNA, complete sequence.
  • NP_000767 cytochrome P450, subfamily IIIA (niphedipine oxidase) , polypeptide 3
  • NP_000768 cytochrome P450, subfamily IIIA, polypeptide 5
  • AAA35747 cytochrome P450 nifedipine oxidase
  • NP_059488 cytochrome P450, subfamily IIIA, polypeptide 4 ; nifedipine oxidase; P450-III, steroid inducible; glucocorticoid-inducible P450; cytochrome P450, subfamily
  • NP_000756 cytochrome P450, subfamily IIIA, polypeptide 7
  • NP_073731 cytochrome P450, family 3, subfamily A polypeptide 43 isoform 1.
  • NP_476436 cytochrome P450, family 3, subfamily
  • polypeptide 43 isoform 2.
  • NP_476437 cytochrome P450 polypeptide 43; cytochrome P450, subfamily IIIA, polypeptide 43
  • Nucleotide database search Blast N Clone Z74 is apparently a partial cDNA of a gene encoding the Mus musculus synovial sarcoma translocation, mRNA.
  • the percentage identity was 98% (548/554) with 2 gaps.
  • NP_033306 synovial sarcoma translocation, Chromosome 18; synovial sarcoma translocated to X chromosome.
  • This protein represents the mouse homolog of SYT, a gene implicated in the development of human synovial sarcomas as described in de Bruijn,D.R. et al . , Oncogene 13 (3), 643-648 (1996)
  • Blast X No significant similarity found. This is expected; the coding sequence of NM_009280 is from bases 180 to 1436. The sequence isolated as clone Z74 is therefore within the 3' untranslated region of NM_009280.
  • Blast P of NP_0033306 indicated significant ho ology to: AAG31034: SYT/SSX4 fusion protein AAM00188: SYT protein AAK21314: SYT variant 1
  • NP_005628 Synovial sarcoma, translocated to X chromosome
  • Insert size 832 bp query sequence (SEQ ID NO: 8)
  • Nucleotide database search Blast N Clone Y92 is apparently a partial cDNA of a gene encoding the Mus musculus cytochrome P450, 4al4 (Cyp4al4) , mRNA.
  • the percentage identity was
  • the highest human matches in the main database were members of the cytochrome P450 subfamily 4. Note that these proteins are functionally related.
  • the cytochrome P450 proteins are monooxygenases that catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids.
  • NM_000778 cytochrome P450, family 4, subfamily A, polypeptide 11 (CYP4A11) , mRNA, complete sequence.
  • L04751 cytochrome p-450 4A (CYP4A) mRNA, complete sequence .
  • D13705 mRNA for fatty acids omega-hydroxylase (cytochrome
  • Protein database search Blast X The best score in the main database was with Mus musculus-cytochrome P450 4al4, NP_031848 (score 392 bits, e value e-109) . Again, the highest human matches in the main database were members of the cytochrome P450 subfamily 4 as follows:
  • NP_000769 cytochrome P450, subfamily IVA, polypeptide 11; fatty acid omega-hydroxylase; P450HL-omega; alkane-1 monooxygenase; lauric acid omega-hydroxylase
  • BAA02864 fatty acid omega-hydroxylase
  • cytochrome P450 4B1 cytochrome P450 4B1
  • NP_000770 cytochrome P450, subfamily IVB, polypeptide 1; cytochrome P450, subfamily IVB, member 1; microsomal monooxygenase
  • Insert size 953 bp query sequence (SEQ ID NO: 11)
  • Nucleotide database search Blast N Clone Z19 is apparently a partial cDNA of a gene encoding the Mus musculus RIKEN cDNA 2810007J24, mRNA.
  • the percentage identity was 97% (335/345) with 1 gap (Note that since the orientation of the cDNA inserts in the cloning vector was not known, "plus” was assigned arbitrarily for the purpose of the Blast alignment. So, since the match is to the minus strand of a known DNA, we assume that the strand labeled "plus” was actually the minus strand of
  • XM_133188 The region of XM_133188 between bases 385 to 1086 has been identified as a potential Sulfotransferase region. Therefore, XM_133188 may encode a Sulfotransferase protein. Human Blast N: No significant similarity found.
  • Blast X No significant similarity found. This is expected; the coding sequence of XM_133188 is from bases 277 to 1125. The sequence isolated as clone Z19 is therefore within the 3' untranslated region of XM_133188. The mouse protein corresponding to XM_133188 is XP_133188.
  • Insert size 864 bp query sequence (SEQ ID NO: 3) Nucleotide database search
  • Blast N Clone A17 is apparently a partial cDNA of a gene encoding the Mus musculus kallistatin-related protein, mRNA.
  • the percentage identity was 92% (634/687) with 34 gaps.
  • Blast X No significant similarity found. This is expected; the coding sequence of AF453874 is from bases 2127 to 2513. The sequence isolated as clone A17 is therefore within the 5' untranslated region of AF453874.
  • the corresponding human gene encodes the serine (or cysteine) proteinase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin) , member 4 (SERPINA4) , mRNA (NM_006215) .
  • Insert size 800 bp query sequence (SEQ ID NO: 10)
  • Nucleotide database search Blast N Clone A53 is apparently a partial cDNA of a gene encoding the Mus musculus adult male liver tumor cDNA, RIKEN clone: C730029H04 product, mRNA.
  • the percentage identity was 99% ( 653 /657 ) with 2 gaps .
  • Protein database search Blast X No significant similarity found. This is expected; AK050237 contains 5 open reading frames (ORFs) located between bases 546 and 1417, the longest being 246 bases (82 aa) . The sequences contained within clone A53 are located outside of this region (bases 2083-2738) . Therefore, clone A53 may represent a portion of either the 5' UTR or 3'UTR of AK050237.
  • ORFs open reading frames located between bases 546 and 1417, the longest being 246 bases (82 aa) .
  • the sequences contained within clone A53 are located outside of this region (bases 2083-2738) . Therefore, clone A53 may represent a portion of either the 5' UTR or 3'UTR of AK050237.
  • the human corresponding gene encodes the one cut domain, family member 1 (ONECUTl) , mRNA (NM 004498) .
  • Insert size 751 bp query sequence (SEQ ID NO: 12)
  • Nucleotide database search Blast N Clone A104 is apparently a partial cDNA of a gene encoding the Mus musculus H2A histone family, member Y (H2afy) ,mRNA.
  • the percentage identity was 98% (465/471).
  • the highest human matches in the main database were : NM_004893: Homo sapiens H2A histone family, member Y (H2AFY) , transcript variant 2, complete sequence.
  • BC013331 Homo sapiens, clone MGC: 13692 IMAGE : 4077577, complete sequence AF054174: Homo sapiens histone macroH2Al .2 , complete sequence
  • XP_127380 H2A histone family
  • member Y AF171080 histone macroH2Al .2 variant
  • BAB68541 MacroH2Al .2
  • NP_004884 H2A histone family, member Y isoform 2; histone macroH2Al .2 ; histone macroH2Al .1
  • AAC39908 histone macroH2Al .2
  • NP_613258 H2A histone family, member Y isoform 3; histone macroH2Al .2 ; histone macroH2Al .1 H2AY_HUMAN: Core histone macro-H2A.l (Histone macroH2Al) (mH2Al) (H2A.y) (H2A/y)
  • NP_613075 H2A histone family, member Y isoform 1; histone macroH2Al .2 ; histone macroH2Al .1
  • AAH13331 Unknown (protein for MGC: 13692)
  • Insert size 829 bp query sequence (SEQ ID NO: 4)
  • Clone B8 is apparently a partial cDNA of a gene encoding the Mus musculus liver-specific uridine phosphorylase, mRNA.
  • the percentage identity was 98% (352/358) with 2 gaps.
  • "plus" was assigned arbitrarily for the purpose of the Blast alignment. So, since the match is to the minus strand of a known DNA, we assume that the strand labeled "plus” was actually the minus strand of AY152393) .
  • the three possible corresponding human genes are:
  • NM_173355 Liver-specific uridine phosphorylase (LOC151531)
  • XM_087230 Similar to Uridine phosphorylase (UDRPase) (LOC151531)
  • XM_087230 Similar to Uridine phosphorylase (UDRPase) (LOC151531)
  • Blast X The corresponding Mus musculus proteins are : NP_083968: liver-specific uridine phosphorylase AAO05705: liver-specific uridine phosphorylase
  • liver-specific uridine phosphorylase or a protein similar to uridine phosphorylase as follows : NP_775491: Liver-specific uridine phosphorylase XP_087230: Similar to Uridine phosphorylase (UDRPase) AAH33529: Similar to uridine phosphorylase AAD12227: Similar to uridine phosphorylase; similar to Q16831 (PID:g2494059)
  • Insert size 851 bp query sequence (SEQ ID NO: 5)
  • Clone B39 is apparently a partial cDNA of a gene encoding the Mus musculus TRAMl, mRNA.
  • the percentage identity was 93% (352/358) with 3 gaps.
  • B39 may be a partial cDNA of the following three other Mus musculus genes:
  • AK088814 2 days neonate thymus thymic cells cDNA, RIKEN clone :E430026I15 product : TRAMl (UNKNOWN) (PROTEIN FOR
  • TRAM translocating chain-associating membrane protein
  • the corresponding human proteins may also be the unknown protein for MGC : 33851 (AAH37738) ; the hypothetical protein MGC26568 (NP_689615) ; the unnamed protein product (BAC11091) or the TRAM-like protein KIAA0057 (NP_036420)
  • Insert size 966 bp query sequence (SEQ ID NO: 6)
  • Nucleotide database search Blast N Clone Y68 is apparently a partial cDNA of a gene encoding either the Mus musculus, integral membrane protein 2B and or the E25B protein, mRNA.
  • the percentage identity was 96% (631/653) with 1 gap.
  • the percentage identity was 96% (631/653) with 2 gaps.
  • Human Blast N The highest human matches in the main database were to integral membrane protein 2B or to genes encoding proteins similar to integral membrane protein 2B.
  • NM_021999 integral membrane protein 2B (ITM2B) BC016148: Similar to integral membrane protein 2B, clone MGC:10219 IMAGE : 3912066
  • BC000554 Similar to integral membrane protein 2B, cloneMGC 1034 IMAGE:3163436
  • Protein database search Blast X The corresponding Mus musculus proteins are integral membrane protein 2B (AAH21786) and/or E25B protein (AAC63851) .
  • Human Blast X The corresponding human proteins may be : NP_068839: Integral membrane protein 2B AAH16148: Similar to integral membrane protein 2B AAH00554: Similar to integral membrane protein 2B AAD40370 : putative transmembrane protein E3-16 AF152462: transmembrane protein BRI BAA91210: unnamed protein product
  • Insert size 842 bp query sequence (SEQ ID NO: 9)
  • Nucleotide database search Blast N Clone Y89 is apparently a partial cDNA of a gene encoding the Mus musculus, vitronectin, clone MGC: 21423 IMAGE : 4500844 mRNA.
  • the percentage identity was 99% (236/237) .
  • Protein database search Blast X The corresponding Mus musculus protein is Vitronectin (AAH18521) The corresponding Human protein is: NP_000629:
  • Vitronectin serum spreading factor
  • somatomedin B complement S-protein Clone Y91
  • Insert size 8806bp query sequence (SEQ ID NO: 7)
  • Nucleotide database search Blast N Clone Y91 is apparently a partial cDNA of a gene encoding the Mus musculus, SPI-2 mRNA.
  • the percentage identity was 95% (683/714) with 12 gaps (Note that since the orientation of the cDNA inserts in the cloning vector was not known, "plus” was assigned arbitrarily for the purpose of the Blast alignment. So, since the match is to the minus strand of a known DNA, we assume that the strand labeled "plus" was actually the minus strand of X56786) .
  • Human Blast N There were no significant human matches. The highest scoring human match was:
  • Protein database search Blast X The corresponding Mus musculus protein is contraspin (CAA40106) .
  • references ci ted herein including journal articles or abstracts, published, corresponding, prior or otherwise related U. S. or foreign patent applications, issued U. S. or foreign patents, or any other references, are entirely incorporated by reference herein, including all data, tables, figures, and text presented in the ci ted references . Addi tionally, the entire contents of the references ci ted wi thin the references ci ted herein are also entirely incorporated by reference .
  • any description of a class or range as being useful or preferred in the practice of the invention shall be deemed a description of any subclass (e. g. , a disclosed class wi th one or more disclosed members omitted) or subrange contained therein, as well as a separate description of each individual member or value in said class or range.
  • Subtractive hybridization cloning an efficient technique to detect overexpressed mRNAs in diabetic nephropathy.
  • This clone contains inserted sequences from two different mouse mRNAs : 1) Partial sequence (345bp) of Mus musculus RIKEN cDNA 2810007J24 gene (2810007J24Rik) , mRNA: 2 ) Partial sequence (178bp) of Mus musculus , cytochrome P450 , 3all
  • Col. 1 The internal designation for the clone. The sequences for the clones appear in tables 1-12.
  • Col. 2 There are three pieces of information here: (1) The database accession number for the mouse gene "corresponding" to the clone as determined by database searching, (2) in parentheses, the E value for the alignment of the clone sequence to the mouse gene. It is the expected number of matches with the same or better alignment score that would have occurred through chance . The lower the E value, the more statistically significant the alignment. (3) the database accession number for the mouse protein corresponding to the mouse gene above.
  • Col . 4 A human protein deemed to correspond to the clone, identified by database accession number and by name. Note that more than one human protein may be so identified. The human proteins are listed in order of correspondence to the clone, from most to least closely corresponding.
  • Col. 5 The E value for the alignment of the query sequence set forth in col . 6 to the human protein set forth in col . 4. There is one entry for each human protein in col . 4.
  • Col. 6 The method used to align the human protein of col. 4 to the query sequence, and the identity of the query sequence. If the query sequence is a clone sequence, the method will be BLASTX (DNA vs. protein) . If the query sequence is the mouse gene of col. 2, the method will again be BLASTX. If the query sequence is the mouse protein of col. 2, the method will be BLASTP (protein vs. protein) .
  • Col. 7. The database accession number of the corresponding human gene . There is one entry for each human protein in col . 4.
  • Col. 2 Corresponding mouse gene and protein.
  • Col. 3 U/F.
  • Col . 4 The classes and subclasses of human proteins deemed to correspond to the mouse clone. This is the result of extrapolation from the data of Master Table 1.
  • Master Table 1 is divided into three subtables on the basis of the Behavior" in col. 3. If a gene has at least one favorable behavior, and no unfavorable ones, it is put into Subtable 1A. In the opposite case, it is put into Subtable IB. If it shows both favorable and unfavorable behavior, it belongs to Subtable 1C. Master Table 2 is analogously divided into subtables 2A, 2B and 2C.
  • Master Table 2 Based on the related human proteins defined in Master Table 1, Master Table 2 generalizes, if possible, as to classes of human proteins which are expected to have similar behavior. For a given mouse gene, several human protein classes may be listed because of the diversity of the human proteins found to be related. In some cases, the stated human protein classes may be hierarchial, e.g., one may be a subset of another. In other cases, the stated classes may be non-overlapping but related. And in yet other cases, the stated classes may be non-overlapping and unrelated. Combinations of the above are also possible.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Aux fins de l'invention, on sait que les personnes exposées à la progression d'un état de nomoinsulinémie non diabétique, ou d'un état diabétique de type II, peuvent être identifiées par un criblage approprié correspondant à un ou plusieurs gènes marqueurs humains 'favorables' ou 'non favorables', ou à leurs protéines codées. Par ailleurs, on établit que les gènes «favorables» et les protéines correspondantes, et les antagonistes vis-à-vis des 'éléments non favorables', sont utiles en thérapie.
PCT/US2004/009629 2003-03-31 2004-03-30 Diagnosis de l'hyperinsulinemie et du diabette de type ii et protection correspondante (i) WO2004092419A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US45839803P 2003-03-31 2003-03-31
US60/458,398 2003-03-31

Publications (2)

Publication Number Publication Date
WO2004092419A2 true WO2004092419A2 (fr) 2004-10-28
WO2004092419A3 WO2004092419A3 (fr) 2005-05-19

Family

ID=33299636

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/009629 WO2004092419A2 (fr) 2003-03-31 2004-03-30 Diagnosis de l'hyperinsulinemie et du diabette de type ii et protection correspondante (i)

Country Status (1)

Country Link
WO (1) WO2004092419A2 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005082398A2 (fr) * 2004-02-26 2005-09-09 Ohio University Diagnostic d'hyperinsulinemie et du diabete de type ii et protection contre lesdits etats pathologique grace aux genes exprimes de façon differentielle dans les cellules musculaires

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030040009A1 (en) * 2001-08-14 2003-02-27 University Of Southern California Saliva-based methods for preventing and assessing the risk of diseases

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030040009A1 (en) * 2001-08-14 2003-02-27 University Of Southern California Saliva-based methods for preventing and assessing the risk of diseases

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
CLONTECH: "PCR-Select differential screening kit. User Manual" 10 September 2001 (2001-09-10), pages 1-35, XP002307356 Retrieved from the Internet: URL:www.clontech.com> *
COROMINOLA H ET AL: "Identification of novel genes differentially expressed in omental fat of obese subjects and obese type 2 diabetic patients" DIABETES, NEW YORK, NY, US, vol. 50, no. 12, December 2001 (2001-12), pages 2822-2830, XP002293068 ISSN: 0012-1797 *
LINDSAY R S ET AL: "Adiponectin and development of type 2 diabetes in the Pima Indian population" LANCET, vol. 360, no. 9326, 6 July 2002 (2002-07-06), pages 57-58, XP004369659 ISSN: 0140-6736 *
PASS GEORGIA J ET AL: "Effect of hyperinsulinemia and type 2 diabetes-like hyperglycemia on expression of hepatic cytochrome p450 and glutathione s-transferase isoforms in a New Zealand obese-derived mouse backcross population." THE JOURNAL OF PHARMACOLOGY AND EXPERIMENTAL THERAPEUTICS. AUG 2002, vol. 302, no. 2, August 2002 (2002-08), pages 442-450, XP002307759 ISSN: 0022-3565 *
SCHUETZ J D ET AL: "SELECTIVE EXPRESSION OF CYTOCHROME P450 CYP3A MRNAS IN EMBRYONIC AND ADULT HUMAN LIVER" PHARMACOGENETICS, CHAPMAN & HALL, LONDON, GB, vol. 4, no. 1, 1994, pages 11-20, XP008015488 ISSN: 0960-314X *
SURWIT R S ET AL: "Differential effects of fat and sucrose on the development of obesity and diabetes in C57BL/6J and AJ mice" METABOLISM, CLINICAL AND EXPERIMENTAL, W.B. SAUNDERS CO., PHILADELPHIA, PA, US, vol. 44, no. 5, May 1995 (1995-05), pages 645-651, XP004540280 ISSN: 0026-0495 *
WESTLIND A ET AL: "INTERINDIVIDUAL DIFFERENCES IN HEPATIC EXPRESSION OF CYP3A4: RELATIONSHIP TO GENETIC POLYMORPHISM IN THE 5'-UPSTREAM REGULATORY REGION" BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, ACADEMIC PRESS INC. ORLANDO, FL, US, vol. 259, 1999, pages 201-205, XP000907112 ISSN: 0006-291X *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005082398A2 (fr) * 2004-02-26 2005-09-09 Ohio University Diagnostic d'hyperinsulinemie et du diabete de type ii et protection contre lesdits etats pathologique grace aux genes exprimes de façon differentielle dans les cellules musculaires
WO2005082398A3 (fr) * 2004-02-26 2006-01-26 Univ Ohio Diagnostic d'hyperinsulinemie et du diabete de type ii et protection contre lesdits etats pathologique grace aux genes exprimes de façon differentielle dans les cellules musculaires

Also Published As

Publication number Publication date
WO2004092419A3 (fr) 2005-05-19

Similar Documents

Publication Publication Date Title
US8022189B2 (en) Isolated antibodies against biologically active leptin-related peptides
JP2002525115A (ja) 発作、高血圧、糖尿病および肥満を予報および治療する、遺伝子およびタンパク質
CA2375820C (fr) Promoteur du gene de la myostatine et methode pouvant inhiber l'activation dudit promoteur
EP1732582A2 (fr) Diagnostic d'hyperinsulinemie et du diabete de type ii et protection contre lesdits etats pathologique grace aux genes exprimes de fa on differentielle dans les cellules musculaires
WO2006023121A1 (fr) Diagnostic d'hyperinsulinemie et de diabete de type ii et protection contre ceux-ci a base de genes differentiellement exprimes dans le tissu adipeux blanc (13)
WO2005000335A2 (fr) Procedes de diagnostic et de traitement lies au vieillissement, en particulier au vieillissement du foie
US20070142311A1 (en) Diagnosis of hyperinsulinemia and type II diabetes and protection against same
US8101564B2 (en) Methods for regulating osteoclast differentiation and bone resorption using LRRc17
WO2004092419A2 (fr) Diagnosis de l'hyperinsulinemie et du diabette de type ii et protection correspondante (i)
JP2002505844A (ja) 新規な遺伝子及びその使用
US6627735B2 (en) Islet cell antigen 1851
WO2005046718A1 (fr) Diagnostic de l'hyperinsulinemie et du diabete de type ii et protection contre ces maladies sur la base de genes d'expression differentielle dans des cellules pancreatiques (12.1)
US20060240500A1 (en) Diagnosis of kidney damage and protection against same
WO2002101002A2 (fr) Nouveaux polynucleotides et polypeptides du gene hgh-v
EP1649063A2 (fr) Methodes de diagnostic et de traitement se rapportant au vieillissement (8a)
CN100417664C (zh) 一种白介素17受体及其编码基因与应用
JP2006525031A (ja) Acheron発現の制御方法
JP2003000268A (ja) 血管新生にかかわる疾患を診断および治療するための方法および組成物
Dutta Discovery of new medicines
WO2005067965A2 (fr) Utilisation de produits proteiques permettant de prevenir et de traiter les maladies du pancreas et/ou l'obesite et/ou le syndrome metabolique
JP2002369695A (ja) 組織特異的新規分泌型ポリペプチドおよびそのdna

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase