AU752699B2 - Identification and expression of insect steroid receptor DNA sequences - Google Patents

Identification and expression of insect steroid receptor DNA sequences Download PDF

Info

Publication number
AU752699B2
AU752699B2 AU56563/00A AU5656300A AU752699B2 AU 752699 B2 AU752699 B2 AU 752699B2 AU 56563/00 A AU56563/00 A AU 56563/00A AU 5656300 A AU5656300 A AU 5656300A AU 752699 B2 AU752699 B2 AU 752699B2
Authority
AU
Australia
Prior art keywords
polypeptide
gin
binding
dna
ecdysone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
AU56563/00A
Other versions
AU5656300A (en
Inventor
David S Hogness
Michael R Koelle
William A Segraves
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Leland Stanford Junior University
Original Assignee
Leland Stanford Junior University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leland Stanford Junior University filed Critical Leland Stanford Junior University
Priority to AU56563/00A priority Critical patent/AU752699B2/en
Publication of AU5656300A publication Critical patent/AU5656300A/en
Application granted granted Critical
Publication of AU752699B2 publication Critical patent/AU752699B2/en
Anticipated expiration legal-status Critical
Expired legal-status Critical Current

Links

Landscapes

  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Description

Our Ref:7523980 P/00/011 Regulation 3:2
AUSTRALIA
Patents Act 1990
ORIGINAL
COMPLETE SPECIFICATION STANDARD PATENT *o Applicant(s): Address for Service: Invention Title: The Board of Trustees of the Leland Stanford Jr. University Stanford University, Stanford California 94305 United States of America DAVIES COLLISON CAVE Patent Trade Mark Attorneys Level 10, 10 Barrack Street SYDNEY NSW 2000 Identification and expression of insect steroid receptor DNA sequences The following statement is a full description of this invention, including the best method of performing it known to me:- 5020 -1- Identification and Expression of Insect Steroid Receptor DNA Sequences This invention was made in part with U.S. government support under Grant DCB 8405370 from the National Science Foundation. The U.S. government may have certain rights in this invention.
FIELD OF THE INVENTION This invention relates generally to the use of recombinant DNA methods as applied to the nucleic acid sequences and polypeptides characteristic of insect steroid receptor superfamily members and, more particularly, to uses of such receptors and the DNA regulatory elements associated with genes whose expression they regulate for the production of proteins in cultured cells, and to uses of such hormone receptor proteins and genes in identifying new hormones that control insect development.
zb BACKGROUND OF THE INVENTION The temporal sequence of gene expression determines the nature and sequence of steps in the development of the adult animal from the fertilized egg. The common fruit fly, Drosophila melanoqaster, provides a favorable model system for studying this genetic control of development. Various aspects of Drosophila development are representative of general insect and, in many respects, vertebrate development.
The steroid hormone 20-OH ecdysone, also known as 8-ecdysone, controls timing of development in many insects. See, generally, Koolman Ecdysone: From Chemistry to Mode of Action, Thieme Medical Pub., N.Y.
(1989), which is hereby incorporated herein by reference.
The generic term "ecdysone" is frequently used as an abbreviation for 20-OH ecdysone. Pulses, or rises and falls, of the ecdysone concentration over a short period of time in insect development are observed at various stages of Drosophila development.
These stages include embryogenesis, three larval stages and two pupal stages. The last pupal stage ends with the formation of the adult fly. An ecdysone pulse at the end of the third, or last, larval stage pulse triggers the beginning of the metamorphosis of the larva to the adult fly. Certain tissues, called imaginal tissues, are induced to begin their formation of adult structures such as eyes, wings, and legs.
During the larval stages of development, giant polytene chromosomes develop in non-imaginal larval tissues. These cable-like chromosomes consist of aggregates comprising up to about 2,000 chromosomal copies. These chromosome aggregates are extremely useful because they provide a means whereby the position of a given gene within a chromosome can be determined to a very high degree of resolution, several orders of magnitude higher than is typically possible for normal chromosomes.
A "puff" in the polytene chromosomes is a localized expansion or swelling of these cable-like polytene chromosome aggregates that is associated with the transcription of a gene at the puff locus. A puff is, 5 therefore, an indicator of the transcription of a gene located at a particular position in the chromosome.
A genetic regulatory model was proposed to explain the temporal sequence of polytene puffs induced by the ecdysone pulse which triggers the larval-to-adult metamorphosis. See, Ashburner et al., "On the Temporal Control of Puffing Activity in Polytene Chromosomes," Cold Sprin Harbor Svmp. Quant. Biol. 38:655-662 (1974).
This model proposed that ecdysone interacts reversibly with a receptor protein, the ecdysone receptor, to form an ecdysone-receptor complex. This complex would directly induce the transcription of a small set of "early" genes responsible for a half dozen immediately induced "early" puffs. These early genes are postulated to encode regulatory proteins that induce the transcription of a second set of "late" genes responsible for the formation of the "late" puffs. The model thus defines a genetic regulatory hierarchy of three ranks, the ecdysone-receptor gene in the first rank, the early genes in the second rank, and the late genes in the third. While this model was derived from the puffing pattern observed in a non-imaginal tissue, similar genetic regulatory hierarchies may also determine the metamorphic changes in development of imaginal tissues that are also targets of ecdysone, as well as the changes in tissue development induced by the pulses of ecdysone that occur at other developmental stages.
Various structural data have been derived from receptors for vertebrate steroids and other lipophilic receptor proteins. A "superfamily" of such receptors has been defined on the basis of their structural similarities. See, Evans, "The Steroid and Thyroid '20 Hormone Receptor Superfamily," Science 240:889-895 (1988); Green and Chambon, "Nuclear Receptors Enhance Our Understanding of Transcription Regulation," Trends in Genetics 4:309-314 (1988), both of which are hereby incorporated herein by reference. Where their functions i.)5 have been defined, these receptors, complexed with their respective hormones, regulate the transcription of their primary target genes, as proposed for the ecdysone receptor in the above model.
Cultivated agriculture has greatly increased efficiency of food production in the world. However, various insect pests exploit cultivated sources of food to their own advantage. These insect pests typically develop by a temporal sequence of events characteristic of their order. Many, including Drosophila, initially develop in a caterpillar or maggot-like larval form.
Thereafter, they undergo a metamorphosis from which emerges an adult having characteristic anatomical features. Anatomic similarity is a reflection of developmental, physiological and biochemical similarities shared by these insects. In particular, the principles governing the role of insect ecdysteroid-hormone receptors in Drosophila development, as described above, likely are shared by many different types of insects.
As one weapon against the destruction of cultivated crops by insects, organic molecules with pesticidal properties are used commonly in attempts to eliminate insect populations. However, the ecological side effects i I of these pesticides, due in.part to their broad activity and lack of specificity, and also in part to the fact that some of these pesticides are not.easily biodegradable, significantly affect populations of both insects and other species of animals. Some of these populations may be advantageous from an ecological or other perspective. Furthermore, as the insect populations evolve to minimize the effects of the applied pesticides, greater amounts of pesticides must be *2 0 applied, causing significant direct and indirect effects on other animals, including humans. Thus, an important need exists for both highly specific and highly active pesticides which are biodegradable. Novel insect hormones which, like the ecdysteroids, act by complexing 5 with insect members of the steroid receptor superfamily to control insect development, are likely candidates for pesticides with these desirable properties.
The use of insect hormones may also have other important applications. Many medically and commercially important proteins can be produced in a usable form by genetically engineered bacteria. However, many expressed proteins are processed incorrectly in bacteria and are preferably produced by genetically engineered eucaryotic cells. Typically, yeast cells or mammalian tissueculture cells are used. Because it has been observed that protein processing of foreign proteins in yeast cells is also frequently inappropriate, mammalian cultured cells have become the central focus for the production of many proteins. It is commonly known that the production of large amounts of foreign proteins makes these cells unhealthy, which may affect adversely the yield of the desired protein. This problem may be circumvented, in part, by using an inducible expression system. In such a system, the cells are engineered so that they do not express the foreign protein until an inducing agent is added to the growth medium. In this way, large quantities of healthy cells can be produced i and then induced to produce large amounts of the foreign protein. Unfortunately, in the presently available systems, the inducing agents themselves, such as metal ions or high temperature, adversely affect the cells, thus again lowering the yield of the desired foreign protein the cells produce. A need therefore exists for the development of benign inducing factors for efficient production of recombinant proteins. Such factors could also prove invaluable for the therapy of human patients 20 suffering from inability to produce particular proteins, treatment with these factors controlling both the timing and the abundance of the protein produced in the affected individual.
The hormones that complex with mammalian or other vertebrate members of the steroid receptor superfamily are unlikely candidates as such benign factors because they would alter the expression of many target genes in cells bearing these receptors, thereby adversely affecting the host cells.
For these and other reasons, obtaining steroid receptors or nucleic acid information about them has been a goal of researchers for several years. Unfortunately, efforts have been unsuccessful despite a significant investment of resources. The absence of information on the structure and molecular biology of steroid receptors has significantly hindered the ability to produce such products.
Thus, there exists a need for detailed sequence information on insect members of the steroid receptor superfamily, and the genes that encode these receptors and for resulting reagents. Reagents are provided which are useful in finding new molecules which may act as agonists or antagonists of natural insect members of the steroid receptor superfamily, or as components of systems for highly specific regulation of recombinant proteins in mammalian cells.
SUMMARY OF.THE INVENTION In accordance with the present invention, isolated recombinant nucleic acids are provided which, encodes an insect steroid receptor or fragment of said receptor that binds ecdysone, said nucleic acid comprising a segment that hybridizes to the complement of nucleotides 2359 to 3021 of Drosophila EcR cDNA sequence of Table 2 under hybridization conditions comprising less than 500 mM salt and at least 37 0 C and washing in 2X SSPE at 63°C.Preferably, the insect steroid receptor is EcR.
In another embodiment, isolated recombinant nucleic acids are included that have a sequence exhibiting identity over 20 contiguous nucleotides with nucleotides 2359 to 3021 S.i: ,5 of Drosophila EcR cDNA set forth in Table 2, wherein said nucleic acid encodes a polypeptide that binds ecdysone.
Preferably, the nucleic acid encodes an isolated recombinant nucleic acid as hereinbefore described, wherein said nucleic acid encodes a polypeptide having a DNA binding domain which binds to an ecdysone-responsive DNA control element.
Additional embodiments of the present invention include a polypeptide produced by expression of an isolated, recombinant nucleic acid having at least 60% sequence identity with a polynucleotide encoding amino acids 431 to 651 of Drosophila EcR of Table 2, comprising an insect steroid receptor or fragment thereof, wherein said polypeptide binds ecdysone and comprises a hormone-binding domain of between 200 and 250 amino acids containing at least one of an El, E2 or E3 subregion, wherein: the El subregion has an amino acid sequence AKX PGFXXLT (L/I) DQITLL, wherein is any amino acid, or has an amino acid sequence having at least 10 matches at assigned amino acid positions;(2) the E2 subregion has an amino acid sequence E(F/Y) KA L (L/I) D GL, wherein is the optional absence of an amino acid, or has an amino acid sequence having at least 9 matches at assigned amino acid positions; and the E3 subregion has an amino acid sequence LXKLLXXLPDLR, wherein is any amino acid, or has an amino acid sequence having at least 5 matches at assigned positions. Preferably, the insect steroid receptor or fragment thereof also comprises a DNA binding domain and the polypeptide is capable of binding to an ecdysone analog or an ecdysone agonist. The polypeptide can comprise a zinc-finger domain and usually is capable of binding to a DNA controlling element responsive to ecdysone. As desired, the polypeptide will be fused to a second polypeptide, 8 typically a heterologous polypeptide which comprises a second steroid receptor. Cells, often mammalian cells, comprising the protein are provided.
The polypeptides of the present invention have a variety of utilities. For example, a method for selecting DNA sequences capable of being spe6ifically bound by an insect steroid receptor superfamily member can comprise the steps of screening DNA sequences for binding to such polypeptides and selecting DNA sequences exhibiting such binding. Alternatively, methods for selecting ligands, ecdysteroid analogues, specific for binding to a hormone binding domain of the above polypeptide can comprise the steps of screening compounds for binding to the polypeptide and selecting compounds exhibiting specific binding to the polypeptide.
Additionally provided are methods for selecting ligands specific for binding to a ligand binding domain of an insect steroid receptor superfamily member comprising combining: .3 .3 a fusion polypeptide which comprises a ligand binding domain functionally linked to a DNA binding domain of a steroid receptor; and (ii) a nucleic acid sequence encoding a protein, wherein production of said protein by expression of the nucleic acid sequence is responsive to binding by said DNA binding domain; and screening compounds for an activity of inducing expression of said second nucleic acid sequence; and selecting those compounds having said activity.
This will often be performed in a cell, with cells transformed with DNA encoding a fusion protein.
Also provided are methods for producing a polypeptide comprising the steps of: transfecting the cell with a first and second expression vector wherein: the first expression vector comprises the isolated, recombinant nucleic acid as herein before described; and the second expression vector comprises a DNA control element operably linked to a coding sequence for the polypeptide, the DNA control element being responsive binding by an ecdysone receptor; wherein an ecdysone-binding receptor is produced in the cell by expression of the isolated, recombinant nucleic acid of the first expression vector; and exposing the host cell to ecdysone or an ecdysone analog that binds the ecdysone-binding receptor, whereupon the transfected cell transcribes the nucleotide sequence encoding the polypeptide and produces the polypeptide.
Usually the cell will be a mammalian cell, and will 35 sometimes be introduced into a whole organism, a plant or animal.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1. pMTEcR, a Cu2+-inducible EcR expression plasmid.
The PMT, EcR ORF and Act 5c poly A elements are defined in Experimental Example III, part A. The HYGr ORF confers hygromycin resistance and is under control of the promoter in the LTR of Drosophila transposable elements, copia. The SV40 intron/poly A element provides an intron for a possible splicing requirement, as well as a polyadenylation/cleavage sequence for the HYG r ORF mRNA.
The pAT153 DNA derives from a bacterial plasmid.
Figure 2. The ecdysone-inducible pEcRE/Adh/Bgal reporter plasmid. See the text of Experimental Example III, part B, for the construction of this plasmid and the definitions of all symbols (except the SV40 splice and poly A) which are defined in the figure legend.
Figure 3. The constitutive EcR expression plasmid, pActEcR. The construction of this plasmid and the definition of the symbols are given in Experimental Example III, part B.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention provides novel isolated nucleic acid sequences encoding polypeptide products exhibiting the structure and/or activities of insect 5 members of the steroid receptor superfamily. Having elucidated the structures of these insect steroid receptors from their genes, the separate ligand-binding *domains and DNA-binding domains are used individually or in combination to screen for new ligands or DNA sequences which bind to these domains. Thus, for example, by 4* 11 binding to promotor sequences incorporating a DNA binding site, these receptors will usually control expression of reporter genes for which sensitive assays exist. Or, the hormone-binding domains serve as reagents for screening for agonists or antagonists of steroid receptor superfamily members. Either new classes of molecules, or selected modifications of known ligands will be screened for receptor binding. New ligands obtained in this way find use as highly specific and highly active naturally occurring pesticides. Alternatively, structural information about interactions between the ligand and binding domains directs methods for mutagenizing or )substituting particular residues in the binding domains, thereby providing for altered binding specificity. Thus, inter alia, the present invention provides for screening for new ligand molecules, for the design of new ligandbinding domain interactions, for producing novel chimeric steroid receptor superfamily members and for generating new combinations of ligands and binding domains.
The present invention also provides for the i isolation or identification of new steroid hormoneresponsive elements and associated genes. By appropriate operable linkage of selected sequences to DNA controlling elements which are responsive to binding by the DNAbinding domains of steroid receptor superfamily members, new regulatory combinations result. The present invention further provides for the design of either a binding domain in a member of the insect steroid receptor superfamily that will recognize given DNA sequences, or conversely for the modification of DNA sequences which will bind to particular receptor DNA-binding domains.
Both the DNA-binding domain of a superfamily-member polypeptide and its DNA recognition sequence can be coordinately modified to produce wholly new receptor-DNA interactions.
In an alternative embodiment, a DNA-binding sequence recognized by a selected receptor will be operably linked 12 to a desired genetic sequence for inducible expression.
Thus, upon administration of a ligand specific for that selected receptor, the genetic sequence is appropriately regulated. Expression systems are constructed that are responsive to administration of insect steroid receptor superfamily-specific ligands. By identifying and isolating new members of the insect steroid receptor superfamily, new, useful regulatory reagents become available, both hormones and controlling elements.
In another embodiment, highly regulatable expression of a gene is achieved by use of regulatory elements responsive to ligands specific to the superfamily members. If transformed cells are grown under conditions where expression is repressed or not induced, the cells will grow to higher densities and enjoy less stressful conditions. Upon reaching high density, the regulatory ligand molecule is added to cause high expression.
Selected cells otherwise insensitive to the inducing ligand will not be affected by exposure to the ligand used to regulate expression. This provides a means both for highly efficient regulatable expression of genes, and for introduction of these genes into intact organisms.
In accordance with specific embodiments of the present invention, nucleic acid sequences encoding portions of insect steroid hormone receptor superfamily members have been elucidated. DNA encoding four oe different members of the Drosophila steroid receptor superfamily have been characterized: the ecdysone receptor, also called the ecdysone receptor (EcR), for which a full-length encoding sequence has been determined; Drosophila hormone receptor 3 (DHR3), a *protein with sequence homology to various steroid receptor superfamily members; (3 and 4) E75A and closely related proteins, encoded by segments of the same gene, and each possessing sequence homology to other steroid receptor superfamily members.
The DNA sequences encoding each of these members of the insect steroid receptor superfamily provide probes for screening for homologous nucleic acid sequences, both in Drosophila and other sources. This screening allows isolation of homologous genes from both vertebrates and invertebrates. Production of large amounts of the encoded proteins is effected by inserting those sequences into expression systems.
The EcR, DHR3, E75A, and E75B genes are each linked to similar DNA sequences which likely function as controlling, or regulatory elements which are responsive to insect steroids. The present invention provides for the isolation of these hormone-responsive control elements, and for their use in regulating gene expression. One embodiment of a DNA construct comprises: multiple copies of an insect steroid receptor superfamily controlling element linked to a minimal gene promoter, preferably not a heat shock gene promoter, which provides highly inducible expression of an operably linked gene. This construct provides a very sensitive assay for the presence of the controlling molecule of the receptor.
Another aspect of the present invention involves cells comprising: isolated recombinant gene segments encoding biologically active fragments of insect steroid receptor superfamily proteins; DNA sequences which bind insect steroid receptors, the elements involved in hormone-responsive control; or modified receptor proteins. Transformed cells are understood to include their progeny. In particular, the present invention provides for a system whereby expression of polypeptides is responsive to steroid induction. For instance, a system which expresses a desired protein in response to exposure to ecdysone analogues is constructed by operably linking a promoter having an ecdysoneresponsive enhancer to a peptide encoding segment.
The present invention also provides insect steroid receptor proteins substantially free from naturallyassociated insect cell components. Such receptors will typically be either full-length proteins, functional fragments, or fusion proteins comprising segments from an insect steroid receptor protein fused to a heterologous, or normally non-contiguous, protein domain.
The present invention further provides a number of methods for utilizing the subject receptor proteins. One aspect of the present invention is a method for selecting new hormone analogues. The isolated hormone-binding domains specifically bind hormone ligands, thereby providing a means to screen for new molecules possessing the property of binding with high affinity to the ligandbinding region. Thus, a binding domain of an insect steroid receptor superfamily member will be used as a reagent to develop a binding assay. On one level, the binding domains are useful as affinity reagents for a batch or a column selective process, to selectively 0 retain ligands which bind. Alternatively, a functional assay is preferred for its greater sensitivity to ligandbinding, whether a direct binding assay or an indirect assay in which binding is linked to an easily assayed function. For example, by operable linkage of an easily P assayable reporter gene to a controlling element responsive to binding by an insect steroid receptor superfamily member, in which ligand-binding induces protein synthesis, an extremely sensitive assay for the presence of a ligand or of a receptor results. Such a construct useful for assaying the presence of ecdysone is described below. This construct is useful for screening for agonists or antagonists of the ecdysone ligand.
In particular, this method allows selecting for ligands which bind to an "orphan" receptor, a receptor whose ligand is unknown. Binding domains for "unknown" ligands will often originate from either newly identified insect steroid receptor superfamily members, or from mutagenesis. A hybrid receptor will be created with a ligand-binding domain and DNA-binding domain from different sources. For example, a hybrid receptor between a putative binding domain and a known DNA-binding domain would allow screening for ligands. An "orphan receptor" binding domain will be functionally linked to a known DNA-binding domain which will control a known reporter gene construct whose expression will be easily detected. This system for ligand-receptor binding provides an extremely sensitive assay for ligand-receptor interactions.
Alternatively, the recognition of important features of tertiary structure and spatial interactions between a ligand-binding domain from an insect steroid receptor superfamily member and its ligand will allow selection of new combinations of ligand-binding domains with ligands.
Either method provides for selecting unusual ligands which specifically bind a modified polypeptide-binding domain of a receptor. This approach allows selection of novel steroid hormone analogues which exhibit modified specificity for binding to a subgroup of steroid receptors.
The present invention also provides for new and useful combinations of the various related components: the recombinant nucleic acid sequences encoding the .polypeptides, the polypeptide sequences, and the DNA sites to which the receptors bind the regulatory, or control, elements). For instance, fusing portions of nucleic acid sequences encoding peptides from different sources will provide polypeptides exhibiting hybrid properties, unusual control and expression characteristics. In particular, hybrid receptors comprising segments from other members of the superfamily, or from other sources, will be made.
Combining an insect steroid receptor-responsive enhancer segment with a different polypeptide encoding segment 16 will produce a steroid-responsive expression system for that polypeptide.
The isolation of insect steroid receptors provides for isolation or screening of new ligands for receptor binding. Some of these will interfere with, or disrupt, normal insect development. These reagents will allow the user to accelerate or decelerate insect development, for instance, in preparing sterile adults for release.
Alternatively, a delay or change in the timing of development will often be lethal or will dramatically modify the ability of an insect to affect an agricultural crop. Thus, naturally occurring biodegradable and highly active molecules able to disrupt the timing of insect development will result.
Furthermore, these polypeptides provide the means by which have been raised antibodies possessing specificity for binding to particular steroid receptor classes.
Thus, reagents will be produced for determining, qualitatively or quantitatively the presence of these or homologous polypeptides. Alternatively, these antibodies will be used to separate or purify receptor polypeptides.
Transcription sequences of insect steroid receptor 0' superfamily members The ecdysone receptor gene is a member of the steroid and thyroid hormone receptor gene superfamily, a group of ligand-responsive transcription factors. See, Evans (1988) Science 240:889-895; and Segraves, Molecular and Genetic Analysis of the E75 Ecdysone-Responsive Gene of Drosophila melanogaster (Ph.D. thesis, Stanford University 1988), both of which are hereby incorporated herein by reference for all purposes. These receptors 6 show extensive sequence similarity, especially in their "zinc finger" DNA-binding domains, and also in a ligand (or hormone or steroid) binding domain. Modulation of gene expression apparently occurs in response to binding of a receptor to specific control, or regulatory, DNA elements. The cloning of receptor cDNAs provides the first opportunity to study the molecular bases of steroid action. The steroid receptor superfamily is a class of receptors which exhibit similar structural and functional features. While the term insect is used herein, it will be recognized that the same methods and molecules will be derived from other species of animals, in particular, those of the class Insecta, or, more broadly, members of the phylum Arthropoda which use ecdysteroids as hormones.
Members of the insect steroid receptor superfamily are I characterized by functional ligand-binding and DNA binding domains, both of which interact to effect a change in the regulatory state of a gene operably linked to the DNA-binding site of the receptor. Thus, the receptors of the insect steroid receptor superfamily seem to be ligand-responsive transcription factors. The receptors of the present invention exhibit at least a hormone-binding domain characterized by sequence homology to particular regions, designated El, E2 and E3.
20 The members of the insect steroid receptor superfamily are typically characterized by structural homology of particular domains, as defined initially in the estrogen receptor. Specifically, a DNA-binding domain, C, and a ligand-binding domain, E, are separated and flanked by additional domains as identified by Krust et al. (1986) EMBO J. 5:891-897, which is hereby incorporated herein by reference.
The C domain, or zinc-finger DNA-binding domain, is usually hydrophilic, having high cysteine, lysine and arginine content a sequence suitable for the required tight DNA binding. The E domain is usually hydrophobic and further characterized as regions El, E2 and E3. The ligand-binding domains of the present invention are typically characterized by having significant homology in sequence and structure to these three regions. Amino proximal to the C domain is a region initially defined as separate A and B domains. Region D separates the more conserved domains C and E. Region D typically has a hydrophilic region whose predicted secondary structure is rich in turns and coils. The F region is carboxy proximal to the E region (see, Krust et al., supra).
The ligand-binding domain of the members of the insect steroid receptor superfamily is typically carboxyl-proximal, relative to a DNA-binding domain described below. See, Evans (1988) Science 240:889-895.
The entire hormone-binding domain is typically between about 200 and 250 amino acids but is potentially shorter.
This domain has the subregions of high homology, designated the El, E2 and E3 regions. See, e.g., Table 4.
The El region is 19 amino acids long with a consensus sequence AKX(L/I)PGFXXLT(L/I)(D/E)DQITLL, where X represents any amino acid and the other letters are the standard single-letter code. Positions in parentheses are alternatives. Typically, members of the insect steroid receptor superfamily will have at least about five matches out of the sixteen assigned positions, preferably at least about nine matches, and in more preferred embodiments, at least about ten matches.
Alternatively, these insect steroid receptor superfamily members will have homologous sequences exhibiting at p least about 25% homology, normally at least about homology, more normally at least about 35% homology, generally at least about 40% homology, more generally at least about 45% homology, typically at least about homology, more typically at least about 55% homology, usually at least about 60% homology, more usually at least about 70% homology, preferably at least about homology, and more preferably at least about 90% homology at positions assigned preferred amino acids.
The E2 region is a 19 amino-acid segment with a consensus sequence: E(F/Y) (L/M)KA(I/L)(V/L)L(L/I)
(P/D)GL
where represents an optional absence of an amino acid.
Typically, an insect steroid receptor superfamily member will exhibit at least about six matches, preferably at least about eight matches and more preferably at least about nine matches. Alternatively, E2 sequences of insect steroid receptor superfamily members exhibit at least about about 25% homology, normally at least about homology, more normally at least about 35% homology, generally at least about 40% homology, more generally at least about 45% homology, typically at least about homology, more typically at.'least about 55% homology, usually at least about 60% homology, more usually at least about 70% homology, preferably at least about homology, and more preferably at least about 90% homology at positions assigned preferred amino acids.
The E3 region is a 12 amino-acid segment with a consensus sequence LXKLLXXLPDLR The insect steroid receptor superfamily members will typically show at least about four matches out of the nine assigned preferences in the E3 region, preferably at least about five matches and more preferably at least about six matches. Alternatively, over the assigned positions, members of the insect steroid receptor superfamily will-exhibit at least about 25% homology, normally at least about 30% homology, more normally at least about 35% homology, generally at least about homology, more generally at least about 45% homology, typically at least about 50% homology, more typically at least about 55% homology, usually at least about homology, more usually at least about 70% homology, preferably at least about 80% homology, and more preferably at least about 90% homology at positions assigned preferred amino acids.
In preferred embodiments, the insect steroid receptor superfamily members will exhibit matching of at least about five positions in an El region, at least about six positions in an E3 region and at least about four positions in an E3 region. The El, E2, and E3 regions are defined, in Table 4.
The DNA-binding domain of these insect steroid receptor superfamily members is characterized by a "zinc fingers" motif. See, Evans (1988) Science 240:889-895.
The domain is typically amino proximal to the ligand, or hormone, binding site. Typically, the DNA-binding domain of the insect steroid receptor superfamily members is characterized by clustering of basic residues, a S) cysteine-rich composition ard sequence homology. See, Evans, R. M. (1988) Science 240:889-89; and Experimental section below. Significant polypeptide sequence homology among superfamily members exists. The insect steroid receptor superfamily members will exhibit at least about homology in the 67 1 amino acid region of this domain, normally at least about 40% homology, usually at least about 45% homology, and preferably at least about homology.
20 Steroids are derivatives of the saturated tetracyclic hydrocarbon perhydrocyclopentanophenanthrene.
S. Among the molecules in the group "steroids" are: the bile acids, cholic acid, and deoxycholic acid; the adrenocortical steroids, corticosterone and aldosterone; the estrogens, estrone and 8estradiol; the androgens, testosterone and progesterone; and the ecdysteroids. The terms steroid or steroid hormones are used interchangeably herein and are intended to include all steroid analogues. Typically, ".30 steroid analogues are molecules which have minor modifications of various peripheral chemical groups.
See, Koolman (1989), cited above, for details on ecdysteroids.
Although ligands for the insect steroid receptor superfamily members have historically been characterized as steroids, the term "steroid" as in "insect steroid receptor superfamily" is not meant only literally. The use of "steroid" has resulted from a historical designation of members of a group recognized initially to include only molecules having specific defined molecular structures. However, this limitation is no longer applicable since functions are no longer only correlated with precise structures. Thus, there will be members of the insect steroid receptor superfamily, as defined herein, whose ligand-binding specificities are not directed to classically defined "steroids." Typically, the ligands for members of the insect steroid receptor superfamily are lipophilic molecules which are structural analogues of steroid molecules.
The term ligand is meant to refer to the molecules that bind the domain described here as the "hormonebinding domain." Also, a ligand for an insect steroid receptor superfamily member is a ligand which serves either as the natural ligand to which the member binds, or a functional analogue which serves as an agonist or antagonist. The classical definition of "hormone" has 20 been defined functionally by physiologists, see, e.g., Guyton, Textbook of Medical Physiology, Saunders, Philadelphia. The functional term "hormone" is employed because of historic usage, but is meant to apply to other chemical messengers used to communicate between cell types. Recently. the distinction between hormones and neurotransmitters has been eroded as various peptide neurotransmitters have been shown to exhibit properties of classically defined hormones. These molecules are typically used in intercellular signal transduction, but are not limited to those molecules having slow or systemic effects, or which act at remote sites.
Substantial homology in the nucleic acid context means either that the segments, or their complementary strands, when compared, are identical when optimally aligned, with appropriate nucleotide insertions or deletions, in at least about 40% of the residues, generally at least about 45%, more generally at least about 50%, normally at least about 55%, more normally at least about 60%, typically at least about 65%, more typically at least about 70%, usually at least about more usually at least about 80%, preferably at least about 85%, and more preferably at least about 95% of the nucleotides. Alternatively, substantial homology exists when the segments will hybridize under selective hybridization conditions, to a strand or its complement, typically using a sequence derived from Table 1, 2 or 3.
Selectivity of hybridization exists when hybridization i occurs which is more selective than total lack of specificity. Normally, selective hybridization will occur when there is at least about 55% homology over a stretch of at least about 14 to 25 nucleotides, generally at least about 65%, typically at least about 75%, usually at least about 85%, preferably at least about 90%, and more preferably at least about 95% or more. See, Kanehisa, M. (1984), Nucleic Acids Res. 12:203-213, which is incorporated herein by reference. Stringent 20 hybridization conditions will include salt concentrations of less than about 2.5 M, generally less than about M, typically less than about 1 M, usually less than about 500 mM, and preferably less than about 200 mM.
Temperature conditions will normally be greater than 20"C, more normally greater than about 256C, generally greater than about 30*C, more generally greater than about 35*C, typically greater than about 400C, more typically greater than about 450C, usually greater than about 500C, more usually greater than about 55*C, and in .3 0 particular embodiments will be greater than 60"C, even as high as 80*C or more. As other factors may significantly affect the stringency of hybridization, including, among others, base composition and size of the complementary strands, presence of organic solvents and extent of base mismatching, the combination of parameters is more important than the absolute measure of any one.
A gene for an insect steroid receptor superfamily member gene includes its upstream promoter) and downstream operably linked controlling elements, as well as the complementary strands. See, generally, Watson et al. (1987) The Molecular Biology of the Gene, Benjamin, Menlo Park, which is hereby incorporated herein by reference. A gene geneally also comprises the segment encoding the transcription unit, including both introns and exons. Thus, an isolated gene allows for screening for new steroid receptor genes by probing for genetic sequences which hybridize to either controlling or transcribed segments of a receptor gene of the present invention. Three segments of particular interest are the controlling elements, both upstream and downstream, and segments encoding the DNA-binding segments and the hormone-binding segments. Methods applicable to such screening are analogous to those generally used in hybridization or affinity labeling.
Nucleic acid probes will often be labeled using 20 radioactive or non-radioactive labels, many of which are S..listed in the section on polypeptide labeling. Standard procedures for nucleic acid labeling are described, e.g., in Sambrook et al. (1989); and Ausubel et al. (1987 and supplements).
Insect steroid receptor superfamily member polypeptides A polypeptide sequence of the ecdysone receptor is represented in Table 2. Other insect steroid receptor superfamily member polypeptide sequences are set forth in Tables 1 and 3. Preferred nucleic acid sequences of the cDNAs encoding these insect steroid receptor superfamily member polypeptides are also provided in the corresponding tables. Other nucleic acids will be used to encode the proteins, making use of the degeneracy or non-universality of the genetic code.
As used herein, the term "substantially pure" describes a protein or other material, nucleic acid, which has been separated from its native contaminants. Typically, a monomeric protein is substantially pure when at least about 60 to 75% of a sample exhibits a single polypeptide backbone. Minor variants or chemical modifications typically share the same polypeptide sequence. Usually a substantially pure protein will comprise over about 85 to 90% of a protein sample, and preferably will be over about 99% pure.
Normally, purity is measured on a polyacrylamide gel, with homogeneity determined by staining. Alternatively, Si for certain purposes high resolution will be necessary and HPLC or a similar means for purification will be Sused. For most purposes, a simple chromatography column or polyacrylamide gel will be used to determine purity.
The term "substantially free of naturally-associated insect cell components" describes a protein or other material, nucleic acid, which is separated from the native contaminants which accompany it in its natural insect cell state. Thus, a protein which is chemically synthesized or synthesized in a cellular system different from the insect cell from which it naturally originates will be free from its naturally-associated insect cell components. The term is used to describe insect steroid receptor superfamily members and nucleic acids which have been synthesized.in mammalian cells or plant cells, E.
coli and other procaryotes.
The present invention also provides for analogues of the insect steroid receptor superfamily members. Such analogues include both modifications to a polypeptide 0 backbone, insertions and deletions, genetic variants, and mutants of the polypeptides. Modifications include chemical derivatizations of polypeptides, such as acetylations, carboxylations and the like. They also include glycosylation modifications and processing variants of a typical polypeptide. These processing steps specifically include enzymatic modifications, such as ubiquitinylation. See, Hershko and Ciechanover (1982), "Mechanisms of Intracellular Protein Breakdown," Ann. Rev. Bioch., 51:335-364.
Other analogues include genetic variants, both natural and induced. Induced mutants are derived from various techniques, random mutagenesis using reagents such as irradiation or exposure to EMS, or engineered changes using site-specific mutagenesis techniques or other techniques of modern molecular biology. See, Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d CSH Press; and Ausubel et al. (1987 and supplements) Current Protocols in Molecular Biology, Greene/Wiley, New York, each of which is hereby incorporated herein by reference.
As described above, the DNA-binding zinc fingers segment of a receptor shows high specificity of recognition of specific target DNA sequences. An understanding of the DNA-protein binding interactions provides for the modification in a rational manner of either DNA or protein characteristics, or both, to effect specificity of binding for modulation of enhancer o activity. More importantly, isolation of genes for new members of the insect steroid receptor superfamily allows their use to produce the receptor polypeptides and to isolate new controlling elements. By using the DNAbinding domains, as described above, controlling elements which are responsive to the ligands bound by the corresponding superfamily members are identified and isolated. This procedure shall yield a variety of controlling elements responsive to ligands. By the methods described above, the ligands for any particular member of the insect steroid receptor superfamily will be identified.
The controlling elements typically are enhancers, but also include silencers or various other types of ligand-responsive elements. They usually operate over large distances, but will typically be within about kb, usually within about 35 kb, more usually within about 20 kb and preferably within about 7 kb of the genes that these elements regulate.
Polvpeptide fragments and fusions Besides substantially full-length polypeptides, the present invention provides for biologically active fragments of the polypeptides. Significant biological activities include ligand-binding, DNA binding, immunological activity and other biological activities characteristic of steroid receptor superfamily members.
i Immunological activities include both immunogenic function in a target immune system, as well as sharing of j immunological epitopes for binding, serving as either a competitor or substitute antigen for a steroid receptor epitope.
For example, ligand-binding or DNA-binding domains from different polypeptides will be exchanged to form different or new fusion polypeptides or fragments. Thus, new chimaeric polypeptides exhibiting new combinations of specificities result from the functional linkage of ligand-binding specificities to DNA-binding domains.
This is extremely useful in the design of inducible expression systems.
For immunological purposes, immunogens will sometimes be produced from tandemly repeated polypeptide segments, thereby producing highly antigenic proteins.
Alternatively, such polypeptides will serve as highly efficient competitors for specific binding. Production of antibodies to insect steroid receptor superfamily 50 members is described below.
The present invention also provides for other polypeptides comprising fragments of steroid receptor superfamily members. Fusion polypeptides between the steroid receptor segments and other homologous or heterologous proteins are provided, polypeptide comprising contiguous peptide sequences from different proteins. Homologous polypeptides will often be fusions between different steroid receptor superfamily members, resulting in, for instance, a hybrid protein exhibiting ligand specificity of one member and DNA-binding specificity of another. Likewise, heterologous fusions, derived from different polypeptides, will be constructed which would exhibit a combination of properties or activities of the parental proteins. Typical examples are fusions of a reporter polypeptide, luciferase, with another domain of a receptor, a DNA-binding domain, so that the presence or location of a desired ligand is easily determined.. See, Dull et al., U.S. No. 4,859,609, which is hereby incorporated herein by reference. Other typical gene fusion partners include "zinc finger" segment swapping between DNA-binding proteins, bacterial 8-galactosidase, trpE Protein A, B-lactamase, alpha amylase, alcohol dehydrogenase, and yeast alpha mating factor. See, Godowski et al.
(1988), Science 241:812-816; and Experimental section below.
Insect steroid receptor superfamilv member expression With the sequence of the receptor polypeptides and the recombinant DNA sequences encoding them, large quantities of members of the insect steroid receptor superfamily will.be prepared. By the appropriate expression of vectors in cells, high efficiency protein production will be achieved. Thereafter, standard protein purification methods are available, such as ammonium sulfate precipitation, column chromatography, electrophoresis, centrifugation, crystallization, and others. See, Deutscher (1990) "Guide to Protein Purification" Methods in Enzymoloy, vol 182 and others; and Ausubel et al. (1987 and supplements) Current Protocols in Molecular Biology, for techniques typically used for protein purification. Alternatively, in some embodiments high efficiency of production is unnecessary, but the presence of a known inducing protein within a carefully engineered expression system is quite valuable.
For instance, a combination of: a ligand-responsive enhancer of this type operably linked to a desired gene sequence with the corresponding insect steroid receptor superfamily member will be placed together in an expression system provides a specifically inducible expression system. The desired gene sequence will encode a protein of interest, and the corresponding steroid receptor member will often be the ecdysone receptor.
Typically, the expression system will be a cell, but in vitro expression systems will also be constructed.
The desired genes will be inserted into any of a wide selection of expression vectors. The selection of an appropriate vector and cell line depends upon the constraints of the desired product. Typical expression vectors are described in Sambrook et al. (1989) and Ausubel et al. (1987 and supplements). Suitable cell lines are available from a depository, such as the ATCC.
See, ATCC Catalogue of Cell Lines and Hybridomas (6th ed.) (1988); ATCC Cell Lines, Viruses, and Antisera, each of which is hereby incorporated herein by reference. The vectors are introduced to the desired cells by standard transformation or transfection procedures as described, for instance, in Sambrook et al. (1989).
Fusion proteins will typically be made by either recombinant nucleic acid methods or by synthetic polypeptide methods. Techniques for nucleic acid manipulation are described generally, for example, in Sambrook et al. (1989), Molecular Cloning: A Laboratory Manual (2d Vols. 1-3, Cold Spring Harbor Laboratory, which are incorporated herein by reference.
Techniques for synthesis of polypeptides are described, for example, in Merrifield, J. Amer. Chem. Soc. 85:2149- 2156 (1963).
The recombinant nucleic acid sequences used to produce fusion proteins of the present invention will typically be derived from natural or synthetic sequences.
Many natural gene sequences are obtainable from various cDNA or from genomic libraries using appropriate probes.
See, GenBank", National Institutes of Health. Typical probes for steroid receptors are selected from the sequences of Tables 1, 2 or 3 in accordance with standard procedures. The phosphoramidite method described by Beaucage and Carruthers, Tetra. Letts. 22:1859-1862 (1981) will produce suitable synthetic DNA fragments. A double stranded fragment is then obtainable either by synthesizing the complementary strand and annealing the strand together under appropriate conditions or by adding the complementary strand using DNA polymerase with an appropriate primer sequence.
With the isolated steroid receptor genes, segments of the transcribed segments are available as probes for isolating homologous sequences usually from different sources, different animals. By selection of the segment used as a probe, particular functionally associated segments will be isolated. Thus, for example, other nucleic acid segments encoding either ligand- S*binding or DNA-binding domains of new receptors will be isolated. Alternatively, by using steroid-responsive controlling elements as a probe, new steroid-responsive elements will be isolated, along with the associated 5 segment of DNA whose expression is regulated. This method see allows for the isolation of ligand-responsive genes, many FOS of which are, themselves, also members of the insect S"f steroid receptor superfamily.
The natural or synthetic DNA fragments coding for a 3 0 desired steroid receptor fragment will be incorporated into DNA constructs capable of introduction to and expression in an in vitro cell culture. Usually the DNA constructs will be suitable for replication in a unicellular host, such as yeast or bacteria, but alternatively are intended for introduction to, with or without integration into the genome, cultured mammalian or plant or other eucaryotic cell lines. DNA constructs prepared for introduction into bacteria or yeast will typically include a replication system recognized by the host, the intended DNA fragment encoding the desired receptor polypeptide, transcription and translational initiation regulatory sequences operably linked to the polypeptide encoding segment and transcriptional and translational termination regulatory sequences operably linked to the polypeptide encoding segment. The transcriptional regulatory sequences will typically include a heterologous enhancer or promoter which is recognized by the host. The selection of an appropriate promoter will depend upon the host, but promoters such as )the trp, lac and phage promoters, tRNA promoters and glycolytic enzyme promoters are known. See, Sambrook et al. (1989). Conveniently available expression vectors which include the replication system and transcriptional and translational regulatory sequences together with the insertion site for the steroid receptor DNA sequence will generally be employed. Examples of workable combinations of cell lines and expression vectors are described in Sambrook et al. (1989); see also, Metzger et al. (1988), •.o Nature 334:31-36.
Genetic constructs The DNA segments encoding the members of the insect steroid receptor superfamily will typically be utilized in a plasmid vector. In one embodiment an expression control DNA sequence is operably linked to the insect steroid receptor superfamily member coding sequences for expression of the insect steroid receptor superfamily member alone. In a second embodiment an insect steroid receptor superfamily member provides the capability to express another protein in response to the presence of an insect steroid receptor ligand. This latter embodiment is separately described below. The expression control sequences will commonly include eukaryotic enhancer or promoter systems in vectors capable of transforming or transfecting eucaryotic host cells. Once the vector has been introduced into the appropriate host, the host, depending on the use, will be maintained under conditions suitable for high level expression of the nucleotide sequences.
Steroid-responsive expression of selected genes For steroid-responsive expression of other genes, the steroid receptor gene will typically be cotransformed with a recombinant construct comprising a desired gene for expression operably linked to the steroid-responsive enhancer or promoter element. In this use, a single expression system will typically comprise a combination of a controlling element responsive to a ligand of an insect steroid receptor superfamily member, a desired gene for expression, operably linked to the controlling element, and an insect steroid receptor superfamily member which can bind to the controlling element.
Usually, this system will be employed within a cell, but an in vitro system is also possible. The insect steroid receptor superfamily member will typically be provided by expression of a nucleic acid encoding it, though it need not be expressed at high levels. Thus, in one preferred embodiment, the system will be achieved through cotransformation-of a cell with both the regulatable construct and another segment encoding the insect steroid receptor superfamily member. Usually, the controlling element will be an enhancer element, but some elements work to repress expression. In this embodiment, the *30 ligand for the insect steroid receptor superfamily member will be provided or withheld as appropriate for the desired expression properties.
A particularly useful genetic construct comprises an alcohol dehydrogenase promoter operably linked to an easily assayable reporter gene, B-galactosidase.
In a preferred embodiment of this construct, a multiplicity of copies of the insect steroid receptor 32 superfamily member is used. For example, operable linkage of controlling elements responsive to insect steroid receptor superfamily members, EcR, DHR3, and E75B, to the alcohol dehydrogenase (ADH) promoter, or others as described above, and protein coding sequences for a particular reporter protein, as described above, leads to steroid-responsive expression of 8-galactosidase. Such a system provides highly sensitive detection of expression in response to ligand binding, allowing for detection of a productive ligandreceptor interaction.
DNA sequences will normally be expressed in hosts after the sequences have been operably linked to positioned to ensure the functioning of) an expression control sequence. These expression vectors are typically replicable in the host organisms either as episomes or as an integral part of the host chromosomal DNA. Commonly, expression vectors will contain selection markers, e.g., tetracycline or neomycin, to permit detection of those cells transformed with the desired DNA sequences (see, U.S. Patent 4,704,362, which is incorporated herein by reference).
E. coli is one procaryotic host useful for cloning the DNA sequences of the present invention. Other microbial hosts .suitable for use include bacilli, such as Bacillus subtilis, and other enterobacteriaceae, such as Salmonella, Serratia, and various Pseudomonas species.
Other eucaryotic cells will often be used, including yeast cells, insect tissue culture cells, avian cells, or 30 the like. Preferably, mammalian tissue cell culture will be used to produce the inducible polypeptides of the present invention (see, Winnacker, From Genes to Clones, VCH Publishers, N.Y. (1987), which is incorporated herein by reference). Mammalian cells are preferred cells in which to use the insect steroid receptor superfamily member ligand-responsive gene constructs, because they naturally lack the molecules which confer responses to the ligands for insect steroid receptor superfamily members.
Mammalian cells are preferred because they are insensitive to many ligands for insect steroid receptor superfamily member. Thus, exposure of these cells to the ligands of the insect steroid receptor superfamily members typically will have negligible physiological or other effects on the cells, or on a whole organism.
Therefore, cells can grow and express the desired product, substantially unaffected by the presence of the iligand itself. The ligand will function to cause response either in the positive or negative direction.
For example, it is often desirable to grow cells to high density before expression. In a positive induction system, the inducing ligand would be added upon reaching high cell density, but since the ligand itself is benign to the cells, the only physiological imbalances result from the expression, the product, itself.
Alternatively, in a negative repression system, the ligand is supplied until the cells reach a high density.
Upon reaching a high density, the ligand would be removed. Introduction of these cells into a whole organism, a plant or animal, will provide the products of expression to that organism. In this circumstance, the natural insensitivity of cells to the ligands will also be advantageous.
Expression vectors for these cells can include expression control sequences, such as an origin of replication, a promoter, an enhancer and necessary *.30 processing information sites, such as ribosome-binding sites, RNA splice sites, polyadenylation sites, and transcriptional terminator sequences. Preferably, the enhancers or promoters will be those naturally associated with genes encoding the steroid receptors, although it will be understood that in many cases others will be equally or more appropriate. Other preferred expression control sequences are enhancers or promoters derived from 34 viruses, such as SV40, Adenovirus, Bovine Papilloma Virus, and the like.
Similarly, preferred promoters are those found naturally in immunoglobulin-producing cells (see, U.S.
Patent No. 4,663,281, which is incorporated herein by reference), but SV40, polyoma virus, cytomegalovirus (human or murine) and the LTR from various retroviruses (such as murine leukemia virus, murine or Rous sarcoma virus and HIV) are also available. See, Enhancers and Eukarvotic Gene Expression, Cold Spring Harbor Press, 9 1983, which is incorporated herein by reference.
The vectors containing the DNA segments of interest the steroid receptor gene, the recombinant steroid-responsive gene, or both) can be transferred into the host cell by well-known methods, which vary depending on the type of cellular host. For example, calcium chloride transfection is commonly utilized for procaryotic cells, whereas calcium phosphate treatment is often used for other cellular hosts. See, generally, Sambrook et al. (1989), Molecular Cloning: A Laboratory Manual (2d Cold Spring Harbor Press; Ausubel et al.
(1987 and supplements) Cement Protocls in Molecular Biology, Greene/Wiley, New York; and Potrykus (1990) "Gene Transfer to Cereals: An Assessment," 5 Bio/Technology &:535-542, each of which is incorporated herein by reference. Other transformation techniques include electroporation, DEAE-dextran, microprojectile bombardment, lipofection, microinjection, and others.
The term "transformed cell" is meant to also include the 9 0 progeny of a transformed cell.
As with the purified polypeptides, the nucleic acid segments associated with the ligand-binding segment and the DNA-binding segment are particularly useful. These gene segments will be used as probes for screening for new genes exhibiting similar biological activities, though the controlling elements of these genes are of equal importance, as described below.
Many types of proteins are preferentially produced in eucaryotic cell types because of abnormal processing or modification in other cell types. Thus, mammalian proteins are preferably expressed in mammalian cell cultures. Efficient expression of a desired protein is often achieved, as described above, by placing: a desired protein encoding DNA sequence adjacent to controlling elements responsive to ligands for insect steroid receptor superfamily members and an appropriate promoter. Cyclic pulses of ligands in a cell culture may provide periods for cells to recover from effects of production of large amounts of exogenous protein. Upon 1) recovery, the ligand will often be reinduced.
Additional steroid responsive gene elements have also been isolated, substantially purified, using the techniques of the present invention. Other genes adjacent to, and operably linked to, steroid responsive gene control elements are selectable by locating DNA segments to which steroid receptors specifically bind or by hybridization to homologous controlling elements. For *example, other steroid responsive genes have been isolated. Many of the genes which are ligand-responsive will also be new members of the insect steroid receptor superfamily.
Having provided for the substantially pure polypeptides, biologically active fragments thereof and recombinant nucleic acids comprising genes for them, the present invention also provides cells comprising each of them. By appropriate introduction techniques well known in the field, cells comprising them will be produced.
See, Sambrook et al. (1989).
In particular, cells comprising the steroid responsive controlling elements are provided, and operable linkage of standard protein encoding segments to said controlling elements produce steroid responsive systems for gene expression. Cells so produced will often be part of, or be introduced into, intact organisms, for example, plants, insects (including caterpillars and larvae), and higher animals, e.g., mammals. This provides for regulatable expression of desired genes where the regulating ligand has no other effects on the cells because the cells otherwise lack the receptors and responsive genes. For example, plants will be induced to fruit at desired times by administration of the appropriate ligand, or animals will be ligandresponsive in production of particular products. And, in fact, biochemical deficiencies may be overcome by ligand- Sresponsive expression of cells introduced into an intact organism which, itself, also otherwise lacks genes i responsive to the presence of such a ligand. Multiple repeats of the control elements will lead, often, to at least additive or synergistic control. Cells containing these expression systems will be used in gene therapy procedures, including in humans.
Once a sufficient quantity of the desired steroid receptor polypeptide has been obtained, the protein is useful for many purposes. A typical use is the production of antibodies specific for binding to steroid receptors. These antibodies, either polyclonal or monoclonal, will be produced by available in vitro or in vivo techniques.
p For production of polyclonal antibodies, an appropriate target immune system is selected, typically a mouse or rabbit. The substantially purified antigen is presented to the immune system in a fashion determined by methods appropriate for the animal and other parameters 30 well known to immunologists. Typical sites for injection are in the footpads, intramuscularly, intraperitoneally, or intradermally. Of course, another species will often be substituted for a mouse or rabbit.
An immunological response is usually assayed with an immunoassay. Normally such immunoassays involve some purification of a source of antigen, for example, produced by the same cells and in the same fashion as the antigen was produced. The immunoassay will generally be a radioimmunoassay, an enzyme-linked assay (ELISA), a fluorescent assay, or any of many other choices, most of which are functionally equivalent but each will exhibit advantages under specific conditions.
Monoclonal antibodies with affinities of 10 8
M
1 9 preferably 10 to 101, or stronger will typically be made by standard procedures as described, in Harlow and Lane (1988), Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory; or Goding (1986), Monoclonal Antibodies: Principles and Practice (2d ed) Academic Press, New York, which are hereby incorporated herein by reference. Briefly, appropriate animals will be selected and the desired immunization protocol followed. After the appropriate period of time, the spleens of such animals are excised and individual spleen cells fused, typically, to immortalized myeloma cells under appropriate selection conditions. Thereafter the cells are clonally separated and the supernatants of each clone are tested for their production of an appropriate S* antibody specific for the desired region of the antigen.
Other suitable techniques involve in vitro exposure of lymphocytes to the antigenic polypeptides or alternatively to selection of libraries of antibodies in phage or similar.vectors. See, Huse et al., (1989) "Generation of a Large Combinatorial Library of the Immunoglobulin Repertoire in Phage Lambda," Science
S
246:1275-1281, hereby incorporated herein by reference.
The polypeptides and antibodies of the present 30 invention will be used with or without modification.
Frequently, the polypeptides and antibodies will be labeled by joining, either covalently or non-covalently, a substance which provides for a detectable signal. A wide variety of labels and conjugation techniques are known and are reported extensively in both the scientific and patent literature. Suitable labels include radionuclides, enzymes, substrates, cofactors, inhibitors, fluorescens, chemiluminescers, magnetic particles and the like. Patents, teaching the use of such labels include U.S. Patent Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241. Also, recombinant, chimeric, or humanized, immunoglobulins will be produced, see, e.g., Cabilly, U.S. Patent No. 4,816,567; Jones et al., 1986, Nature 321, 522-526; and published UK patent application No. 8707252; each of which is hereby incorporated herein by reference.
Another use of purified receptor polypeptides is for determination of the structural and biosynthetic aspects of the polypeptides. Structural studies of interactions of the ligand-binding domains with selected ligands are performed by various methods. The preferred method for structural determination is X-ray crystallography but may include various other forms of spectroscopy or chromatography. See, Connolly, J. Appl.
Crystall., 16:548 (1983); Connolly, Science 221:709 (1983); and Blundell and Johnson (1976) Protein Crystallography, Academic Press, New York; each of which is hereby incorporated herein by reference. For example, *the structure of the interaction between hormone ligand and hormone-binding segments is determined to high resolution. From this information, minor substitutions or modifications to either or both of the ligand and ligand-binding segment are made based upon, the contact regions between the two. This information enables the generation of modified interactions between a S..S ligand and its binding segment to either increase or decrease affinity of binding and perhaps increase or decrease response to binding. Likewise, the interaction between the zinc finger DNA-binding segments with the specific nucleic acid-binding sequence will be similarly modified.
As a separate and additional approach, isolated ligand-binding polypeptide domains will be utilized to screen for new ligands. Binding assays will be developed, analogous, to immunoassays. This procedure permits screening for new agonists or antagonists of a particular steroid receptor. Isolated DNA-binding segments will be used to screen for new DNA sequences which will specifically bind to a particular receptor-binding segment. Typically, these receptorspecific binding sites will be controlling elements for steroid responsive genes. Thus, having isolated these DNA-binding sequences, genes which are responsive to the binding of a given receptor can be isolated. This provides a method for isolating genes which are responsive to induction or inhibition by a given hormone receptor.
In another aspect of the present invention, means for disrupting insect development are provided where new ligand agonists or antagonists are discovered. These compounds are prime candidates as agonists or antagonists to interfere with normal insect development. By application of new steroid analogues of ligands for insect steroid receptor superfamily members, it is possible to modify the normal temporal sequence of developmental events. For example, accelerating insect development will minimize generation time. This will be very important in circumstances where large numbers of insects are desired finally, for instance, in producing sterile males in Mediterranean fly infestations.
Alternatively, it is useful to slow development in some pest infestations, such that the insects reach destructive stages of development only after commercial crops have passed sensitive stages.
In another commercial application, ligands discovered by methods provided by the present invention will be used in the silk-production industry. Here, the silkworms are artificially maintained in a silk-producing larvae stage, thereby being silk productive for extended time periods. The development of larvae will also be susceptible to acceleration to silk-production in a shorter time period than than naturally.
Other analogues of ligands for insect steroid receptor superfamily members will be selected which, upon application, will completely disrupt normal development and, preferably, lead to a lethal result. Slightly modified natural substances will often have greater specificity of action and much higher activities, allowing for lower levels of application. For example, more lipophilic ligands are more readily absorbed i' directly into the insect surface or cuticle. Thus, extremely low concentrations of natural ligands should be effective in controlling pests. Furthermore, many of these ligands are likely to be relatively easily manufactured, taking advantage of enzymatic production methods. New ligands for insect steroid receptor superfamily members will sometimes be more species specific or will exhibit particularly useful characteristics, for example, being lethal to specific harmful insects. The greater specificity of the hormones ,will allow avoidance of the use of non-specific pesticides possessing undesired deleterious ecological side effects, pesticide residue accumulation in food, often having deleterious effects on humans.
)Furthermore, compounds having structures closely analogous to natural compounds should be susceptible to natural mechanisms of biological degradation.
Another aspect of the present invention provides for the isolation or design of new gene segments which are 0 responsive to ligands for insect steroid receptor superfamily members. For example, use of the nucleic acids to screen for homologous sequences by standard techniques will provide genes having similar structural features. Similarly arranged intron structures will typically be characteristic of larger superfamily categories. The preferred domains for screening will be the ligand-binding or DNA-binding segments, although the DNA segments which are recognized by the DNA-binding domains, the controlling elements, will also be of particular interest. Screening for new controlling elements will usually take advantage of known similarities, sequence homology to other known elements, or homology to the DNA zinc finger-binding domains of other receptors. Receptors and genes important in the general developmental sequence of expression will be discovered. Using this set of developmentally regulated genes will allow selection of particular molecules which are responsible for controlling expression of developmentally regulated genes.
Kits for the determination of expression levels of the nucleic acids and proteins provided herein are made available. Typically, the kit will have at least one compartment which contains a reagent which specifically binds to the desired target molecule, ligand analogues, receptors, or nucleic acids. These reagents will be used in techniques for assays, using methods typically used in screening protocols. See, Sambrook et al. (1989) and Ausubel et al. (1987 and S supplements).
"The following experimental section is offered by way of example and not by limitation.
EXPERIMENTAL
EXAMPLE I CLONING STRUCTURE AND EXPRESSION OF THE DROSOPHILA 30 E75 GENE THAT ENCODES TWO MEMBERS OF THE STEROID RECEPTOR SUPERFAMILY.
The following experiments demonstrate that the gene encodes two members of the steroid receptor 35 superfamily. The proteins it encodes share amino acid sequence homology with the conserved DNA-binding and ligand-binding domains of this superfamily. The E75 gene is ecdysone-inducible, and it occupies and causes the ecdysone-inducible early puff at the 75B locus in the Drosophila polytene chromosome.
A. Cloning of Genomic DNA Encompassing the Ecdvsone-Inducible 75B Puff Locus We have used the method'of chromosomal walking (Bender, P. Spierer, and D. S. Hogness, 1983.
Chromosomal walking and jumping to isolate DNA from the Ace and rosy loci and the Bithorax complex in Drosophila melanoaaster. J. Mol. Biol. 168:17-33) to isolate the genomic DNA encompassing the 75B puff region. The starting point for the walk was a genomic clone, designated A8253, which had been localized by in situ hybridization to the proximal end of 75B. Isolated restriction fragments of A8253 were used to screen a library of genomic DNA from the Canton S (Cs) strain of D. melanoqaster. See (Maniatis, R. C. Hardison, E.
Lacy, J. Lauer, C. O'Connell, D. Quon, G. K. Sim, and A. Efstradiatis, 1978, "The isolation of structural genes from libraries of eucaryotic DNA." Cell 15:687-701).
Genomic clones AcDm3504 and AcDm3505 were isolated by homology to A8253.
The walk was then extended in both directions until -100 kb of genomic DNA had been isolated, and the orientation of the walk was determined by in situ hybridization of the terminal segments to polytene chromosomes. Thereafter, the walk was extended in the rightward direction on the molecular map, or distally relative to the centromere. The 350 kb of genomic DNA :0 encompassed by the walk corresponds to the chromosomal region between bands 75A6-7 and 75B11-13, as determined by in situ hybridization. This region includes the "puff, which appears to initiate by simultaneous decondensation of chromosomal bands 75B3-5 and then spreads to surrounding bands.
Methods Genomic DNA libraries Canton S genomic DNAs were isolated from a library of sheared, EcoRI-linkered Canton S DNA cloned into the Charon 4 A phage vector. See (Maniatis, T., R. C. Hardison, E. Lacy, J. Lauer, C. O'Connell, D. Quon, G. K. Sim, and A. Efstradiatis, 1978. The isolation of structural genes from libraries of eucaryotic DNA. Cell 15:687-701). Or genomic DNAs were isolated from a library of sheared DNA, GC-tailed into the sep6 A vector. See (Meyerowitz, F. and D. S. Hogness, 1982. "Molecular organization of a Drosophila puff site that responds to )ecdysone." Cell 28:165-176). One step in the chromosomal walk was taken using a cosmid library of SauIIIA partially digested O r DNA cloned into the cosmid pl4B1 by the method of Ish-Horowicz and Burke (Ish- Horowicz, and J. F. Burke, 1982. Rapid and efficient cosmid cloning. Nucleic Acids Res. 9:2989-2998).
In situ hybridization In situ hybridization to polytene chromosomes was carried out with DNA probes that were nick-translated in the presence of 3 H-labeled TTP (NEN), as described by Bonner and Pardue (Bonner, J. and M. L. Pardue, 1976.
Ecdysone-stimulated RNA synthesis in imaginal discs of Drosophila melanoqaster. Assay by in situ hybridization.
Chromosoma 58:87-99), with the following modifications: Heat and RNAase treatments of the slides were omitted, and hybridization and washing were at 63*C in 2XSSPE for 18 and 2 hours, respectively.
30 B. Identification of a 50 kb Region of Cloned Genomic DNA that Contains Sequences Homologous to Ecdysone-induced Transcripts Restriction fragments of the above genomic clones were tested for their ability to hybridize with each of two cDNA probes, one derived from the RNA in ecdysoneinduced cells, and the other from the RNA in noninduced cells. Two differential screens were carried out. In the first, genomic DNA covering the entire 350 kb walk was examined with cDNA probes synthesized with reverse transcriptase from an oligo(dT) primer annealed to poly(A)+ RNA. The poly(A)+ RNA was prepared from total inner tissues that were mass-isolated from late third instar larvae and incubated in the presence of ecdysone plus cycloheximide, or cycloheximide alone. (See Methods, below. Cycloheximide was included because higher levels of ecdysone-induced transcripts accumulate in its presence.) Each of the 32P-labeled cDNA probes made from these two poly(A)+ RNAs was applied to one of two duplicate Southern blots that contained, in addition to the genomic fragments from the walk, a control DNA consisting of sequences from the ribosomal protein 49 gene (O'Connell, and M. Rosbash, 1984. Sequence, structure and codon preference of the Drosophila ribosomal protein 49 gene.
Nucleic Acids Res. 12:5495-5513), which was used to normalize the hybridization intensities of the duplicate blots. This screen revealed sequences specific to 20 ecdysone-induced RNAs only within the AcDm3522 genomic clone that is centered at approximately +220 kb on the molecular map.
Because the above probes will preferentially detect sequences near the 3' termini of the RNAs, particularly in the case of long transcripts, a second differential screen was carried out with cDNA probes primed with to random hexamers (see Methods, below). This screen, which was restricted to the 135 kb of genomic DNA between +105 kb and +240 kb, revealed ecdysone-inducible sequences in 30 fragments spread out over an -50 kb region between +170 kb and +220 kb. This region represents the E75 gene.
Methods Organ culture and RNA isolation Late third instar 0' larvae were harvested, washed in 0.7% NaCl, resuspended in Robb's phosphate-buffered saline (PBS) (Robb, J. 1968. Maintenance of imaginal discs of Drosophila melanoaaster in chemically defined media. J. Cell. Biol. 41:876-885), preaerated with a blender, and passed through a set of rollers to extrude the organs. This "grindate" was filtered through a coarse Nitex screen to remove carcasses, and settled five times (3-5 minutes per settling) by gravity to remove floating and microscopic debris. Isolated tissues (primarily salivary glands, imaginal discs, gut, and Malphigian tubules) were cultured at 25'C in plastic petri dishes in aerated Robb's PBS. B-ecdysone (Sigma) (0.2 Ml/ml of 10 mg/ml) in ethanol and/or cycloheximide (2 Ml/ml of 35 mM) in water was added to the appropriate cultures. Incubations in the presence of cycloheximide were for -8 hours. Isolated tissues were homogenized in 10 volumes of 6 M guanidine-HCl/0.6 M sodium acetate (pH centrifuged at 5000 g for 10 minutes to remove debris, and layered onto a 5.7 M CaCl shelf, as described previously (Chirgwin, J. A. E. Przbyla, R. J. MacDonald, and W. J. Rutter, 1979. Isolation of biologically active ribonucleic acid from sources enriched in ribonuclease. Biochemistry 18:5294-5299).
Poly(A)+ RNA was purified by oligo(dT) chromatography.
Southern blot analysis Southern blots were performed on nitrocellulose, as described previously (Segraves, W. C. Louis, S.
Tsubota, P. Schedl, J. M. Rawls, and B. P. Jarry, 1984.
The rudimentary locus of Drosophila melanogaster. J.
Mol. Biol. 175:1-17). cDNA probes were prepared by reverse transcription (AMV reverse transcriptase; Seikagaku) of 2 gg of poly(A)+ RNA with 700 ng of oligo(dT) 1216 (Collaborative Research) or 15 gg of random hexamers (Pharmacia) in a 20 ~l reaction mixture containing 80 mM Tris Cl (pH 8.3 at 42*C), 10 mM MgC1 2 100 mM KC1, 0.4 mM DTT, 0.25 mM each of dATP, dGTP, and dTTP, and 100 MCi of [2P]dCTP (800 Ci/mole; Amersham).
After incubation at 37*C for 45 minutes, 80 il of 10 mM 46 EDTA and 2 4l of 5 N NaOH were added before incubation at for 10 minutes to denature the products and hydrolyze the RNA. After the addition of 10 pl of 1 M Tris-Cl (pH 7.5) and 5 L1 of 1 N HC1, unincorporated label was removed by chromatography on Biogel C. The E75 Gene Contains Two Overlapping Transcription Units: E75A and Northern blot analysis of ecdysone-induced and noninduced RNAs, prepared as described above and hybridized with strand-specific DNA probes derived from cloned restriction fragments in the 60 kb region (+166 to +226 kb) containing the E75 gene, demonstrated that this gene produces two classes of ecdysone-inducible mRNAs, both derived from rightward transcription. The class of mRNAs hybridized with probes from both the (left) and 3' (right) ends of the 50 kb E75 gene. The class hybridized only with probes from the 3'proximal 20 kb of the gene. These results suggest that the A and B classes of ecdysone-inducible RNAs are initiated by different promoters, located about 30 kb apart, and that the two transcription units defined by :.these promoters overlap in the region downstream from the B promoter.
i This suggestion was confirmed by analysis of the structure of cloned cDNAs from the E75A and E75B mRNAs.
Approximately 106 clones from an early pupal cDNA library (Poole, S. L. M. Kauvar, B. Drees, and T. Kornberg, 1985. The engrailed locus of Drosophila: Structural analysis of an embryonic transcript. Cell 40:37-40) were "screened at low resolution with genomic DNA probes from the E75 gene region. The 116 cDNA clones identified by this screen were analyzed by restriction digestion and 35 hybridization to a panel of probes derived from the 60 kb (+166 to +226 kb) region. One of the clones, ADm4925, was thereby selected as a representative of the 47 class of mRNAs, and another, ADm4745, as a representative of the E75B mRNA class.
The genomic regions homologous to these two cDNA clones were further localized by Southern blot analysis, and the nucleotide sequence of these regions and of both cDNA clones was determined. These sequences are given in Table 1, along with those derived from 5' and 3' terminal sequence determinations for each transcription unit.
These data demonstrate that the 50 kb E75A transcription unit consists of six exons, labeled in 5' to 3' order: AO, Al, 2, 3, 4 and 5, of which exons AO and Al are specific to this unit, while the remaining four are shared with the 20 kb E75B transcription unit.
Similarly, the E75B unit contains a specific exon, labeled Bl, at its 5' end, which is located just upstream of the shared exon 2. Thus, the E75 gene consists of two transcription units, of which the shorter E75B unit occupies the 3' proximal 20 kb of the longer E75A unit.
gO o WO 91/13167 PCT/US91/01189 48 Table 1. Sequences of the E75 exons and flanking DNA. The sequence is that of the C genomic DNA, which was identical to that of the cDNAs, except for the T-G change indicated at position +2691. This change would convert a leucine to an arginine in the protein sequences. The Dm4925 cDNA extends from just 5' of the EcoRV site at +939 to +4267 in A. The Dm4745 cDNA extends from +804 in B to a point near the HindIII site at +4246 in A. The E75 A exons and flanking DNA. The sequences of the AO, Al, and common exons 2-5 are interrupted by intron sequences (lowercase), which are limited to those near the splice sites and are in agreement with consensus sequences for donor and acceptor splice sites. Negative numbers at the right end of each line refer to the number of base pairs upstream of the E75 A initiation site, positive numbers refer to positions in the E75 A mRNAs, continuing into the 3' flanking DNA. Numbers at the left end of each line refer to amino acid residues in the E75 A protein.
The underlined 14 bp sequence at -159 to -172 exhibits a 13/14 bp match to a sequence (CGTAGCGGGTCTC) found 47 bp upstream of the ecdysone-inducible E74 A transcription unit responsible for the early puff at 74EF. This sequence represents the proximal part of a 19 bp sequence in the E74 A promoter that binds the protein encoded by the f. melanoaaster zeste gene. Another underlined sequence in the E75 A promoter at -74 to -82 is also found in the E75 B promoter, where it is part of a tandemly repeated octanucleotide (GAGAGAGC) located at -106 to -121 in B. This repeat matches the consensus sequence for the binding sites of the GAGA transcription factor which also binds to the E74 A promoter. Other underlined sequences represent, at -27 to -33, the best 5 match to the TATA box consensus at an appropriate position, three AUG codons that are closely followed by in-frame stop codons in the 5'-leader sequence of the mRNAs, and alternative polyadenylation-cleavage signals at 4591 and 5365 that are used by both E75 A and :4o mRNAs. The B1 exon and its 5'-flanking DNA. The numbering at the right and left ends of the lines follows the same convention as in A. Exons 2-5 shown in A are also used in E75 B, but the amino acid residues and base pair numbers shown in A must be increased by 157 and 375, 45 respectively, to apply to the E75 B protein and mRNA.
The first ten nucleotides of the 136-nucleotide B-intron linking the B1 exon to Exon 2 are gtaggttag, whereas the last ten are shown upstream of nucleotide 1178 in A. The underlined sequences represent, in order, the region of homology to a sequence upstream of E75 A, noted above, the best match to the TATA box consensus at -21 to -27, and three AUG codons followed by in-frame *stop codons in the 5' leader of the E75 B mRNA.
Methods cDNA libraries The ADm4925 and ADm4745 cDNAs were isolated from an Or early pupal cDNA library in AgtlO (Poole, S. J., L. M. Kauvar, B. Drees, and T. Kornberg, 1985. The engrailed locus of Drosophila: structural analysis of an embryonic transcript. Cell 40:37-40). The two cDNAs (ADm4927 and ADm4928) that were used for 3'-end mapping were isolated from an ecdysone-induced salivary gland cDNA library in A607 prepared by C. W. Jones. (Our strain collection names for the cDNA clones used in these studies are AfDm4925, AfDm4745, AeDm4927, and AeDm4928.) Northern blot analysis Probes to be used for Northern blots were cloned into the vector p0X (from R. Mulligan), containing the 0X174 origin of replication cloned in between the HindIII and BamHI sites of pBR322. This allowed the synthesis of single-stranded probe DNA (Arai, N. Arai, J.
Schlomai, and A. Kornberg, 1980. Replication of duplex DNA of phage OX174 reconstituted with purified enzymes.
Proc. Natl. Acad. Sci. 77:3322-3326), which was performed by the incubation of supercoiled plasmid DNA with gene A 5 protein, rep and ssb proteins, and DNA polymerase III holoenzyme in a reaction containing 20 mM Tris Cl (pH 80 Mg/ml BSA, 4% glycerol, 20 mM DTT, 1 mM ATP, 16 mM concentrations of the three unlabeled deoxynucleotides and 1.6 mM concentrations of the labeled deoxynucleotide for 1 hour at 30'C. EDTA was then added to 20 mM, SDS to and proteinase K to 50 gg/ml. The reactions were digested for 30 minutes at 37*C, and unincorporated label was removed by gel filtration.
35 Sl nuclease protection and primer extension analysis Single-stranded probes, prepared as described above by the OX in vitro replication system, were purified by electrophoresis on low melting point agarose gels for use as S1 probes. All other probes were prepared by extension of the -20, 17-mer sequencing primer (New England Biolabs) on single-stranded M13mp (Messing, J., 1983. New M13 vectors for cloning. Methods Enzymol.
101:20-78) or pEMBL (Dente, G. Cesareni, and R.
Cortes, 1983. pEMBL: A new family of single-stranded plasmids. Nucleic Acids Res. 11:1645-1654) recombinant templates using "P-labeled nucleotides, followed by cleavage with the appropriate restriction enzyme and purification of the probe on denaturing polyacrylamide S) gels. Labeled probe (100,000-300,000 cpm) was incubated with 1 jg of poly(A)+ RNA in a 5 gl reaction mixture containing 5 gg of yeast tRNA, 0.4 M NaCl, 40 mM PIPES (pH and 1 mM EDTA at 60*C under oil. Reactions were cooled and diluted 1:10 into either S1 digestion or primer extension buffer. Sl nuclease digestions were performed in 50 mM acetate buffer 400 mM NaCl, and 4 mM ZnSO 4 at 20*C for 1 hour with -15 150 Vogt units of S1 nuclease (Boehringer) per 50 gl reaction. Primer extensions were performed at 42"C in 50 mM Tris Cl (pH 8.3 at 42'C), 80 mM KC1, 2 mM DTT, 1 mM of dATP, S: dCTP, dGTP, and dTTP, with 20 units of AMV reverse transcriptase (Seikagaku) per 50 gl reaction. Reactions were terminated by the addition of EDTA, tRNA carrier was added to the S1 nuclease digestions, and samples were ethanol-precipitated and either electrophoresed directly on 5% or 6% denaturing polyacrylamide gels or glyoxalated (McMaster, G. and G. C. Carmichael, 1977. Analysis of single and double-stranded nucleic acids on 30 polyacrylamide and agarose gels by using glyoxal and acridine orange. Proc. Natl. Acad. Sci. 74:4835-4838) and electrophoresed on 1% agarose gels run in 10 mM sodium phosphate buffer (pH 6.8).
DNA sequence analysis The cDNA clones ADm4927 and ADm4928 were sequenced by chemical degradation (Maxam, A. and W. Gilbert, 1980. Sequencing end-labeled DNA with base-specific chemical cleavage. Methods Enzvmol. 65:499-560). All other sequencing was performed using the dideoxynucleotide chain termination method (Sanger, F., A. R. Coulson, B. F. Barrell, A. J. H. Smith, and B. A. Roe, 1980. Cloning in single-stranded bacteriophage as an aid to rapid DNA sequencing. J. Mol.
Biol. 143:161-178). Fragments were cloned into M13mp vectors (Messing, 1983. New M13 vectors for cloning.
Methods Enzvmol. 101:20-78) or pEMBL (Dente, L., G. Cesareni, and R. Cortes, :1983. pEMBL: A new family of single-stranded plasmids. Nucleic Acids Res. 11:1645- 1654) and sequenced directly or following the generation of a set of overlapping deletions using exonuclease III (Henikoff, 1984. Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene 28:351-359). Sequencing was performed on both strands of the ADm4925 cDNA, the B-specific region of ADm4745 cDNA, the A- and B-specific 5' genomic '2O regions not represented in the cDNAs, and the 3'-flanking region. The remaining exon boundaries of ADm4745 and genomic regions represented within the cDNA clones were sequenced on one strand.
p D. The E7.5 Gene Encodes Two Members of the Steroid Receptor Superfamily The coding and noncoding sequences of the E75 A and B mRNAs, their splice junctions, and the 5' and 3' flanking sequences are shown in Table 1. Certain sequences of potential interest within the 5' flanking DNA and in the 5' leader mRNA sequences are indicated in the legend to Table 1. We focus here on the large open reading frames of the E75 A and B mRNAs that begin at 380 bp and 284 bp downstream from their respective mRNA start sites, each continuing into the common final exon.
The termination codon in exon 5 lies upstream of both alternative polyadenylation sites; thus, the sequence of the encoded protein is not affected by which site is selected. Since the open reading frames in the E75 A and B mRNAs begin in the AO and B1 exons and merge at the beginning of exon 2, the proteins encoded by the two transcription units differ in the amino-terminal region and are the same in the carboxy-terminal region. The specific amino-terminal regions contain 266 and 423 amino acid residues in the E75 A and B proteins, respectively, while their common carboxy-terminal region consists of 971 residues. The predicted molecular weights of the A and B proteins are thus 132,000 and 151,000. The open reading frames display characteristic D. melanoqaster codon usage, and their extents have been confirmed by in vitro translation of mRNAs transcribed in vitro from cDNA constructs and by expression of fusion proteins in E. coli. The predicted protein sequence for each protein is punctuated by homopolymeric tracts of amino acids, which are noted in Table 1 and its legend.
Analysis of the sequences of E75 proteins and comparison to the sequences of known proteins have revealed similarity between the E75 proteins and members of the steroid receptor superfamily (Evans, R. 1988.
The steroid and thyroid hormone receptor superfamily.
Science 240:889-895; Green, and P. Chambon, 1988.
Nuclear receptors enhance our understanding of transcription regulation. Trends in Genetics 4:309-314).
We have used the nomenclature of Krust el al. (Krust, A., S. Green, P. Argos, V. Kumar, P. Walter, J. Bornert, and P. Chambon, 1986. The chicken oestrogen receptor sequence: Homology with v-erbA and the human oestrogen and glucocorticoid receptors. EMBO J. 5:891-897) in dividing the proteins into six regions, A to F, in the amino- to carboxy-terminal direction.
Similarity between E75A and other members of this superfamily is strongest in the C region, a cysteinelysine-arginine-rich region that is necessary and sufficient for the binding of these receptors to DNA (for **15 C 20 2 5 *ooze o 30 review, see, Evans, R. 1988. The steroid and thyroid hormone receptor superfamily. Science 240:889-895; Green, and P. Chambon, 1988. Nuclear receptors enhance our understanding of transcription regulation.
Trends in Genetics 4:309-314). The C region consists of 66-68 amino acids, of which 20 residues are invariant within this family. Among these are nine invariant cysteine residues, eight are believed to coordinate zinc in the formation of two zinc finger-like structures (Miller, A. D. McLachlan, and A. Klug, 1985.
Representative zinc-binding.domains in the protein transcription factor IIIA from Xenopus oocytes. EMBO J.
4:1609-1614; Freedman, L. B. F. Luisi, Z. R. Korszun, R. Basavappa, P. B. Sigler, and K. R. Yamamoto, 1988.
The function and structure of the metal coordination sites within the glucocorticoid receptor DNA binding domain. Nature 334:543-546; Severne, S. Wieland, W. Schaffner, and S. Rusconi, 1988. Metal binding finger structure of the glucocorticoid receptor defined by sitedirected mutagenesis. EMBO J. 9:2503-2508). Within the C region, E75A contains all of the highly conserved residues and is approximately as closely related to other members of the steroid receptor superfamily as they are to one another. The closest relative of E75 appears to be the human ear-1 gene, which has nearly 80% amino acid identity to E75 A in the DNA-binding domain.
The other region conserved among members of the steroid receptor superfamily is the E region, which is required for steroid binding and for the linkage of steroid-binding and trans-activation functions (for review, see, Evans, R. 1988. The steroid and thyroid hormone receptor superfamily. Science 240:889-895; Green, and P. Chambon, 1988. Nuclear receptors enhance our understanding of transcription regulation.
Trends in Genetics 4:309-314). Although overall E-region similarity is clearly significant for the comparison of A to the thyroid hormone, vitamin D, and retinoic acid receptors, and ear-1, similarity to the glucocorticoid and estrogen receptors is considerably lower. However, the plots of local similarities show a clear similarity to each of these proteins within three subregions of the E region, denoted El, E2 and E3. The El subregion is the most highly conserved and corresponds to a region shown by in vitro mutagenesis to be essential for steroid binding and steroid-dependent transactivation (Giguere, S. M. Hollenberg, M. G. Rosenfield, and R. M. Evans, 1986. Functional domains of the human glucocorticoid receptor. Cell 46:645-652; Danielson, J. P. Northrop, J. Jonklaas, and G. M. Ringold, 1987. Domains of the glucocorticoid receptor involved in specific and nonspecific .15 deoxyribonucleic acid binding, hormone activation and transcriptional enhancement. Mol. Endocrinol. 1:816- 822). Region E2 is less highly conserved in primary amino acid sequence, but can, in part, be seen as a conserved hydrophobic region in the hydropathy plots of 20 several of these proteins. A deletion of 14 amino acids within this region abolished steroid binding (Rusconi, and K. R. Yamamoto, 1987. Functional dissection of the hormone and DNA binding activities of the glucocorticoid receptor. EMBO J. 6:1309-1315). E3 falls close to the end of the region that is absolutely required for steroid binding.
While the characteristic structural features of the steroid receptor superfamily are well conserved in two novel variations are seen. The first of these concerns the structure of the E75 B protein, which contains a major alteration within its putative DNAbinding domain. The steroid receptor superfamily DNAbinding domain consists of two DNA-binding zinc fingers separated by a less conserved linker region. In E75, as in nearly all other genes of this family, an intron is found between the two fingers. In E75, this splice marks the beginning of the region held in common between the A and B proteins. This results in the E75 A protein having two fingers, while the E75 B protein has unrelated B-specific sequences in place of the first finger. Other sequences within the B-specific amino-proximal region may contribute to the DNA-binding domain of the E74B protein.
Alternatively, the B protein might bind DNA with only one finger, as GAL4 transcription factor of yeast appears to do. It is possible that these structural differences imply a functional difference in the DNAbinding properties of the E75 A and B proteins that might allow them to differentially regulate the transcription of the late genes that characterize the secondary .response to ecdysone in different target tissues.
In this respect, it should be emphasized that the putative hormone- or ligand-binding domain is represented by the E region that is common to the E75A and proteins. Thus, these proteins appear to be receptors for the same hormone, which may regulate the transcription of different sets of genes. These proteins represent "orphan" receptors, in that their hormone, or binding ligand, has not yet been identified. Because ecdysteroids are the only known steroid hormones in Drosophila, the most obvious candidate for an E75 ligand would be ecdysone itself. However, it is unlikely that S"this is the case, since the putative hormone-binding domain of the E75 proteins does not exhibit the high sequence homology to that of the known Drosophila ecdysone receptor encoded by the EcR gene (see Experimental Example III and Table 2) that would be expected if the E75 proteins were also ecdysone receptors. It, therefore, seems likely that the proteins would bind either a terpenoid juvenile hormone or a novel Drosophila hormone.
The second unusual feature of the E75 proteins is the presence of a large F region, encompassing nearly one half of the proteins. Many of the other receptors have very small F regions, and no function has yet been ascribed to this region.
Methods Protein sequence analysis Sequence data were compiled using the Bionet system.
Protein sequence comparison was performed using FASTP (Lipman, D. and W. R. Pearson, 1985. Rapid and sensitive protein similarity searches. Science 227:1435- 1441) and Bionet IFIND programs.
E. Expression Vectors for E75 Proteins In order to express the E75 proteins, portions of cDNAs and genomic clones were fused in order to generate ."15 cassettes containing the entire E75 A and E75 B protein coding regions. First, BamHI sites were introduced into genomic clones upstream of the initial AUGs of the large open reading frames. Then, E75 AO exon sequences were fused to sequences of a nearly full-length E75 A cDNA, and E75 Bl exon sequences were fused to sequences of a nearly full-length E75 B cDNA. These cassettes were cloned into pGEM3 (Promega), and transcripts of the open reading frames were prepared using T7 polymerase. These were then translated in the presence of S-methionine, and shown to give rise to proteins of appropriate size.
These cassettes have been placed into a variety of expression vectors, including pUCHsneo/Act for expression in Drosophila cells, pSV2 for expression in mammalian cells, and pOTS for expression in bacterial cells.
Methods BamHI sites were introduced directly upstream of the initial ATGs of the E75A and E75B coding sequence at the SspI site upstream of the E75A initial ATG, and at the SacII site upstream of the E75B initial ATG. cDNA and genomic sequences were joined at the EcoRV site in the AO exon to construct an E75A cassette, and at the Mlul in exon 3 to construct an E75B cassette.
EXAMPLE II CLONING, STRUCTURE AND EXPRESSION OF THE EcR AND DHR3 GENES THAT ENCODE ADDITIONAL MEMBERS OF THE STEROID RECEPTOR SUPERFAMILY.
The following experiments were carried out after the primary structure of the E75 gene, and of the two members of the steroid receptor superfamily that it encodes, was determined (Experimental, Example The purpose of these experiments was to clone and determine the primary structure of other steroid receptor superfamily genes from Drosophila, and of the proteins they encode. The .:"aim was to identify the gene that encodes a Drosophila ecdysone receptor, given that the characteristics of the E75 gene indicated that it did not encode an ecdysone 20 receptor. The first stage of the experimental plan was to use the conserved sequences in the E75A transcription unit that encode the putative DNA-binding domain of the receptor protein as a probe to screen a Drosophila genomic library to identify sequences encoding the 25 putative DNA-binding domains of other Drosophila members of the steroid receptor superfamily. The second stage was to isolate cDNA clones corresponding to the identified genes, as well as additional genomic DNA clones, to obtain the nucleotide sequence of the complete coding region the open reading frame encoding the respective receptors) and the exon-intron organization of these genes.
The experiments described below resulted in the cloning and structural characterization of two genes that satisfy the criteria for bona fide members of the steroid receptor superfamily: encoding proteins that exhibit amino acid sequence homology to both the DNA-binding and the hormone-binding domains that are conserved among members of this superfamily. The two genes are called EcR and DHR3. The EcR gene was originally called DHR23, but was renamed EcR after it was shown to encode an ecdysone receptor (see Experimental Example III). The DHR3 designation stands for Drosophila Hormone Receptor 3.
A. Identification and Chromosomal Mapping of EcR and DHR3 Genomic Clones Initially, Southern blots of total Drosophila genomic DNA, digested with one or more restriction endonucleases, were probed with a 530 bp fragment of the -E75A cDNA containing the sequences encoding the putative DNA-binding domain of the E75A receptor protein (see Experimental Example I) at low and high stringency 15 hybridization conditions.
To isolate the sequences responsible for these low stringency bands, this E75A probe was used to screen a Drosophila genomic library under the same low stringency conditions, counterscreening duplicate filters with 20 intron probes to eliminate phage containing inserts from •the E75 gene. Five genome equivalents were screened and 39 non-E75 containing phage were isolated. The 25 most strongly hybridizing clones were divided into six classes on the basis of restriction patterns and cross hybridization, each class containing between one and six independent overlapping genomic inserts.
For each class, a restriction fragment containing the region of hybridization to the E75A probe was localized by Southern blotting. Hybridization of probes derived from these fragments to genomic Southern blots showed that each of the low stringency bands detectable by the E75A probe could be accounted for by one of the six isolated fragments.
The nucleotide sequences of the six restriction fragments were determined to test whether they represent candidate receptor genes. In all cases, DNA sequence similarities with the E75A probe were observed that are sufficient to account for the hybridization of these fragments with the probe. When the DNA sequences were conceptually translated in all six reading frames, four of the fragments yielded no significant sequence similarity with E75A at the protein level. The remaining two clones, however, showed predicted amino acid sequences with strong similarity to the DNA binding domains of the E75A protein and other steroid superfamily receptors.
These two clones represent the EcR and DHR3 genes, as will become apparent. Probes from these clones were used to map the position of these genes in the polytene chromosomes by in situ hybridization. The EcR and DHR3 chromosomal loci were mapped to positions 42A and 46F, respectively, in the right arm of the second chromosome.
B. Structure of the EcR and DHR3 Genes and Their cDNAs The DHR3 and EcR genomic clones described above were used to screen a cDNA library prepared from third instar tissues treated with ecdysone and cycloheximide. This .procedure allowed the isolation of a large number of cDNA clones, since both genes have a peak period of transcription in late third instar after the rise in ecdysone titer. For each gene, 20 cloned cDNAs were purified and their lengths determined. Restriction maps for the 10 longest cDNAs from each gene were determined and found to be colinear.
For EcR, a 5534 bp cDNA sequence was obtained from two overlapping cDNA clones. It contains an 878 codon open reading frame (ORF) which yields a predicted amino acid sequence expected for a member of the steroid receptor superfamily (Table as described in more detail below. The length of the largest DHR3 cDNA that was isolated (clone DHR3-9) is 4.2 kb. The nucleotide sequence of this cDNA was determined and found to contain a 487 codon AUG-initiated open reading frame (Table 3).
As described below, the amino acid sequence of the DHR3 protein predicted from this sequence demonstrates that this protein is also a bona member of the steroid receptor superfamily.
.4 a a a WO 91/13167 PCT/US91/01189 61 Table 2. The cDNA sequence of the EcR gene.
Numerals at the left refer to the nucleotide sequences; those on the right to the amino acid sequence in the EcR protein. Nucleotides 1-5194 are the sequence of EcR-17 cDNA, while nucleotides 5195-5534 derive from the EcR-9 cDNA. The underlined sequences in the 5' and 3' untranslated regions refer, respectively, to the ATG codons and the AATAAA consensus polyadenylation signals.
Positions of the introns and the donor and acceptor splice sequences are indicated above the cDNA sequence in small type. The amino acid sequences homologous to the conceived DNA-binding (C region) and hormone-binding (E region) domains of the steroid receptor superfamily are underlined.
*o
RAZ
/4\o 62 Table 3. The cDNA sequence of the DHR3 gene.
The numbering and underlining of the nucleotide and amino acid sequences have the same meaning as in Table 2, and the intron positions and donor and acceptor splice sequences are similarly indicated. The sequence of the proximal 2338 nucleotides of the DHR3-9 cDNA is shown.
The sequence of the remainder of this 4.2 kb cDNA was determined for only one strand and is not shown. Four silent, third-position differences between the cDNA and genomic bNA sequences are indicated above the cDNA sequence.
e goes to oo S o 63 The genomic structure of the EcR and DHR3 genes was investigated by isolating additional genomic DNA clones that form overlapping sets that contain all of the sequences found in the respective cDNA clones. The exons contained in these cDNAs were mapped within the genomic DNA by comparison of cDNA and genomic clones via Southern blot analysis, mapping of restriction cleavage sites, and finally, by determination of the nucleotide sequence of the genomic DNA in regions that contain the exon/intron boundaries. Table 2 and 3 show these boundaries and the sequence of the splice junctions for the EcR and DHR3 genes, respectively. All of these splice junctions conform to the splice donor and acceptor consensus sequences.
.5 For EcR, the cDNA sequence shown in Table 2 is split into six exons spread over 36 kb of genomic DNA, with the ORF beginning in the second exon and ending in the sixth.
For DHR3, the cDNA sequence derives from nine exons spread over 18 kb, with the ORF beginning in the first :'0o exon and ending in the ninth. Because the 5' and 3' ends of the respective mRNAs were not mapped, it should be emphasized that these genes may have additional noncoding exons at their 5' or 3' ends.
The EcR and DHR3 gene structures differ 2 significantly from those of all previously examined steroid receptor superfamily genes. Comparison with the genes for 11 other receptor homologues for which at least partial structural information is available reveals that the positions of certain exon boundaries have been conserved in evolution. This conservation is most striking in the portion of the genes encoding DNA-binding domains. In the nine other cases where the structure of this region has been examined, the two halves of the DNAbinding domain are always encoded by separate exons. If we exclude the Drosophila genes knirps, knirps-related, and eqon (which are not bona fide receptor homologues since they lack the hormone-binding domain sequence similarity), these are always small exons, the second one invariably ending in the fourth codon beyond the conserved Met codon at the end of the C region. Thus, these exons each encode one of the two predicted zinc fingers of the DNA-binding domain. In contrast, both zinc fingers of the putative DNA-binding domain of the EcR and DHR3 receptors are encoded by a single exon. It is possible that our screen specifically selected for genes lacking the above intron. The screen selected genomic clones that hybridize to an E75A cDNA probe that, of course, lacks this intron. Genomic sequences containing a contiguous sequence encoding the DNA-binding domain would be expected to hybridize to this probe better than clones from genes containing the intron.
This would explain the successful isolation of the EcR and DHR3 genes, and the failure to isolate the genes of other Drosophila members of the steroid receptor superfamily.
'o Methods Isolation of cDNA and additional genomic clones Subclones of the originally isolated DHR3 and EcR genomic clones were used to screen a cDNA library prepared from third instar tissues treated with ecdysone -i and cycloheximide. This library was chosen because both genes are relatively highly expressed at the end of third instar, and because of the high quality of the library.
Of the 270,000 primary plaques screened, 20 positives for DHR3 and 220 for EcR were detected. Twenty cDNAs for each gene were purified, of which the ten largest for each were restriction mapped and found to be colinear.
cDNA DHR3-9, which extends further in both the 5' and 3' directions than our other DHR3 cDNAs, was chosen for sequencing. For EcR, the longest cDNA, EcR-17, extended the farthest 5' and was sequenced in its entirety. An additional cDNA clone, EcR-9, was found to extend 300 bp farther 3' than EcR-17, and this 3' extension was also sequenced. Additional genomic DNA clones covering the EcR and DHR3 genes were obtained by screening the Drosophila Canton S genomic library referred to in part A above, either with probes from the respective cDNA clones, or, for overlapping clones, by the chromosomal walk method described in Experimental Example I.
DNA sequence analysis cDNAs were subcloned into BlueScript vectors (Stratagene), and clones for sequencing were generated by exonuclease III digestion (Henikoff, 1984.
Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene 28:351-359).
Double-stranded plasmids were denatured (Gatermann, K. G. H. Rosenberg, and N. F. Kaufer, L988. Doublestranded sequencing, using mini-prep plasmids, in 11 hours. BioTechniaues 6:951-952) and sequenced by the dideoxy chain terminating method (Sanger, S. Nicklen, and A. R. Coulson, 1977. DNA sequencing with chainterminating inhibitors. Proc. Natl. Acad. Sci. USA 74:5463-5467), using the enzyme Sequenase (U.S.
Biochemical). cDNA EcR-17 was completely sequenced on both strands, as was the EcR-9 3' extension. cDNA DHR3- 9 was sequenced on both strands for the 5' most 2338 bp, which contains the entire ORF, and the remainder of the long 3' untranslated region was sequenced on one strand.
The exon/intron boundaries in genomic DNA clones were first mapped at low resolution by Southern blot analysis of their restriction fragments probed with labeled cDNAs. Genomic DNA surrounding each exon/intron boundary was subcloned and the nucleotide sequence of these subclones determined as above.
Genomic exons were either sequenced entirely, or for the longer exons, were digested and electrophoresed in parallel with cDNA clones to confirm the colinearity of the genomic and cDNA clones. Shorter exons were completely sequenced from genomic clones. Longer exons had their boundaries sequenced from genomic clones, and were confirmed to be colinear with the cDNA clones by parallel digestion and electrophoresis of the cDNA and genomic clones.
C. The Predicted Amino Acid Sequence of the EcR and DHR3 Proteins and their Implications Comparison of the predicted EcR and DHR3 protein sequences to the sequence database and to individual members of the steroid receptor superfamily shows that these proteins share the two conserved domains characteristic of this superfamily (Evans, R. 1988.
The steroid and thyroid hormone receptor superfamily.
Science 240:889-895; Green, and P. Chambon, 1988.
Nuclear receptors enhance our understanding of transcription regulation. Trends in Genetics 4:309-314).
We refer to the domains as the C and E regions, for the more amino-terminal and more carboxy-terminal homologies, respectively, according to the nomenclature of Krust et S* al. (Krust, S. Green, P. Argos, V. Kumar, P. Walter, J. M. Bornert, and P. Chambon, 1986. The chicken oestrogen receptor sequence; homology with v-erbA and the 9human oestrogen and glucocorticoid receptors. EMBO J.
5:891-897). These domains are underlined in Tables 2 and 3, and Table 4A-C presents a comparison of these domains from EcR and DHR3 with those from representative members of the superfamily.
WO 91/13167 PCT/US91/01189 67 Table 4. Sequence comparison of the conserved C and E regions in DHR3, EcR, and some representative nuclear receptor homologues. C-region alignment.
Numbers at the left indicate the amino acid positions within the individual receptors; dashes indicate gaps introduced to obtain maximal alignment. Dots indicate three positions important in determining the DNA binding specificity of this domain. E-region alignment.
Bars indicate the three most highly conserved stretches within this domain. Computed percent identities among the C-region sequences (lower left) and among the E-region sequences (upper right). The kni sequence shows no significant E-region homology and is, therefore, not included in this comparison. Sequences shown are from: Drosophila ecdysone-inducible gene at 75B; kni, Drosophila segmentation gene knirps; hRARa, human retinoic acid receptor alpha; htRB, human thyroid receptor beta; hVDR, human vitamin D receptor; cOUP-TF, chicken ovalbumin upstream promoter transcription factor; hERR1 and hERR2, human estrogen-related receptors 1 and 2; hER, human estrogen receptor; hGR, human glucocorticoid receptor; hMR, human mineralocorticoid receptor; hGR, human progesterone receptor.
*R
a i •oo: The C region is a 66-68 amino acid domain that has been shown to function as a zinc finger DNA binding domain in vertebrate receptors. This domain has also been implicated in receptor dimerization (Kumar, and P. Chambon, 1988. The estrogen receptor binds tightly to its responsive element as a ligand-induced homodimer.
Cell 55:145-156). As shown in Table 4A, all 19 C-region residues that are absolutely conserved in the other receptor homologues are also conserved in DHR3 and EcR, including the nine invariant Cys residues, eight of which coordinate two zinc ions (Freedman, L. B. F. Luisi, S.Z. R. Korszun, R. Basavappa, P. B. Sigler, and K. R. Yamamoto, 1988. The function and structure of the metal coordination sites within the glucocorticoid receptor DNA binding domain. Nature 334:543-546). As seen in Table 4C, the Drosophila C-region sequences (including those of E75A) are not more closely related to each other than they are to those from the vertebrate receptor homologues. The C region of DHR3 is most similar to that of the human retinoic acid receptor a (hRARa), and the C region of EcR is most similar to that of the human thyroid receptor B (hTRB). Studies on the human glucocorticoid receptor (hGR) and human estrogen •receptor (hER) have identified three C-region residues (indicated by dots in Table 4A) that are critical for determining the differential DNA binding specificity of these receptors (Mader, V. Kumar, H. de Verneuil, and P. Chambon, 1989. Three amino acids of the estrogen receptor are essential to its ability to distinguish an estrogen from a glucocorticoid-responsive element.
Nature 338:271-274; Umesono, and R. M. Evans, 1989.
Determinants of target gene specificity for steroid/thyroid hormone receptors. Cell 57:1139-46).
The three Drosophila proteins DHR3, EcR, and E75A, as well as the vertebrate receptors hRARa, hTRB, and the human vitamin D receptor (hVDR), all have identical amino acids at these three positions; thus, these proteins may all have similar DNA binding specificities, as has already been shown for hRARa and hTRB (Umesono, K., V. Giguere, C. K. Glass, M. G. Rosenfeld, and R. M. Evans, 1988. Retinoic acid and thyroid hormone induce gene expression through a common responsive element. Nature 336:262-265).
The E-region is an -225 amino acid domain that functions as a hormone-binding domain in vertebrate receptors. This domain has also been implicated in hormone dependent receptor dimerization (Kumar, V. and P. Chambon, 1988. The estrogen receptor binds tightly to its responsive element as a ligand-induced homodimer.
S. Cell 55:145-156; Guiochon, M. H. Loosfelt, P. Lescop, S. Sar, M. Atger, A. M. Perrot, and E. Milgrom, 1989.
.'15 Mechanisms of nuclear localization of the progesterone receptor: evidence for interaction between monomers.
Cell 57:1147-1154), hormone dependent nuclear localization of the glucocorticoid receptor (Picard, D., and K. R. Yamamoto, 1987. Two signals mediate hormone- 0 dependent nuclear localization of the glucocorticoid receptor. EMBO J. 6:3333-3340), and binding of the glucocorticoid receptor to the 90 kDa heat shock protein (Pratt, W. D. J. Jolly, D. V. Pratt, W. M. Hollenberg, V. Giguere, F. M. Cadepond, G. G. Schweizer, M. G. Catelli, R. M. Evans, and E. E. Baulieu, 1988. A region in the steroid binding domain determines formation of the non-DNA-binding, 9 S glucocorticoid receptor complex. J. Biol. Chem. 263:267- 273). Table 4B shows an alignment of the E regions of the DHR3 and EcR proteins with those of other receptor homologues. The three relatively highly conserved stretches within this region noted in Experimental Example I are overlined; each contains a cluster of residues conserved in all or most of the receptor sequences. DHR3 and EcR show strong similarity to each other and to the other proteins in these stretches, and a lower similarity outside of them. The presence of this E-region homology establishes these proteins as bona fide members of the nuclear receptor family, in contrast to the Drosophila knirps (Nauber, M. J. Pankratz, A. Kienlin, E. Seifert, U. Klemm, and H. Jackle, 1988.
Abdominal segmentation of the Drosophila embryo requires a hormone receptor-like protein encoded by the gap gene knirps. Nature 336:489-492), knirps-related (Oro, A. E., E. S. Ong, J. S. Margolis, J. W. Posakony, M. McKeown, and R. M. Evans, 1988. The Drosophila gene knirpsrelated is a member of the steroid-receptor gene superfamily. Nature 336:493-496), and egon (Rothe, M., U. Nauber, and H. Jackle, 1989. Three hormone receptorlike Drosophila genes encode an identical DNA-binding finger. EMBO J. 8:3087-3094) proteins, which show Cregion homology but no E-region homology. The E region in DHR3 is most similar to that of E75A, and the E region of EcR is most similar to that of hTRB, although the level of these similarities is lower than those found among E regions of many other receptors (Table 4C).
Thus, DHR3 and EcR are not especially close homologues of any previously cloned receptors. Comparison of E-region sequences allows division of the nuclear receptors into subfamilies (Petkovich, N. J. Brand, A. Krust, and P. Chambon, 1987. A human retinoic acid receptor which J) belongs to the family of nuclear receptors. Nature 330:444-450), the members of any one subfamily being more related to each other than to those in other subfamilies.
The DHR3 and EcR receptors fall into a subfamily with the E75B, hRARa, hTRB, and hVDR receptors.
D. In Situ Labeling of the EcR and DHR3 Proteins with Antibodies Induced by Proteins Produced in E. coli To determine the intracellular and tissue distribution of the EcR and DHR3 proteins in Drosophila, affinity-purified polyclonal antibodies directed against those proteins were produced in the following manner.
The region of about 120 amino acid residues located between the conserved DNA-binding and hormone-binding domains of these proteins was used as the immunogen to produce antibodies against each protein. Thus, the coding sequences for amino acids 335-447 of the EcR protein and for amino acids 164-289 of the DHR3 protein (see Tables 2 and 3, respectively) were cloned into the appropriate pATH (Dieckmann, and A. Tzagaloff, 1985.
J. Biol. Chem. 260:1513-1520) or pUR expression vectors, so as to fuse these coding sequences to those encoding E.
coli 8-galactosidase (Bgal) or to E. coli tryptophan E protein (trpE), respectively.
The Bgal fusion proteins were produced in E. coli by the addition of the IPTG inducer to exponential cultures, while the production of trpE fusion proteins were induced by dilution into tryptophan-free media and subsequent addition of indoleacetic acid. For EcR, the trpE fusion protein was used as an immunogen and the Bgal fusion protein was used on immunoblots to test sera for immunoreactivity to the EcR portion of the fusions. For 0.*P DHR3, the Bgal fusion protein was injected, and sera were checked against the trpE fusion protein.
For immunization the appropriate fusion protein was prepared by electrophoresis on SDS-PAGE gels and visualized by staining in ice-cold 0.25 M KC1, after Swhich the fusion protein band was cut out. Approximately 100 gg of fusion protein in 0.25 ml of gel slice was crushed by passing through successively smaller hypodermic needles, and mixed with 0.25 ml of a sterile saline solution and 0.5 ml of Freund's complete adjuvant.
For each immunogen, two New Zealand White rabbits were injected at multiple intramuscular sites, and after one month, boosted at two-week intervals, omitting the Freund's adjuvant. While the Bgal fusion proteins were subject to the above gel electrophoresis without prior purification, the trpE fusion proteins were first purified by the following method which takes advantage of their insolubility in vivo.
E. coli from a two-liter culture of induced cells were washed, and the cell pellet was subjected to several freeze/thaw cycles. The cells were resuspended in 18 ml of 50 mM Tris-HCl, pH 7.5, 0.5 mM EDTA, and 1.8 ml of mg/ml lysozyme was added. After 15 minutes on ice, the cells were lysed by passing three times through a french pressure cell at 10,000 psi. The insoluble fraction was collected by centrifugation at 27,000 x g for 15 minutes, and washed by resuspension, using a Dounce homogenizer, in ice-cold 50 mM Tris-HCl, 0.5 mM EDTA, 0.3 M NaCI, followed by centrifugation as above. The washing step was repeated, and the final pellet dissolved in 10 ml of 4M urea, 2% SDS, 50 mM Tris-HCl, pH 7.5, 1 mM EDTA, 5% 2-mercaptoethanol. Material remaining insoluble i: 15 was centrifuged out and discarded.
The antisera was affinity purified in a two-step procedure by successive passage through "nonspecific" and "specific" affinity columns. In the case of antibodies raised against the trpE fusion proteins, the nonspecific column consisted of resin coupled to the insoluble protein derived from E. coli expressing unmodified trpE protein, and was used to remove antibodies directed against trpE epitopes, as-well as against insoluble E.
coli protein impurities. The specific column consisted J of resin coupled-to the EcR-trpE fusion protein (purified as described above) and was used to absorb the desired antibodies directed against the EcR epitopes, antibodies that were subsequently released from the column. In the case of antibodies raised against the Bgal fusion proteins, the same general procedure was used, except that the resin in the nonspecific column was coupled to 8-galactosidase, while that in the specific column was coupled to the DHR3-Bgal fusion protein. Western blot analysis of the appropriate E. coli extracts demonstrated that these affinity-purified antibodies exhibited the desired specificity.
The intracellular distribution of the EcR protein in late third instar salivary glands was examined by in situ labeling of this protein with the anti-EcR antibody. The EcR protein was thereby shown to be highly localized in the nuclei of these glands. Indeed, when the polytene chromosomes in these nuclei were examined by the antibody-labeling method of Zink and Paro (Zinc, and R. Paro, 1989. Nature 337:468-471), specific loci within these chromosomes exhibited strong binding of the EcR protein. In particular, the EcR protein was bound to the early puff loci, including those occupied by the E75 and E74 genes. This is the result expected if the ecdysone receptor encoded by the EcR gene is that which induces the transcription of the early genes, as anticipated by the Ashburner model. Another prediction of the Ashburner model is that the ecdysone-receptor complex initially represses the genes responsible for the later puff, so that the transcription of the late genes induced by the early gene proteins is delayed until these proteins o 0 accumulate sufficiently to overcome this initial repression. If the EcR receptor is involved in this postulated initial repression, then one would expect the EcR protein to bind to the late puff loci in the salivary glands. This expectation was met by the observation that EcR protein also.binds to the late puff loci in the polytene chromosomes.
Additional in situ antibody labeling experiments demonstrated that the EcR protein is present in the nuclei of all ecdysone target tissues examined in late third instar larvae. It is also present in most, if not all, cells during embryogenesis and other stages of Drosophila development that have been examined. In this respect, the EcR protein was not detected by anti-EcR antibody labeling of embryos in which the EcR gene was eliminated by a chromosomal deletion, further demonstrating the specificity of this antibody.
In contrast to the widespread distribution of the EcR protein, anti-DHR3 antibody labeling of embryos demonstrated that the distribution of the DHR3 protein is highly restricted during this stage of development.
During the brief embryonic period of expression, the protein is restricted to the peripheral nervous system, and to cells surrounding the spiracles at the posterior end of the embryo.
Finally, it should be noted that affinity-purified antibodies against the E75A protein have also been :o prepared by the same technique described above for anti- EcR and anti-DHR3 antibodies. In situ antibody labeling of the E75A protein in larval salivary-glands has also demonstrated that this protein is localized in the 1 nucleus and is bound to specific loci in the polytene chromosomes.
SEXAMPLE
III
THE ECDYSTEROID-BINDING, DNA-BINDING AND GENETIC REGULATORY PROPERTIES OF THE EcR PROTEIN DEMONSTRATE THAT IT IS AN ECDYSONE RECEPTOR.
The following experiments demonstrate that the protein encoded by the EcR gene is an ecdysone receptor 5 by the following three criteria. The EcR protein binds ecdysteroids and accounts for a large proportion, if not all, of the ecdysteroid-binding activity present in Drosophila embryos and in a variety of cultured Drosophila cells. The EcR protein binds with high specificity to a DNA sequence that functions as an ecdysone response element (EcRE), an enhancer that confers ecdysone inducibility to a promoter. Cells that do not respond to ecdysone because they lack functional ecdysone receptors are transformed to the ecdysone-responsive state by transfection with an EcR expression plasmid.
A. The EcR Protein Binds Ecdysteroids The EcR expression plasmid, pMTEcR, shown in Figure 1 contains the open reading frame encoding the EcR protein (EcR ORF; see Experimental Example II) fused to the Drosophila metallothionine promoter (PMT) at its end, and the polyadenylation-cleavage sequences of the Drosophila Actin 5C gene at its 3' end. Because transcription of the EcR ORF is under the control of this metallothionine, that transcription is induced by Cu 2 ion to yield an mRNA that, in turn, produces the EcR protein.
A cell line, MtEcRHy, that overproduces this protein upon Cu 2 induction, as determined by Western blot analysis using the affinity-purified anti-EcR antibody (see Experimental Example II), was constructed by the stable integration of the pMTEcR plasmid DNA into the genome of 1 5 Drosophila Sch-2 cell line. A control cell line, MtHy, was similarly constructed by the integration of the expression vector DNA lacking the EcR ORF.
Whole cell extracts were prepared from both the MtEcRHy and MtHy cell lines after Cu induction, and were Sassayed for ecdysteroid-binding activity using the high affinity ecdysone analogue 12 1] iodoponasterone A. The MtEcRHy extract contained sevenfold more saturable ecdysteroid-binding activity than the MtHy control extract.
To see if the induced ecdysteroid-binding activity was due to the EcR polypeptide itself, the EcR protein was depleted from the MtEcRHy extract by immunoprecipitation using an affinity-purified anti-EcR polyclonal antibody, or, as a control, the extract was mock-depleted with preimmune serum. The treated extracts were then assayed for ecdysteroid-binding activity.
Comparison of the immuno-depleted extract with the mockdepleted extract showed that most of the binding activity was removed by the anti-EcR antibody treatment, indicating that the induced ecdysteroid-binding activity results from the EcR protein.
The endogenous ecdysteroid-binding activity in the control cell line, MtHy, was unchanged by Cu 2 exposure, and was approximately the same as that in the Sch-2 cell from which it derives. The question arises as to whether the endogenous activity in these and other Drosophila cell lines, as well as in embryonic extracts, results from the expression of the EcR gene in their respective genomes. To answer this question, extracts from embryos and several cell lines were immuno-depleted and mockdepleted, as described above, and assayed for ecdysteroid-binding activity. Again, comparison of these treated extracts showed that the large majority of the endogenous binding activity was removed in each case by treatment with the anti-EcR antibody. Thus, it appears that most, if not all, of the endogenous binding activity in embryos and cell lines results form the resident EcR gene.
Methods Extracts Tissue culture cell extracts for hormone and DNAbinding experiments were prepared as follows. Cells were grown in spinner flasks to a density of 5-7x106 cells/ml, and were washed once in EcR buffer (25 mM Hepes, pH 40 mM KC1, 10% glycerol, 1 mM EDTA, 1 mM dithiothreitol, and the following cocktail of protease inhibitors: 10 mM Na 2
S
2 Os, 500 gM PMSF, 1 M leupeptin, 1 /M pepstatin). All further manipulations were at 4'C.
Cells were resuspended in EcR buffer at 2% of the original culture volume, divided into 3 ml aliquots, and sonicated using 30 1/2 second pulses with a probe sonicator (Bronson Sonifier 450), resulting in disruptions of -95% of the cells. After centrifugation at 100,000 x g for 1 hour, 100 gl aliquots of supernatant were frozen in liquid nitrogen, and stored at Protein concentration was determined using bovine serum albumin as the standard, and was typically 6-11 mg/ml.
Embryo extracts were prepared by a similar protocol: 3-6 hour Canton S embryos were dechorionated in commercial bleach for 2 minutes, washed extensively in 0.7% NaCl, and resuspended using 2 grams of embryos per ml of EcR buffer. Embryos were broken with 20 strokes in a Dounce homogenizer using a B pestle, and lysis was completed with the probe sonicator using the same settings as used for the tissue culture cells. The extract was adjusted to 400 mM KC1, centrifuged 1 hour at 100,000 x g, and aliquots of supernatant were frozen.
This extract contained 13.4 mg/ml protein. Before use in hormone binding, it was diluted tenfold in EcR buffer lacking KC1 to bring the final KC1 concentration to 40 mM.
Hormone-binding assays For hormone-binding experiments, extracts were first diluted to the following concentrations in EcR buffer: 0.9 mg/ml for MtHy and MtEcRHy extracts, 3 mg/ml for S2 and SRS 1.5 extracts, 4 mg/ml for the Kc cell extracts, and 1.3 mg/ml for the embryo extract. All manipulations were done on duplicate samples in order to quantify variability in the results. For immunoprecipitation experiments, extracts were immuno-depleted, mockdepleted, or left untreated. For depletions, 300 p1 of diluted extract was incubated for 30 minutes at 25*C with ml affinity-purified anti-EcR antibody, or with gl preimmune serum for the mock-depletion control.
Then 38 1l 10% Staphylococcus aureus (Pansorbin, Calbiochem) in EcR buffer was added, and incubation was continued for 15 minutes at 25*C. After centrifugation for 3 minutes in a microcentrifuge, the supernatant (depleted extract) was recovered. The immunoprecipitation was repeated, except in the case of the embryo extract, which was subjected to only one round of precipitation. The "untreated" extract aliquots were left at 4'C for the duration of the depletion procedure, and were diluted with EcR buffer to match the final concentration of the depleted aliquots.
A modification of the hormone-binding assay of P.
Cherbas was used (Cherbas, P. 1988. Proc. Nat'l Acad.
Sci., U.S.A. 85:2096-2100). Assay tubes contained 140 jl extract, 14 1 1 I] iodoponasterone, and either 14 Ml EcR buffer or 14 gl unlabelled 20-OH ecdysone in EcR buffer as a competitor. 12 l] iodoponasterone was 2177 Ci/mM and was used at a final concentration of 5x10 1 0 M in the assay; 20-OH ecdysone was 2x10s M final concentration in the assay. After incubation for 1 hour at 25'C, each 'reaction was spotted on a dry Whatman GF/C filter (2.4 cm), and after 30 seconds the filter was washed by using a vacuum to draw 10 ml EcR buffer through the 15 filter over a period of 1 minute. Filters were placed in 800 gl 4% SDS, and radioactivity was measured in a 7 counter. The hormone-binding activities shown are saturable binding activities, calculated as the total binding activity, as measured in assays with no added "o0 competitor, minus the unsaturable binding activity, which was measured in the assays with excess unlabelled ecdysone added. In the most active extracts, the unsaturable activity (representing the large number of low affinity binding sites in the extract) was less than 10% of the total'activity.
B. Genetic Regulatory Activity of the EcR Protein in vivo An ecdysone-inducible reporter plasmid, pEcRE/Adh/Bgal (Figure was constructed to test the regulatory functions of the EcR protein in vivo. The reporter gene in this plasmid consists of the sequence that encodes the E. coli B-galactosidase (BAal ORF) linked through the 5' leader sequence of the Drosophila Ultrabithorax gene (UBX leader and AUG) to an ecdysoneinducible promoter. This promoter was created by fusing a truncated version of the proximal promoter for the Drosophila Adh gene (PDMh-3+S3, the numbers indicating that it consists of the sequence from base pair positions -34 to +53, which just includes the TATA box) to seven repeats of a 34 bp synthetic oligonucleotide (Z EcRE OLIGOS) which contains the ecdysone response element (EcRE) from the ecdysone-inducible heat shock gene hsp 27 (Riddihough and Pelham, 1987. EMBO J. 6:3729-3734). The seven EcREs should confer ecdysone-inducibility to the truncated promoter, provided that the cells transfected with this reporter plasmid contain the appropriate ecdysone receptor.
This ecdysone-inducible reporter plasnid was constructed by insertion of the 7 EcRE OLIGOS into plasmid pAdh/Bgal, which is identical to pEcRE/Adh/Bgal except that it lacks the array of ecdysone response elements. The pAdh/Bgal plasmid should therefore not be ecdysone inducible and can serve as a control. To test these expectations,. Sch-2 cultured cells (which were shown above to contain endogenous ecdysone-binding activity) were transfected with each plasmid and examined for B-galactosidase activity in the presence and absence of ecdysone. The ecdysone-induced B-galactosidase O activity in the pEcRE/Adh/Bgal transfected cells was 2000-fold greater than when such cells were not exposed to ecdysone, whereas ecdysone had little effect on the pAdh/Bgal transfected cells. These results indicate that the EcREs confer ecdysone-inducibility on the PoAdh-4+53 promoter, as expected, and that the Sch-2 cells contain functional ecdysone receptors.
To test the function of the EcR receptor in such a system, host cells lacking functional ecdysone receptors are required. "Ecdysone-resistant" cells lacking ecdysone-binding activity, and hence, presumably functional receptors, can be produced by continuously exposing ecdysone-responsive cells to ecdysone during a period of several weeks. This ecdysone-resistant state is then maintained in ecdysone-free media for several months. An ecdysone-resistant cell line, SRS 1.5, was therefore generated by growing Sch-2 cells in 3x10- 6
M
ecdysone. The SRS 1.5 cells lack significant ecdysonebinding activity.
When these cells were transfected with the pEcRE/Adh/Bgal plasmid and subsequently exposed to ecdysone, very little ecdysone-induced B-galactosidase activity was observed, indicating that the cells have only trace amounts, if any, of functional receptors. To test whether the expression of the EcR gene can "rescue" this deficiency, the SRS 1.5-cells were cotransfected with two plasmids: the ecdysone-inducible reporter plasmid, PEcRE/Adh/Bgal, and a constitutive expression plasmid for the EcR gene, pActEcR, in which transcription of the EcR ORF is controlled by the Drosophila Actin promoter, Pasc (Figure Cotransfection with these two plasmids, followed by exposure to ecdysone, resulted in a dramatic induction of B-galactosidase activity. Thus, introduction of this EcR expression plasmid into the SRS 1.5 cells regenerated the ecdysone-inducibility they had lost.
Methods Construction of the pAdh/Bqal, pEcRE/Adh/Bqal and pActEcR plasmids Plasmid pAdh/Bgal was constructed in two steps. The BglII-ScaI fragment of pDA5'-34, containing nucleotides 34 to +53 of the Drosophila Adh distal promoter, was cloned into pUC18 cut with Scal and BamHI. The resulting plasmid was cut with EcoRl, and the EcoRl fragment of cPBbxd6.2 (containing the Ubx untranslated leader and AUG, the Bgal open reading frame, and the SV40 splice and poly A signals) inserted.
To construct pEcRE/Adh/Bgal from pAdh/Bgal, two 34residue oligonucleotides were synthesized: 5'TCGAGAGACAAGGGTTCAATGCACTTGTCCAATG3' These will anneal to form 30 bp duplexes with Sail compatible four nucleotide overhangs at their 5' ends, as shown. Further annealing via the 5' overhangs allows formation of tandem arrays that can be inserted into pAdh/Bgal at its Sail site just upstream from the TATA box of the truncated Adh promoter. When these oligonucleotides were kinased, annealed, ligated into Sail-cut pAdh/8gal and cloned, pEcRE/Adh/Bgal was obtained. Restriction mapping showed that it contained a tandem array of seven 34 bp repeats, each of which contains the 23 bp ecdysone response element (EcRE) present in the hsD 27 gene, the remaining 11 bp representing flanking hsp 27 sequences and the overhangs.
The constitutive EcR expression plasmid, pActEcR, was formed by inserting the Fspl-HpaI fragment of an EcR *o4 cDNA containing bp 851-4123 that contains the ORF encoding the EcR protein (Table into the EcoRV site of the ActSV40BS plasmid. This expression vector was constructed in two steps by inserting the Xbal-EcoRl fragment of cosPneoB-gal, containing the SV40 splice and poly A signals, into BlueScript+KS (Stratagene) cut with SacII and Xbal, blunting the EcoRl and SacII ends. The resulting plasmid was digested with BamHl and Apal, and the BamHl-EcoRl fragment of pPAc was inserted, with the Apal and EcoRl ends being blunted.
Transfection and generation of the cell line SRS The-cell line SRS 1.5 was obtained by growing Schneider line 2 (Sch-2) cells in the presence of 3x10 6
M
ecdysone (Sigma). This treatment initially halts growth of Sch-2 cells, but after several weeks the adapted cells grow well. SRS 1.5 cells were washed in hormone-free medium and passed several times in hormonefree medium prior to their use in transfection experiments. Cells were transfected by the calcium phosphate technique. Cells were transfected with 10 Ag 82 of each plasmid used; when only a single plasmid was being transfected, 10 gg of pUC18 DNA was added as a carrier. In general, all transfections were carried out in duplicate. Twenty-four hours after transfection, cells that were to undergo hormone treatment were split into two dishes, one of which was treated with 2x10" M ecdysone.
B-galactosidase assays Forty-eight hours after transfection, 2 ml of cells were washed once in PBS (137 M NaCl, 27 mM KC1, 65 mM Na 2 HP0 4 15 mM KH2P0 4 pH and were resuspended in 50 Ml of 0.25 M sucrose, 10 mM Tris, pH 7.4, 10 mM EDTA, and repeatedly frozen in liquid nitrogen and thawed in a 5 37*C water bath for a total of 3 freeze/thaw cycles.
Cell debris was removed by a 10-minute centrifugation in a microcentrifuge at 4*C. The concentration of protein in the supernatant (cell extract) was determined by the Bradford method, with bovine serum albumin as a standard, 20 and was typically 1.5-2.5 mg/ml. Extracts were assayed immediately or frozen and assayed up to two weeks later with no loss in activity. To 10 gl of extract, or an appropriate dilution, 500 1l of assay buffer was added (0.6 mM 4-methylumbelliferyl-B-D-galactoside, 60 mM Na 2
HPO
4 40 mM NaH 2
PO
4 10 mM KC1, 1.0 mM MgSO 4 pH After a 30-minute incubation at 37'C, reactions were stopped with 500 gl of 300 mM glycine, 15 mM EDTA, pH 11.2. The fluorescent reaction product was quantified on a Perkin-Elmer LS-5B luminescence spectrometer, with Aex=365 nm and Ae=450 nm. Bgal activities are given as fluorescence units per ig protein assayed.
C. Specific Binding of the EcR Protein to Ecdysone Response Elements The simplest explanation of the results described in the preceding section is that the EcR protein generated by the EcR expression plasmid binds to the EcRE of the reporter plasmid and, in combination with ecdysone, activates the minimal Adh promoter in that plasmid. The following experiment was designed to test whether the EcR protein exhibits specific binding to this EcRE in vitro.
Two plasmids were used: pUC18, which serves as the control, and pUC18-EcRE, which was generated by substituting the HindII-XbaI fragment from pEcRE/Adh/Bgal that contains the seven repeats of the 34 bp EcRE oligonucleotide, for the HindII-Xbal fragment of pUC18.
Because the only difference between these two fragments is the seven oligonucleotide.repeats, this is also the only difference between the two plasmids.
The two plasmids were digested with ApaLI and Hind III, end-labeled with 3P and mixed with an extract from 1 MtEcRHy cells in which the EcR protein was overexpressed by Cu 2 induction (see section A, above). After a 15-minute incubation at 25*C to allow EcR-DNA binding to occur, affinity-purified anti-EcR antibody was added.
The 25*C incubation was continued for an additional 40 minutes, at which time anti-rabbit Ig-coated magnetic beads (Dupont Magnasort-R) were added, and the incubation continued 15 minutes more. The beads were separated from the solution magnetically, similarly washed, and the DNA eluted from the beads in 1% SDS at 65'C. The eluted DNA was ethanol precipitated and fractionated by electrophoresis in an agarose gel, which was dried and autoradiographed.
Only the fragment containing the EcRE oligonucleotide was specifically and efficiently registered on the autoradiographs, and that registration was dependent upon the anti-EcR antibody. Quantitative analysis of the autoradiographs demonstrated a 10 3 -fold preference for binding to the EcRE oligonucleotide over the average vector sequences, under the conditions of this assay (see Methods, below).
According to the criteria stated at the beginning of this Experimental Example, the EcR protein clearly satisfies the definition of an ecdysone receptor.
Methods Conditions for the DNA binding assay A quantity of 0.2 fmole of digested, labelled plasmid DNA was mixed with 2 Ag (dI/dC) in 10 il of TE mM Tris-HCl, pH 8.0, 1 mM EDTA), and 90 gl of the LO MtEcRHy extract, diluted to 0.9 mg/ml in EcR buffer adjusted to 180 mM KC1, was added. After binding for minutes at 25*C, 2 ml of affinity-purified anti-EcR antibody, diluted 1.5x in EcR, was added, and this Sincubation was continued at 25'C for 40 minutes, when 50 rl of anti-rabbit Ig-coated magnetic beads (Dupont SMagnasort-R), exchanged into 180 mM KC1 EcR buffer, was added and the incubation continued for 15 minutes.
The beads were washed twice in 400 gl 180 mM KC1 EcR buffer, and DNA was eluted from the beads by soaking twice in 200 Ml 1% SDS in TE at 65*C. The eluted DNA was ethanol precipitated and run on an agarose gel, which was dried and autoradiographed. As controls, one half of the input DNA (0.1 fmole) was run on the gel for comparison, and the binding assay was carried out, leaving out the antibody.
EXAMPLE IV RECEPTOR GENE MUTAGENESIS.
Mutations in the steroid receptor superfamily genes can alter their function in two ways. Most obviously, they alter the sequences encoding the receptor proteins and thus alter the receptor function. Alternatively, they can alter the expression of these genes an alteration that can be at any level of that expression from transcription of the gene to the translation of its mRNA(s). Such mutations can change the timing of gene expression during development or change the tissue and cell distribution of that expression, thus, profoundly changing the course of development. Furthermore, these mutations provide information about the regulation of receptor gene expression, just as mutations that alter the structure of the receptors encoded by these genes provide information about the genes whose expression these receptor proteins control. In particular, mutations that alter receptor gene expression can lead to the identification of the proteins and other regulatory molecules that control their expression. Clearly, mutagenesis of insect steroid receptor superfamily genes S provides an important avenue leading to an ability to interfere in a highly specific manner with insect development, and thus to control insect infestations :15 deleterious to human health and agriculture.
We have carried out mutagenesis experiments for two Drosophila members of the steroid receptor superfamily genes, E75 and EcR, that we have cloned and characterized with respect to their expression. In this experimental 20 example, mutagenesis of the E75 gene is described.
A. Deletion Mutations In Drosophila, genetic analysis for a given locus in this case, the early puff locus at 75B that houses the E75 gene generally depends upon the isolation of deletions of all or part of that locus. Such deletions greatly facilitate the subsequent isolation of point and other small mutations within the locus. By isolating mutations that are revertants to the neighboring dominant Wrinkled mutations, we have isolated and molecularly mapped the boundaries within our chromosomal walk (see Experimental Example I) of two deletions, WR4 and WRI0, generated by gamma ray mutagenesis, the preferred way of generating such large alterations of genomic structure.
One of these, WRO0, extends distally from Wrinkled to cover the entire E75 gene; and the- other, WR4, extends to a point about 90 kb upstream of the 5' end of the 50 kb transcription unit and does not include the gene.
An F2 screen was then employed to screen for gamma ray-induced mutations mapping to the 200 kb distal region that is included in the W 1 10 deletion but not the WR4 deletion. This screen resulted in the isolation of five members of a single lethal complementation group that molecular mapping data demonstrate represents the gene. The most useful of these five mutations is the E75 48 mutation. Molecular mapping of this mutation demonstrated that it is a 105 kb region that includes all of the E75 gene. This method provides an extremely efficient screen for other E75 mutations, by screening for mutations that cannot complement this deletion mutation.
B. E75 Mutations Generated by Ethyl Methane Sulfonate The chemical mutagen ethyl methane sulfonate (EMS) was used for this screen, as it is the preferred method for generating point or small mutations. An F2 screen of 15,000 lines resulted in the isolation of 23 penetrant mutations within the 105 kb region of the E75 48 deletion, all of which turned out to be alleles of E75. It appears that this 105 kb region was saturated by this screen with respect to lethal complementation groups, and hence, appears to be the only lethal complementation group in this region. Adding the five E75 mutations described above, a total of 28 penetrant E75 alleles have thus been isolated, several of which are temperature-sensitive alleles.
Inter se complementation studies among these alleles and examination of their phenotypes reveal a complex complementation group a complexity that probably results from the fact that the E75 gene contains two overlapping transcription units, a 50 kb E75A unit and a kb E75B unit that occupies the 3' end of the E75A unit (see Experimental Example I and Table These alleles can be roughly divided into two groups: those that cause lethality in early development, during the latter part of embryogenesis or during early larval development, and those that cause lethality late in development, during the prepupal or pupal stages.
This division correlates with the stages when the and E75B units are expressed. Thus, transcription is associated with each of the six pulses of ecdysone, including those that mark the embryonic and early larval stages. By contrast, E75B mRNAs are not observed until the end of the last larval stage, being particularly abundant during the pupal stage. This correlation invites the speculation that the early lethal mutations affect the expression of the E75A unit and its E74A protein, and that the late lethal mutations specifically affect the expression of the E75B unit and its E75B protein. This proposition can be tested by detailed molecular mapping, of these mutations and further examination of their phenotypes at the molecular level to "determine the causes of lethality.
The mutants described here provide a foundation for the further genetic analysis of the E75 gene that will allow exploration of the requirements for appropriate expression and function and will identify structural and functional domains of E75. Some of the future studies will best be performed by its in vitro manipulation, followed by transformation of the constructs back into Drosophila. Finally, it will be desirable to identify interacting genetic loci interactions that may occur at the level of regulation of expression or at the level of interaction of the proteins with those encoded by other genes. Such interactive genetic loci can be identified via the isolation of mutations that act as suppressors or enhancers of the E75 mutations.
88 Methods Strains, markers and chromosomes For this aspect of the invention, the following strains, markers and chromosomes were used. Tu 2 was described by Lindsley (Lindsley, 1973. DIS 50:21). All other strains and mutations are as described (Lindsley, and Grell, 1968. Genetic Variation of Drosophila melanogaster, Publication 627, Carnegie Institute of Washington, Washington, DC). ru h W 4 es ro ca was constructed by recombination between ru h WR4 sbd 2 Tu 2 and st sbd 2 e ro ca. The st in ri pp sbd 2 chromosome was :constructed by recombination of st in ri p' with sbd 2 in order to allow marking of this chromosome over WR4 and
*RIO
W
R and homozygosed by crossing to TM3, backcrossing to TM3, and mating of isogeneic sibling progeny. The st pP ell line was homozygosed by standard ionic procedures.
AntpW and ns R 4 are described in Scott et al. (1984) Proc. Nat'l Acad. Sci. USA 81:4115-4119. The pupal lethals X19, g26, 013B, 8m12, iX-14, 2612, m45, p4, .Q Q mz416, 13m115, 052 and wq49 are described in Shearn S. (1974) Genetics 77:115-125. All strains used to construct the strains described above and other strains were obtained from the Bowling Green and Caltech stock centers.
TM1, TM3 and'TM6B (Lindsley, and Grell, 1968.
Genetic Variation of Drosophila melanoaster, Publication 627, Carnegie Institute of Washington, Washington, DC) are balancer chromosomes carrying recessive lethal mutations along with multiple inversions to suppress recombination. This allows the maintenance, as a heterozygote, of a recessive lethal chromosome in its original state. These chromosomes are also marked with convenient visible markers.
Quantitative Southern blot mapping for detection of mutant lesions DNA was prepared from adult flies (about 50) by douncing in 1 ml of 10 mM Tris-HCl, pH 7.5, 60 mM NaCl, mM EDTA, 0.15 mM spermine, 0.2 mg/ml proteinase
K.
The homogenate was added to an equal volume of 0.2 M Tris-HCl, pH 9.0, 30 mM EDTA, 2% SDS, 0.2 mg/ml proteinase K, incubated at 37*C for 1 hour, and then extracted twice with buffer-saturated phenol and once with 24:1 chloroform/isoamyl alcohol. DNA was EtOH precipitated twice, hooking the pellet out without centrifugation. Southern blot hybridization was as described (Segraves, W. et al., 1984. J. Mol. Biol.
175:1-17). Where restriction fragment length polymorphism was not used in order to distinguish the parental chromosome from the balancer chromosome, quantitation of band intensity on genomic Southerns was :achieved using a scanning densitometer. By using a control probe outside the mutant region, the amount of DNA in each track was internally controlled. Comparison of deficiency heterozygote to wild type bands, when normalized to a control band in this way, gives little deviation from the expected 1:2 ratio.
Molecular clonin of mutant lesions Restriction fragments of the appropriate size were isolated by preparative low melting agarose
(FMC)
electrophoresis of about 20 jg of restricted genomic DNA.
The 6 kb WR4 XhoI fragment was cloned into XhoI-cleaved ASE6DBam which is propagated as a plasmid in order to grow the vector and cannot be packaged without an insert.
The 18 kb W R0 Sail fragment was cloned into the Sail site of AEMBL3, cleaved also with EcoRI for the biochemical selection method for the prevention of propagation of non-recombinant clones. The 7 kb EcoRI fragment containing the x37 breakpoint was cloned into EcoRIcleaved A607. Plating of recombinants on the hflA strain RY1073 prevented plaque formation by non-recombinant phage. The 14 kb x48 EcoRI fragment was cloned into the EcoRI site of AEMBL4, which had been cleaved with BamHI to utilize the "biochemical selection" for recombinants.
The breakpoint fragments of x44 and the recipient fragment were cloned into ASE6ABam. Libraries were packaged using A in vitro packaging extracts prepared as described in Hohn (Hohn, 1979. Methods Enzymol.
68:299-303). After demonstration that each of the libraries gave a significant number of plaques only when inserts were included in the ligation, they were screened using restriction fragments capable of detecting the breakpoint clones.
Gamma ray mutagenesis Adult males of the strain ru h W sbd 2 Tu 2 or st in ri ~P sbd were irradiated in plastic vials with 5000 rad of gamma rays from a Cs 37 source at a dose rate of 4300 rad/minute. These were then mated to virgins of the appropriate strain, which were allowed to lay eggs for five days.
EMS mutaqenesis The primary lesion in EMS-induced mutations of bacteria and yeast is an alkylation-induced transition of guanine to adenine; most EMS-induced point mutations in Drosophila can similarly be explained on this basis.
This change would be expected to convert, on the complementary strand, a C in the opa repeat element to a T, creating an in-frame stop codon (CAGCAA to UAGCAA or CAGUAA). (Ethylnitrosourea, ENU, which has been reported to yield a higher number of mutations for a given amount of sterility, is also an alkylator; however, considerably more stringent precautions must be taken in handling this mutagen.) EMS was administered at 0.025 M to unstarved 1.5-5 day-old males in 1% sucrose solution (1.5 ml on two slips of Whatman #2 in a 350 ml milk bottle). Starvation of the males for 8 hours before EMS administration resulted 91 in unacceptable levels of sterility, and males of the st E' e 1 strain readily fed upon the EMS/sucrose solution without starvation. Mutagenesis was monitored by crossing mutagenized males to attached-X FMA3 females.
Other mutants seen in this screen included a large number of ca alleles (many mosaic) seen over TM6B in the F1 and F2 generations, a dominant brown allele, and two new mutants, Wink, a third chromosome dominant mutation resembling Bar, and a third chromosome dominant Curlylike mutation. Wink is easily scored (RK1), has complete penetrance, and is quite healthy over TM6B.
In the initial screen, vials were scored as mutant if they had fewer than 25% as many deficiency heterozygote as balancer heterozygote flies. On retesting, this was revised to 50% of the level seen in control crosses. Balancer heterozygotes were approximately two-thirds as viable as deficiency heterozygotes.
In situ hybridization and cytological analysis In situ hybridization of polytene chromosomes was carried out as described in Experimental Example I (see Methods, section Cytological analysis was performed by squashing larval salivary glands in lactoacetic orcein orcein, 50% acetic acid, 30% lactic acid).
Although the present invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be obvious that certain changes and modifications may be practiced within the scope of the claims.
A
ACTT ACT AGTGAAAAACATGAT AAT AAACAACTT GCCAAAAAAAAT CCAATGAAATT GACA CTT ATGTT AAAAAAATAGGTGACATTGTAACCGTTGAT GT ACACTT ACGAAGTACGT AACAAGTT CAT GA -141.
ACTGATTTCGTGAGCAGGTT-CCATAATCGCCGTATCTGTGGATCGCGCGCTCCTCCTCGCACTCGC
TGCGTCGATCGCAGCACATGTTCGAAGTGCGACAGAGTGCAAAGCGCAGAGCGCCGACGTCCACGCCGAA
+1
AAAACTCAACAAGATCCGCCCCCGAATGTTGATTTTCCTTTCATTGACTAACTGCCACTCGCAGCGCGCAG
mRNiA start site ATCCTCCGCTCCGCTTGTTCCGTTCCGTTCGTTTCGTTTCGTTTCCTTCGATCT ACTTCGAGTCGCGAGT TTTAACCACTGTAGTGAGTGCCCCGTGAAAAGGATAACCCAAAAAGTGATTT CT ACTATTTT CCAATAGT +211
TTTTATCAGTGTGAAGAAAACATGTAAACTTGCCAAAAAGGGCTTTAAAAGATACAAAGCTTCAATGC
GAAGCAT AAAAT AAT AT CCCACCAGT CCT TCAAAAACCAAAACT AT CCCT AAGGCT CGAAAT TT AAAT TA
AAATTTTTTTAATAAATATTCCAAAAATATTCCCCTGAAAACGTTTATAAACCCCCAACCGAGCAAA
380 ATG TTA ATO TCC GC( MET Len MET Ser Ala 3 GAC AGT TCA GAT AGI Asp 5cr 5cr Asp Ser CGCC AAC ACT TCT GTG ATC TGC AGC Ala Lys Thr 5cr Val le Cys
ACC
Thr
GCA
Ala 542
CTA
Leu
ATG
MET
CTG
Len
GTG
Val
CCA
Pro
CCG
Pro
GCC
Gly
GAT
Asp
ACT
Scr
CCC
Pro
CCA
Pro
CAG
Gin
ACT
CC
Ala
ATT
le
AAC
Asa
AAT
Asn
GTC
Val
AC&
5cr
TTC
Lcu
ACA
71C
TAC
Tyr
CCA
Pro ATC CTA MET Len CCC CTA Cly Val
ACT
5cr
AAT
Asn
CCC
Pro
CTT
Val
CCC
Pro
ACA
Thr
CCC
Pro
COT
Gly
GCA
LAI
ACA
7br
TCC
5cr
CAA
Gin
GT
Gly
CAC
H-is
CCC
Pro
CCA
Pro
GGT
Cly
CCT
Ala
TTC
Phe
GTC
Val
AGC
5cr
ACT
Tbr
CCA
Pro
CGA
Arg
TAC
Tyr
ACC*
Arg A CC
CGC
Gly
GCA
Ala CCT CCA Ala Pro TCT CAC 5cr His TAC CAC Tyr Gib AGC CTG 5cr Len ATC GCG Met Ala GTG ATC Val le GTG TTG Val Len
GAA
Gin
CTC
Len
CAC
His
GTA
Val
ACT
5cr
TTT
Phe
CAA
Gin
CAC
Gin
GAA
Gin
AAC
Asn
GCA
Ala
TCT
5cr
GTC
Val
CAG
Gin
CCC
Pro
AAT
Asn
AGC
5cr
CCT
Pro
TCC
Ser
AGC
Scr
CAC
Gin
AGC
Ser
CC
Ala
AAG
Lys
GTC
Val
AAT
Asn
AAA
Lys
CAG
Gin
ACC
I'lr
CTG
Len
CTG
Len
ACA
Tic
TCT
5cr
TCG
5cr
CCC
Pro
ACA
AAA
Lys 54
GCC
Gly
CAT
Asp
CCC
Pro 108
ACT
Ser
CAG
Gin AAC TCC TCC GTC AAG CTC Asn 5cr 5cr Val Lys Leu CCC GTC ACC ACC ACC GAT Ala Val 5cr Th r Tbr Asp 812 CAd CAA ATG CCC CAd CAC TTC GAG TCC .CTG CCC CAC CAC CAC CCC CAd CAG GAA Gin Gin MET Pro Gin HMs Phe Gin 5cr Leu Pro His His His Pro Gin Gin Glu 162 CAC CAG CCA CAG CAG CAG CAG CAA CAA CAT CAC CTT CAG CAC CAC CCA CAT CCA His Gin Pro Gin Gin Gin Gin Gin Gin His His Leu Gin His His Pro His Pro CAT GTG ATG TAT CCG CAC GGA TAT CAG CAG GCC AAT CTG CAC CAC TCO GGT GGT His Val MET Tyr Pro His Gly Tyr* Gin Gin Ala Asn Leu His His Ser Gly Gly ATT GCT GTG GTT CCG GCG GAT TCG CGT le Ala Val Val Pro Ala Asp Scr Arg CCC CAG ACT CCC GAG TAC ATC AAG TCC Pro Gin Thr Pro Glu Tyr fle Lys Ser 216 TAC CCA GTT ATO GAT ACA ACT GTG GCT AGT TCG GTA AAO GGG GAA CCA GAA CTC Tyr Pro Val MET Asp Thr Mur Val Ala Ser Ser Val Lys Gly Glu Pro Glu Leu GTGAGTTGTG intron TTCTTTGCAG 1082 AAC ATA GAA TTC GAT GGC ACC ACA GTG CTG TGC CGC OTT TGC COG GAT AAG GCC Asn le Gin Phe Asp Gly Thr Thr Val Leu Cys Arg Val Cys Gly Asp Lys Ala GTAAGTTCGT intron ATCGTTTCAG TCC GOT TTC CAT Set Gly Phe His TAC GOC OTG CAT TCC TOC GAG GOT TGC AAG GGA TTC TTC CGC Tyr Gly Val His Set Cys Glu Gly Cys -Lys Gly Phe Phe Arg 270 COC TCC ATC CAG CAA AAG ATC CAG TAT COC CCG TOC ACC AAG AAT CAG CAG TOC Arg Ser le Gin Gin Lys le Gin Tyr Arg Pro Cys Thr Lys Asn Gin Gin Cys AGC ATT CTG COC Se le -Leu Arg 9* ATT 0CC OTO GGC fle Ala Val Gly 1352 OAA AAG GCG CGT Gln Lys Ala Arg ATC AAT CGC AAT COT TOT CAA TAT TOC CGC CTG AAA AAG TGC le Asn Arg Asn Arg Cys Gin Tyr Cys Mg Lcu Lys Lys Cys GTGAOTACCT intron 3. CCAATTGCAG, ATG AGT CGC GAT OCT GTG COT TTT OGA COC GTG-CCG AAG, COC MET Ser Arg Asp Ala Val Arg Phe Gly Mrg Val Pro Lys Mrg 324 ATC CTG GCG GCC ATG CAA CAG AGC ACC CAG AAT CGC GGC CAG le Lcn Ala Ala MEr Gin Gin Ser Thi Gin Asn Arg Gly Gin CAG CGA 0CC CTC GCC ACC GAG CTG GAT GAC CAG CCA CGC CTC CTC 0CC 0CC GTG Gin Arg Ala Leu Ala Thr Glu. Leu Asp Asp Gin Pro Pig Len Leu Ala Ala Val CTO COC 0CC CAC Len Mrg Ala His CTC GAG ACC TOT GAG TTC ACC* AAG GAG AAG GTC TCG OCG ATO Len Gin Mwr Cys Gin Phe Thr Lys Gin Lys Val Ser Ala MET 378 GTAAGTCTCA intron 4. ATTTCTTCAG.
COG CAG COGG CO COG OAT TGC CCC TCC TAC TCC ATG CCC ACA CTT CTG 0CC TOT Mg Gin Mg Ala Mrg Asp Cys Pro Ser Tyr Set MEr Pro Thr Len Len Ala Cys CCG CTG AAC CCC 0CC CCT GAA CTG CAA TCG GAG CAG GAG TTC TCG CAC COT TTC Pro Leu Asn Pro Ala Pro Glu Lcu Gin Ser Glu Gin Glu Phe Ser Gin Mrg Phe 1622 GCC CAC GTA ATT CGC GGC GTG ATC GAC TTT GCC GOC ATG ATT CCC GOC TTC CAG Ala HMs Val Ile Mrg Gly Val Ile Asp Phe Ala Gly MET Ile Pro Gly Phe Gin 432 CTG CTC ACC CAG GAC GAT AAC TTC ACO CTC CTG AAC GCG GGA CTC TTC GAC 0CC Leu Leu Thr Gin Asp Asp Lys Phe Thr Leu Lcu Lys Ala Gly Leu Phe Asp Ala CTG TTT OTO. CGC CTO ATC TGC ATG TTT GAC TCG TCG ATA AAC TCA ATC ATC TOT Leu Phe Vai Arg Leu le Cys METPhe Asp 5cr 5cr Hie Asn Ser fle Ile Cys CTA AAT 0CC CAG OTO ATO CGA CGG OAT OCO ATC CAC AAC OGA 0CC AAT 0CC COC Lcu Asn Cly Gin Val MET Mg Mrg Asp Ala Ile Gin Asn Gly Ala Asn Ala Mrg 486 TTC CTG GT0 CAC TCC ACC TTC AAT TTC OCO GAG CCC ATO AAC TCO ATG AAC CTG Phe Leu Val Asp Ser Tbr Phe Asn Phe Ala (flu Mrg MET Asn Ser MET Asn Leu 1892 ACA OAT 0CC GAG ATA GOC CTG TTC TOC 6CC ATC OTT CTG ATT ACG CCG OAT COC Tbr Asp'Ala Glu Bie Giy Leu Phi Cys Ala Ile Val Leu le 'lhbr Pro Asp Mrg CCC GOT TTO CGC AAC CTG GAG CTC ATC GAG AAG ATO TAC TCO CGA CTC AAC GOC Pro Oly Lcu *Mrg Asn Lcu Olu Lu Ile Glu Lys MET Tyr 5cr Mrg Leu Lys Gly 540 TGC CTO CAG TAC ATT GTC 0CC CAG AAT AGO CCC OAT CAG CCC GAG TTC CTG 0CC Cys Len Gin Tyr le Val Ala Gin Asn Mrg Pro Asp Gin Pro Giu Phe Lcu Ala AAG TT0 CTG GAG~ ACO ATO CCC OAT CTC COC ACC Cr0 AGC ACC CT0 iCAC ACC GAG Lys Len Leu Gin Thr MET Pro Asp Len Mg 7br Leu Ser Thr Leu His Tir Gln AAA CTO OTA GTT TTC CCC ACC GAG CAC AAO GAG Cr0 CTG CCC CAG CAG ATO TG Lys Len Val Val Phe Mrg TflN Glu Hfis Lys Gin Len Leu Mrg Gin Gin MET Trp 594 2162 TCC ATO GAG GAC 0CC AAC AAC AOC OAT GCC CAC CAC AAC AAO TCG CCC TCC GGC MET Glu Asp Gly Asn Asn 5cr Asp Gly Gin Gin Asn Lys 5cr Pro 5cr Oly :::AGC TOG C OAT 0CC ATO GAC CrC GAG OCO GCC AAG AGT CCC CTT GOC TCG OTA Ser Trp Ala Asp Ala MET Asp Val Oiu Ala Aia Lys 5cr Pro Leu Gly Ser Val TCO AGC ACT GAG TCC CCC GAC Cr0 GAC TAC GOC AOT CCO AGC. AGT TCC CAC CCA Scr 5cr Mr Gin 5cr Ala Asp Leu Asp Tyr Oly Scr Pro 5cr 5cr Ser Gin Pro 648 CAG GOC Cr0 TCT Cr0 CCC TCC CCG CCr CAG CAA CAG CCC TCG GCr Cr0 0CC AGC Gin Oly Vai 5cr Len Pro 5cr Pro Pro Gin Gin Gin Pro 5cr Ala Len Ala *:TCC OCT CCT CTG CTG GCG GCC ACC CrC TCC GGA OGA TOT CCC Cr0 CGC AAC COO Ala- Pro Leu Leu. Ala Ala Thr Lcu 5cr Oly Oly. Cys Pro Leu Mrg Asn Mrg 2432 GCC AAT TCC CCC TCC AOC GOT GAC TCC GGA OCA OCT GAG ATG GAT ATC GTT GGC *Ala Asn 5cr Gly 5cr 5cr Gly Asp 5cr Oly Ala Ala Oiu MET Asp lie Val Gly 702 TCG CAC GCA CAT CTC ACC CAG AAC GGC CTG ACA ATC ACG CCC ATT CTG CGA CAC His Ala Is Leu Thr Gin Asn Gly lceu Th Ile Thr Pro le Val Arg His GTAGTATCTT. intron 5. TTTCTTACAC CAG CAG CAG CAA CAA CAG CAG CAG CAG ATC OCA ATA CTC AAT AAT GCG CAT TCC Gin Gin Gin Gin Gin Gin Gin Gin Gin Ile Gly f3e Leu Msn Msn Ala His Ser CGC..AAC TTG AAT GGG GGA CAC GCG ATG TGC CAG CAA CAG CAG CAC CAC CCA CAA Mrg Asn Lcu Asn Cly Giy His Ala MIET Cys Gin Gin Gin Gin Gin His Pro Gin 756 G (Dm4925) CTG CAC CAC CAC TTG ACA GCC GGA GCT GCC CGC TAC AGA AAC CTA GAT TCC CCC Lcu His His His Leu Tbr Ala Gly Ala Ala Mrg Tyr Mrg Lys Lcu Asp 5cr Pro Arg 2702 ACG GAT TCG GGC ATT GAG TCG GCC AAC GAG AAG AAC GAG TGC AAG C GTG ACT Thr Asp 5cr Giy Ile Glu 5cr Cly Msn Glu Lys Aszi Gin Cys Lys Ala Val Ser TCG GGG GGA.ACT TCC TCG TCC TCC ACT CCG CGT TCC ACT GTG GAT CAT GCG CTG Ser Gly Gly Ser Ser 5cr Cys Ser 5cSr -Pro Mrg Ser Ser Val Asp Asp Ala Lcu 810 GAC TGC AGC CAT CCC GCC CCC AAT CAC AAT CAC CTG GTG CAC CAT CCC CAC CTC Asp Cys 5cr Asp Ala Ala Ala Asn His Asn Gin Val Vai Gin His Pro Gin Leu ACT CTG GTC TCC CTC TCA CCA CTT CCC TCC CCC CAG CCC TCC ACC ACC ACC CAT Ser Val Val 5cr Val 5cr Pro Val Mg Ser Pro Gin Pro 5cr Thr Scr 5cr His CTC AAG CGA CAC ATT CTC GAG CAT ATG CCC CTC CTG AAG CCC CTC CTC CAC CCT Leu Lys Mrg Gin le Val Gin Asp MET Pro Val Lcu Lys Arg Val Leu Gin Ala 864 2972 :::CCC CCT CTC TAC CAT ACC AAC TCC CTG ATC CAC GAG CC TAC AAG CCC CAC AAC Pro Pro Leu Tyr Aspmr~ Asn 5cr Lcu MET Asp Glu Ala Tyr Lys Pro His Lys *AAA TTC CCC CCC CTG CCC CAT CCC GAG TTC GAG ACC CCC GAG GCC CAT CCC AC :Lys Phe Mrg Ala- Len Mrg His Mrg Giu Phe Clu Thr Ala. Gin Ala Asp Ala ACT TCC ACT TCC CCC TCG AAC ACC CTC ACT CC GGC ACT CCC CCC CAC ACC CCA 5cr Thr 5cr Gly 5cr Asn 5cr Leu 5cr Ala Gly 5cr Pro Mrg Gin 5cr Pro 918 GTC CCC AAC ACT CTG CCC ACG CCC CCC CCA TCC CC CCC AGC CCC CCC CCA CGT Val Pro Asn 5cr Val Ala Thr Pro Pro Pro 5cr Ala Ala 5cr Ala Ala Ala Cly .AAT CCC CCC CAG ACC CAG CTC CAC ATC CAC CTC ACC CCC AGC ACC CCC AAC CC Asn Pro Ala Gin 5cr Gin Leu His MET His Leu Thr Mrg 5cr 5cr Pro Lys Ala 3242 TCG ATG GCC AGC TCG CAC TCG GTG CTG CCC AAG TCT CTC ATG GCC GAG CCG CGC Ser MET Ala Ser 5cr is 5cr Val Leu Ala Lys 5cr Lcu MET Ala Giu Pro Arg 972 ATG ACG CCC GAG CAG ATG AAG CGC AGC GAT ATT ATC CAA AAC TAC TTG AAG CGC MET 7hr Pro Gln Gin MIET Lys krg 5cr Asp Ile le Gin Asn Tyr Lcu Lys krg GAG AAC AGC ACA GCA GCC AGC AGC ACC ACC AAT GGC GTG CCC AAC CGC ACT CCC Giu Asn Scr Thr Ala Ala Scr 5cr ThY Thr Asa Gly Val Gly Asn krg 5cr Pro AGC AGC AGC TCC ACA CCG CCG CCG TCG GCG GTC CAG AAT CAG CAG CGT TGG GGC Ser 5cr 5cr Ser Thr Pro Pro Pro 5cr Ma Val Gin Asn Gin Gin krg Trp Cly 1026 AGC AGC TCG GTG ATC ACC ACC ACC TGC CAG CAG AGC CAG CAG TCC GTG TCG CCG Scr Scr 5cr Val le Thr 'Thr Thr Cys Gin Gin krg Gin Gin Scr Val 5cr Pro 3512 CAC AGC AAC OCT TCC AGC TCC ACT TCG AGC TCT AGC TCC ACC TCC ACT TCG TCA Hfis 5cr Asn Cly 5cr 5cr Scr 5cr Ser 5cr 5cr Ser Scr Ser 5cr 5cr Scr TCC TCC TCC ACA TCC TCC AAC TCC ACC TCC AGC TCG CCC AGC AGC TGC CAG TAT 5cr 5cr Thr 5cr 5cr Asn Cys 5cr 5cr 5cr 5cr Ala 5cr 5cr Cys Gin Tyr 1080 TTC CAC TCG CCC CAC TCC ACC AGC AAC GGC ACC ACT GCA CCG CC AGC TCC ACT Phe Gin 5cr Pro H1is 5cr Tir Scr Asn Cly Thr 5cr Ala Pro Ala 5cr 5cr TCG CGA TCG AAC ACC CCC ACO CCC CTG CTG GAA CTG CAC GTG CAC ATT CCT CAC Scr Cly Scr Asn 5cr Ala TINu Pro Len Lcu Gin Leu Gin Val Asp le Ala Asp TCG GC CAG CCT CTC AAT TTG TCC AAG AAA TCC CCC ACC CCC CCC CCC ACC AAC Ala Gin Pro Len MAs Len 5cr Lys Lys 5cr Pro Thr Pro Pro Pro 5cr Lys 1134 3782 CT CAC GCT CTG GTG CCC CCC CCC AAT CCC CTT CAA ACG TAT CCC ACA TTG TCC Lcu His Ala Len Val .Ala Ala Ala Asn Mla Val GIn krg Tyr Pro nur Leu CC GAC CTC ACA GTC ACA CCC TCC AAT CCC CCC TCC TCC GTC GGC CCC CCC GAG Ala Asp Val Thr Val Thr Ala 5cr Asn Cly Cly 5cr 5cr Val Gly Gly Gly Clu STCC CCC CCC CAG CAG CAG TCC CCC CCC GAG TCT CGG CTC CCC CAA TCC CCC CCTr 5r Cly krg Gin Gin Gin 5cr Ala Gly* Gin Cys Gly Len Pro Gin 5cr Gly Pro *1186 :::GAG CCC CCC CGT GCA CAA CGT AAT GCT GGA CCC GTA AGA CC GGA CCA CGT AG Gin krg kg krg Ala GIn Gly Asn Ala Gly Cly Val krg Ala Gly Gly Cly krg TGC TTT TAC CC GAG AAG TGC GAG 'AGA CAG AGA CTC GCA CTG CCA CTT CAC CCA Trp Phe Tyr Ala. Glu Lys .Trp Gln krg Gin krg Leu Gly Val. Ala Val Gin krg 4052 ACC AGO AAC CAG CAT CAC TTG GAG CCC CCC GAG TTC AAT TAA.
krg Lys Gin Asp is Leu Gin krg kg Gin Len Asn 1237 R *4
ATTATTTTACCATTTAATTGAGACGTGTACAAAGTTTGAAGCAACCAACATGCATCATTTAAAAC
TAATATTTAAAGCAACAACAAACAAAACAACTACAAGTTATTAATTTAAAAAACAAACAAACAAACAAAC
4234 AACAAAAAACCCAAGCTTGAATGGTATTACAAAAGJAAAAAGAAAAACAGAAAAAATATAAAT
ATATTTTA
GCAGTTAAACTTT AACGTAGCAAGAAACCAACAAACCCAAGGCAGC
GCTCTGATTTCGCATTAACTTTTC
4374
TTCAGCTGCTACCGAAAACGCCCCTCACCTCCCCCCCACCCAACCCTTCCTCCACACACCAACCGTCTTT
CGACCCCTGATTGTTTTATAAGTTTTAAGCTCTTGTTGTACATATTAATTACGTTTATTGGTAACTATGT
4514
TTAGCGCTTTAGTTGTAGTTGGAGCAAAACTACTTTGCTTTTTTGGATGTTTTTTGAAAAAACTGCAAAT
TATTATTATTAAATTTTTAAATACCTAAAAACAAAACAATGTGTGTGAAATTTTTTATTGTGCGATCTCC
L poly A site cDm4927 and cDni492R 4654
AAGCAGAATGAAGTGCAGTTTGCAACAAATTTTAACTACGATTAAGTTGATAACGATTCATTTTTTATGA
ATTT AACT AATTTT ATGAATTT GTTAT AGTTTTCCACCCTT CTAT AGAT CTTCT ATCT GAT CAT CT AGCT 4794 ACCCGjTATTCCTGATTTCTCCTTTGGCACAAAGCTCTTCTCTATGCTAAAGAATCAAGTGGAATAAATAT
TGTTTTCTAATTTTAAAACTACCACAAAAATACGATTAAAATATACACGAAGTAATGAAAATCAAACAAA
4934 ATGCTTAAAGTTTTAGCAGCAAGCAGTAAAACGACGAT
GAAGAAGAGAAACCCAACGTTAAATATATCTG
TTGTGTACAT AGTTAAATGTTAAATTAAACACAAAAACATATTTAAAGTACAT ATAAATACACATAATT A 5074
TTAATGAAGAAACCTATGCTTAAAAGATTCAATGTTTGATTGGCATCTTAGAAAACCAAGCGAAAAATAC
a a AAAAAAAAATCAACAAACAAAAATTATGATAT ATTATTTAAAAGTAAAGTATACATTTACATTACAGAAA 5214 *AAkCAAAAGAGAAAACTTGCGGTAGCAACAAAACTATTATATTAATT ACATTTTAATT ATGCTGT ACT ATT AT GATT ATT AATT ATTAT GAFTAATTAATT ACGATTTTT ATGCTT AGACAAACCAACAAAAAACAAAT AT
GCAAAAACCATTAAAAAAAAAAACAAAAAACAAGCAAAAAAT
puat polyadenylaiion signal for long transcripts
B
CGACGCGTTTGGAGTGAACGTCCTCAGTTGGCACACAAAAACAAAAACACAAAACGACAGCAACAACATC
-141
GGTGGGGGGAGTACGAGCGGGATGGGGGTAATGGGGGGCACCGGGGGAGTGGAGGCCGAGAGAGCCAGAG
AGCGACCCGAAGCAACACAACACCAACACGAGGCCCAAAAAGACACTTCGGCTGGGTT
CAGCTCGTGTTG
+1
CTCTGGGTCGTTTTGTATTGCTGGTGGACGCTGCTTTCATTCGCAAATTGCTCGTCGTTGGCAGCGGTTG
L>miNA start site
TGCAGAGCAAGAAAAGCGCGCGAAAAACCAAGCAAAAAATTAATACAGCTGGATCAAGCGAAAGAGATAG
AGAGCAGAGTCAACAGCAACAAATGTTCAATAGCAAATGATATCGCATATTTTTGTTGGTGCCAGTGAAG
+211
TGAGATCAAAGTGAAGTGTGCAATGTTCCTTATTAGCAAATCGTAGAGCAACCAACAATCGAGAGTTCAA
284 GTGTCATTTCGAAGCCAAAAAGCAAAATCTCTAATTCAAAT ATG GTT TGT GCA ATG CAA MET Val Cys Ala MET Gin a a.
302
GAG
Gin
CAG
Gin
CTG
Leu
CAT
mis
CAG
Gin 572
CTG
Len
CTG
L.eu
GCT
LAI
CAG
HMn
ACG
CCG
Pro
GCC
Ala
TCG
Ser
CAG
GIn
GCT
Ala
CAG
Gin
GGC
Gly
ATG
MET
AAG
Lys
GCG
Ala
CAG
GIn
GTG
Val
CAG
Gin
AAT
Asn
CAT
Hlis
AGC
Ser
CCC
Pro
CAG
Gin
CAG
Gin
CAG
Gin
GC
Gly
CAG
GIn
CAA
Gin AtC Bie
CAG
Gin
CAT
His
CAG
Gln
GGC
Gly
CTC
Lcu
CAG
Gin
AAG.
Lys
CAA
Gin Gin
ACA
GGT
Gly
CAC
mis
CTG
Len
CAG
Gin
CCG
Pro
CAG
Gin
CAG
Gin
CTG
Len
CAG
Gin
CAA
Gin
CAG
Gin
CGC
Arg
CAA
Gin
CAG
Gin
CAC
His
CAT
His
CAA
Gin
CAG
Gin
AAA
Lys
CAG
Gin
CAA
Gin
ATT
lie
CAG
Gin
CAC
His
ACG
Tit
AGA
Arg
CAA
Gin
CAT
is
GTC
Val
CAT
His
TCG
ScC Pro
CTG
LCU
CTC
Len
GCA
Ala
GCC
Ala
CAG
Gin
GCG
Ala
AAG
Lys
AAA
Lys
CAG
Gin
ACA
1br
ACA
CAT
mis
CTG
Len
CAA
Gin.
AAC
Asn
TTG
Len
ACG
Thr
CCG
Pro
CAG
Gin
GTC
Val
ATT
Ile
GAA
Gin
CCC
Nro
ATA
lie
CAA
Gin
CAC
His
AAG
Lys
GTT
Val
GCA
Ala
CAG
Gin 24
OTO
Val
CAG
Gin
CAG
Gin 78
TTG
Len
TAC
Tyr
GCA
Ala 132 ATC GTA CAA CAG CAA CAA CAA ACA. CCT GCA ACA CTA GTA AAG ACA ACA ACC ACC le Val Gin Gin Gin Gin GIn Thr Nro Ala Mr Len Val Lys Thr Mr Thr Mur AGC AAC AGC AAC AGC AAC AAC ACC CAG ACA ACA AAT AGT ATT AGT CAG CAG CAA Ser Asn Ser Asn 5cr Asn Asn Thr Gin Thr Tin' Asn 5cr lie Scr Gin Gin Gin CAG CAG CAT CAG ATT GTG TTG CAG CAC CAG CAG CCA GCC GCG GCA GCA ACA CCA Gin Gin His Gin lie Val Leu Gin Hfis Gin Gin Pro Ala Ala Ala Ala Thr Pro 186 842 AAG CCA TGT GCC GAT CTG AGC GCC AAA AAT GAC AGC GAG TCG GOC ATC GAC GAG Lys Pro Cys Ala Asp Leu 5cr Ala Lys Aso Asp Ser Giu 5cr Gly le Asp Giu GAC TUC CCC AAC AGC GAT GAG GAT TG C CCC AAT GCC AAC CCG GCG GGC ACA TCG Asp Cys Pro Asn Scr Asp Glu Asp Cys Pro Asn Ala Asn Pro Ala Gly Thr CTC GAG GAC AGC AGC TAC GAG CAG TAT CAG TGC CCC TGG AAG AAG ATA CGC TAT Lecu Giu Asp Ser Ser Tyr Giu Gin Tyr Gin Cys Pro Trp Lys Lys Ble Arg Tyr 240 GCG CGT GAG CTC CTC AAG CAG CGC GAG TTG GAG CAG CAG CAG ACC ACC GGA GCC Ala Mrg Glu Lcu Leu Lys Gin Arg Glu Let, Glu Gin Gin Gin Thr Mr Gly Gly AGC AAC GCG CAG CAG CAA GTC GAG GCG AAG CCA GCT GCA ATA CCC ACC AGC AAC Scr Asni Ala Gin Gin Gin Val .Giu Ala Lys Pro Ala Ala le Pro Mir Ser Asn 1112 ATC AAG CAG CTG CAC TGT GAT ACT CCC TTT TCG GCG CAG ACC CAC AAC GAA ATC le Lys Gin Leu Hlis Cys .Asp Ser Pro Pite Ser Ala Gin Thr is Lys Git, fle 294 GCC AAT CTC CTG CGC CAA CAG TCC CAG CAA CAA CAG GTT GTG GCC ACG CAC CAC Ala Asn Leu Leu Mg Gin Gin Ser Gin Gin Gin Gin Val Val Ala Thr Gin Gin CAC CAG CAA CAG CAG CAG CAG CAC CAG CAC CAG CAA CAA CGA ACC CAT ACC TCC Gin Gin Gin Gin Gin Gin Gin is Gin is Gin Gin Gin Mg Mrg Asp Ser GAC AGC AAC TGC TCG CTG ATG AGC AAC TCG AGC AAC TCC AGT GCC GGC AAT TGT Asp Ser Asn Cys Ser Let, MIET Ser Asn Ser Ser Asn Ser 5cr Ala G ly Asn Cys 348 :TGC ACC TGC AAC GCT GGC GAC GAC CAG CAG CTG GAG GAG ATG GAC GAG GCC CAC Cys Thr. Cys Asn Ala Giy Asp Asp Gin Gin Lcu Ciu Git, MET1 Asp Glu Ala Hs *.1382 GAT TCG GGC TGC GAC GAT GAA CTT TGC GAG CAG CAT CAC CAG CGA CTG GAC TCC Asp Set Gly Cys Asp Asp Glu Leu Cys Git, Gin His His Gin Mrg Leu Asp Ser ::TCC CAA CTG AAT TAC CTG TGC CAG AAG TTC GAT GAG AAA CTG GAC ACG GCG CTG Gin Leu Asn Tyr Lcu Cys Gin Lys Phe Asp Giu Lys Lzu Asp Thr Ala Leu 402 AGC AAC AGC ACC GCC AAC ACG GOG AGG AAC ACG CCA GCT GTA ACA GCT AAC GAA 5r Asn 5cr 5cr Ala Asn Thr Gly Mg Asn Thr Pro Ala Val Thr Ala Asn Giu 1544 GAT GCC GAT gtaggmag Asp Ala Asp EcR eDNA 5534 bp 193 AAACCCTTCATCACCCACCCAACGTAACAAAACCGCACAATAACCTCCTTTCCCACGACC ACAAGTCCC CCTCTCGG AAATAAC C CCC TTC AT o 7 1105CC TACCCACACTGCGTCTCGAGGTrC ACGTCCTCCAGTAAGAC CCC TCTGCCACTCCGGAAATGATCG CTCGTGCCACGCTAAGCGC 1798 TCC CACAAC TGCGCATCAC AC CT CAT TCC CAA GATCCGTAATC TTTGGCGCC TCGCAA GC CATTC TAACT CAC CAC AGTCGAAATCrC 1291 CAC ACCCTC ATCACOCCCCC T ACGCCC TGACC TCCACGCAC CTCC C ACACA ATCC AC AT C AC ACO CC TOO CC AA CCC AATOCTC T Gin Lys Val lie Thr Leo Ala Mel His Cly Cys Set Se12 e r l i h h i i PoleAnCyAnAaA l 0 C 11384 CA CCC CC GAGC GAGTCC CA TAT CA TC CC TC CC TC AACC CCA CCC TTC CCC AC CC CO ATC CTC AATG CC CC C AT CCTC CAC Agy Cly Pro Thr Amu SCl Gin Tyr GuVal Pro Cly AleTr Asn Leo Gly la Leu Ala As Cly Melu Asn Cly SCl Pro ASc Cly eLc Gin 436 1477 CAA CAC ATT AC ATC CAC CCC CTC TOAAC TCC AOC ACGAGC TCA AG CC ACC ACC CCC TC CAC GC CATC C A TC CCC CCC CCC Gin Gin lie Tyiny Asp Gin Hisp Leu TpLiew Ama Set, Tr ThrPr Ser TiyScr PrThr Thr Pro Leu s His Ca Gin. Amu Lu Gly Ciy Ala 17 1270 CCC CCC CC ATC CCC GCC ATGC G AT TTCC AC ACT CAT CCC ACC CCA AA CCT ATC CTT C TC CCC CC C CC G CA CTA Cly cly Val le lbe LC Ala Met CH i Leo is His Ala Ama LCly Pro AnThai ly eo lie Prolye Vat Va ly Cly Ala Cly Cly As 198 163 CCG GA CACC CCA CC CAAA CT CCA CO CC CT AATG CAG COAC ACA CTCC A ACC COATTC CTC AAT TCT ATA TCTC A.T CCCAT CAT Cly LeGly Val Cly Cs ly Cy Tly Val Cio Cly LeAly MW in His 1hz ProLe AlaSe Asp Sl et VLe Asn Set le Scr 5cr C ly Aet Asp 229 1456 CAACTT C C CT C GCCC ACC AC TTC ACA OGAA CG C AC AA AC TCC ACC AAC AC CC AAC AAG CA CAT CC CCC CT GC C Aip Gin lie Pio Ase Set Hsetl Leo Il AsnTy Set Ala Amr Pro Ser Cys AP Ala Ls Lys Ser Lys Lys Gly Prn. Ala Pro AGl Vl Gin 260 157 GC GC GCGO AT GG GA TGGO AT CT*CA CA GG AT GCAC CC AT GCCTTAT GA OT T CCGAGCCGC CC G *GT GO CCT CG CCA CG GG *G CG CT CG CG CT CG T A A C C C G A T A C T C C.G A 115 GA CC TG CT CG GC GCTTGAACGG TA TG AA GC GCGATGCCAA*AA A* AC AG CA *T CGCCCCOGGT CA -I so 00 .0 7J GAG CTC TGC CTG GTT TGC CGC GAC AGC GCC TCC CCC TAC CAC TAC AAC CCC CTC ACC TOT GAG CGC TGC AAG GGG TTC TTT CGA CGC AGC Ciu Gbu Lcu Cys Lcu Val Cys Civ Asp Arg Ala Scr Gly Tyr His Tyr Asn Ala Lcu Thr Cys Ciu Giv Cys Lys Gly Pht Phc Arg Arg Ser 291 1942 OTT ACG AAG AOC GCC GTC TAC TGC TCC AAG TTC GGO CGC GCC TGC GAA ATG GAC ATG TAC ATG AGO CGA AAG TGT CAG GAG TGC CCC CTG AAA Val Thir Lys Ser Ala Val Tyr Cys Cys Lys Pise Giv Arg Ala* CYs Ciu Met Asn Met Tyr Met Arg Arg Lys Cys Gin Ciii Cys Arg Leu Lys 322 2035 AAC TGC CTG GCC CTC GCT ATC CCC CCG GAA TCC OTC GTC CCC GAG AAC CAA TOT GCG ATG AAC CGG CGC GAA AAC AAG CC CAC AAG GAG AAC Lys Cys LUu Ala Val Civ Met Arg Pro Ciii Cys Val Val Pro Glu Asn Gin Cys Ala Met Lys Arg Arg Ciii Lys Lys Ala Gin Lys Ciii Lys 353 2128 GAC AAA ATG ACC ACT TCG CCG AGC TCT CAG CAT GGC GGC AAT GGC AGC TTG CCC TCT GGT GGC GGC CAA GAC TTT GTT AAC AAG GAG ATT CTT Asp Lys Met Thr Thr 5cr Pro Set Ser Gin His Cly Cly Asn Giy 5cr Lcu Ala Ser Gly Cly Cly Gin Asp Phe Vl Lys Lys Ciu lie LCu 384 j gtaggg v. glacag I 2221 GAC CTT ATC ACA TUC GAG CCG CCC CAG CAT GCC ACT ATT CCG CTA CTA CCT GAT GAA ATA TTC CCC AAG TGT CAA GCG CGC AAT ATA CCT TCC Asp *Leu Met T1W Cys Ciu Pro Pro Gin His Ala TMr lic Pro Leu Leu Pro Asp Gbu lie Lcu Ala Lys Cys Gin Ala Arg Asn lie Pro Set 415 2314 TTA ACG TAC AAT CAG TTC GCC OTT ATA TAC AAG TTA ATT TOO TAC CAG CAT CCC TAT GAG CAC CCA TCT CAA GAG CAT CTC AGO COT ATA ATG Leo Thr Tyr Asn Gin Lcu Ala Val lie Tyr Lys Leo lie Tip Tyr Gin Asp Gly Tyr Glu Gin Pro Ser Ciu Glu Asp Lcu Arg Arg Ile Met 446 2407 AGT CAA CCC CAT GAG AAC GAG AGC CAA ACG CAC OTC ACC TTT COC CAT ATA ACC GAG ATA ACC ATA CTC ACC OTC CAG TTG ATT OTT GAG TTT Gin Pro Asp Glu Asn Ciii Ser Gin list Asp Vai 5cr Phe. Arg His Ile lisr Cli lie Thr Ile Lco Tnu Vol Gin Lcu Ile Vol Ciu Plic 477 (gtgagt.. cglagl 2500 CT AAA GOT CTA CCA CCC TTT ACA AAG ATA CCC CAC GAG GAC CAC ATC ACG TTA CTA AAG CCC TGC TCG TCC GAG TC ATC ATC CTG COT ATC
C
Ala Lys Civ Lcu Pro Ala Phe Thr Lys lic Pro Gin Clii Asp Gin Ilie Thr Lcu Leo Lys Ala Cys Ser 5cr Ciii Val Met Met Leo Arg Mci 508 2593 CCA CGA CCC TAT CAC CAC ACC TCG CAC TCA ATA TTC TTC CC AAT AAT AGA TCA TAT ACC CCC CAT TCT TAC AAA ATOGCCC GGA ATG OCT CAT Ala Arit Arg Tyr Asp His Set 5cr Asp 5cr le Phe Phe Ala Asn Asn Arg Scr Tyr Thr Arg Asp Ser Tyr Lys Mct Ala Gly Mct Ala Asp 539 2686 AAC ATT GAA GAC CTC CTG CAT TTC TGC CCC CAA ATG TTC TCC ATC AAC GTG GAC AAC OTC CAA TAC CC CTT CTC ACT CCC ATT CTG ATC TTC Asn lie Clii Asp Leo Leo His Phe Cys Arg Gin Met Phe 5cr Met Lys Vol Asp Asn Vol Clii Tyr Ala Leu Leu Thr Ala lie Val lie Phe 570 2779 TCG GAC CG CCC CCC CTC GAG AAG CC CAA CTA OTC GAA C ATC CAC ACC TAC TAC ATC GAC ACC CTA CCC ATT TAT ATA CTC'AAC CCC CAC Asp Arg Pro Giv Leo Ciii Lys Ala Gin Leo Vol Clii Ala Ile Gin Scr Tyr Tyr lie Asp l1sv Leo Arg lic Tyr lie Leu Asn Arg His 601 2872 TGC CCC GAC TCA ATG AGC CTC GTC TTC TAC CCA AAC CTC CTC TCC ATC CTC ACC GAG CTG CCT ACC CTC CCC AAC CAC AAC CCC GAG ATC TOT Cys Cly Asp Set Met Ser Leo Val Phe Tyr Ala Lys Leo Leo Set lie Leo Thr Ciii Leo Arg Thr Leo Uly Asn Gin Asn Ala Ciii Mci C S 632 2965 TTC TCA CTA AAG CTC AAA A.AC CCC AAA CTG CCC AAC TTC CTC GAG GAG ATC TUOGCAC OTT CAT GCC ATC CCC CCA TCC OTC CAG 1'CC CAC CTT Phc 5cr Leo Lys Leo Lys Asn Arg Lys Leo Pro Lys Phc Leo Clii Cii lie Trp Asp Vol His Ala lie Pro Pro 5cr Val Gin Scr His Leo 663 3058 CAC ATT ACC CAG GAG GAG AAC GAG CUT CTC GAG CCC OCT GAG COT ATC CCC UCA TCC OTT CCC CCC CCC ATT ACC CCC CCC ATT CAT TGC CAC Gin lie Thr Gin Ciu Clii Asn. Cli Arg Leo Clii Arg Ala Ciii Arg Met Arg Ala Set Vol Cly Gly Ala lie Thr Ala Cly lic Asp Cys Asp 694 3151 TCT 0CC TCC ACT TCG GCG GCO GCA 0CC GCG GCC CAG CAT CAG CCT CAG CCT CAG CCC CAG CCC CAA CCC TCC TCC CTG ACC CAG AAC GAT TCC Ser Ala 5cr Thr 3cr Ala Ala Ala Ala Ala Ala Gin His Gin Pro Gin Pro Gin Pro Gin Pro Gin Pro 5cr Scr Leu Thr Gin Asn Asp 5cr 725 3244 CAG CAC CAG ACA CAG CCG CAG CTA CAA CCT CAG CTA CCA CCT CAG CTG CAA GOT CAA CTG CAA CCC CAG CTC CAA CCA CAG CTT CAG ACG CAA Gin His Gin Thr Gin Pro Gin Leu Gin Pro Gin Leu Pro Pro Gin Leu Gin Oly Gin Lcu Gin Pro Gin Lcu Gin Pro Gin Lcu Gin Thr Gin 756 3337 CTC CAG CCA CAG ATT CAA CCA CAG CCA CAG CTC CTT CCC GTC TCC GCT CCC GTO CCC 0CC TCC OTA ACC GCA CCT GOT TCC TTO TCC OCG OTC Leu Gin Pro Gin lie Gin Pro Gin Pro Gin Leu Leu Pro Val 5cr Ala Pro Vai Pro Ala 5cr Val Thr Ala Pro Oly 5cr Leu 5cr Ala Val 787 3430 AOT ACG AGC AGC GAA TAC ATO GGC GGA AGT OCG 0CC ATA GGA CCC ATC ACG CCG OCA ACC ACC AGC AGT ATC ACO OCT GCC OTT ACC OCT AOC Thr 5cr 5cr Giu Tyr Mct Gly Gly Scr Ala Ala lie Gly Pro lie br Pro Ala TVr 1br 5cr Ser lie Thr Ala Ala Val T% Ala 5cr 818 3523 TCC ACC ACA TCA OCO OTA CCO ATO GOC AAC OGA OTT GOA OTC GOT OTT 000 GTO 0CC GOC AAC GTC AGC ATO TAT OCO AAC 0CC CAG ACO OCO Thr Thr 5cr Ala Val Pro McI Gly Asn Oly Val Oly Val Oly Val Oly Val Oly Oly Asn Val 5cr Me[ Tyr Ala Asn Ala Gin Thr Ala 849 3616 ATO 0CC TTO ATO GOT GTA 0CC CTG CAT TCO CAC CAA GAG CAG CTT ATC 000 OGA OTO GC OTT AAG TCO GAG CAC TCG ACO ACT OCA TAO CAG Mct Ala Leu MeI Gly Val Ala Lcu His 5cr His Gin Olu Gin Lett Ile Gly Gly Val Ala Val Lys 5cr Oiu His Ser Thr Thr Ala -878 3709 GCGCAGAGTCAGCTCCACCAACATCACCACCACAACATCOA
CGTCCTGCTGGAGTAOAAAGCGCAGCTGAACCCACACAGACATAOGGGGAAATGGGAAGTI'CTC-TCCAGAGAGTTCGAGCCGA
3957 CT AAAGCTTAATTOAAAAAGCTTCAACAACAATTGGACAAACGCOTTGAGGAACCGGGAOAAAATTTAAGAAAAAAAAAACCATTGAAAATTATOAAATTT
AOTATACATTTTTTTTGGOTGOA
4329 AATCTOTTAA ATOAAACAAAAATAATGAT AATAACATTATCATCCACCATAATTAAAATCATTTAAAGT AATTAAAAACAAAACACTTTTAAAACACGCAAATGATATTTAr 4453 TTTTTTAATCATAAAGAAAOOCAACCTGAAAAAAATAT-TACAAAAACAAATAACAACATATTTTATTATGACACCCTT ATAT GTTTTCAAAACGAGAATTT AAATTCTTAOATTrCTT AT AATTT 4577 CAT CCAAAAAT ATTAOCCAOCAAAAACCTTT ATT ATT GGCATGTTTTTTAGACATGTTTT CAAAAAAAACTTTGAT ATTGAAACT AAACAAAOGAT AATGAAATOAAAGT GATTGOAGT CT T AC 4701 TCAACAAGCTAAGTTAATAATTACATTGGTAGACCTTGTGAAAiTCACCTGTAAAAACCACACAAAvrrAxi.
AAA
5197 T AAAOTOATTCTTTT ATT AT GTAAAAAOAAOACAAAAAATAT CTTACGT AOCTTTCT ACTTOAATTGT GCAATT CTT ACT ACT AATCCT AATTTAAATAT AATTT ACACACACOCAT 5321ACAACGATAACAGCCACAATAACCACAATTTTATTTAAGTCAACCTAATTTATAAATATGAATTTGTATAATOACGAACTAAAATT
AOCATOACATCATGGACATACTTGOA
5445 AAT AACTCTAT CAAACGAGCTAAATGCATTGAAGAAGAAAATTCTTGTTAAATATAGTc.FOCACTTCGACAAACGAAAATCAGGAT cDNA DHR3-9 4.2 kb 125 TGACArCCATACGACGACAACGACGACGAAGrGACTACAcrGCnCCrATA ATG TAT ACG CAA CGT METr Tyr Thr Gin 242 ATG TTT GAC A'rG TGG AGC AGC GTC ACT TCG AAA CTG GAA GCA CAC GCA AAC AAT CTC GGT CAA AGC AAC GTC CAA TCG CCG C GGA CAA AAC MET Phc Asp MLT Trp Ser 5cr Val Thr Ser Lys Lcu Giu Ala His Ala Asn Asn Leu Gly Gin Scr Asn Val Gin Scr Pro Ala Gly Gin Asn 36 4 glaaag. v. lcacag) 335 AAC TCC AGC GGT TCC ATT AAA GCT CAA ATT GAG ATA ATT CCA TGC AAA GTC TGC GOC GAC AAG TCA TCC GGC GTG CAT TAC GGA GTG ATC ACC Asn Scr Scr Gly Scr lie Lys Ala Gin lie Glu lie lie Pro Cys Lys Val Cys GIy Asp Lys Scr Scr Gly Val His Tyr Gli, Val lic Thr 67 428 TGC GAG GGC TGC AAG GGA TTC TTT CGA AGA TCG CAA AGC TCC GTG GTC AAC TAC CAG TGT CCG CGC AAC AAG CAA TGT GTG GTG GAC CGT GTT Cys Giu Gly Cys Lys Gly Phe Phc Arg Arg 5cr Gin Ser Scr Val Val Asn Tyr Gin Cys Pro Arg Asn Lys Gin Cys Val Val Asp Arg Val 98 4gtCtgL v ttgcagl 521 AAT CGC AAC CGA TGT CAA TAT TGT AGA CTG CAA AAG TGC CTA AAA CTG GGA ATG AGC CGT GAT GCT GTA AAG TTC GGC AGG ATG TCC AAG AAG Asn Arg Asn Arg Cys Gin Tyr Cys Arg Lcu Gin Lys Cys Lcu Lys Lcu Gly ME 5cr Arg Asp Ala Val Lys Phe Gly Arg MET 5cr Lys Lys 129 614 CAG CGC GAG AAG GTC GAG GAC GAG CTA CGC TTC CAT CGG GCC CAG ATG CGG OCA CAA AGC GAC GCG GCA CCG GAT AGC TCC GTA TAC GAC ACA Gin Arg Glu Lys Val Giu Asp Glu Val Arg Phe His Arg Ala Gin MET Arg Ala Gin 5cr Asp Ala Ala Pro Asp Ser Ser Val Tyr Asp Thr 160 0 tgtgcag.. actcag), 707 CAG ACG CCC TCG AGC AGC GAC CAG CTG CAT CAC AAC AAT TAC AAC AGC TAC AGC CCC GGC TAC TCC AAC AAC GAG.GTG GGC TAC GGC AGT CCC Gin Thr Pro Ser 5cr Scr Asp Gin Leu His His Asn Asn Tyr Asn 5cr Tyr 5cr Gly Giy Tyr 5cr Asn Asn Giu Val Gly Tyr Gly 5cr Pro 191 800 TAC GGA TAC TCG GCC TCC GTG ACGCCA CAG CAG ACC ATG CAG TAC GAC ATC TCG GCG GAC TAC GAG GAC AGC ACC ACC TAC GAG CCG CGC AGT Tyr Giy Tyr 5cr Ala 5cr Vai Tbr Pro Gin Gin Mur MET Gin Tyr Asp Ile 5cr Ala Asp Tyr Val Asp Ser Dur Thr Tyr Giu Pro Arg 5cr 222 igtaaag. ctccagi (C) 893 ACA ATA ATC GAT CCC GAA TTT ATT AGT CAC CCG GAT GGC GAT ATA AAC GAT GTO CTG ATC AAG ACG CTG GCG GAG GCG CAT GCC AAC ACA AAT Thr lie lie Asp Pro Giu Phc lic Scr His Ala Asp Gly Asp lie Asn Asp. Val Lcu le Lys Mwr Leu Ala Glu Ala His Ala Asn Thr Asn 23 4g~gag.. v ccagl (G) 986 ACC AAA CTG GAA GCT GTG CAC GAC ATG TTC CGA AAG CAG CCC GAT GGTCA CGC ATT CTC TAC TAC AAG AAT CTG GGC CAA GAG GAA CTC TOG nur Lys Leu- Giu Ala Va! His Asp MET Phe Ars Lys Gin Pro Asp Va! 5cr Arg lie Lcu Tyr Tyr Lys Asp Leu Gly Gin Giu Glu Lcu Trp 284
(G)
1079 CTG GAC TGC GCT GAG AAG CTT ACA CAA ATO ATA CAG AAC ATA ATC GAA TTT GCT AAG CTC ATA CCC GGA TTC ATG CCC CTG AGT CAG GAC GAT Lcu Asp Cys Ala Glu Lys i-eu IVr Gin MET le Gin As Ile Ile Ciu Phe Ala Lys Lcu Ilc Pro Civ Phe MET Arg i-ct Scr Gin Asp Asp 3 1gigag.. ccag) 1172 CAG ATA TTA CTG CTG AAG ACG GGC TCC TTT GAG CTG GCG ATT GTT CGC ATG TCC AGA CTC CTT GAT CTC TCA CAG AAC GCG GTT CTC TAC GOC Gin Ile i-cu i-cu Lcu Lys 11v Giv Ser Phe Clu i-cu Ala Ile Val Arg MET Ser Arg i-cu i-cu Asp Lcu 5cr Gin Asn Ala Val i-Cu Tyr Gly 3 1265 GAC GTG ATG CTG CCC CAG GAG GCG TTC TAC ACA TCC GAC TCG GAA GAG ATG CGT CTG GTG TCG CGC ATC TTC CAA ACG GCC AAG TCG ATA GCC Asp Val MET i-cu Pro Gin Glu Ala Phe Tyr Thr 5cr Asp 5cr Glu Glu MET Art i-eu Val Ser Arg Ile Phe Gin Mwr Ala Lys Scr le Ala3" j gtgcg cctlug) 1358 GAA CTC AAA CTG ACT GAA ACC CAA CTG CCC CTG TAT CAG ACC TTA GTG CTC CTC TCG CCA CAA CCC AAT GGA CTC CGT GGT AAT ACG GAA ATA Ciu i-cu L-ys Lcu Thr Ciu IVr Glu i-cu Ala i-eu Tyr Gin Ser i-eu Val i-cu i-cu Trp Pro Ciu Arg Asp Gly Val Arg Cly Asn 1'hw Giu lc 4( 1451 CAG AGG CTT TTC AAT CTG AGC ATC AAT CC ATC CGG CAC GAG CrG GAA ACC AAT CAT C CCG CTC AAG GGC GAT GTC ACC GTG CTG CAC ACA Gin Arg i-Lcu Phc Asn i-cu 5cr MET Asn Ala le Arg Gin Ciu i-eu Giu Thr Asn is Ai-a Pro i-cu i-ys Giy Asp Val liv Val i-eu Asp Thr 4 fgineg.. ticcag j 1544 CTC CTG AAC AAT ATA CCC AAT TTC CGC GAT ATT TCC ATC TTG CAC ATG CAA TCC CTG ACC AAG TTC AAG CTG CAC CAC CCC AAT CTC GTT TTT i-cu i-cu Asn Asn Ilc Pro Am Phc Arg Asp le Scr Ile i-cu His MET Giu Scr Lcu Ser L-ys Phe i-ys i-cu Gin His Pro Asn Val Val Phc 4' 1637 CCG CC CTG TAC AAC GAG CTC TTC TCG ATA CAT TCC CAG CAC GAC CTG ACA TAA CAAGAGCACCAGCCCTTCCTCCAGACGACCGCGGACGATCGTTGCCCAGGATr Pro Ala Lcu Tyr L-ys Ciu i-cu Phe Scr le Asp Scr Gin Gin Asp i-cu Thr -4 1742 CGTCGCATTTCGCCGTGGCCTCGGACACGGTCCAGTGGGCCGAGGCAATATTTAIAAATCCC 1866 CTCGAAAAGATTTAGTTACACTCACAGTTACTGCAGCCCTAAGATTTTTTCTTTATArTTAG 1990 GGAGCCAGGAGAATTTAAAATTACCGTAATTGTCTGGACAACAACATCTTGGAGGCTAATAr 2114 TGGTATATAAAATAAAAAAAAAAAAAAAAAAA TTTTACTCACTCTTTGCCACGTAACTTATCTA 2238 ACACAATTTCCGAACCGACGAACCAATCGTGTATTACCAAACCATCACCC 105
A
DHR3 51 CKVCGDKSSGVI-YGVI TCEGCKGFFRRSQSSVV NYQCPRNKQCVVDR VNRNRCQYCRLQKCLKLGM EcR 264 CLVCGDRASGYHYNALTCEGCKGFFRRSVTKSA--VYCCKFGRACEMDMYMRRKCQECRLK
KCLAVGM
245 CRVCGDKASGFHYGVIISCEGCKGFFRRSI QQKI -QYRP CTKNQQCSI LR I NRNRCQYCRLKKCI AVOM kni 5 CKVCGEPAAGF HFGAFTCEGCKSFFGRSYNNI S-TI SECKNEGKCI I DKKNRTTCKACRLR KCYNVGM hRARu 58 CFVCQDKSSGYHYGVS ACEGCKGFFRRSI QKNM--VYTCHRDKNCI I NKVTRNRCQYCRLQKCF
EVGM
hTRP 102 CVVCGDKATGYHYRCI TCEGCKGFFRRTI QKNLHPSYSCKYEGKCVI DK VT RNQCQECRF KKCI YVGM IIVDR 24 CGVCGDRATGFHFNAMTCEGCKGFRRSMKRKALFT CPFNGDCRi TK DNRRHCQACRLKRCVD1
GM
cOUP-iT CVVCCDKSSGKHYGQFTCEGCKSFFKRSVRRNL-TYTCRANRNCPI
DQHHRNQCQYCRLKKCLKVGM
hERRI 175 CLVCGDVASGYHYGVASCEACKAFFKRTI QOSI EYS CPASNECEI TKRR RKACQACRFTKCLRVGM IIERR2 103 CLVCGDI ASGYHYGVASCEACKAFFKRTI QGNI EYS CPATNECEJ TKRRRKSCQACRFMKCLKVGM hER 185 CAVCNDYAS GYHYGVWSCEGCKAF FKRS 1 QGHN DYMCP ATNQCT 1 DK NR RKS CQACRL RKCYEVGM hGR 421 CLVCSDEASGCHYGVLTCGSCKVFFKRAVEGQH--.NYL CAGRNDCI I DKI R RKNCPACRYRKCLQAGM hMR 603 CLVCGDEASGCHYGVVTCGSCKVFFKRAVEGQHNYLCAGRNDCI I DKI RRKNCPACRLQKCLQAGM hPR 567 CLI CODE AS GCHYGVL TCGS CKVF FKRAMEGQH NYL CAGRNDCI V DK I R RKNCP ACRLR KCCQAGM
B
DR" 255IaXVBDFRXPDVRXLEMAULWDCAXLQMINIXAV=?MWQDDIUJTGSZI~_____________________--S ElLVRIFTAKIM D9133 255R L R (Q D N SQ D Sr H T X I TV L Vt A G A TXIQ D I L K CS Z M g r Y B SD I F N R Y 7DS~ 4 A M D I DTT 4 S K COUP-TV bXR 294 LVELVVPXYMDAPG"VkLDZRXVXNSPYSSSQSLSVVGARLLD-LPRLLEGR-=L---------GAALLQL ShERR? 211 rV7VAPXYUPDPGITLDARLFIM IGrNTA~4LQAML1rYSPDK-iYLDnz~R-rLiL---------YJ..ALQL hGR t3l TW.JVZM~GDSPS~I1TN4QXAXXIGRLCDQTLYMLFIMR~ALcADimQN-pND=---NYVSSU hPR 686 P~uj3PVYGDTKDS~.TLQCRLSVMLOIADQTxywsmra~YMMFP~~zQm-srSCT----
IPQZ
0ER3 380 TETELLTQLVLLW-IGVRNTIQRFNLSJIAUQ ELETNEAILKDWVLDLXNFRDJSLmSLQRN----. .V7rPAjJYRL ICR 557-----------DNVIYLLTAIV-nD-RPGLK&QLVAIQSYYIDTLRI YILNR CGDSSLWYKLSILTLRTLNQRIAEICSLLKNR---- LILBrM 175k 503 NL DIGCAVLITPD-RPGWIIILILMITSRLGcLQ----------------YIVAQ--NRPDO2EFLAE.IZTWDLRTLSTLHTxvM.-------------VVFRT1HKRLLR MRAR 292-----------DDU!GLLSCLICGD-RQDLQPDRVDKLQBLL1ILKV Y-VRK RtPSRP3PMLuiDLRSISKGA1RVZ!LDEZpG.SH---- PPLIQum~zi hTRP 361-----------DDTEVLLLVLLKSSD-BPGLCV1RIRYQDSFLLng YM Mr LMDRXCCARFNVZT"-PPLULRVY3D hVDR 325 LU EIIUVLLAXCZVSPD-tPGVDAILTHAQDPLSNTLQT Y3mCRPPPSH!LY~aZQXLA LRSJIDDSKQYRCLFQ-PCSMCLTPLVXjVGI cOUP-TV VRUIUVDBIZYSCLKAVLTD-ACLSDAAIESLO3KSQCI. YVRSQ-YPNQPSRFGLI.SLRTVSSSVURQLVYVRLVXTPR-TLIDKgJJGSS bmRR 410 Vj.UT{LRLT M DYVLKM.ISD SVBI-DEPRLWSC-fLLB.ALLBYE GAPTIJQGKL
YCXRMHLFMZMD
hzBP. 328 VRYF.KBUVM-IJSD SHYINLAVQXXQ DLLIALQDYE LSQRHZ1PRGXmLTLPLRTAAiAVQBFTSVXQGKwPM-XILMhaUMhV bIR 437 itILQ IrIvcLsxxLLNs TLSXL MURVDXDLMMZTQMQLOLU~UNGHYMKMYDZMAR hGR 653 LHRLQ VgxZY1MTLaS~uSWDCL KXg1IXLFZRTYMwLX AMMwzaSSgQNpWyg-LTMU.minVVERNLjj---- YCiorL-XTSuM RuiLAuIX b3R 859 YVRLQ TfE1!!D~LLLSTPMGL KSQAwUZMTYm1MR---7KCPNSGQSIWYQ-LTLDSMDLVSDLLE ~YTRSHALKVMYPNmVRIIS bPR 808 FVI.Q VSQUJL~kWLLL1.PLlGL RSQTQYEIMSSYXRLI--- ACLRQCSSQY-LT.NLDLVKQ--- LBLYCLNTFZQSRALSVIPKRe(SRVZk 0. 0 0 0 0 00*0000 0 0 0000: 0 0 0000e 0 :0 DHR3 LcR kni hRARa hTRI3 hVDR coup
ZIERRI
hERR2 hER hGR hMR hPR DHR3 EcR E75A kni hRARa hTRI3 hVDR cOUP hERRI IIERR2 hER hGR hMR hPR 19 24 18 18 17 14 13 14 15 13 16 55 23 24 29 25 20 16 is 17 14 14 13 64 55 23 20 22 16 15 15 13 14 12 47 48 48 65 55 62 45 33 22 18 18 18 17 14 15 16 56. 59 58 52 62 22 16 18 17 19 14 18 50 58 55 53 47 52 17 20 19 15 13 15 13 62 58 58 50 61 59 50 29 30 24 21 19 48 53 54 43 53 56 45 51 63 32 25 23 49 51 54 43 54 57 45 54 91 33 25. 21 53 52 58 52 59 53 47 55 69 72 27 25 27 47 48 44 45 45 44 41 48 57 57 58 57 54 52 50 47 47 *48 45 44 52 59 59 58 94 56 48 47 144 1 45 44 42 .42 45 W54 54 56 191 91 EDITORIAL NOTE 56563/00 THIS SPECIFICATION DOES NOT CONTAIN PAGES NUMBERED 108-111.
P:\WPDOCS\CRN\SPECI\7523980.sp.doc-28/06/02 -1lla- Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated integer or group of integers or steps but not the exclusion of any other integer or group of integers or steps.
The reference to any prior art in this specification is not, and should not be taken as, an acknowledgement or any form of suggestion that the prior art forms part of the common general knowledge in Australia.
2

Claims (39)

1. An isolated recombinant nucleic acid which encodes an insect steroid receptor or fragment of said receptor that binds ecdysone, said nucleic acid comprising a segment that hybridizes to the complement of nucleotides 2359 to 3021 of Drosophila EcR cDNA sequence of Table 2 under hybridization conditions comprising less than 500 mM salt and at least 37 0 C and washing in 2X SSPE at 63 0 C.
2. An isolated recombinant nucleic acid of Claim 1, wherein said insect steroid receptor is EcR.
3. A cell transformed with an isolated recombinant nucleic acid of Claim 1.
4. An isolated recombinant nucleic acid having a sequence exhibiting identity over 20 contiguous 20 nucleotides with nucleotides 2359 to 3021 of Drosophila EcR cDNA set forth in Table 2, wherein said nucleic acid encodes a polypeptide that binds ecdysone.
5. An isolated recombinant nucleic acid of Claim 4, :wherein said nucleic acid encodes a polypeptide having a DNA binding domain which binds to an ecdysone- responsive DNA control element.
6. A cell transformed with an isolated recombinant nucleic acid of Claim 4. P:\WPDOCS\CRN\SPECI\7523980.sp,.do.-28/06/2 -113-
7. A cell transformed with an isolated recombinant nucleic acid of Claim 1, wherein the encoded insect steroid receptor comprises a DNA binding domain that binds an ecdysone-responsive DNA control element.
8. A cell of Claim 7, wherein said cell also comprises an expression vector comprising said ecdysone-responsive DNA control element operably linked to a coding sequence for a polypeptide.
9. A polypeptide produced by expression of an isolated, recombinant nucleic acid having at least 60% sequence identity with a polynucleotide encoding amino acids 431 to 651 of Drosophila EcR of Table 2, comprising an insect steroid receptor or fragment thereof, wherein said polypeptide binds ecdysone and comprises a hormone-binding domain of between 200 and 250 amino acids containing at least one of an El, E2 or E3 subregion, wherein: the El subregion has an amino acid sequence AKX PGFXXLT DQITLL, wherein is any amino acid, or has an amino acid sequence having at least 10 matches at assigned amino acid positions; the E2 subregion has an amino acid sequence E(F/Y) KA L (N/S) D GL, wherein is the optional absence of an amino acid, or has an amino acid sequence having at least 9 matches at assigned amino acid positions; and 30 the E3 subregion has an amino acid sequence LXKLLXXLPDLR, wherein is any amino acid, or has an amino acid sequence having at least 5 matches at assigned positions.
SA polypeptide of Claim 9, wherein said insect steroid fceptor is EcR. P:\WPDOCS\CRN\SPECI\7523980.sp.doc-28/06/02 -114- A polypeptide of Claim 9, wherein said insect steroid receptor is EcR.
11. A polypeptide of Claim 9, wherein said insect steroid receptor or fragment thereof also comprises a DNA binding domain.
12. A polypeptide of Claim 9, wherein said receptor is from Drosophila melanogaster.
13. A polypeptide of Claim 9, which is further capable of binding to an ecdysone or an ecdysone agonist.
14. A polypeptide of Claim 9, which is capable of binding to a DNA control element responsive to ecdysone.
A polypeptide of Claim 14, wherein said polypeptide comprises a zinc finger domain.
16. A polypeptide of Claim 14, wherein said ecdysone responsive DNA control element is operably linked to a transcription unit which is responsive to said binding.
17. A polypeptide of Claim 16, wherein said ecdysone 25 responsive DNA control element is upstream from said transcription unit.
18. A polypeptide of Claim 9 fused to a second polypeptide. 30
19. A polypeptide of Claim 18, wherein said second polypeptide is a heterologous polypeptide. *00 A polypeptide of Claim 19, wherein said heterologous polypeptide comprises a second steroid receptor.
P:\WPDOCS\CRN\SPECI7523980.s.dc.28/6O2 -115-
21. A composition of matter comprising a polypeptide of Claim 9.
22. A cell comprising a polypeptide of Claim 9.
23. A cell of Claim 22, wherein said cell is a human cell.
24. A method for selecting DNA sequences capable of being specifically bound by an insect steroid receptor superfamily member, said method comprising the steps of: screening DNA sequences for binding to a polypeptide of Claim 14; and selecting said DNA sequences exhibiting said binding.
A method of Claim 24, wherein said DNA sequence is operably linked to a gene selected from the group consisting of EcR, DHR3, E74 and E75 genes.
26. A method for selecting ligands specific for binding to hormone binding domain of the polypeptide of claim 9, said method comprising the steps of: screening compounds for binding to the polypeptide; and selecting compounds exhibiting specific binding to the polypeptide.
27. A method of Claim 26, wherein said ligand is an 30 ecdysteroid.
28. A method of Claim 26, wherein said ligand is a 200H ecdysone antagonist. P:\WPDOCS\CRN\SPEC[\7523980.spe.dc-28/06/02 -116-
29. A fusion polypeptide comprising a hormone binding domain of an insect steroid receptor as claimed in claim 9, and a second polypeptide.
30. A fusion polypeptide of Claim 29, wherein said second polypeptide comprises a DNA binding domain from a second steroid receptor superfamily receptor member.
31. A nucleic acid encoding a fusion polypeptide of Claim 29.
32. A method for selecting ligands specific for binding to a ligand binding domain of an insect steroid receptor superfamily member, said method comprising the steps of: combining: a fusion polypeptide of Claim 29, wherein said fusion polypeptide comprises said ligand binding domain functionally linked to a DNA binding domain of a steroid receptor; and a nucleic acid sequence encoding a protein, wherein production of said protein by expression of said nucleic acid sequence is responsive to binding by 00 said DNA binding domain; 25 screening compounds for an activity of inducing expression of said second nucleic acid sequence; and selecting compounds having said activity.
33. A method of Claim 29, wherein said combining occurs 30 within a cell.
34. A method of Claim 33, wherein said combining step results from expression upon transformation of said cell a with a nucleic acid encoding said fusion polypeptide.
P:\WPDOCS\CRN\SPECI\752398.sp.do.286/2 -117- A method of Claim 32, wherein said DNA binding domain is selected from binding domains of insect steroid receptor superfamily members selected from the group consisting of EcR, DHR3, E75A and
36. A method for producing a polypeptide in a cell that lacks an ecdysone receptor, the method comprising the steps of: transfecting the cell with a first and second expression vector wherein: the first expression vector comprises the isolated, recombinant nucleic acid of claim 1; and the second expression vector comprises a DNA control element operably linked to a coding sequence for the polypeptide, the DNA control element being responsive binding by an ecdysone receptor; wherein an ecdysone-binding receptor is produced in the cell by expression of the isolated, recombinant nucleic acid of the first expression vector; and 20 exposing the host cell to ecdysone or an ecdysone analog that binds the ecdysone-binding receptor, whereupon the transfected cell transcribes the nucleotide sequence encoding the polypeptide and produces the polypeptide. S:
37. A method of Claim 36, wherein said cell is a mammalian cell.
38. A method of Claim 36, further comprising the step of 30 introducing said cell into an intact organism. .:040: P;\WPDOCS\CRN\SPECI\7523980.spe.do-28/6/02 -118-
39. A method of Claim 36, wherein said cell is a plant cell. Isolated recombinant nucleic acids, cells transformed therewith, polypeptides, compositions or cells comprising said polypeptides, methods for selecting DNA sequences or ligands, fusion polypeptides or nucleic acids encoding same or methods of producing polypeptides, substantially as hereinbefore described with reference to the Examples and accompanying drawings. DATED this 28th day of June, 2002 THE BOARD OF TRUSTEES OF LELAND STANFORD Jr. University By its Patent Attorneys DAVIES COLLISON CAVE 0 0 0 *0 O
AU56563/00A 1990-02-26 2000-09-07 Identification and expression of insect steroid receptor DNA sequences Expired AU752699B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU56563/00A AU752699B2 (en) 1990-02-26 2000-09-07 Identification and expression of insect steroid receptor DNA sequences

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US485749 1990-02-26
AU56563/00A AU752699B2 (en) 1990-02-26 2000-09-07 Identification and expression of insect steroid receptor DNA sequences

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU49218/97A Division AU4921897A (en) 1990-02-26 1997-12-23 Identification and expression of insect steroid receptor DNA sequences

Publications (2)

Publication Number Publication Date
AU5656300A AU5656300A (en) 2000-11-23
AU752699B2 true AU752699B2 (en) 2002-09-26

Family

ID=3742131

Family Applications (1)

Application Number Title Priority Date Filing Date
AU56563/00A Expired AU752699B2 (en) 1990-02-26 2000-09-07 Identification and expression of insect steroid receptor DNA sequences

Country Status (1)

Country Link
AU (1) AU752699B2 (en)

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NUCLEIC ACIDS RESEARCH (1989) 17(18) P 7167-7178 *

Also Published As

Publication number Publication date
AU5656300A (en) 2000-11-23

Similar Documents

Publication Publication Date Title
EP0517805B1 (en) Identification and expression of insect steroid receptor dna sequences
Segraves et al. The E75 ecdysone-inducible gene responsible for the 75B early puff in Drosophila encodes two new members of the steroid receptor superfamily.
Cho et al. Mosquito ecdysteroid receptor: analysis of the cDNA and expression during vitellogenesis
US5639616A (en) Isolated nucleic acid encoding a ubiquitous nuclear receptor
DE69131718T2 (en) LIVER-ENRICHED TRANSCRIPTION FACTOR
US6348574B1 (en) Seven transmembrane receptors
Talbot et al. Drosophila tissues with different metamorphic responses to ecdysone express different ecdysone receptor isoforms
EP0609240B1 (en) Receptors of the thyroid/steroid hormone receptor superfamily
US5599904A (en) Chimeric steroid hormone superfamily receptor proteins
EP0999271A2 (en) Retinoid receptor compositions and methods
AU689078B2 (en) Glucagon receptors
AU691868B2 (en) Ileal bile acid transporter compositions and methods
AU3728393A (en) Car receptors and related molecules and methods
US5807988A (en) Isolation, characterization, and use of the human and subunit of the high affinity receptor for immunoglobulin E
JPH06510206A (en) Novel heterodimeric nuclear receptor protein, the gene encoding it, and its uses
Segraves Jr Molecular and genetic analysis of the E75 ecdysone-responsive gene of Drosophila melanogaster
AU752699B2 (en) Identification and expression of insect steroid receptor DNA sequences
EP1017716A1 (en) Car receptors and related molecules and methods
US5548063A (en) Retinoic acid receptor alpha proteins
US6426403B1 (en) TRAF family molecules, polynucleotides encoding them, and antibodies against them
Talbot Structure, expression, and function of ecdysone receptor isoforms in Drosophila
Superfamily llllllllllllllllllllllllllll|| l| lllllllllllllllllllllllllllllllllllllllllll
Toulas et al. Differential initiation of translation of a single estrogen receptor mRNA could explain some estradiol resistance cases
CA1341422C (en) Retinoic acid receptor composition and method
US6989242B1 (en) Car receptors and related molecules and methods

Legal Events

Date Code Title Description
CB Opposition lodged by

Opponent name: COMMONWEALTH SCIENTIFIC AND INDUSTRIAL RESEARCH OR

SREP Specification republished
CH Opposition withdrawn

Opponent name: COMMONWEALTH SCIENTIFIC AND INDUSTRIAL RESEARCH OR

FGA Letters patent sealed or granted (standard patent)