WO1991008295A1 - Dna binding protein - Google Patents

Dna binding protein Download PDF

Info

Publication number
WO1991008295A1
WO1991008295A1 PCT/US1990/006817 US9006817W WO9108295A1 WO 1991008295 A1 WO1991008295 A1 WO 1991008295A1 US 9006817 W US9006817 W US 9006817W WO 9108295 A1 WO9108295 A1 WO 9108295A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
gcf
dna
cell
sequence
Prior art date
Application number
PCT/US1990/006817
Other languages
French (fr)
Inventor
Ira H. Pastan
Ryoichiro Kageyama
Original Assignee
The United States Of America, As Represented By The Secretary, U.S. Department Of Commerce
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The United States Of America, As Represented By The Secretary, U.S. Department Of Commerce filed Critical The United States Of America, As Represented By The Secretary, U.S. Department Of Commerce
Publication of WO1991008295A1 publication Critical patent/WO1991008295A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/5308Immunoassay; Biospecific binding assay; Materials therefor for analytes not provided for elsewhere, e.g. nucleic acids, uric acid, worms, mites
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells

Definitions

  • the present invention relates, in general, to a DNA binding protein, and, in particular, to a DNA binding protein that represses transcription from promoters to which it binds.
  • the invention further relates to a DNA sequence encoding the DNA binding protein, to a recombinant DNA molecule that includes such a sequence and to cells transformed therewith.
  • the invention further relates to a method of regulating gene expression.
  • DNA binding proteins and cis-actinq elements play an important role in controlling gene expression (for reviews see Breathnach et al. Ann. Rev. Biochem. 50:349 (1981); Dynan et al. Nature 316:774 (1985); McKnight et al. Cell 46:795 (1986)).
  • a combination of different DNA elements can generate a variety of types of transcriptional regulation.
  • multiple factors can bind to the same DNA sequence or region and produce diversity in transcriptional control.
  • One such example involves the octa er transcription factors.
  • OTF-1 NF-A1, OBP100, NFIII
  • OTF-2 NF-A2
  • B cell-specific promoters Schott al.
  • NF-A3 is unique to embryonal carcinoma cells and mediates transcriptional repression (Lenardo et al. Science 243:544 (1989)). Therefore, the same DNA sequence may function as either a positive or negative control element depending upon the DNA binding factors present in different cell types.
  • Spl is a well characterized transcription factor that binds to GC boxes (GGGCGG) and . stimulates transcription from promoters which contain those sites (for reviews see Dynan et al. Nature 316:774 (1985); Kadonaga et al. Tre ⁇ ds Biochem. 11:20 (1986)).
  • An activator, termed ETF has been identified that also binds to GC-rich sequences including GC boxes (Kageyama et al. &_. Biol. Chem. 263:6329 (1988) and P.N.A.S. USA
  • GC-rich sequences are ubiquitous elements and are found in the promoter regions of many housekeeping genes and cellular oncogenes (Melton et al. P.N.A.S. USA 81:2147 (1984); Reynolds et al. Cell 38:275 (1984); Ishii et al. P.N.A.S. USA 82:4920 (1985) and Science 230:1378 (1985)).
  • the presence of a negative trans-acting factor which binds to GC-rich sequences has also been suggested.
  • the promoter of calcium-dependent protease (CANP) ⁇ ene has a high CG content, and its transcription is negatively regulated by repeated GC-rich elements (Hata et al. J. Biol, Chem. 264:6404 (1989)). Therefore, it is likely that various factors with different functions may interact with the same or similar sequences to control gene expression in a flexible manner.
  • the present invention relates to a novel trans-actin ⁇ factor (and DNA sequence encoding same) that binds to GC-rich promoters and represses transcription therefrom.
  • the invention further relates to methods of using this trans ⁇ acting factor in regulating gene expression.
  • the present invention relates to a substantially pure form of a DNA binding protein that recognizes GC rich sequences and represses transcription from GC rich promoters when bound thereto. In another embodiment, the present invention relates to a DNA fragment encoding the above-described DNA binding protein.
  • the present invention relates to a recombinant DNA molecule comprising a vector, and the above-described DNA fragment.
  • the present invention relates to a host cell transformed with the above-described recombinant DNA molecule.
  • the present invention relates to a process of producing the above-described DNA binding protein.
  • the method comprises culturing the above-described host cell under conditions such that the above-described DNA fragment is expressed and the DNA binding protein thereby produced.
  • the invention relates to a method of regulating expression of a gene in a cell.
  • the method comprises inserting into a cell a nucleotide sequence encoding GC Factor, or binding fragment thereof, under conditions such that the sequence, or fragment, binds to a promoter from which the gene is transcribed so that expression of the gene is inhibited.
  • the present invention relates to a diagnostic assay for determining the metastatic potential of a cancer cell.
  • the method comprises measuring the level of expression in the cell of the GC Factor gene and correlating that level with standard values indicative of metastatic potential.
  • FIGURE 1 The Recombinant Molecule pSV2-GCF.
  • FIGURE 2 The Recombinant Molecule pT7-GCF.
  • FIGURE 3 Characterization of DNA Binding Properties of Molecular Cloned
  • A DNase I footprinting analyses of the B-actin promoter. Extracts of the isopropyl- thio- ⁇ -galactoside (IPTG)-induced lysogens were prepared from two different positive clones (C and GC Factor (GCF) ) , and subjected to DNase I footprinting assays. A Sall-Xhol fragment of the ⁇ -actin gene promoter labeled at the Sail site was used as a probe. Amounts of the protein are indicated above each lane in microliters. Lanes 1 and 7 (-) show the control DNase digestion pattern with no protein factors added. Lane 2 indicates the reaction with a purified transcription factor ETF. The nucleotide sequence protected by the factor encoded by ⁇ GCF-1 (lanes 5 and 6) is depicted on the right.
  • IPTG isopropyl- thio- ⁇ -galactoside
  • GCF GC Factor
  • FIGURE 4 Structural Analysis of Cloned GCF cDNA.
  • (A) Schematic structure of the cDNA clones. The boxed region denotes the open reading frame, and EcoRI sites (R) are indicated.
  • Ser/Thr sequence are underlined at the amino acid sequence.
  • Leucine and methionine residues which could represent a leucine zipper motif are boxed.
  • FIGURE 5 Structural Analysis of GCF.
  • (A) Northern blot analysis of GCF.
  • Poly(A)-RNA (20 ⁇ g) of A431 human epidermoid carcinoma cells was analyzed by a 1936 bp EcoRI fragment of ⁇ GCF-1 as a probe.
  • the sizes (kb) of markers are indicated on the left.
  • BV Primer extension analysis. 5•-End-labeled oligonucleotide probe complementary to the region between residues 98 and 121 was hybridized to either tRNA (lane 1) or A431 poly(A)-RNA (lane 2) , and the primer-extended products were analyzed on a 5. polyacrylamide sequencing gel. The sizes (nt) of the DNA markers are shown on the left.
  • (C) In vitro translation analysis of GCF.
  • the GCF RNA synthesized in vitro with T7 polymerase was incubated in reticulocyte lysates with ["S] methionine (lane 2).
  • the r"ST. methionine-labeled product was analyzed on an SDS- polyacrylamide gel.
  • Lane 1 indicates the control reaction with no RNA samples added.
  • the sizes (kd) of the protein standards are shown on the left.
  • FIGURE 6 Deletion Analysis with Bacterially Expressed Proteins for Identification of the DNA Binding Domain.
  • A. Deletion analysis with bacterially expressed proteins. DNA fragments containing parts of GCF cDNA were cloned into ⁇ gtll vector, and the plaques were screened with a 75 bp oligonucleotide probe having the GC-rich sequences of the ⁇ -actin gene promoter. Open bars represent which parts on the GCF protein are present, and thin lines indicate the deleted regions. Amino acid residue numbers of the end points are shown on the left. Percentage rates of the positive plaques are indicated on the right. In each screening, at least 100 plaques were analyzed.
  • FIGURE 7 Cotransfection Assays of the GCF Expression Plasmids and the CAT Reporter Gene Under the Control of Various Promoters.
  • A Schematic structures of the GCF expression and the chloramphenicol acetyltransferase (CAT) reporter plasmids. SV40 early promoter was placed in front of the full- length of the GCF cDNA in the GCF expression plasmid (pSV2-GCF) . The control expression plasmid had the nonfunctional segment of the CAT gene (pSV2-C) .
  • CAT chloramphenicol acetyltransferase
  • the CAT reporter plasmids contained either the wild type human EGFR promoter (pERcat ⁇ ) , the truncated EGFR promoter lacking the major GCF-binding sites (pERcatl ⁇ ) , the chicken ⁇ -actin promoter (p ⁇ Acat) , or the SV40 early promoter (pSV2cat) . Closed and open circles depict strong and weak GCF-binding sites, respectively.
  • fB) CAT analysis by cotransfection assays. One ⁇ g of a CAT reporter plasmid and various amounts of pSV2-GCF were cotransfected into CV1 cells. The total DNA amounts were adjusted to 21 ⁇ g with pSV2-C.
  • the CAT activities of the control cotransfection assays using 1 ⁇ g of a CAT reporter plasmid and 20 ⁇ g of pSV2-C were assigned to 100%, and percentage ratios of the CAT activities of the control experiments and the experiments with pSV2-GCF were calculated.
  • the CAT reporter plasmids were pERcat6 (o) ; pERcatl ⁇
  • FIGURE 8 DNA Binding and Transcriptional Properties of GCF to the CANP Negative Control Element.
  • Lane 3 lysogen (lane 3) were subjected to the DNase I footprinting assay. A Hindlll-StuI fragment of pA l ⁇ cat-CANP2 labeled at the Hindlll site was used as a probe. Lane 1 (-) shows the control DNase digestion pattern with no protein factors added. The location of the SV40 promoter and the CANP negative control element is shown on the left. The nucleotide sequence protected by GCF is indicated on the right.
  • FIGURE 9 In Vitro Transcription Analysis of GCF.
  • RNA products were analyzed by a primer-extension assay.
  • the present invention relates to a DNA binding protein capable of repressing transcription from promoters to which it binds.
  • the invention further relates to DNA sequences (fragments) encoding all, or unique portions (i.e., at least 5 amino acids), of the binding protein.
  • the invention also relates to recombinant molecules containing such DNA sequences, and to cells transformed therewith.
  • the present invention relates to methods of controlling expression of specific genes, for example, genes associated with particular disease states.
  • the protein of the present invention recognizes GC rich sequences present in promoters, including promoters of housekeeping genes and cellular oncogenes (for example, the EGFR, ⁇ -actin or CANP promoter) .
  • the protein represses transcription from the particular promoter to which it binds.
  • the protein can have the complete sequence given in Figure 4C, in which case it is designated GCF.
  • the protein can also have the amino acid sequence of a molecule having substantially the same DNA binding and transcription repressive properties of the molecule given in Figure 4C (for example, allelic forms of GCF) .
  • the protein (or polypeptide) of the invention can have an amino acid sequence corresponding to a unique portion of the sequence given in Figure 4C (or allelic form thereof) .
  • the protein can have an amino acid sequence corresponding to the binding domain of the Figure 4C sequence (or allelic variation thereof) .
  • the protein can be present in a substantially pure form, that is, in a form substantially free of proteins and nucleic acids with which it is normally associated.
  • GCF including GCF made in cell-free extracts using GCF RNA and GCF made using recombinant techniques, can be purified using protocols know in the art.
  • the protein can be used as an antigen, in protocols known in the art, to produce antibodies thereto, both monoclonal and polyclonal.
  • the present invention relates, as indicated above, to DNA sequences (including cDNA sequences) that encode the entire amino acid sequence given in Figure 4C (the specific DNA sequence given in Figure 4C being only one example) , or any unique portion thereof.
  • DNA sequences to which the invention relates also include those encoding proteins (or polypeptides) having substantially the same DNA binding and transcription repressive properties of GCF (for example, allelic forms of the amino acid sequence of Figure 4C) .
  • the present invention relates to a recombinant DNA molecule that includes a vector and a DNA sequence as described above (advantageously, a DNA sequence encoding the protein shown in Figure 4C or a protein having the DNA binding and/or transcription repressive properties of that protein) .
  • the vector can take the form of a virus or a plasmid (for example, pUC19) vector.
  • the DNA sequence can be present in the vector operably linked to regulatory elements, including, for example, a promoter.
  • the recombinant molecule can be suitable for transforming procaryotic or eucaryotic cells, advantageously, mammalian cells. Specific examples of recombinant molecules containing GCF cDNA are shown in Figures 1 and 2.
  • pSV2-GCF has the SV40 early promoter directing expression of GCF in eucaryotic cells.
  • pT7-GCF has the T7 promoter enabling in vitro transcription of GCF by T7 polymerase.
  • the present invention relates to a host cell transformed with the above-described recombinant molecule.
  • the host can be procaryotic (for example, bacterial) , lower eucaryotic (i.e., fungal, including yeast) or higher eucaryotic (i.e., mammalian, including human) . Transformation can be effected using methods known in the art.
  • the transformed host cells can be used as a source for the DNA sequence described above (which sequence constitutes part of the recombinant molecule) .
  • the recombinant molecule takes the form of an expression system, the transformed cells can be used as a source for the above-described protein.
  • the DNA binding proteins and nucleic acid sequences of the present invention can be used both in a research setting (for example, to facilitate an understanding of transcription) and in a clinical setting to, for example, control expression of genes associated with particular disease states.
  • cDNA encoding all or part of GCF can be inserted into a retrovirus and the virus used to infect cancer cells thereby causing inhibition of expression of a cancer forming gene.
  • GCF can also be introduced into a cell to inhibit expression of any deleterious gene containing sequences to which GCF binds.
  • GCF DNA can be introduced into cells using a retrovirus or into human cells in tissue culture which can then be returned to the person.
  • GCF gene in cells can be measured using known methods and a determination made, based on that measurement, as to whether or not the cells are cancer cells and whether the cells, in the case of cancer cells, are likely to grow and spread by metastases.
  • GCF RNA levels can be measured using GCF DNA.
  • antibodies to GCF prepared by producing all or portions of the GCF protein and injecting these into various types of animals e.g. rabbits, sheep, goats or mice) can be used to measure expression.
  • the filters were next immersed in the binding buffer (10 mM Tris pH 7.5, 50 mM NaCl, 3 mM MgCl, and 1 mM DTT) containing 6M guanidine hydrochloride for 5 min twice at room temperature.
  • This solution was diluted with the equal volume of the binding buffer, and the filters were submerged in this solution for 5 min.
  • This dilution step was repeated four times and then the filters were transferred to the blocking buffer (5% Carnation nonfat dry milk in the binding buffer) and left for 30 min. After washing the filters with the binding buffer, they were incubated with the labeled probe in the same buffer for > 2 hr.
  • a uniformly labeled 75 bp synthetic oligonucleotide probe containing GC-rich elements of the chicken ⁇ -actin gene promoter that is composed of the region between residues -181 and -107 relative to the transcription initiation site was made as follows. After annealing two complementary oligonucleotides,
  • the second strand was synthesized with Klenow fragment in the presence of 0.5 ⁇ M [ ⁇ --"P] dCTP (the specific activity of the probe > 5 x 10* cpm/ ⁇ g) .
  • the filters were then washed with the binding buffer for 1 hr and subjected to autoradiography. From 1 x 10' plaques, only one GCF cDNA clone was obtained.
  • DNA binding domains were identified as follows. Several subclones were made from ⁇ GCF-1 and ⁇ GCF-4, and were digested with EcoRI (nucleotide residues 1931 and 2453), Rsal (179, 827, 1121 and 1146), S al (456) and Sspl (1640). Each fragment and a PCR fragment (224-456) were purified and ligated into ⁇ gtll either directly or by EcoRI linkers (either 8-mer, 10-mer or 12-mer to obtain in-frame fusion proteins) . These phage clones were screened with the same 75 bp oligonucleotide probe and it was found that the clone N78 had the DNA binding domain.
  • the cells were collected, suspended in 1/50 volume of 0.12 H KM buffer (20 mM Hepes pH 7.9, 1 mM MgCl,, 2 mM DTT, 17% glycerol and 0.12 M KC1) , subjected to three cycles of freeze and thaw, and centrifuged. The supernatant (0.7 ⁇ g/ ⁇ l) was partially purified by sequence-specific oligonucleotide affinity chromatography. The affinity resin was prepared as described by
  • a DNase I footprinting assay was used.
  • a 283 bp Sall- Xhol fragment of p ⁇ Acat (Billeter et al. Mol. Cell. Biol. 8:1361 (1988)) labeled at the Sail site
  • a 279 bp HindIII-Sau3AI fragment labeled at the Hindlll site of pGER9-l which contains the EGFR promoter region between -297 and -20
  • a 221 bp Hindlll-StuI fragment labeled at the
  • Hindlll site of pA l ⁇ cat-CANP2 were used as probes for the ⁇ -actin, EGFR and CANP promoters, respectively. Footprinting reactions were carried out in a total volume of 50 ⁇ l of 0.12 M HM buffer containing 5 ng of the end-labeled probe, as previously described (Dynan et al. Cell 35:79 (1983)) .
  • GCF GC factor
  • the binding of GCF to the SV40 promoter was also examined, but specific binding (see Figure 8) was not observed, which is further evidence that GCF is different from Spl and ETF.
  • GCF-binding sites of the ⁇ -actin and EGFR genes it was found that GCGGGGC was responsible for the specific binding (Table 1) .
  • the same or similar sequences were also found, in the promoter region of the CANP gene. TABLE 1 Alignment of the GCF-Binding Sites
  • DNA sequences protected by GCF in the DNase I footprinting experiments are shown here; ⁇ -actin (nucleotide residues -142 to -128 and -125 to -116), EGFR (-274 to -265 and -236 to -227), CANP (-235 to -226).
  • poly(A)-RNA of A431 was electrophoresed on a formaldehyde-1% agarose gel, and transferred to a nitrocellulose filter. A 1936 bp EcoRI fragment was used as a probe.
  • primer extension analyses 5'-end-labeled oligonucleotide primer complementary to the region between residues 98 and 121 was hybridized to 10 ⁇ g of either tRNA or A431 poly(A)-RNA in 10 ⁇ l of 40% formamide, 0.4 M NaCl, 40 mM PIPES pH 6.4 and 1 mM EDTA. After hybridization at 42° for 3 hr, the reverse transcriptase reaction was carried out (Maniatis et al. Molecular Cloning; A Laboratory Manual
  • GCF The amino-terminal end of GCF is extremely basic, with about 43% of the first 86 amino acids being either- lysine, arginine, or histidine. This portion contains an unusual stretch of seven lysines (residues 23-29) that are flanked by arginines. After the basic domain, an acidic regio * n of 45 amino acids is present that lies between residues 186 and 230. About 31% of this region is either glutamic acid or aspartic acid. In the remaining 500-amino acids of the protein, basic and acidic amino acids are evenly distributed. Another feature of GCF is the presence of four asparagine-X-serine/threonine motifs that are potential N-linked glycosylation sites (Marshall, Ann. Rev. Biochem. 41:673 (1972)).
  • RNA-blot analyses The nature of the RNA species producing GCF was determined by RNA-blot analyses. As shown in Figure 5A, a 3 kb RNA transcript was detected using poly(A)-RNA from A431 cells. Assuming that the poly(A) tail is approximately 100-200 nucleotides long, the total size of the cDNA of 2825 bp agrees well with the size of the mRNA. A primer extension analysis using a 24-mer oligonucleotide primer which was complementary to the region between residues 98 and 121 ( Figure 5B) . Two prominent bands of about 80 nucleotides and 170 nucleotides in size were obtained, that indicating that transcription initiation occurs at two sites.
  • the cDNA was labelled with "P by nick-translation and hybridized to blots containing RNA from KB cells and A431 cells.
  • RNA from GCF is processed by splicing to produce different RNA species encoding different proteins or that GCF belongs to a family of genes encoding related RNAs.
  • IPTG-induced extract was used for DNase I footprinting analysis.
  • the deletion mutant N78 showed the identical protection pattern on the ⁇ -actin gene promoter as the original clone ⁇ GCF-1.
  • CAT cotransfection analysis For expression vectors, either the full-length of the GCF cDNA (pSV2-GCF) or the nonfunctional segment of the CAT gene (pSV2-C) was cloned under the control of the SV40 early promoter.
  • the CAT reporter plasmids contained either human EGFR promoter (-775 to -20) (pERcat ⁇ ) (Johnson et al. J. Biol. Chem, 263:5693 (1988)), the truncated EGFR promoter (-105 to -20) (pERcatl ⁇ ) (Johnson et al. J. Biol. Chem.
  • pA cat the chicken ⁇ -actin promoter
  • pSV2cat the SV40 early promoter
  • pA 10 cat-CANPl and -CANP2 the chimeric promoters consisting of a negative regulatory element of the CANP gene and the truncated SV40 early promoter
  • pA 10 cat-CANPl and -CANP2 the chimeric promoters consisting of a negative regulatory element of the CANP gene and the truncated SV40 early promoter
  • pA 10 cat-CANPl and -CANP2 the chimeric promoters consisting of a negative regulatory element of the CANP gene and the truncated SV40 early promoter
  • pA 10 cat-CANPl and -CANP2 the chimeric promoters consisting of a negative regulatory element of the CANP gene and the truncated SV40 early promoter
  • pA 10 cat-CANPl and -CANP2 the chimeric promoters consisting of
  • the in vitro reconstituted reaction mixture contained 1 ⁇ g of a supercoiled DNA template, 100 ⁇ g of DEAE-Sepharose fraction BB, and 30 ⁇ g of fraction BC, as previously described (Kageyama et al. J. Biol. Chem. 263:6329 (1988)). After incubation for 60 min at 30 e C, RNA was prepared and hybridized with 5'-end labeled primer. 5'TGCCATTGGGATATATCAACGGTG3' . The primer extension products were analyzed on a 5% polyacrylamide sequencing gel.
  • GCF cDNA pSV2-GCF
  • pSV2-C control fragment
  • GCF represses the expression from the EGFR and ⁇ -actin promoters, their GCF- binding sites are not known as a negative regulatory element.
  • the CANP gene promoter was analyzed. This promoter is highly GC-rich, and its transcription is negatively regulated by repeated GC-rich elements (Hata et al. J. Biol. Chem. 264:6404 (1989)). These elements also act as inhibitory sequences on heterologous promoters in an orientation- independent manner.
  • Chimeric promoters were made, as described above, by inserting a fragment containing a negative regulatory element of the CANP gene (between -259 and -225) in either orientation into the Bglll site of pA awarecat (Laimins et al. P.N.A.S. USA 79:6453 (1982)) which has the CAT gene under the control of the SV40 early promoter but with the deletion of its enhancer regions (pA personallycat-CANPl and -CANP2) .
  • pA Precat-CANPl and -CANP2

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Biomedical Technology (AREA)
  • Chemical & Material Sciences (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Microbiology (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Urology & Nephrology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Hematology (AREA)
  • Biophysics (AREA)
  • Cell Biology (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Plant Pathology (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present invention relates to a DNA sequence encoding a DNA binding protein that represses transcription from promoters to which it binds, and to the binding protein itself. The invention also relates to a recombinant DNA molecule that includes such a sequence and to cells transformed therewith. The invention further relates to methods of utilizing the DNA sequence and protein encoded therein.

Description

DNA BINDING PROTEIN
BACKGROUND OF THE INVENTION
Technical Field The present invention relates, in general, to a DNA binding protein, and, in particular, to a DNA binding protein that represses transcription from promoters to which it binds. The invention further relates to a DNA sequence encoding the DNA binding protein, to a recombinant DNA molecule that includes such a sequence and to cells transformed therewith. The invention further relates to a method of regulating gene expression.
Background Information The interactions between DNA binding proteins and cis-actinq elements play an important role in controlling gene expression (for reviews see Breathnach et al. Ann. Rev. Biochem. 50:349 (1981); Dynan et al. Nature 316:774 (1985); McKnight et al. Cell 46:795 (1986)). A combination of different DNA elements can generate a variety of types of transcriptional regulation. In addition, multiple factors can bind to the same DNA sequence or region and produce diversity in transcriptional control. One such example involves the octa er transcription factors. Several proteins that specifically recognize the octamer motif (ATGCAAAT) have been identified: OTF-1 (NF-A1, OBP100, NFIII) is a ubiquitous factor that is utilized by several genes (Fletcher et al. Cell 51:773 (1987); Singh et al. Nature 319:154 (1986); Sturm et al. Genes Dev. 1:1147 (1987); O'Neil et al. Science 241:1210 (1988)). OTF-2 (NF-A2) is a lymphoid-specific protein and activates transcription from B cell-specific promoters (Scheidereit et al. Cell 51:783 (1987) and Nature 336:551 (1988); Staudt et al. Nature 323:640 (1986); Milller et al. Nature 336:544 (1988)). NF-A3 is unique to embryonal carcinoma cells and mediates transcriptional repression (Lenardo et al. Science 243:544 (1989)). Therefore, the same DNA sequence may function as either a positive or negative control element depending upon the DNA binding factors present in different cell types.
Spl is a well characterized transcription factor that binds to GC boxes (GGGCGG) and . stimulates transcription from promoters which contain those sites (for reviews see Dynan et al. Nature 316:774 (1985); Kadonaga et al. Treηds Biochem. 11:20 (1986)). An activator, termed ETF, has been identified that also binds to GC-rich sequences including GC boxes (Kageyama et al. &_. Biol. Chem. 263:6329 (1988) and P.N.A.S. USA
85:5016 (1988)). The sequence requirement for ETF binding is rather loose and it binds to GC-rich sequences not recognized by Spl. Furthermore, several enhancer-binding factors have been reported which bind to GC boxes (Mitchell et al. Cell 50:847 (1987); Mermod et al. Nature 332:557 (1988)) .
GC-rich sequences are ubiquitous elements and are found in the promoter regions of many housekeeping genes and cellular oncogenes (Melton et al. P.N.A.S. USA 81:2147 (1984); Reynolds et al. Cell 38:275 (1984); Ishii et al. P.N.A.S. USA 82:4920 (1985) and Science 230:1378 (1985)). In addition to these positive regulators, the presence of a negative trans-acting factor which binds to GC-rich sequences has also been suggested. The promoter of calcium-dependent protease (CANP) αene has a high CG content, and its transcription is negatively regulated by repeated GC-rich elements (Hata et al. J. Biol, Chem. 264:6404 (1989)). Therefore, it is likely that various factors with different functions may interact with the same or similar sequences to control gene expression in a flexible manner.
The present invention relates to a novel trans-actinα factor (and DNA sequence encoding same) that binds to GC-rich promoters and represses transcription therefrom. The invention further relates to methods of using this trans¬ acting factor in regulating gene expression.
SUMMARY OF THE INVENTION
It is a general object of the invention to provide a DNA binding protein that regulates transcription.
It is a specific object of the invention to provide a trans-acting protein (and DNA sequence encoding same) that represses gene expression.
It is a further object of the invention to provide a method of regulating expression of specific genes, for example, genes associated with specific disease states.
Further objects and advantages of the present invention will be clear from the description that follows.
In one embodiment, the present invention relates to a substantially pure form of a DNA binding protein that recognizes GC rich sequences and represses transcription from GC rich promoters when bound thereto. In another embodiment, the present invention relates to a DNA fragment encoding the above-described DNA binding protein.
In a further embodiment, the present invention relates to a recombinant DNA molecule comprising a vector, and the above-described DNA fragment.
In yet another embodiment, the present invention relates to a host cell transformed with the above-described recombinant DNA molecule.
In another embodiment, the present invention relates to a process of producing the above-described DNA binding protein. The method comprises culturing the above-described host cell under conditions such that the above-described DNA fragment is expressed and the DNA binding protein thereby produced.
In a further embodiment, the invention relates to a method of regulating expression of a gene in a cell. The method comprises inserting into a cell a nucleotide sequence encoding GC Factor, or binding fragment thereof, under conditions such that the sequence, or fragment, binds to a promoter from which the gene is transcribed so that expression of the gene is inhibited.
In yet another embodiment, the present invention relates to a diagnostic assay for determining the metastatic potential of a cancer cell. The method comprises measuring the level of expression in the cell of the GC Factor gene and correlating that level with standard values indicative of metastatic potential. BRIEF DESCRIPTION OF THE FIGURES
FIGURE 1: The Recombinant Molecule pSV2-GCF.
FIGURE 2: The Recombinant Molecule pT7-GCF.
FIGURE 3: Characterization of DNA Binding Properties of Molecular Cloned
Factors.
(A) : DNase I footprinting analyses of the B-actin promoter. Extracts of the isopropyl- thio-β-galactoside (IPTG)-induced lysogens were prepared from two different positive clones (C and GC Factor (GCF) ) , and subjected to DNase I footprinting assays. A Sall-Xhol fragment of the β-actin gene promoter labeled at the Sail site was used as a probe. Amounts of the protein are indicated above each lane in microliters. Lanes 1 and 7 (-) show the control DNase digestion pattern with no protein factors added. Lane 2 indicates the reaction with a purified transcription factor ETF. The nucleotide sequence protected by the factor encoded by λGCF-1 (lanes 5 and 6) is depicted on the right.
(B) : DNase I footprinting analyses of the epidermal growth factor receptor (EGFR) promoter. The extract of IPTG-induced λGCF-1 lysogen was subjected to the DNase I footprinting analysis. A HindIII-Sau3AI fragment of the EGFR promoter labeled at the Hindlll site was used as a probe. Lane 1 (-) shows the control DNase digestion pattern with no protein factors added. The nucleotide sequence protected by the factor encoded by λGCF-1 is depicted on the right. The location of Spl- and ETF-binding sites is indicated on the left. FIGURE 4: Structural Analysis of Cloned GCF cDNA.
(A) : Schematic structure of the cDNA clones. The boxed region denotes the open reading frame, and EcoRI sites (R) are indicated.
(Bl: Charge distribution of GCF. Net charges of 10 amino acids were averaged over successive 30 amino acids. The striped and stippled areas indicate the highly basic and acidic regions, respectively.
(C) : Nucleotide sequence of GCF cDNA and the deduced primary structure of the protein. The position of the primer used for the primer extension analysis (Figure 5B) is underlined at the nucleotide sequence. The possible glycosylation sites conforming to the Asn-X-
Ser/Thr sequence are underlined at the amino acid sequence. Leucine and methionine residues which could represent a leucine zipper motif are boxed.
FIGURE 5: Structural Analysis of GCF.
(A) : Northern blot analysis of GCF. Poly(A)-RNA (20 μg) of A431 human epidermoid carcinoma cells was analyzed by a 1936 bp EcoRI fragment of λGCF-1 as a probe. The sizes (kb) of markers are indicated on the left.
(BV: Primer extension analysis. 5•-End-labeled oligonucleotide probe complementary to the region between residues 98 and 121 was hybridized to either tRNA (lane 1) or A431 poly(A)-RNA (lane 2) , and the primer-extended products were analyzed on a 5. polyacrylamide sequencing gel. The sizes (nt) of the DNA markers are shown on the left.
(C) : In vitro translation analysis of GCF. The GCF RNA synthesized in vitro with T7 polymerase was incubated in reticulocyte lysates with ["S] methionine (lane 2). The r"ST. methionine-labeled product was analyzed on an SDS- polyacrylamide gel. Lane 1 indicates the control reaction with no RNA samples added. The sizes (kd) of the protein standards are shown on the left.
FIGURE 6: Deletion Analysis with Bacterially Expressed Proteins for Identification of the DNA Binding Domain. (A. : Deletion analysis with bacterially expressed proteins. DNA fragments containing parts of GCF cDNA were cloned into λgtll vector, and the plaques were screened with a 75 bp oligonucleotide probe having the GC-rich sequences of the β-actin gene promoter. Open bars represent which parts on the GCF protein are present, and thin lines indicate the deleted regions. Amino acid residue numbers of the end points are shown on the left. Percentage rates of the positive plaques are indicated on the right. In each screening, at least 100 plaques were analyzed.
(B) ϊ DNase I footprinting assays on the β-actin gene promoter. Extracts of the IPTG- induced lysogen from the deletion mutant N78 containing the first 78 amino acid residues of GCF were prepared, and subjected to DNase I footprinting experiments (lanes 4 and 5) . The probe was a Sall-Xhol fragment labeled at the Sail site of the β-actin gene promoter. Lanes 2 and 3 present the DNase digestion pattern of extracts of the lysogen from λGCF-1. Amounts of the protein are indicated above each lane in microliters. Lanes 1 and 6 show the control experiment with no protein factors added. FIGURE 7: Cotransfection Assays of the GCF Expression Plasmids and the CAT Reporter Gene Under the Control of Various Promoters. (A) : Schematic structures of the GCF expression and the chloramphenicol acetyltransferase (CAT) reporter plasmids. SV40 early promoter was placed in front of the full- length of the GCF cDNA in the GCF expression plasmid (pSV2-GCF) . The control expression plasmid had the nonfunctional segment of the CAT gene (pSV2-C) . The CAT reporter plasmids contained either the wild type human EGFR promoter (pERcatδ) , the truncated EGFR promoter lacking the major GCF-binding sites (pERcatlδ) , the chicken β-actin promoter (pβAcat) , or the SV40 early promoter (pSV2cat) . Closed and open circles depict strong and weak GCF-binding sites, respectively. fB) : CAT analysis by cotransfection assays. One μg of a CAT reporter plasmid and various amounts of pSV2-GCF were cotransfected into CV1 cells. The total DNA amounts were adjusted to 21 μg with pSV2-C. The CAT activities of the control cotransfection assays using 1 μg of a CAT reporter plasmid and 20 μg of pSV2-C were assigned to 100%, and percentage ratios of the CAT activities of the control experiments and the experiments with pSV2-GCF were calculated. The CAT reporter plasmids were pERcat6 (o) ; pERcatlδ
(•) pβAcat (Δ) ; and pSV2cat (o) . Results are the average of at least four independent experiments. The range of value is shown by vertical bars at each point. FIGURE 8: DNA Binding and Transcriptional Properties of GCF to the CANP Negative Control Element.
(A) : DNase I footprinting analyses of the chimeric promoter containing the SV40 promoter and the CANP negative control element. The purified Spl (lane 2) and the extract of the λGCF-
1 lysogen (lane 3) were subjected to the DNase I footprinting assay. A Hindlll-StuI fragment of pAcat-CANP2 labeled at the Hindlll site was used as a probe. Lane 1 (-) shows the control DNase digestion pattern with no protein factors added. The location of the SV40 promoter and the CANP negative control element is shown on the left. The nucleotide sequence protected by GCF is indicated on the right.
(B) : CAT analysis by cotransfection assays. Ten μg each of a CAT reporter plasmid and either pSV2- GCF (+) or pSV2-C (-) was cotransfected into CVl cells. pA„cat-CANPl and pA„catCANP2 have the CANP negative control element in the Bglll site of pA„cat in the same orientation as in the CANP gene and in the opposite orientation, respectively. The CAT activity of the control cotransfection assay using 10 μg each of pλ„cat and pSV-C was assigned to 100%, and percentage ratios of the CAT activities of the control and other experiments were calculated. Results are the average of at least four independent experiments differing by <5%. Small circles in the SV40 promoter indicate the 21 bp repeats.
FIGURE 9: In Vitro Transcription Analysis of GCF.
One μg of supercoiled pGER15 (lanes 1 and 2), pGER6 (lanes 3, 4 and 7-9), and pβAcat (lanes
5 and 6) was used for transcription from the truncated EGFR (-105 to -20) , the wild type EGFR (-775 to -20), and β-actin promoters, respectively. The transcription reaction also contained 100 μg of DEAE-Sepharose fraction BB, 30 μg of fraction BC and either no additional factors (lanes 1, 3, 5 and 7), the factor made from λGCF- 1 (lanes 2, 4, 6 and 9) or the factor made from N78 (lane 8) . RNA products were analyzed by a primer-extension assay.
DETAILED DESCRIPTION OF THE INVENTION
The present invention relates to a DNA binding protein capable of repressing transcription from promoters to which it binds. The invention further relates to DNA sequences (fragments) encoding all, or unique portions (i.e., at least 5 amino acids), of the binding protein. The invention also relates to recombinant molecules containing such DNA sequences, and to cells transformed therewith. In a further embodiment, the present invention relates to methods of controlling expression of specific genes, for example, genes associated with particular disease states.
The protein of the present invention recognizes GC rich sequences present in promoters, including promoters of housekeeping genes and cellular oncogenes (for example, the EGFR, β-actin or CANP promoter) . The protein represses transcription from the particular promoter to which it binds. The protein can have the complete sequence given in Figure 4C, in which case it is designated GCF. The protein can also have the amino acid sequence of a molecule having substantially the same DNA binding and transcription repressive properties of the molecule given in Figure 4C (for example, allelic forms of GCF) . Alternatively, the protein (or polypeptide) of the invention can have an amino acid sequence corresponding to a unique portion of the sequence given in Figure 4C (or allelic form thereof) . As an example, the protein (or polypeptide) can have an amino acid sequence corresponding to the binding domain of the Figure 4C sequence (or allelic variation thereof) . The protein can be present in a substantially pure form, that is, in a form substantially free of proteins and nucleic acids with which it is normally associated. GCF, including GCF made in cell-free extracts using GCF RNA and GCF made using recombinant techniques, can be purified using protocols know in the art. The protein can be used as an antigen, in protocols known in the art, to produce antibodies thereto, both monoclonal and polyclonal.
In another embodiment, the present invention relates, as indicated above, to DNA sequences (including cDNA sequences) that encode the entire amino acid sequence given in Figure 4C (the specific DNA sequence given in Figure 4C being only one example) , or any unique portion thereof. DNA sequences to which the invention relates also include those encoding proteins (or polypeptides) having substantially the same DNA binding and transcription repressive properties of GCF (for example, allelic forms of the amino acid sequence of Figure 4C) .
In another embodiment, the present invention relates to a recombinant DNA molecule that includes a vector and a DNA sequence as described above (advantageously, a DNA sequence encoding the protein shown in Figure 4C or a protein having the DNA binding and/or transcription repressive properties of that protein) . The vector can take the form of a virus or a plasmid (for example, pUC19) vector. The DNA sequence can be present in the vector operably linked to regulatory elements, including, for example, a promoter. The recombinant molecule can be suitable for transforming procaryotic or eucaryotic cells, advantageously, mammalian cells. Specific examples of recombinant molecules containing GCF cDNA are shown in Figures 1 and 2. pSV2-GCF has the SV40 early promoter directing expression of GCF in eucaryotic cells. pT7-GCF has the T7 promoter enabling in vitro transcription of GCF by T7 polymerase.
In a further embodiment, the present invention relates to a host cell transformed with the above-described recombinant molecule. The host can be procaryotic (for example, bacterial) , lower eucaryotic (i.e., fungal, including yeast) or higher eucaryotic (i.e., mammalian, including human) . Transformation can be effected using methods known in the art. The transformed host cells can be used as a source for the DNA sequence described above (which sequence constitutes part of the recombinant molecule) . When the recombinant molecule takes the form of an expression system, the transformed cells can be used as a source for the above-described protein.
The DNA binding proteins and nucleic acid sequences of the present invention can be used both in a research setting (for example, to facilitate an understanding of transcription) and in a clinical setting to, for example, control expression of genes associated with particular disease states. For example, to inhibit cell growth, cDNA encoding all or part of GCF can be inserted into a retrovirus and the virus used to infect cancer cells thereby causing inhibition of expression of a cancer forming gene. GCF can also be introduced into a cell to inhibit expression of any deleterious gene containing sequences to which GCF binds. To change the proteins made by cells, and thereby alter cell behavior, GCF DNA can be introduced into cells using a retrovirus or into human cells in tissue culture which can then be returned to the person. (See Science 244:1275 (1989) .) For diagnostic purposes, expression of the GCF gene in cells can be measured using known methods and a determination made, based on that measurement, as to whether or not the cells are cancer cells and whether the cells, in the case of cancer cells, are likely to grow and spread by metastases. To accomplish this, GCF RNA levels can be measured using GCF DNA. Alternatively, antibodies to GCF (prepared by producing all or portions of the GCF protein and injecting these into various types of animals e.g. rabbits, sheep, goats or mice) can be used to measure expression.
The invention is described in further detail in the following non-limiting Examples.
Example I
Isolation and Characterization of cDNA Clones for a GC-Rich Sequence Binding Factor
(i) Cloning and sequencing analyses of GCF cDNA: cDNA was synthesized by oligo(dT) priming of poly(A)-RNA of A431 human epidermoid carcinoma cells. Double strand cDNA was next constructed, treated with EcoRI ethylase, ligated to EcoRI linkers, size-fractionated to select inserts longer than 1 kb and cloned into λgtll vectors (Promega) (Huynh et al. DNA Cloning; A Practical Approach. vol. 1 DM Glover ed. (Oxford:IRL Press) pp. 49-78 (1985)). Usually, 1-2 x 10* plaques starting from 8 μg of poly(A)-RNA were obtained. Screening was carried out according to Vinson et al. (Genes Dev. 2:801 (1988)), a modified version of the method by Singh et al. (Cell 52:415 (1988)). Briefly, Y1090 was infected by recombinant λ phages, plated and incubated at 42°C for 5 hr. The nitrocellulose filters pretreated in 10 mM IPTG were overlaid on the plates and incubated at 37°C for > 6 hr. The filters were next immersed in the binding buffer (10 mM Tris pH 7.5, 50 mM NaCl, 3 mM MgCl, and 1 mM DTT) containing 6M guanidine hydrochloride for 5 min twice at room temperature. This solution was diluted with the equal volume of the binding buffer, and the filters were submerged in this solution for 5 min. This dilution step was repeated four times and then the filters were transferred to the blocking buffer (5% Carnation nonfat dry milk in the binding buffer) and left for 30 min. After washing the filters with the binding buffer, they were incubated with the labeled probe in the same buffer for > 2 hr. A uniformly labeled 75 bp synthetic oligonucleotide probe containing GC-rich elements of the chicken β-actin gene promoter that is composed of the region between residues -181 and -107 relative to the transcription initiation site (Kost et al. Nucl. Acids Res. 11:8287 (1983)) was made as follows. After annealing two complementary oligonucleotides,
5'AGCGATGGGGGCGOGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGC GGGGCGAGGGGCGGGGCGGGGCGAGGC3» and 5'GCCTCGCCCCGS• , the second strand was synthesized with Klenow fragment in the presence of 0.5 μM [<--"P] dCTP (the specific activity of the probe > 5 x 10* cpm/μg) . The filters were then washed with the binding buffer for 1 hr and subjected to autoradiography. From 1 x 10' plaques, only one GCF cDNA clone was obtained.
(ii) DNA binding analysis: DNA binding domains were identified as follows. Several subclones were made from λGCF-1 and λGCF-4, and were digested with EcoRI (nucleotide residues 1931 and 2453), Rsal (179, 827, 1121 and 1146), S al (456) and Sspl (1640). Each fragment and a PCR fragment (224-456) were purified and ligated into λgtll either directly or by EcoRI linkers (either 8-mer, 10-mer or 12-mer to obtain in-frame fusion proteins) . These phage clones were screened with the same 75 bp oligonucleotide probe and it was found that the clone N78 had the DNA binding domain. To characterize the DNA binding specificity, Y1089 lysogens containing λGCF-1 and N78 were isolated (Huynh et al. DNA Cloning; A Practical Approach, vol. 1, D.M. Glover, ed. (Oxford IRL Press), pp.49-78 (1985)), grown until ODMO = 0.5, incubated at 5βC for 20 min, and then induced with 10 mM IPTG. After incubation at 37»C for 1 hr, the cells were collected, suspended in 1/50 volume of 0.12 H KM buffer (20 mM Hepes pH 7.9, 1 mM MgCl,, 2 mM DTT, 17% glycerol and 0.12 M KC1) , subjected to three cycles of freeze and thaw, and centrifuged. The supernatant (0.7 μg/μl) was partially purified by sequence-specific oligonucleotide affinity chromatography. The affinity resin was prepared as described by
Kadonaga et al. (P.N.A.S. USA 83:5889 (1986)) and Wu et al. (Science 238:1247 (1987)) by annealing two oligonucleotides, 5•CCCGCGCGAGCTAGACGTCCGGGCAGCCCCCGGCGCAGCGCGGCCG3• and 5•CGGCCGCGCTGCGCCGGGGGCTGCCCGGACGTCTAG3» . This region contained GC-rich sequences of the EGFR promoter region. The eluate was concentrated by using centricon 30 (Amicon) . The final protein concentration was -0.2 μg/μl. Either the supernatant or the partially purified sample from the affinity column was used for the DNA binding assay. To analyze the DNA binding activity, a DNase I footprinting assay was used. For the DNase I footprinting experiment, a 283 bp Sall- Xhol fragment of pβAcat (Billeter et al. Mol. Cell. Biol. 8:1361 (1988)) labeled at the Sail site, a 279 bp HindIII-Sau3AI fragment labeled at the Hindlll site of pGER9-l which contains the EGFR promoter region between -297 and -20 (Kageyama et al. P.N.A.S. USA 85:5016 (1988)), and a 221 bp Hindlll-StuI fragment labeled at the
Hindlll site of pAcat-CANP2 were used as probes for the β-actin, EGFR and CANP promoters, respectively. Footprinting reactions were carried out in a total volume of 50 μl of 0.12 M HM buffer containing 5 ng of the end-labeled probe, as previously described (Dynan et al. Cell 35:79 (1983)) .
In the initial screening of the above- described library, two positive clones were obtained. Lysogens with these phage clones were prepared, as indicated above, to analyze their DNA binding specificity and extracts from the IPTG- induced lysogens were subjected to DNase I footprinting assays. As shown in Figure 3A, the extract from λGCF-1 protected the GC-rich sequences of the β-actin gene promoter (lanes 5 and 6) , while the other extract did not protect any specific regions (lanes 3 and 4). These data indicate that clone λGCF-1 encodes a specific GC- rich sequence binding factor. This experiment also shows that the region protected by this factor is not identical to that protected by a transcriptional activator ETF (compare lane 2 with 5 and 6) , indicating this factor is di ferent from ETF.
To investigate whether this factor interacts with GC-rich sequences of other genes, its interaction with the epidermal growth factor receptor (EGFR) promoter region was analyzed. As shown in Figure 3B the extract from λGCF-1 bound to at least two GC-rich regions of the EGFR. promoter, between -278 and -263 and between -243 and -225. One of them (between -243 and -225) highly overlaps the ETF-binding site. Weak binding in the region between -150 and -90, which contains an Spl-binding site, was also observed. These results demonstrate that the factor encoded by λGCF-1 is able to interact with various GC- rich sequences including some of Spl- and ETF- binding sites. However, because the DNA binding specificity of this factor is different from those of Spl and ETF, this factor was designated GCF (GC factor) . The binding of GCF to the SV40 promoter was also examined, but specific binding (see Figure 8) was not observed, which is further evidence that GCF is different from Spl and ETF. By comparison of GCF-binding sites of the β-actin and EGFR genes, it was found that GCGGGGC was responsible for the specific binding (Table 1) . The same or similar sequences were also found, in the promoter region of the CANP gene. TABLE 1 Alignment of the GCF-Binding Sites
Promoter DNA Sequence
β-Actin G G G C G G G G C G
EGFR G C G C G G G C C G
EGFR C A G C G C G G C C CANP T C C C G G C G C T
Consensus N N G C G G G G C N
(C) (0(C)(C)
DNA sequences protected by GCF in the DNase I footprinting experiments are shown here; β-actin (nucleotide residues -142 to -128 and -125 to -116), EGFR (-274 to -265 and -236 to -227), CANP (-235 to -226).
Exam le II
Structural Analysis of GCF and Localization of GCF Gene
(i) DNA sequencing; For sequencing analysis, a chain termination method was used (Sanger et al. P.N.A.S. USA 74:5463 (1977)). For ambiguous regions, the Maxam-Gilbert technique was used to confirm the sequence (Maxam and Gilbert, Methods in En^vmol. 65:499 (1980).
(ii) Protein and RNA analysis: For in vitro translation analyses, a full-length of cDNA was cloned into the downstream of the T7 promoter and linearized by BspHI. From this DNA template, GCF RNA was synthesized with T7 polymerase, treated with RNase-free DNase, purified, incubated at 65°C for 5 min and subjected to in vitro translation in reticulocyte lysates with ["S] methionine. The [5'S] methionine-labeled product was analyzed on an SDS-polyacrylamide gel. For RNA-blot analysis, poly(A)-RNA of A431 was electrophoresed on a formaldehyde-1% agarose gel, and transferred to a nitrocellulose filter. A 1936 bp EcoRI fragment was used as a probe. For primer extension analyses, 5'-end-labeled oligonucleotide primer complementary to the region between residues 98 and 121 was hybridized to 10 μg of either tRNA or A431 poly(A)-RNA in 10 μl of 40% formamide, 0.4 M NaCl, 40 mM PIPES pH 6.4 and 1 mM EDTA. After hybridization at 42° for 3 hr, the reverse transcriptase reaction was carried out (Maniatis et al. Molecular Cloning; A Laboratory Manual
(Cold Spring Harbor, New York: Cold Spring Harbor Laboratory) (1982)), and the primer-extended product was electrophoresed on a 5% polyacrylamide sequencing gel.
Sequence analysis indicated that clone λGCF-1 did not contain the 3'-portion of the CCF gene, therefore, the library was rescreened using a 522bp EcoRI fragment of λGCF-1 as a probe. From 1.5 x 10* plaques, seven positive clones were obtained. All these clones were purified and partially sequenced. Two clones were found that contained the 3* end and one of them (λGCF-4) was totally sequenced (Figure 4A) .
The complete cDNA sequence and deduced amino acid sequence of GCF are presented in Figure 4C. The open reading frame starts with an ATG at rer'.due 224 and ends with a termination codon at residue 2576 indicating that the encoded protein should contain 784 amino acids and have a Mr of 91 kd. When the cDNA was placed under the control of a T7 promoter, and RNA was prepared and translated in reticulocyte lysates, a [MS] methionine-labeled protein of approximately 100 kd was produced (Figure 5C) . This size agrees well with the predicted Mr of 91 kd. One of the most striking.features of this protein is its charge distribution (Figure 4B) . The amino-terminal end of GCF is extremely basic, with about 43% of the first 86 amino acids being either- lysine, arginine, or histidine. This portion contains an unusual stretch of seven lysines (residues 23-29) that are flanked by arginines. After the basic domain, an acidic regio *n of 45 amino acids is present that lies between residues 186 and 230. About 31% of this region is either glutamic acid or aspartic acid. In the remaining 500-amino acids of the protein, basic and acidic amino acids are evenly distributed. Another feature of GCF is the presence of four asparagine-X-serine/threonine motifs that are potential N-linked glycosylation sites (Marshall, Ann. Rev. Biochem. 41:673 (1972)).
The nature of the RNA species producing GCF was determined by RNA-blot analyses. As shown in Figure 5A, a 3 kb RNA transcript was detected using poly(A)-RNA from A431 cells. Assuming that the poly(A) tail is approximately 100-200 nucleotides long, the total size of the cDNA of 2825 bp agrees well with the size of the mRNA. A primer extension analysis using a 24-mer oligonucleotide primer which was complementary to the region between residues 98 and 121 (Figure 5B) . Two prominent bands of about 80 nucleotides and 170 nucleotides in size were obtained, that indicating that transcription initiation occurs at two sites. One is about 50 nucleotides upstream of nucleotide 1; the other lies about 40 nucleotides downstream from nucleotide 1. These results clearly demonstrate that the sequence shown in Figure 4 represents a full-length or nearly full-length cDNA. GCF DNA hybridizes with chromosome 2 at a region between pll.l and pll.2.
Example III
Characterization of mRNA Species
To determine the RNA species hybridizing with the 3.0 kb GCF cDNA clone, the cDNA was labelled with "P by nick-translation and hybridized to blots containing RNA from KB cells and A431 cells.
Three mRNA species were detected with M,s of 4.2 kd, 3.0 kd and 1.0 kd. These data indicate that either RNA from GCF is processed by splicing to produce different RNA species encoding different proteins or that GCF belongs to a family of genes encoding related RNAs.
Example IV
Identification of the DNA Binding Domain
(i) See Example I, item (ii)
Examination of the GCF sequence did not reveal any obvious homology with DNA binding • domains of other transcription factors including ho eo boxes and zinc finger structures. To identify the region responsible for the specific binding, several deletion mutants were made. Fragments of GCF cDNA were cloned into the λgtll vector to make fusion proteins with β-galactosidase, the phage were plated, and the plaques screened with the same probe which had been used to isolate λGCF-1. Because only one of the two orientations is able to express the protein, theoretically 50% of the plaques should be positive if the cloned fragment encodes the DNA binding domain. Due to the inefficiency of the cloning steps, positive plaques were usually obtained at less than a 50% rate. As shown in Figure 6A, a clone containing the fragment encoding the amino-terminal 78 amino acids (N78) exhibited the DNA binding activity, while the clones containing the other regions did not. A lysogen was made from clone N78 and the
IPTG-induced extract was used for DNase I footprinting analysis. As presented in Figure 6B, the deletion mutant N78 showed the identical protection pattern on the β-actin gene promoter as the original clone λGCF-1. These results indicate that 78 amino acids at the amino-terminal end of GCF is sufficient to produce specific DNA binding activity.
Example V
Transcriptional Repression by GCF
(i) CAT cotransfection analysis; For expression vectors, either the full-length of the GCF cDNA (pSV2-GCF) or the nonfunctional segment of the CAT gene (pSV2-C) was cloned under the control of the SV40 early promoter. The CAT reporter plasmids contained either human EGFR promoter (-775 to -20) (pERcatδ) (Johnson et al. J. Biol. Chem, 263:5693 (1988)), the truncated EGFR promoter (-105 to -20) (pERcatlδ) (Johnson et al. J. Biol. Chem. 263:5693 (1988)), the chicken β-actin promoter (pβAcat) (Billeter et al. Mol. Cell Biol. 8:1361 (1988)) the SV40 early promoter (pSV2cat) (Gorman et al. Mol. Cell. Biol. 2:1044 (1982)) or the chimeric promoters consisting of a negative regulatory element of the CANP gene and the truncated SV40 early promoter (pA10cat-CANPl and -CANP2). pA„cat-CANPl and pA10cat-CANP2 were made as follows: after kinased, the two oligonucleotides,
5»GATCTCAGCACCGGCCCGTGTCGCGGCGTTCCCGGCGCTCAAGCTTA3 ' and
5'GATCTAAGCTTGAGCGCCGGGAACGCCGCGACACGGGCCGGTGCTGA3 • were hybridized and cloned into the Bglll site of pA„cat. One or 10 μg of a CAT reporter plasmid and various amounts of pSV2-GCF were cotransfected into CV1 cells using the calcium phosphate coprecipitation method, as previously described (Gorman et al. Mol. Cell. Biol. 2:1044 (1982)). The total DNA amounts were adjusted to 20 or 21 μg with pSV2-C. CAT activities were determined, as previously described (Gorman et al. Mol. Cell. Biol. 2:1044 (1982)) .
(ii) In vitro transcription: The in vitro reconstituted reaction mixture contained 1 μg of a supercoiled DNA template, 100 μg of DEAE-Sepharose fraction BB, and 30 μg of fraction BC, as previously described (Kageyama et al. J. Biol. Chem. 263:6329 (1988)). After incubation for 60 min at 30eC, RNA was prepared and hybridized with 5'-end labeled primer. 5'TGCCATTGGGATATATCAACGGTG3' . The primer extension products were analyzed on a 5% polyacrylamide sequencing gel.
DNA-mediated gene transfer experiments were carried out to assess the functional effects of GCF. GCF cDNA (pSV2-GCF) or the control fragment (pSV2-C) was inserted downstream from the
SV40 early promoter, and cotransfected with reporter plasmids containing the CAT gene under the control of the wild type human EGFR promoter
(pERcat6) , the truncated EGFR promoter lacking the major GCF-binding sites (pERcatlS) , the chicken β-actin promoter (pβAcat) , or the SV40 early promoter (pSV2cat) (see above and Figure 7A) . As shown in Figure 7B, cotransfection with the GCF expression plasmid resulted in significant repression of expression of the wild type EGFR gene. When the region containing the GCF-binding site was removed, GCF exhibited a weaker repression on the EGFR expression.
However, the removal of the GCF-binding sites did not completely abolish the repression by GCF. This is probably because the truncated EGFR promoter, which is extremely GC-rich, still contained a weak GCF-binding site (the region between -105 and -90) . Some repression by GCF was also observed when the β-actin promoter, which also contains a GCF-binding site, was used (Figure 7B) . However, this repression was much less than on the EGFR promoter. On the other hand, the expression of pSV2cat which does not bind GCF although it contains GC-rich regions and Spl- binding sites was not affected by GCF (Figure 7B) . A control expression vector which contains a non- functional portion of the CAT gene (pSV2-C) did not repress the expression of any of the above genes. These results demonstrate that GCF acts as a sequence-specific repressor in vivo.
Although GCF represses the expression from the EGFR and β-actin promoters, their GCF- binding sites are not known as a negative regulatory element. To examine whether GCF interacts with negative control elements, the CANP gene promoter was analyzed. This promoter is highly GC-rich, and its transcription is negatively regulated by repeated GC-rich elements (Hata et al. J. Biol. Chem. 264:6404 (1989)). These elements also act as inhibitory sequences on heterologous promoters in an orientation- independent manner. Chimeric promoters were made, as described above, by inserting a fragment containing a negative regulatory element of the CANP gene (between -259 and -225) in either orientation into the Bglll site of pA„cat (Laimins et al. P.N.A.S. USA 79:6453 (1982)) which has the CAT gene under the control of the SV40 early promoter but with the deletion of its enhancer regions (pA„cat-CANPl and -CANP2) . To analyze the interaction between GCF and the CANP gene, a DNase I footprinting assay was performed using the chimeric promoter as a probe. As shown in Figure 8A, GCF bound to the CANP negative element containing the sequence CGGCGC, but not to the SV40 GC boxes (lane 3). On the contrary, Spl bound to the SV40 GC boxes, but not to the CANP negative element (lane 2) .
To further clarify the negative function of GCF, cotransfection experiments were carried out. As shown in Figure 8B, GCF did not influence the CAT activity of pA„cat which does not contain GCF-binding sites. However, the insertion of the CANP negative regulatory element into the promoter resulted in 30-40% reduction of the CAT activities without th<* expression of GCF (Figure 8B) . This is probably due to the endogenous GCF present in CV1 cells. When cotransfected with the GCF expression plasmid, the CAT activities were significantly repressed in an orientation- independent manner (Figure 8B) . These results indicate that GCF acts as a repressor by interacting with the negative enhancer-like element. To obtain more direct evidence that GCF is involved in the transcriptional level, the GCF effect was analyzed in the cell-free transcription system. The inhibitory activity of GCF was tested using the reconstituted A431 nuclear extracts (Kageyama et al. J. Biol. Chem. 263:6329 (1988)).
As shown in Figure 9, the factor made from λGCF-1 which contained the amino-terminal 745 amino acids exhibited a five-fold repression of EGFR gene expression (compare lanes 3 and 4) while the same factor showed almost no repression on the truncated EGFR promoter (lanes 1 and 2). Some repression was observed for the β-actin promoter (lanes 5 and 6). Thus, these in vitro results agreed well with the in vivo transfection data and indicate that GCF is involved in the transcriptional level although the truncated EGFR promoter showed a stronger repression in vivo than in vitro.
Next, the repressor activity of the truncated form of GCF which contained the amino- terminal end of 78 amino acids (N78) was analyzed. This region includes only the basic DNA binding domain. The addition of N78 showed a two-fold repression on the EGFR promoter (lane 8) while the same DNA binding units of GCF exhibited a much stronger repressor activity (lane 9). Thus, although the amino-terminal DNA binding domain of 78 amino aci^s had a partial repressor activity, the carboxy-ter inal part is also necessary for the full activity of GCF.
The entire contents of all references cited herein above are hereby incorporated by re erence.
While the foregoing invention has been described in some detail for purposes of clarity and understanding, it will be clear to one skilled in the art from a reading of this disclosure that various changes in form and detail can be made without departing from the true scope of the invention.

Claims

WHAT IS CLAIMED IS:
1. A substantially pure form of a DNA binding protein that recognizes GC rich sequences and represses transcription from GC rich promoters when bound thereto.
2. The protein according to claim 1, wherein said protein has an amino acid sequence corresponding to that shown in Figure 4C, or a unique portion thereof.
3. The protein according to claim 1, wherein said protein has the amino acid sequence of a molecule having the DNA binding properties and transcription repressive properties of a protein having the sequence shown in Figure 4C.
4. A DNA fragment that encodes a DNA binding protein, which protein recognizes GC rich sequences and represses transcription from GC rich promoters when bound thereto.
5. The DNA fragment according to claim 4, wherein said DNA fragment encodes the amino acid sequence set forth in Figure 4C, or a unique portion thereof.
6. The DNA fragment according to claim 5, wherein said fragment has the sequence of bases designated 224 to 2575 in Figure 4C, or a unique portion of that sequence.
7. The DNA fragment according to claim 4, wherein said DNA fragment encodes the amino acid sequence of a molecule having the DNA binding properties and transcription repressive properties of a protein having the sequence shown in Figure 4C.
8. A recombinant DNA molecule . comprising:
(i) a vector, and
(ii) said DNA fragment according to claim 4.
9. The recombinant DNA molecule according to claim 8, wherein said vector is a viral vector.
10. The recombinant DNA molecule according to claim 8, wherein said DNA fragment encodes the amino acid sequence set forth in Figure 4C, or a unique portion of that sequence.
11. A host cell transformed with the recombinant DNA molecule according to claim 8.
12. The host cell according to claim 11, wherein said cell is a enraryotic cell.
13. A process of producing a DNA binding protein that recognizes GC rich sequences and represses transcription from GC rich promoters when bound thereto which comprises culturing the cell according to claim 11, under conditions such that said DNA fragment is expressed and said protein thereby produced, and isolating said protein.
14. A method of regulating expression of a gene in a cell comprising inserting into said cell a nucleotide sequence encoding GC Factor, or binding fragment thereof, under conditions such that said sequence, or fragment thereof, binds to a promoter from which said gene is transcribed so that expression of said gene is inhibited.
15. The method according to claim 14, wherein said cell is a cancer cell.
16. The method according to claim 14, wherein said gene is an oncogene.
17. A diagnostic assay for determining the metastatic potential of a cancer cell comprising measuring the level of expression in said cell of the GC Factor gene and correlating that level with standard values indicative of metastatic potential.
PCT/US1990/006817 1989-11-28 1990-11-28 Dna binding protein WO1991008295A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US44191289A 1989-11-28 1989-11-28
US441,912 1989-11-28

Publications (1)

Publication Number Publication Date
WO1991008295A1 true WO1991008295A1 (en) 1991-06-13

Family

ID=23754798

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1990/006817 WO1991008295A1 (en) 1989-11-28 1990-11-28 Dna binding protein

Country Status (2)

Country Link
AU (1) AU6907191A (en)
WO (1) WO1991008295A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997041226A2 (en) * 1996-04-29 1997-11-06 The Government Of The United States Of America, Represented By The Secretary, Department Of Health And Human Services Polynucleotides encoding the transcriptional repressor gcf2 of the epidermal growth factor receptor

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
CELL. Vol. 59, issued 01 December 1989, KAGEYAMA et al., "Molecular Cloning and Characterization of a Human DNA Binding Factor that Represses Transcription", pages 815-825. *
JOURNAL OF VIROLOGY, Volume 62, No. 1, issued january 1988, PARKS et al., "Organization of the Transcriptional Control Region of the Elb Gene of Adenovirus Type 5", pages 54-67. *
MOLECULAR AND CELLULAR BIOLOGY, Volume 6, No. 11, issued November 1986, NISHIKURA, "Sequences Involved in Accurate and Efficient Transcription of Human C-mvc Genes Microinjected into Frog OOcytes", pages 4093-4098. *
NATURE, Vol. 316, issued 29 August 1985, DYNAN et al., "Control of Eukaryotic Messenger RNA Synthesis by Sequence-Specific DNA-Binding Proteins", pages 774-778. *
NATURE, Vol. 325, issued 22 January 1987, LEE et al., "Activation of Transcription by Two Factors that Bind Promoter and Enhancer Sequences of the Human Metallothionern Gene and SV-40", pages 368-372. *
NATURE, Volume 32, issued 07 April 1988, MERMOD et al., "Enhancer Binding Factors AP-4 and AP-1 Act in Concert to Activate SV40 Late Transcription In Vitro", pages 557-561. *
NUCLEIC ACIDS RESEARCH, Volume 16, No. 24, issued December 1988, TAMURA et al., "Analysis of Transcription Control Elements of the Mouse Myclin Basic Protein Gene in Hela Cell Extracts: Demonstration of a Strong NFI-Binding Motif in the Upstream Region", pages 11,441-11,459. *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997041226A2 (en) * 1996-04-29 1997-11-06 The Government Of The United States Of America, Represented By The Secretary, Department Of Health And Human Services Polynucleotides encoding the transcriptional repressor gcf2 of the epidermal growth factor receptor
WO1997041226A3 (en) * 1996-04-29 1997-12-31 Us Health Polynucleotides encoding the transcriptional repressor gcf2 of the epidermal growth factor receptor

Also Published As

Publication number Publication date
AU6907191A (en) 1991-06-26

Similar Documents

Publication Publication Date Title
Kageyama et al. Molecular cloning and characterization of a human DNA binding factor that represses transcription
Chang et al. Glucose-regulated protein (GRP94 and GRP78) genes share common regulatory domains and are coordinately regulated by common trans-acting factors
Tian et al. A splicing enhancer complex controls alternative splicing of doublesex pre-mRNA
Kageyama et al. Molecular characterization of transcription factors that bind to the cAMP responsive region of the substance P precursor gene. cDNA cloning of a novel C/EBP-related factor
Perisic et al. Stable binding of Drosophila heat shock factor to head-to-head and tail-to-tail repeats of a conserved 5 bp recognition unit
Urness et al. Molecular interactions within the ecdysone regulatory hierarchy: DNA binding properties of the Drosophila ecdysone-inducible E74A protein
King et al. Amplification of a novel v-erb B-related gene in a human mammary carcinoma
Lu et al. Fusion with E2A converts the Pbx1 homeodomain protein into a constitutive transcriptional activator in human leukemias carrying the t (1; 19) translocation
Velcich et al. Adenovirus E1a proteins repress transcription from the SV40 early promoter
Helfman et al. Nonmuscle and muscle tropomyosin isoforms are expressed from a single gene by alternative RNA splicing and polyadenylation
Kuczek et al. Sheep wool (glycine+ tyrosine)‐rich keratin genes: a family of low sequence homology
Farr et al. Characterization and mapping of the human SOX4 gene
WO1994017087A1 (en) Tata-binding protein associated factors, nucleic acids encoding tafs, and methods of use
Pan Identification and characterization of a novel promoter of the mouse mu opioid receptor gene (Oprm) that generates eight splice variants
US5759803A (en) Recombinant retinoblastoma-associated protein 1 (E2F-1) polypeptides and cDNA
Mine et al. Gene fusion involving HMGIC is a frequent aberration in uterine leiomyomas
CA2040099A1 (en) Tyrosine kinase negative trkb
Wilson Gene regulation and structure-function studies of mammalian DNA polymerase ß
Scott et al. The neuroblastoma amplified gene, NAG: genomic structure and characterisation of the 7.3 kb transcript predominantly expressed in neuroblastoma
US5302698A (en) DNA coding for protein binds to enhancer of α-fetoprotein gene
US6410233B2 (en) Isolation and identification of control sequences and genes modulated by transcription factors
Linzer et al. Transcriptional regulation of proliferin gene expression in response to serum in transfected mouse cells.
US5759853A (en) Coding, promoter and regulator sequences of IRF-1
WO1991008295A1 (en) Dna binding protein
Herrera et al. Regulation of alpha-tropomyosin and N5 genes by a shared enhancer. Modular structure and hierarchical organization.

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LU NL SE

NENP Non-entry into the national phase

Ref country code: CA