WO1998045326A1 - Cell zinc finger polynucleotides and splice variant polypeptides encoded thereby - Google Patents
Cell zinc finger polynucleotides and splice variant polypeptides encoded thereby Download PDFInfo
- Publication number
- WO1998045326A1 WO1998045326A1 PCT/US1998/006925 US9806925W WO9845326A1 WO 1998045326 A1 WO1998045326 A1 WO 1998045326A1 US 9806925 W US9806925 W US 9806925W WO 9845326 A1 WO9845326 A1 WO 9845326A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- polynucleotide
- szfl
- seq
- sequence
- dna
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/46—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
- C07K14/47—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
Definitions
- This invention relates to newly identified polynucleotides, polypeptides encoded by such polynucleotides, the use of such polynucleotides and polypeptides, as well as the production and isolation of such polynucleotides and polypeptides. More particularly, the polynucleotides and polypeptides of the present invention have been putatively identified as being transcription factors, and in particular zinc finger transcription factors, and still more particularly as being involved in hematopoiesis .
- Hematopoiesis is a complex process in which the critical stage and lineage specific regulators establish lineage commitment through their combined action. It is important to identify new transcription factors that are expressed in hematopoietic stem/progenitor cells since each factor may control transcription of one or more genes involved in the processes of self-renewal or differentiation of the cells.
- Transcription factor proteins bind to sets of genes and stimulate or repress the transcription of these genes, thus mediating cell fate decisions and regulating differentiation. Hematopoietic development involves many such transcription factors and the phenotypes of the differentiated lineages reflect the sets of genes activated or repressed by factors 1-2 of the invention.
- the processes of self-renewal and multilineage differentiation of hematopoietic stem cells occur continuously during the lifetime of the organism and can be regulated in response to need.
- knockout mice a number of transcription factors have been implicated in hematopoietic stem/progenitor cell development. For example, homozygous disruption of either AMLl, c-myb, PU.l, or scul/tall results in defects in fetal hematopoiesis in mice which die during early embryonic development.
- These genes appear to be essential for the development of multiple hematopoietic lineages. Transcription factors which appear to be lineage-specific have also been identified. These genes include E2A and Pax-5 which are required for B cell development, rbtn2 which is necessary for primitive yolk sac-derived erythropoiesis, and NF-E2 which is essential for the development of megakaryocytes .
- GATA3 Several zinc finger genes play important roles in hematopoiesis.
- the GATA3 gene appears to be involved in early development of multiple cell lineages .
- GATA3 -deficient embryos die in early development as a result of impaired fetal liver hematopoiesis .
- Homologous knockout of another zinc finger gene, Ikaros blocked lymphoid development.
- Egrl a myeloid differentiation primary response gene, has been shown to be essential for differentiation of hematopoietic cells along the macrophage lineage.
- MZF1 which is preferentially expressed in hematopoietic cells, may inhibit myeloid differentiation by negative regulation of hematopoietic genes such as CD34 and c-myb.
- the present invention provides in one aspect the a novel polypeptide which has been putatively identified as a hematopoietic factor and more particularly as Stem Cell Zinc Finger ("SZF1") which is encoded by a polynucleotide obtained from a cDNA library prepared from CD34+ human bone marrow cells.
- SZF1 codes for a protein containing a C 2 H 2 type zinc finger and a KRAB domain.
- Two alteratively spliced transcripts were isolated: SZFl-1 and SZFl-2, respectively.
- SZFl-2 was expressed in most cell types and tissues, whereas SZFl-1 appears limited to expression in CD34+ cells.
- SZFl-1 appears limited to expression in CD34+ cells and not in more mature cells it can be utilized as a marker for undifferentiated CD34+ cells.
- antibodies to SZFl-1 may be utilized to confirm a population of CD34+ cells.
- SZFl-1 appears to be a factor involved in transcription of one or more genes for undifferentiated replication of CD34+ cells.
- SZFl-2 appears to be involved in the maturation of CD34+ cell in that certain concentrations of SZFl-2 can cause hematopoietic cells to differentiate and mature rather than to merely replicate.
- the gene for each of SZFl-1 and SZF1- 2 is found on chromosome 3.
- the polynucleotides encoding SZFl-1 and SZFl-2 are useful to generate probes or antibodies that may be utilized to detect the presence or absence of chromosome 3 in a sample.
- the polynucleotides accorting to the invention may be utilized for gene therapy in a host to replace or supplement a defective SZFl-1 or SZFl-2 gene.
- novel polypeptides as well as active fragments, analogs and derivatives thereof.
- nucleic acid molecules encoding the proteins of the present invention including mRNAs, cDNAs, genomic DNAs as well as active analogs and fragments of such proteins .
- nucleic acid molecules encoding mature polypeptides expressed by the DNA contained in ATCC Deposit Nos. .
- a process for producing such polypeptides by recombinant techniques comprising culturing recombinant prokaryotic and/or eukaryotic host cells, containing a nucleic acid sequence of the present invention, under conditions promoting expression of said proteins and subsequent recovery of said proteins .
- a process for utilizing such polypeptides for analyzing potential agonists to the polypeptides utilizes the polynucleotides to assay for compounds which bind said polynucleotides and would thus block expression of any products from said polynucleotides .
- nucleic acid probes comprising nucleic acid molecules of sufficient length to specifically hybridize to a nucleic acid sequence of the present invention.
- Figures 1A-1D collectively show the polynucleotide sequence of the SZFl-1 cDNA (SEQ ID N0:1) and its predicted amino acid sequence (SEQ ID NO: 2) .
- the nucleotide and predicted amino acid sequences of SZFl-1 are shown with numbered nucleotides and amino acids listed to the left of each product.
- RNA splicing sites as determined by sequencing comparison of cD ⁇ As with genomic D ⁇ A are shown by each character ( A ) .
- the deduced amino acid sequence starts from the conserved Kozak translation start site (underlined) .
- Zinc fingers are underlined from the initial conserved cysteine to the final conserved histidine beginning at amino acid 247.
- KRAB-A and -B domains are indicated above each such domain.
- Phosphorylation consensus sites for potential casein kinase II (ck2) and protein kinase C (PKC) are labelled above each element .
- the peptide sequence used to generate polyclonal serum is shown in italics .
- Figures 2A-2E collectively show the polynucleotide sequence of the SZFl-2 cD ⁇ A (SEQ ID NO: 3) and its predicted amino acid sequence (SEQ ID NO: 4) .
- the nucleotide and predicted amino acid sequences of SZFl-2 are shown with numbered nucleotides and amino acids listed to the left of each product.
- RNA splicing sites as determined by sequencing comparison of cDNAs with genomic DNA are shown by each character ( ⁇ ) .
- the deduced amino acid sequence starts from the conserved Kozak translation start site (underlined) . Zinc fingers are underlined from the initial conserved cysteine to the final conserved histidine beginning at amino acid 247.
- KRAB-A and -B domains are indicated above each such domain.
- Phosphorylation consensus sites for potential casein kinase II (ck2) and protein kinase C (PKC) are labelled above each element .
- Two instability motifs (ATTA) in the 3' untranslated region are underlined. The peptide sequence used to generate polyclonal serum is shown in italics .
- Figures 3A and 3B show the results from labelling and immunoprecipitation of in vi tro transcribed and translated (TNT) SZFl-1 and SZFl-2 proteins.
- Figure 3A [ 35 S] methionine was used to label the proteins generated form TNT of full length SZFl-1 and SZFl-2 constructs. These products together with the products of a TNT reaction in which no template was added (dH 2 0) were resolved on a 10% SDS-polyacrylamide gel which was then exposed to film. MW markers are shown to the left.
- FIG. 3B The TNT products of SZFl-1 and SZFl-2 were immunoprecipitated (IP) with a polyclonal antisera generated to a peptide which was present in the predicted ORF of both transcripts. IP reactions were also conducted after preincubation of the antisera with 25 ⁇ g of the peptide to demonstrate specificity of the antiserum (+ peptide lanes) . The IP products were then processed as above.
- Figure 4A and 4B illustrate KRAB and zinc finger homologies, respectively, between the SZFl-1, SZFl-2 and several related proteins.
- alignment of the KRAB domains (Figure 4A) and zinc finger domains (Figure 4B) of SZFl-1 (line 1 of each row) and SZFl-2 (line 2 of each row) were compared with the transcriptional repressor proteins ZNF133 (line 3) , Kidl (line 4) and ZNF85 (line 5) .
- Protein alignments were generated with the GCG Pileup program.
- the sixth line of each row shows the amino acid residues of a consensus sequence, which only includes a particular amino acid if at least three of the compared proteins have the identical amino acid at that position.
- Figure 5 shows the expression results of SZF1 in normal human tissue.
- Northern blots containing 2 ⁇ g of polyA+ RNA from multiple normal human tissues were hybridized with a SZFl 3 ' probe which contained the 3 ' region (nucleotides 1414-2389) of SZFl-1 downstream of the zinc fingers (top portion) .
- the blots were then stripped and reprobed with -actin (lower portion) .
- the source of RNA is noted above each lane .
- MW size markers are noted to the left.
- Figure 6 illustrates RNAse protection analysis of the expression of SZFl in hematopoietic cell lines.
- a 5 ⁇ g aliquot of total R ⁇ A from each cell line sample was hybridized with a 32 P-UTP labelled 230 base antisense R ⁇ A probe which contains 165 bp of SZFl (bases 1256-1421) attached to 65 bp of vector sequence.
- a 3-actin probe was included in the hybridization reactions as an internal control for R ⁇ A loading. After hybridization, the samples were treated with RNAse A and TI followed by electrophoresis on a 6% polyacrylamide gel and autoradiography. The SZFl and /3-actin probes are shown on the left .
- the source of the R ⁇ A added is noted above each lane, they are BM (total bone marrow cells); Jurkat, Molt- 3, Molt-16, RPMI-8402 (T-lineage acute lymphocytic leukemia); K422, RL, REH (B-lineage acute lymphocytic leukemia); Raj i (Burkitt's lymphoma) ; HEL (Erythroleukemia) ; K562 (Chronic myelogenous leukemia) ; ML- 1, KGla (Acute myelocytic leukemia) .
- the protected SZFl and actin fragments are indicated to the right.
- Figure 7 shows SZFl expression with differentiation.
- HL60 cells and ML-1 cells were treated with TPA for 24 hours .
- 5 ⁇ g of total RNA isolated from control cells and cells exposed to TPA was added to the R ⁇ Ase protection assay with the same 32 P-UTP labelled 230 base antisense R ⁇ A probe used in Figure 6.
- the source of each RNA sample is noted above each lane.
- the SZF-1 and /3-actin protected bands are indicated to the left .
- Figure 8A and 8B show the results from RT-PCR analysis of SZFl expression in CD34+ and CD34- cells.
- CD34+ and CD34- cells were purified from human bone marrow mononuclear cells by immunomagnetic separation and RNA was isolated by the guanidum thiocyanate method. A 1 ⁇ g aliquot of each RNA sample was reverse transcribed using random hexamer priming.
- RT-PCR fragments generated by primer pairs shown in Figure 8B are 1-2 (210 bp) , 3-4 (324 bp) , 5-6 (392 bp) and 7-8 (133 bp for SZFl-1 or 1417 bp for SZFl-2) .
- Figure 8B includes a Map which indicates the location of the primer pairs on the SZFl-1 and SZFl-2 cDNAs .
- Figure 9A, 9B and 9C illustrate the genomic organization of SZFl cDNAs and predicted motif structures.
- the SZFl-1 cDNA (empty boxes) and SZFl-2 cDNA (Hatched boxes) as well as the boundary of introns and exons are schematically presented in B with the approximate location of EcoRI sites (R) indicated.
- cDNA motifs and sequences that encode predicted zinc fingers, KRAB-A and KRAB-B domains and the PEST sequences are indicated in A for SZF1- 1 and in C for SZFl-2.
- the lines indicate the splicing events that occur from SZFl to result in the observed cDNAs .
- Figures 10A and 10B show the chromosomal localization of SZFl by FISH.
- Figure 10A results from PI plasmids that encompass the SZFl gene that were nick-translated with biotin-14 and hybridized to metaphase chromosome spreads from normal lymphocytes cultured with BrdU.
- the specific paired FISH signals of the SZFl gene are shown with an arrow on chromosome 3 (DAPI-stained) .
- Figure 10B shows the results of metaphases that were G-banded and photographed prior to FISH. The arrow indicates band 3p21, corresponding to the FISH signals seen in Figure 10A.
- Figure 11 illustrates the chromosome ideogram of paired signals from FISH as described in Figure 10A. Each dot represents a paired signal seen on metaphase chromosomes as described in Figure 10B.
- Figures 12A and 12B show the results of SZFl expression from lung and hematopoietic tissues and cell types .
- Figure 12A 5 ⁇ g of polyA+RNA from each indicated sample was hybridized with 32 P-labelled SZFl 3' (top), SZFl 5' (middle) or / ⁇ -actin (bottom) probes.
- HFL normal human fetal lung
- HEL K562, and ML-I (hematopoietic cell lines)
- H209, H249, H1385, H82, DMS53, N417 SCLC cell lines
- H727, H385, H157, A549 HSCLC cell lines
- 965950 and 277055 Primary small cell lung cancer tissues
- gene means the segment of DNA involved in producing a polypeptide chain; it includes regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons) .
- a coding sequence is "operably linked to" another coding sequence when RNA polymerase will transcribe the two coding sequences into a single mRNA, which is then translated into a single polypeptide having amino acids derived from both coding sequences .
- the coding sequences need not be contiguous to one another so long as the expressed sequences ultimately process to produce the desired protein.
- Recombinant proteins refer to proteins produced by recombinant DNA techniques; i.e., produced from cells transformed by an exogenous DNA construct encoding the desired protein.
- Synthetic proteins are those prepared by chemical synthesis .
- a DNA "coding sequence of” or a “nucleotide sequence encoding" a particular protein is a DNA sequence which is transcribed and translated into a protein when placed under the control of appropriate regulatory sequences .
- Plasmids are designated by a lower case “p” preceded and/or followed by capital letters and/or numbers.
- the starting plasmids herein are either commercially available, publicly available on an unrestricted basis, or can be constructed from available plasmids in accord with published procedures.
- equivalent plasmids to those described are known in the art and will be apparent to the ordinarily skilled artisan.
- “Digestion” of DNA refers to catalytic cleavage of the DNA with a restriction enzyme that acts only at certain sequences in the DNA.
- the various restriction enzymes used herein are commercially available and their reaction conditions, cofactors and other requirements were used as would be known to the ordinarily skilled artisan.
- For analytical purposes typically 1 ⁇ g of plasmid or DNA fragment is used with about 2 units of enzyme in about 20 ⁇ l of buffer solution.
- For the purpose of isolating DNA fragments for plasmid construction typically 5 to 50 ⁇ g of DNA are digested with 20 to 250 units of enzyme in a larger volume. Appropriate buffers and substrate amounts for particular restriction enzymes are specified by the manufacturer. Incubation times of about 1 hour at 37 °C are ordinarily used, but may vary in accordance with the supplier's instructions. After digestion the reaction is electrophoresed directly on a polyacrylamide gel to isolate the desired fragment.
- Size separation of the cleaved fragments is performed using 8 percent polyacrylamide gel described by Goeddel et al., Nucleic Acids Res . , 8:4057 (1980).
- Oligonucleotides refers to either a single stranded polydeoxynucleotide or two complementary polydeoxynucleotide strands which may be chemically synthesized. Such synthetic oligonucleotides have no 5' phosphate and thus will not ligate to another oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated .
- Ligase refers to the process of forming phosphodiester bonds between two double stranded nucleic acid fragments (Maniatis, T., et al . . Id. , p. 146) . Unless otherwise provided, ligation may be accomplished using known buffers and conditions with 10 units of T4 DNA ligase ("ligase”) per 0.5 ⁇ g of approximately equimolar amounts of the DNA fragments to be ligated.
- ligase T4 DNA ligase
- isolated nucleic acids from SZFl-1 and SZFl-2 (SEQ ID NOS: 1 and 3) as shown in Figures 1A-D, collectively, and Figures 2A-E, collectively, which encode, respectively, the mature SZFl-1 and SZFl-2 proteins having the continuous deduced amino acid sequence shown in Figure 1A-D and 2A-E.
- isolated polynucleotides encoding the polypeptides of the present invention.
- the deposited material is a genomic clone comprising DNA encoding a polypeptide of the present invention, in a plasmid DNA vector form. As deposited with the American Type Culture Collection, 12301 Parklawn Drive, Rockville, Maryland 20852, USA, the deposited material is assigned ATCC Deposit No. .
- polynucleotides of this invention coding for the proteins of this invention were originally recovered from a genomic gene library derived CD34+ cells.
- SZF-1 was made through the random sequencing of a portion of several hundred clones from a CDNA library prepared from human bone marrow CD34+ cells .
- One of the fragments was unique but showed homology to the Kruppel family of zinc finger proteins. We therefore sequenced the entire 1200 bp fragment.
- CDNA libraries from CD34+ cells, as well as libraries made from K562 cells and human lung (which in preliminary experiments showed expression of the transcript) were screened with the 1.2 kb fragment. Overlapping cDNA fragments which hybridized to the probe gave rise to two alternatively spliced cDNA products. Most regions were isolated several times in more than one cDNA library.
- Figs. 1A-1D The complete SZF-1 nucleotide and predicted amino acid sequences are shown in Figs. 1A-1D (collectively); the complete SZFl-2 nucleotide and predicted amino acid sequence are shown in Fig. 2A-2E (collectively) .
- SZFl- 1 SEQ ID NO: 2
- SZFl-2 SEQ ID NO: 4
- the size of the proteins is based on the usage of the 5' -proximal ATG codon which follows the consensus Kozak site (underlined in Figs. 1A-1D and in Figs. 2A-2E) downstream of an in-frame stop codon. To test the prediction that this ATG is the most frequently used start site for translation initiation, the in vitro transcription and translation assay was performed with full length SZFl-1 and SZFl-2 constructs. As shown in Fig.
- the longest and most intense bands correspond to 48 kD for SZFl-1 and 41 kD for SZFl-2. This corresponds to the MW predicted by the full length ORF of both transcripts if the predicted initiating methionine is used. The smaller bands seen in both lanes could represent degradation products or initiation at internal methionines .
- immunoprecipitation was performed with polyclonal antisera generated to a peptide that is shared by the predicted ORF of both SZFl-1 and SZFl-2.
- Fig. 3B it is evident that the antisera recognizes the full length product of both SZFl-1 and SZFl-2.
- Both transcripts share the same amino-terminal sequence upstream of the zinc fingers and through the first three zinc fingers, but diverge after amino acid 349 toward the end of the fourth zinc finger.
- a Kruppel-associated box (KRAB) domain is present with highly conserved A (amino acids 31-73) and B (amino acids 74-93) elements. This domain was initially noted in zinc finger genes from Xenopus. The KRAB domain has been suggested to form an o;-helix and may be involved in protein-protein interactions .
- the fourth zinc finger is incomplete with the final histidine replaced by a glutamine residue.
- the lack of a second histidine in the final zinc finger has been observed in other zinc finger proteins.
- An additional 74 amino acids follow the zinc fingers at the carboxy terminus of SZFl-1.
- the CDNA sequences of SZFl-1 and SZFl-2 were used to search the available nucleotide databases to determine the most highly related genes.
- the result showed that ZNF133, Kidl and ZNF85 were the most highly related genes with homologies of 65%, 55% and 45% at the nucleotide level, respectively.
- ZNT133, Kidl, and ZNF85 (Poncelet, D.A., University of Med, pers. comm.) are all zinc finger proteins that have been suggested to play roles in transcriptional repression.
- the region of SZFl that encompasses the KRAB and zinc finger domains has the highest degree of homology to these transcriptional factors (Figs. 4A and 4B) .
- the identity of the KRAB domain of SZFl with that of ZNF133 is 68% for KRAB-A and 22% for KRAB-B; with Kid] it is 74% for KRAB-A and 12% for KRAB-B; and with ZNF85 it is 70% for KRAB-A and 26% for KRAB-B (Fig. 4A) .
- the zinc finger sequence of SZFl also shows a high degree of homology to these transcription factors, with an identity of about 50% (Fig. 4B) . A much lower degree of homology was found in the remainder of the SZFl coding sequence .
- the lane containing liver RNA showed a 2 kb band instead. Some samples also showed a larger transcript of approximately 6 kb.
- the 5' probe also hybridized to the 4.2 kb band in all these issues and in addition showed a faint band of 1.2 kb (data not shown) .
- RNAse protection assay was used to further screen for expression. Although the level of expression varies, protected species are seen from every hematopoietic cell line tested, including Jurkat, K422, Raji, Molt-16, RL, HEL, RPMI-8402, K562, ML-1, KGla, Molt-3 and REH (Fig. 6) . These represent myeloid, lymphoid, and erythroid lineages.
- SZFl expression was measured as a function of differentiation induced in HL60 and ML-1 cell lines by TPA.
- RNA isolated from control cells and cells exposed to TPA for 24 hours, was used in an RNAse protection assay with actin as an internal control to allow for normalization (Fig. 7) .
- Quantitation by phosphorimager scanning of the gel showed that SZFl expression was decreased to 30% by 24 hours of induction of the HL60 cells.
- expression of SZFl decreased to only 16% of control levels by 24 hours of differentiation.
- SZFl expression is greatly repressed upon differentiation, at least in these two cell lines.
- RT-PCR was used to test for the presence of the two SZFl transcripts in the polyA+RNA from hematopoietic cell lines, lung cancer cell lines, fetal lung tissue and a normal lung epithelial cell primary culture.
- the primers used in the assay cross the SZFl-2 exon that is spliced out of the SZFl-1 transcript and would generate a 133 bp RT-PCR fragment from the SZFl-1 transcript and a 1.4 kb fragment from the SZFl-2 transcript . None of the mRNA samples tested gave rise to the 133 bp product from SZFl-1.
- FISH Fluorescence In situ Hybridization
- PI plasmids with inserts that encompass the entire coding sequence of SZFl ORFs were used as the probe .
- An example of the FISH results is shown in Figs. 10A and 10B. Clear paired signals were observed only on chromosome 3. Analysis of 34 metaphase cells showed 23 cells (67%) had at least one pair of signals. 41/44 signals were located on chromosome 3 as indicated by G-banding 24, with all of the signals on band p21. In addition, 9 out of 10 signals analyzed from the pre-G banded metaphases had the same 3p21 localization.
- Fig. 9C The ideogram representing the results of chromosomes with paired signals is shown in Fig. 9C.
- the results indicate that SZFl maps to human chromosome 3p21, a region that has been implicated in a number of cancers . Lung expression.
- HFL normal fetal lung tissue
- HBE normal lung epithelial cell primary culture
- SCLC small cell lung cancer
- NSCLC small cell lung cancer cells
- SZFl-1 and SZFl-2 Fig. 11
- Fig. 11 the cell lines showed widely varying levels in the expression of the 4.2 kb transcript.
- the transcript could not be detected.
- a 1.1 kb band was observed in addition to the 4.2 kb band. Higher stringency washing conditions did not remove this band. It is possible that the 1.1 kb band represents a species cross-hybridizing with the KRAB domain.
- One means for isolating the nucleic acid molecules encoding the proteins of the present invention is to probe a CD34+ gene library with a natural or artificially designed probe using art recognized procedures (see, for example: Current Protocols in Molecular Biology, Ausubel F.M. et al . (EDS.) Green Publishing Company Assoc . and John Wiley Interscience, New York, 1989, 1992) . It is appreciated by one skilled in the art that the polynucleotides of SEQ ID NOS : 1 and 3, or fragments thereof
- particularly useful probes are particularly useful probes.
- Other particularly useful probes for this purpose are hybridizable fragments of the sequences of SEQ ID NOS : 1 and 3 (i.e., comprising at least 12 contiguous nucleotides) .
- hybridization may be carried out under conditions of reduced stringency, medium stringency or even stringent conditions.
- a polymer membrane containing immobilized denatured nucleic acids is first prehybridized for 30 minutes at 45°C in a solution consisting of 0.9 M NaCl, 50 mM NaH 2 P0 4 , pH 7.0, 5.0 mM Na 2 EDTA, 0.5% SDS, 10X Denhardt's, and 0.5 mg/mL polyriboadenylic acid.
- Approximately 2 X 10 7 cpm (specific activity 4-9 X 10 8 cpm/ug) of 32 P end-labeled oligonucleotide probe are then added to the solution. After 12-16 hours of incubation, the membrane is washed for 30 minutes at room temperature in IX SET (150 mM NaCl, 20 mM Tris hydrochloride , pH 7.8, 1 mM Na 2 EDTA) containing 0.5% SDS, followed by a 30 minute wash in fresh IX SET at Tm less 10°C for the oligonucleotide probe. The membrane is then exposed to auto- radiographic film for detection of hybridization signals .
- IX SET 150 mM NaCl, 20 mM Tris hydrochloride , pH 7.8, 1 mM Na 2 EDTA
- Stringent conditions means hybridization will occur only if there is at least 90% identity, preferably at least 95% identity and most preferably at least 97% identity between the sequences. Further, it is understood that a section of a 100 bps sequence that is 95 bps in length has 95% identity with the 1090 bps sequence from which it is obtained. See J. Sambrook et al . , Molecular Cloning, A Laboratory Manual , 2d Ed. , Cold Spring Harbor Laboratory (1989) which is hereby incorporated by reference in its entirety. Also, it is understood that a fragment of a 100 bps sequence that is 95 bps in length has 95% identity with the 100 bps sequence from which it is obtained.
- a first DNA (RNA) sequence is at least 70% and preferably at least 80%, and more preferably at least a 90%, and even more preferably or at least 95% identical to another DNA (RNA) sequence if there is at least 70% and preferably at least a 80%. 90% or 95% identity, respectively, between the bases of the first sequence and the bases of the another sequence, when properly aligned with each other, for example when aligned by BLASTN.
- the present invention relates to polynucleotides which differ from the reference polynucleotide in a manner such that the change or changes is/are silent change, in that the amino acid sequence encoded by the polynucleotide remains the same.
- the present invention also relates to nucleotide changes which result in amino acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the reference polynucleotide. In a preferred aspect of the invention these polypeptides retain the same biological action as the polypeptide encoded by the reference polynucleotide.
- the polynucleotides of the present invention may be in the form of RNA or DNA which DNA includes cDNA, genomic DNA, and synthetic DNA.
- the DNA may be double-stranded or single-stranded, and if single stranded may be the coding strand or non-coding (anti-sense) strand.
- the coding sequences which encodes the mature proteins may be identical to the coding sequences shown in Figures 1A-D and 2A-E, (SEQ ID NOS : 1 and 3, respectively) or may be a different coding sequence which coding sequence, as a result of the redundancy or degeneracy of the genetic code, encodes the same mature proteins as does the DNA of Figures 1A-D and 2A-E, (SEQ ID NOS : 1 and 3, respectively) .
- the polynucleotides which encode each of the mature proteins may include, but each is not limited to: only the coding sequence for the mature protein; the coding sequence for the mature protein and additional coding sequence such as a leader sequence or a proprotein sequence; the coding sequence for the mature protein (and optionally additional coding sequence) and non-coding sequence, such as introns or non- coding sequence 5' and/or 3' of the coding sequence for the mature protein.
- polynucleotide encoding a protein encompasses a polynucleotide which includes only coding sequence for the polypeptide as well as a polynucleotide which includes additional coding and/or non-coding sequences .
- the present invention further relates to variants of the hereinabove described polynucleotides which encode for fragments, analogs and derivatives of the proteins having the deduced amino acid sequences of Figures 1A-D and 2A-E (SEQ ID NOS: 2 and 4, respectively).
- the variant of the poly-nucleotide may be a naturally occurring allelic variant of the polynucleotide or a non-naturally occurring variant of the polynucleotide .
- the present invention includes polynucleotides encoding the same mature proteins as shown in Figures 1A-D and 2A-D, as well as variants of such polynucleotides which variants encode for a fragment, derivative or analog of the proteins of Figures 1A-D and 2A-E, respectively.
- Such nucleotide variants include deletion variants, substitution variants and addition or insertion variants.
- the polynucleotides may have a coding sequence which is a naturally occurring allelic variant of the coding sequence shown in Figures 1A-D and 2A-E.
- an allelic variant is an alternate form of a polynucleotide sequence which may have a substitution, deletion or addition of one or more nucleotides, which does not substantially alter the function of the encoded protein.
- directed and other evolution strategies one may make very minor changes in DNA sequence which can result in major changes in function.
- Fragments of the full length gene of the present invention may be used as hybridization probes for a cDNA or a genomic library to isolate the full length DNA and to isolate other DNAs which have a high sequence identity to the gene.
- Probes of this type preferably have at least 10, preferably at least 15, and even more preferably at least 30 bases and may contain, for example, at least 50 or more bases. In fact, probes of this type having at least up to 150 bases or greater may be utilized.
- the probe may also be used to identify a DNA clone corresponding to a full length transcript and a genomic clone or clones that contain the complete gene including regulatory and promotor regions , exons and introns .
- An example of a screen comprises isolating the coding region of the gene by using the known DNA sequence to synthesize an oligonucleotide probe.
- Labeled oligonucleotides, having a sequence complementary to that of the gene or portion of the gene sequences of the present invention are used to screen a library of genomic DNA to determine which members of the library the probe hybridizes to in a complementary sense, have an identity as described above.
- probes can be and are preferably labeled with an analytically detectable reagent to facilitate identification of the probe.
- useful reagents include but are not limited to radioactivity, fluorescent dyes or proteins capable of catalyzing the formation of a detectable product . The probes are thus useful to isolate complementary copies of DNA from other sources or to screen such sources for related sequences .
- the present invention further relates to polynucleotides which hybridize to the hereinabove- described sequences if there is at least 70%, preferably at least 90%, and more preferably at least 95% identity between the sequences.
- 70% identity would include within such definition a 70 bps fragment taken from a 100 bp polynucleotide, for example.
- the present invention particularly relates to polynucleotides which hybridize under stringent conditions to the hereinabove-described polynucleotides .
- stringent conditions means hybridization will occur only if there is at least 95% and preferably at least 97% identity between the sequences .
- polynucleotides which hybridize to the hereinabove described polynucleotides in a preferred embodiment encode proteins which either retain substantially the same biological function or activity as the mature proteins encoded by the DNA of Figures 1A-D and 2A-E, respectively.
- identity refers to complementarity of polynucleotide segments.
- the polynucleotide may have at least 15 bases, preferably at least 30 bases, and more preferably at least 50 bases which hybridize to any part of a polynucleotide of the present invention and which has an identity thereto, as hereinabove described, and which may or may not retain activity.
- such polynucleotides may be employed as probes for the polynucleotides of SEQ ID NOS: 1 and 3, for example, for recovery of the polynucleotide or as a diagnostic probe or as a PCR primer.
- the present invention is directed to polynucleotides having at least a 70% identity, preferably at least 90% identity and more preferably at least a 95% identity to a polynucleotide which encodes the polypeptides of SEQ ID NOS: 2 and 4, respectively, as well as fragments thereof, which fragments have at least 15 bases, preferably at least 30 bases, more preferably at least 50 bases and most preferably fragments having up to at least 150 bases or greater, which fragments are at least 90% identical, preferably at least 95% identical and most preferably at least 97% identical to any portion of a polynucleotide of the present invention.
- the present invention further relates to proteins which have the deduced amino acid sequence of Figures 1A-D and 2A-E, respectively, (SEQ ID NOS: 2 and 4, respectively) as well as fragments, analogs and derivatives of such proteins .
- fragment when referring to each of the polypeptides of Figures 1A-D and 2A-E, respectively, (SEQ ID NOS: 2 and 4, respectively) mean a protein which retains essentially the same biological function or activity as such protein.
- an analog includes a proprotein which can be activated by cleavage of the proprotein portion to produce an active mature protein.
- the proteins of the present invention may be a recombinant protein, a natural protein or a synthetic protein, preferably a recombinant protein.
- each of the polypeptides of Figures 1A-D and 2A-E may be (i) one in which one or more of the amino acid residues are substituted with a conserved or non-conserved amino acid residue (preferably a conserved amino acid residue) and such substituted amino acid residue may or may not be one encoded by the genetic code, or (ii) one in which one or more of the amino acid residues includes a substituent group, or (iii) one in which the mature protein is fused with another compound, such as a compound to increase the half-life of the protein (for example, polyethylene glycol) , or (iv) one in which the additional amino acids are fused to the mature protein, such as a leader or secretory sequence or a sequence which is employed for purification of the mature protein or a proprotein sequence.
- a conserved or non-conserved amino acid residue preferably a conserved amino acid residue
- substituted amino acid residue may or may not be one encoded by the genetic code
- polypeptides and polynucleotides of the present invention are preferably provided in an isolated form, and preferably are purified to homogeneity.
- isolated means that the material is removed from its original environment (e.g., the natural environment if it is naturally occurring) .
- a naturally-occurring polynucleotide or polypeptide present in a living animal is not isolated, but the same polynucleotide or protein, separated from some or all of the coexisting materials in the natural system, is isolated.
- Such polynucleotides could be part of a vector and/or such polynucleotides or proteins could be part of a composition, and still be isolated in that such vector or composition is not part of its natural environment.
- the polypeptides of the present invention include the respective proteins of SEQ ID NOS: 2 and 4 (in particular the mature proteins) as well as proteins which have at least 70% similarity (preferably at least 70% identity) to the repsective proteins of of SEQ ID NOS: 2 and 4 and more preferably at least 90% similarity (more preferably at least 90% identity) to the respective proteins of SEQ ID NOS: 2 and 4, and still more preferably at least 95% similarity (still more preferably at least 95% identity) to the respective proteins of SEQ ID NOS: 2 and 4, and also include portions of such proteins with such portion of the protein generally containing at least 30 amino acids and more preferably at least 50 amino acids and most preferably at least up to 150 amino acids, or more.
- the mature polypeptides according to the invention may comprise or omit an N-terminal methionine amino acid residue.
- similarity between two proteins is determined by comparing the amino acid sequence and its conserved amino acid substitutes of one protein to the sequence of a second protein.
- the definition of 70% similarity would include a 70 amino acid sequence fragment of a 100 amino acid sequence, for example, or a 70 amino acid sequence obtained by sequentially or randomly deleting 30 amino acids from the 100 amino acid sequence.
- a variant i.e. a "fragment”, “analog” or “derivative” polypeptide, and reference polypeptide may differ in amino acid sequence by one or more substitutions, additions, deletions, fusions and truncations, which may be present in any combination.
- substitutions are those that vary from a reference by conservative amino acid substitutions .
- Such substitutions are those that substitute a given amino acid in a polypeptide by another amino acid of like characteristics.
- conservative substitutions are the replacements, one for another, among the aliphatic amino acids Ala, Val, Leu and lie; interchange of the hydroxyl residues Ser and Thr, exchange of the acidic residues Asp and Glu, substitution between the amide residues Asn and Gin, exchange of the basic residues Lys and Arg and replacements among the aromatic residues Phe, Tyr.
- variants which retain the same biological function and activity as the reference polypeptide from which it varies .
- Fragments or portions of the proteins of the present invention may be employed for producing the corresponding full-length protein by peptide synthesis; therefore, the fragments may be employed as intermediates for producing the full-length proteins. Fragments or portions of the polynucleotides of the present invention may be used to synthesize full-length polynucleotides of the present invention.
- the present invention also relates to vectors which include polynucleotides of the present invention, host cells which are genetically engineered with vectors of the invention and the production of protiens of the invention by recombinant techniques .
- Host cells are genetically engineered (transduced or transformed or transfected) with the vectors of this invention which may be, for example, a cloning vector such as an expression vector.
- the vector may be, for example, in the form of a plasmid, a phage, etc.
- the engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants or amplifying the genes of the present invention.
- the culture conditions such as temperature, pH and the like, are those previously used with the host cell selected for expression, and will be apparent to the ordinarily skilled artisan.
- the polynucleotides of the present invention may be employed for producing proteins by recombinant techniques .
- the polynucleotide may be included in any one of a variety of expression vectors for expressing a protein.
- Such vectors include chromosomal, nonchromosomal and synthetic DNA sequences, e . g. , derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, and pseudorabies .
- any other vector may be used as long as it is replicable and viable in the host.
- the appropriate DNA sequence may be inserted into the vector by a variety of procedures .
- the DNA sequence is inserted into an appropriate restriction endonuclease site(s) by procedures known in the art. Such procedures and others are deemed to be within the scope of those skilled in the art.
- the DNA sequence in the expression vector is operatively linked to an appropriate expression control sequence (s) (promoter) to direct mRNA synthesis.
- s expression control sequence
- promoters there may be mentioned: LTR or SV40 promoter, the E. coli . lac or trp, the phage lambda P L promoter and other promoters known to control expression of genes in prokaryotic or eukaryotic cells or their viruses .
- the expression vector also contains a ribosome binding site for translation initiation and a transcription terminator.
- the vector may also include appropriate sequences for amplifying expression.
- the expression vectors preferably contain one or more selectable marker genes to provide a phenotypic trait for selection of transformed host cells such as dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as tetracycline or ampi- cillin resistance in E. coli .
- the vector containing the appropriate DNA sequence as hereinabove described, as well as an appropriate promoter or control sequence, may be employed to transform an appropriate host to permit the host to express the protein.
- bacterial cells such as E. coli , Streptomyces , Bacillus subtilis
- fungal cells such as yeast
- insect cells such as Drosophila S2 and Spodoptera Sf9
- animal cells such as CHO, COS or Bowes melanoma
- adenoviruses plant cells, etc.
- the present invention also includes recombinant constructs comprising one or more of the sequences as broadly described above .
- the constructs comprise a vector, such as a plasmid or viral vector, into which a sequence of the invention has been inserted, in a forward or reverse orientation.
- the construct further comprises regulatory sequences, including, for example, a promoter, operably linked to the sequence .
- a promoter operably linked to the sequence .
- Bacterial pQE70, pQE60, pQE-9 (Qiagen) , pBluescript II KS, ptrc99a, pKK223-3, pDR540, pRIT2T (Pharmacia); Eukaryotic: pXTl, pSG5 (Stratagene) pSVK3, pBPV, pMSG, pSVL SV40 (Pharmacia).
- any other plasmid or vector may be used as long as they are replicable and viable in the host .
- Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other vectors with selectable markers .
- Two appropriate vectors are pKK232-8 and pCM7.
- Particular named bacterial promoters include lacl, lacZ, T3 , T7, gpt, lambda P R , P L and trp.
- Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I . Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art.
- the present invention relates to host cells containing the above-described constructs.
- the host cell can be a higher eukaryotic cell, such as a mammalian cell, or a lower eukaryotic cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a bacterial cell.
- Introduction of the construct into the host cell can be effected by calcium phosphate transfeetion, DEAE-Dextran mediated transfection, or electroporation (Davis, L. , Dibner, M. , Battey, I., Basic Methods in Molecular Biology, (1986) ) .
- the constructs in host cells can be used in a conventional manner to produce the gene product encoded by the recombinant sequence.
- the proteins of the invention can be synthetically produced by conventional peptide synthesizers .
- Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate promoters. Depending upon the expression host a mature protein may or may not contain an N-terminal methionine. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the D ⁇ A constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook et al . , Molecular Cloning: A Laboratory Manual , Second Edi tion, Cold Spring Harbor, ⁇ .Y. , (1989), the disclosure of which is hereby incorporated by reference .
- Enhancers are cis-acting elements of D ⁇ A, usually about from 10 to 300 bp that act on a promoter to increase its transcription. Examples include the SV40 enhancer on the late side of the replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers .
- recombinant expression vectors will include origins of replication and selectable markers permitting transformation of the host cell, e. g. , the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct transcription of a downstream structural sequence .
- promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK) , -factor, acid phosphatase, or heat shock proteins, among others.
- the heterologous structural sequence is assembled in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable of directing secretion of translated protein.
- the heterologous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired characteristics, e. g. , stabilization or simplified purification of expressed recombinant product.
- Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable reading phase with a functional promoter.
- the vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance of the vector and to, if desirable, provide amplification within the host.
- Suitable prokaryotic hosts for transformation include E. coli , Bacillus subtilis, Salmonella typhimurium and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be employed as a matter of choice .
- useful expression vectors for bacterial use can comprise a selectable marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017) .
- cloning vector pBR322 ATCC 37017
- Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and pGEMl (Promega Biotec, Madison, WI , USA) .
- pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence to be expressed.
- the selected promoter is induced by appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an additional period.
- Cells are typically harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained for further purification.
- Microbial cells employed in expression of proteins can be disrupted by any convenient method, including freeze- thaw cycling, sonication, mechanical disruption, or use of cell lysing agents, such methods are well known to those skilled in the art .
- mammalian cell culture systems can also be employed to express recombinant protein.
- mammalian expression systems include the COS-7 lines of monkey kidney fibroblasts, described by Gluzman, Cell , 23:175 (1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and BHK cell lines.
- Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking nontranscribed sequences .
- DNA sequences derived from the SV40 splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements .
- polypeptides according to the invention can be recovered and purified from recombinant cell cultures by methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography and lectin chromatography. Protein refolding steps can be used, as necessary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can be employed for final purification steps.
- HPLC high performance liquid chromatography
- the polypeptides of the present invention may be a naturally purified product, or a product of chemical synthetic procedures, or produced by recombinant techniques from a prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and mammalian cells in culture) .
- a prokaryotic or eukaryotic host for example, by bacterial, yeast, higher plant, insect and mammalian cells in culture
- the proteins of the present invention may be glycosylated or may be non- glycosylated. Proteins of the invention may or may not also include an initial methionine amino acid residue.
- Antibodies generated against a protein corresponding to a sequence of the present invention can be obtained by direct injection of the respective protein (or a portion of the protein) into an animal or by administering the proteins to an animal, preferably a nonhuman. The antibody so obtained will then bind the respective protein itself. In this manner, even a sequence encoding only a fragment of the proteins can be used to generate antibodies binding the whole native proteins. Such antibodies can then be used to isolate the protein from cells expressing that protein and may also be useful as antimicrobials, or controls in assays to determine the efficacy of potential antimicrobials.
- any technique which provides antibodies produced by continuous cell line cultures can be used. Examples include the hybridoma technique (Kohler and Milstein, Nature, 256: 495 - 497, 1975) , the trioma technique, the human B-cell hybridoma technique (Kozbor et al . , Immunology Today 4:72, 1983) , and the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al . , in Monoclonal m ⁇ ntibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77- 96, 1985) . Techniques described for the production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce single chain antibodies to immunogenic protein products of this invention. Also, transgenic mice may be used to express humanized antibodies to immunogenic protein products of this invention.
- Antibodies generated against a protein of the present invention may be used in screening for similar proteins from other organisms and samples .
- screening techniques are known in the art, for example, one such screening assay is described in Sambrook and Maniatis, Molecular Cloning: A Laboratory Manual (2d Ed.), vol. 2: Section 8.49, Cold Spring Harbor Laboratory, 1989, which is hereby incorporated by reference in its entirety.
- Northern hybridization was performed according to standard protocols . In brief , 5 ⁇ g of each mRNA sample was incubated with 50% formamide, 6.5% formaldehyde, and lx MOPS (pH 7.0) at 55°C for 15 minutes. The samples were electrophoresed in 1.2% agarose/formaldehyde gels and transferred to nylon membrane (HYBOND, Amersham, UK) by the capillary method.
- Blots containing RNA samples from cell lines as well as multitissue Northern blots obtained commercially were hybridized for 2 hrs in 50% formamide, 5X SSPE, 10X Denhardt's, 2% SDS, and 100 ug/ml denatured salmon sperm DNA (Clonetech) with randomly primed 32 P-dCTP labelled probe and exposed to film after washing. The blots were then stripped and rehyridized with other probes.
- RNAse protection assay was performed using the MAXIscript T3 in vitro transcription kit (Ambion, Austin, Texas) .
- the anti-sense RNA probe of SZFl (bpl256-1421) was synthesized by runoff transcription using Bacteriophage T3 RNA polymerase on a pBluescript II KS- plasmid containing a fragment of SZFl linearized at the Sad site. It results in a probe of 230 bases containing 165 bases of SZF-1 that would be protected by SZF-1 message.
- 5 ug of total RNA from each sample was hybridized with the 32 P-UTP labelled antisense RNA probe, and then treated with RNAse A and TI .
- a j ⁇ -actin probe was included in the hybridization reactions as an internal control for RNA loading. Protected fragments were resolved on an 8M-Urea, 6% acrylamide gel and exposed to film (Kodak X-OMAT) .
- M-MLV-RT Moloney murine leukemia virus reverse transcriptase
- PI plasmids Three independent PI plasmids (1629, 1630 and 1631) with approximately 85 kb SZFl genomic fragments were obtained from a human genomic DNA PI library (DMPC-HFF#1; Genome Systems, St. Louis, MO) by PCR screening using SZFl oligonucleotides (5' primer CTGTGTTCTTCCATTAGC (SEQ ID NO: 13); 3' primer GGCCTTAGCCATTTGTCT (SEQ ID NO: 14)).
- SZFl oligonucleotides 5' primer CTGTGTTCTTCCATTAGC (SEQ ID NO: 13); 3' primer GGCCTTAGCCATTTGTCT (SEQ ID NO: 14)
- PI plasmids # 1629, 1030, and 1631 containing the entire SZFl gene were nick-translated with biotin-14 DATP (BRL, Gaithersburg, MD) .
- DMS53 All cell lines except DMS53 were grown in RPMI 1640 supplemented with 10% fetal calf serum, 2mM glutamine, and lOOU/ml penicillin and streptomycin (GIBCO/BRL, Grand Island, NY) . DMS53 was grown in Waymonth' s MB 752/1 media with the same supplements as above .
- the normal lung epithelial primary cell line (HBE) came from a normal lung tissue sample and was cultured in kerafinocyte growth medium with bovine pituitary extract (BPE) at 30 ng/ml (Clonetics, Walkersville, MD) .
- BPE bovine pituitary extract
- CD34+ and CD34 cells were isolated from normal bone marrow cells by immunomagnetic separation.
- ML-1 cells and HL60 cells were incubated with TPA (I 2-0-tetradecancylphorbol-13 -acetate) at a concentration of 33 nM for 24 hours.
- TPA I 2-0-tetradecancylphorbol-13 -acetate
- Total RNA was isolated from cell lines and the primary tumor samples by the guanidium thiocyanate method and polyadenylated RNA was prepared using a MRNA isolation kit (Becton Dickinson Labware, Bedford, MA) .
- Example 8 In vi tro transcription and translation and immunoprecipi ta tion .
- the lysates were diluted into PBS with 1% NP-40. Proteins were immunoprecipitated 1 hour under these conditions with either the preimmune sera (10 ⁇ l) , polyclonal antisera (10 ⁇ l) that was raised against a synthesized peptide (RQKAVTAEKSSDKRQ) located upstream of the zinc fingers (shown in Fig. 1A-D and 2A-E in italics) , or the same antisera (10 ⁇ l) preincubated with the peptide (25 mg) used to generate it.
- the polyclonal antisera was prepared by HRP (HRP Inc. Denver, PA) from a NZW female rabbit with three boosts of the KLH conjugated peptide.
- Antigen-antibody complexes were isolated on protein A-sepharose beads (Sigma, St. Louis, MO), and the pellets were washed five times with PBS containing 0.5% Tween-20. The [ 35 S] methionine labeled samples were, analyzed by SDS-PAGE and autoradiography.
- ADDRESSEE CARELLA, BYRNE, BAIN, GILFILLAN,
- Leu Glu lie Gin Leu Ser Pro Ala Gin Asn Ala Ser Ser Glu
- ATCTCTAGCC TTTGCTGTTT CCTCTCCTAC CCCACCTTTA GATTTTACTC AGAGTTCAGT
- CTCCAGCCCT ACAATCTGAG GGACACCTTT ACCAGGTCCC CTTCCTAACC CTCCAGTCCC
- Leu Glu lie Gin Leu Ser Pro Ala Gin Asn Ala Ser Ser Glu
- MOLECULE TYPE OLIGONUCLEOTIDE
- xi SEQUENCE DESCRIPTION: SEQ ID NO: 8:
Landscapes
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Gastroenterology & Hepatology (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Toxicology (AREA)
- Peptides Or Proteins (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
This invention relates to newly identified polynucleotides, polypeptides encoded by such polynucleotides, the use of such polynucleotides and polypeptides, as well as the production and isolation of such polynucleotides and polypeptides. More particularly, the polynucleotides and polypeptides of the present invention have been putatively identified as being transcription factors, and in particular zinc finger transcription factors, and still more particularly as being involved in hematopoiesis.
Description
CELL ZINC FINGER POLYNUCLEOTIDES AND SPLICE VARIANT POLYPEPTIDES ENCODED THEREBY
This invention relates to newly identified polynucleotides, polypeptides encoded by such polynucleotides, the use of such polynucleotides and polypeptides, as well as the production and isolation of such polynucleotides and polypeptides. More particularly, the polynucleotides and polypeptides of the present invention have been putatively identified as being transcription factors, and in particular zinc finger transcription factors, and still more particularly as being involved in hematopoiesis .
Generally, such proteins are of importance to hematopoietic stem/progenitor cell differentation. Hematopoiesis is a complex process in which the critical stage and lineage specific regulators establish lineage commitment through their combined action. It is important to identify new transcription factors that are expressed in hematopoietic stem/progenitor cells since each factor may control transcription of one or more genes involved in the processes of self-renewal or differentiation of the cells.
Transcription factor proteins bind to sets of genes and stimulate or repress the transcription of these genes, thus mediating cell fate decisions and regulating differentiation. Hematopoietic development involves many
such transcription factors and the phenotypes of the differentiated lineages reflect the sets of genes activated or repressed by factors 1-2 of the invention.
The processes of self-renewal and multilineage differentiation of hematopoietic stem cells occur continuously during the lifetime of the organism and can be regulated in response to need. By generating knockout mice, a number of transcription factors have been implicated in hematopoietic stem/progenitor cell development. For example, homozygous disruption of either AMLl, c-myb, PU.l, or scul/tall results in defects in fetal hematopoiesis in mice which die during early embryonic development. These genes appear to be essential for the development of multiple hematopoietic lineages. Transcription factors which appear to be lineage-specific have also been identified. These genes include E2A and Pax-5 which are required for B cell development, rbtn2 which is necessary for primitive yolk sac-derived erythropoiesis, and NF-E2 which is essential for the development of megakaryocytes .
Several zinc finger genes play important roles in hematopoiesis. Among them, the GATA3 gene appears to be involved in early development of multiple cell lineages . GATA3 -deficient embryos die in early development as a result of impaired fetal liver hematopoiesis . Homologous knockout of another zinc finger gene, Ikaros, blocked lymphoid development. Disruption of GATA I and GATA2 in mice led to lineage-selective deficits in erythropoiesis. Egrl, a myeloid differentiation primary response gene, has been shown to be essential for differentiation of hematopoietic cells along the macrophage lineage. MZF1, which is preferentially expressed in hematopoietic cells, may inhibit myeloid differentiation by negative regulation of hematopoietic genes such as CD34 and c-myb.
The present invention provides in one aspect the a novel polypeptide which has been putatively identified as a hematopoietic factor and more particularly as Stem Cell
Zinc Finger ("SZF1") which is encoded by a polynucleotide obtained from a cDNA library prepared from CD34+ human bone marrow cells. SZF1 codes for a protein containing a C2H2 type zinc finger and a KRAB domain. Two alteratively spliced transcripts were isolated: SZFl-1 and SZFl-2, respectively. SZFl-2 was expressed in most cell types and tissues, whereas SZFl-1 appears limited to expression in CD34+ cells. Since SZFl-1 appears limited to expression in CD34+ cells and not in more mature cells it can be utilized as a marker for undifferentiated CD34+ cells. In particular, antibodies to SZFl-1 may be utilized to confirm a population of CD34+ cells. Also, SZFl-1 appears to be a factor involved in transcription of one or more genes for undifferentiated replication of CD34+ cells. SZFl-2 appears to be involved in the maturation of CD34+ cell in that certain concentrations of SZFl-2 can cause hematopoietic cells to differentiate and mature rather than to merely replicate. The gene for each of SZFl-1 and SZF1- 2 is found on chromosome 3. Therefore, the polynucleotides encoding SZFl-1 and SZFl-2 are useful to generate probes or antibodies that may be utilized to detect the presence or absence of chromosome 3 in a sample. The polynucleotides accorting to the invention may be utilized for gene therapy in a host to replace or supplement a defective SZFl-1 or SZFl-2 gene.
In accordance with one aspect of the present invention, there are provided novel polypeptides, as well as active fragments, analogs and derivatives thereof.
In accordance with another aspect of the present invention, there are provided isolated nucleic acid molecules encoding the proteins of the present invention including mRNAs, cDNAs, genomic DNAs as well as active analogs and fragments of such proteins .
In accordance with another aspect of the present invention there are provided isolated nucleic acid
molecules encoding mature polypeptides expressed by the DNA contained in ATCC Deposit Nos. .
In accordance with yet a further aspect of the present invention, there is provided a process for producing such polypeptides by recombinant techniques comprising culturing recombinant prokaryotic and/or eukaryotic host cells, containing a nucleic acid sequence of the present invention, under conditions promoting expression of said proteins and subsequent recovery of said proteins .
In accordance with yet a further aspect of the present invention, there is provided a process for utilizing such polypeptides for analyzing potential agonists to the polypeptides. Another process utilizes the polynucleotides to assay for compounds which bind said polynucleotides and would thus block expression of any products from said polynucleotides .
In accordance with yet a further aspect of the present invention, there are also provided nucleic acid probes comprising nucleic acid molecules of sufficient length to specifically hybridize to a nucleic acid sequence of the present invention.
In accordance with yet a further aspect of the present invention, there is provided a process for utilizing such proteins, or polynucleotides encoding such proteins, for purposes related to scientific research, for example, to generate probes for identifying similar sequences which might encode similar proteins from other organisms by using certain regions, i.e., conserved sequence regions, of the nucleotide sequence .
These and other aspects of the present invention should be apparent to those skilled in the art from the teachings herein.
Brief Description of the Drawings
The following drawings are illustrative of an embodiment of the invention and are not meant to limit the scope of the invention as encompassed by the claims.
Figures 1A-1D collectively show the polynucleotide sequence of the SZFl-1 cDNA (SEQ ID N0:1) and its predicted amino acid sequence (SEQ ID NO: 2) . The nucleotide and predicted amino acid sequences of SZFl-1 are shown with numbered nucleotides and amino acids listed to the left of each product. RNA splicing sites as determined by sequencing comparison of cDΝAs with genomic DΝA are shown by each character (A) . The deduced amino acid sequence starts from the conserved Kozak translation start site (underlined) . Zinc fingers are underlined from the initial conserved cysteine to the final conserved histidine beginning at amino acid 247. KRAB-A and -B domains are indicated above each such domain. Phosphorylation consensus sites for potential casein kinase II (ck2) and protein kinase C (PKC) , and PEST sequences are labelled above each element . The peptide sequence used to generate polyclonal serum is shown in italics .
Figures 2A-2E collectively show the polynucleotide sequence of the SZFl-2 cDΝA (SEQ ID NO: 3) and its predicted amino acid sequence (SEQ ID NO: 4) . The nucleotide and predicted amino acid sequences of SZFl-2 are shown with numbered nucleotides and amino acids listed to the left of each product. RNA splicing sites as determined by sequencing comparison of cDNAs with genomic DNA are shown by each character (Λ) . The deduced amino acid sequence starts from the conserved Kozak translation start site (underlined) . Zinc fingers are underlined from the initial conserved cysteine to the final conserved histidine beginning at amino acid 247. KRAB-A and -B domains are indicated above each such domain. Phosphorylation consensus sites for potential casein kinase II (ck2) and protein kinase C (PKC) , and PEST sequences are labelled
above each element . Two instability motifs (ATTA) in the 3' untranslated region are underlined. The peptide sequence used to generate polyclonal serum is shown in italics .
Figures 3A and 3B show the results from labelling and immunoprecipitation of in vi tro transcribed and translated (TNT) SZFl-1 and SZFl-2 proteins. Figure 3A: [35S] methionine was used to label the proteins generated form TNT of full length SZFl-1 and SZFl-2 constructs. These products together with the products of a TNT reaction in which no template was added (dH20) were resolved on a 10% SDS-polyacrylamide gel which was then exposed to film. MW markers are shown to the left. Figure 3B: The TNT products of SZFl-1 and SZFl-2 were immunoprecipitated (IP) with a polyclonal antisera generated to a peptide which was present in the predicted ORF of both transcripts. IP reactions were also conducted after preincubation of the antisera with 25 μg of the peptide to demonstrate specificity of the antiserum (+ peptide lanes) . The IP products were then processed as above.
Figure 4A and 4B illustrate KRAB and zinc finger homologies, respectively, between the SZFl-1, SZFl-2 and several related proteins. Specifically, alignment of the KRAB domains (Figure 4A) and zinc finger domains (Figure 4B) of SZFl-1 (line 1 of each row) and SZFl-2 (line 2 of each row) were compared with the transcriptional repressor proteins ZNF133 (line 3) , Kidl (line 4) and ZNF85 (line 5) . Protein alignments were generated with the GCG Pileup program. The sixth line of each row shows the amino acid residues of a consensus sequence, which only includes a particular amino acid if at least three of the compared proteins have the identical amino acid at that position.
Figure 5 shows the expression results of SZF1 in normal human tissue. Northern blots containing 2 μg of polyA+ RNA from multiple normal human tissues were
hybridized with a SZFl 3 ' probe which contained the 3 ' region (nucleotides 1414-2389) of SZFl-1 downstream of the zinc fingers (top portion) . The blots were then stripped and reprobed with -actin (lower portion) . The source of RNA is noted above each lane . MW size markers are noted to the left.
Figure 6 illustrates RNAse protection analysis of the expression of SZFl in hematopoietic cell lines. A 5 μg aliquot of total RΝA from each cell line sample was hybridized with a 32P-UTP labelled 230 base antisense RΝA probe which contains 165 bp of SZFl (bases 1256-1421) attached to 65 bp of vector sequence. A 3-actin probe was included in the hybridization reactions as an internal control for RΝA loading. After hybridization, the samples were treated with RNAse A and TI followed by electrophoresis on a 6% polyacrylamide gel and autoradiography. The SZFl and /3-actin probes are shown on the left . The source of the RΝA added is noted above each lane, they are BM (total bone marrow cells); Jurkat, Molt- 3, Molt-16, RPMI-8402 (T-lineage acute lymphocytic leukemia); K422, RL, REH (B-lineage acute lymphocytic leukemia); Raj i (Burkitt's lymphoma) ; HEL (Erythroleukemia) ; K562 (Chronic myelogenous leukemia) ; ML- 1, KGla (Acute myelocytic leukemia) . The protected SZFl and actin fragments are indicated to the right.
Figure 7 shows SZFl expression with differentiation. HL60 cells and ML-1 cells were treated with TPA for 24 hours . 5 μg of total RNA isolated from control cells and cells exposed to TPA was added to the RΝAse protection assay with the same 32P-UTP labelled 230 base antisense RΝA probe used in Figure 6. The source of each RNA sample is noted above each lane. The SZF-1 and /3-actin protected bands are indicated to the left .
Figure 8A and 8B show the results from RT-PCR analysis of SZFl expression in CD34+ and CD34- cells. In Figure 8A,
CD34+ and CD34- cells were purified from human bone marrow mononuclear cells by immunomagnetic separation and RNA was isolated by the guanidum thiocyanate method. A 1 μg aliquot of each RNA sample was reverse transcribed using random hexamer priming. RT-PCR fragments generated by primer pairs shown in Figure 8B are 1-2 (210 bp) , 3-4 (324 bp) , 5-6 (392 bp) and 7-8 (133 bp for SZFl-1 or 1417 bp for SZFl-2) . The RT-PCR products were resolved in a 2% agarose gel, stained with ethidium bromide and photographed. The source of each RNA and the primer pairs used are shown above each lane. DNA size markers are shown on the left. Figure 8B includes a Map which indicates the location of the primer pairs on the SZFl-1 and SZFl-2 cDNAs .
Figure 9A, 9B and 9C illustrate the genomic organization of SZFl cDNAs and predicted motif structures. The SZFl-1 cDNA (empty boxes) and SZFl-2 cDNA (Hatched boxes) as well as the boundary of introns and exons are schematically presented in B with the approximate location of EcoRI sties (R) indicated. cDNA motifs and sequences that encode predicted zinc fingers, KRAB-A and KRAB-B domains and the PEST sequences are indicated in A for SZF1- 1 and in C for SZFl-2. The lines indicate the splicing events that occur from SZFl to result in the observed cDNAs .
Figures 10A and 10B show the chromosomal localization of SZFl by FISH. Figure 10A results from PI plasmids that encompass the SZFl gene that were nick-translated with biotin-14 and hybridized to metaphase chromosome spreads from normal lymphocytes cultured with BrdU. The specific paired FISH signals of the SZFl gene are shown with an arrow on chromosome 3 (DAPI-stained) . Figure 10B shows the results of metaphases that were G-banded and photographed prior to FISH. The arrow indicates band 3p21, corresponding to the FISH signals seen in Figure 10A.
Figure 11 illustrates the chromosome ideogram of paired signals from FISH as described in Figure 10A. Each dot represents a paired signal seen on metaphase chromosomes as described in Figure 10B.
Figures 12A and 12B show the results of SZFl expression from lung and hematopoietic tissues and cell types . In Figure 12A 5 μg of polyA+RNA from each indicated sample was hybridized with 32P-labelled SZFl 3' (top), SZFl 5' (middle) or /δ-actin (bottom) probes. HFL (normal human fetal lung) ; HBE primary human lung epithelial cell culture); HEL, K562, and ML-I (hematopoietic cell lines); H209, H249, H1385, H82, DMS53, N417 (SCLC cell lines); H727, H385, H157, A549 (HSCLC cell lines); 965950 and 277055 (Primary small cell lung cancer tissues) . Molecular weight markers are shown to the left . Figure 12 B illustrates a cDNA map of SZFl that shows regions covered by the 5 ' and 3 ' SZFl probes .
Definitions
In order to facilitate understanding of the following description and examples which follow certain frequently occurring methods and/or terms will be described.
The term "gene" means the segment of DNA involved in producing a polypeptide chain; it includes regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons) .
A coding sequence is "operably linked to" another coding sequence when RNA polymerase will transcribe the two coding sequences into a single mRNA, which is then translated into a single polypeptide having amino acids derived from both coding sequences . The coding sequences need not be contiguous to one another so long as the
expressed sequences ultimately process to produce the desired protein.
"Recombinant" proteins refer to proteins produced by recombinant DNA techniques; i.e., produced from cells transformed by an exogenous DNA construct encoding the desired protein. "Synthetic" proteins are those prepared by chemical synthesis .
A DNA "coding sequence of" or a "nucleotide sequence encoding" a particular protein, is a DNA sequence which is transcribed and translated into a protein when placed under the control of appropriate regulatory sequences .
"Plasmids" are designated by a lower case "p" preceded and/or followed by capital letters and/or numbers. The starting plasmids herein are either commercially available, publicly available on an unrestricted basis, or can be constructed from available plasmids in accord with published procedures. In addition, equivalent plasmids to those described are known in the art and will be apparent to the ordinarily skilled artisan.
"Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction enzyme that acts only at certain sequences in the DNA. The various restriction enzymes used herein are commercially available and their reaction conditions, cofactors and other requirements were used as would be known to the ordinarily skilled artisan. For analytical purposes, typically 1 μg of plasmid or DNA fragment is used with about 2 units of enzyme in about 20 μl of buffer solution. For the purpose of isolating DNA fragments for plasmid construction, typically 5 to 50 μg of DNA are digested with 20 to 250 units of enzyme in a larger volume. Appropriate buffers and substrate amounts for particular restriction enzymes are specified by the manufacturer. Incubation times of about 1 hour at 37 °C are ordinarily used, but may vary in accordance with the
supplier's instructions. After digestion the reaction is electrophoresed directly on a polyacrylamide gel to isolate the desired fragment.
Size separation of the cleaved fragments is performed using 8 percent polyacrylamide gel described by Goeddel et al., Nucleic Acids Res . , 8:4057 (1980).
"Oligonucleotides" refers to either a single stranded polydeoxynucleotide or two complementary polydeoxynucleotide strands which may be chemically synthesized. Such synthetic oligonucleotides have no 5' phosphate and thus will not ligate to another oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated .
"Ligation" refers to the process of forming phosphodiester bonds between two double stranded nucleic acid fragments (Maniatis, T., et al . . Id. , p. 146) . Unless otherwise provided, ligation may be accomplished using known buffers and conditions with 10 units of T4 DNA ligase ("ligase") per 0.5 μg of approximately equimolar amounts of the DNA fragments to be ligated.
Unless otherwise stated, transformation was performed as described in Sambrook and Maniatis, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, 1989.
Summary of the Invention
In accordance with an aspect of the present invention, there are provided isolated nucleic acids (polynucleotides) from SZFl-1 and SZFl-2 (SEQ ID NOS: 1 and 3) as shown in Figures 1A-D, collectively, and Figures 2A-E, collectively, which encode, respectively, the mature SZFl-1 and SZFl-2 proteins having the continuous deduced amino acid sequence shown in Figure 1A-D and 2A-E.
In accordance with another aspect of the present invention, there is provided isolated polynucleotides encoding the polypeptides of the present invention. The deposited material is a genomic clone comprising DNA encoding a polypeptide of the present invention, in a plasmid DNA vector form. As deposited with the American Type Culture Collection, 12301 Parklawn Drive, Rockville, Maryland 20852, USA, the deposited material is assigned ATCC Deposit No. .
The deposits have been made under the terms of the Budapest Treaty on the International Recognition of the Deposit of Micro-organisms for Purposes of Patent Procedure. The clones will be irrevocably (without restriction or condition) released to the public upon the issuance of a patent. The deposits are provided merely as a convenience to those of skill in the art and is not an admission that any deposit would be required under 35 U.S.C. §112. The sequenced of the polynucleotided contained in the respective deposited materials, as well as the amino acid sequences of the polypeptide encoded thereby, are controlling in the event of any conflict with any description of sequences herein. A license may be required to make, use or sell the deposited material, and no such license is hereby granted.
Detailed Description of the Invention
The polynucleotides of this invention coding for the proteins of this invention were originally recovered from a genomic gene library derived CD34+ cells.
The identification of SZF-1 was made through the random sequencing of a portion of several hundred clones from a CDNA library prepared from human bone marrow CD34+ cells . One of the fragments was unique but showed homology to the Kruppel family of zinc finger proteins. We therefore sequenced the entire 1200 bp fragment. Then,
CDNA libraries from CD34+ cells, as well as libraries made from K562 cells and human lung (which in preliminary experiments showed expression of the transcript) were screened with the 1.2 kb fragment. Overlapping cDNA fragments which hybridized to the probe gave rise to two alternatively spliced cDNA products. Most regions were isolated several times in more than one cDNA library. The complete SZF-1 nucleotide and predicted amino acid sequences are shown in Figs. 1A-1D (collectively); the complete SZFl-2 nucleotide and predicted amino acid sequence are shown in Fig. 2A-2E (collectively) .
Two predicted polypeptide products of 421 and 361 amino acids are predicted from the open reading frame (ORF) of the two transcripts, termed SZFl- 1 (SEQ ID NO: 2) and SZFl-2 (SEQ ID NO: 4) . The size of the proteins is based on the usage of the 5' -proximal ATG codon which follows the consensus Kozak site (underlined in Figs. 1A-1D and in Figs. 2A-2E) downstream of an in-frame stop codon. To test the prediction that this ATG is the most frequently used start site for translation initiation, the in vitro transcription and translation assay was performed with full length SZFl-1 and SZFl-2 constructs. As shown in Fig. 3A, the longest and most intense bands correspond to 48 kD for SZFl-1 and 41 kD for SZFl-2. This corresponds to the MW predicted by the full length ORF of both transcripts if the predicted initiating methionine is used. The smaller bands seen in both lanes could represent degradation products or initiation at internal methionines . To confirm that the full length products represent translation from the predicted ORF, immunoprecipitation was performed with polyclonal antisera generated to a peptide that is shared by the predicted ORF of both SZFl-1 and SZFl-2. In Fig. 3B it is evident that the antisera recognizes the full length product of both SZFl-1 and SZFl-2.
Both transcripts share the same amino-terminal sequence upstream of the zinc fingers and through the first three zinc fingers, but diverge after amino acid 349 toward the end of the fourth zinc finger. In a region close to
the amino terminal end, a Kruppel-associated box (KRAB) domain is present with highly conserved A (amino acids 31-73) and B (amino acids 74-93) elements. This domain was initially noted in zinc finger genes from Xenopus. The KRAB domain has been suggested to form an o;-helix and may be involved in protein-protein interactions . Towards the carboxy end, four predicted zinc fingers of the C2H2 type are present in SZFl-2, followed by 10 amino acids before the stop codon. In SZFl-1, the fourth zinc finger is incomplete with the final histidine replaced by a glutamine residue. The lack of a second histidine in the final zinc finger has been observed in other zinc finger proteins. An additional 74 amino acids follow the zinc fingers at the carboxy terminus of SZFl-1.
Several regions of the deduced proteins suggest that they may be substrates for phosphorylation. There are three potential casein kinase H phosphorylation sites upstream of the zinc fingers. In addition, there are several regions in and near the zinc fingers of both SZFl-1 and SZFl-2 that match the consensus motif for cyclic AMP (cAMP) -dependent protein kinase and protein kinase C phosphorylation sites.
Some features suggest that the mRNA and protein products of SZFl may have short-half lives. Two AUUUA sequences appear beginning at nucleotides 1641 and 1725 in the 3' untranslated region of SZFl-2. Those sequences confer mRNA instability and lead to a short half-life in other messages which express this feature . There is one PEST consensus region present in both predicted proteins, and a second PEST sequence near the carboxy-terminus of SZFl-1. PEST sequences, rich in proline, acidic, semine, and threonine residues, are often present in proteins with short half-lives and may signal for their rapid degradation by the proteolytic machinery of the cell.
The CDNA sequences of SZFl-1 and SZFl-2 were used to search the available nucleotide databases to determine the most highly related genes. The result showed that ZNF133, Kidl and ZNF85 were the most highly related genes with
homologies of 65%, 55% and 45% at the nucleotide level, respectively. ZNT133, Kidl, and ZNF85 (Poncelet, D.A., University of Liege, pers. comm.) are all zinc finger proteins that have been suggested to play roles in transcriptional repression. The region of SZFl that encompasses the KRAB and zinc finger domains has the highest degree of homology to these transcriptional factors (Figs. 4A and 4B) . The identity of the KRAB domain of SZFl with that of ZNF133 is 68% for KRAB-A and 22% for KRAB-B; with Kid] it is 74% for KRAB-A and 12% for KRAB-B; and with ZNF85 it is 70% for KRAB-A and 26% for KRAB-B (Fig. 4A) . The zinc finger sequence of SZFl also shows a high degree of homology to these transcription factors, with an identity of about 50% (Fig. 4B) . A much lower degree of homology was found in the remainder of the SZFl coding sequence .
Tissue and cell line expression of SZFl .
To examine the expression of SZFl in normal human tissues, hybridization of SZFl probes with multi-tissue Northern blots was performed. Both 3' and 5' probes (Figs. 12A and 12B) were generated to omit the zinc finger region to decrease the possibility of cross hybridization with other zinc finger-containing genes. Both probes would hybridize with both SZFl-1 and SZFl-2 transcripts. An example of hybridization with a 3' probe is shown in Fig. 5. A 4.2 kb band was seen in most human tissues, including heart, brain, placenta, lung, liver, kidney, pancreas, spleen, thymus, prostate, testis, ovary, small intestine, colon, and blood. The lane containing liver RNA showed a 2 kb band instead. Some samples also showed a larger transcript of approximately 6 kb. The 5' probe also hybridized to the 4.2 kb band in all these issues and in addition showed a faint band of 1.2 kb (data not shown) .
To determine if SZFl was expressed in hematopoietic cells, Northern blotting was performed with samples from a number of hematopoietic cell lines. A discrete 4.2 kb band was seen from several samples, including the RL, HEL and K562 cell lines (data not shown) . The more sensitive RNAse
protection assay was used to further screen for expression. Although the level of expression varies, protected species are seen from every hematopoietic cell line tested, including Jurkat, K422, Raji, Molt-16, RL, HEL, RPMI-8402, K562, ML-1, KGla, Molt-3 and REH (Fig. 6) . These represent myeloid, lymphoid, and erythroid lineages.
Several of the hematopoietic-derived cell lines expressing SZFl differentiate in response to various agents. SZFl expression was measured as a function of differentiation induced in HL60 and ML-1 cell lines by TPA. RNA, isolated from control cells and cells exposed to TPA for 24 hours, was used in an RNAse protection assay with actin as an internal control to allow for normalization (Fig. 7) . Quantitation by phosphorimager scanning of the gel showed that SZFl expression was decreased to 30% by 24 hours of induction of the HL60 cells. In ML-1 cells, expression of SZFl decreased to only 16% of control levels by 24 hours of differentiation. Thus, SZFl expression is greatly repressed upon differentiation, at least in these two cell lines.
The Northern blotting and RNAse protection experiments conducted above do not distinguish between the two SZFl transcripts. Therefore RT-PCR was used to test for the presence of the two SZFl transcripts in the polyA+RNA from hematopoietic cell lines, lung cancer cell lines, fetal lung tissue and a normal lung epithelial cell primary culture. The primers used in the assay cross the SZFl-2 exon that is spliced out of the SZFl-1 transcript and would generate a 133 bp RT-PCR fragment from the SZFl-1 transcript and a 1.4 kb fragment from the SZFl-2 transcript . None of the mRNA samples tested gave rise to the 133 bp product from SZFl-1. Instead, most of the mRNA preparations showed the 1.4 kb product from the SZFl-2 transcript (data not shown) . Controls in which reverse transcriptase was not added to the RT reactions confirmed that the signal was not amplified from contaminating DNA (data not shown) .
To determine which SZFl transcripts were present in nominal human bone marrow CD34+ cells, several pairs of primers were used for RT-PCR. Fig. 8A shows that the 133 bp SZFl-1 transcript is expressed only in normal human marrow CD34+ cells and not in CD34- cells. Controls in which reverse transcriptase was left out of the reaction demonstrated that the signals were not from contaminating DNA (data not shown) . Additional experiments confirmed that the SZFl-1 transcript was present in mRNA samples from total bone marrow (which includes CD34+ cells) , but was not present after depletion of CD34+ cells (data not shown) . Southern analysis confirms that the 133 bp and 1.4 kb RT-PCR fragments hybridize to specific internal oligonucleotides (data not shown) .
Genomic organization of SZFl .
Three overlapping PI clones that each contain approximately 85 kb genomic DNA fragments were obtained by PCR screening of a PI library. Southern analysis indicated that each of the three encompasses the entire SZFl coding sequence (data not shown) . Subclones were generated from the PI plasmids to facilitate sequencing and mapping. Mapping of the intron/exon structure of SZFl revealed that SZFl-2 contained an intron which is spliced out of SZFl-1 (Figs. 9A, 9B and 9C) . This alternative splicing results in a different carboxy terminus for the two translated protein products. All the other exons are present in both cDNAs . Sequencing of the corresponding genomic DNA subclones confirmed the 5' upstream and 3' stop codons which are present in the cDNAs .
Chromosomal localization of SZFl .
FISH (Fluorescence In Situ Hybridization) was performed to determine the chromosomal localization of SZFl. PI plasmids with inserts that encompass the entire coding sequence of SZFl ORFs were used as the probe . An example of the FISH results is shown in Figs. 10A and 10B. Clear paired signals were observed only on chromosome 3. Analysis of 34 metaphase cells showed 23 cells (67%) had at least one pair of signals. 41/44 signals were located on
chromosome 3 as indicated by G-banding 24, with all of the signals on band p21. In addition, 9 out of 10 signals analyzed from the pre-G banded metaphases had the same 3p21 localization. The ideogram representing the results of chromosomes with paired signals is shown in Fig. 9C. The results indicate that SZFl maps to human chromosome 3p21, a region that has been implicated in a number of cancers . Lung expression.
Due to localization of SZFl to human chromosome 3 band p21, its expression in lung cancer cells was investigated. Preparations were made from normal fetal lung tissue (HFL) , a normal lung epithelial cell primary culture (HBE) , small cell lung cancer (SCLC) lines, non-small cell lung cancer
(NSCLC) cells, and two primary small cell lung cancer tumors were first examined with a 945 bp 5' probe corresponding to the first 945 nucleotides of SZFl-1 and SZFl-2 (Fig. 11) . Compared with normal tissues, the cell lines showed widely varying levels in the expression of the 4.2 kb transcript. In some NSCLC and SCLC cell lines, the transcript could not be detected. After stripping the blot and rehybridizing with the 5' probe, a 1.1 kb band was observed in addition to the 4.2 kb band. Higher stringency washing conditions did not remove this band. It is possible that the 1.1 kb band represents a species cross-hybridizing with the KRAB domain.
One means for isolating the nucleic acid molecules encoding the proteins of the present invention is to probe a CD34+ gene library with a natural or artificially designed probe using art recognized procedures (see, for example: Current Protocols in Molecular Biology, Ausubel F.M. et al . (EDS.) Green Publishing Company Assoc . and John Wiley Interscience, New York, 1989, 1992) . It is appreciated by one skilled in the art that the polynucleotides of SEQ ID NOS : 1 and 3, or fragments thereof
(comprising at least 12 contiguous nucleotides) , are particularly useful probes. Other particularly useful probes for this purpose are hybridizable fragments of the
sequences of SEQ ID NOS : 1 and 3 (i.e., comprising at least 12 contiguous nucleotides) .
With respect to nucleic acid sequences which hybridize to specific nucleic acid sequences disclosed herein, hybridization may be carried out under conditions of reduced stringency, medium stringency or even stringent conditions. As an example of oligonucleotide hybridization, a polymer membrane containing immobilized denatured nucleic acids is first prehybridized for 30 minutes at 45°C in a solution consisting of 0.9 M NaCl, 50 mM NaH2P04, pH 7.0, 5.0 mM Na2EDTA, 0.5% SDS, 10X Denhardt's, and 0.5 mg/mL polyriboadenylic acid. Approximately 2 X 107 cpm (specific activity 4-9 X 108 cpm/ug) of 32P end-labeled oligonucleotide probe are then added to the solution. After 12-16 hours of incubation, the membrane is washed for 30 minutes at room temperature in IX SET (150 mM NaCl, 20 mM Tris hydrochloride , pH 7.8, 1 mM Na2EDTA) containing 0.5% SDS, followed by a 30 minute wash in fresh IX SET at Tm less 10°C for the oligonucleotide probe. The membrane is then exposed to auto- radiographic film for detection of hybridization signals .
Stringent conditions means hybridization will occur only if there is at least 90% identity, preferably at least 95% identity and most preferably at least 97% identity between the sequences. Further, it is understood that a section of a 100 bps sequence that is 95 bps in length has 95% identity with the 1090 bps sequence from which it is obtained. See J. Sambrook et al . , Molecular Cloning, A Laboratory Manual , 2d Ed. , Cold Spring Harbor Laboratory (1989) which is hereby incorporated by reference in its entirety. Also, it is understood that a fragment of a 100 bps sequence that is 95 bps in length has 95% identity with the 100 bps sequence from which it is obtained.
As used herein, a first DNA (RNA) sequence is at least 70% and preferably at least 80%, and more preferably at least a 90%, and even more preferably or at least 95%
identical to another DNA (RNA) sequence if there is at least 70% and preferably at least a 80%. 90% or 95% identity, respectively, between the bases of the first sequence and the bases of the another sequence, when properly aligned with each other, for example when aligned by BLASTN.
The present invention relates to polynucleotides which differ from the reference polynucleotide in a manner such that the change or changes is/are silent change, in that the amino acid sequence encoded by the polynucleotide remains the same. The present invention also relates to nucleotide changes which result in amino acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the reference polynucleotide. In a preferred aspect of the invention these polypeptides retain the same biological action as the polypeptide encoded by the reference polynucleotide.
The polynucleotides of the present invention may be in the form of RNA or DNA which DNA includes cDNA, genomic DNA, and synthetic DNA. The DNA may be double-stranded or single-stranded, and if single stranded may be the coding strand or non-coding (anti-sense) strand. The coding sequences which encodes the mature proteins may be identical to the coding sequences shown in Figures 1A-D and 2A-E, (SEQ ID NOS : 1 and 3, respectively) or may be a different coding sequence which coding sequence, as a result of the redundancy or degeneracy of the genetic code, encodes the same mature proteins as does the DNA of Figures 1A-D and 2A-E, (SEQ ID NOS : 1 and 3, respectively) .
The polynucleotides which encode each of the mature proteins (SEQ ID NOS : 2 and 4, respectively) may include, but each is not limited to: only the coding sequence for the mature protein; the coding sequence for the mature protein and additional coding sequence such as a leader sequence or a proprotein sequence; the coding sequence for
the mature protein (and optionally additional coding sequence) and non-coding sequence, such as introns or non- coding sequence 5' and/or 3' of the coding sequence for the mature protein.
Thus, the term "polynucleotide encoding a protein" encompasses a polynucleotide which includes only coding sequence for the polypeptide as well as a polynucleotide which includes additional coding and/or non-coding sequences .
The present invention further relates to variants of the hereinabove described polynucleotides which encode for fragments, analogs and derivatives of the proteins having the deduced amino acid sequences of Figures 1A-D and 2A-E (SEQ ID NOS: 2 and 4, respectively). The variant of the poly-nucleotide may be a naturally occurring allelic variant of the polynucleotide or a non-naturally occurring variant of the polynucleotide .
Thus, the present invention includes polynucleotides encoding the same mature proteins as shown in Figures 1A-D and 2A-D, as well as variants of such polynucleotides which variants encode for a fragment, derivative or analog of the proteins of Figures 1A-D and 2A-E, respectively. Such nucleotide variants include deletion variants, substitution variants and addition or insertion variants.
As hereinabove indicated, the polynucleotides may have a coding sequence which is a naturally occurring allelic variant of the coding sequence shown in Figures 1A-D and 2A-E. As known in the art, an allelic variant is an alternate form of a polynucleotide sequence which may have a substitution, deletion or addition of one or more nucleotides, which does not substantially alter the function of the encoded protein. Also, using directed and other evolution strategies, one may make very minor changes
in DNA sequence which can result in major changes in function.
Fragments of the full length gene of the present invention may be used as hybridization probes for a cDNA or a genomic library to isolate the full length DNA and to isolate other DNAs which have a high sequence identity to the gene. Probes of this type preferably have at least 10, preferably at least 15, and even more preferably at least 30 bases and may contain, for example, at least 50 or more bases. In fact, probes of this type having at least up to 150 bases or greater may be utilized. The probe may also be used to identify a DNA clone corresponding to a full length transcript and a genomic clone or clones that contain the complete gene including regulatory and promotor regions , exons and introns . An example of a screen comprises isolating the coding region of the gene by using the known DNA sequence to synthesize an oligonucleotide probe. Labeled oligonucleotides, having a sequence complementary to that of the gene or portion of the gene sequences of the present invention are used to screen a library of genomic DNA to determine which members of the library the probe hybridizes to in a complementary sense, have an identity as described above.
It is also appreciated that such probes can be and are preferably labeled with an analytically detectable reagent to facilitate identification of the probe. Useful reagents include but are not limited to radioactivity, fluorescent dyes or proteins capable of catalyzing the formation of a detectable product . The probes are thus useful to isolate complementary copies of DNA from other sources or to screen such sources for related sequences .
The present invention further relates to polynucleotides which hybridize to the hereinabove- described sequences if there is at least 70%, preferably at least 90%, and more preferably at least 95% identity
between the sequences. (As indicated above, 70% identity would include within such definition a 70 bps fragment taken from a 100 bp polynucleotide, for example.) The present invention particularly relates to polynucleotides which hybridize under stringent conditions to the hereinabove-described polynucleotides . As herein used, the term "stringent conditions" means hybridization will occur only if there is at least 95% and preferably at least 97% identity between the sequences . The polynucleotides which hybridize to the hereinabove described polynucleotides in a preferred embodiment encode proteins which either retain substantially the same biological function or activity as the mature proteins encoded by the DNA of Figures 1A-D and 2A-E, respectively. In referring to identity in the case of hybridization, as known in the art, such identity refers to complementarity of polynucleotide segments.
Alternatively, the polynucleotide may have at least 15 bases, preferably at least 30 bases, and more preferably at least 50 bases which hybridize to any part of a polynucleotide of the present invention and which has an identity thereto, as hereinabove described, and which may or may not retain activity. For example, such polynucleotides may be employed as probes for the polynucleotides of SEQ ID NOS: 1 and 3, for example, for recovery of the polynucleotide or as a diagnostic probe or as a PCR primer.
Thus, the present invention is directed to polynucleotides having at least a 70% identity, preferably at least 90% identity and more preferably at least a 95% identity to a polynucleotide which encodes the polypeptides of SEQ ID NOS: 2 and 4, respectively, as well as fragments thereof, which fragments have at least 15 bases, preferably at least 30 bases, more preferably at least 50 bases and most preferably fragments having up to at least 150 bases or greater, which fragments are at least 90% identical, preferably at least 95% identical and most preferably at
least 97% identical to any portion of a polynucleotide of the present invention.
The present invention further relates to proteins which have the deduced amino acid sequence of Figures 1A-D and 2A-E, respectively, (SEQ ID NOS: 2 and 4, respectively) as well as fragments, analogs and derivatives of such proteins .
The terms "fragment," "derivative" and "analog" when referring to each of the polypeptides of Figures 1A-D and 2A-E, respectively, (SEQ ID NOS: 2 and 4, respectively) mean a protein which retains essentially the same biological function or activity as such protein. Thus, an analog includes a proprotein which can be activated by cleavage of the proprotein portion to produce an active mature protein.
The proteins of the present invention may be a recombinant protein, a natural protein or a synthetic protein, preferably a recombinant protein.
The fragment, derivative or analog of each of the polypeptides of Figures 1A-D and 2A-E (SEQ ID NOS: 2 and 4, respectively) may be (i) one in which one or more of the amino acid residues are substituted with a conserved or non-conserved amino acid residue (preferably a conserved amino acid residue) and such substituted amino acid residue may or may not be one encoded by the genetic code, or (ii) one in which one or more of the amino acid residues includes a substituent group, or (iii) one in which the mature protein is fused with another compound, such as a compound to increase the half-life of the protein (for example, polyethylene glycol) , or (iv) one in which the additional amino acids are fused to the mature protein, such as a leader or secretory sequence or a sequence which is employed for purification of the mature protein or a proprotein sequence. Such fragments, derivatives and
analogs are deemed to be within the scope of those skilled in the art from the teachings herein.
The polypeptides and polynucleotides of the present invention are preferably provided in an isolated form, and preferably are purified to homogeneity.
The term "isolated" means that the material is removed from its original environment (e.g., the natural environment if it is naturally occurring) . For example, a naturally-occurring polynucleotide or polypeptide present in a living animal is not isolated, but the same polynucleotide or protein, separated from some or all of the coexisting materials in the natural system, is isolated. Such polynucleotides could be part of a vector and/or such polynucleotides or proteins could be part of a composition, and still be isolated in that such vector or composition is not part of its natural environment.
The polypeptides of the present invention include the respective proteins of SEQ ID NOS: 2 and 4 (in particular the mature proteins) as well as proteins which have at least 70% similarity (preferably at least 70% identity) to the repsective proteins of of SEQ ID NOS: 2 and 4 and more preferably at least 90% similarity (more preferably at least 90% identity) to the respective proteins of SEQ ID NOS: 2 and 4, and still more preferably at least 95% similarity (still more preferably at least 95% identity) to the respective proteins of SEQ ID NOS: 2 and 4, and also include portions of such proteins with such portion of the protein generally containing at least 30 amino acids and more preferably at least 50 amino acids and most preferably at least up to 150 amino acids, or more. The mature polypeptides according to the invention may comprise or omit an N-terminal methionine amino acid residue.
As known in the art "similarity" between two proteins is determined by comparing the amino acid sequence and its
conserved amino acid substitutes of one protein to the sequence of a second protein. The definition of 70% similarity would include a 70 amino acid sequence fragment of a 100 amino acid sequence, for example, or a 70 amino acid sequence obtained by sequentially or randomly deleting 30 amino acids from the 100 amino acid sequence.
A variant, i.e. a "fragment", "analog" or "derivative" polypeptide, and reference polypeptide may differ in amino acid sequence by one or more substitutions, additions, deletions, fusions and truncations, which may be present in any combination.
Among preferred variants are those that vary from a reference by conservative amino acid substitutions . Such substitutions are those that substitute a given amino acid in a polypeptide by another amino acid of like characteristics. Typically seen as conservative substitutions are the replacements, one for another, among the aliphatic amino acids Ala, Val, Leu and lie; interchange of the hydroxyl residues Ser and Thr, exchange of the acidic residues Asp and Glu, substitution between the amide residues Asn and Gin, exchange of the basic residues Lys and Arg and replacements among the aromatic residues Phe, Tyr.
Most highly preferred are variants which retain the same biological function and activity as the reference polypeptide from which it varies .
Fragments or portions of the proteins of the present invention may be employed for producing the corresponding full-length protein by peptide synthesis; therefore, the fragments may be employed as intermediates for producing the full-length proteins. Fragments or portions of the polynucleotides of the present invention may be used to synthesize full-length polynucleotides of the present invention.
The present invention also relates to vectors which include polynucleotides of the present invention, host cells which are genetically engineered with vectors of the invention and the production of protiens of the invention by recombinant techniques .
Host cells are genetically engineered (transduced or transformed or transfected) with the vectors of this invention which may be, for example, a cloning vector such as an expression vector. The vector may be, for example, in the form of a plasmid, a phage, etc. The engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants or amplifying the genes of the present invention. The culture conditions, such as temperature, pH and the like, are those previously used with the host cell selected for expression, and will be apparent to the ordinarily skilled artisan.
The polynucleotides of the present invention may be employed for producing proteins by recombinant techniques . Thus, for example, the polynucleotide may be included in any one of a variety of expression vectors for expressing a protein. Such vectors include chromosomal, nonchromosomal and synthetic DNA sequences, e . g. , derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, and pseudorabies . However, any other vector may be used as long as it is replicable and viable in the host.
The appropriate DNA sequence may be inserted into the vector by a variety of procedures . In general , the DNA sequence is inserted into an appropriate restriction endonuclease site(s) by procedures known in the art. Such procedures and others are deemed to be within the scope of those skilled in the art.
The DNA sequence in the expression vector is operatively linked to an appropriate expression control sequence (s) (promoter) to direct mRNA synthesis. As representative examples of such promoters, there may be mentioned: LTR or SV40 promoter, the E. coli . lac or trp, the phage lambda PL promoter and other promoters known to control expression of genes in prokaryotic or eukaryotic cells or their viruses . The expression vector also contains a ribosome binding site for translation initiation and a transcription terminator. The vector may also include appropriate sequences for amplifying expression.
In addition, the expression vectors preferably contain one or more selectable marker genes to provide a phenotypic trait for selection of transformed host cells such as dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as tetracycline or ampi- cillin resistance in E. coli .
The vector containing the appropriate DNA sequence as hereinabove described, as well as an appropriate promoter or control sequence, may be employed to transform an appropriate host to permit the host to express the protein.
As representative examples of appropriate hosts, there may be mentioned: bacterial cells, such as E. coli , Streptomyces , Bacillus subtilis; fungal cells, such as yeast; insect cells such as Drosophila S2 and Spodoptera Sf9; animal cells such as CHO, COS or Bowes melanoma; adenoviruses; plant cells, etc. The selection of an appropriate host is deemed to be within the scope of those skilled in the art from the teachings herein.
More particularly, the present invention also includes recombinant constructs comprising one or more of the sequences as broadly described above . The constructs comprise a vector, such as a plasmid or viral vector, into
which a sequence of the invention has been inserted, in a forward or reverse orientation. In a preferred aspect of this embodiment, the construct further comprises regulatory sequences, including, for example, a promoter, operably linked to the sequence . Large numbers of suitable vectors and promoters are known to those of skill in the art, and are commercially available. The following vectors are provided by way of example; Bacterial: pQE70, pQE60, pQE-9 (Qiagen) , pBluescript II KS, ptrc99a, pKK223-3, pDR540, pRIT2T (Pharmacia); Eukaryotic: pXTl, pSG5 (Stratagene) pSVK3, pBPV, pMSG, pSVL SV40 (Pharmacia). However, any other plasmid or vector may be used as long as they are replicable and viable in the host .
Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other vectors with selectable markers . Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include lacl, lacZ, T3 , T7, gpt, lambda PR, PL and trp. Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I . Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art.
In a further embodiment, the present invention relates to host cells containing the above-described constructs. The host cell can be a higher eukaryotic cell, such as a mammalian cell, or a lower eukaryotic cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a bacterial cell. Introduction of the construct into the host cell can be effected by calcium phosphate transfeetion, DEAE-Dextran mediated transfection, or electroporation (Davis, L. , Dibner, M. , Battey, I., Basic Methods in Molecular Biology, (1986) ) .
The constructs in host cells can be used in a conventional manner to produce the gene product encoded by
the recombinant sequence. Alternatively, the proteins of the invention can be synthetically produced by conventional peptide synthesizers .
Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate promoters. Depending upon the expression host a mature protein may or may not contain an N-terminal methionine. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DΝA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook et al . , Molecular Cloning: A Laboratory Manual , Second Edi tion, Cold Spring Harbor, Ν.Y. , (1989), the disclosure of which is hereby incorporated by reference .
Transcription of the DΝA encoding the proteins of the present invention by higher eukaryotes is increased by inserting an enhancer sequence into the vector. Enhancers are cis-acting elements of DΝA, usually about from 10 to 300 bp that act on a promoter to increase its transcription. Examples include the SV40 enhancer on the late side of the replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers .
Generally, recombinant expression vectors will include origins of replication and selectable markers permitting transformation of the host cell, e. g. , the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct transcription of a downstream structural sequence . Such promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK) , -factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled in
appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable of directing secretion of translated protein. Optionally, the heterologous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired characteristics, e. g. , stabilization or simplified purification of expressed recombinant product.
Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable reading phase with a functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance of the vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for transformation include E. coli , Bacillus subtilis, Salmonella typhimurium and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be employed as a matter of choice .
As a representative but nonlimiting example, useful expression vectors for bacterial use can comprise a selectable marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017) . Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and pGEMl (Promega Biotec, Madison, WI , USA) . These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence to be expressed.
Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the selected promoter is induced by appropriate means
(e.g., temperature shift or chemical induction) and cells are cultured for an additional period.
Cells are typically harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained for further purification.
Microbial cells employed in expression of proteins can be disrupted by any convenient method, including freeze- thaw cycling, sonication, mechanical disruption, or use of cell lysing agents, such methods are well known to those skilled in the art .
Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney fibroblasts, described by Gluzman, Cell , 23:175 (1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and BHK cell lines. Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking nontranscribed sequences . DNA sequences derived from the SV40 splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements .
The polypeptides according to the invention can be recovered and purified from recombinant cell cultures by methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography and lectin chromatography. Protein refolding steps can be used, as necessary, in completing configuration of the mature
protein. Finally, high performance liquid chromatography (HPLC) can be employed for final purification steps.
The polypeptides of the present invention may be a naturally purified product, or a product of chemical synthetic procedures, or produced by recombinant techniques from a prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and mammalian cells in culture) . Depending upon the host employed in a recombinant production procedure, the proteins of the present invention may be glycosylated or may be non- glycosylated. Proteins of the invention may or may not also include an initial methionine amino acid residue.
Antibodies generated against a protein corresponding to a sequence of the present invention can be obtained by direct injection of the respective protein (or a portion of the protein) into an animal or by administering the proteins to an animal, preferably a nonhuman. The antibody so obtained will then bind the respective protein itself. In this manner, even a sequence encoding only a fragment of the proteins can be used to generate antibodies binding the whole native proteins. Such antibodies can then be used to isolate the protein from cells expressing that protein and may also be useful as antimicrobials, or controls in assays to determine the efficacy of potential antimicrobials.
For preparation of monoclonal antibodies, any technique which provides antibodies produced by continuous cell line cultures can be used. Examples include the hybridoma technique (Kohler and Milstein, Nature, 256: 495 - 497, 1975) , the trioma technique, the human B-cell hybridoma technique (Kozbor et al . , Immunology Today 4:72, 1983) , and the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al . , in Monoclonal mΑntibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77- 96, 1985) .
Techniques described for the production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce single chain antibodies to immunogenic protein products of this invention. Also, transgenic mice may be used to express humanized antibodies to immunogenic protein products of this invention.
Antibodies generated against a protein of the present invention may be used in screening for similar proteins from other organisms and samples . Such screening techniques are known in the art, for example, one such screening assay is described in Sambrook and Maniatis, Molecular Cloning: A Laboratory Manual (2d Ed.), vol. 2: Section 8.49, Cold Spring Harbor Laboratory, 1989, which is hereby incorporated by reference in its entirety.
The following non-limiting examples are provided merely to illustrate a preferred embodiment of the invention.
Example 1 cDNA Library Screening and DNA se-quencing .
Several hundred clones from a CD34+ CDNA library were randomly sequenced and searched against available databases . One of the unique fragments obtained contained zinc fingers of the C2H2 type. This 1.2 kb fragment was used to screen cDNA libraries from K562 cells (CLONTECH, Palo Alto, CA) , human lung (CLONTECH, Palo Alto, CA) , and CD34+ cells. At least 106 recombinants from each library were screened by standard procedures . Positive clones were subcloned into pBluescript II KS- (Stratagene, La Jolla, CA) and sequenced using the Sequence Version 2.0 DNA sequencing Kit (USB, Cleveland, Ohio) .
Example 2 Sequence Analysis .
Computer searches of available databases were performed using the BLAST program. Protein alignments were
generated with the Genetics Computer Group Pileup program (Generics Computer Group, Madison, WI) .
Example 3 Northern Analysis .
Northern hybridization was performed according to standard protocols . In brief , 5 μg of each mRNA sample was incubated with 50% formamide, 6.5% formaldehyde, and lx MOPS (pH 7.0) at 55°C for 15 minutes. The samples were electrophoresed in 1.2% agarose/formaldehyde gels and transferred to nylon membrane (HYBOND, Amersham, UK) by the capillary method. Blots containing RNA samples from cell lines as well as multitissue Northern blots obtained commercially (CLONTECH, Palo Alto, CA) were hybridized for 2 hrs in 50% formamide, 5X SSPE, 10X Denhardt's, 2% SDS, and 100 ug/ml denatured salmon sperm DNA (Clonetech) with randomly primed 32P-dCTP labelled probe and exposed to film after washing. The blots were then stripped and rehyridized with other probes.
Example 4 RNAse Protection Assays .
The RNAse protection assay was performed using the MAXIscript T3 in vitro transcription kit (Ambion, Austin, Texas) . The anti-sense RNA probe of SZFl (bpl256-1421) was synthesized by runoff transcription using Bacteriophage T3 RNA polymerase on a pBluescript II KS- plasmid containing a fragment of SZFl linearized at the Sad site. It results in a probe of 230 bases containing 165 bases of SZF-1 that would be protected by SZF-1 message. 5 ug of total RNA from each sample was hybridized with the 32P-UTP labelled antisense RNA probe, and then treated with RNAse A and TI . A jδ-actin probe was included in the hybridization reactions as an internal control for RNA loading. Protected fragments were resolved on an 8M-Urea, 6% acrylamide gel and exposed to film (Kodak X-OMAT) .
Example 5
RT-PCR .
1 μg of mRNA from each sample was reverse transcribed with Moloney murine leukemia virus reverse transcriptase (M-MLV-RT) (GIBCO/BRL, Gaithersburg, MD) using random hexamers or oligo(dT)15 (Boehringer Mannheim, city, state) as primers. PCR was performed with cycles of 95°C for 1 min, 45°C for 1 min, and 72°C for 2 min, repeated for 35 cycles . Primer pairs used for PCR were 1 ) . 5 ' -CGGGATCCTAATACGACTCACTATAGGGAG ACCACCATGATFGATTTTCAAATG-3' (SEQ ID NO: 5); 2) .5 ' GGAATTCCTCAAGGCTCAGTC-3 ' (SEQ ID NO: 6) ; 3 ) .5 ' -AAGAGACTGAGCCTTGAG-3 ' (SEQ ID NO: 7) ; 4) .5'-ATTTCCTGCATGGAAACC-3' (SEQ ID NO: 8); 5) .5 ' -GGGCAACAGAATAT-3 ' (SEQ ID NO : 9) ; 6) .5 ' -ATAAGGTTTCTCCCCGGA-3 ' (SEQ ID NO : 10) ; 7) .5'-GAGGCGTGAAAAGT-3' (SEQ ID NO: 11); 8) .5 ' -CAGGAGAGTGTCATGGAA-3 ' (SEQ ID NO : 12) .
Example 6 FISH.
Three independent PI plasmids (1629, 1630 and 1631) with approximately 85 kb SZFl genomic fragments were obtained from a human genomic DNA PI library (DMPC-HFF#1; Genome Systems, St. Louis, MO) by PCR screening using SZFl oligonucleotides (5' primer CTGTGTTCTTCCATTAGC (SEQ ID NO: 13); 3' primer GGCCTTAGCCATTTGTCT (SEQ ID NO: 14)). PI plasmids # 1629, 1030, and 1631 containing the entire SZFl gene, were nick-translated with biotin-14 DATP (BRL, Gaithersburg, MD) . Slides with chromosome spreads were made from normal male lymphocytes cultured with BrdU. Fluorescence in situ hybridization was performed as described with modifications. 20 μl of hybridization mix (2xSSCP, 60% formamide, 10% dextran sulfate, 4 ng/μl biotinylated probe that had been co-precipitated with Cot-1 DNA [BRL, Gaithersburg, MD] and herring sperm DNA) was denatured at 70°C for 5 min, preannealed at 37°C for 60 min, placed on slides and hybridized at 37°C overnight. Slides were washed in 70% formamide/2xSSC at 43 °C for 20
min, and 2 changes of 2xSSC at 37°C for 5 min each. Biotinylated probe was detected with an in situ hybridization kit (Oncor Inc., Gaithersburg, MD) . In situ hybridization was also performed using metaphases which had been Giemsa banded and photographed prior to hybridization.
Example 7 Cell Cul ture, Isolation and RNA Extraction.
All cell lines except DMS53 were grown in RPMI 1640 supplemented with 10% fetal calf serum, 2mM glutamine, and lOOU/ml penicillin and streptomycin (GIBCO/BRL, Grand Island, NY) . DMS53 was grown in Waymonth' s MB 752/1 media with the same supplements as above . The normal lung epithelial primary cell line (HBE) came from a normal lung tissue sample and was cultured in kerafinocyte growth medium with bovine pituitary extract (BPE) at 30 ng/ml (Clonetics, Walkersville, MD) . CD34+ and CD34 cells were isolated from normal bone marrow cells by immunomagnetic separation. To induce differentiation, ML-1 cells and HL60 cells were incubated with TPA (I 2-0-tetradecancylphorbol-13 -acetate) at a concentration of 33 nM for 24 hours. Total RNA was isolated from cell lines and the primary tumor samples by the guanidium thiocyanate method and polyadenylated RNA was prepared using a MRNA isolation kit (Becton Dickinson Labware, Bedford, MA) .
Example 8 In vi tro transcription and translation and immunoprecipi ta tion .
In vitro transcription and translation were performed according to the manufacturer's instructions (TNT T7/T3 coupled Reticulocyte Lysate System, Promega, Madison, WI) . [35S] methionine labeled SZFl-1 and SZFl-2 proteins were produced from constructs which were generated by cloning the entire coding sequences of SZFl-1 and SZFl-2 into PCI-neo (Promega) . Transcripts were generated with T7 RNA polymerase. In vitro transcripts from PneoSZFl-11 and PneoSZFl-2 were added to the reticulocyte lysate
translation mixture containing 60 μCi of [35S] methionine and incubated for 1 hour at 30°C. The lysates were diluted into PBS with 1% NP-40. Proteins were immunoprecipitated 1 hour under these conditions with either the preimmune sera (10 μl) , polyclonal antisera (10 μl) that was raised against a synthesized peptide (RQKAVTAEKSSDKRQ) located upstream of the zinc fingers (shown in Fig. 1A-D and 2A-E in italics) , or the same antisera (10 μl) preincubated with the peptide (25 mg) used to generate it. The polyclonal antisera was prepared by HRP (HRP Inc. Denver, PA) from a NZW female rabbit with three boosts of the KLH conjugated peptide. Antigen-antibody complexes were isolated on protein A-sepharose beads (Sigma, St. Louis, MO), and the pellets were washed five times with PBS containing 0.5% Tween-20. The [35S] methionine labeled samples were, analyzed by SDS-PAGE and autoradiography.
Numerous modifications and variations of the present invention are possible in light of the above teachings and, therefore, within the scope of the appended claims, the invention may be practiced otherwise than as particularly described.
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: INVENTOR'S NAME
(ii) TITLE OF INVENTION: Cell Zinc Finger Polynucleotides and
Splice Variant Polypeptides Encoded Thereby
(iii) NUMBER OF SEQUENCES: ##
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: CARELLA, BYRNE, BAIN, GILFILLAN,
CECCHI, STEWART & OLSTEIN
(B) STREET: 6 BECKER FARM ROAD
(C) CITY: ROSELAND
(D) STATE: NEW JERSEY
(E) COUNTRY: USA
(F) ZIP: 07068
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: 3.5 INCH DISKETTE
(B) COMPUTER: IBM PS/2
(C) OPERATING SYSTEM: MS-DOS
(D) SOFTWARE: ASCII
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: Unassigned
(B) FILING DATE: Concurrently
(C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: MULLINS, J.G.
(B) REGISTRATION NUMBER: 33,073
(C) REFERENCE/DOCKET NUMBER: 290770-10
(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 201-894-1700
(B) TELEFAX: 201-994-1744
(2) INFORMATION FOR SEQ ID NO:l:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2382 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l:
GTAGCAGCAC CGTGCGTGCG TGCGCAGATG TGGGCCCCGC GGGAGCAGCT ACTGGGCTGG
60
GCTGCGGAAG CTCTGCCTGC CAAGGATTCT GCCTGGCCCT GGGAAGAGAA GCCTAGATAT
120
CTGGTTCCAG TGGCAGATGG GGCACTGGCC AGAGGGGTTA GGAGGATTTG GACTCTCCTT
180
GGGAGTCGCT GTGATACGGC CTTCAATTAT AGTTGTCTTT CCTTCTTGGA TTTCCACCAC
240
TTCCCTGAAA CCACGAGCTG GAAGCTGAGA CCAGCTGGTC ACTGCTGGGC CTGCTAGGAG
300
CCCACACGGA AGCTGCCCAC AGCCAGACAC CAGAGCCTCA AAGGCCGCCA TCTTGAAAAA
360
CCAACCCTGA CCTCCACAAT GATTGATTTT CAAATGTTGA ACCAGCTTTG CAGAACTATA
420
ATAAACCCAA GTGTGATACC CTGTCTCAAG TATTGCGGTG ATCAAATAGG ACCAGTGACT
480
TTCGAGGATG TGGCTGTGCT TTTCACTGAG GCAGAGTGGA AGAGACTGAG CCTTGAGCAG
540
AGGAACCTAT ACAAAGAAGT GATGCTGGAA AATCTCAGGA ATCTGGTCTC ATTGGAATCA
600
AAGCCAGAAG TCCATACCTG CCCTTCTTGC CCTCTGGCCT TTGGCAGTCA GCAGTTCCTC
660
AGCCAAGATG AGCTACACAA TCATCCTATT CCAGGTTTCC ATGCAGGAAA TCAACTCCAC
720
CCAGGAAATC CCTGCCCAGA GGATCAGCCA CAGTCACAAC ATCCTTCTGA TAAAAATCAC
780
AGGGGGGCTG AAGCAGAAGA TCAACGAGTG GAAGGAGGCG TCAGACCCTT GTTTTGGAGT
840
ACAAATGAAA GGGGGGCTTT AGTGGGTTTC TCTAGCCTGT TCCAGAGACC ACCAATAAGC
900
TCTTGGGGAG GCAACAGAAT ATTAGAGATA CAGCTCAGTC CAGCCCAGAA TGCAAGCTCT
960
GAGGAAGTAG ACAGAATTTC CAAGAGGGCA GAAACCCCAG GGTTTGGAGC AGTCAGGTTT
1020
GGGGAGTGTG CACTAGCTTT TAACCAGAAG TCAAACCTGT TCAGACAGAA GGCAGTCACA
1080
GCAGAAAAAT CTTCAGACAA AAGGCAGTCA CAGGTGTGCA GGGAGTGTGG GCGAGGCTTT
1140
AGCAGGAAGT CACAGCTCAT CATACACCAG AGGACACACA CAGGAGAAAA GCCTTATGTC
1200
TGCGGAGAGT GTGGGCGAGG CTTTATAGTT GAGTCAGTCC TCCGCAACCA CCTGAGTACA
1260
CACTCCGGGG AGAAACCTTA TGTGTGCAGC CATTGTGGGC GAGGCTTTAG CTGCAAGCCA
1320
TACCTCATCA GACATCAGAG GACACACACA AGGGAGAAAT CGTTTATGTG CACAGTGTGT
1380
GGGCGAGGCT TTCGTGAAAA GTCAGAGCTC ATTAAGCACC AGAGGTGTCA AGTGACGGTC
1440
CCCTTGGAGG AATGGTCTTT GCATCTGACT ACTTCCTTCT GCAACTGTGT TCTTCCATTA
1500
GCTTCCATGA CACTCTCCTG CTTTATTTTT TTCTACATCT CTAGCCTTTG CTGTTTCCTC
1560
TCCTACCCCA CCTTTAGATT TTACTCAGAG TTCAGTCTCC AGCCCTACAA TCTGAGGGAC
1620
ACCTTTACCA GGTCCCCTTC CTAACCCTCC AGTCCCAAAT CCAAGATTCT TTAACCACAC
1680
TCTAAAAGTT CTTCAGACTC AGGACTTAAA CATAGCCACG CCACCTTGGC CTTCAATGAC
1740
AGGGATCTAG CAATGCTGCA TCATCAGCCT TCCAATACCA GGTTTAAGGG TATTTTAAAC
1800
ACAGCTCCTC TTAAATCCTC CAATCTCAGT ACCCAGTGTT TTAGCCATGC TCGGGTGGCT
1860
AAATTACATC CAGGAATGGT GCCAGGGCCT TTAGCCATTT GTCTCTCCTC ACACTCCAGC
1920
CCATATGGCC CAGGTTCTGA CAGTTTGCCT TACTCCCTTG GGCTGGGGCT AGCCCTACCT
1980
GATACCCTGT GTCAATGAGT GTACCTTGGA GAGCTATCCA CTCAGGCCCC AGTGCCTCTA
2040
TTTGCTAAGG GACTCTGCCA CAGAAAAGAA GGGGAGAGAT GTTCATGTAA CCTCAAAATA
2100
CTTAGGCTTG GTTTTGATGC TAGAGAGGAA AAAGGACTTG GAGAGAGAGA AGGAATGGCT
2160
GGTCCAGAGG CTTTTGTCCA CTCCCTCTCA CTGGAAGTGG TTGATCTCCA GGGAATCCCC
2220
AAGGTTAGCC TGCTTAGGGG AAGGGCTAGG GGTACCTGGA ATGTAGGATC TCCCCCATGC
2280
CTGGCCTACC ACCCTAATGT GTCTGGAATT GGTGGGTTCT TGGTCTTGCT GACTTCAAGA
2340
ATGAAGCCGT GGACCCTCAC GGTGAGTGTT ACAATTCTTA AA 2382
(2) INFORMATION FOR SEQ ID NO: 2:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 421 AMINO ACIDS
(B) TYPE: POLYPEPTIDE (D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: PROTEIN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:
Met lie Asp Phe Gin Met Leu Asn Gin Leu Cys Arg Thr lie lie
5 10 15
Asn Pro Ser Val lie Pro Cys Leu Lys Tyr Cys Gly Asp Gin lie 20 25 30
Gly Pro Val Thr Phe Glu Asp Val Ala Val Leu Phe Thr Glu Ala 35 40 45
Glu Trp Lys Arg Leu Ser Leu Glu Gin Arg Asn Leu Tyr Lys Glu 50 55 60
Val Met Leu Glu Asn Leu Arg Asn Leu Val Ser Leu Glu Ser Lys 65 70 75
Pro Glu Val His Thr Cys Pro Ser Cys Pro Leu Ala Phe Gly Ser 80 85 90
Gin Gin Phe Leu Ser Gin Asp Glu Leu His Asn His Pro lie Pro 95 100 105
Gly Phe His Ala Gly Asn Gin Leu His Pro Gly Asn Pro Cys Pro 110 115 120
Glu Asp Gin Pro Gin Ser Gin His Pro Ser Asp Lys Asn His Arg 125 130 135
Gly Ala Glu Ala Glu Asp Gin Arg Val Glu Gly Gly Val Arg Pro 140 145 150
Leu Phe Trp Ser Thr Asn Glu Arg Gly Ala Leu Val Gly Phe Ser
155 160 165
Ser Leu Phe Gin Arg Pro Pro lie Ser Ser Trp Gly Gly Asn Arg
170 175 180
lie Leu Glu lie Gin Leu Ser Pro Ala Gin Asn Ala Ser Ser Glu
185 190 195
Glu Val Asp Arg lie Ser Lys Arg Ala Glu Thr Pro Gly Phe Gly
200 205 210
Ala Val Arg Phe Gly Glu Cys Ala Leu Ala Phe Asn Gin Lys Ser
215 220 225
Asn Leu Phe Arg Gin Lys Ala Val Thr Ala Glu Lys Ser Ser Asp
230 235 240
Lys Arg Gin Ser Gin Val Cys Arg Glu Cys Gly Arg Gly Phe Ser
245 250 255
Arg Lys Ser Gin Leu lie lie His Gin Arg Thr His Thr Gly Glu
260 265 270
Lys Pro Tyr Val Cys Gly Glu Cys Gly Arg Gly Phe lie Val Glu
275 280 285
Ser Val Leu Arg Asn His Leu Ser Thr His Ser Gly Glu Lys Pro
290 295 300
Tyr Val Cys Ser His Cys Gly Arg Gly Phe Ser Cys Lys Pro Tyr
305 310 315
Leu lie Arg His Gin Arg Thr His Thr Arg Glu Lys Ser Phe Met
320 325 330
Cys Thr Val Cys Gly Arg Gly Phe Arg Glu Lys Ser Glu Leu lie
335 340 345
Lys His Gin Arg Cys Gin Val Thr Val Pro Leu Glu Glu Trp Ser
350 355 360
Leu His Leu Thr Thr Ser Phe Cys Asn Cys Val Leu Pro Leu Ala
365 370 375
Ser Met Thr Leu Ser Cys Phe lie Phe Phe Tyr lie Ser Ser Leu
380 385 390
Cys Cys Phe Leu Ser Tyr Pro Thr Phe Arg Phe Tyr Ser Glu Phe
395 400 405
Ser Leu Gin Pro Tyr Asn Leu Arg Asp Thr Phe Thr Arg Ser Pro
410 415 420
Ser
(2) INFORMATION FOR SEQ ID NO: 3:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 3099 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:
GTAGCAGCAC CGTGCGTGCG TGCGCAGATG TGGGCCCCGC GGGAGCAGCT ACTGGGCTGG
60
GCTGCGGAAG CTCTGCCTGC CAAGGATTCT GCCTGGCCCT GGGAAGAGAA GCCTAGATAT
120
CTGGTTCCAG TGGCAGATGG GGCACTGGCC AGAGGGGTTA GGAGGATTTG GACTCTCCTT
180
GGGAGTCGCT GTGATACGGC CTTCAATTAT AGTTGTCTTT CCTTCTTGGA TTTCCACCAC
240
TTCCCTGAAA CCACGAGCTG GAAGCTGAGA CCAGCTGGTC ACTGCTGGGC CTGCTAGGAG
300
CCCACACGGA AGCTGCCCAC AGCCAGACAC CAGAGCCTCA AAGGCCGCCA TCTTGAAAAA
360
CCAACCCTGA CCTCCACAAT GATTGATTTT CAAATGTTGA ACCAGCTTTG CAGAACTATA
420
ATAAACCCAA GTGTGATACC CTGTCTCAAG TATTGCGGTG ATCAAATAGG ACCAGTGACT
480
TTCGAGGATG TGGCTGTGCT TTTCACTGAG GCAGAGTGGA AGAGACTGAG CCTTGAGCAG
540
AGGAACCTAT ACAAAGAAGT GATGCTGGAA AATCTCAGGA ATCTGGTCTC ATTGGAATCA
600
AAGCCAGAAG TCCATACCTG CCCTTCTTGC CCTCTGGCCT TTGGCAGTCA GCAGTTCCTC
660
AGCCAAGATG AGCTACACAA TCATCCTATT CCAGGTTTCC ATGCAGGAAA TCAACTCCAC
720
CCAGGAAATC CCTGCCCAGA GGATCAGCCA CAGTCACAAC ATCCTTCTGA TAAAAATCAC
780
AGGGGGGCTG AAGCAGAAGA TCAACGAGTG GAAGGAGGCG TCAGACCCTT GTTTTGGAGT
840
ACAAATGAAA GGGGGGCTTT AGTGGGTTTC TCTAGCCTGT TCCAGAGACC ACCAATAAGC
900
TCTTGGGGAG GCAACAGAAT ATTAGAGATA CAGCTCAGTC CAGCCCAGAA TGCAAGCTCT
960
GAGGAAGTAG ACAGAATTTC CAAGAGGGCA GAAACCCCAG GGTTTGGAGC AGTCAGGTTT
1020
GGGGAGTGTG CACTAGCTTT TAACCAGAAG TCAAACCTGT TCAGACAGAA GGCAGTCACA
1080
GCAGAAAAAT CTTCAGACAA AAGGCAGTCA CAGGTGTGCA GGGAGTGTGG GCGAGGCTTT
1140
AGCAGGAAGT CACAGCTCAT CATACACCAG AGGACACACA CAGGAGAAAA GCCTTATGTC
1200
TGCGGAGAGT GTGGGCGAGG CTTTATAGTT GAGTCAGTCC TCCGCAACCA CCTGAGTACA
1260
CACTCCGGGG AGAAACCTTA TGTGTGCAGC CATTGTGGGC GAGGCTTTAG CTGCAAGCCA
1320
TACCTCATCA GACATCAGAG GACACACACA AGGGAGAAAT CGTTTATGTG CACAGTGTGT
1380
GGGCGAGGCT TTCGTGAAAA GTCAGAGCTC ATTAAGCACC AGAGAATTCA CACGGGGGAT
1440
AAGCCTTATG TGTGCAGAGA TTGAGGCCGA GGCTTTGTAA GGAGATCATG TCTCAACACA
1500
CACCAGAGGA TACATTCAGA TGAGAAGCCT TTTGTTTGCA GAGAGTGTGG GCGAGGCTTT
1560
CGTGCTAAAT CAACTCTCCT CCTACACCAG TGGACACATT CAGAGGTGAA ACCTCACGTG
1620
TGTGAGGAGT GTGGGCATGG ATTTAGCCAG AAGTCGTCGC TCAAATCACA TCGGAGAACA
1680
CACTCAGGGG AGAAGCCTTA TGTGTGTGGG GAATGTGGGC GGGGATTTAG CCGGAGGATA
1740
GTCCTCAATG GACACTGGAG GACACACACG GGAGAGAAGC CTTACACGTG CTTTGAGTGT
1800
GGGCGAAACT TTAGCCTCAA GTCCGCTCTT AGTGTACATC AGAGGATACA CTCTGGGGAG
1860
AAGCCTTATG CATGCACGGA GTGTGGGCAA GGCTTTATCA CGAAATCACA GCTCATCAGA
1920
CACCAGAGGA CACACACAGG AGAAAAGCCT TATGTCTGCG GAGAGTGTGG GCGAGGCTTT
1980
ATAGCTCAGT CAACCCTCCA CTACCACCGG AGTACACACT CCAAGGAAAA ACCTTATGTG
2040
TGCAGCCAGT GTGGGCGAGG CTTTTGTGAT AAATCAACTC TCCTCGCACA CGAGCAGACA
2100
CATTCAGGGG AGAAGCCTTA TGTGTGTGGG GAATGTGGGC GGGGATTTGG CCGGAAGATA
2160
CTCCTCAACA GACACTGGAG GACACACACA GGAGAGAAAC CTTACGCATG CATCGAGTGT
2220
GGGCGAAACT TTAGCCACAA GTCCACTCTC AGCTTACATC AGAGGATACA CTCGGGGGAG
2280
AAGCCTTATG CATGCGTGGA GTGTGGGCAA AGCTTTAGGA GAAAGTCACA GCTCATCATA
2340
CACCAGAAGA TACACTCGGG GAAAAGCTTT AGAGGTGCAA GGAGTGAGGA TGTGATTTTA
2400
GCAACAAGTC AGCCATCAGC CACACCAGCG GAAATGCTTA GGGAGAAGCC TTGTTTGTAA
2460
GGTAATGTGG ACAGAGCTGT ACGTGGACAT CATTACTTGT CACGTGTCAG AGGACACACT
2520
CGGGAGAAAC CTTCATGGAG TGAGAGTAAG GTGTTGGCTG GAAGTGGCCC CTTAAGAGAT
2580
ACTTGGAGTC AAATCTATCC ACTGTACGCC CACCCCACTC TTGTTCTAAG AGCTTTGGGG
2640
ACAGTCTTTT GACCCCTTAC ATTCCTTTAG ATGTGAAGAT GACAGAGATC TAACTTCTGA
2700
GAGCAGAGGT GTCAAGTGAC GGTCCCCTTG GAGGAATGGT CTTTGCATCT GACTACTTCC
2760
TTCTGCAACT GTGTTCTTCC ATTAGCTTCC ATGACACTCT CCTGCTTTAT TTTTTTCTAC
2820
ATCTCTAGCC TTTGCTGTTT CCTCTCCTAC CCCACCTTTA GATTTTACTC AGAGTTCAGT
2880
CTCCAGCCCT ACAATCTGAG GGACACCTTT ACCAGGTCCC CTTCCTAACC CTCCAGTCCC
2940
AAATCCAAGA TTCTTTAACC ACACTCTAAA AGTTCTTCAG ACTCAGGACT TAAACATAGC
3000
CACGCCACCT TGGCCTTCAA TGACAGGGAT CTAGCAATGC TGCATCATCA GCCTTCCAAT
3060
ACCAGGTTTA AGGGTATTTT AAACACAGCT CCTCTTAAA
3099
(2) INFORMATION FOR SEQ ID NO:4:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 361 AMINO ACIDS
(B) TYPE: POLYPEPTIDE (D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: PROTEIN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:
Met lie Asp Phe Gin Met Leu Asn Gin Leu Cys Arg Thr lie lie
5 10 15
Asn Pro Ser Val lie Pro Cys Leu Lys Tyr Cys Gly Asp Gin lie
20 25 30
Gly Pro Val Thr Phe Glu Asp Val Ala Val Leu Phe Thr Glu Ala
35 40 45
Glu Trp Lys Arg Leu Ser Leu Glu Gin Arg Asn Leu Tyr Lys Glu
50 55 60
Val Met Leu Glu Asn Leu Arg Asn Leu Val Ser Leu Glu Ser Lys
65 70 75
Pro Glu Val His Thr Cys Pro Ser Cys Pro Leu Ala Phe Gly Ser
80 85 90
Gin Gin Phe Leu Ser Gin Asp Glu Leu His Asn His Pro lie Pro
95 100 105
Gly Phe His Ala Gly Asn Gin Leu His Pro Gly Asn Pro Cys Pro
110 115 120
Glu Asp Gin Pro Gin Ser Gin His Pro Ser Asp Lys Asn His Arg
125 130 135
Gly Ala Glu Ala Glu Asp Gin Arg Val Glu Gly Gly Val Arg Pro
140 145 150
Leu Phe Trp Ser Thr Asn Glu Arg Gly Ala Leu Val Gly Phe Ser
155 160 165
Ser Leu Phe Gin Arg Pro Pro lie Ser Ser Trp Gly Gly Asn Arg
170 175 180
lie Leu Glu lie Gin Leu Ser Pro Ala Gin Asn Ala Ser Ser Glu
185 190 195
Glu Val Asp Arg lie Ser Lys Arg Ala Glu Thr Pro Gly Phe Gly
200 205 210
Ala Val Arg Phe Gly Glu Cys Ala Leu Ala Phe Asn Gin Lys Ser
215 220 225
Asn Leu Phe Arg Gin Lys Ala Val Thr Ala Glu Lys Ser Ser Asp
230 235 240
Lys Arg Gin Ser Gin Val Cys Arg Glu Cys Gly Arg Gly Phe Ser
245 250 255
Arg Lys Ser Gin Leu lie lie His Gin Arg Thr His Thr Gly Glu
260 265 270
Lys Pro Tyr Val Cys Gly Glu Cys Gly Arg Gly Phe lie Val Glu
275 280 285
Ser Val Leu Arg Asn His Leu Ser Thr His Ser Gly Glu Lys Pro
290 295 300
Tyr Val Cys Ser His Cys Gly Arg Gly Phe Ser Cys Lys Pro Tyr
305 310 315
Leu lie Arg His Gin Arg Thr His Thr Arg Glu Lys Ser Phe Met
320 325 330
Cys Thr Val Cys Gly Arg Gly Phe Arg Glu Lys Ser Glu Leu lie
335 340 345
Lys His Gin Arg lie His Thr Gly Asp Lys Pro Tyr Val Cys Arg
350 355 360
Asp
(2) INFORMATION FOR SEQ ID NO: 5:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 54 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: OLIGONUCLEOTIDE
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
CGGGATCCTA ATACGACTCA CTATAGGGAG ACCACCATGA TFGATTTTCA AATG 54
(2) INFORMATION FOR SEQ ID NO: 6:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 21 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: OLIGONUCLEOTIDE
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:
GGAATTCCTC AAGGCTCAGT C 21
(2) INFORMATION FOR SEQ ID NO: 7:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS : SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: OLIGONUCLEOTIDE
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:
AAGAGACTGA GCCTTGAG 18
(2) INFORMATION FOR SEQ ID NO: 8:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: OLIGONUCLEOTIDE
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:
ATTTCCTGCA TGGAAACC 18
(2) INFORMATION FOR SEQ ID NO: 9:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 14 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: OLIGONUCLEOTIDE
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:
GGGCAACAGA ATAT 14
(2) INFORMATION FOR SEQ ID NO: 10:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: OLIGONUCLEOTIDE
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:
ATAAGGTTTC TCCCCGGA 18
(2) INFORMATION FOR SEQ ID NO: 11:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 14 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: OLIGONUCLEOTIDE
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:
GAGGCGTGAA AAGT 14
(2) INFORMATION FOR SEQ ID NO: 12:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: OLIGONUCLEOTIDE
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:
CAGGAGAGTG TCATGGAA 18
(2) INFORMATION FOR SEQ ID NO: 13:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: OLIGONUCLEOTIDE
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:
CTGTGTTCTT CCATTAGC 18
(2) INFORMATION FOR SEQ ID NO: 14:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 NUCLEOTIDES
(B) TYPE: NUCLEIC ACID
(C) STRANDEDNESS: SINGLE
(D) TOPOLOGY: LINEAR
(ii) MOLECULE TYPE: OLIGONUCLEOTIDE
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:
GGCCTTAGCC ATTTGTCT 18
Claims
1. An isolated polynucleotide comprising a polynucleotide having at least 95% identity to a member selected from the group consisting of:
(a) a polynucleotide encoding a polypeptide comprising amino acids 2 to 421 of SEQ ID NO : 2 ;
(b) a polynucleotide encoding a polypeptide comprising amino acids 2 to 361 of SEQ ID NO : 4 ; and
(c) the complement of (a) or (b) .
2. The isolated polynucleotide of claim 1 wherein said member is (a) .
3. The isolated polynucleotide of claim 1 wherein said member is (b) .
4. The isolated polynucleotide of claim 1 wherein said polynucleotide encodes a member selected from the group consisting of:
(a) the polypeptide comprising amino acid 1 to 421 of SEQ ID NO : 2 ; and
(b) the polypeptide comprising amino acid 1 to 361 of SEQ ID N0:4.
5. The isolated polynucleotide of claim 1, wherein the polynucleotide is DNA.
6. The isolated polynucleotide of claim 1 comprising a polynucleotide encoding a polypeptide comprising an amino acid sequence identical to an amino acid sequence consisting of amino acids 1 to 421 of SEQ ID NO: 2.
7. The isolated polynucleotide of claim 1 comprising a polynucleotide encoding a polypeptide comprising an amino acid sequence identical to an amino acid sequence consisting of amino acids 1 to 361 of SEQ ID NO: 4.
8. The isolated polynucleotide of claim 1, wherein said polynucleotide is RNA.
9. A method of making a recombinant vector comprising inserting an isolated polynucleotide of claim
I into a vector, wherein said polynucleotide is the member (a) or (b) and is DNA.
10. A recombinant vector comprising a polynucleotide of claim 1, wherein said polynucleotide is the member (a) or (b) and is DNA.
11. A recombinant host cell comprising a polynucleotide of claim 1, wherein said polynucleotide is the member (a) or (b) and is DNA.
12. A method for producing a polypeptide comprising expressing from the recombinant cell of claim
II the polypeptide encoded by said polynucleotide.
13. The isolated polynucleotide of claim 1 comprising nucleotides 82 to 1641 of SEQ ID NO:l.
14. The isolated polynucleotide of claim 1 comprising nucleotides 79 to 1641 of SEQ ID NO:l.
15. The isolated polynucleotide of claim 1 comprising nucleotides 82 to 1461 of SEQ ID NO:3.
16. The isolated polynucleotide of claim 1 comprising nucleotides 79 to 1461 of SEQ ID NO:3.
17. An isolated polynucleotide comprising a polynucleotide having at least a 95% identity to a member selected from the group consisting of:
(a) a polynucleotide encoding the same mature polypeptide encoded by the human cDNA in ATCC Deposit No. xxxxx;
(b) a polynucleotide encoding the same mature polypeptide encoded by the human cDNA in ATCC Deposit No. xxxxx; and
(c) the complement of (a) or (b) .
18. An isolated polypeptide comprising: a mature polypeptide having an amino acid sequence encoded by a polynucleotide which is at least 95% identical to the polynucleotide of claim 4, wherein said member is (a) or (b) .
19. The isolated polypeptide of claim 26, comprising amino acids 2 to 421 of sequence of SEQ ID NO: 2.
20. An antibody against the polypeptide of claim 18.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU68898/98A AU6889898A (en) | 1997-04-08 | 1998-04-08 | Cell zinc finger polynucleotides and splice variant polypeptides encoded thereby |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US4181197P | 1997-04-08 | 1997-04-08 | |
US60/041,811 | 1997-04-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1998045326A1 true WO1998045326A1 (en) | 1998-10-15 |
Family
ID=21918448
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1998/006925 WO1998045326A1 (en) | 1997-04-08 | 1998-04-08 | Cell zinc finger polynucleotides and splice variant polypeptides encoded thereby |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU6889898A (en) |
WO (1) | WO1998045326A1 (en) |
-
1998
- 1998-04-08 WO PCT/US1998/006925 patent/WO1998045326A1/en active Application Filing
- 1998-04-08 AU AU68898/98A patent/AU6889898A/en not_active Abandoned
Non-Patent Citations (1)
Title |
---|
MILLER I. J., ET AL.: "A NOVEL, ERYTHROID CELL-SPECIFIC MURINE TRANSCRIPTION FACTOR THAT BINDS TO THE CACCC ELEMENT AND IS RELATED TO THE KRUEPPEL FAMILY OF NUCLEAR PROTEINS.", MOLECULAR AND CELLULAR BIOLOGY., AMERICAN SOCIETY FOR MICROBIOLOGY, WASHINGTON., US, vol. 13., no. 05., 1 May 1993 (1993-05-01), US, pages 2776 - 2786., XP002912508, ISSN: 0270-7306 * |
Also Published As
Publication number | Publication date |
---|---|
AU6889898A (en) | 1998-10-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6337388B1 (en) | Aspergillus fumigatus auxotrophs, auxotrophic markers and polynucleotides encoding same | |
WO1998045326A1 (en) | Cell zinc finger polynucleotides and splice variant polypeptides encoded thereby | |
WO2001072786A1 (en) | A novel polypeptide - tumor suppressor protein 63 and the polynucleotide encoding said polypeptide | |
US20040005658A1 (en) | Novel polypeptide-human an1-like protein 16 and the polynucleotide encoding the same | |
CA2211682A1 (en) | Multiple drug resistance gene of aspergillus fumigatus | |
WO2001038383A1 (en) | A novel polypeptide - human ubiquitin carboxy-terminal hydrolase (28) and a polynucleotide encoding the same | |
WO2001046251A1 (en) | A new polypeptide- myb protein 9 and the polynucleotide encoding it | |
WO2001055417A1 (en) | Novel polypeptide---f-box structure protein 65 and polynucleotide encoding it | |
WO2001040480A1 (en) | Novel polypeptide - human endopolypeptidase 6 and polynucleotide encoding it | |
WO2001066578A1 (en) | A novel polypeptide-dna polymerase 10 and polynucleotide encoding said polypeptide | |
WO2001055192A1 (en) | A novel polypeptide-human nerve growth factor 34 and the polynucleotide encoding the same | |
WO2001055399A1 (en) | A novel polypeptide, a human dipeptide aminopeptidase 28 and the polynucleotide encoding the polypeptide | |
WO2001040487A1 (en) | Novel polypeptide---human ii aminoacyl-trna synthetase9 and polynucleotide encoding it | |
WO2001075041A2 (en) | A novel polypeptide-human epilepsy-associated protein 11 and the polynucleotide encoding said polypeptide | |
WO2001079434A2 (en) | A novel polypeptide, a human signal peptidase 10 and the polynucleotide encoding the polypeptide | |
WO2001066724A1 (en) | A novel polypeptide, a human actin 14 and the polynucleotide encoding the polypeptide | |
WO2001040287A1 (en) | A novel polypeptide-beta subunit 10 of sodium pump and the polynucleotide encoding said polypeptide | |
WO2001074886A1 (en) | A novel polypeptide - human amyloid glycoprotein 9 and a polynucleotide encoding the same | |
WO2001070779A1 (en) | A novel polypeptide-human cdc4 analogous protein and the polynucleotide encoding said polypeptide | |
WO2001081572A1 (en) | A novel polypeptide, a human renal cancer rage4 antigen 25 and the polynucleotide encoding the polypeptide | |
WO2001079425A2 (en) | A novel polypeptide - human chloride channel protein 10 and the polynucleotide encoding said polypeptide | |
WO2001075020A2 (en) | A novel polypeptide, a human neuropolypeptide y 11 and the polynucleotide encoding the polypeptide | |
WO2001079431A2 (en) | A novel polypeptide, a human pathway 15 protein with the huntingtine protein and the polynucleotide encoding the polypeptide | |
WO2001038388A1 (en) | A novel polypeptide - human branched chain transacylase 31 and a polynucleotide encoding the same | |
WO2002020600A1 (en) | A new polypeptide- human zinc finger protein 10.45 and the polynucleotide encoding it |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AU CA JP US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: CA |
|
NENP | Non-entry into the national phase |
Ref country code: JP Ref document number: 1998543053 Format of ref document f/p: F |
|
122 | Ep: pct application non-entry in european phase |