WO2001007607A2 - FULL LENGTH cDNA CLONES AND PROTEINS ENCODED THEREBY - Google Patents
FULL LENGTH cDNA CLONES AND PROTEINS ENCODED THEREBY Download PDFInfo
- Publication number
- WO2001007607A2 WO2001007607A2 PCT/JP2000/004895 JP0004895W WO0107607A2 WO 2001007607 A2 WO2001007607 A2 WO 2001007607A2 JP 0004895 W JP0004895 W JP 0004895W WO 0107607 A2 WO0107607 A2 WO 0107607A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- polynucleotide
- protein
- nucleotide sequence
- full length
- cdna
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/46—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
- C07K14/47—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
Definitions
- the present invention relates to a full length cDNA clone encoding a human protein, a protein encoded by the cDNA clone, and a method for producing them and utilizing them
- genomic sequences of more than 10 species of prokaryotes, a lower eukaryote, yeast, and a multicellular eukaryote, C elegans are already determined
- human genome which is supposed to be composed of three thousand million base pairs
- the world wide cooperative projects have been under way to analyze it, and the whole structure is predicted to be determined by the years 2002-2003
- the aim of the determination of genomic sequence is to reveal the functions of all genes and their regulation and to understand living organisms as a network of interactions between genes, proteins, cells or individuals through deducing the information in a genome, which is a blueprint of the highly complicated living organisms
- To understand living organisms by utilizing the genomic information from various species is not only important as an academic subject, but also socially significant from the viewpoint of industrial application
- genomic sequences itself cannot identify the functions of all genes For example, as for yeast, only the function of approximately half of the 6000 genes, which is predicted based on the genomic sequence, was able to be deduced As for human, the number of the genes is predicted to be approximately one hundred thousand Therefore, it is desirable to establish "a high throughput analysis system of the gene functions" which allows us to identify rapidly and efficiently the functions of vast amounts of the genes obtained by the genomic sequencing
- cDNA which is produced from mRNA that lacks introns, encodes a protein as a single continuous amino acid sequence and allows us to identify the primary structure of the protein easily
- ESTs Expression Sequence Tags
- the information of ESTs is utilized for analyzing the structure of human genome, or for predicting the exon-regions of genomic sequences or their expression profile.
- a method to synthesize a full length cDNA is known to those skilled in the art.
- the oligo-capping method (Maruyama K. and Sugano S. (1994) Gene 138: 171-174; Suzuki Y. et al. (1997) Gene 20: 149-156) enables to synthesize a library enriched with full length cDNA, in principle.
- the synthesized cDNA is cloned and the nucleotide sequence is determined, it is possible to estimate whether the cDNA is a full length cDNA clone or not by methods such as the ATGpr (Salamov A.A., Nishikawa T., and Swindells M.B.
- An objective of the present invention is to provide a novel human protein, a polynucleotide encoding the protein, and their usage.
- the inventors have developed a method for efficiently cloning a human full length cDNA that is predicted by the ATGpr etc. to be a full length cDNA clone, from a full length-enriched cDNA library that is synthesized by the oligo-capping method. Then, the inventors determined the nucleotide sequence of the obtained cDNA clones from both 5'- and 3'- ends. By utilizing the sequences, the inventors selected clones that were expected to contain a signal sequence by the PSORT (Nakai K. and Kanehisa M. (1992) Genomics 14: 897-911), and obtained clones that do not contain a cDNA encoding a secretory protein or membrane protein.
- the full length cDNA clones of the present invention have high fullness ratio since these were obtained by the combination of (1) construction of a full length- enriched cDNA library that is synthesized by the oligo-capping method, and (2) a system in which the full length ratio is evaluated from the nucleotide sequence of the 5'-end.
- the inventors have analyzed the nucleotide sequence of the full length cDNA clones obtained by the method, and deduced the amino acid sequence encoded by the nucleotide sequence. Then, the inventors have performed the BLAST search (Altschul S.F., Gish W., Miller W., Myers E.W., and Lipman D.J. (1990) J. Mol. Biol. 215: 403-410; Gish W., and States D.J. (1993) Nature Genet. 3: 266-272; http://www.ncbi.nlm.nih.gov/BLAST/) of the GenBank
- the present invention relates to the polynucleotide mentioned below, a protein encoded by the polynucleotide, and their usage. First, the present invention relates to
- a polynucleotide comprising a nucleotide sequence encoding a protein comprising an amino acid sequence selected from the amino acid sequences set forth in the SEQ ID NOs in Table 1, in which one or more amino acids are substituted, deleted, inserted, and/or added, wherein said protein is functionally equivalent to the protein comprising said amino acid sequence selected from the amino acid sequences set forth in the SEQ ID NOs in Table 1;
- a polynucleotide comprising a nucleotide sequence encoding a partial amino acid sequence of a protein encoded by the polynucleotide of (a) to (d);
- a polynucleotide comprising a nucleotide sequence with at least 70% identity to the nucleotide sequence set forth in any one of the SEQ ID NOs in Table 1.
- Table 1 shows the names of the cDNA clones isolated in the examples described later, comprising the full length cDNA of the present invention, the corresponding SEQ ID NOs. of the nucleotide sequences of the cDNA clones, and the corresponding SEQ ID NOs. of the amino acid sequences deduced from the nucleotide sequences of the cDNA clones.
- the present invention relates to the above polynucleotide, a protein encoded by the polynucleotide, and the use of them as described below.
- a vector comprising the polynucleotide of (1).
- a method for producing the protein of (2) or the peptide of (3) comprising culturing the transformant of (7) and recovering the expression product.
- An oligonucleotide comprising the nucleotide sequence set forth in any one of the SEQ ID NOs in Table 1 or the nucleotide sequence complementary to the complementary strand thereof, wherein said oligonucleotide comprises 15 nucleotides or more.
- a method for synthesizing a polynucleotide comprising: a) synthesizing a complementary strand using a cDNA library as a template, and using the primer of (10); and b) recovering the synthesized product.
- a method for detecting the polynucleotide of (1) comprising: a) incubating a target polynucleotide with the oligonucleotide of (9) under the conditions where hybridization occurs, and b) detecting the hybridization of the target polynucleotide with the oligonucleotide of (9)-
- Figure 1 shows the restriction maps of vectors pME18SFL3 and pUC19FL3.
- polynucleotide is defined as a molecule in which multiple nucleotides are polymerized. There are no limitations in the number of the polymerized nucleotides. In case that the polymer contains relatively low number of nucleotides, it is also described as an "oligonucleotide".
- the polynucleotide or the oligonucleotide of the present invention can be a natural or chemically synthesized product. Alternatively, it can be synthesized using a template polynucleotide by an enzymatic reaction such as PCR.
- cDNA all the cDNA provided by the invention are full length cDNA.
- a "full length cDNA” is defined as a cDNA which contains both ATG codon (the translation start site) and the stop codon. Accordingly, the untranslated regions, which are originally found in the upstream or downstream of the protein coding region in natural mRNA, may or may not be contained.
- isolated polynucleotide is a polynucleotide which is not identical to any naturally occurring nucleic acid or to that of any fragment of a naturally occurring genomic nucleic acid spanning more than three separate genes. The term therefore covers, for example,
- a separate molecule such as a cDNA, a genomic fragment, a fragment produced by polymerase chain reaction (PCR), or a restriction fragment;
- a recombinant nucleotide sequence that is part of a hybrid gene i.e., a gene encoding a fusion protein.
- a substantially pure human protein of the present invention comprises any one of the amino acid sequences of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, and SEQ ID NO: 8, as shown in Table 1. The features of these proteins and the full length cDNA clones encoding the proteins were summarized in Table 2.
- PSEC0058 has a longer 5 '-end sequence than the dbEST in the GenBank.
- substantially pure as used herein in reference to a given polypeptide means that the protein or polypeptide is substantially free from other biological macromolecules.
- the substantially pure protein or polypeptide is at least 75% (e.g., at least
- amino acid sequence of the protein of the present invention Since the amino acid sequence of the protein of the present invention has been determined, it is possible to analyze its biological function(s) of the clone gene by expressing it as a recombinant protein utilizing an appropriate expression system, or by using a specific antibody against it.
- the function of the cloned gene for example, by expressing the protein of the invention, injecting the protein into cells (various cell lines or primary culture cells), and analyzing the changes in cells by monitoring the changes in signals such as calcium ions, the change of the cellular growth state, or the change of the expression of a protein or mRNA whose function is known. It is also possible to analyze the function of the cloned gene by injecting into cells (various cell lines or primary culture cells) the antibody which specifically recognizes the protein of the invention, and analyzing the changes in the cells by monitoring the changes in signals such as calcium ions, the change of the cellular growth state, or the change of the expression of a protein or mRNA whose function is known.
- the function of the protein of the invention by analyzing the localization of the polypeptide within the cells or within the tissues in detail by using an antibody that recognizes the protein specifically.
- the histochemical analysis of the whole body of an embryo in case that it is difficult to obtain a human embryo, a mouse one, for example, can be used, since the corresponding genes, for example, of mouse generally has high homology to the human genes at the amino acid level.
- simian genes have high homology to the human gene
- cells in each differentiation level, or cultured cells can be used to predict the function of the cloned gene.
- any protein encoded by the cDNA clone of the invention contains its full length amino acid sequence, it is possible to analyze its biological activity by expressing it as a recombinant protein utilizing an appropriate expression system, or by using a specific antibody against it. If the protein is associated with diseases, a specific antibody obtained by using the expressed protein can be utilized to examine the relationship between the expression level or activity of the protein and a particular disease. Alternatively, it is possible to analyze the relationship between the protein and disease by using the Online Mendelian Inheritance in Man (OMIM) (http://www.ncbi.nlm.nih.gov/Omim/), the database of human genes and diseases.
- OMIM Online Mendelian Inheritance in Man
- Proteins associated with diseases are useful in drug development since they can be utilized as a diagnostic marker, a drug that regulates the level of their expression and activity, or a target of gene therapy. Especially, the protein associated with transcription or signal transduction is extremely useful in the medicinal industry because the associations of such a protein with diseases have been reported in "Transcription factor research 1999" (Fujii, Tamura, Morohashi, Kageyama, and Satake edit. (1999) Jikken-Igaku Zoukan, Vol.17, No.3), and "Gene medical” (1999) Vol.3, No.2.
- medicines can be developed as follows. If the protein is a regulatory factor of the cellular conditions such as growth and differentiation, low-molecular-weight compounds can be screened by examining the change in the cellular conditions, or the activation or repression of a particular gene in a certain cell into which the protein or the antibody of the present invention is microinjected.
- the screening can be performed, for example, as follows. First, the protein of the invention is expressed and purified in a recombinant form. Then, the purified protein is microinjected into a various kind of cell lines or primary cultured cells, and the change in the cell growth and differentiation is monitored. The induction of a particular gene that is known to be involved in a certain cellular change is detected by the amounts of mRNA and protein. Alternatively, the amount of an intracellular molecule (low-molecular-weight compounds, etc.) that is changed by the function of a gene product (protein) that is known to function in a certain cellular change is used for the detection.
- a gene product protein
- a compound (which can be either a low-molecular-weight or high-molecular-weight compound) whose activity is to be screened can be screened by the change in cellular conditions as an index by adding the compound to the culture medium.
- the screening can be achieved by merely monitoring the change of a gene product obtained in the present invention without microinjecting the gene product into a cell.
- the above screening enables developing a substance that activates or represses the function of a protein of the present invention, which regulates cellular conditions or functions. Such a substance is expected to be used as medicine.
- a transformed cell line expressing the protein of the invention is obtained. Then, the transformed cell line and the untransformed original cell line are compared for the changes in the expression of a certain gene by detecting the amount of its mRNA or protein. Alternatively, the amount of an intracellular molecule (low molecular compounds, etc.) that is changed by the function of a certain gene product (protein) is used for the detection.
- an intracellular molecule low molecular compounds, etc.
- protein protein
- the change in the expression of a certain gene is detected by introducing a fusion gene that comprises a regulatory region of the gene and a marker gene (luciferase, beta-galactosidase, etc.) into cells, expressing the protein of the invention in the cells, and estimating the activity of a marker gene product (protein).
- a marker gene luciferase, beta-galactosidase, etc.
- the screening reveals that the affected protein or gene is associated with diseases, it is possible to perform a screening for a compound or gene that is capable of regulating the expression or activity of the affected gene either directly or indirectly by utilizing the protein provided by the invention.
- the protein of the invention is expressed and purified in a recombinant form.
- the affected protein or gene is also purified. Then, the binding ability of the protein of the invention to the affected protein or gene is examined. The change in the binding ability is monitored after a compound that is a candidate for an inhibitor is added to the reaction mixture.
- a regulatory factor of the expression of the gene encoding the protein of the invention can be screened as follows. A transcription regulatory region locating in the 5 '-upstream of the gene encoding the protein of the invention is obtained, and fused with a marker gene. After the fusion gene is introduced into cells, test compounds are added to the cells for screening. . The compounds obtained through such screening can be used as a drug for the diseases with which the protein of the invention is associated. Similarly, if the regulatory factor is a protein, compounds that affect the expression or activity of the protein can be used as a medicine for the diseases.
- a screening can be performed by adding a compound to the protein of the invention and monitoring the change of the compound.
- the enzymatic activity can also be utilized to screen a compound that inhibits the activity of the protein.
- Such screening can be carried out as follows. First, the protein of the invention is expressed and purified in a recombinant form. Then, compounds are added to the purified protein, and the amounts of the compound and of the reaction products are examined. Alternatively, after a compound that is a candidate for an inhibitor is added, a compound (substrate) that reacts with the purified protein is added, and the amounts of the substrate and of the reaction products are examined.
- the compounds obtained in the screening can be used as a medicine for diseases with which the protein of the invention is associated.
- a specific antibody that recognizes the protein of the invention can be used to examine the relationship between the level of the expression or activity of the protein and a particular disease. It is also possible to analyze the relationship according to the methods described in "Molecular Diagnosis of Genetic Diseases” (Elles R. edit. (1996)) in the series of “Method in Molecular Biology” (Humana Press). Proteins associated with diseases are targets of screening as mentioned, and thus are very useful in developing drugs which regulate their expression and activity. Also, the proteins are useful in the medicinal industry as a diagnostic marker of the associated disease or a target of gene therapy.
- Compounds isolated as mentioned above can be administered patients as it is, or after formulated into a pharmaceutical composition according to the known methods.
- a pharmaceutically acceptable carrier or vehicle specifically sterilized water, saline, plant oil, emulsifier, or suspending agent can be mixed with the compounds appropriately.
- the pharmaceutical compositions can be administered to patients by a method known to those skilled in the art, such as intraarterial, intravenous, or subcutaneous injections.
- the dosage may vary depending on the weight or age of a patient, or the method of administration, but those skilled in the art can choose an appropriate dosage properly.
- the compound is encoded by DNA
- the DNA can be cloned into a vector for gene therapy, and used for gene therapy.
- the dosage of the DNA and the method of its administration may vary depending on the weight or age of a patient, or the symptoms, but those skilled in the art can choose properly.
- the protein of the invention can be prepared as a recombinant protein or a natural protein.
- the recombinant protein can be prepared, for example, by inserting the DNA encoding the protein of the invention into a vector, introducing the vector into an appropriate host cell culturing the host cell in a culture medium and purifying the protein expressed in the transformed host cell or the culture medium, as described below.
- the natural protein can be prepared, for example, by utilizing an affinity column to which an antibody against the protein of the invention is attached, as described below (Current Protocols in Molecular Biology (1987) Ausubel et al. edit, John Wily & Sons, Section 16.1-16.19).
- the antibody used for the affinity chromatography can be either a polyclonal antibody or a monoclonal antibody.
- in vitro translation for example, “On the fidelity of mRNA translation in the nuclease-treated rabbit reticulocyte lysate system.” Dasso M.C., and Jackson R.J. (1989) Nucleic Acids Res. 17: 3129-3144) can be used for preparing the protein of the invention.
- Proteins functionally equivalent to the proteins of the present invention can be prepared by those skilled in the art, for example, by using a method for introducing mutations into an amino acid sequence of a protein (for example, site-directed mutagenesis (Current Protocols in Molecular Biology, edit, Ausubel et al., (1987) John Wiley & Sons, Section 8.1-8.5). Besides, such proteins can be generated by spontaneous mutations.
- the present invention include the proteins having one or more amino acid substitutions, deletions, insertions and/or additions in the amino acid sequences of the proteins of the present invention specifically SEQ ID No. 2, 4, 6, and 8, as far as the proteins have the equivalent functions to those of the proteins identified in the EXAMPLE.
- a substituted amino acid has a similar property to that of the original amino acid. For example, Ala, Val, Leu, He, Pro, Met, Phe and Trp are assumed to have similar properties to one another because they are all classified into a group of non-polar amino acids.
- substitution can be performed among non-charged amino acids sach as Gly, Ser, Thr, Cys, Tyr, Asn, and Gin, acidic amino acids such as Asp and Glu, and basic amino acids such as Lys, Arg, and His.
- proteins functionally equivalent to the proteins of the present invention can be isolated by using techniques of hybridization or gene amplification known to those skilled in the art. Specifically, using the hybridization technique (Current Protocols in Molecular Biology, edit, Ausubel et al., (1987) John Wiley & Sons, Section 6.3-6.4)), those skilled in the art can usually isolate a DNA highly homologous to the DNA encoding the protein identified in the below mentioned EXAMPLE based on the identified nucleotide sequence (SEQ ID No. 1, 3, 5, and 7) or a portion thereof and obtain the functionally equivalent protein from the isolated DNA.
- the hybridization technique Current Protocols in Molecular Biology, edit, Ausubel et al., (1987) John Wiley & Sons, Section 6.3-6.4
- those skilled in the art can usually isolate a DNA highly homologous to the DNA encoding the protein identified in the below mentioned EXAMPLE based on the identified nucleotide sequence (SEQ ID No. 1, 3, 5, and 7) or a portion
- the present invention includes proteins encoded by the DNAs hybridizing with the DNAs encoding the proteins identified in the present EXAMPLE, as far as the proteins are functionally equivalent to the proteins identified in the present EXAMPLE.
- Organisms from which the functionally equivalent proteins are isolated include vertebrates such as human, mouse, rat, rabbit, pig and bovine, but are not limited to these animals. Washing conditions of hybridization for the isolation of DNAs encoding the functionally equivalent proteins are usually "1 X SSC, 0.1% SDS, 37°C”; more stringent conditions are "0.5 X SSC, 0.1% SDS, 42°C”; and still more stringent conditions are "0.1 X
- hybridization conditions of the present invention Namely, conditions in which the hybridization is done at "6 X SSC, 40% Formamide, 25°C", and the washing at "1 X SSC, 55°C” can be given. More preferable conditions are those in which the hybridization is done at “6 X SSC, 40% Formamide, 37°C", and the washing at "0.2 X SSC, 55°C”. Even more preferable are those in which the hybridization is done at "6 X SSC, 50% Formamide,
- the amino acid sequences of proteins isolated by using the hybridization techniques usually exhibit high homology to those of the proteins of the present invention.
- the present invention encompasses a polynucleotide comprising a nucleotide sequence that has a high identity to the nucleotide sequence of claim 1 (a).
- the present invention encompasses a peptide, or protein comprising an amino acid sequence that has a high identity to the amino acid sequence encoded by the polynucleotide of claim 1 (b).
- the term "high identity” indicates sequence identity of at least 40% or more; preferably 60% or more; and more preferably 70% or more. Alternatively, more preferable is identity of 90% or more, or 93% or more, or 95% or more, furthermore, 97% or more, or 99% or more.
- the identity can be determined by using the BLAST search algorithm.
- PCR Gene amplification technique
- Gapped BLAST is utilized as described in Altschul et al. (Nucleic Acids Res.25:3389-3402,1997).
- the default parameters of the respective programs e.g., BLASTX and BLASTN are used. See http://www.ncbi.nlm.nih.gov.
- the present invention also includes a partial peptide of the proteins of the invention.
- the present invention includes an antigen peptide for raising antibodies.
- the peptides to be specific for the protein of the invention comprise at least 7 amino acids, preferably 8 amino acids or more, and more preferably 9 amino acids or more.
- the peptide can be used for preparing antibodies against the protein of the invention, or competitive inhibitors of them, and also screening for a receptor that binds to the protein of the invention.
- the partial peptides of the invention can be produced, for example, by genetic engineering methods, known methods for synthesizing peptides, or digesting the protein of the invention with an appropriate peptidase.
- the present invention also relates to a polynucleotide encoding the protein of the invention.
- the polynucleotide of the invention can be provided in any form as far as it encodes the protein of the invention, and thus includes cDNA, genomic DNA, and chemically synthesized DNA, etc.
- the polynucleotide also includes a DNA comprising any nucleotide sequence that is obtained based on the degeneracy of the genetic code, as far as it encodes the protein of the invention.
- the polynucleotide of the invention can be isolated by the standard methods such as hybridization using a probe polynucleotide comprising the nucleotide sequence set forth in SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, or SEQ ID NO: 7, or the portions of them, or by PCR using primers that are synthesized based on the nucleotide sequence.
- 4 clones provided by the present invention which have been isolated in the examples mentioned below, are novel and full length cDNA. All the cDNA clones provided by the invention are characterized as follows.
- all the cDNA clones of the present invention comprises full length cDNA, those obtained by the oligo-capping method, and those selected based on the features of their 5 'end sequences by the score in the ATGpr (or described as ATGprl) that predicts the full length ratio at the 5 '-end.
- the cDNA clones of the present invention are those in which the PSORT, which predicts the presence of signal sequences, has found no signal sequence at their 5 '-ends and those which have no transmembrane region in their protein coding regions.
- the selected clones were found to be not identical to any human mRNA (therefore, to be novel) by the homology search for their 5 '-end sequences.
- the present invention also relates to a vector into which the DNA of the invention is inserted.
- the vector of the invention is not limited as long as it contains the inserted DNA stably.
- vectors such as pBluescript vector (Stratagene) are preferable as a cloning vector.
- expression vectors are especially useful. Any expression vector can be used as far as it is capable of expressing the protein in vitro, in E. coli, in cultured cells, or in vivo.
- pB ⁇ ST vector Promega
- p ⁇ T vector Invitrogen
- ligation utilizing restriction sites can be performed according to the standard method (Current Protocols in Molecular Biology (1987) Ausubel et al. edit, John Wily & Sons, Section 11.4-11.11).
- the present invention also relates to a transformant carrying the vector of the invention.
- Any cell can be used as a host into which the vector of the invention is inserted, and various kinds of host cells can be used depending on the purposes.
- COS cells or CHO cells can be used, for example.
- Introduction of the vector into host cells can be performed, for example, by calcium phosphate precipitation method, electroporation method (Current Protocols in Molecular Biology (1987) Ausubel et al. edit, John Wily & Sons, Section 9.1-9.9), hpofectamine method (GIBCO-BRL), or microinjection method, etc.
- the present invention also relates to a polynucleotide which specifically hybridizes with a polynucleotide comprising the nucleotide sequence set forth in SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, or SEQ ID NO: 7 encoding the protein of the invention, or its complementary strand, and has a length of at least 15 nucleotides.
- the term "specifically hybridize” is used as to refer to hybridize with a polynucleotide of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, or SEQ ID NO: 7 encoding the protein of the invention, or its complementary strand, and not with polynucleotide encoding other proteins under the standard conditions for hybridization, or preferably under stringent conditions.
- a polynucleotide of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, or SEQ ID NO: 7 encoding the protein of the invention, or its complementary strand, and not with polynucleotide encoding other proteins under the standard conditions for hybridization, or preferably under stringent conditions.
- Such polynucleotide can be used as a probe for isolation and detection of the polynucleotide of the invention, and as a primer for amplifying the polynucleotide of the present invention.
- the polynucleotide usually has a length of 15 to 100 bp, and preferably has a length of 15 to 35 bp.
- the polynucleotide contains the entire sequence of the polynucleotide of the invention, or at least the portion of it, and has a length of at least 15 bp.
- the polynucleotide of the present invention can be used for examination and diagnosis of the abnormality of the protein of the invention. For example, it is possible to examine the abnormal expression of the gene encoding the protein using the polynucleotide of the invention as a probe for Northern hybridization or as a primer for RT-PCR. Also, the polynucleotide of the invention can be used as a primer for polymerase chain reaction (PCR) such as the genomic DNA-PCR, and RT-PCR to amplify the polynucleotide encoding the protein of the invention, or the regulatory region of the expression, with which it is possible to examine and diagnose the abnormality of the sequence by RFLP analysis, SSCP, and direct sequencing, etc.
- PCR polymerase chain reaction
- the "polynucleotide which specifically hybridizes with a polynucleotide comprising the nucleotide sequence set forth in SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, or SEQ ID NO: 7 encoding the protein of the invention, or its complementary strand, and has a length of at least 15 nucleotides” includes an antisense polynucleotide for blocking the expression of the protein of the invention.
- the antisense polynucleotide has a length of at least 15 bp or more, preferably 100 bp, and more preferably 500 bp or more, and has a length of usually 3000 bp or less and preferably 2000 bp or less.
- the antisense polynucleotide can be used in the gene therapy of the diseases which are caused by the abnormality of the protein of the invention (abnormal function or abnormal expression).
- Said antisense polynucleotide can be prepared, for example, by the phosphorothioate method ("Physicochemical properties of phosphorothioate oligodeoxynucleotides.” Stein (1988) Nucleic Acids Res. 16: 3209-3221) based on the nucleotide sequence of the polynucleotide encoding the protein (for example, the polynucleotide set forth in SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, or SEQ ID NO: 7).
- the polynucleotide or antisense polynucleotide of the present invention can be used in gene therapy, for example, by administrating it into a patient by the in vivo or ex vivo method with virus vectors such as retrovirus vectors, adenovirus vectors, and adeno-associated virus vectors, or non-virus vectors such as liposome.
- virus vectors such as retrovirus vectors, adenovirus vectors, and adeno-associated virus vectors, or non-virus vectors such as liposome.
- the present invention also relates to antibodies that bind to the protein of the invention.
- antibodies of the invention include polyclonal antibodies, monoclonal antibodies, or their portions that can bind to the protein of the invention. They also include antibodies of all classes. Furthermore, special antibodies such as humanized antibodies are also included.
- the polyclonal antibody of the invention can be obtained according to the standard method by synthesizing an oligopeptide corresponding to the amino acid sequence and immunizing rabbits with the peptide (Current Protocols in Molecular Biology (1987) Ausubel et al. edit, John Wily & Sons, Section 11.12-11.13).
- the monoclonal antibody of the invention can be obtained according to the standard method by purifying the protein expressed in E. coli, immunizing mice with the protein, and producing a hybridoma cell by fusing the spleen cells and myeloma cells (Current Protocols in Molecular Biology (1987) Ausubel et al. edit, John Wily & Sons, Section 11.4-11.11).
- the antibody binding to the protein of the present invention can be used for purification of the protein of the invention, and also for detection and/or diagnosis of the abnormalities of the expression and structure of the protein.
- proteins can be extracted, for example, from tissues, blood, or cells, and the protein of the invention is detected by
- the antibody binding to the protein of the present invention can be utilized for treating the diseases that associates with the protein of the invention.
- human antibodies or humanized antibodies are preferable in terms of their low antigenicity.
- the human antibodies can be prepared by immunizing a mouse whose immune system is replaced with that of human ("Functional transplant of megabase human immunoglobulin loci recapitulates human antibody response in mice” Mendez M.J. et al. (1997) Nat. Genet. 15: 146-156).
- the humanized antibodies can be prepared by recombination of the hypervariable region of a monoclonal antibody (Methods in Enzymology (1991) 203: 99-121). The invention is illustrated more specifically with reference to the following examples, but is not to be construed as being limited thereto.
- the present invention has provided 4 novel proteins and full length cDNA clones encoding the proteins. It is of great significance that the present invention has provided novel full length cDNA of human, since only few full length cDNA of human has been isolated. Since the full length cDNA clones of the present invention were derived from human, these can be associated with human diseases.
- the genes and proteins associated with diseases are useful as diagnostic markers. In addition, they are useful in medical development as probes for searching a compound that regulates their expression and activities, or as targets of gene therapy.
- EXAMPLE 1 Construction of a cDNA library by the oligo-capping method
- the NT-2 neuron progenitor cells (Stratagene), a teratocarcinoma cell line from human embryo testis, which can differentiate into neurons by treatment with retinoic acid were used.
- the NT-2 cells were cultured according to the manufacturer's instructions as follows.
- NT-2 cells were cultured without induction by retinoic acid treatment (NT2RM1).
- NT-2 cells were induced by adding retinoic acid, and then were cultured for 48 hours (NT2RP1).
- NT-2 cells were induced by adding retinoic acid, and then were cultured for 2 weeks (NT2RP2).
- HEMBA1 human embryo-derived tissues that were enriched with brain
- HEMBA1 human embryo-derived tissues that were enriched with brain
- poly(A) + RNA was purified from the mRNA using oligo-dT cellulose.
- Each poly(A) + RNA was used to construct a cDNA library by the oligo-capping method
- Dralll-cleaved pUC19FL3 vector Figure 1; for NT2RM1, and NT2RP1
- the Dralll-cleaved pME18SFL3 Figure 1 (GenBank AB009864, expression vector; for NT2RP2, HEMBA1) was used for cloning the cDNA in a unidirectional manner, and cDNA libraries were obtained. The clones having an insert cDNA with a length of 1 kb length or less were discarded from the cDNA libraries.
- the nucleotide sequence of the 5'- and 3'- ends of the cDNA clones was analyzed with a DNA sequencer (ABI PRISM 377, PE Biosystems) after sequencing reactions were performed with the DNA sequencing reagents (Dye Terminator Cycle Sequencing FS Ready Reaction Kit, dRhodamine Terminator Cycle Sequencing FS Ready Reaction Kit, or BigDye Terminator Cycle Sequencing FS Ready Reaction Kit, from by PE Biosystems) according to the instructions.
- DNA sequencing reagents Dye Terminator Cycle Sequencing FS Ready Reaction Kit, dRhodamine Terminator Cycle Sequencing FS Ready Reaction Kit, or BigDye Terminator Cycle Sequencing FS Ready Reaction Kit, from by PE Biosystems
- the full length-enriched cDNA libraries of NT2RP2 and HEMBA1 were constructed using eukaryotic expression vector pME18SFL3.
- the vector contains SR ⁇ promoter and SV40 small t intron in the upstream of the cloning site, and SV40 polyA added signal sequence site in the downstream.
- the cloning site of pME18SFL3 has asymmetrical Dralll sites, and the ends of cDNA fragments contain Sfil sites complementary to the Dralll sites, the cloned cDNA fragments can be inserted into the downstream of the SR ⁇ promoter unidirectionally. Therefore, clones containing full length cDNA can be expressed transiently by introducing the obtained plasmid directly into COS cells.
- the clones can be analyzed very easily in terms of the proteins that are the gene products of the clones, or in terms of the biological activities of the proteins.
- the fullness ratio at the 5 '-end sequences of the cDNA clones in the libraries constructed by the oligo-capping method was determined as follows. Of all the clones whose 5 '-end sequences were found in those of known human mRNA in the public database, a clone was judged to be "full length", if it had a longer 5'-end sequence than that of the known human mRNA, or, even though the 5 '-end sequence was shorter, if it contained the translation initiation codon. A clone which did not contain the translation initiation codon was judged to be "not-full length".
- the fullness ratio ((the number of full length clones)/(the number of full length and not-full length clones)) at the 5 '-end of the cDNA clones from each library was determined by comparing with the known human mRNA (NT2RM1: 69%; NT2RP1: 75%; NT2RP2: 62%; HEMBA1: 53%). The result indicates that the fullness ratio at the 5 '-end sequence was extremely high.
- the relationship between the cDNA libraries and the clones is shown below.
- NT2RM1 PSEC0006 NT2RP1 : PSEC0043
- NT2RP2 PSEC0058 HEMBA1 : PSEC0211
- the ATGpr developed by Salamov A.A., Nishikawa T., and Swindells M.B. in the Helix
- the ESTiMateFL developed by Nishikawa and Ota in the Helix Research Institute, is a method for the selection of a clone with high fullness ratio by comparing with the 5'-end or 3 '-end sequences of ESTs in the public database.
- a cDNA clone is judged presumably not to be full length if there exist any ESTs which have longer 5 '-end or 3 '-end sequences than the clone.
- the method is systematized for high throughput analysis.
- a clone is judged to be full length if the clone has a longer 5 '-end sequence than ESTs in the public database. Even if a clone has a shorter 5 '-end, the clone is judged to be full length if the difference in length is within 50 bases, and otherwise judged not to be full length, for convenience.
- the accuracy of the prediction by comparing cDNA clones with ESTs is improved with increasing number of ESTs to be compared.
- the method is effective in excluding clones with high probability of being not-full length, from the cDNA clones that is synthesized by the oligo-capping method and that have the 5 '-end sequences with about 60 % fullness ratio.
- the ESTiMateFL is efficiently used to estimate the fullness ratio at the 3 '-end sequence of cDNA of a human unknown mRNA which has a significant number of ESTs in the public database. The results were summarized in Tables 4 and 5.
- the number of full length clones, the number of not-full length clones, and the fullness ratio indicate the number of the clones which contain the N-terminus of the ORF, the number of the clones which does not contain the N-terminus of the ORF, and the resulting number of (the number of full length clones)/(the number of full length and not-full length clones), respectively.
- Table 4 The fullness ratio at the 5 '-end sequence of the cDNA clones that were judged to be full length by comparing the ORF of the known human mRNA and that were obtained by the oligo-capping method, wherein the ratio was evaluated by comparing the cDNA clones with ESTs. maximal number of number of fullness ATGprl full length not-full length ratio Score clones clones
- Table 5 The fullness ratio at the 5 '-end sequence of the cDNA clones that were judged to be not-full length by comparing the ORF of the known human mRNA and that were obtained by the oligo-capping method, wherein the ratio was evaluated by comparing the cDNA clones with ESTs. maximal number of number of fullness
- PSEC0006-PSEC0058 were selected by the presence of an ORF (Open reading frame: a region translated into amino acids) in the 5 '-end sequence.
- ORF Open reading frame: a region translated into amino acids
- the clones were not selected by the ATGpr score of the data of the 5 '-end sequence (one pass sequencing).
- PSEC0211 was selected as those having the maximal ATGprl score 0.7 or higher, and containing an ORF in the 5 '-end sequence.
- nucleotide sequences of the full length cDNA and the deduced amino acid sequences were determined.
- the nucleotide sequences were finally determined by overlapping completely the partial nucleotide sequences determined by the three methods mentioned below.
- the amino acid sequences were deduced from the determined cDNA sequences. The results were shown in SEQUENCE LISTING.
- the four clones are defined as "the clones that are predicted to be full length cDNA clones by the ATGpr, etc. among a human cDNA library that was constructed by the oligo-capping method and that has high full length ratio.”
- PSEC0058 which has low score in the ATGprl (ATGprl 0.17)
- ATGprl ATGprl 0.17
- PSEC0058 is not a clone selected by the ATGpr.
- the comparison between PSEC0058 and corresponding ESTs revealed that PSEC0058 is longer than any of the ESTs.
Landscapes
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Toxicology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Peptides Or Proteins (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU60227/00A AU6022700A (en) | 1999-07-23 | 2000-07-21 | Full length cdna clones and proteins encoded thereby |
JP2001512876A JP2003516717A (en) | 1999-07-23 | 2000-07-21 | Full-length cDNA clone and protein encoding it |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP20981799 | 1999-07-23 | ||
JP11/209817 | 1999-07-23 | ||
US15952899P | 1999-10-18 | 1999-10-18 | |
US60/159,528 | 1999-10-18 |
Publications (3)
Publication Number | Publication Date |
---|---|
WO2001007607A2 true WO2001007607A2 (en) | 2001-02-01 |
WO2001007607A3 WO2001007607A3 (en) | 2001-05-17 |
WO2001007607A8 WO2001007607A8 (en) | 2001-06-21 |
Family
ID=26517680
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2000/004895 WO2001007607A2 (en) | 1999-07-23 | 2000-07-21 | FULL LENGTH cDNA CLONES AND PROTEINS ENCODED THEREBY |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP2003516717A (en) |
AU (1) | AU6022700A (en) |
WO (1) | WO2001007607A2 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1293569A2 (en) * | 2001-09-14 | 2003-03-19 | Helix Research Institute | Full-length cDNAs |
EP1308459A2 (en) * | 2001-11-05 | 2003-05-07 | Helix Research Institute | Full-length cDNA sequences |
US6943241B2 (en) | 2001-11-05 | 2005-09-13 | Research Association For Biotechnology | Full-length cDNA |
US6979557B2 (en) | 2001-09-14 | 2005-12-27 | Research Association For Biotechnology | Full-length cDNA |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999020750A1 (en) * | 1997-10-22 | 1999-04-29 | Helix Research Institute | METHOD FOR SCREENING FULL-LENGTH cDNA CLONES |
-
2000
- 2000-07-21 WO PCT/JP2000/004895 patent/WO2001007607A2/en active Application Filing
- 2000-07-21 JP JP2001512876A patent/JP2003516717A/en active Pending
- 2000-07-21 AU AU60227/00A patent/AU6022700A/en not_active Abandoned
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999020750A1 (en) * | 1997-10-22 | 1999-04-29 | Helix Research Institute | METHOD FOR SCREENING FULL-LENGTH cDNA CLONES |
Non-Patent Citations (2)
Title |
---|
DATABASE EMBL SEQUENCES [Online] Accession No. AA631935, 31 October 1997 (1997-10-31) CHEN X. ET AL.: "fmfc2 regional genomic DNA specific cDNA library, H. sapiens cDNA clone CR18-9." XP002150622 * |
SHERIDAN K.M. & MALTESE W.A.: "Expression of Rab3a GTPase and other synaptic proteins is induced in differentiated NT2N neurons." J. MOL. NEUROSCIENCE, vol. 10, April 1998 (1998-04), pages 121-128, XP000952944 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1293569A2 (en) * | 2001-09-14 | 2003-03-19 | Helix Research Institute | Full-length cDNAs |
EP1293569A3 (en) * | 2001-09-14 | 2004-03-31 | Research Association for Biotechnology | Full-length cDNAs |
US6979557B2 (en) | 2001-09-14 | 2005-12-27 | Research Association For Biotechnology | Full-length cDNA |
EP1308459A2 (en) * | 2001-11-05 | 2003-05-07 | Helix Research Institute | Full-length cDNA sequences |
EP1308459A3 (en) * | 2001-11-05 | 2003-07-09 | Research Association for Biotechnology | Full-length cDNA sequences |
US6943241B2 (en) | 2001-11-05 | 2005-09-13 | Research Association For Biotechnology | Full-length cDNA |
Also Published As
Publication number | Publication date |
---|---|
WO2001007607A3 (en) | 2001-05-17 |
AU6022700A (en) | 2001-02-13 |
WO2001007607A8 (en) | 2001-06-21 |
JP2003516717A (en) | 2003-05-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7560541B2 (en) | Heart20049410 full-length cDNA and polypeptides | |
EP1067182A2 (en) | Secretory protein or membrane protein | |
US20040248256A1 (en) | Secreted proteins and polynucleotides encoding them | |
PT1650221E (en) | Novel compounds | |
KR100270348B1 (en) | Transcription factor aprf | |
WO2001007607A2 (en) | FULL LENGTH cDNA CLONES AND PROTEINS ENCODED THEREBY | |
EP1197554A1 (en) | Proliferation differentiation factor | |
JP3517988B2 (en) | Human McCard-Joseph disease-related protein, cDNA and gene encoding the protein, vector containing the DNA or gene, host cell transformed with the expression vector, method for diagnosing and treating McCard-Joseph disease | |
JPH11215987A (en) | Tsa 305 gene | |
JP2000308488A (en) | Protein which is related to blood vessel neogenesis and gene coding for the protein | |
WO2001060855A1 (en) | A novel human cell cycle control-related protein and a sequence encoding the same | |
US6908765B1 (en) | Polypeptide—human SR splicing factor 52 and a polynucleotide encoding the same | |
WO2002022676A1 (en) | A longevity guarantee protein and its encoding sequence and use | |
WO2001029228A1 (en) | A novel polypeptide, a human casein kinase 48 and the polynucleotide encoding the polypeptide | |
WO2001030818A1 (en) | A novel polypeptide-rna binding protein 33 and polynucleotide encoding said polypeptide | |
US20060003375A1 (en) | Novel polypeptide - human retinoic acid-responsive protein 53.57 and a polynucleotide encoding the same | |
WO2001031001A1 (en) | A novel polypeptide, a translation initiation factor helper factor 28 and the polynucleotide encoding the polypeptide | |
WO2001029075A1 (en) | A novel polypeptide-g-protein activating protein 129 and the polynucleotide encoding the polypeptide | |
WO2001032698A1 (en) | A novel polypeptide-human autoimmune disease associated protein 16 and the polynucleotide encoding said polypeptide | |
WO2004074302A2 (en) | Autosomal recessive polycystic kidney disease nucleic acids and peptides | |
WO2001038389A1 (en) | A new polypeptide-ribosomal protein l14.22 and the polynucleotide encoding it | |
CA2470178A1 (en) | Mammalian simp protein, gene sequence and uses thereof in cancer therapy | |
WO2001030823A1 (en) | A novel polypeptide-human circular canal protein 69 and the polynucleotide encoding said polypeptide | |
WO2001038380A1 (en) | A novel polypeptide - human cell nucleus regulatory protein 56 and a polynucleotide encoding the same | |
WO2001027283A1 (en) | A novel polypeptide, a human reverse transcriptase like protein 16 and the polynucleotide encoding the polypeptide |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
AK | Designated states |
Kind code of ref document: A3 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
AK | Designated states |
Kind code of ref document: C1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: C1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
WR | Later publication of a revised version of an international search report | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase |