WO2005087949A1 - Systematic mapping of adenosine to inosine editing sites in the human transcriptome - Google Patents

Systematic mapping of adenosine to inosine editing sites in the human transcriptome Download PDF

Info

Publication number
WO2005087949A1
WO2005087949A1 PCT/IL2005/000286 IL2005000286W WO2005087949A1 WO 2005087949 A1 WO2005087949 A1 WO 2005087949A1 IL 2005000286 W IL2005000286 W IL 2005000286W WO 2005087949 A1 WO2005087949 A1 WO 2005087949A1
Authority
WO
WIPO (PCT)
Prior art keywords
proteins
desc
acc
cancer
diseases
Prior art date
Application number
PCT/IL2005/000286
Other languages
French (fr)
Inventor
Erez Levanon
Eli Eisenberg
Rodrigo Yelin
Sergey Nemzer
Ronen Shemesh
Original Assignee
Compugen Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Compugen Ltd. filed Critical Compugen Ltd.
Publication of WO2005087949A1 publication Critical patent/WO2005087949A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search

Definitions

  • the present invention is of a method for detecting RNA editing sites, as well as uses of this method (for example for diagnostic uses).
  • the present invention also comprises the located RNA editing sites themselves.
  • RNA editing by members of the double-sfranded RNA-specific ADAR family leads to site-specific conversion of adenosine to inosine (A-to-I) in the precursor messenger
  • RNAs 1 Editing by ADARs is believed to occur in all metazoa, and is essential for mammalian development 2"5 .
  • ADAR-mediated RNA editing is essential for normal life and development in both invertebrates and vertebrates .
  • AD AR-deficient inverterbrates show only behavioural defects ' , while ADAR1 knock-out mice die embryonically and ADAR2 null mice live to term but die prematurely 4 ' 5 .
  • High editing levels were found in inflamed tissues, in agreement with a proposed antiviral function of ADARs and their transcriptional regulation by interferon 8 .
  • Altered editing patterns were found in epileptic mice 9 , suicide victims suffering chronic depression 10 and in malignant gliomas 11 .
  • SNPs single nucleotide polymorphisms
  • mutations are enoneously identified as editing events by this method.
  • the background art does not teach or suggest many RNA editing sites, as previous attempts to locate such sites were neither sufficiently systematic nor sufficiently successful to uncover the vast majority of RNA editing sites.
  • the present invention is of a method for searching for RNA editing sites. According to prefened embodiments, the method features searching for ADAR editing sites in the human transcriptome.
  • the method of the present invention was validated by searching millions of available expressed sequences to map A-to-I editing sites. A much larger number of A-to-I editing sites were mapped in many different genes, with an estimated accuracy of 95%, raising the number of known editing sites by two orders of magnitude.
  • the method was experimentally validated by verifying the occunence of editing in 28 novel substrates.
  • A-to-I editing in humans primarily occurs in non-coding regions, typically in Alu repeats. Within Alu sequences, specific hotspots for editing were identified. Remarkably, a significant fraction of editing events result in the stabilization of the double-sfranded RNA (dsRNA) stmcture, while only 3% have a neutral effect on pairing.
  • ADAR substrates are usually imperfect dsRNA stems formed by base pairing of an exon containing the adenosine to be edited with a complementary portion of the pre-mRNA (up to several thousand nucleotides apart).
  • the search for mismatches was restricted to potential double-sfranded regions, in order to remove most of the noise and facilitate the identification of tme editing sites.
  • human ESTs and cDNAs were aligned to the genome and assembled into clusters representing genes or partial genes, as described in Shoshan et al 18 .
  • the method of the present invention aligned the expressed part of the gene with the conesponding genomic region, looking for reverse complement alignments longer than 30nt with identity levels higher than 85% (see figure 1).
  • About 429,000 candidate dsRNAs were found in 14,512 different genes, mostly resulting from alignment of an exon to an intron.
  • additional filters are preferably featured. Since sequencing enors tend to cluster in certain regions, especially in low complexity areas and towards sequences ends, preferably an optional filter discards all single-letter repeats longer than 4nt, as well as 150nt at both ends of each sequence. In addition, all 50nt-wide windows in which the total number of mismatches is 6 or more were considered as having low sequencing quality and were discarded according to another optional filter. However, 4 or more identical sequential mismatches were masked in the count for mismatches in a given window.
  • This exception (according, to a prefened embodiment of the filter) is intended to retain sequences with many sequential editing sites, which were found to occur in previously documented examples . Mismatches supported by less than 5% of available sequences were also discarded according to another optional but prefened filter, and, finally, known SNPs of genomic origin were removed. Employing those criteria one finds that the putative editing sites tend to group together, a fact which is also supported by the few available known cases . Thus, all mismatches that occur less than three times in an exon were ignored according to still another optional but prefened filter. The above described filtering (cleaning) procedure resulted almost exclusively in A-to- G mismatches (see figure 2).
  • the method of the present invention resulted in the identification of 12,723 putative editing sites, belonging to 1,637 different genes.
  • Detailed information of the ? NA editing sites and the respective transcripts annotation is disclosed in the "flank_clean” and "Ann_clean” files in the attached CD- ROM.
  • the same approach applied to G-to-A mismatches yielded only 242 sites.
  • Sequencing enors, SNPs and mutations, whcih were determined to be significant sources of noise in the analysis, are expected to produce at least as many G-to-As as A-to-Gs (see figure 2). This signal-to-noise ratio (242/12637) suggests that the false positive rate for the method according to the present invention is very low.
  • the method comprises the detection of editing in liver, lung, kidney, prostate, and uterine tissues (see Example 1). Such editing was not previously known to occur.
  • the present invention comprises the use of RNA editing in any one or more of liver, lung, kidney, prostate or uterine tissues, more preferably the detection of RNA editing, and most preferably for diagnosis.
  • Prefened embodiments of the present invention also optionally and preferably comprise a kit for detecting RNA editing in any one or more of liver, lung, kidney, prostate or uterine tissues, as well as a method for detecting RNA editing in one or more of these tissues.
  • Prefened embodiments of the present invention also optionally and preferably comprise a method of treating a disease in a subject by modulating RNA editing in any one or more of liver, lung, kidney, prostate or uterine tissues. For most genes, editing was found in all tissues, with varying relative abundance, but generally the unedited signal dominated the edited signal.
  • a method of identifying an RNA editing substrate comprising: identifying nucleic acid sequence exhibiting a base pair mismatch in a stem region thereof, the nucleic acid sequence being the RNA editing substrate.
  • the stem region is identified by: detecting an exon capable of forming a double stranded region in the nucleic acid sequence, wherein the exon features an adenosine.
  • the method further comprises filtering the nucleic acid sequence to remove a section of repeated nucleotides before the identifying the nucleic acid sequence. More preferably, the section comprises at least four repeated nucleotides.
  • the metiiod further comprises filtering the nucleic acid sequence wherein at least a portion of the nucleic acid sequence is discarded if the portion features more than a threshold number of mismatches before the identifying the nucleic acid sequence.
  • the portion comprises at least about 20 nucleotides and the threshold number comprises at least about three mismatches. More preferably, if the portion features at least about two identical sequential mismatches, the portion is not discarded.
  • the portion comprises at least about 50 nucleotides and the threshold number comprises at least about six mismatches. Preferably, if the portion features at least about four identical sequential mismatches, the portion is not discarded.
  • the RNA editing substrate is detected in a tissue comprising at least one of liver, lung, kidney, prostate, or uterine tissue.
  • the method further comprises: diagnosing a disease or pathological condition in a subject by detecting RNA editing in at least one of the tissues.
  • the diagnosing is performed by determining whether RNA editing in a nucleotide sequence of the subject differs from a normal nucleotide sequence.
  • a kit for diagnosing a subject comprising at least one component for detecting RNA editing as described herein.
  • the at least one component comprises an oligonucleotide.
  • the oligonucleotide hybridizes to the nucleotide sequence for detecting RNA editing.
  • the oligonucleotide comprises a pair of oligonucleotides for amplifying at least a portion of the nucleotide sequence for detecting RNA editing.
  • a computer readable storage medium comprising data stored in a retrievable manner, the data including sequence information of RNA editing substrates as set forth in files "flan_for_aH” and “flan_clean” of enclosed CD-ROM1, and conesponding sequence annotations as set forth in the file “Ann_for_all” and “Ann_clean” of enclosed CD-ROM1.
  • sequence information of RNA editing substrates as set forth in files "flan_for_aH” and "flan_clean” of enclosed CD-ROM1
  • conesponding sequence annotations as set forth in the file “Ann_for_all” and “Ann_clean” of enclosed CD-ROM1.
  • any identified RNA editing site as described herein or as derivable from the methods described herein, optionally as described herein, for diagnostic assays, dmg targets, expressed sequences suitable for therapeutic proteins, and gene therapy to fix abenant and/or pathological RNA editing.
  • a diagnostic assay comprising an assay for determining an RNA editing pattern in a sample taken from an individual, optionally as described herein.
  • the method is performed on a multi-probe chip, the chip comprising a plurality of probes for detecting a presence or an absence of at least one RNA editing site in the sample, optionally as described herein.
  • a diagnostic method for determining an RNA editing pattern in a sample taken from an individual comprising: determining an RNA editing pattern in the sample to form a test pattern; and comparing the test pattern to a standard pattern, optionally as described herein.
  • the standard pattern is optionally related to disease or pathology, and/or to normalcy or "health".
  • the method further comprises: at least partially diagnosing the individual according to the comparison.
  • the disease comprises cancer.
  • a method for detecting cancer in a subject or a disposition or tendency or susceptibility thereto comprising analyzing RNA editing in the subject, optionally as described herein.
  • Inset b shows the distribution of mismatches resulting from applying the algorithm to random expressed sequences covering about 20% of the transcriptome.
  • Insets c and d show the distributions for known SlSfPs and mutations , respectively. A-to-G mismatches do not stand out in the distributions b-d.
  • Figure 3 Editing in the Fll receptor (JAM!) gene. Top: some of the publicly available expressed sequences covering this gene, together with the corresponding genomic sequence. The evidence for editing is highlighted. Bottom: Results of sequencing experiments. Matching DNA and cDNA RNA sequences for a number of tissues. Editing is characterized by a trace of guanosine in the cDNA RNA sequence, where the DNA sequence exhibits only adenosine signals (highlighted).
  • Figure 4 multiple alignment of the genomic sequence and the expressed sequences within the NARF gene, undergoing RNA editing. The nucleotides positions involved in RNA editing are marked. The Genbank accession numbers of the sequences appears at the right of the alignment.
  • Figure 5 multiple alignment of the genomic sequence and the expressed sequences within the HSPC274 gene, undergoing RNA editing. The nucleotides positions involved in RNA editing are marked. The Genbank accession numbers of the sequences appears at the right of the alignment.
  • Figure 6 multiple alignment of the genomic sequence and the expressed sequences within the FLJ25952 hypotetical protein, undergoing RNA editing.
  • the editing site is at position 601, where the codon UAU(Y) is edited into UGU(C). Structures for the other substrates are given in figures below.
  • B Conservation levels at the editing genomic locus. The two red bars at the bottom mark the editing region and the intronic sequence almost perfectly pairing with it to form the hairpin stmcture shown in (A). The editing site is marked in black within the left red bar. The high conservation level of the intronic sequence, suggesting a functional importance, supports its identification as necessary for the editing process.
  • Figure 9 Distributions of the different types of simple substitution SNPs. (a) all SNPs (b) SNPs infened from expressed data only (c) S?NPs within Alu repetitive elements (d) SNPs within Alu elements infened from expressed data only.
  • FIG. 10 An editing site in the eukaryotic translation initiation factor (eIF3k) locus, enoneously identified as SNPs.
  • eIF3k eukaryotic translation initiation factor
  • A some of the publicly available expressed sequences covering this gene, together with the conesponding genomic sequence. The location of the dbSNP SNP record is indicated at the bottom. The editing location is highlighted in green for non-edited sequences and in red for edited sequences.
  • B Experimental results: sequencing matching human DNA and cDNA RNA sequences.
  • Editing is characterized by a trace of guanosine (black) in the cDNA RNA sequence, where the DNA sequence exhibits only adenosine signals (green).
  • Figure 11 Editing sites in the ribosomal protein S19 (RPS19) locus, enoneously identified as SNPs.
  • RPS19 ribosomal protein S19 locus
  • Figure 12 shows illustrative sequencing results for an exemplary RNA editing site for the BLCAP gene as described below.
  • Figures 13-16 show secondary stmcture as predicted by MFOLD for CYFIP2 ( Figure 13), FLNA ( Figure 14), BLCAP ( Figure 15) and IGFBP7 ( Figure 16), respectively.
  • Figure 17 shows the content of Appendix 5 (mouse and chicken sequences).
  • Prefened embodiments of the present invention comprise a method for detecting RNA editing, as well as methods of using such detection (for example for diagnosis), and/or methods for treating a disease by modulating RNA editing. According to preferred embodiments of the present invention, these activities are performed with regard to RNA editing in any one or more of liver, lung, kidney, prostate or uterine tissues. Altered editing patterns have been found to be associated with inflammation 16 , epilepsy 17 , depression 18 , ALS and malignant gliomas 19 .
  • differential ?RNA editing is used to diagnose the following diseases: inflammation, depression, ALS, cancer and epilepsy.
  • the level of RNA editing preferably is lower than in normal samples.
  • a single gene was found (in cancerous tissue samples) to have lower levels of RNA editing than for normal (non-cancerous) tissue samples.
  • RNA editing is lower in cancerous tissue than in non-cancerous tissue, although at least the level of editing may be modulated (raised or lowered) in cancerous tissue as compared to normal tissue.
  • the cancer comprises brain cancer.
  • modulated RNA editing is preferably found in one or more of the following genes for diagnosing cancer, more preferably brain cancer, most preferably malignant glioma (and also most preferably a lowered level of RNA editing): BLCAP; FLNA; CYFIP2; or IGFBP7.
  • BLCAP a malignant glioma
  • FLNA FLNA
  • CYFIP2 a lowered level of RNA editing
  • IGFBP7 IGFBP7.
  • the presence of differential RNA editing in cancerous tissue, preferably brain cancer, most preferably malignant glioma may optionally and preferably be determined by comparing RNA editing in cancerous tissue to such editing in normal tissue, most preferably to detect a different level of RNA editing in cancerous tissue which is optionally and most preferably a lower level of RNA editing.
  • Illustrative cancers that may optionally be diagnosed with the present invention include but are not limited to bile duct, bladder, bone, bowel (colon and/or rectal cancer), brain (including but not limited to acoustic neuroma, astrocytoma, central nervous system lymphoma, ependymoma, haemangioblastoma, medulloblastoma, meningioma, mixed gliomas, malignant glioma, oligodendroglioma, pineal region tumours or pituitary tumors), breast, carcinoid (including but not limited to carcinoid cancers of the neuroendocrine system, optionally including but not limited to cancers of the appendix, small intestine, lung or pancreas), cervical, eye, gall bladder, esophageal, cancers of the head and neck, kidney, larynx, leukemia (acute lymphoblastic, acute myeloid, chronic lymphocytic, chronic my
  • differential levels of RNA editing may be determined for a gene, a plurality of genes, an entire tissue (or a plurality of tissues), a genetic locus (or a plurality of such loci) or for a tissue sample.
  • the subject could optionally give a urine sample, after which RNA editing could be determined for any of these items.
  • the ratio of adenosine to inosine could optionally be measured in the urine sample, and compared to that of a normal subject (without prostate cancer).
  • tissue samples for use with the present invention include but are not limited to blood, serum, plasma, blood cells, urine, sputum, saliva, stool, spinal fluid or CSF, lymph fluid, the external secretions of the skin, respiratory, intestinal, and genitourinary tracts, tears, milk, neuronal tissue and or any other tissue of the brain, CNS and or peripheral nervous system, lung tissue, any human organ or tissue, including any tumor or normal tissue, any sample obtained by lavage (for example of the bronchial system or of the breast ductal system), and also samples of in vivo cell culture constituents.
  • RNA editing sites and related sequences described herein as well as for all such editing sites discoverable according to the methods of the present invention, there are many potential applications, including but not limited to, diagnostic assays, dmg targets, expressed sequences suitable for therapeutic proteins, and gene therapy to fix abenant and/or pathological RNA editing.
  • diagnostic assays optionally and preferably a suitable method and/or assay would include determining an RNA editing pattern in an individual subject, and comparing this test pattern to a known standard pattern.
  • the standard pattern could optionally be related to disease or pathology, and/or to normalcy or "health".
  • the comparison could then preferably be used to at least assist in the diagnosis of the individual, for example to determine whether the individual is suffering from (or alternatively lacks) a particular disease or pathological state.
  • a diagnostic assay could optionally be adapted from a chip-based assay for detecting SNPs (single nucleotide polymorphisms), as described for example with regard to US Patent No. 6,368,799, hereby incorporated by reference as if fully set forth herein.
  • SNPs single nucleotide polymorphisms
  • a non-limiting description of an exemplary, illustrative assay for detecting RNA editing patterns is provided below.
  • PCR may be used to amplify any samples before the assay is performed.
  • the assay is preferably performed on a specially constructed array.
  • a simple anay for characterizing binary RNA editing sites could optionally be constructed with a pair of probes respectively hybridizing to the two mRNA forms (edited and not edited).
  • each editing site would be represented by two positions on the anay, a first position featuring a non-edited sequence, and a second position featuring a sequence that was edited (ie the changed nucleotide which is indicative of editing).
  • analysis is more accurate using specialized anays of probes tiled based on the respective edited/non-edited forms.
  • Tiling refers to the use of groups of related immobilized probes, some of which show perfect complementarity to a reference sequence and others of which show mismatches from the reference sequence.
  • the anay would contain two groups of probes tiled based on two reference sequences constituting the respective edited/non-edited forms.
  • the first group of probes preferably includes at least a first set of one or more probes which span the editing site and are exactly complementary to one of the edited or non-edited forms.
  • the group of probes can also contain second, third and fourth additional sets of probes, which contain probes identical to probes in the first probe set except at one position refened to as an intenogation position.
  • the one probe that shows perfect hybridization is a probe from the second, third or fourth probe sets whose intenogation position aligns with the editing site and is occupied by a base complementary to the other form (for example, if the first probe set is related to the edited form, then the second probe set is preferably related to the non-edited form).
  • the probe group is hybridized with a sample in which only some of the mRNAs are edited, the above patterns are superimposed.
  • the probe group shows distinct and characteristic hybridization patterns depending on the editing level at the given site.
  • the anay also contains a second group of probes tiled using the same principles as the first group but with a reference sequence constituting the non-edited form.
  • the first probe set in the second group spans the edited site and shows perfect complementarity to the non-edited form.
  • Hybridization of the second probe group yields a minor image of hybridization patterns from the first group.
  • Anays can also be designed to analyze many different editing sites in many different genes and/or in the same gene simultaneously simply by including multiple subanays of probes. Each subanay has first and second groups of probes designed for analyzing a particular editing site according to the strategy described above. Chips that are suitable for the above anays may optionally be manufactured according to the method of Affymetrix, California USA (see US Patent Nos.
  • the chips are manufactured from quartz wafers, which are washed with silane to enable high density anay spotting.
  • Probe synthesis is performed on the chip, by using a linker that binds to the silane. Nucleotides are first added to the linker, and then synthesis continues by elongation of the probes. All probes are synthesized in parallel, by using photolithographic masks. These masks permit light to shine on various parts of the chip in sequence, so that as each nucleotide is added in sequence, only those probes for which the particular nucleotide is appropriate at that point in the sequence have the nucleotide added.
  • the validation set was composed of two subsets: (i) 20 genes for which the EST data suggested many putative editing events, 18 of these genes were confirmed to be edited, (ii) 13 genes were chosen randomly from the list of 1,595 predicted genes, 9 of which were successfully amplified and sequenced. 8 out of these 9 genes were confirmed edited.
  • ADAR-mediated editing of an A in an A-U base pair produces the less stable I-U pair, while A-C mismatches can be edited into the more stable I-C pairs. Looking at the best complementary alignment of the editing regions, it was found that in 78% of the editing cases an A-U pair is destabilized, while in 19% an A-C pair is stabilized. Editing of either A-A or A-G pairs occurs in only 3% of the cases. This suggests that editing is aimed at stabilization and destabilization only, and does not occur in situations where it has no major effect on dsRNA stability.
  • the editing mechanism seems to prefer stabilization over destabilization: 22% of the editing events target a mismatched base-pair, while the average frequency of such mismatched base-pairs in the sites adjacent to the editing sites is only 10%, since these sites are all located in double-sfranded regions.
  • stabilization editing i.e., editing of A-C to I-C
  • This work increases the number of editing substrates by two orders of magnitude, in accordance with prior estimates 7 . This allows a large-scale analysis of the editing phenomenon.
  • ESTs and cDNAs were obtained from NCBI GenBank version 136 (June 2003; www.ncbi.nlm.nih.gov/dbEST). The genomic sequences were taken from the human genome build 33 (June 2003; www.ncbi.nlm.nih.gov/genome/guide/human).
  • Total RNA and genomic DNA (gDNA) isolated simultaneously from the same tissue sample were purchased from Biochain Institute (Hayward, CA). In this work we used samples of liver, prostate, uterus, kidney, lung normal and tumor, brain tumor (glioma), cerebellum and frontal lobe.
  • RNA from tissue culture cells was isolated using Trifast (PeqLab, Germany) and poly- A selected using using magnetic oligo dT beads (Dynal, Germany), l ⁇ g of poly A RNA was reverse transcribed using random hexamers as primers and RNAseH deficient M- MLV reverse transcriptase (Promega, Madison, WI). Genomic DNA from tissue culture cells was isolated according to Ausubel et al.
  • First strand cDNAs or conesponding genomic regions were amplified with suitable primers using Pfu polymerase, to minimize mutation rates during amplification.
  • Amplified fragments were A-tailed using Taq polymerase, gel purified and cloned into pGem-T easy (Promega, Madison, WI).
  • After transformation in E. coli individual plasmids were sequenced and aligned using ClustalW.
  • Contig Express software from Vector NTI 6.0 Suite (Informax, Inc.) for multiple-alignment of the elecfropherograms (see Supplementary Information).
  • the extent of A-I editing is variable, e.g.
  • the validation set was composed of two subsets: (i) 18 genes for which the EST data suggested many putative editing events, 16 of these genes were confirmed to be edited, (ii) 13 genes were chosen randomly from the list of 1,595 predicted genes, 9 of which were successfully amplified and sequenced. 8 out of these 9 genes were confirmed edited.
  • EXAMPLE 2 Editing sites and the AL U sequence ALU is a complex and diverse family of genomic repeats that are unique to the primates. Due to their ubiquity, it is probable that two oppositely oriented ALUs will be present in the same gene, and thus they are likely to form dsRNAs and putative editing sites. The editing sites were compared with the ALU repeat, to examine their similarities and differences. In order to simplify the following analysis, a "generic" ALU consensus sequence was used as an example: the consensus of the Alu-J subfamily.
  • the exact sequence that was used is gnl
  • the ALU consensus sequence is 290 nt in length, and contains 67 A's (23.1% of sequence).
  • positions between the 67 different As is shown. It is shown that there are prefened positions for editing events in the alignment to ALU (p-value calculated using the Z-test). Note that positions 27 & 28 account for 11.7% of the total number of positions analyzed (2,615), and 18.75% of the positions aligned to A (1632). This is a large bias suggesting that these 2 positions are in a place very favorable for RNA editing. In contrast, position 44 (only 16 bases apart) has a count of just 7, showing that this position is unfavorable for predicted editing. Such very close positions with significantly different counts serve as ideal controls for each other as there was no prior selection that prefened any of them.
  • EXAMPLE 3 In the following, the effect of RNA editing on the stability of its dsRNA substrates is considered. For each predicted site, a search is performed for its best opposite-strand alignment within the genomic region covered by the same gene cluster, and look at the effect of the editing on this alignment. First, the fraction of editing sites which are (before editing) " matching to their opposite strand sequence was calculated: 78.2% of the nucleotides in the editing sites match the opposite strand, and 21.8% are mismatched. This frequency of mismatches is actually much higher than could be expected by chance, given that the editing region as a whole is matched with average identity level of about 90%.
  • G is strongly undenepresented in the upstream preceding site, and ovenepresented in the site following the editing site.
  • the site opposed to the editing site is in most cases U, where editing changes the stable A-U pair into the less stable I-U pair.
  • the vast majority are C sites, where editing changes the less stable A-C pair into the more stable I-C pair. Changes that do not have a significant effect on the dsRNA stability, i.e., change of A-A pairs into I- A pahs or change of A-G pairs into I-G pairs are rare.
  • Example 4 Various exemplary EST libraries are described herein in which the fraction of ESTs showing RNA editing is significantly higher than the average. First, all ESTs that are edited at one or more sites out of the 12,723 sites in the database were counted, and this number was compared to the total number of ESTs covering these sites that do not exhibit editing (after the cleaning procedure is applied). It was found that 6690 ESTs are edited and 4657 are not, giving an average editing to non-editing ratio of 6690:4657 or about 3:2. For each library, this ratio was calculated separately. The libraries most significantly deviating from the 3:2 ratio (p- value calculated by the Fisher's Exact Test) are listed below.
  • RNA editing within coding regions of genes EXAMPLE 5a: RNA editing in NARFgene
  • exon 8A is an alternative ALU based exon.
  • the strongest site might be transition of A>G at position 19 altering a STOP codon into Trp (TAG>TGG; X>W).
  • Transition of A>G at position 24 replaces Thr with Ala (ACG>GCG; T>A).
  • Transition of A>G at position 33 replaces Iso with Val (ATOGTC; I>V).
  • Transition of A>G at position 46 replaces Gin with Arg (CAG>CGG; Q>R).
  • Transition of A>G at position 70 replaces Arg with Gly
  • Gencarta Compugen, Tel-Aviv, Israel
  • Gencarta platform includes a rich pool of annotations, sequence information (particularly of spliced sequences), chromosomal information, alignments, and additional information such as gene ontology terms, expression profiles, functional analyses, known and predicted proteins and detailed homology reports. Brief description of the methodology used to obtain annotative sequence information is summarized infra (for detailed description see U.S. Pat. Appl. 10/426,002).
  • An ontology refers to the body of knowledge in a specific knowledge domain or discipline such as molecular biology, microbiology, immunology, virology, plant sciences, pharmaceutical chemistry, medicine, neurology, endocrinology, genetics, ecology, genomics, proteomics, cheminformatics, pharmacogenomics, bioinformatics, computer sciences, statistics, mathematics, chemistry, physics and artificial intelligence.
  • An ontology includes domainrspecific concepts - refened to, herein, as sub- ontologies. A sub-ontology may be classified into smaller and nanower categories.
  • the ontological annotation approach is effected as follows.
  • biomolecular (i.e., polynucleotide or polypeptide) sequences are computationally clustered according to a progressive homology range, thereby generating a plurality of clusters each being of a predetermined homology of the homology range.
  • Progressive homology is used to identify meaningful homologies among biomolecular sequences and to thereby assign new ontological annotations to sequences, which share requisite levels of homologies.
  • a biomolecular sequence is assigned to a specific cluster if displays a predetermined homology to at least one member of the cluster (i.e., single linkage).
  • a “progressive homology range” refers to a range of homology thresholds, which progress via predetermined increments from a low homology level (e.g. 35 %) to a high homology level (e.g. 99 %).
  • one or more ontologies are assigned to each cluster. Ontologies are derived from an annotation preassociated with at least one biomolecular sequence of each cluster; and/or generated by analyzing (e.g., text-mining) at least one biomolecular sequence of each cluster thereby annotating biomolecular sequences.
  • Hierarchical annotation refers to any ontology and subontology, which can be hierarchically ordered, such as, a tissue expression hierarchy, a developmental expression hierarchy, a pathological expression hierarchy, a cellular expression hierarchy, an intracellular expression hierarchy, a taxonomical hierarchy, a functional hierarchy and so forth.
  • the hierarchical annotation approach is effected as follows. First, a dendrogram representing the hierarchy of interest is computationally constructed. A “dendrogram” refers to a branching diagram containing multiple nodes and representing a hierarchy of categories based on degree of similarity or number of shared characteristics.
  • Each of the multiple nodes of the dendrogram is annotated by at least one keyword describing the node, and enabling literature and database text mining, such as by using publicly available text mining software.
  • a list of keywords can be obtained from the GO Consortium (www.geneontlogy.org). However, measures are taken to include as many keywords, and to include keywords which might be out of date.
  • tissue annotation a hierarchy is built using all available tissue/libraries sources available in the GenBank, while considering the following parameters: ignoring GenBank synonyms, building anatomical hierarchies, enabling flexible distinction between tissue types (normal versus pathology) and tissue classification levels (organs, systems, cell types, etc.).
  • each of the biomolecular sequences is assigned to at least one specific node of the dendrogram.
  • the biomolecular sequences can be annotated biomolecular sequences, unannotated biomolecular sequences or partially annotated biomolecular sequences.
  • Annotated biomolecular sequences can be retrieved from pre-existing annotated databases as described hereinabove. For example, in GenBank, relevant annotational information is provided in the definition and keyword fields. In this case, classification of the annotated biomolecular sequences to the dendrogram nodes is directly effected.
  • a search for suitable annotated biomolecular sequences is performed using a set of keywords which are designed to classify the biomolecular sequences to the hierarchy (i.e., same keywords that populate the dendrogram)
  • keywords which are designed to classify the biomolecular sequences to the hierarchy
  • extraction of additional annotational information is effected prior to classification to dendrogram nodes. This can be effected by sequence alignment, as described hereinabove.
  • annotational information can be predicted from structural studies.
  • nucleic acid sequences can be transformed to amino acid sequences to thereby enable more accurate annotational prediction.
  • each of the assigned biomolecular sequences is recursively classified to nodes hierarchically higher than the specific nodes, such that the root node of the dendrogram encompasses the full biomolecular sequence set, which can be classified according to a certain hierarchy, while the offspring of any node represent a partitioning of the parent set.
  • a biomolecular sequence found to be specifically expressed in "rhabdomyosarcoma” will be classified also to a higher hierarchy level, which is “sarcoma”, and then to "Mesenchimal cell tumors" and finally to a highest hierarchy level "Tumor”.
  • a sequence found to be differentially expressed in endomefrium cells will be classified also to a higher hierarchy level, which is "uterus”, and then to "women genital system” and to “genital system” and finally to a highest hierarchy level “genitourinary system”.
  • the retrieval can be performed according to each one of the requested levels.
  • Annotating gene expression according to relative abundance Spatial and temporal gene annotations are also assigned by comparing relative abundance in libraries of different origins. This approach can be used to find gene which are differentially expressed in tissues, pathologies and different developmental stages. In principal, the presentation of a contig in at least two tissues of interest is determined and significant over or under representation of the contig in one of the at least two tissues is assessed to identify differential expression.
  • splice variants Significant over or under representation is analyzed by statistical pairing.
  • Annotating spatial and temporal expression can also be effected on splice variants. This is effected as follows. First, a contigue which includes exonal sequence presentation of the at least two splice variants of the gene of interest is obtained. This contigue is assembled from a plurality of expressed sequences; Then, at least one contigue sequence region unique to a portion (i.e., at least one and not all) of the at least two splice variants of the gene of interestis identified . Identification of such unique sequence region is effected using computer alignment software.
  • the number of the plurality of expressed sequences in the tissue having the at least one contigue sequence region is compared with the number of the plurality of expressed sequences not-having the at least one contigue sequence region, to thereby compare the expression level of the at least two splice variants of the gene of interest in the tissue.
  • Sequence anntotations obtained using the above-described methodologies and other approaches are disclosed in a data table in the file "Ann_for_aH" and "Ann_clean" of the enclosed CD-ROM.
  • the data table shows a collection of annotations for biomolecular sequences, which were identified according to the teachings of the present invention using transcript data based on GenBank versions 136 (June 15, 2003 ftp://ftp.ncbi.nih. gov/genbank/release.notes/gb 136,release.notes) and NCBI genome assembly of April 2003.
  • Each feature in the data table is identified by "#”.
  • #ES[DICATION - This field designates the indications (i.e., diseases, disorders, pathological conditions) and therapies that the polypeptide of the present invention can be utilized for. Specifically, an indication lists the disorders or diseases in which the polypeptide of the present invention can be clinically used.
  • a therapy describes a postulated mode of action of the polypeptide for the above-mentioned indication.
  • an indication can be "Cancer, general” while the therapy will be “Anticancer”.
  • Each Protein of the present invention was assigned a SwissProt/TrEMBL human protein accession as described in section "Assignment of SwissProt/ TrEMBL accessions to Gencarta contigs" hereinbelow.
  • Example- #INDICATION Alopecia general; Antianginal; Anticancer, immunological; Anticancer, other; Atherosclerosis; Buerger's syndrome; Cancer, general; Cancer, head and neck; Cancer, renal; Cardiovascular; Cinhosis, hepatic; Cognition enhancer; Dermatological; Fibrosis, pulmonary; Gene therapy; Hepatic dysfunction, general; Hepatoprotective; Hypolipaemic/Antiatherosclerosis; Infarction, cerebral; Neuroprotective; Ophthalmological; Peripheral vascular disease; Radio/chemoprotective; Recombinant growth factor; Respiratory; Retinopathy, diabetic; Symptomatic antidiabetic; Urological; Assignment of SwissProt/TrEMBL accessions to Gencarta contigs - Gencarta contigs were assigned a Swisspro
  • SwissProt/TrEMBL data (SwissProt version 41.13 June 2003, TrEMBL and TrEMBL _new version 23.17 June 2003) were parsed and for each Swissprot/TremBl accession (excluding Swissprot/TremBl that are annotated as partial or fragment proteins) cross-references to EMBL and Genbank were obtained.
  • the alignment quality of the SwissProt/TrEMBL protein to their assigned mRNA sequences was checked by frame+p2n aligmnent analysis.
  • a good alignment was considered as having the following properties: • For partial mRNAs (those that in the mRNA description have the phrase "partial cds" or annotated as "3"' or "5"')- an overall identity of 97% and coverage of 80 % of the Swissprot/TremBl protein. • All the rest mRNA sequences were considered as fully coding mRNAs and for them an overall identity of 97% identity and coverage of the SwissProt/TrEMBL protein of over 95 %. The mRNAs were searched in the LEADS database for their conesponding contigs, and the contigs that included these mRNA sequences were assigned the Swissprot/TremBl accession.
  • PCL 1 a public protein that has a curated GO annotation
  • PCL 2 a public protein that has over 85 % identity to a public protein with a curated
  • PCL 3 a public protein that exhibits 50 - 85 % identity to a public protein with a curated GO annotation
  • PCL 4 a public protein that has under 50 % identity to a public protein with a curated GO annotation.
  • a homology search against all public proteins was done. If the Protein of the present invention has over 95 % identity to a public protein with PCL X than the Protein of the present invention gets the same confidence level as the public protein. This confidence level is marked as "#CL X". If the Protein of the present invention has over 85 % identity but not over 95 % to a public protein with PCL X than the Protein of the present invention gets a confidence level lower by 1 than the confidence level of the public protein.
  • the Protein of the present invention has over 70 % identity but not over 85 % to a public protein with PCL X than the Protein of the present invention gets a confidence level lower by 2 than the confidence level of the public protein. If the Protein of the present invention has over 50 % identity but not over 70 % to a public protein with PCL X than the Protein of the present invention gets a confidence level lower by 3 than the confidence level of the public protein. If the Protein of the present invention has over 30 % identity but not over 50 % to a public protein with PCL X than the Protein of the present invention gets a confidence level lower by 4 than the confidence level of the public protein.
  • a Protein of the present invention may get confidence level of 2 also if it has a true interpro domain that is linked to a GO annotation http://www.geneontologv.org/extemal2go/interpro2go/.
  • confidence level is above "1”
  • GO annotations of higher levels of the GO hierarchy are assigned (e.g. for "#CL 3" the GO annotations provided, is as appears plus the 2 GO annotations above it in the hierarchy).
  • "#DB” marks the database on which the GO assignment relies on.
  • the "sp”, as in Example 10a, relates to SwissProt/TremBl Protein knowledgebase, available from http://www.expasy.ch sprot/.
  • InterPro refers to the InterPro combined database, available from http://www.ebi.ac.uk/interpro/, which contains information regarding protein families, collected from the following databases: SwissProt (http://www.ebi.ac.uk/swissprot/), Prosite (http://www.expasy.ch/prosite/), Pfam (http://www.sanger.ac.uk/Software/Pfam ), Prints
  • PROLOC means that the method used for predicting the Gene Ontology cellular component is based on Proloc prediction, where the database is the statistical data the Proloc software employs to predict the subcellular localization of proteins.
  • "Viral protein database” All viral proteins (Total 294,805 proteins) were downloaded from NCBI GenBank on 1/10/2003. All the Baculoviridae and Entomopoxvirinae proteins, which are known to infect only insects, were removed and then a non-redundant set was prepared using 95 % identity as a cutoff (Holm L, Sander C.
  • #DB sp #EN NRG2_HUMAN means that the GO assignment in this case was based on a protein from the SwissProt/Trembl database, while the closest homologue (that has a GO assignment) to the assigned protein is depicted in SwissProt entry "NRG2JHUMAN "#DB interpro #EN IPR001609" means that GO assignment in this case was based on InterPro database, and the protein had an Interpro domain, IPR001609, that the assigned GO was based on. In Proloc predictions this field will have a Proloc annotation "#EN Proloc". In predictions based on viral proteins this field will have the gi. viral protein accession, "#EN 1491997".
  • novel RNA editing sites may be used for improved diagnosis and/or treatment when used singly or in combination with the previously described genes.
  • the novel splice variants may distinguish between healthy and diseased phenotype.
  • Another example is in cases of autosomal recessive genetic diseases.
  • #DRUG_DRUG_INTERACTION refers to proteins involved in a biological process which mediates the interaction between at least two consumed dmgs. Novel splice variants of known proteins involved in interaction between drugs may be used, for example, to modulate such dmg-dmg interactions. Examples of proteins involved in dmg-drag interactions are presented in Table 9 together with the conesponding internal gene contig name, enabling to allocate the new splice variants within the data files "Ann_for_all” and "Ann_clean" in the attached CD-ROM.
  • tissue-specific genes i.e., genes upregulated in a specific tissue or tissues.
  • tissue-specific genes i.e., genes upregulated in a specific tissue or tissues.
  • tissue proliferation i.e., differentiation and/or tissue damage.
  • proteins also have therapeutic significance as described above.
  • tissue-name the "tissue name” field specifies the list of tissues for which tissue- specific genes/variants were searched, as follows: amniotic+placenta; Blood; Bone; Bone manow; Brain; Cervix+utems; Colon; Endocrine, adrenal gland; Endocrine, pancreas; Endocrine, parathyroid+thyroid; Gastrointestinal tract; Genitourinary; Head and neck; Immune, T-cells; Kidney; Liver; Lung; Lymph node; Mammary gland; Muscle; Ovary; Prostate; Skin; Thymus.
  • #TAA This field denotes genes or transcript sequences over-expressed in cancer.
  • tissue-name specifies the list of tissues for which tissue-tumor specific genes/variants were searched, as follows: All tumor types; All epithelial tumors; prostate-tumor; lung-tumor; head and neck-tumor; stomach-tumor; colon- tumor; mammary-tumor; kidney-tumor; ovary-tumor; utems/cervix-tumor; thyroid-tumor; adrenal-tumor; pancreas-tumor; liver-tumor; skin-tumor; brain-tumor; bone-tumor; bone manow-tumor; blood-cancer; T-cells-tumor; lymph nodes-tumor; muscle-tumor.
  • #TAAT - This field denotes splice variants over expressed in cancer.
  • the annotation format is as follows: #TAAT tissue-name start nucleotide - end nucleotide, where the "start nucleotide - end nucleotide” field denotes the start and end nucleotides are the location on the transcript of the unique exon/s of this transcript which are over expressed in cancer.
  • EXAMPLE 7 The following sections list examples of proteins (subsection i), based on their molecular function, which participate in variety of diseases (listed in subsection ii), which diseases can be diagnosed/treated using information derived from naturally occurring transcripts having RNA editing sites, such as those uncovered by the present invention.
  • the present invention is of biomolecular sequences, which can be classified to functional groups based on known activity of homologous sequences.
  • This functional group classification allows the identification of diseases and conditions, which may be diagnosed and treated based on the novel sequence information and annotations of the present invention.
  • This functional group classification includes the following groups: Proteins involved in Drug-Drug interactions: The phrase "proteins involved in drug-drag interactions" refers to proteins involved in a biological process which mediates the interaction between at least two consumed drags. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies dfrected against such proteins or polynucleotides capable of altering expression of such proteins, may be used to modulate drug-drug interactions.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such drug-drug interactions. Examples of these conditions include, but are not limited to the cytoclirom P450 protein family, which is involved in the metabolism of many drugs. Examples of proteins involved in dmg-drag interactions are listed in Table 9, below. Proteins involved in the metabolism of a pro-drug to a drug: The phrase "proteins involved in the metabolism of a pro-drag to a dmg" refers to proteins that activate an inactive pro-drag by chemically chaining it into a biologically active compound. Preferably, the metabolizing enzyme is expressed in the target tissue thus reducing systemic side effects.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to modulate the metabolism of a pro-drag into dmg.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such conditions. Examples of these proteins include, but are not limited to esterases hydrolyzing the cholesterol lowering drag simvastatin into its hydroxy acid active form.
  • MDR proteins The phrase "MDR proteins" refers to Multi Drag Resistance proteins that are responsible for the resistance of a cell to a range of drags, usually by exporting these drags outside the cell.
  • the MDR proteins are ABC binding cassette proteins.
  • drug resistance is associated with resistance to chemotiierapy.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the transport of molecules and macromolecules such as neurotransmitters, hormones, sugar etc. is abnormal leading to various pathologies.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • MDR proteins include, but are not limited to the multi-drag resistant transporter MDRl/P-glycoprotein, which is the gene product of MDR1, belonging to the ATP-binding cassette (ABC) superfamily of membrane transporters. This protein was shown to increase the resistance of malignant cells to therapy by exporting the therapeutic agent out of the cell.
  • Hydrolases acting on amino acids The phrase "hydrolases acting on amino acids" refers to hydrolases acting on a pair of amino acids.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the fransfer of a glycosyl chemical group from one molecule to another is abnormal thus, a beneficial effect may be achieved by modulation of such reaction.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • Transaminases refers to enzymes transfening an amine group from one compound to another.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the transfer of an amine group from one molecule to another is abnormal thus, a beneficial effect may be achieved by modulation of such reaction.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • Examples of such fransaminases include, but are not limited to two liver enzymes, frequently used as markers for liver function - SGOT (Serum Glutamic-Oxalocetic Transaminase - AST) and SGPT (Serum Glutamic-Pyravic Transaminase - ALT).
  • Immunoglobulins refers to proteins that are involved in the immune and complement systems such as antigens and autoantigens, immunoglobulins, MHC and HLA proteins and their associated proteins.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases involving the immune system such as inflammation, autoimmune diseases, infectious diseases, and cancerous processes.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • members of the complement family such as C3 and C4 that their blood level is used for evaluation of autoimmune diseases and allergy state and Cl inhibitor that its absence is associated with angioedema.
  • new variants of these genes are expected to be markers for similar events.
  • Mutation in variants of the complement family may be associated with other immunological syndromes, such as increased bacterial infection that is associated with mutation in C3.
  • Cl inhibitor was shown to provide safe and effective inhibition of complement activation after reperfused acute myocardial infarction and may reduce myocardial injury [Eur. Heart J. 2002, 23(21): 1670-7], thus, its variant may have the same or improved effect.
  • transcription factor binding refers to proteins involved in transcription process by binding to nucleic acids, such as transcription factors, RNA and DNA binding proteins, zinc fingers, helicase, isomerase, histones, and nucleases.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases involving transcription factors binding proteins. Such treatment may be based on transcription factor that can be used to for modulation of gene expression associated with the disease.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • Examples of such diseases include, but are not limited to breast cancer associated with ErbB-2 expression that was shown to be successfully modulated by a transcription factor [Proc. Natl. Acad. Sci. U S A. 2000, 97(4): 1495-500].
  • Examples of novel transcription factors used for therapeutic protein production include, but are not limited to those described for Erythropoietin production [J. Biol. Chem. 2000, 275(43):33850-60; J. Biol. Chem. 2000, 275(43):33850-60] and zinc fingers protein transcription factors (ZFP-TF) variants [J. Biol. Chem. 2000, 275(43):33850-60].
  • Small GTPase regulatory/interacting proteins refers to proteins capable of regulating or interacting with GTPase such as RAB escort protein, guanyl-nucleotide exchange factor, guanyl-nucleotide exchange factor adaptor, GDP-dissociation inhibitor, GTPase inhibitor, GTPase activator, guanyl-nucleotide releasing factor, GDP-dissociation stimulator, regulator of G-protein signaling, RAS interactor, RHO interactor, RAB interactor, and RAL interactor.
  • RAB escort protein guanyl-nucleotide exchange factor
  • guanyl-nucleotide exchange factor adaptor such as GDP-dissociation inhibitor, GTPase inhibitor, GTPase activator, guanyl-nucleotide releasing factor, GDP-dissociation stimulator, regulator of G-protein signaling, RAS interactor, RHO interactor, RAB interactor,
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which G-proteases mediated signal- transduction is abnormal, either as a cause, or as a result of the disease.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to diseases related to prenylation. Modulation of prenylation was shown to affect therapy of diseases such as osteoporosis, ischemic heart disease, and inflammatory processes.
  • Calcium binding proteins refers to proteins involve in calcium binding, preferably, calcium binding proteins, ligand binding or earners, such as diacylglycerol kinase, Calpain, calcium-dependent protein serine/threonine phosphatase, calcium sensing proteins, calcium storage proteins.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat calcium involved diseases.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • diseases include, but are not limited to diseases related to hypercalcemia, hypertension, cardiovascular disease, muscle diseases, gastro-intestinal diseases, uterus relaxing, and utems.
  • An example for therapy use of calcium binding proteins variant may be treatment of emergency cases of hypercalcemia, with secreted variants of calcium storage proteins.
  • Oxidoreductase The term “oxidoreductase” refers to enzymes that catalyze the removal of hydrogen atoms and electrons from the compounds on which they act.
  • oxidoreductases acting on the following groups of donors: CH-OH, CH-CH, CH-NH2, CH-NH; oxidoreductases acting on NADH or NADPH, nitrogenous compounds, sulfur group of donors, heme group, hydrogen group, diphenols and related substances as donors; oxidoreductases acting on peroxide as acceptor, superoxide radicals as acceptor, oxidizing metal ions, CH2 groups; oxidoreductases acting on reduced fenedoxin as donor; oxidoreductases acting on reduced flavodoxin as donor; and oxidoreductases acting on the aldehyde or oxo group of donors.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases caused by abnormal activity of oxidoreductases.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to malignant and autoimmune diseases in which the enzyme DHFR (DiHydroFolateReductase) that participates in folate metabolism and essential for de novo glycine and purine synthesis is the target for the widely used drag Methotrexate (MTX).
  • DHFR DiHydroFolateReductase
  • Receptors refers to protein-binding sites on a cell's surface or interior, that recognize and binds to specific messenger molecule leading to a biological response, such as signal transducers, complement receptors, ligand-dependent nuclear receptors, fransmembrane receptors, GPI-anchored membrane-bound receptors, various coreceptors, intemalization receptors, receptors to neurotransmitters, hormones and various other effectors and ligands.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases caused by abnormal activity of receptors, preferably, receptors to neurotransmitters, hormones and various other effectors and ligands.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, chronic myelomonocytic leukemia caused by growth factor ⁇ receptor deficiency [Rao D. S., et al, (2001) Mol.
  • nuclear receptors variants may be based on secreted version of receptors such as the thyroid nuclear receptor that by binding plasma free thyroid hormone to reduce its levels may have a therapeutic effect in cases of thyrotoxicosis.
  • Secreted soluble TNF receptor is an example for a molecule, which can be used to treat conditions hi which downregulation of TNF levels or activity is benefitial, including, but not limited to, Rheumatoid Arthritis, Juvenile Rheumatoid Arthritis, Psoriatic Arthritis and Ankylosing Spondylitis.
  • Protein serine/threonine kinases refers to proteins which phosphorylate serine/threonine residues, mainly involved in signal transduction, such as fransmembrane receptor protein serine/threonine kinase, 3-phosphoinositide-dependent protein kinase, DNA- dependent protein kinase, G-protein-coupled receptor phosphorylating protein kinase, SlSfFlA/AJVIP-activated protein kinase, casein kinase, calmodulin regulated protein kinase, cyclic-nucleotide dependent protein kinase, cyclin-dependent protein kinase, eukaryotic translation initiation factor 2 ⁇ kinase, galactosylfransferase-associated kinase, glycogen synthase kinase 3, protein kinase C, receptor signaling
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases ameliorated by a modulating kinase activity.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to schizophrenia.
  • 5-HT(2A) serotonin receptor is the principal molecular target for LSD-like hallucinogens and atypical antipsychotic drags.
  • serine/threonine kinases specific for the 5-HT(2A) serotonin receptor may serve as dmg targets for a disease such as schizophrenia.
  • Other diseases that may be treated through serine/thereonine kinases modulation are Koz- Jeghers syndrome (PJS, a rare autosomal-dominant disorder characterized by hamartomatous polyposis of the gastrointestinal tract and melanin pigmentation of the skin and mucous membranes [Hum.
  • Channel/pore class transporters refers to proteins that mediate the transport of molecules and macromolecules across membranes, such as ⁇ -type channels, porins, and pore-forming toxins.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the transport of molecules and macromolecules are abnormal, therefore leading to various pathologies.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • diseases include, but are not limited to diseases of the nerves system such as Parkinson, diseases of the hormonal system, diabetes and infectious diseases such as bacterial and fungal infections.
  • ⁇ -hemolysin which is produced by S. aureus creating ion conductive pores in the cell membrane, thereby deminishing its integrity.
  • Hydrolases, acting on acid anhydrides refers to hydrolytic enzymes that are acting on acid anhydrides, such as hydrolases acting on acid anhydrides in phosphoras- containing anhydrides or in sulfonyl-containing anhydrides, hydrolases catalyzing fransmembrane movement of substances, and involved in cellular and subcellular movement.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the hydrolase-related activities are abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • diseases include, but are not limited to glaucoma freated with carbonic anhydrase inhibitors (e.g. Dorzolamide), peptic ulcer disease treated with HC ⁇ K ⁇ ATPase inhibitors that were shown to affect disease by blocking gastric carbonic anhydrase (e.g. Omeprazole).
  • Transferases, transferring phosphorus-containing groups refers to enzymes that catalyze the transfer of phosphate from one molecule to another, such as phosphotransferases using the following groups as acceptors: alcohol group, carboxyl group, nitrogenous group, phosphate; phosphotransferases with regeneration of donors catalyzing intramolecular transfers; diphosphotransferases; nucleotidyltransferase; and phosphotransferases for other substituted phosphate groups.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the transfer of a phosphorous containing functional group to a modulated moiety is abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to acute MI [Ann. Emerg. Med. 2003, 42(3):343-50], Cancer [Oral. Dis. 2003, 9(3):119-28; J. Surg. Res. 2003, 113(l):102-8] and Alzheimer's disease [Am. J. Pathol.
  • Examples for possible utilities of such transferases for drag improvement include, but are not limited to aminoglycosides treatment (antibiotics) to which resistance is mediated by aminoglycoside phosphotransferases [Front. Biosci. 1999, 1;4:D9-21]. Using aminoglycoside phosphotransferases variants or inhibiting these enzymes may reduce aminoglycosides resistance. Since aminoglycosides can be toxic to some patients, proving the expression of aminoglycoside phosphotransferases in a patient can deter from treating him with aminoglycosides and risking the patient in vain.
  • Phosphoric monoester hydrolases refers to hydrolytic enzymes that are acting on ester bonds, such as nuclease, sulfuric ester hydrolase, carboxylic ester hydrolase, thiolester hydrolase, phosphoric monoester hydrolase, phosphoric diester hydrolase, triphosphoric monoester hydrolase, diphosphoric monoester hydrolase, and phosphoric friester hydrolase.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the hydrolytic cleavage of a covalent bond with accompanying addition of water (-H being added to one product of the cleavage and -OH to the other), is abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to diabetes and CNS diseases such as Parkinson and cancer.
  • Enzyme inhibitors The term “enzyme inhibitors" refers to inhibitors and suppressors of other proteins and enzymes, such as inhibitors of: kinases, phosphatases, chaperones, guanylate cyclase,
  • DNA gyrase DNA gyrase, ribonuclease, proteasome inhibitors, diazepam-binding inhibitor, ornithine decarboxylase inhibitor, GTPase inhibitors, dUTP pyrophosphatase inhibitor, phospholipase inhibitor, proteinase inhibitor, protein biosynthesis inhibitors, and ⁇ -amylase inhibitors.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which beneficial effect may be achieved by modulating the activity of inhibitors and suppressors of proteins and enzymes.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • diseases include, but are not limited to ⁇ -1 antifrypsin (a natural serine proteases, which protects the lung and liver from proteolysis) deficiency associated with emphysema, COPD and liver chirosis.
  • ⁇ -1 antifrypsin is also used for diagnostics in cases of unexplained liver and lung disease.
  • a variant of tiiis enzyme may act as protease inhibitor or a diagnostic target for related diseases.
  • Electron transporters refers to ligand binding or canier proteins involved in electron transport such as flavin-containing electron fransporter, cytochromes, electron donors, electron acceptors, electron caniers, and cytochrome-c oxidases.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which beneficial effect may be achieved by i ll
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • diseases include, but are not limited to cyanide toxicity, resulting from cyanide binding to ubiquitous metalloenzym.es rendering them inactive, and interfering with the electron transport.
  • Novel electron transporters to which cyanide can bind may serve as drag targets for new cyanide antidotes.
  • Transferases, transferring glycosyl groups refers to enzymes that catalyze the transfer of a glycosyl chemical group from one molecule to another such as murein lytic endotransglycosylase E, and sialyltransferase.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the fransfer of a glycosyl chemical group is abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • Ligases, forming carbon-oxygen bonds refers to enzymes that catalyze the linkage between carbon and oxygen such as ligase forming aminoacyl-tRNA and related compounds.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the linkage between carbon and oxygen in an energy dependent process is abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • Ligases refers to enzymes that catalyze the linkage of two molecules, generally utilizing ATP as the energy donor, also called synthetase.
  • ligases are enzymes such as ⁇ -alanyl-dopamine hydrolase, carbon-oxygen bonds forming ligase, carbon-sulfur bonds forming ligase, carbon-nitrogen bonds forming ligase, carbon-carbon bonds forming ligase, and phosphoric ester bonds forming ligase.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases in which the joining together of two molecules in an energy dependent process is abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • diseases include, but are not limited to neurological disorders such as Parkinson's disease [Science. 2003, 302(5646):819-22; J. Neurol. 2003, 250 Suppl.
  • Hydrolases, acting on glycosyl bonds refers to hydrolytic enzymes that are acting on glycosyl bonds such as hydrolases hydrolyzing N-glycosyl compounds, S- glycosyl compounds, and O-glycosyl compounds.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the hydrolase-related activities are abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • diseases include cancerous diseases [J. Natl. Cancer Inst. 2003,
  • kinases refers to enzymes which phosphorylate serine/threonine or . tyrosine residues, mainly involved in signal transduction. Examples for kinases include enzymes such as 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase,
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases which may be ameliorated by a modulating kinase activity.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, acute lymphoblastic leukemia associated with spleen tyrosine kinase deficiency [Goodman P.
  • nucleotide binding refers to ligand binding or canier proteins, involved in physical interaction with a nucleotide, preferably, any compound consisting of a nucleoside that is esterified with [ortho]phosphate or an oligophosphate at any hydroxyl group on the glycose moiety, such as purine nucleotide binding proteins.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases that are associated with abnormal nucleotide binding.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to Gout (a syndrome characterized by high urate level in the blood). Since urate is a breakdown metabolite of purines, reducing purines serum levels could have a therapeutic effect in Gout disease.
  • Tubulin binding refers to binding proteins that bind tubulin such as microtubule binding proteins.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases which are associated with abnormal tubulin activity or stracture. Binding the products of the genes of this family, or antibodies reactive therewith, can modulate a plurality of tubulin activities as well as change microtubulin stracture. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • Alzheimer's disease associated with t-complex polypeptide 1 deficiency [Schuller E., et al., (2001) Life Sci., 69(3):263-70], neurodegeneration associated with apoE deficiency [Masliah E., et al., (1995) Exp. Neurol., 136(2): 107-22], progressive axonopathy associated with disfuctional neurofilaments [Griffiths I. R., et al., (1989) Neuropathol. Appl.
  • Receptor signaling proteins refers to receptor proteins involved in signal transduction such as receptor signaling protein serine/threonine kinase, receptor signaling protein tyrosine kinase, receptor signaling protein tyrosine phosphatase, aryl hydrocarbon receptor nuclear translocator, hematopoeitin/interferon-class (D200-domain) cytokine receptor signal fransducer, fransmembrane receptor protein tyrosine kinase signaling protein, fransmembrane receptor protein serine/threonine kinase signaling protein, receptor signaling protein serine/threonine kinase signaling protein, receptor signaling protein serine/threonine phosphatase signaling protein, small GTPase regulatory/interacting protein, receptor signaling protein tyrosine kinase signaling protein, and receptor signaling protein serine/threonine phosphatase.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the signal-transduction is abnormal, either as a cause, or as a result of the disease.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, complete hypogonadofropic hypogonadism associated with GnRH receptor deficiency [Kottler M. L., et a., (2000) J. Clin. Endocrinol.
  • Molecular function unknown refers to various proteins with unknown molecular function, such as cell surface antigens.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which regulation of the recognition, or participation or bind of cell surface antigens to other moieties may have therapeutic effect.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, autoimmune diseases, various infectious diseases, cancer diseases which involve non cell surface antigens recognition and activity.
  • Enzyme activators refers to enzyme regulators such as activators of: kinases, phosphatases, sphingolipids, chaperones, guanylate cyclase, tryptophan hydroxylase, proteases, phospholipases, caspases, proprotein convertase 2 activator, cyclin- dependent protein kinase 5 activator, superoxide-generating NADPH oxidase activator, sphingomyelin phosphodiesterase activator, monophenol monooxygenase activator, proteasome activator, and GTPase activator.
  • enzyme regulators such as activators of: kinases, phosphatases, sphingolipids, chaperones, guanylate cyclase, tryptophan hydroxylase, proteases, phospholipases, caspases, proprotein convertase 2 activator, cyclin- dependent protein kinase 5 activator, superoxide-generating NADPH
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which beneficial effect may be achieved by modulating the activity of activators of proteins and enzymes.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to all complement related diseases, as most complement proteins activate by cleavage other complement proteins.
  • Transferases, transferring one-carbon groups refers enzymes that catalyze the transfer of a one-carbon chemical group from one molecule to another such as methyltransferase, amidinotransferase, hydroxymethyl-, formyl- and related fransferase, carboxyl- and carbamoylfransferase.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases in which the transfer of a one-carbon chemical group from one molecule to another is abnormal so that a beneficial effect may be achieved by modulation of such reaction.
  • Transferases refers to enzymes that catalyze the transfer of a chemical group, preferably, a phosphate or amine from one molecule to another. It includes enzymes such as transferases, transfening one-carbon groups, aldehyde or ketonic groups, acyl groups, glycosyl groups, alkyl or aryl (other than methyl) groups, nitrogenous, phosphorus- containing groups, sulfur-containing groups, lipoyltransferase, deoxycytidyl transferases.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the transfer of a chemical group from one molecule to another is abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to cancerous diseases such as prostate cancer [Urology. 2003, 62(5 Suppl l):55-62] or lung cancer [Invest. New Drags. 2003, 21(4):435-43; JAMA. 2003, 22;290(16):2149-58], psychiatric disorders [Am. J.
  • Chaperones refers to functional classes of unrelated families of proteins that assist the conect non-covalent assembly of other polypeptide-containing structures in vivo, but are not components of these assembled structures when they a performing their normal biological function.
  • the group of chaperones include proteins such as ribosomal chaperone, peptidylprolyl isomerase, lectin-binding chaperone, nucleosome assembly chaperone, chaperonin ATPase, cochaperone, heat shock protein, HSP70/HSP90 organizing protein, fimbrial chaperone, metallochaperone, tubulin folding, and HSC70-interacting protein.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases which are associated with abnormal protein activity, stracture, degradation or accumulation of proteins.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to neurological syndromes [J. Neuropathol. Exp. Neurol. 2003, 62(7):751-64; Antioxid Redox Signal. 2003, 5(3):337-48; J. Neurochem. 2003, 86(2) :394-404], neurological diseases such as Parkinson's disease [Hum. Genet.
  • Cell adhesion molecule refers to proteins that serve as adhesion molecules between adjoining cells such as membrane-associated protein with guanylate kinase activity, cell adhesion receptor, neuroligin, calcium-dependent cell adhesion molecule, selectin, calcium-independent cell adhesion molecule, and extracellular matrix protein.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases in which adhesion between adjoining cells is involved, typically conditions in which the adhesion is abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • diseases include, but are not limited to cancer in which abnormal adhesion may cause and enhance the process of metastasis and abnormal growth and development of various tissues in which modulation adhesion among adjoining cells can improve the condition.
  • Leucocyte-endothlial interactions characterized by adhesion molecules involved in interactions between cells lead to a tissue injury and ischemia reperfusion disorders in which activated signals generated during ischemia may trigger an exuberant inflammatory response during reperfusion, provoking greater tissue damage than initial ischemic insult [Crit. Care Med.
  • the blockade of leucocyte-endothelial adhesive interactions has the potential to reduce vascular and tissue injury. This blockade may be achieved using a soluble variant of the adhesion molecule. States of septic shock and ARDS involve large recruitment of neutrophil cells to the damaged tissues. Neutrophil cells bind to the endothelial cells in the target tissues through adhesion molecules. Neufrophils possess multiple effector mechanisms that can produce endothelial and lung tissue injury, and interfere with pulmonary gas transfer by dismption of surfactant activity [Eur. J. Surg. 2002, 168(4):204-14].
  • the use of soluble variant of the adhesion molecule may decrease the adhesion of monrophils to the damaged tissues.
  • diseases include, but are not limited to, Wiskott-Aldrich syndrome associated with WAS deficiency [Westerberg L., et al, (2001) Blood, 98(4): 1086-94], asthma associated with intercellular adhesion molecule- 1 deficiency [Tang M. L. and Fiscus L. C, (2001) Pulm. Pharmacol. Ther., 14(3):203-10], infra-afrial thrombogenesis associated with increased von Willebrand factor activity [Fukuchi M., et al., (2001) J. Am. Coll. Cardiol.,
  • Motor proteins refers to proteins that generate force or energy by the hydrolysis of ATP and that function in the production of intracellular movement or transportation. Examples of such proteins include microfilament motor, axonemal motor, microtubule motor, and kinetochore motor (dynein, kinesin, or myosin).
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases in which force or energy generation is impaired.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, malignant diseases where microtubules are drag targets for a family of anticancer drags such as myodystrophies and myopathies [Trends Cell Biol. 2002, 12(12):585-91], neurological disorders [Neuron. 2003,
  • defense/immunity proteins refers to proteins that are involved in the immune and complement systems such as acute-phase response proteins, antimicrobial peptides, antiviral response proteins, blood coagulation factors, complement components, immunoglobulins, major histocompatibility complex antigens and opsonins.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases involving the immunological system including inflammation, autoimmune diseases, infectious diseases, as well as cancerous processes or diseases which are manifested by abnormal coagulation processes, which may include abnormal bleeding or excessive coagulation.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, late (C5-9) complement component deficiency associated with opsonin receptor allotypes [Fijen C. A., et al., (2000) Clin. Exp.
  • Intracellular transporters refers to proteins that mediate the transport of molecules and macromolecules inside the cell, such as intracellular nucleoside transporter, vacuolar assembly proteins, vesicle transporters, vesicle fusion proteins, type II protein secretors.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the transport of molecules and macromolecules is abnormal leading to various pathologies.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • Transporters refers to proteins that mediate the transport of molecules and macromolecules, such as channels, exchangers, and pumps.
  • Transporters include proteins such as: amine/polyamine transporter, lipid transporter, neurotransmitter transporter, organic acid transporter, oxygen transporter, water transporter, earners, intracellular transports, protein transporters, ion transporters, carbohydrate transporter, polyol transporter, amino acid transporters, vitamin/cofactor transporters, siderophore fransporter, drag transporter, channel/pore class transporter, group translocator, auxiliary transport proteins, permeases, murein transporter, organic alcohol fransporter, nucleobase, nucleoside, and nucleotide and nucleic acid transporters.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the transport of molecules and macromolecules such as neurotransmitters, hormones, sugar etc. is impaired leading to various pathologies.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, glycogen storage disease caused by glucose-6-phosphate fransporter deficiency [Hiraiwa H., and Chou J. Y.
  • These transporters may have the capability to bind the compound in the serum they would normally bind on the membrane.
  • a secreted form ATP7B a transporter involved in Wilson's disease, is expected to bind plasma Copper, therefore have a desired therapeutic effect in Wilson's disease.
  • Lyases refers to enzymes that catalyze the formation of double bonds by removing chemical groups from a substrate without hydrolysis or catalyze the addition of chemical groups to double bonds. It includes enzymes such as carbon-carbon lyase, carbon- oxygen lyase, carbon-nitrogen lyase, carbon-sulfur lyase, carbon-halide lyase, and phosphorus- oxygen lyase.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which the double bonds formation catalyzed by these enzymes is impaired.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • diseases include, but are not limited to, autoimmune diseases [JAMA. 2003, 290(13): 1721-8; JAMA.2003, 290(13): 1713-20], diabetes [Diabetes. 2003, 52(9):2274- 8], neurological disorders such as epilepsy [J. Neurosci. 2003, 23(24):8471-9], Parkinson [J. Neurosci.2003, 23(23):8302-9; Lancet. 2003, 362(9385):712] or Creutzfeldt-Jakob disease [Clin. Neurophysiol.
  • Actin binding proteins refers to proteins binding actin as actin cross- linking, actin bundling, F-actin capping, actin monomer binding, actin lateral binding, actin depolymerizing, actin monomer sequestering, actin filament severing, actin modulating, membrane associated actin binding, actin thin filament length regulation, and actin polymerizing proteins.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases in which actin binding is impaired.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, neuromuscular diseases such as muscular dystrophy [Neurology. 2003, 61(3):404-6], Cancerous diseases [Urology. 2003, 61(4):845-50; J. Cutan. Pathol. 2002, 29(7):430; Cancer. 2002, 94(6): 1777-86; Clin. Cancer Res.
  • Protein binding proteins The phrase "protein binding proteins" refers to proteins involved in diverse biological functions through binding other proteins.
  • Examples of such biological function include intermediate filament binding, LIM-domain binding, LLR-domain binding, clathrin binding, ARF binding, vinculin binding, KU70 binding, froponin C binding PDZ-domain binding, SH3 -domain binding, fibroblast growth factor binding, membrane-associated protein with guanylate kinase activity interacting, Wnt-protein binding , DEAD/H-box RNA helicase binding, ⁇ -amyloid binding, myosin binding, TATA-binding protein binding DNA topoisomerase I binding, polypeptide hormone binding, RHO binding, FHl-domain binding, syntaxin-1 binding, HSC70-interacting, transcription factor binding, metarhodopsin binding, tubulin binding, JUN kinase binding, RAN protein binding, protein signal sequence binding, importin ⁇ export receptor, poly-glutamine tract binding, protein carrier, ⁇ -catenin binding, protein C-terminus binding, lipoprotein binding, cytoskeletal protein binding protein, nuclear localization sequence binding
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases which are associated with impaired protein bindmg.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, neurological and psychiatric diseases [J. Neurosci. 2003, 23(25):8788-99; Neurobiol. Dis. 2003, 14(1): 146-56; J. Neurosci. 2003, 23(17):6956-64; Am. J. Pathol.
  • Ligand binding or carrier proteins refers to proteins involved in diverse biological functions such as: pyridoxal phosphate binding, carbohydrate binding, magnesium binding, amino acid binding, cyclosporin A binding, nickel binding, chlorophyll binding, biotin binding, penicillin binding, selenium binding, tocopherol binding, lipid binding, drag binding, oxygen transporter, electron fransporter, steroid binding, juvenile hormone binding, retinoid binding, heavy metal binding, calcium binding, protein binding, glycosaminoglycan binding, folate binding, odorartt binding, lipopolysaccharide binding and nucleotide binding.
  • compositions including such proteins or protein encoding sequences, antibodies dfrected against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases which are associated with impaired function of these proteins.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, neurological disorders [J. Med. Genet. 2003, 40(10):733-40; J. Neuropathol. Exp. Neurol. 2003, 62(9):968-75; J. Neurochem. 2003, 87(2):427-36], autoimmune diseases (N. Engl. J. Med.
  • ATPases refers to enzymes that catalyze the hydrolysis of ATP to ADP, releasing energy that is used in the cell. This group include enzymes such as plasma membrane cation-transporting ATPase, ATP-binding cassette (ABC) fransporter, magnesium-ATPase, hydrogen-/sodium-translocating ATPase or ATPase translocating any other elements, arsenite-transporting ATPase, protein-transporting ATPase, DNA translocase, P-type ATPase, and hydrolase, acting on acid anhydrides involved in cellular and subcellular movement.
  • ABS ATP-binding cassette
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases which are associated with impaired conversion of the hydrolysis of ATP to ADP or resulting energy use.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, infectious diseases such as helicobacter pylori ulcers [BMC Gasfroenterology 2003, 3:31 (published 6 November 2003)], Neurological, muscular and psychiatric diseases [Int. J. Neurosci.
  • Carboxylic ester hydrolases refers to hydrolytic enzymes acting on carboxylic ester bonds such as N-acetylglucosaminylphosphatidylinositol deacetylase, 2- acetyl-1-alkylglycerophosphocholine esterase, aminoacyl-tRNA hydrolase, arylesterase, carboxylesterase, cholinesterase, gluconolactonase, sterol esterase, acetylesterase, carboxymethylenebutenolidase, protein-glutamate methylesterase, lipase, and 6- phosphogluconolactonase.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which the hydrolytic cleavage of a covalent bond with accompanying addition of water (-H being added to one product of the cleavage and -OH to the other) is abnormal so that a beneficial effect may be achieved by modulation of such reaction.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, autoimmune neuromuscular disease Myasthenia Gravis, freated with cholinesterase inhibitors.
  • Hydrolase, acting on ester bonds refers to hydrolytic enzymes acting on ester bonds such as nucleases, sulfuric ester hydrolase, carboxylic ester hydrolases, thiolester hydrolase, phosphoric monoester hydrolase, phosphoric diester hydrolase, triphosphoric monoester hydrolase, diphosphoric monoester hydrolase, and phosphoric triester hydrolase.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which the hydrolytic cleavage of a covalent bond with accompanying addition of water (-H being added to one product of the cleavage and -OH to the other), is abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • Hydrolases refers to hydrolytic enzymes such as GPI-anchor fransamidase, peptidases, hydrolases, acting on ester bonds, glycosyl bonds, ether bonds, carbon-nitrogen (but not peptide) bonds, acid anhydrides, acid carbon-carbon bonds, acid halide bonds, acid phosphoras-nitrogen bonds, acid sulfur-nitrogen bonds, acid carbon- phosphorus bonds, acid sulfur-sulfur bonds.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases in which the hydrolytic cleavage of a covalent bond with accompanying addition of water (-H being added to one product of the cleavage and -OH to the other) is abnormal.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, cancerous diseases [Cancer.
  • Enzymes refers to naturally occuning or synthetic macromolecular substance composed mostly of protein, that catalyzes, to various degree of specificity, at least one (bio)chemical reactions at relatively low temperatures.
  • RNA that has catalytic activity (ribozyme) is often also regarded as enzymatic.
  • enzymes are mainly proteinaceous and are often easily inactivated by heating or by protein-denaturing agents.
  • the substances upon which they act are known as substrates, for which the enzyme possesses a specific binding or active site.
  • the group of enzymes include various proteins possessing enzymatic activities such as mannosylphosphate fransferase, para-hydroxybenzoate:polyprenyltransferase, rieske iron- sulfur protein, imidazoleglycerol-phosphate synthase, sphingosine hydroxylase, tRNA 2'- phosphofransferase, sterol C-24(28) reductase, C-8 sterol isomerase, C-22 sterol desaturase, C-14 sterol reductase, C-3 sterol dehydrogenase (C-4 sterol decarboxylase), 3-keto sterol reductase, C-4 methyl sterol oxidase, dihydronicotinamide riboside quinone reductase, glutamate phosphate reductase, DNA repair enzyme, telomerase, ⁇ -ketoacid dehydrogenase, ⁇ -alanyl-do
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases which can be ameliorated by modulating the activity of various enzymes which are involved both in enzymatic processes inside cells as well as in cell signaling.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • Cytoskeletal proteins refers to proteins involved in the structure formation of the cytoskeleton.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases which are caused or due to abnormalities in cytoskeleton, including cancerous cells, and diseased cells such as cells that do not propagate, grow or function normally.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, liver diseases such as cholestatic diseases [Lancet. 2003, 362(9390).T 112-9], vascular diseases [J. Cell Biol.
  • Structural proteins refers to proteins involved in the stmcture formation of the cell, such as structural proteins of ribosome, cell wall structural proteins, structural proteins of cytoskeleton, extracellular matrix structural proteins, extracellular matrix glycoproteins, amyloid proteins, plasma proteins, structural proteins of eye lens, structural protein of chorion (sensu Insecta), structural protein of cuticle (sensu Insecta), puparial glue protein (sensu Diptera), structural proteins of bone, yolk proteins, structural proteins of muscle, structural protein of vitelline membrane (sensu Insecta), structural proteins of perifrophic membrane (sensu Insecta), and structural proteins of nuclear pores.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases which are caused by abnormalities in cytoskeleton, including cancerous cells, and diseased cells such as cells that do not propagate, grow or function normally.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, blood vessels diseases such as aneurysms [Cardiovasc. Res. 2003, 60(1):205-13], joint diseases [Rheum. Dis. Clin. North Am.
  • Ligands refers to proteins that bind to another chemical entity to form a larger complex, involved in various biological processes, such as signal transduction, metabolism, growth and differentiation, etc.
  • This group of proteins includes opioid peptides, baboon receptor ligand, branchless receptor ligand, breathless receptor ligand, ephrin, frizzled receptor ligand, frizzled-2 receptor ligand, heartless receptor ligand, Notch receptor ligand, patched receptor ligand, punt receptor ligand, Ror receptor ligand, saxophone receptor ligand, SE20 receptor ligand, sevenless receptor ligand, smooth receptor ligand, thickveins receptor ligand, Toll receptor ligand, Torso receptor ligand, death receptor ligand, scavenger receptor ligand, neuroligin, integrin ligand, hormones, pheromones, growth factors, and sulfonylurea receptor ligand.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases involved in impaired hormone function or diseases which involve abnormal secretion of proteins which may be due to abnormal presence, absence or impaired normal response to normal levels of secreted proteins.
  • Those secreted proteins include hormones, neurotransmitters, and various other proteins secreted by cells to the extracellular environment.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • Examples of such diseases include, but are not limited to, analgesia inhibited by orphanin FQ/nociceptin [Shane R., et al., (2001) Brain Res., 907(1-2):109-16], stroke protected by estrogen [Alkayed N. J., et al., (2001) J. Neurosci., 21(19):7543-50], atherosclerosis associated with growth hormone deficiency [Elhadd T A., et al., (2001) J. Clin. Endocrinol. Metab., 86(9):4223-32], diabetes inhibited by ⁇ -galactosylceramide [Hong S., et al., (2001) Nat.
  • Signal transducer refers to proteins such as activin inhibitors, receptor- associated proteins, ⁇ -2 macroglobulin receptors, morphogens, quorum sensing signal generators, quorum sensing response regulators, receptor signaling proteins, ligands, receptors, two-component sensor molecules, and two-component response regulators.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the signal-fransduction is impaired, either as a cause, or as a result of the disease.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, altered sexual dimorphism associated with signal fransducer and activator of franscription 5b [Udy G. B., et al., (1997) Proc. Natl. Acad. Sci.
  • RNA polymerase II transcription factors refers to proteins such as specific and non-specific RNA polymerase II transcription factors, enhancer binding, ligand-regulated transcription factor, and general RNA polymerase II transcription factors.
  • Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases involving impaired function of RNA polymerase II franscription factors.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, cardiac diseases [Cell Cycle.
  • RNA binding proteins refers to RNA binding proteins involved in splicing and translation regulation such as tRNA binding proteins, RNA helicases, double- sfranded RNA and single-stranded RNA binding proteins, mRNA binding proteins, snRNA cap binding proteins, 5S RNA and 7S RNA binding proteins, poly-pyrimidine tract binding proteins, snRNA binding proteins, and AU-specific RNA binding proteins.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases involving franscription and translation factors such as helicases, isomerases, histones and nucleases, diseases where there is impaired transcription, splicing, post-transcriptional processing, translation or stability of the RNA.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, cancerous diseases such as lymphomas [Tumori. 2003, 89(3):278-84], prostate cancer [Prostate. 2003, 57(l):80-92] or lung cancer [J. Pathol. 2003, 200(5):640-6], blood diseases, such as fanconi anemia [Cun. Hematol.
  • cardiovascular diseases such as atherosclerosis [J. Thromb. Haemost
  • Nucleic acid binding proteins refers to proteins involved in RNA and DNA synthesis and expression regulation such as transcription factors, RNA and DNA binding proteins, zinc fingers, helicase, isomerase, histones, nucleases, ribonucleoproteins, and transcription and translation factors.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases involving DNA or RNA binding proteins such as: helicases, isomerases, histones and nucleases, for example diseases where there is abnormal replication or transcription of DNA and RNA respectively.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, neurological diseases such as renitis pigmentoas [Am. J. Ophthal ol. 2003, 13 (4):678-87] parkinsonism [Proc. Natl. Acad.
  • Proteins involved in Metabolism The phrase “proteins involved in metabolism” refers to proteins involved in the totality of the chemical reactions and physical changes that occur in living organisms, comprising anabolism and catabolism; may be qualified to mean the chemical reactions and physical processes undergone by a particular substance, or class of substances, in a living organism.
  • This group includes proteins involved in the reactions of cell growth and maintenance such as: metabolism resulting in cell growth, carbohydrate metabolism, energy pathways, electron transport, nucleobase, nucleoside, nucleotide and nucleic acid metabolism, protein metabolism and modification, amino acid and derivative metabolism, protein targeting, lipid metabolism, aromatic compound metabolism, one-carbon compound metabolism, coenzymes and prosthetic group metabolism, sulfur metabolism, phosphorus metabolism, phosphate metabolism, oxygen and radical metabolism, xenobiotic metabolism, nitrogen metabolism, fat body metabolism (sensu Insecta), protein localization, catabolism, biosynthesis, toxin metabolism , methylglyoxal metabolism, cyanate metabolism, glycolate metabolism, carbon utilization and antibiotic metabolism.
  • proteins involved in the reactions of cell growth and maintenance such as: metabolism resulting in cell growth, carbohydrate metabolism, energy pathways, electron transport, nucleobase, nucleoside, nucleotide and nucleic acid metabolism, protein metabolism and modification, amino acid and derivative metabolism, protein targeting, lipid metabolism, aromatic compound metabolism, one
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to freat diseases involving cell metabolism.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
  • metabolism-related diseases include, but are not limited to, multisystem mitochondrial disorder caused by mitochondrial DNA cytochrome C oxidase II deficiency [Campos Y., et al., (2001) Ann. Neurol. 50(3):409-13], conduction defects and ventricular dysfunction in the heart associated with heterogeneous connexin43 expression [Gutstein D.
  • Cell growth and/or maintenance proteins refers to proteins involved in any biological process required for cell survival, growth and maintenance, including proteins involved in biological processes such as cell organization and biogenesis, cell growth, cell proliferation, metabolism, cell cycle, budding, cell shape and cell size control, sporulation (sensu Saccharomyces), transport, ion homeostasis, autophagy, cell motility, chemi- mechanical coupling, membrane fusion, cell-cell fusion, and stress response.
  • compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat or prevent diseases such as cancer, degenerative diseases, for example neurodegenerative diseases or conditions associated with aging, or alternatively, diseases wherein apoptosis which should have taken place, does not take place.
  • diseases such as cancer, degenerative diseases, for example neurodegenerative diseases or conditions associated with aging, or alternatively, diseases wherein apoptosis which should have taken place, does not take place.
  • Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases, detection of predisposition to a disease, and determination of the stage of a disease.
  • diseases include, but are not limited to, ataxia-telangiectasia associated with ataxia-telangiectasia mutated deficiency [Hande et al., (2001) Hum. Mol. Genet., 10(5):519-28], osteoporosis associated with osteonectin deficiency [Delany et al., (2000) J. Clin.
  • Variants of proteins which accumulate an element/compound Variant proteins which their wild type version naturally binds a certain compound or element inside the cell, such as for storage, may have therapeutic effect as secreted variants.
  • Fenitin accumulates iron inside the cells.
  • a secreted variant of this protein is expected to bind plasma iron, reduce its levels to thereby have therapeutic effects in hemodisorders which are characterized by high levels of free-iron in the blood.
  • Autoantigens refer to "self proteins which evoke autoimmune response. Examples of autoantigens are listed in Table 8, below. Secreted splice variants of such autoantigens can be used to treat such autoimmune disorders.
  • the secreted variants of the present invention may treat these multiple symptoms.
  • Therapeutic mechanisms of such variants may include: (i) sequestration of auto-antibodies to thereby reduce their circulating levels; (ii) antigen specific immunotherapy — based on the observation that prior systemic administration of a protein antigen could inhibit the subsequent generation of the immune response to the same antigen (has been proved in mice models for Myasthenia Gravis and type I Diabetes).
  • any novel variant of autoantigens may be used for "specific immunoadsorption" - leading to a specific immunodepletion of an antibody when used in immunoadsorption columns.
  • Variants of autoantigens are also of a diagnostic value. The diagnosis of many autoimmune disorders is based on looking for specific autoantibodies to autoantigens known to be associated with an autoimmune condition. Most of the diagnostic techniques are based on having a recombinant form of the autoantigen and using it to screen for serum autoantibodies. However these antibodies may bind the variants of the present invention with a similar or augmented affinity.
  • TPO is a known autoantigen in thyroid autoimmunity.
  • TPOzanelli also take part in the autoimmune process and can bind the same antibodies as TPO [Biochemistry. 2001 Feb 27; 40(8):2572-9.].
  • the nucleic acid sequences of the present invention, the proteins encoded thereby and the cells and antibodies described hereinabove can be used in screening assays, therapeutic or prophylactic methods of treatment, or predictive medicine (e.g., diagnostic and prognostic assays, including those used to monitor clinical trials, and pharmacogenetics).
  • the nucleic acids of the present invention can be used to: (i) express a protein of the invention in a host cell in culture or in an intact multicellular organism following, e.g., gene therapy; (ii) detect an mRNA; or (iii) detect an alteration in a gene to which a nucleic acid of the invention specifically binds; or to modulate such a gene's activity.
  • the nucleic acids and proteins of the present invention can also be used to treat disorders characterized by either insufficient or excessive production of those nucleic acids or proteins, a failure in a biochemical pathway in which they normally participate in a cell, or other abenant or unwanted activity relative to the wild type protein (e.g., inappropriate enzymatic activity or unproductive protein folding).
  • the proteins of the invention are useful in screening for naturally occuning protein substrates or other compounds (e.g., drags) that modulate protein activity.
  • the antibodies of the invention can also be used to detect and isolate the proteins of the invention, to regulate their bioavailability, or otherwise modulate their activity. Examplary uses, and the methods by which they can be achieved, are described in detail below. Possible utilities for variants of drug targets Finding a variant of a known drug target can be advantageous in cases where the known drag has a major side effect, the therapeutic efficacy of the known drug is medium, a known drag has failed clinical trials due to one of the above.
  • a drug which is specific to a new protein variant of the target or to the target only (without affecting the novel variant) is likely to have lower side effects as compared to the original drag, higher therapeutic efficacy, and broader or different range of activities.
  • COX3 which is a variant of COX1
  • COX3 is known to bind COX inhibitors in different affinity than COX1.
  • This molecule is also associated with different physiological processes than COX1. Therefore, a compound specific to COX1 or compounds specific to
  • COX3 would have lower side effects (by not affecting the other variants), and higher therapeutic efficacy to larger populations.
  • Diseases that may be treated/diagnosed using the teaching of the present invention Inflammatory diseases Examples of inflammatory diseases include, but are not limited to, chrome inflammatory diseases and acute inflammatory diseases. Inflammatory diseases associated with hypersensitivity Examples of hypersensitivity include, but are not limited to, Types I-IN hypersensitivity, immediate hypersensitivity, antibody mediated hypersensitivity, immune complex mediated hypersensitivity, T lymphocyte mediated hypersensitivity and DTH. An example of type I or immediate hypersensitivity is asthma.
  • type II hypersensitivity examples include, but are not limited to, rheumatoid diseases, rheumatoid autoimmune diseases, rheumatoid arthritis [Krenn V. et al, Histol Histopathol 2000 Jul;15
  • paraneoplastic neurological diseases cerebellar atrophy, paraneoplastic cerebellar atrophy, non-paraneoplastic stiff man syndrome, cerebellar atrophies, progressive cerebellar atrophies, encephalitis, Rasmussen's encephalitis,. amyofrophic lateral sclerosis, Sydeham chorea, Gilles de la Tourette syndrome, polyendocrinopathies, autoimmune polyendocrinopathies [Antoine JC. and Honnorat J. Rev Neurol (Paris) 2000 Jan;156 (1):23], neuropathies, dysimmune neuropathies [Nobile-Orazio E.
  • vasculitises necrotizing small vessel vasculitises, microscopic polyangiitis, Churg and Strauss syndrome, glomerulonephritis, pauci-immune focal necrotizing glomemlonephritis, crescentic glomerulonephritis [Noel LH. Ann Med Interne (Paris). 2000 May; 151 (3): 178], antiphospholipid syndrome [Flamholz R. et al, J Clin Apheresis 1999;14 (4):171], heart failure, agonist-like ⁇ -adrenoceptor antibodies in heart failure [Wallukat G.
  • Type IV or T cell mediated hypersensitivity include, but are not limited to, rheumatoid diseases, rheumatoid arthritis [Tisch R, McDevitt HO. Proc Natl Acad Sci U S A 1994 Jan 18;91 (2):437], systemic diseases, systemic autoimmune diseases, systemic lupus erythematosus [Datta SK., Lupus 1998;7 (9):591], glandular diseases, glandular autoimmune diseases, pancreatic diseases, pancreatic autoimmune diseases, Type 1 diabetes [Castano L. and Eisenbarth GS. Ann. Rev. Immunol.
  • autoimmune diseases include, but are not limited to, cardiovascular diseases, rheumatoid diseases, glandular diseases, gastrointestinal diseases, cutaneous diseases, hepatic diseases, neurological diseases, muscular diseases, nephric diseases, diseases related to reproduction, connective tissue diseases and systemic diseases.
  • autoimmune cardiovascular and blood diseases include, but are not limited to atherosclerosis [Matsuura E.
  • autoimmune rheumatoid diseases include, but are not limited to rheumatoid arthritis [Krenn V. et al, Histol Histopathol 2000 Jul;15 (3):791; Tisch R, McDevitt HO. Proc Natl Acad Sci units S A 1994 Jan 18;91 (2):437) and ankylosing spondylitis [Jan Voswinkel etal, Arthritis Res 2001; 3 (3): 189].
  • autoimmune glandular diseases include, but are not limited to, autoimmune diseases of the pancreas, Type 1 diabetes [Castano L. and Eisenbarth GS. Ann. Rev. Immunol. 8:647; Zimmet P.
  • autoimmune gastrointestinal diseases include, but are not limited to, chronic inflammatory intestinal diseases [Garcia Herola A. et al, Gastroenterol Hepatol. 2000 Jan;23 (1):16], celiac disease [Landau YE. and Shoenfeld Y. Harefuah 2000 Jan 16;138 (2): 122], colitis, ileitis and Crohn's disease and ulcerative colitis.
  • autoimmune cutaneous diseases include, but are not limited to, autoimmune bullous skin diseases, such as, but are not limited to, pemphigus vulgaris, bullous pemphigoid and pemphigus foliaceus.
  • autoimmune hepatic diseases include, but are not limited to, hepatitis, autoimmune chronic active hepatitis [Franco A. et al, Clin Immunol Immunopathol 1990 Mar;54 (3):382], primary biliary cinhosis [Jones DE. Clin Sci (Colch) 1996 Nov;91 (5):551; Sfrassburg CP. et al, Eur J Gastroenterol Hepatol.
  • autoimmune neurological diseases include, but are not limited to, multiple sclerosis [Cross AH. et al, 1 Neuroimmunol 2001 Jan 1;112 (1-2):1], Alzheimer's disease [Oron L. et al, J Neural Transm Suppl. 1997;49:77], myasthenia gravis [Infante AJ. And Kraig E, Int Rev Immunol 1999;18 (l-2):83; Oshima M.
  • autoimmune muscular diseases include, but are not limited to, myositis, autoimmune myositis and primary Sjogren's syndrome [Feist E.
  • autoimmune nephric diseases include, but are not limited to, nephritis and autoimmune interstitial nephritis [Kelly CJ. J Am Soc Nephrol 1990 Aug;l (2).T40], glommeralar nephritis.
  • autoimmune diseases related to reproduction include, but are not limited to, repeated fetal loss [Tincani A. etal, Lupus 1998;7 Suppl 2:S107-9].
  • autoimmune connective tissue diseases include, but are not limited to, ear diseases, autoimmune ear diseases [Yoo TJ. et al, Cell Immunol 1994 Aug;157 (1):249) and autoimmune diseases of the inner ear [Gloddek B. et al, Ann N Y Acad Sci 1997 Dec 29;830:266].
  • autoimmune systemic diseases include, but are not limited to, systemic lupus erythematosus [Erikson J. et al, Immunol Res 1998; 17 (l-2):49) and systemic sclerosis [Renaudineau Y. et al, Clin Diagn Lab Immunol. 1999 Mar;6 (2):156; Chan OT.
  • infectious diseases include, but are not limited to, chronic infectious diseases, subacute infectious diseases, acute infectious diseases, viral diseases, bacterial diseases, protozoan diseases, parasitic diseases, fungal diseases, mycoplasma diseases, and prion diseases.
  • Graft rejection diseases Examples of diseases associated with transplantation of a graft include, but are not limited to, graft rejection, chronic graft rejection, subacute graft rejection, hyperacute graft rejection, acute graft rejection, and graft versus host disease.
  • Allergic diseases include, but are not limited to, asthma, hives, urticaria, pollen allergy, dust mite allergy, venom allergy, cosmetics allergy, latex allergy, chemical allergy, drag allergy, insect bite allergy, animal dander allergy, stinging plant allergy, poison ivy allergy and food allergy.
  • Cancerous diseases include but are not limited to carcinoma, lymphoma, blastoma, sarcoma, and leukemia. Particular examples of cancerous diseases but are not limited to: Myeloid leukemia such as Chronic myelogenous leukemia. Acute myelogenous leukemia with maturation.
  • Acute promyelocytic leukemia Acute nonlymphocytic leukemia with increased basophils, Acute monocytic leukemia.
  • Acute myelomonocytic leukemia with eosinophilia malignant lymphoma, such as Birkitt's Non-Hodgkin's
  • Lymphoctyic leukemia such as acute lumphoblastic leukemia.
  • Chronic lymphocytic leukemia Myeloproliferative diseases, such as Solid tumors Benign Meningioma, Mixed tumors of salivary gland, Colonic adenomas; Adenocarcinomas, such as Small cell lung cancer, Kidney, Uteras, Prostate, Bladder, Ovary, Colon, Sarcomas, Liposarcoma, myxoid, Synovial sarcoma, Rhabdomyosarcoma (alveolar), Extraskeletel myxoid chonodrosarcoma, Ewing's tumor; other include Testicular and ovarian dysgerminoma, Retinoblastoma, Wilms' tumor, Neuroblastoma, Malignant melanoma, Mesothelioma, breast, skin, prostate, and ovarian.
  • nucleic acid sequences of the present invention having RNA editing sites, and the proteins encoded thereby and the cells and antibodies described hereinabove can be used in, for example, screening assays, therapeutic or prophylactic methods of treatment, or predictive medicine (e.g., diagnostic and prognostic assays, including those used to monitor clinical trials, and pharmacogenetics).
  • predictive medicine e.g., diagnostic and prognostic assays, including those used to monitor clinical trials, and pharmacogenetics.
  • the nucleic acids of the invention can be used to: (i) express a protein of the invention in a host cell (in culture or in an intact multicellular organism following, e.g., gene therapy, given, of course, that the transcript in question contains more than untranslated sequence); (ii) detect an mRNA; or (iii) detect an alteration in a gene to which a nucleic acid of the invention specifically binds; or to modulate such a gene's activity.
  • the nucleic acids and proteins of the invention can also be used to treat disorders characterized by either insufficient or excessive production of those nucleic acids or proteins, a failure in a biochemical pathway in which they normally participate in a cell, or other abenant or unwanted activity relative to the wild type protein (e.g., inappropriate enzymatic activity or unproductive protein folding).
  • the proteins of the invention are especially useful in screening for naturally occurring protein substrates or other compounds (e.g., drags) that modulate protein activity.
  • the antibodies of the invention can also be used to detect and isolate the proteins of the invention, to regulate their bioavailability, or otherwise modulate their activity.
  • EXAMPLE 8 Examples of annotation This section presents examples of annotations, assigned to transcripts having RNA editing, as described in Example 1 above.
  • the arbitrary name of each fragment means as follows: Compugen contig name (see Table l)_segment numbe r_editing site location within the segment. AA554866_1_1403 #SEQLIST AK024183
  • AI138826_1_253 #SEQLIST BG952531 AI537687 AI138826_1_274 #SEQLIST AL702589 AI537687 AH 38826 L279 #SEQLIST BG952531 AI537687 AI138826_1_281 #SEQLIST AI537687
  • H20403_31_502 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST HSM802030
  • HSIFNABR_30_459 #GENE_SYMBOL IFNABR ;IFNAR2 ;IFNARB #GO_F #GO_Acc 3800 #GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory ;
  • ⁇ Cirrhosis hepatic ⁇ Cytokine ⁇ Diabetes, Type II ⁇ Fibromyalgia ⁇ Fibrosis, pulmonary ; ' Gene therapy ⁇ Hepatoprotective ⁇ Immunoconjugate, other ⁇ Immunodeficiency, general ⁇ Immunoglobulin, non-MAb ⁇ Immunological ⁇ Immunomodulator, anti-infective ⁇ Immunostimulant, anti-AIDS ⁇ Immunostimulant, other ⁇ Immunosuppressant ⁇ Infection, HIV/AIDS ⁇ Infection, coronavirus ⁇ Infection, coronavirus, prophylaxis ⁇ Infection, general ⁇ Infection, hepatitis virus, general ⁇ Infection, hepatitis-B virus
  • ⁇ Infection hepatitis-C virus ⁇ Infection, herpes simplex virus ⁇ Infection, herpes virus, general ⁇ Infection, human papilloma virus ⁇ Infection, otological ⁇ Infection, staphylococcal prophylaxis ⁇ Infection, streptococcal prophylaxis ⁇ Infection, varicella zoster virus ⁇ Inflammation, brain ⁇ Keratoconjunctivitis ⁇ Macular degeneration ⁇ Monoclonal antibody, other ⁇ Multiple sclerosis treatment ⁇ Multiple sclerosis, general ⁇ Musculoskeletal ⁇ Neurological ;Non-antisense oligonucleotides ⁇ Ophthalmological ⁇ Pemphigus ⁇ Prophylactic vaccine ⁇ Recombinant interferon ⁇ Recombinants, other ⁇ Respiratory ⁇ Rhinitis, allergic, general ⁇ Sepsis ⁇ Septic shock treatment ⁇ Sjogren's syndrome
  • ⁇ Infection hepatitis-C virus ⁇ Infection, herpes simplex virus ⁇ Infection, herpes virus, general ⁇ Infection, human papilloma virus ⁇ Infection, otological ⁇ Infection, staphylococcal prophylaxis ⁇ Infection, streptococcal prophylaxis ⁇ Infection, varicella zoster virus ⁇ Inflammation, brain ⁇ Keratoconjunctivitis ⁇ Macular degeneration ⁇ Monoclonal antibody, other ⁇ Multiple sclerosis treatment ⁇ Multiple sclerosis, general ⁇ Musculoskeletal ⁇ Neurological ;Non-antisense oligonucleotides ⁇ Ophthalmological
  • HSIFNABR_33_58 #GENE_SYMBOL IFNABR ⁇ IFNAR2 ⁇ IFNARB #GO_F #GO_Acc 3800 #GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory ⁇ Anti
  • the attached CD-ROM1 contains 4 files as follows:
  • RNA Editing in Coding Regions This Example relates to locating RNA editing sites which affect proteins, and hence which are located in the coding region. To locate such editing sites, this Example describes the use of conservation data between human and mouse.
  • the cunent method uses LEADS (the previously described sequence discovery engine and database) to find all potential mismatches between RNA and DNA, and maps them to the human genome and also separately to the mouse genome (see the results in Appendices 2 and 3). Flanking regions around each mismatch of 200 bp (100 bp on each side of the mismatch) were then obtained, and were aligned between human and mouse sequences. The method looked for aligned sequences in which the same type of mismatch occurs in conserved regions at the same location for both the human and mouse sequences.
  • the method then preferably includes locating all potential loops that are conserved between human and mouse sequences, and search for editing sites in this region with the EST data.
  • a list of A->G putative editing sites detected according to this method appears in Appendix 2, while a conesponding list for putative C-T sites appears in Appendix 3.
  • One example of a validated RNA editing site that was predicted according to the present invention is as follows, for the blcap gene: in the DNA sequence, there is only "A” but in the ?RNA one can see "A” and "G” which is the hallmark of editing. This case is the first non-ion channel protein that undergoes editing in its coding sequence. In this case, at the protein level there is a transformation from Y->C. The sequence change is shown in the illustration in Figure 12.
  • the method of the present invention was optionally and preferably performed as follows: 1. Marking bad sequences. 2. Marking regions with higher sequencing enor probability
  • the method of the present invention preferably has the following detailed stages. 1. Marking "bad" sequences involves removing sequences which are defective and/or otherwise could be problematic or create noise in the method of the present invention. For example, sequences with an excessively high enor rate in a node. These sequences might be simply "bad", or wrongly clustered, and are preferably discarded from the rest of the analysis.
  • a refseq is an RNA that appears as an RNA and as a refseq (reference sequence, derived from a project by NCBI - see www.ncbi.nlm.nih.gov/RefSeq/ for an example)).
  • small_node_seq_bnd seq_bnd for small nodes
  • small_node_size maximal size for small nodes. 2. Marking "polluted" regions - regions where the sequencing enor probability seems higher for some reason.
  • RNA editing sites calculating the probabilities of columns with disagreements, given a model of no-editing site, and extracting the ones for which the probability is below a given bound.
  • This method involves the use of the null hypothesis, with a threshold for determining acceptance of an RNA editing site.
  • Multiplicative probability factors for the groupings setting a new sequencing-enor probability for the different conditions. For example, for clone-disagreement, this factor should be much larger than 1, indicating that this is probably a sequencing enor.
  • ADAR1 targets Using comparative genomics and expressed sequences analysis, four additional human substrates were identified and experimentally verified: FLNA, BLCAP, CYPIF2 and IGFBP7 — more than the sum total previously reported. Editing of three of these substrates was also verified in mouse, and two subsfrates were validated in chicken as well (see
  • the method of the present invention is designed to find genomic sites at which the expressed nucleotide diverges from the genomic one. Such occunences could be interpreted as either SNPs or editing, and it is therefore not surprising to find that all of the editing sites reported here are enoneously recorded as SNPs in dbSNP (dbSNP id's: BLCAP - rsll557677; FLNA - rs3179473; CYFIP2 - rs3207362; IGFBP7 - rsl 133243 and rsl 1555284). All of these presumed SNPs have no evidence for genomic polymorphism, and were included in dbSNP based on expressed data alone.
  • the full sequencing data are given in Figure 7 below; additional data is also provided.
  • the full-length BLCAP (bladder cancer associated protein) cDNA contains a complete open reading frame (ORF) encoding a protein composed of 87 amino acids.
  • Comparison of mouse and human BLCAP genomic loci revealed an infronless organization of the coding region in both species as well as a highly conserved stracture having 91% and 100% identity at the DNA (coding region) and protein levels.
  • the function of this differentially expressed protein is not yet known but it is expressed mainly in brain tissues and B cells(/2) and appears to be down-regulated during bladder cancer progression( 3).
  • An editing site within the BLCAP coding sequence, located at chr20:36,833,001 was identified, inducing a Y->C substitution at the 2 nd amino-acid of the final protein.
  • the FLNA (filamin A alpha) protein is a 280-kD (2647 a.a.) protein that crosslinks actin filaments into orthogonal networks in the cortical cytoplasm(i ⁇ ) and participates in the anchoring of membrane proteins with the actin cytoskeleton(75).
  • the resulting remodelling of the cytoskeleton is central to the modulation of cell shape and cell migration.
  • One editing site within the FLNA transcript (chrX: 152,047,854) was identified, resulting in Q->R substitution at amino-acid 2341 in the human and mouse proteins and 2283 in the chicken homologue.
  • the human editing region is predicted to form a 32bp long dsRNA stracture with a conserved region within the infron ⁇ 200bp downstream to the editing site.
  • the edited amino acid lies within the 22 nd rod-like region in the protein, which has been shown to be important for interaction with integrin beta(i ⁇ 5).
  • the same region binds to Racl(77), which is also known to interact with CYFIP2(i ⁇ °).
  • the CYFIP2 (cytoplasmic FMR1 interacting protein 2) transcript encodes a protein of 1253 amino-acids.
  • C YFIP2 is a member of a highly conserved protein family found in both invertebrates and vertebrates. Human CYFIP2 shares approximately 99% sequence identity with its mouse orthologs(7 ). It is expressed mainly in brain tissues, immune-system cells and kidney(72).
  • One editing site within the CYFIP2 transcript (chr5: 156,717,703) was identified, resulting in a K->E substitution at amino-acid 320 in both the human and mouse proteins. Editing was also observed at the conesponding predicted position in the chicken cDNA.
  • CYFIP2 is a p53 inducible protein(20), thus possibly a pro-apoptotic gene.
  • ADAR1 knock out mice show elevated apoptosis in most tissues thus possibly providing a link between the phenotype of these mice and a potential pro-apoptotic editing target (10).
  • No obvious dsRNA stmcture in the CYFIP2 pre-mRNA including the editing region could be identified, except for a weak, local pairing.
  • the IGFBP7 (insulin-like growth factor binding protein 7) transcript encodes a protein 282 amino-acids length, and is expressed in a wide range of tissues (12).
  • IGFBP7 is a member of a family of soluble proteins that bind insulin-like growth factors (IGFs) with high affinity.
  • the editing site overlaps with an infron of an antisense transcript BC039519, pairing with which could also trigger editing by ADARs(22).
  • the editing site in the FLNA transcript is located two nucleotides upstream to a splicing site, resembling the R/G editing site of glutamate receptor.
  • seven of the eight nucleotides around the editing site are identical in the two substrates. This might suggest that FLNA, like glutamate receptor, can be edited by ADAR2.
  • the proximity of the editing site in the glutamate receptor to the splicing site has led to speculations on a possible link between editing and splicing.
  • GluR-B mRNA molecules in ADAR2 null mice exhibit almost no editing at the Q/R site accompanied with inefficient removal of the adjacent infron 11 (8).
  • analysis of the available EST data suggests a positive conelation between editing of the last codon in the exon of FLNA and abenant retention of the following infron, again, suggesting a link between editing and splicing. Editing typically happens in only a fraction of the sequences. Since the coverage of expressed sequences is scarce for many genes, editing sites might be missed by the method of the present invention. For example, the method did not detect editing of the serotonin receptor, which is supported by only one sequence, or editing of KCNA1, which is not supported by any sequence.
  • ADAR2 knockout mice show behavioural phenotypes(25). Therefore it was hypothesized that A-to-I RNA editing has a pivotal role in nervous system functions(23). Notably, while all four novel subsfrates presented here do not encode ion-channels, at least two of them have functions in the CNS.
  • CYFIP2 interacts with the Fragile-X mental retardation protein(7P), as well as with the FMRP-related proteins F?XR1P and F?XR2P, and is present in synaptosomal exfracts(iP).
  • the Drosophila homologue has also been shown to be required for normal axonal growth and synapsis formation (18, 24).
  • our experimental results suggest that the editing of CYFIP2 is brain specific.
  • FLNA binds a plethora of fransmembrane receptors and ion channels(75).
  • the genomic position is marked as a tme event (i.e., it is assumed being an SNP or editing site).
  • sequences of low alignment quality >10% mismatches
  • genomic regions where the ?MA is of low quality mi.g., it is assumed being an SNP or editing site.
  • the probability of a sequencing or alignment enor at a certain position is estimated based on the type of the sequence (RefSeq, RNA or EST) and the quality of the MA at the genomic region (enor probabilities: clean regions - RefSeq: 2E- 6; RNA: 5e-4; EST: 3E-4. polluted regions - RefSeq: 8E-4; RNA . 5E-3; EST: 5E-2).
  • the probability cut-off against which the different model probabilities are compared is 10 "6 divided by the number of supporting sequences.
  • the prior probability of an SNP is 10 "4 . Applying this algorithm to the human and mouse transcriptomes resulted in two lists of putative SNPs/editing events.
  • RNA and genomic DNA isolated simultaneously from the same tissue sample were purchased from Biochain Institute (Hayward, CA). In this work we used samples of liver, prostate, utems, kidney, lung normal and tumor, brain tumor (glioma), cerebellum and frontal lobe. The total RNA underwent oligo-dT primed reverse transcription using Superscript II
  • PCR reactions were done using Abgene ReddyMixTM kit (Takara Bio, Shiga, Japan) using the primers and annealing conditions as detailed in the following.
  • First strand cDNAs or conesponding genomic regions were amplified with suitable primers using Ptu polymerase, to minimize mutation rates during amplification.
  • Amplified fragments were A-tailed using Taq polymerase, gel purified and cloned into pGem-T easy (Promega, Madison, WI).
  • pGem-T easy Promega, Madison, WI.
  • E. coli individual plasmids were sequenced and aligned using ClustalW.
  • Sequencher 4.2 Suite (Gene Codes Corporation) was usesd for multiple-alignment of the elecfropherograms.
  • the extent of A-I editing is variable, e.g.
  • the levels of the guanosine trace sometimes is only a fraction of the adenine trace, while in some occasions the conversion from A to I is almost complete.
  • FLNA - DNA F GACCTGAGACACGAGAAAAACTCC
  • R CGGTCTTACACTCTTTCCCTGC
  • IGFBP7 - RNA F GAGGGCGAGCCGTGC
  • CYFIP2 - DNA F GCGAAGGCAGCCACCCCAAC
  • CYFIP2 - ( DNA & RNA ) F TCGGCGATATGCAGATAGAAC R: GGGACACACACAGAAGCCAAG
  • SNPs in dbSlNTP were found in the course of the sequencing the human, by algorithmic search for single nucleotide differences between aligned sequence reads of genomic sequence.
  • This approach has been successful in identifying common SNPs, namely those with a frequency of greater than 1% in a diverse panel of individuals representative of different populations.
  • this approach has concentrated on developing a dense map, with uniform coverage across the existing draft of the human genome 1 .
  • Sources for enoneous SNP identifications include sequencing enors, mutations and duplications.
  • SNPs in these databases were not seen, meaning that they are either of very low frequency, mis- mapped, or not polymorphic at all 4 .
  • S?NPs were identified using expressed data: aligning millions of available expressed sequence tags (ESTs), one can search clusters of ESTs for possible SNPs 5"7 .
  • ESTs expressed sequence tags
  • this methods have yielded only tens of thousands of SNPs, not a significant number compared to the millions of SNPs in dbSNP, its importance is due to the fact that the resulting SNPs have an increased likelihood of residing in a coding region or untranslated region of a gene.
  • SNPs in these regions, or generally in regulatory (rSNP) and expressed regions (cSNP) are considered much more important than those in non-functional regions (i.e., most of the SNPs) which are considered of low probability to contribute to phenotype.
  • sequences undergoing A-to-I RNA editing will read G instead of the genomic A, and this could be enoneously inte ⁇ reted as an A/G SNP.
  • S?NPs are only a small fraction (0.5%) of the total number of SNPs, they are a significant fraction (15%) of SNPs in coding sequences, including 17% of the non-synonym SlSIPs. Thus, curation of this subset of SNPs is of great importance.
  • over-representation of A/G expressed SNPs was therefore checked within Alu repetitive elements, in which A- to-I RNA editing is enhanced.
  • Figure 9 shows the distribution of the different types of simple substitution SlSfPs.
  • A/G SNPs account for 33% of all single substitution S?NPs, and for 35% of single substitution S?NPs within Alu repeats.
  • A/G expressed SlSfPs are highly over-represented in Alu repeats: whereas only 27% of all expressed single- substitution SNPs are of type A/G, 70% of these which reside within an Alu repeat are A/G SNPs. Since the annotation of the SNPs does not distinguish between strands, it might be necessary to look at the statistics of A/G and C/T SNPs combined.
  • SNPs account for 66% of all single substitution SNPs, and for 69% of single substitution SlSfPs within Alu repeats.
  • A/G and C/T expressed SNPs are highly over-represented in Alu repeats: whereas only 59% of all expressed single-substitution SNPs are of type A/G or C/T, 86% of these which reside within an Alu repeat are SNPs of these types.
  • A-to-I editing occurs in dsRNA regions
  • A-to-I editing occurs mainly within Alu repeats
  • editing sites tend to cluster, and to show a combinatorial nature: different sequences will be edited in different subsets of the cluster.
  • Such a combinatorial behavior is not expected for SNPs, since the short distance between the sites does not allow for many recombinations.
  • the above characteristics were used in a recently published algorithm to search for RNA editing(Levanon et al. 2004).
  • the set of putative editing sites (predicted accuracy > 95%, experimental validation of a random subset shows accuracy -90%) was used for aligning each predicted editing site against the database of expressed SNPs using the BLAST algorithm, retaining only alignments longer than 32nt with identity levels higher than 95%. 562 expressed SNPs that were mapped on predicted A-to-I editing sites were found, a list of which is given below. However, since most of these SNPs are located within Alu elements, only 102 of these SNPs have an unambiguous mapping onto the genome in dbSNP. The list of these 102 SNPs is given in Table 10.
  • the RefSeq sequence onto which the SNP is mapped (if any), and the location within the RefSeq sequence are given. In addition, it is indicated whether the SNP resides within an Alu repeat. 56 out of 102 SNPs are mapped onto a RefSeq sequence, 37 of which (66%) are mapped to the UTR of the RefSeq, and the remaining 19 (34%) are located within infrons of the RefSeq sequence. None of the 102 SNPs are mapped onto RefSeq coding sequences. 96 out of the 102 SNPs in the table (94%) are located within Alu repeats.
  • transcripts that contain SNPs were selected from the list of 102 candidates and are relatively easy to sequence, having a long, unique, flanking region out of the Alu in the same exon.
  • PCR products of matching DNA and RNA samples were sequenced in a number of tissues as described in the methods section below.
  • the occunence of editing was determined by the presence of an unambiguous frace of guanosine in positions for which the genomic DNA clearly indicated the presence of an adenosine (figure 10 and figure 11).
  • RNA and genomic DNA were isolated simultaneously from the same tissue sample using TriZol reagent (Invifrogen, Carlsbad, CA).
  • TriZol reagent Invifrogen, Carlsbad, CA
  • mice We used samples of liver, prostate, uteras, kidney, lung normal and tumor, brain tumor (glioma), cerebellum and frontal lobe.
  • the total RNA underwent oligo-dT primed reverse franscription using M-MLV Reverse Transcriptase (Invifrogen, Carlsbad, CA) according to manufacturer instructions and also as described above.
  • cDNA and gDNA were used as templates for PCR reactions. We aimed at high sequencing quality and thus amplified rather short genomic sequences (roughly 200nt). The amplified regions chosen for validation were selected only if the fragment to be amplified maps to the genome at a single site. PCR reactions were done using Abgene ReddyMixTM kit (Takara Bio, Shiga, Japan) using the primers and annealing conditions as detailed in the following. PCR fragment were purified from agarose gel using QIAquick Gel Extraction Kit (QIAGEN) followed by sequencing using ABI Prism 3100 Genetic Analyzer (Applied Biosystems). Results All sites tested have been shown to be truly editing sites and not SNPs.
  • RNA editing sites consists of 12,723 sites 22 . This is a conservative estimation, using a strict set of parameters. There is a number of indications that there are actually many more sites as previously described. Accordingly, the number of enoneously assigned EST-based SNPs is probably much higher than the 121 examples that are described herein.
  • RNA editing in the human franscriptome such as the C-to-U editing of apoB transcripts by APOBEC-1 (apolipoprotein B mRNA editing catalytic polypeptide 1).
  • A-to-I pre-mRNA editing in Drosophila is primarily involved in adult nervous system function and integrity.
  • GluR-B a base-paired intron-exon stracture determines position and efficiency.
  • RNA editing Nature 399, 75-80 (1999). Wong, S. K., Sato, S. & Lazinski, D. W. Substrate recognition by ADAR1 and ADAR2. Rna 7, 846-58 (2001). Lei, M., Liu, Y. & Samuel, C. ⁇ . Adeno virus VAI RNA antagonizes the RNA- editing activity of the ADAR adenosine deaminase. Virology 245, 188-96 (1998). Tonkin, L. A. & Bass, B. L. Mutations in RNAi rescue abenant chemotaxis of ADAR mutants. Science 302, 1725 (2003). Jiang, R. et al. Genome- wide evaluation of the public SNP databases. Pharmacogenomics 4, 779-89 (2003). Antonarakis, S. ⁇ ., Krawczak, M. & Cooper, D. C. in The Genetic Basis of Human
  • RNA editing deaminase ADAR1 gene for embryonic erythropoiesis. Science 290, 1765-8 (2000). Higuchi, M. et al. Point mutation in an AMPA receptor gene rescues lethality in mice deficient in the RNA-editing enzyme ADAR2. Nature 406, 78-81 (2000). Patterson, J. B. & Samuel, C. E. Expression and regulation by interferon of a double- stranded-R ⁇ A-specific adenosine deaminase from human cells: evidence for two forms of the deaminase. Mol Cell Biol 15, 5376-88 (1995). Bmsa, R. et al.
  • R ⁇ A hafrpins in noncoding regions of human brain and Caenorhabditis elegans mR ⁇ A are edited by adenosine deaminases that act on R ⁇ A.
  • Appendix 2 list of potential A->G see file labeled "Appendix 2.txt” in CDROM2.
  • Appendix 3 list of potential C->T see file labeled "Appendix 3.txt" in CDROM2.
  • Appendix 4 list of protein names and conesponding contig names from Appendices 2 and 3
  • Bold face type indicates the two validated examples.
  • H24858 SLC25A10 solute canier family 25 mitochondria canier; dicarboxylate fransporter, member 10
  • HSALDAR ALDOA aldolase A fructose-bisphosphate
  • HUMGRK5 A GRK5 G protein-coupled receptor kinase 5

Abstract

A method for detecting an RNA editing site, and methods of use and assays thereof.

Description

SYSTEMATIC MAPPING OF ADENOSINE TO INOSINE EDITING SITES IN THE HUMAN TRANSCRIPTOME
RELATIONSHIP TO EXISTING APPLICATIONS The present application claims priority from US Provisional Patent Application No. 60/552,311, filed 12 March 2004, US Provisional Patent Application No. 60/583,591, filed 30 June 2004 and from US Provisional Patent Application No. 60/631,458, filed 30 November 2004, the contents of which are hereby incorporated by reference.
FIELD OF THE INVENTION The present invention is of a method for detecting RNA editing sites, as well as uses of this method (for example for diagnostic uses). The present invention also comprises the located RNA editing sites themselves.
BACKGROUND OF THE INVENTION RNA editing by members of the double-sfranded RNA-specific ADAR family leads to site-specific conversion of adenosine to inosine (A-to-I) in the precursor messenger
RNAs1. Editing by ADARs is believed to occur in all metazoa, and is essential for mammalian development2"5. ADAR-mediated RNA editing is essential for normal life and development in both invertebrates and vertebrates . AD AR-deficient inverterbrates show only behavioural defects ' , while ADAR1 knock-out mice die embryonically and ADAR2 null mice live to term but die prematurely4'5. High editing levels were found in inflamed tissues, in agreement with a proposed antiviral function of ADARs and their transcriptional regulation by interferon8. Altered editing patterns were found in epileptic mice9, suicide victims suffering chronic depression10 and in malignant gliomas11. Until recently only a handful of edited human genes were documented, most of which were discovered serendipitously12. A systematic experimental search for inosine-containing RNAs has yielded 19 additional cases13, and one further example was found using a cross-genome comparison approach14.
However, quantitation of inosine in total RNA suggests that editing affects a much larger fraction of the mammalian transcriptome7. Large-scale identification of editing substrates by bioinformatics tools was previously considered practically impossible15. In principle, editing may be detected using the large- scale database of ESTs16 (expressed sequence tags) and RNAs, which cunently holds over 5 million human records. Editing sites show up when a sequence is aligned with the genome: while the DNA reads A, sequencing identifies the inosine in the edited site as guanosine (G). However, the poor sequencing quality of the sequence database (up to 3% sequencing enors17) precludes a straightforward application of this approach. Moreover, millions of single nucleotide polymorphisms (SNPs) and mutations are enoneously identified as editing events by this method. Currently, only a limited number of human ADAR substrates are known, while indirect evidence suggests a substantial fraction of all pre-mRNAs being affected6'7.
SUMMARY OF THE INVENTION The background art does not teach or suggest many RNA editing sites, as previous attempts to locate such sites were neither sufficiently systematic nor sufficiently successful to uncover the vast majority of RNA editing sites. The present invention is of a method for searching for RNA editing sites. According to prefened embodiments, the method features searching for ADAR editing sites in the human transcriptome. The method of the present invention was validated by searching millions of available expressed sequences to map A-to-I editing sites. A much larger number of A-to-I editing sites were mapped in many different genes, with an estimated accuracy of 95%, raising the number of known editing sites by two orders of magnitude. The method was experimentally validated by verifying the occunence of editing in 28 novel substrates. A-to-I editing in humans primarily occurs in non-coding regions, typically in Alu repeats. Within Alu sequences, specific hotspots for editing were identified. Remarkably, a significant fraction of editing events result in the stabilization of the double-sfranded RNA (dsRNA) stmcture, while only 3% have a neutral effect on pairing. ADAR substrates are usually imperfect dsRNA stems formed by base pairing of an exon containing the adenosine to be edited with a complementary portion of the pre-mRNA (up to several thousand nucleotides apart). According to prefened embodiments of the present invention, the search for mismatches was restricted to potential double-sfranded regions, in order to remove most of the noise and facilitate the identification of tme editing sites. For this purpose, human ESTs and cDNAs were aligned to the genome and assembled into clusters representing genes or partial genes, as described in Shoshan et al18. Then, the method of the present invention aligned the expressed part of the gene with the conesponding genomic region, looking for reverse complement alignments longer than 30nt with identity levels higher than 85% (see figure 1). About 429,000 candidate dsRNAs were found in 14,512 different genes, mostly resulting from alignment of an exon to an intron. In order to further decrease the number of random mismatches, SNPs and mutations, the algorithm then cleaned the sequences supporting the stem region. According to other preferred embodiments of the present invention, additional filters are preferably featured. Since sequencing enors tend to cluster in certain regions, especially in low complexity areas and towards sequences ends, preferably an optional filter discards all single-letter repeats longer than 4nt, as well as 150nt at both ends of each sequence. In addition, all 50nt-wide windows in which the total number of mismatches is 6 or more were considered as having low sequencing quality and were discarded according to another optional filter. However, 4 or more identical sequential mismatches were masked in the count for mismatches in a given window. This exception (according, to a prefened embodiment of the filter) is intended to retain sequences with many sequential editing sites, which were found to occur in previously documented examples . Mismatches supported by less than 5% of available sequences were also discarded according to another optional but prefened filter, and, finally, known SNPs of genomic origin were removed. Employing those criteria one finds that the putative editing sites tend to group together, a fact which is also supported by the few available known cases . Thus, all mismatches that occur less than three times in an exon were ignored according to still another optional but prefened filter. The above described filtering (cleaning) procedure resulted almost exclusively in A-to- G mismatches (see figure 2). Employing this procedure, the method of the present invention resulted in the identification of 12,723 putative editing sites, belonging to 1,637 different genes. Detailed information of the ? NA editing sites and the respective transcripts annotation is disclosed in the "flank_clean" and "Ann_clean" files in the attached CD- ROM. The same approach applied to G-to-A mismatches yielded only 242 sites. Sequencing enors, SNPs and mutations, whcih were determined to be significant sources of noise in the analysis, are expected to produce at least as many G-to-As as A-to-Gs (see figure 2). This signal-to-noise ratio (242/12637) suggests that the false positive rate for the method according to the present invention is very low. According to prefened embodiments of the present invention, the method comprises the detection of editing in liver, lung, kidney, prostate, and uterine tissues (see Example 1). Such editing was not previously known to occur. Optionally and preferably, the present invention comprises the use of RNA editing in any one or more of liver, lung, kidney, prostate or uterine tissues, more preferably the detection of RNA editing, and most preferably for diagnosis. Prefened embodiments of the present invention also optionally and preferably comprise a kit for detecting RNA editing in any one or more of liver, lung, kidney, prostate or uterine tissues, as well as a method for detecting RNA editing in one or more of these tissues. Prefened embodiments of the present invention also optionally and preferably comprise a method of treating a disease in a subject by modulating RNA editing in any one or more of liver, lung, kidney, prostate or uterine tissues. For most genes, editing was found in all tissues, with varying relative abundance, but generally the unedited signal dominated the edited signal. According to prefened embodiments of the present invention, there is provided a method of identifying an RNA editing substrate, the method comprising: identifying nucleic acid sequence exhibiting a base pair mismatch in a stem region thereof, the nucleic acid sequence being the RNA editing substrate. Optionally, the stem region is identified by: detecting an exon capable of forming a double stranded region in the nucleic acid sequence, wherein the exon features an adenosine. Preferably, the method further comprises filtering the nucleic acid sequence to remove a section of repeated nucleotides before the identifying the nucleic acid sequence. More preferably, the section comprises at least four repeated nucleotides. According to prefened embodiments of the present invention, the metiiod further comprises filtering the nucleic acid sequence wherein at least a portion of the nucleic acid sequence is discarded if the portion features more than a threshold number of mismatches before the identifying the nucleic acid sequence. Preferably, the portion comprises at least about 20 nucleotides and the threshold number comprises at least about three mismatches. More preferably, if the portion features at least about two identical sequential mismatches, the portion is not discarded. Optionally, the portion comprises at least about 50 nucleotides and the threshold number comprises at least about six mismatches. Preferably, if the portion features at least about four identical sequential mismatches, the portion is not discarded. According to prefened embodiments of the present invention, the RNA editing substrate is detected in a tissue comprising at least one of liver, lung, kidney, prostate, or uterine tissue. Preferably, the method further comprises: diagnosing a disease or pathological condition in a subject by detecting RNA editing in at least one of the tissues. More preferably, the diagnosing is performed by determining whether RNA editing in a nucleotide sequence of the subject differs from a normal nucleotide sequence. According to prefened embodiments of the present invention, there is provided a kit for diagnosing a subject, comprising at least one component for detecting RNA editing as described herein. Preferably, the at least one component comprises an oligonucleotide. More preferably, the oligonucleotide hybridizes to the nucleotide sequence for detecting RNA editing. Optionally and more preferably, the oligonucleotide comprises a pair of oligonucleotides for amplifying at least a portion of the nucleotide sequence for detecting RNA editing. According to prefened embodiments of the present invention, there is provided a use of a polynucleotide having a nucleic acid sequence set forth in the files "flan_fo r_all" and "flan_clean" of the enclosed CD-ROM1 and fragments and homologs thereof for the diagnosis and/or treatment of the diseases listed herein and in the file "Ann_for_all" and "Ann_clean" of the enclosed CD-ROM1. According to prefened embodiments of the present invention, there is provided a computer readable storage medium comprising data stored in a retrievable manner, the data including sequence information of RNA editing substrates as set forth in files "flan_for_aH" and "flan_clean" of enclosed CD-ROM1, and conesponding sequence annotations as set forth in the file "Ann_for_all" and "Ann_clean" of enclosed CD-ROM1. According to prefened embodiments of the present invention, there is provided use of any identified RNA editing site, as described herein or as derivable from the methods described herein, optionally as described herein, for diagnostic assays, dmg targets, expressed sequences suitable for therapeutic proteins, and gene therapy to fix abenant and/or pathological RNA editing. According to prefened embodiments of the present invention, there is provided a diagnostic assay, comprising an assay for determining an RNA editing pattern in a sample taken from an individual, optionally as described herein. Optionally and preferably, the method is performed on a multi-probe chip, the chip comprising a plurality of probes for detecting a presence or an absence of at least one RNA editing site in the sample, optionally as described herein. According to prefened embodiments of the present invention, there is provided a diagnostic method for determining an RNA editing pattern in a sample taken from an individual, comprising: determining an RNA editing pattern in the sample to form a test pattern; and comparing the test pattern to a standard pattern, optionally as described herein. Optionally, the standard pattern is optionally related to disease or pathology, and/or to normalcy or "health". Preferably the method further comprises: at least partially diagnosing the individual according to the comparison. More preferably, the disease comprises cancer. According to prefened embodiments of the present invention, there is provided a method for detecting cancer in a subject or a disposition or tendency or susceptibility thereto, comprising analyzing RNA editing in the subject, optionally as described herein.
BRIEF DESCRIPTION OF THE DRAWINGS The invention is herein described, by way of example only, with reference to the accompanying drawings. With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of the prefened embodiments of the present invention only, and are presented in the cause of providing what is believed to be the most useful and readily understood description of the principles and conceptual aspects of the invention. In this regard, no attempt is made to show structural details of the invention in more detail than is necessary for a fundamental understanding of the invention, the description taken with the drawings making apparent to those skilled in the art how the several forms of the invention may be embodied in practice. In the drawings: Figure l DAR-mediated editing: a. pre-mRNA as transcribed from DNA. The gene contains two Alu repeats with opposite orientations, one of which overlaps with an exon. b The two oppositely oriented Alu sequences form a ds?RNA stmcture. c An enzyme of the ADAR family edits some of the adenosines in the dsRNA stmcture into inosines. Figure 2: Distribution of mismatches between the DNA and the expressed RNA sequences that passes the cleaning algorithm, a: results of algorithm application to dsRNAs only. A-to-G mismatches clearly dominate the distribution. Notably, T-to-C mismatches are also ovenepresented, likely being editing events that were aligned to the opposite strand. Inset b shows the distribution of mismatches resulting from applying the algorithm to random expressed sequences covering about 20% of the transcriptome. Insets c and d show the distributions for known SlSfPs and mutations , respectively. A-to-G mismatches do not stand out in the distributions b-d. Figure 3: Editing in the Fll receptor (JAM!) gene. Top: some of the publicly available expressed sequences covering this gene, together with the corresponding genomic sequence. The evidence for editing is highlighted. Bottom: Results of sequencing experiments. Matching DNA and cDNA RNA sequences for a number of tissues. Editing is characterized by a trace of guanosine in the cDNA RNA sequence, where the DNA sequence exhibits only adenosine signals (highlighted). 23 additional examples are provided in the Supplementary Information. Note the variety of tissues showing editing, and the variance in the relative intensity of the edited guanosine signal. Figure 4: multiple alignment of the genomic sequence and the expressed sequences within the NARF gene, undergoing RNA editing. The nucleotides positions involved in RNA editing are marked. The Genbank accession numbers of the sequences appears at the right of the alignment. Figure 5: multiple alignment of the genomic sequence and the expressed sequences within the HSPC274 gene, undergoing RNA editing. The nucleotides positions involved in RNA editing are marked. The Genbank accession numbers of the sequences appears at the right of the alignment. Figure 6: multiple alignment of the genomic sequence and the expressed sequences within the FLJ25952 hypotetical protein, undergoing RNA editing. The nucleotides positions involved in RNA editing are marked. The Genbank accession numbers of the sequences appears at the right of the alignment. Figure 7: Editing in FLNA transcripts. (A) Some of the publicly available expressed sequences covering this gene, together with the conesponding genomic sequence. A total of 226 sequences are available for this locus, 23 of which are edited. (B) Results of sequencing experiments. Matching human DNA and cDNA RNA sequences for human brain and lung tissues. Editing is characterized by a trace of guanosine in the cDNA RNA sequence, where the DNA sequence exhibits only adenosine signals. Sequencing data for more tissues is available as supporting material. Note the variety of tissues showing editing, and the variance in the relative intensity of the edited guanosine signal. (C) Sequences of individually cloned fragments from matching DNA and RNA of mouse brain tissues and chicken brain and liver tissues. Only part of the data is shown. A total of 20 mouse brain cDNA clones, 10 chicken brain and 9 chicken liver cDNA clones were sequenced, out of which 4, 7, and 1 sequences showed editing events, respectively. Figure 8: Hairpin stracture in BLCAP transcripts. (A) The predicted secondary stmcture for the BLCAP substrate, based on lowest free-energy predictions using the program MFO D(27) (www.ibc.wustl.edu/~zuker/ma ). The editing site is at position 601, where the codon UAU(Y) is edited into UGU(C). Structures for the other substrates are given in figures below. (B) Conservation levels at the editing genomic locus. The two red bars at the bottom mark the editing region and the intronic sequence almost perfectly pairing with it to form the hairpin stmcture shown in (A). The editing site is marked in black within the left red bar. The high conservation level of the intronic sequence, suggesting a functional importance, supports its identification as necessary for the editing process. Figure 9: Distributions of the different types of simple substitution SNPs. (a) all SNPs (b) SNPs infened from expressed data only (c) S?NPs within Alu repetitive elements (d) SNPs within Alu elements infened from expressed data only. The enrichment of A/G SISfPs in the last panel is probably due to editing sites within the Alu elements which were enoneously interpreted as S?NPs. Figure 10: An editing site in the eukaryotic translation initiation factor (eIF3k) locus, enoneously identified as SNPs. (A) some of the publicly available expressed sequences covering this gene, together with the conesponding genomic sequence. The location of the dbSNP SNP record is indicated at the bottom. The editing location is highlighted in green for non-edited sequences and in red for edited sequences. (B) Experimental results: sequencing matching human DNA and cDNA RNA sequences. Editing is characterized by a trace of guanosine (black) in the cDNA RNA sequence, where the DNA sequence exhibits only adenosine signals (green). Figure 11: Editing sites in the ribosomal protein S19 (RPS19) locus, enoneously identified as SNPs. (A) some of the publicly available expressed sequences covering this gene, together with the conesponding genomic sequence. The locations of the dbSNP SNP records are indicated at the bottom. The editing location is highlighted in green for non- edited sequences and in red for edited sequences. (B) Experimental results: sequencing matching human DNA and cDNA RNA sequences. Editing is characterized by a trace of guanosine (black) in the cDNA RNA sequence, where the DNA sequence exhibits only adenosine signals (green). It should be noted that the results show that rs3207020, not found in this exemplary illustrative set, is also an editing site rather than a SNP. Figure 12 shows illustrative sequencing results for an exemplary RNA editing site for the BLCAP gene as described below. Figures 13-16 show secondary stmcture as predicted by MFOLD for CYFIP2 (Figure 13), FLNA (Figure 14), BLCAP (Figure 15) and IGFBP7 (Figure 16), respectively. Figure 17 shows the content of Appendix 5 (mouse and chicken sequences).
DESCRIPTION OF PREFERRED EMBODIMENTS Prefened embodiments of the present invention comprise a method for detecting RNA editing, as well as methods of using such detection (for example for diagnosis), and/or methods for treating a disease by modulating RNA editing. According to preferred embodiments of the present invention, these activities are performed with regard to RNA editing in any one or more of liver, lung, kidney, prostate or uterine tissues. Altered editing patterns have been found to be associated with inflammation16, epilepsy17, depression18, ALS and malignant gliomas19. Optionally and preferably, differential ?RNA editing (more preferably differential levels of editing) is used to diagnose the following diseases: inflammation, depression, ALS, cancer and epilepsy. For ALS, cancer and epilepsy, the level of RNA editing preferably is lower than in normal samples. For malignant glioma, a single gene was found (in cancerous tissue samples) to have lower levels of RNA editing than for normal (non-cancerous) tissue samples. Without wishing to be limited by a single hypothesis, it is believed that RNA editing is lower in cancerous tissue than in non-cancerous tissue, although at least the level of editing may be modulated (raised or lowered) in cancerous tissue as compared to normal tissue. Optionally and preferably, the cancer comprises brain cancer. According to prefened embodiments, modulated RNA editing is preferably found in one or more of the following genes for diagnosing cancer, more preferably brain cancer, most preferably malignant glioma (and also most preferably a lowered level of RNA editing): BLCAP; FLNA; CYFIP2; or IGFBP7. As described in the text in Example 11 , the presence of RNA editing was confirmed experimentally. The presence of differential RNA editing in cancerous tissue, preferably brain cancer, most preferably malignant glioma, may optionally and preferably be determined by comparing RNA editing in cancerous tissue to such editing in normal tissue, most preferably to detect a different level of RNA editing in cancerous tissue which is optionally and most preferably a lower level of RNA editing. Illustrative cancers that may optionally be diagnosed with the present invention include but are not limited to bile duct, bladder, bone, bowel (colon and/or rectal cancer), brain (including but not limited to acoustic neuroma, astrocytoma, central nervous system lymphoma, ependymoma, haemangioblastoma, medulloblastoma, meningioma, mixed gliomas, malignant glioma, oligodendroglioma, pineal region tumours or pituitary tumors), breast, carcinoid (including but not limited to carcinoid cancers of the neuroendocrine system, optionally including but not limited to cancers of the appendix, small intestine, lung or pancreas), cervical, eye, gall bladder, esophageal, cancers of the head and neck, kidney, larynx, leukemia (acute lymphoblastic, acute myeloid, chronic lymphocytic, chronic myeloid), liver, lung, lymphoma (Hodgkin's or non-Hodgkin's), melanoma, mesothelioma, myeloma, neuroendocrine (including carcinoid and non-carcinoid, GEPs- gastroenteropancreatic tumors, including insulinomas, gastrinomas, glucagonomas, vipomas or somatostatinomas; and adrenal tumors), ovary, pancreas, penis, prostate, skin, soft tissue sarcomas, spinal cord, stomach, testes, thyroid, vagina and/or vulva, and uterine. According to optional embodiments of the present invention, differential levels of RNA editing may be determined for a gene, a plurality of genes, an entire tissue (or a plurality of tissues), a genetic locus (or a plurality of such loci) or for a tissue sample. For example, to detect differential levels of RNA editing that may be indicative of prostate cancer, the subject could optionally give a urine sample, after which RNA editing could be determined for any of these items. For example, for determining overall RNA editing levels of material in the urine sample, the ratio of adenosine to inosine could optionally be measured in the urine sample, and compared to that of a normal subject (without prostate cancer). Other illustrative examples of tissue samples for use with the present invention include but are not limited to blood, serum, plasma, blood cells, urine, sputum, saliva, stool, spinal fluid or CSF, lymph fluid, the external secretions of the skin, respiratory, intestinal, and genitourinary tracts, tears, milk, neuronal tissue and or any other tissue of the brain, CNS and or peripheral nervous system, lung tissue, any human organ or tissue, including any tumor or normal tissue, any sample obtained by lavage (for example of the bronchial system or of the breast ductal system), and also samples of in vivo cell culture constituents. It should be noted that for all of the RNA editing sites and related sequences described herein, as well as for all such editing sites discoverable according to the methods of the present invention, there are many potential applications, including but not limited to, diagnostic assays, dmg targets, expressed sequences suitable for therapeutic proteins, and gene therapy to fix abenant and/or pathological RNA editing. For example, for diagnostic assays, optionally and preferably a suitable method and/or assay would include determining an RNA editing pattern in an individual subject, and comparing this test pattern to a known standard pattern. The standard pattern could optionally be related to disease or pathology, and/or to normalcy or "health". The comparison could then preferably be used to at least assist in the diagnosis of the individual, for example to determine whether the individual is suffering from (or alternatively lacks) a particular disease or pathological state. For example, such a diagnostic assay could optionally be adapted from a chip-based assay for detecting SNPs (single nucleotide polymorphisms), as described for example with regard to US Patent No. 6,368,799, hereby incorporated by reference as if fully set forth herein. A non-limiting description of an exemplary, illustrative assay for detecting RNA editing patterns is provided below. Optionally and preferably, PCR may be used to amplify any samples before the assay is performed. The assay is preferably performed on a specially constructed array. A simple anay for characterizing binary RNA editing sites (ie sites in which there is or is not RNA editing) could optionally be constructed with a pair of probes respectively hybridizing to the two mRNA forms (edited and not edited). In this case, each editing site would be represented by two positions on the anay, a first position featuring a non-edited sequence, and a second position featuring a sequence that was edited (ie the changed nucleotide which is indicative of editing). However, analysis is more accurate using specialized anays of probes tiled based on the respective edited/non-edited forms. Tiling refers to the use of groups of related immobilized probes, some of which show perfect complementarity to a reference sequence and others of which show mismatches from the reference sequence. For example, for the above type of editing site, the anay would contain two groups of probes tiled based on two reference sequences constituting the respective edited/non-edited forms. The first group of probes preferably includes at least a first set of one or more probes which span the editing site and are exactly complementary to one of the edited or non-edited forms. The group of probes can also contain second, third and fourth additional sets of probes, which contain probes identical to probes in the first probe set except at one position refened to as an intenogation position. When such a probe group is hybridized with the edited or non-edited form constituting the reference sequence, all probes in the first probe set show perfect hybridization and all of the probes in the other probe sets show non-specific hybridization levels due to mismatches. When such a probe group is hybridized with the other form, a different pattern is obtained. That is, all but one probe set in the anay shows a mismatch to the target and produces only non-specific hybridization. The one probe that shows perfect hybridization is a probe from the second, third or fourth probe sets whose intenogation position aligns with the editing site and is occupied by a base complementary to the other form (for example, if the first probe set is related to the edited form, then the second probe set is preferably related to the non-edited form). When the probe group is hybridized with a sample in which only some of the mRNAs are edited, the above patterns are superimposed. Thus, the probe group shows distinct and characteristic hybridization patterns depending on the editing level at the given site. Preferably, the anay also contains a second group of probes tiled using the same principles as the first group but with a reference sequence constituting the non-edited form. That is, the first probe set in the second group spans the edited site and shows perfect complementarity to the non-edited form. Hybridization of the second probe group yields a minor image of hybridization patterns from the first group. By analyzing the hybridization patterns from both probe groups, one can determine with a high accuracy what is the editing level in an individual. Anays can also be designed to analyze many different editing sites in many different genes and/or in the same gene simultaneously simply by including multiple subanays of probes. Each subanay has first and second groups of probes designed for analyzing a particular editing site according to the strategy described above. Chips that are suitable for the above anays may optionally be manufactured according to the method of Affymetrix, California USA (see US Patent Nos. 6,630,308 and 6,506,558), as follows. The chips are manufactured from quartz wafers, which are washed with silane to enable high density anay spotting. Probe synthesis is performed on the chip, by using a linker that binds to the silane. Nucleotides are first added to the linker, and then synthesis continues by elongation of the probes. All probes are synthesized in parallel, by using photolithographic masks. These masks permit light to shine on various parts of the chip in sequence, so that as each nucleotide is added in sequence, only those probes for which the particular nucleotide is appropriate at that point in the sequence have the nucleotide added. To experimentally validate the predicted editing sites, 30 genes were selected and matching DNA and RNA samples retrieved from the same specimen were sequenced, for up to five tissues. Editing events were positively verified in 26 previously unknown editing substrates (see Example 1). PCR products were either cloned followed by sequencing of individual clones, or sequenced as a population, without cloning. When the PCR products were cloned, editing occunence was detected by comparing the sequences of several clones with the genomic sequence. When PCR products were directly sequenced, the occunence of editing was determined by the presence of an unambiguous trace of guanosine in positions for which the genomic DNA (gDNA) clearly indicated the presence of adenosine (figure 3). Two genes were validated using cell-lines known to have varying levels of ADAR activity. Interestingly, the observed levels of A-to-I conversions conelated well with the reported ADAR activities in these cell-lines19. Typically, additional editing sites, not present in the experimental list described herein, were found in the same region. The validation set was composed of two subsets: (i) 20 genes for which the EST data suggested many putative editing events, 18 of these genes were confirmed to be edited, (ii) 13 genes were chosen randomly from the list of 1,595 predicted genes, 9 of which were successfully amplified and sequenced. 8 out of these 9 genes were confirmed edited. Note that the success rate in the random subset (89%) used in the present experiments is a lower bound to the tme accuracy of the list, as either low editing efficiency at a given site or limited variety of tissues could prevent the detection of editing events. Interestingly, 92% of sites occur within an Alu repeat, and additional 1.3% lie within the primate-Ll repeat, in accordance with previous reports13. Without wishing to be limited by a single hypothesis, this is explicable by the fact that only long paired RNA molecules were scanned for editing, a stracture more likely formed between repetitive elements. The distribution of editing sites within the Alu sequence exhibits a number of prefened edited adenosines, as well as adenosines unlikely to be edited. In particular, two specific A sites within the Alu repeat, in positions 27 and 28 of Alu, account for -12% of all editing events (see Example 2). It was determined that G is undenepresented in the nucleotide upstream to the edited A, and ovenepresented in the nucleotide following the editing site (see Example 3), in 90 91 accordance with previous reports ' . However, the fact that most of the sites occur within Alu repeats strongly biases the identification of additional significant patterns characterizing the editing site. Typically, editing is seen in only a fraction of the supporting expressed sequences (ESTs or cDNAs). In fact, for 83% of the sites only one sequence exhibits editing. This suggests that editing does not occur with equal frequency in all tissues and conditions, and is of probabilistic nature, again without wishing to be limited by a single hypothesis. The present experimental data also support this finding. No specific expression pattern or Gene Ontology (GO) classification for the edited genes was found. The EST libraries were analyzed to search for specific libraries showing an altered editing pattern. The libraries with the most significant over-editing pattern came from thymus, brain, pancreas, spleen, lung and prostate (see Example 4). Some of these observations support previous reports . Editing can extend the proteomic diversity by changing the identity of a particular 99 • codon , as the ribosome reads mosine as guanosine. One novel example of such editing was found in the NARF gene (see Example 5). However, Morse et al. have predicted that most pre-mRNA editing in the brain is located in non-coding regions6. In agreement with this, virtually all of the editing sites identified herein are located in non-coding regions: of the sites that can be aligned with RefSeq sequences, 12% were located in the 5' UTR, 54% in the 3' UTR and 33% are in RefSeq infrons. Some of the sites annotated as infrons might actually be within an alternative exon not covered by the RefSeq sequence. Editing can also alter or create splicing sites, as is known to happen to ADAR2 itself23. It was suggested that one of the functions of RNA editing is the destabilization of dsRNAs. This prediction was tested as follows. ADAR-mediated editing of an A in an A-U base pair produces the less stable I-U pair, while A-C mismatches can be edited into the more stable I-C pairs. Looking at the best complementary alignment of the editing regions, it was found that in 78% of the editing cases an A-U pair is destabilized, while in 19% an A-C pair is stabilized. Editing of either A-A or A-G pairs occurs in only 3% of the cases. This suggests that editing is aimed at stabilization and destabilization only, and does not occur in situations where it has no major effect on dsRNA stability. Furthermore, the editing mechanism seems to prefer stabilization over destabilization: 22% of the editing events target a mismatched base-pair, while the average frequency of such mismatched base-pairs in the sites adjacent to the editing sites is only 10%, since these sites are all located in double-sfranded regions. The preference of stabilization editing (i.e., editing of A-C to I-C) is in agreement with a previous report24 This work increases the number of editing substrates by two orders of magnitude, in accordance with prior estimates7. This allows a large-scale analysis of the editing phenomenon. The widespread occunence of editing makes it a significant contributor to the diversity of the transcriptome, producing presumably more different transcripts than produced by alternative splicing, although it affects only a small number of nucleotides. Interestingly, the large-scale editing in human is found to be strongly associated with Alu repeats, which are unique to primates. Thus, one does not expect the conesponding sites to be found in non-primate mammals. However, other repeats present in these organisms may be associated with the same phenomenon. The pronounced concentration of editing sites in Alu repeats suggests that A-to-I editing might act as an anti-transposition mechanism by inhibiting the integration of transcribed Alu back into the genome. This is in agreement with the suggestion that editing is an anti-viral mechanism25, as the retrotransposition procedure of many repetitive elements is very similar to some stages of the retroviral infection. Alternatively, Tonkin et al. suggest26 that editing regulates RNAi by protecting the dsRNA from degradation. These results suggest that these possible mechanisms may be of wide applicability. Finally, we note that there are probably many more sites than those listed in this work, since: (i) Editing happens in only a fraction of the sequences. Since the expressed sequence coverage of many genes is scarce, many editing sites might be absent from GenBank sequences, (ii) The filtering parameters were chosen to minimize the noise, but inevitably miss many tme sites, (iii) The experimental evidence show that a typical editing substrate contains more editing sites than the number predicted according to the method of the present invention. Thus, the 12,220 sites described herein may still represent only a portion of the actual editing repertoire, without wishing to be limited by a single hypothesis. The large-scale mapping of editing sites enables the identification of new properties of non- coding regions, and may facilitate the association of mutations in these regions witii known pathologies. Additional 31,888 RNA editing sites predicted with accuracy of 82 % are listed in the "Ann for all" and "flank for all" files in the attached CD-ROM.
EXAMPLES The Examples below relate to exemplary experiments performed for experimental verification of the method of the present invention, as well as of illustrative, non-limiting applications of that method.
Materials and Methods Human ESTs and cDNAs were obtained from NCBI GenBank version 136 (June 2003; www.ncbi.nlm.nih.gov/dbEST). The genomic sequences were taken from the human genome build 33 (June 2003; www.ncbi.nlm.nih.gov/genome/guide/human). Total RNA and genomic DNA (gDNA) isolated simultaneously from the same tissue sample were purchased from Biochain Institute (Hayward, CA). In this work we used samples of liver, prostate, uterus, kidney, lung normal and tumor, brain tumor (glioma), cerebellum and frontal lobe. The total RNA underwent oligo-dT primed reverse franscription using Superscript II (Invifrogen, Carlsbad, CA) according to manufacturer instmctions. The cDNA and gDNA (at 0.1 μg/μl) were used as templates for PCR reactions. The amplified regions chosen for validation were selected only if the fragment to be amplified maps to the genome at a single site. PCR reactions were done using TaKaRa Ex Taq™ Hot Start (Takara Bio, Shiga, Japan) using the primers and annealing conditions as detailed in the Supplementary Information. The PCR products were run on 2% agarose gels and only if a single clear band of the conect approximate size was obtained, it was excised and sent to Hy-labs laboratories (Rehovot, Israel) for purification and direct sequencing without cloning. Poly-A RNA from tissue culture cells was isolated using Trifast (PeqLab, Germany) and poly- A selected using using magnetic oligo dT beads (Dynal, Germany), lμg of poly A RNA was reverse transcribed using random hexamers as primers and RNAseH deficient M- MLV reverse transcriptase (Promega, Madison, WI). Genomic DNA from tissue culture cells was isolated according to Ausubel et al. First strand cDNAs or conesponding genomic regions were amplified with suitable primers using Pfu polymerase, to minimize mutation rates during amplification. Amplified fragments were A-tailed using Taq polymerase, gel purified and cloned into pGem-T easy (Promega, Madison, WI). After transformation in E. coli individual plasmids were sequenced and aligned using ClustalW. We used Contig Express software from Vector NTI 6.0 Suite (Informax, Inc.) for multiple-alignment of the elecfropherograms (see Supplementary Information). Typically, the extent of A-I editing is variable, e.g. the levels of the guanosine trace sometimes is only a fraction of the adenine trace, while in some occasions the conversion from A to I is almost complete. For each gene tested, the three tissues in which the expression was the highest were sequenced. The RT-PCR and gDNA-PCR of one of these tissues were sequenced from both ends to ensure the consistency of the resulting elecfropherograms. EXAMPLE 1 Validation of the predicted RNA editing sites To experimentally validate the predicted editing sites, 28 genes were selected and were sequenced in matching DNA and RNA samples retrieved from the same specimen, for up to five tissues. Editing events were positively verified in 24 previously unknown editing substrates. When PCR products were directly sequenced, the occunence of editing was determined by the presence of an unambiguous frace of guanosine in positions for which the genomic DNA (gDNA) clearly indicated the presence of adenosine (figure 3). As described above, these experiments provide direct evidence for editing in liver, lung, kidney, prostate, and uterus. The validation set was composed of two subsets: (i) 18 genes for which the EST data suggested many putative editing events, 16 of these genes were confirmed to be edited, (ii) 13 genes were chosen randomly from the list of 1,595 predicted genes, 9 of which were successfully amplified and sequenced. 8 out of these 9 genes were confirmed edited.
Table la
# PCR Annealing Refseq Gene Temp Reverse CGEN name * accession name [Celsius] primer
1 D31352 NM_002393 MDM4 **61 GAAAGTGAAAAAATGAGGCAACTACAGA
2 NM_000874. HSIFNABR 2 IFNAR2 **62 GCACAGCGGCTCACCCC
3 F02268 NM_032634 PIGO **61 GGCCAAGGTGGAAGGATCG
4 AK098080 AK098080 **61 CAACCTCCACTCCCAAGTAGCTG
5 H93599 BF446580 **60 GCACTTTCGGAGTCCAAGGTG
6 T23533 NM_003678 C22orf19 **61 GGCTGGAGTACAGTACAGGCACG
7 T39606 NM_016946 F11R **62 GAGCTGGAG I I I I GCTCTTGTTGC
8 HU FOLME S NM_000791 DHFR **60 CAGCCTTGAACTCGTGGGC
9 Z46114 N _182852 CCNB1IP1 **63 TCCCGACCAGCGTGGCC
10 Rab11- Z44562 NM_014700 FIP3 **62 GTCCTGGAACTCACAAAGCCACG
11 Z44313 NM 025137 FLJ21439: **63 GTGATGAAGTCTCACTCTGTCGCCC 12 F07156 NMJ 73827 FLJ38991 **61 AATGCTGGGTGAGGTGGCTC
13 R12689 NM_025097 FLJ21106 **61 GGAATGCCATGGTGTGATCTTG
14 H20403 BC035077 KIAA1936 **61 TGATCTGCCTGCCTCGGC
15 AH 38826 AH 38826 AH 38826 **61 TCCCACTCTGTTGCCAGGC
16 T89426 NM_004401 DFFA 64 CAGGTTCATGCCATTCTCCTGCC
17 DKFZP434 AA633281 AK091450 F011 : **60 TTTCCAGCAGGTTGATACGCTC
18 AB011129 ZNF500 **60 CCCAGTTCCTCCACCCACAG
19 HSNICRB NM_000750 HSNICRB 63 GCTTTAGAATGATTGGGTGGTTAGGG
20 DKFZP727 Z40049 NM_015540 M111 **61 GATTGTAGCACTGATAAGCAAGGAGC
21 T05339 NM_017623 CNNM3 **62 CCCACTGTTTGCCATGAAGAAAG
22 AA554866 NM_003743 NC0A1 **61 GATGTTAAGTATCTTGGCCGGGAG
23 Z39650 NM_006268 DPF2 **62 GGTTCAGGCAATTCTCCTGCC
24 F09475 NM_020429 SMURF1 61 CGATCATGGCTCACTGCAGC
** Cycling conditions: 95oC 2 min; 35 cycles of 94oC 45 sec, anneal temp 30 sec, 72oC 45 sec; 72oC 10 min
Table lb
# Refseq CGEN name * accession Tissues tested
1 D31352 NM_002393 Kidney, Colon, Cerebellum, Frontal Lobe (brain)
2 HSIFNABR NM_000874.2 Kidney, Colon, Cerebellum, Glioma
3 F02268 NM_032634 Kidney, Frontal Lobe (brain), Glioma
4 AK098080 AK098080 Kidney, Lung, Cerebellum, Frontal Lobe (brain)
5 H93599 BF446580 Kidney, Lung, Cerebellum, Glioma, Colon
6 T23533 NM_003678 Kidney, Frontal Lobe (brain), Glioma, Colon
7 T39606 NM_016946 Kidney, Frontal Lobe (brain), Lung, Liver, Colon
8 HUMFOLME S NM_000791 Cerebellum, Liver, Uterus
9 Z46114 NMJ 82852 Kidney, Lung, Cerebellum
10 Z44562 NM_014700 Kidney, Cerebellum, Frontal Lobe (brain)
11 Z44313 NM_025137 Kidney, Lung, Cerebellum
12 F07156 NMJ 73827 Liver, Lung tumor, Glioma
13 R12689 NM_025097 Glioma, Lung, Frontal Lobe (brain)
14 H20403 BC035077 Lung, Cerebellum, Frontal Lobe (brain) 15 AH38826 AH 38826 Lung, Kidney, Glioma
16 T89426 NM_004401 Kidney, Lung, Cerebellum
17 AA633281 AK091450 Cerebellum 18 AB011129 Uterus, Liver, Frontal Lobe (brain)
19 HSNICRB NM_000750 Cerebellum, Frontal Lobe (brain), Prostate
20 Z40049 NM_015540 Prostate, Cerebellum
21 T05339 NM_017623 Uterus, Liver, Frontal Lobe (brain)
22 AA554866 NM_003743 Liver, Cerebellum, Uterus, Prostate
23 Z39650 NM_006268 Frontal Lobe (brain), Cerebellum, Prostate, Uterus, Liver
24 F09475 NM_020429 Cerebellum, Prostate, Liver, Uterus
Detailed annotation of the transcripts appears in Example 8 below and in the attached CD- ROM. EXAMPLE 2 Editing sites and the AL U sequence ALU is a complex and diverse family of genomic repeats that are unique to the primates. Due to their ubiquity, it is probable that two oppositely oriented ALUs will be present in the same gene, and thus they are likely to form dsRNAs and putative editing sites. The editing sites were compared with the ALU repeat, to examine their similarities and differences. In order to simplify the following analysis, a "generic" ALU consensus sequence was used as an example: the consensus of the Alu-J subfamily. The exact sequence that was used is gnl|alu|HSU14567AU 12,220 predicted editing sites were compared to the ALU sequence using the BLASTN program. The best same-strand hits to ALU were used. More than 92% of the sequences in the database had a significant (with an escore below le- 10) match to the ALU consensus sequence. Only hits with at least 80% identity were retained, no gaps in the alignment, which contain the predicted edited site in the alignment. 2,615 such hits were found, each assigning one editing site to a specific position on the ALU sequence. The ALU consensus sequence is 290 nt in length, and contains 67 A's (23.1% of sequence). Of the 2,615 counts of predicted editing positions with alignments to ALU, 1,623 (62%) are in A positions. The remaining sites are almost exclusively located in G positions (i.e., a site which conesponds to a G in the ALU but actually show A in the DNA, is edited to be G). This and more information is summarized in the following table 2: Table 2
Figure imgf000021_0001
positions between the 67 different As is shown. It is shown that there are prefened positions for editing events in the alignment to ALU (p-value calculated using the Z-test). Note that positions 27 & 28 account for 11.7% of the total number of positions analyzed (2,615), and 18.75% of the positions aligned to A (1632). This is a large bias suggesting that these 2 positions are in a place very favorable for RNA editing. In contrast, position 44 (only 16 bases apart) has a count of just 7, showing that this position is unfavorable for predicted editing. Such very close positions with significantly different counts serve as ideal controls for each other as there was no prior selection that prefened any of them.
Table 3
Figure imgf000021_0002
Figure imgf000022_0001
Figure imgf000023_0001
EXAMPLE 3 In the following, the effect of RNA editing on the stability of its dsRNA substrates is considered. For each predicted site, a search is performed for its best opposite-strand alignment within the genomic region covered by the same gene cluster, and look at the effect of the editing on this alignment. First, the fraction of editing sites which are (before editing)" matching to their opposite strand sequence was calculated: 78.2% of the nucleotides in the editing sites match the opposite strand, and 21.8% are mismatched. This frequency of mismatches is actually much higher than could be expected by chance, given that the editing region as a whole is matched with average identity level of about 90%. Indeed, the same analysis for the neighboring sites yields only 10.8% mismatches for the site upstream to the editing site, and 8.3% mismatches for the site downstream to the editing site. Thus the number of matching editing sites is actually lower than expected assuming a uniform distribution. Next, the distribution of nucleotides in the sites neighboring the editing sites, as well as the site located at the editing sites on the other strand. The distributions are presented in the following table 4: Table 4
Figure imgf000024_0001
G is strongly undenepresented in the upstream preceding site, and ovenepresented in the site following the editing site. However, one should be cautious in analyzing these patterns, as almost all sites are located within highly similar ALU repeats. The site opposed to the editing site is in most cases U, where editing changes the stable A-U pair into the less stable I-U pair. Among the cases in which the edited site is mismatched, the vast majority are C sites, where editing changes the less stable A-C pair into the more stable I-C pair. Changes that do not have a significant effect on the dsRNA stability, i.e., change of A-A pairs into I- A pahs or change of A-G pairs into I-G pairs are rare. This suggests editing is directed at regulating the dsRNA stability. Moreover, the strong bias towards mismatches in the editing sites suggests editing is prefened where is stabilizes the dsRNA. Example 4 Various exemplary EST libraries are described herein in which the fraction of ESTs showing RNA editing is significantly higher than the average. First, all ESTs that are edited at one or more sites out of the 12,723 sites in the database were counted, and this number was compared to the total number of ESTs covering these sites that do not exhibit editing (after the cleaning procedure is applied). It was found that 6690 ESTs are edited and 4657 are not, giving an average editing to non-editing ratio of 6690:4657 or about 3:2. For each library, this ratio was calculated separately. The libraries most significantly deviating from the 3:2 ratio (p- value calculated by the Fisher's Exact Test) are listed below.
Figure imgf000025_0001
EXAMPLES RNA editing within coding regions of genes EXAMPLE 5a: RNA editing in NARFgene In the nuclear prelamin A recognition factor (NARF) gene, exon 8A is an alternative ALU based exon. In the alternative exon a few putative editing sites were identified (figure 4), some of which are silent, but some cause replacement of amino acids and or prevent premature termination of the protein: The strongest site might be transition of A>G at position 19 altering a STOP codon into Trp (TAG>TGG; X>W). Transition of A>G at position 24 replaces Thr with Ala (ACG>GCG; T>A). Transition of A>G at position 33 replaces Iso with Val (ATOGTC; I>V). Transition of A>G at position 46 replaces Gin with Arg (CAG>CGG; Q>R). Transition of A>G at position 70 replaces Arg with Gly
(AGA>GGA; R>G). Transition of A>G at position 24 replaces Thr with Ala (ACG>GCG T>A). Transition of A>G at position 2, 32 and 78 doesn't affect the amino acid (GAA>GAG; E>E, GTA>GTG; V>V, TCA>TCG; S>S, respectively) (Figure 4). EXAMPLE 5b: RNA editing in HSPC274 gene In the HSPC274 (C20orf30) gene, exon 2 is an alternative ALU based exon. The alternative exon was found to contain a few putative editing sites (figure 5), some of which are silent, but others cause replacement of amino acids: In the strongest site no AA is changed (position 10 CTA>CTG L>L). However, transition of A>G at position 66 replaces His with Arg (C AC>CGC H>R). A change of an AA is noted also in position 80 (AGOGGC S>G) (Figure 5)(ORF was taken from the AF161392 sequence).
EXAMPLE 5c: RNA editing in FLJ25952 hypotetical protein The hypothetical protein FLJ25952 was found to contain a few putative editing sites
(figure 6). The sequence in genome positions 44-46 is tcaaaa (SK). Editing changes the sequence in one of a number of ways: tcgaaa (SK), tcaaga(SR), or tcggga(SG). Other potential editing sites in this exon are: position 120 agt> ggt S>G, position 127 aac > age N>S, and position 130 tac>tgc Y >C (Figure 6).
EXAMPLE 6 Annotation of transcripts having RNA editing sites Newly uncovered naturally occurring transcripts having RNA editing sites were annotated using the Gencarta (Compugen, Tel-Aviv, Israel) platform. The Gencarta platform includes a rich pool of annotations, sequence information (particularly of spliced sequences), chromosomal information, alignments, and additional information such as gene ontology terms, expression profiles, functional analyses, known and predicted proteins and detailed homology reports. Brief description of the methodology used to obtain annotative sequence information is summarized infra (for detailed description see U.S. Pat. Appl. 10/426,002). The ontological annotation approach - An ontology refers to the body of knowledge in a specific knowledge domain or discipline such as molecular biology, microbiology, immunology, virology, plant sciences, pharmaceutical chemistry, medicine, neurology, endocrinology, genetics, ecology, genomics, proteomics, cheminformatics, pharmacogenomics, bioinformatics, computer sciences, statistics, mathematics, chemistry, physics and artificial intelligence. An ontology includes domainrspecific concepts - refened to, herein, as sub- ontologies. A sub-ontology may be classified into smaller and nanower categories. The ontological annotation approach is effected as follows. First, biomolecular (i.e., polynucleotide or polypeptide) sequences are computationally clustered according to a progressive homology range, thereby generating a plurality of clusters each being of a predetermined homology of the homology range. Progressive homology is used to identify meaningful homologies among biomolecular sequences and to thereby assign new ontological annotations to sequences, which share requisite levels of homologies. Essentially, a biomolecular sequence is assigned to a specific cluster if displays a predetermined homology to at least one member of the cluster (i.e., single linkage). A "progressive homology range" refers to a range of homology thresholds, which progress via predetermined increments from a low homology level (e.g. 35 %) to a high homology level (e.g. 99 %). Following generation of clusters, one or more ontologies are assigned to each cluster. Ontologies are derived from an annotation preassociated with at least one biomolecular sequence of each cluster; and/or generated by analyzing (e.g., text-mining) at least one biomolecular sequence of each cluster thereby annotating biomolecular sequences. The hierarchical annotation approach - "Hierarchical annotation" refers to any ontology and subontology, which can be hierarchically ordered, such as, a tissue expression hierarchy, a developmental expression hierarchy, a pathological expression hierarchy, a cellular expression hierarchy, an intracellular expression hierarchy, a taxonomical hierarchy, a functional hierarchy and so forth. The hierarchical annotation approach is effected as follows. First, a dendrogram representing the hierarchy of interest is computationally constructed. A "dendrogram" refers to a branching diagram containing multiple nodes and representing a hierarchy of categories based on degree of similarity or number of shared characteristics. Each of the multiple nodes of the dendrogram is annotated by at least one keyword describing the node, and enabling literature and database text mining, such as by using publicly available text mining software. A list of keywords can be obtained from the GO Consortium (www.geneontlogy.org). However, measures are taken to include as many keywords, and to include keywords which might be out of date. For example, for tissue annotation, a hierarchy is built using all available tissue/libraries sources available in the GenBank, while considering the following parameters: ignoring GenBank synonyms, building anatomical hierarchies, enabling flexible distinction between tissue types (normal versus pathology) and tissue classification levels (organs, systems, cell types, etc.). In a second step, each of the biomolecular sequences is assigned to at least one specific node of the dendrogram. The biomolecular sequences can be annotated biomolecular sequences, unannotated biomolecular sequences or partially annotated biomolecular sequences. Annotated biomolecular sequences can be retrieved from pre-existing annotated databases as described hereinabove. For example, in GenBank, relevant annotational information is provided in the definition and keyword fields. In this case, classification of the annotated biomolecular sequences to the dendrogram nodes is directly effected. A search for suitable annotated biomolecular sequences is performed using a set of keywords which are designed to classify the biomolecular sequences to the hierarchy (i.e., same keywords that populate the dendrogram) In cases where the biomolecular sequences are unannotated or partially annotated, extraction of additional annotational information is effected prior to classification to dendrogram nodes. This can be effected by sequence alignment, as described hereinabove. Alternatively, annotational information can be predicted from structural studies. Where needed, nucleic acid sequences can be transformed to amino acid sequences to thereby enable more accurate annotational prediction. Finally, each of the assigned biomolecular sequences is recursively classified to nodes hierarchically higher than the specific nodes, such that the root node of the dendrogram encompasses the full biomolecular sequence set, which can be classified according to a certain hierarchy, while the offspring of any node represent a partitioning of the parent set. For example, a biomolecular sequence found to be specifically expressed in "rhabdomyosarcoma", will be classified also to a higher hierarchy level, which is "sarcoma", and then to "Mesenchimal cell tumors" and finally to a highest hierarchy level "Tumor". In another example, a sequence found to be differentially expressed in endomefrium cells, will be classified also to a higher hierarchy level, which is "uterus", and then to "women genital system" and to "genital system" and finally to a highest hierarchy level "genitourinary system". The retrieval can be performed according to each one of the requested levels. Annotating gene expression according to relative abundance - Spatial and temporal gene annotations are also assigned by comparing relative abundance in libraries of different origins. This approach can be used to find gene which are differentially expressed in tissues, pathologies and different developmental stages. In principal, the presentation of a contig in at least two tissues of interest is determined and significant over or under representation of the contig in one of the at least two tissues is assessed to identify differential expression. Significant over or under representation is analyzed by statistical pairing. Annotating spatial and temporal expression can also be effected on splice variants. This is effected as follows. First, a contigue which includes exonal sequence presentation of the at least two splice variants of the gene of interest is obtained. This contigue is assembled from a plurality of expressed sequences; Then, at least one contigue sequence region unique to a portion (i.e., at least one and not all) of the at least two splice variants of the gene of interestis identified . Identification of such unique sequence region is effected using computer alignment software. Finally, the number of the plurality of expressed sequences in the tissue having the at least one contigue sequence region is compared with the number of the plurality of expressed sequences not-having the at least one contigue sequence region, to thereby compare the expression level of the at least two splice variants of the gene of interest in the tissue. Sequence anntotations obtained using the above-described methodologies and other approaches are disclosed in a data table in the file "Ann_for_aH" and "Ann_clean" of the enclosed CD-ROM.
The data table shows a collection of annotations for biomolecular sequences, which were identified according to the teachings of the present invention using transcript data based on GenBank versions 136 (June 15, 2003 ftp://ftp.ncbi.nih. gov/genbank/release.notes/gb 136,release.notes) and NCBI genome assembly of April 2003. Each feature in the data table is identified by "#". #ES[DICATION - This field designates the indications (i.e., diseases, disorders, pathological conditions) and therapies that the polypeptide of the present invention can be utilized for. Specifically, an indication lists the disorders or diseases in which the polypeptide of the present invention can be clinically used. A therapy describes a postulated mode of action of the polypeptide for the above-mentioned indication. For example, an indication can be "Cancer, general" while the therapy will be "Anticancer". Each Protein of the present invention was assigned a SwissProt/TrEMBL human protein accession as described in section "Assignment of SwissProt/ TrEMBL accessions to Gencarta contigs" hereinbelow. The information contained in this field is the indication concatenated to the therapies that were accumulated for the SwissProt and or TrEMBL human protein from dmg databases, such as PharmaProject (PJB Publications Ltd 2003 http://www.pjbpubs.com/cms.asρ?pageid=340) and public databases, such as LocusLink (http://www.genelynx.org/cgi-bin/resource?res=locuslink) and Swissprot
(http://www.ebi.ac.uk/swissprot/index.html). The field may includes more than one term wherein a ";" separates each adjacent terms. Example- #INDICATION Alopecia, general; Antianginal; Anticancer, immunological; Anticancer, other; Atherosclerosis; Buerger's syndrome; Cancer, general; Cancer, head and neck; Cancer, renal; Cardiovascular; Cinhosis, hepatic; Cognition enhancer; Dermatological; Fibrosis, pulmonary; Gene therapy; Hepatic dysfunction, general; Hepatoprotective; Hypolipaemic/Antiatherosclerosis; Infarction, cerebral; Neuroprotective; Ophthalmological; Peripheral vascular disease; Radio/chemoprotective; Recombinant growth factor; Respiratory; Retinopathy, diabetic; Symptomatic antidiabetic; Urological; Assignment of SwissProt/TrEMBL accessions to Gencarta contigs - Gencarta contigs were assigned a Swissprot/TremBl human accession as follows. SwissProt/TrEMBL data (SwissProt version 41.13 June 2003, TrEMBL and TrEMBL _new version 23.17 June 2003) were parsed and for each Swissprot/TremBl accession (excluding Swissprot/TremBl that are annotated as partial or fragment proteins) cross-references to EMBL and Genbank were obtained. The alignment quality of the SwissProt/TrEMBL protein to their assigned mRNA sequences was checked by frame+p2n aligmnent analysis. A good alignment was considered as having the following properties: • For partial mRNAs (those that in the mRNA description have the phrase "partial cds" or annotated as "3"' or "5"')- an overall identity of 97% and coverage of 80 % of the Swissprot/TremBl protein. • All the rest mRNA sequences were considered as fully coding mRNAs and for them an overall identity of 97% identity and coverage of the SwissProt/TrEMBL protein of over 95 %. The mRNAs were searched in the LEADS database for their conesponding contigs, and the contigs that included these mRNA sequences were assigned the Swissprot/TremBl accession. #PHARM- This field indicates possible pharmacological activities of the polypeptide. Each polypeptide was assigned with a SwissProt and/or TrEMBL human protein accession, as described above. The information contained in this field is the proposed pharmacological activity that was associated to the SwissProt and/or TrEMBL human protein from drug databases such as PharmaProject (PJB Publications Ltd 2003 http://www.pjbpubs.com/cms.asp?pageid=340) and public databases, such as LocusLink and Swissprot. Note that in some cases this field can include opposite terms in cases where the protein can have contradicting activities - such as:
1. Stimulant - inhibitor
2. Agonist - antagonist
3. Activator- inhibitor
4. Immunosuppressant - Immunostimulant In these cases the pharmacology was indicated as "modulator". For example, if the predicted polypeptide has potential agonistic/antagonistic effects (e.g. Fibroblast growth factor agonist and Fibroblast growth factor antagonist) then the annotation for this code will be "Fibroblast growth factor modulator". A documented example for such contradicting activities has been described for the soluble tumor necrosis factor receptors [Mohler et al., J. Immunology 151, 1548-1561]. Essentially, Mohler and co-workers showed that soluble receptor can act as a canier of TNF (i.e., agonistic effect) and as an antagonist of TNFR activity. #THERAPEUTIC_PROTEIN - This field predicts a therapeutic role for a protein represented by the contig. A contig was assigned this field if there was information in the drag database or the public databases (e.g., described hereinabove) that this protein, or part thereof, is used or can be used as a drag. This field is accompanied by the SwissProtaccession of the therapeutic protein, which this contig most likely represents. Example: # THERAPEUTIC_PROTEIN UROK HUMAN #SEQLIST- This field lists all ESTs and/or mRNA sequences supporting the RNA editing position derived from Genbank version 136 (June 15 2003 ftp://ftp.ncbi.nih.gov/genbank/release.notes/gbl36.release.notes). GO annotations were predicted as described in "The ontological annotation approach" section hereinabove. Functional annotations of transcripts based on Gene Ontology (GO) are indicated by the following format. "#GO_P", annotations related to Biological Process, "#GO_F", annotations related to Molecular Function, and The Gene Ontology and gene association files were updated using the following databases: SWISS-PROT and TrEMBL release Dec. 18, 2002; Medline databases of April 6,
2001 and the following files from Gene Ontology Consortium, which were downloaded on
Oct. 22, 2003: gene_association.fb; gene_association.mgi; gene association.sgd; gene_association.wb; and gene_association.goa_sptr.
For each category the following features are optionally addressed: "#GO_Acc" represents the accession number of the assigned GO entry, conesponding to the following "#GO_Desc" field. "#GO_Desc" represents the description of the assigned GO entry, conesponding to the mentioned "#GO Ace" field. The assignment of Immune response GO annotation (#GO_Acc 6955 # GO_Desc immune response) to transcripts and proteins of the present invention was based on a homology to a viral protein, as described in U.S. Pat. Appl. 60/480,752. "#CL" represents the confidence level of the GO assignment, when #CL1 is the highest and #CL5 is the lowest possible confidence level. This field appears only when the
GO assignment is based on a SwissProt/TrEMBL protein accession or Interpro accession and (not on Proloc predictions or viral proteins predictions). Preliminary confidence levels were calculated for all public proteins as follows: PCL 1 : a public protein that has a curated GO annotation, PCL 2: a public protein that has over 85 % identity to a public protein with a curated
GO annotation, PCL 3: a public protein that exhibits 50 - 85 % identity to a public protein with a curated GO annotation, PCL 4: a public protein that has under 50 % identity to a public protein with a curated GO annotation. For each Protein of the present invention a homology search against all public proteins was done. If the Protein of the present invention has over 95 % identity to a public protein with PCL X than the Protein of the present invention gets the same confidence level as the public protein. This confidence level is marked as "#CL X". If the Protein of the present invention has over 85 % identity but not over 95 % to a public protein with PCL X than the Protein of the present invention gets a confidence level lower by 1 than the confidence level of the public protein. If the Protein of the present invention has over 70 % identity but not over 85 % to a public protein with PCL X than the Protein of the present invention gets a confidence level lower by 2 than the confidence level of the public protein. If the Protein of the present invention has over 50 % identity but not over 70 % to a public protein with PCL X than the Protein of the present invention gets a confidence level lower by 3 than the confidence level of the public protein. If the Protein of the present invention has over 30 % identity but not over 50 % to a public protein with PCL X than the Protein of the present invention gets a confidence level lower by 4 than the confidence level of the public protein. A Protein of the present invention may get confidence level of 2 also if it has a true interpro domain that is linked to a GO annotation http://www.geneontologv.org/extemal2go/interpro2go/. When the confidence level is above "1", GO annotations of higher levels of the GO hierarchy are assigned (e.g. for "#CL 3" the GO annotations provided, is as appears plus the 2 GO annotations above it in the hierarchy). "#DB" marks the database on which the GO assignment relies on. The "sp", as in Example 10a, relates to SwissProt/TremBl Protein knowledgebase, available from http://www.expasy.ch sprot/. "InterPro", as in Example 10c, refers to the InterPro combined database, available from http://www.ebi.ac.uk/interpro/, which contains information regarding protein families, collected from the following databases: SwissProt (http://www.ebi.ac.uk/swissprot/), Prosite (http://www.expasy.ch/prosite/), Pfam (http://www.sanger.ac.uk/Software/Pfam ), Prints
(http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/), Prodom
(http://prodes.toulouse.inra.fr/prodom/), and Smart (http://smart.embl-heidelberg.de/). PROLOC means that the method used for predicting the Gene Ontology cellular component is based on Proloc prediction, where the database is the statistical data the Proloc software employs to predict the subcellular localization of proteins. "Viral protein database" -All viral proteins (Total 294,805 proteins) were downloaded from NCBI GenBank on 1/10/2003. All the Baculoviridae and Entomopoxvirinae proteins, which are known to infect only insects, were removed and then a non-redundant set was prepared using 95 % identity as a cutoff (Holm L, Sander C. Removing near-neighbor redundancy from large protein sequence collections. Bioinformatics 1998 Jun;14(5):423-9). This resulted in 97,979 proteins. The cluster members of each of the viral proteins are described in U.S. Pat. Appl. 60/480,752. "#EN" represents the accession of the entity in the database (#DB), conesponding to the accession of the protein/domain why the GO was predicted. If the GO assignment is based on a protein from the SwissProt TremBl Protein database this field will have the locus name of the protein. Examples, "#DB sp #EN NRG2_HUMAN, means that the GO assignment in this case was based on a protein from the SwissProt/Trembl database, while the closest homologue (that has a GO assignment) to the assigned protein is depicted in SwissProt entry "NRG2JHUMAN "#DB interpro #EN IPR001609" means that GO assignment in this case was based on InterPro database, and the protein had an Interpro domain, IPR001609, that the assigned GO was based on. In Proloc predictions this field will have a Proloc annotation "#EN Proloc". In predictions based on viral proteins this field will have the gi. viral protein accession, "#EN 1491997". #GENE_SYMBOL - for each Gencarta contig a HUGO gene symbol was assigned in two ways: (i) After assigning a Swissprot/TremBl protein to each contig (see Assignment of Swissprot/TremBl accessions to Gencarta contigs) all the gene symbols that appear for the Swissprot entry were parsed and added as a Gene symbol annotation to the gene. (ii) LocusLink information- LocusLink was downloaded from NCBI ftp://ftp.ncbi.nih.gov/refseq/LocusLink/ (files loc2acc, loc2ref, and LL.ou t_hs). The data was integrated producing a file containing the gene symbol for every sequence. Gencarta contigs were assigned a gene symbol if they contain a sequence from this file that has a gene symbol Example: #GENE_SYMBOL MMP 15 #DIAGNOSTICS- secreted/membrane-bound proteins get an annotation of "can be used as a diagnostic markers" preferably for the conesponding list of indications appearing in the # INDICATION field, described hereinabove. All proteins that were identified as secreted or membrane-bound proteins (as described in the GO field section), excluding membrane-bound proteins of intracellular components such as nuclear membrane, will be assigned with this field. In addition, known contigs representing known diagnostic markers (such as listed in Table 6, below) and all transcripts and proteins deriving from this contig will be assigned to this field and will get the above mentioned annotation followed by "as indicated in the Diagnostic markers table".
Table 6
Figure imgf000035_0001
Figure imgf000036_0001
Figure imgf000037_0001
Figure imgf000038_0001
Figure imgf000039_0001
Figure imgf000040_0001
Figure imgf000041_0001
Figure imgf000042_0001
Figure imgf000043_0001
Figure imgf000044_0001
Figure imgf000045_0001
Figure imgf000046_0001
Note: (i) Small portion of these "markers" are also dmg targets, whether already for approved drags (such as alphal antiTrypsin) or under development (e.g., GOT). (ii) Some of these "markers" are also used as therapeutic proteins (e.g., Erythropoietin). (iii) All markers are found in the blood/serum unless otherwise specified.
#DISEASE_RELATED_CLINICAL_PHENOTYPE - This field denotes the possibility of using biomolecular sequences of the present invention for the diagnosis and/or treatment of genetic diseases such as listed in the following URL: http://www.geneclinics.org/serylet/access?id=:8888891&kev=X9D790O5relAz&db=genetes ts&res=&fcn=b&g =g&genesearch^rae&testtype=both&ls~l&type=e&qrv=&submit=Sea rch and in Table 7, below. This list includes genetic diseases and genes which may be used for the detection and/or treatment thereof. As such, newly uncovered variants of these genes, including novel RNA editing sites, may be used for improved diagnosis and/or treatment when used singly or in combination with the previously described genes. For example, in genetic diseases where the diseased phenotype has a different splice variant profile than the healthy phenotype, like that seen in Thalasemia and in Duchenne Mascular Dystrophy, the novel splice variants may distinguish between healthy and diseased phenotype. Another example is in cases of autosomal recessive genetic diseases. Some publicly available sequences were sequenced from malfunctioning alleles derived from healthy earners of the disease, and therefore contain the mutation that leads to the disease. Identification of novel RNA editing sites based on sequence alignment can assist in identifying disease-causing mutations.
Table 7
Figure imgf000047_0001
Figure imgf000048_0001
Figure imgf000049_0001
Figure imgf000050_0001
Figure imgf000051_0001
Figure imgf000052_0001
Figure imgf000053_0001
Figure imgf000054_0001
Figure imgf000055_0001
Figure imgf000056_0001
Figure imgf000057_0001
Figure imgf000058_0001
Figure imgf000059_0001
Figure imgf000060_0001
Figure imgf000061_0001
Figure imgf000063_0001
Figure imgf000064_0001
Figure imgf000065_0001
Figure imgf000066_0001
Figure imgf000067_0001
Figure imgf000068_0001
Figure imgf000069_0001
Figure imgf000070_0001
Figure imgf000071_0001
Figure imgf000072_0001
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
Figure imgf000080_0001
Figure imgf000081_0001
Figure imgf000082_0001
Figure imgf000083_0001
Figure imgf000084_0001
Figure imgf000085_0001
Figure imgf000086_0001
Figure imgf000087_0001
#AUTOANTIGEN_P _AUTOIMMUNE_DISEASE - Secreted splice variants of known autoantigens associated with a specific autoimmune syndrome, such as for example, those listed in Table 8, below ("contig" column), can be used as therapeutic tools for the treatment of such disorders as described hereinabove. It is also contemplated that variants of autoantigens are of a diagnostic value. Novel splice variant of the genes listed in Table 8, may be revealed as true autoantigens, therefore their use for detection of autoantibodies is expected to result in a more sensitive and specific test.
Table 8
Contig Disease Description
Figure imgf000087_0002
Figure imgf000088_0001
Figure imgf000089_0001
Figure imgf000090_0001
#DRUG_DRUG_INTERACTION: refers to proteins involved in a biological process which mediates the interaction between at least two consumed dmgs. Novel splice variants of known proteins involved in interaction between drugs may be used, for example, to modulate such dmg-dmg interactions. Examples of proteins involved in dmg-drag interactions are presented in Table 9 together with the conesponding internal gene contig name, enabling to allocate the new splice variants within the data files "Ann_for_all" and "Ann_clean" in the attached CD-ROM.
Table 9 Contig Gene Symbol Description
Figure imgf000090_0002
Figure imgf000091_0001
Figure imgf000092_0001
Figure imgf000093_0001
Figure imgf000094_0001
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
Figure imgf000098_0001
Figure imgf000099_0001
Figure imgf000100_0001
Figure imgf000101_0001
Differentially expressed biomolecular sequences -field description #TS - This field denotes tissue-specific genes i.e., genes upregulated in a specific tissue or tissues. As described hereinabove, such gene may be used as markers for tissue proliferation, differentiation and/or tissue damage. These proteins also have therapeutic significance as described above. The annotation format is as follows: #TS tissue-name -the "tissue name" field specifies the list of tissues for which tissue- specific genes/variants were searched, as follows: amniotic+placenta; Blood; Bone; Bone manow; Brain; Cervix+utems; Colon; Endocrine, adrenal gland; Endocrine, pancreas; Endocrine, parathyroid+thyroid; Gastrointestinal tract; Genitourinary; Head and neck; Immune, T-cells; Kidney; Liver; Lung; Lymph node; Mammary gland; Muscle; Ovary; Prostate; Skin; Thymus. #TAA - This field denotes genes or transcript sequences over-expressed in cancer. The annotation format is as follows: #TAA tissue-name - where the "tissue name" field specifies the list of tissues for which tissue-tumor specific genes/variants were searched, as follows: All tumor types; All epithelial tumors; prostate-tumor; lung-tumor; head and neck-tumor; stomach-tumor; colon- tumor; mammary-tumor; kidney-tumor; ovary-tumor; utems/cervix-tumor; thyroid-tumor; adrenal-tumor; pancreas-tumor; liver-tumor; skin-tumor; brain-tumor; bone-tumor; bone manow-tumor; blood-cancer; T-cells-tumor; lymph nodes-tumor; muscle-tumor. #TAAT - This field denotes splice variants over expressed in cancer. The annotation format is as follows: #TAAT tissue-name start nucleotide - end nucleotide, where the "start nucleotide - end nucleotide" field denotes the start and end nucleotides are the location on the transcript of the unique exon/s of this transcript which are over expressed in cancer. EXAMPLE 7 The following sections list examples of proteins (subsection i), based on their molecular function, which participate in variety of diseases (listed in subsection ii), which diseases can be diagnosed/treated using information derived from naturally occurring transcripts having RNA editing sites, such as those uncovered by the present invention. The present invention is of biomolecular sequences, which can be classified to functional groups based on known activity of homologous sequences. This functional group classification, allows the identification of diseases and conditions, which may be diagnosed and treated based on the novel sequence information and annotations of the present invention. (i)This functional group classification includes the following groups: Proteins involved in Drug-Drug interactions: The phrase "proteins involved in drug-drag interactions" refers to proteins involved in a biological process which mediates the interaction between at least two consumed drags. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies dfrected against such proteins or polynucleotides capable of altering expression of such proteins, may be used to modulate drug-drug interactions. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such drug-drug interactions. Examples of these conditions include, but are not limited to the cytoclirom P450 protein family, which is involved in the metabolism of many drugs. Examples of proteins involved in dmg-drag interactions are listed in Table 9, below. Proteins involved in the metabolism of a pro-drug to a drug: The phrase "proteins involved in the metabolism of a pro-drag to a dmg" refers to proteins that activate an inactive pro-drag by chemically chaining it into a biologically active compound. Preferably, the metabolizing enzyme is expressed in the target tissue thus reducing systemic side effects. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to modulate the metabolism of a pro-drag into dmg. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such conditions. Examples of these proteins include, but are not limited to esterases hydrolyzing the cholesterol lowering drag simvastatin into its hydroxy acid active form. MDR proteins: The phrase "MDR proteins" refers to Multi Drag Resistance proteins that are responsible for the resistance of a cell to a range of drags, usually by exporting these drags outside the cell. Preferably, the MDR proteins are ABC binding cassette proteins. Preferably, drug resistance is associated with resistance to chemotiierapy. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the transport of molecules and macromolecules such as neurotransmitters, hormones, sugar etc. is abnormal leading to various pathologies. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of MDR proteins include, but are not limited to the multi-drag resistant transporter MDRl/P-glycoprotein, which is the gene product of MDR1, belonging to the ATP-binding cassette (ABC) superfamily of membrane transporters. This protein was shown to increase the resistance of malignant cells to therapy by exporting the therapeutic agent out of the cell. Hydrolases acting on amino acids: The phrase "hydrolases acting on amino acids" refers to hydrolases acting on a pair of amino acids. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the fransfer of a glycosyl chemical group from one molecule to another is abnormal thus, a beneficial effect may be achieved by modulation of such reaction. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to reperfusion of clotted blood vessels by TPA (Tissue Plasminogen Activator) which converts the abundant, but inactive, zymogen plasminogen to plasmin by hydrolyzing a single ARG- VAL bond in plasminogen. Transaminases: The term "fransaminases" refers to enzymes transfening an amine group from one compound to another. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the transfer of an amine group from one molecule to another is abnormal thus, a beneficial effect may be achieved by modulation of such reaction. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such fransaminases include, but are not limited to two liver enzymes, frequently used as markers for liver function - SGOT (Serum Glutamic-Oxalocetic Transaminase - AST) and SGPT (Serum Glutamic-Pyravic Transaminase - ALT).
Immunoglobulins: The term "immunoglobulins" refers to proteins that are involved in the immune and complement systems such as antigens and autoantigens, immunoglobulins, MHC and HLA proteins and their associated proteins. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases involving the immune system such as inflammation, autoimmune diseases, infectious diseases, and cancerous processes. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases and molecules that may be target for diagnostics include, but are not limited to members of the complement family such as C3 and C4 that their blood level is used for evaluation of autoimmune diseases and allergy state and Cl inhibitor that its absence is associated with angioedema. Thus, new variants of these genes are expected to be markers for similar events. Mutation in variants of the complement family may be associated with other immunological syndromes, such as increased bacterial infection that is associated with mutation in C3. Cl inhibitor was shown to provide safe and effective inhibition of complement activation after reperfused acute myocardial infarction and may reduce myocardial injury [Eur. Heart J. 2002, 23(21): 1670-7], thus, its variant may have the same or improved effect. Transcription factor binding: The phrase "transcription factor binding" refers to proteins involved in transcription process by binding to nucleic acids, such as transcription factors, RNA and DNA binding proteins, zinc fingers, helicase, isomerase, histones, and nucleases. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases involving transcription factors binding proteins. Such treatment may be based on transcription factor that can be used to for modulation of gene expression associated with the disease. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to breast cancer associated with ErbB-2 expression that was shown to be successfully modulated by a transcription factor [Proc. Natl. Acad. Sci. U S A. 2000, 97(4): 1495-500]. Examples of novel transcription factors used for therapeutic protein production include, but are not limited to those described for Erythropoietin production [J. Biol. Chem. 2000, 275(43):33850-60; J. Biol. Chem. 2000, 275(43):33850-60] and zinc fingers protein transcription factors (ZFP-TF) variants [J. Biol. Chem. 2000, 275(43):33850-60]. Small GTPase regulatory/interacting proteins: The phrase "Small GTPase regulatory/interacting proteins" refers to proteins capable of regulating or interacting with GTPase such as RAB escort protein, guanyl-nucleotide exchange factor, guanyl-nucleotide exchange factor adaptor, GDP-dissociation inhibitor, GTPase inhibitor, GTPase activator, guanyl-nucleotide releasing factor, GDP-dissociation stimulator, regulator of G-protein signaling, RAS interactor, RHO interactor, RAB interactor, and RAL interactor. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which G-proteases mediated signal- transduction is abnormal, either as a cause, or as a result of the disease. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to diseases related to prenylation. Modulation of prenylation was shown to affect therapy of diseases such as osteoporosis, ischemic heart disease, and inflammatory processes. Small GTPases regulatory/interacting proteins are major component in the prenylation post translation modification, and are required to the normal activity of prenylated proteins. Thus, then variants may be used for therapy of prenylation associated diseases. Calcium binding proteins: The phrase "calcium binding proteins" refers to proteins involve in calcium binding, preferably, calcium binding proteins, ligand binding or earners, such as diacylglycerol kinase, Calpain, calcium-dependent protein serine/threonine phosphatase, calcium sensing proteins, calcium storage proteins. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat calcium involved diseases. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to diseases related to hypercalcemia, hypertension, cardiovascular disease, muscle diseases, gastro-intestinal diseases, uterus relaxing, and utems. An example for therapy use of calcium binding proteins variant may be treatment of emergency cases of hypercalcemia, with secreted variants of calcium storage proteins. Oxidoreductase: The term "oxidoreductase" refers to enzymes that catalyze the removal of hydrogen atoms and electrons from the compounds on which they act. Preferably, oxidoreductases acting on the following groups of donors: CH-OH, CH-CH, CH-NH2, CH-NH; oxidoreductases acting on NADH or NADPH, nitrogenous compounds, sulfur group of donors, heme group, hydrogen group, diphenols and related substances as donors; oxidoreductases acting on peroxide as acceptor, superoxide radicals as acceptor, oxidizing metal ions, CH2 groups; oxidoreductases acting on reduced fenedoxin as donor; oxidoreductases acting on reduced flavodoxin as donor; and oxidoreductases acting on the aldehyde or oxo group of donors. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases caused by abnormal activity of oxidoreductases. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to malignant and autoimmune diseases in which the enzyme DHFR (DiHydroFolateReductase) that participates in folate metabolism and essential for de novo glycine and purine synthesis is the target for the widely used drag Methotrexate (MTX).
Receptors: The term "receptors" refers to protein-binding sites on a cell's surface or interior, that recognize and binds to specific messenger molecule leading to a biological response, such as signal transducers, complement receptors, ligand-dependent nuclear receptors, fransmembrane receptors, GPI-anchored membrane-bound receptors, various coreceptors, intemalization receptors, receptors to neurotransmitters, hormones and various other effectors and ligands. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases caused by abnormal activity of receptors, preferably, receptors to neurotransmitters, hormones and various other effectors and ligands. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, chronic myelomonocytic leukemia caused by growth factor β receptor deficiency [Rao D. S., et al, (2001) Mol. Cell Biol., 21(22):7796-806], thrombosis associated with protease-activated receptor deficiency [Sambrano G. R., et al, (2001) Nature, 413(6851):26-7], hypercholesterolemia associated with low density lipoprotein receptor deficiency [Koivisto U. M., et al., (2001) Cell, 105(5):575-85], familial Hibernian fever associated with tumor necrosis factor receptor deficiency [Simon A., et al., (2001) Ned Tijdschr Geneeskd, 145(2):77-8], colitis associated with immunoglobulin E receptor expression [Dombrowicz D., et al., (2001) J. Exp. Med., 193(l):25-34], and alagille syndrome associated with Jaggedl [Stankiewicz P. et al., (2001) Am. J. Med. Genet., 103(2):166-71], breast cancer associated with mutated BRCA2 and androgen, hypertension associated with β and α adrenergic receptors, diabetes associated with the insulin receptor. Therapeutic applications of nuclear receptors variants may be based on secreted version of receptors such as the thyroid nuclear receptor that by binding plasma free thyroid hormone to reduce its levels may have a therapeutic effect in cases of thyrotoxicosis. A secreted version of glucocorticoid nuclear receptor, by binding plasma free cortisol, thus, reducing, may have a therapeutic effect in cases of Gushing' s disease (a disease associated with high cortisole levels in the plasma). Secreted soluble TNF receptor is an example for a molecule, which can be used to treat conditions hi which downregulation of TNF levels or activity is benefitial, including, but not limited to, Rheumatoid Arthritis, Juvenile Rheumatoid Arthritis, Psoriatic Arthritis and Ankylosing Spondylitis. Protein serine/threonine kinases: The phrase "protein serine/threonine kinases" refers to proteins which phosphorylate serine/threonine residues, mainly involved in signal transduction, such as fransmembrane receptor protein serine/threonine kinase, 3-phosphoinositide-dependent protein kinase, DNA- dependent protein kinase, G-protein-coupled receptor phosphorylating protein kinase, SlSfFlA/AJVIP-activated protein kinase, casein kinase, calmodulin regulated protein kinase, cyclic-nucleotide dependent protein kinase, cyclin-dependent protein kinase, eukaryotic translation initiation factor 2α kinase, galactosylfransferase-associated kinase, glycogen synthase kinase 3, protein kinase C, receptor signaling protein serine/threonine kinase, ribosomal protein S6 kinase, and Q B kinase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases ameliorated by a modulating kinase activity. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to schizophrenia. 5-HT(2A) serotonin receptor is the principal molecular target for LSD-like hallucinogens and atypical antipsychotic drags. It has been shown that a major mechanism for the attenuation of this receptor signaling following agonist activation typically involves the phosphorylation of serine and/or threonine residues by various kinases. Therefore, serine/threonine kinases specific for the 5-HT(2A) serotonin receptor may serve as dmg targets for a disease such as schizophrenia. Other diseases that may be treated through serine/thereonine kinases modulation are Peutz- Jeghers syndrome (PJS, a rare autosomal-dominant disorder characterized by hamartomatous polyposis of the gastrointestinal tract and melanin pigmentation of the skin and mucous membranes [Hum. Mutat.2000, 16(l):23-30], breast cancer [Oncogene. 1999, 18(35):4968-73], Type 2 diabetes insulin resistance [Am. J. Cardiol. 2002, 90(5A):11G-18G], and fanconi anemia [Blood. 2001, 98(13):3650-7]. Channel/pore class transporters: The phrase "Channel/pore class transporters" refers to proteins that mediate the transport of molecules and macromolecules across membranes, such as α-type channels, porins, and pore-forming toxins. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the transport of molecules and macromolecules are abnormal, therefore leading to various pathologies. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to diseases of the nerves system such as Parkinson, diseases of the hormonal system, diabetes and infectious diseases such as bacterial and fungal infections. One specific example is the α-hemolysin, which is produced by S. aureus creating ion conductive pores in the cell membrane, thereby deminishing its integrity. Hydrolases, acting on acid anhydrides: The phrase "hydrolases, acting on acid anhydrides" refers to hydrolytic enzymes that are acting on acid anhydrides, such as hydrolases acting on acid anhydrides in phosphoras- containing anhydrides or in sulfonyl-containing anhydrides, hydrolases catalyzing fransmembrane movement of substances, and involved in cellular and subcellular movement. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the hydrolase-related activities are abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to glaucoma freated with carbonic anhydrase inhibitors (e.g. Dorzolamide), peptic ulcer disease treated with HC^K ^ATPase inhibitors that were shown to affect disease by blocking gastric carbonic anhydrase (e.g. Omeprazole). Transferases, transferring phosphorus-containing groups: The phrase "transferases, transfening phosphoms-containing groups " refers to enzymes that catalyze the transfer of phosphate from one molecule to another, such as phosphotransferases using the following groups as acceptors: alcohol group, carboxyl group, nitrogenous group, phosphate; phosphotransferases with regeneration of donors catalyzing intramolecular transfers; diphosphotransferases; nucleotidyltransferase; and phosphotransferases for other substituted phosphate groups. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins may be used to treat diseases in which the transfer of a phosphorous containing functional group to a modulated moiety is abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to acute MI [Ann. Emerg. Med. 2003, 42(3):343-50], Cancer [Oral. Dis. 2003, 9(3):119-28; J. Surg. Res. 2003, 113(l):102-8] and Alzheimer's disease [Am. J. Pathol. 2003, 163(3):845-58]. Examples for possible utilities of such transferases for drag improvement include, but are not limited to aminoglycosides treatment (antibiotics) to which resistance is mediated by aminoglycoside phosphotransferases [Front. Biosci. 1999, 1;4:D9-21]. Using aminoglycoside phosphotransferases variants or inhibiting these enzymes may reduce aminoglycosides resistance. Since aminoglycosides can be toxic to some patients, proving the expression of aminoglycoside phosphotransferases in a patient can deter from treating him with aminoglycosides and risking the patient in vain. Phosphoric monoester hydrolases: The phrase "phosphoric monoester hydrolases" refers to hydrolytic enzymes that are acting on ester bonds, such as nuclease, sulfuric ester hydrolase, carboxylic ester hydrolase, thiolester hydrolase, phosphoric monoester hydrolase, phosphoric diester hydrolase, triphosphoric monoester hydrolase, diphosphoric monoester hydrolase, and phosphoric friester hydrolase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the hydrolytic cleavage of a covalent bond with accompanying addition of water (-H being added to one product of the cleavage and -OH to the other), is abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to diabetes and CNS diseases such as Parkinson and cancer. Enzyme inhibitors: The term "enzyme inhibitors" refers to inhibitors and suppressors of other proteins and enzymes, such as inhibitors of: kinases, phosphatases, chaperones, guanylate cyclase,
DNA gyrase, ribonuclease, proteasome inhibitors, diazepam-binding inhibitor, ornithine decarboxylase inhibitor, GTPase inhibitors, dUTP pyrophosphatase inhibitor, phospholipase inhibitor, proteinase inhibitor, protein biosynthesis inhibitors, and α-amylase inhibitors. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which beneficial effect may be achieved by modulating the activity of inhibitors and suppressors of proteins and enzymes. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to α-1 antifrypsin (a natural serine proteases, which protects the lung and liver from proteolysis) deficiency associated with emphysema, COPD and liver chirosis. α-1 antifrypsin is also used for diagnostics in cases of unexplained liver and lung disease. A variant of tiiis enzyme may act as protease inhibitor or a diagnostic target for related diseases. Electron transporters: The term "Electron transporters" refers to ligand binding or canier proteins involved in electron transport such as flavin-containing electron fransporter, cytochromes, electron donors, electron acceptors, electron caniers, and cytochrome-c oxidases. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which beneficial effect may be achieved by i ll
modulating the activity of electron transporters. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to cyanide toxicity, resulting from cyanide binding to ubiquitous metalloenzym.es rendering them inactive, and interfering with the electron transport. Novel electron transporters to which cyanide can bind may serve as drag targets for new cyanide antidotes. Transferases, transferring glycosyl groups: The phrase "transferases, fransfening glycosyl groups" refers to enzymes that catalyze the transfer of a glycosyl chemical group from one molecule to another such as murein lytic endotransglycosylase E, and sialyltransferase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the fransfer of a glycosyl chemical group is abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Ligases, forming carbon-oxygen bonds: The phrase "ligases, forming carbon-oxygen bonds" refers to enzymes that catalyze the linkage between carbon and oxygen such as ligase forming aminoacyl-tRNA and related compounds. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the linkage between carbon and oxygen in an energy dependent process is abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases.
Ligases: The term "ligases" refers to enzymes that catalyze the linkage of two molecules, generally utilizing ATP as the energy donor, also called synthetase. Examples for ligases are enzymes such as β-alanyl-dopamine hydrolase, carbon-oxygen bonds forming ligase, carbon-sulfur bonds forming ligase, carbon-nitrogen bonds forming ligase, carbon-carbon bonds forming ligase, and phosphoric ester bonds forming ligase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which the joining together of two molecules in an energy dependent process is abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to neurological disorders such as Parkinson's disease [Science. 2003, 302(5646):819-22; J. Neurol. 2003, 250 Suppl.
3:11125-11129] or epilepsy [Nat. Genet. 2003, 35(2):125-7], cancerous diseases [Cancer Res.
2003, 63(17):5428-37; Lab. Invest. 2003, 83(9): 1255-65], renal diseases [Am. J. Pathol.
2003, 163(4): 1645-52], infectious diseases [Arch. Virol. 2003, 148(9): 1851-62] and fanconi anemia [Nat. Genet. 2003, 35(2): 165-70]. Hydrolases, acting on glycosyl bonds: The phrase "hydrolases, acting on glycosyl bonds" refers to hydrolytic enzymes that are acting on glycosyl bonds such as hydrolases hydrolyzing N-glycosyl compounds, S- glycosyl compounds, and O-glycosyl compounds. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the hydrolase-related activities are abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include cancerous diseases [J. Natl. Cancer Inst. 2003,
95(17): 1263-5; Carcinogenesis. 2003, 24(7):1281-2; author reply 1283] vascular diseases [J.
Thorac. Cardiovasc. Surg. 2003, 126(2):344-57], gastrointestinal diseases such as colitis [J.
Immunol. 2003, 171(3): 1556-63] or liver fibrosis [World J. Gastroenterol. 2002, 8(5):901-7]. Kinases: The term "kinases" refers to enzymes which phosphorylate serine/threonine or . tyrosine residues, mainly involved in signal transduction. Examples for kinases include enzymes such as 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase,
NAD *") kinase, acetylglutamate kinase, adenosine kinase, adenylate kinase, adenylsulfate kinase, arginine kinase, aspartate kinase, choline kinase, creatine kinase, cytidylate kinase, deoxyadenosine kinase, deoxycytidine kinase, deoxyguanosine kinase, dephospho-CoA kinase, diacylglycerol kinase, dolichol kinase, ethanolamine kinase, galactokinase, glucokinase, glutamate 5-kinase, glycerol kinase, glycerone kinase, guanylate kinase, hexokinase, homoserine kinase, hydroxyethylthiazole kinase, inositol/phosphatidylinositol kinase, ketohexokinase, mevalonate kinase, nucleoside-diphosphate kinase, pantothenate kinase, phosphoenolpyruvate carboxykinase, phosphoglycerate kinase, phosphomevalonate kinase, protein kinase, pyruvate dehydrogenase (lipoamide) kinase, pyruvate kinase, ribokinase, ribose-phosphate pyrophosphokinase, selenide, water dikinase, shikimate kinase, thiamine pyrophosphokinase, thymidine kinase, thymidylate kinase, uridine kinase, xylulokinase, lD-myo-inositol-trisphosphate 3 -kinase, phosphofructokinase, pyridoxal kinase, sphinganine kinase, riboflavin kinase, 2-dehydro-3-deoxygalactonokinase, 2- dehydro-3-deoxygluconokinase, 4-diphosphocytidyl-2C-methyl-D-erythritol kinase, GTP pyrophosphokinase, L-fuculokinase, L-ribulokinase, L-xylulokinase, isocitrate dehydrogenase (NADP4) kinase, acetate kinase, allose kinase, carbamate kinase, cobinamide kinase, diphosphate-purine nucleoside kinase, fructokinase, glycerate kinase, hydroxymethylpyrimidine kinase, hygromycin-B kinase, inosine kinase, kanamycin kinase, phosphomethylpyrimidine kinase, phosphoribulokinase, polyphosphate kinase, propionate kinase, pyruvate, water dikinase, rhamnulokinase, tagatose-6-phosphate kinase, tefraacyldisaccharide 4'-kinase, thiamine-phosphate kinase, undecaprenol kinase, uridylate kinase, N-acylmannosamine kinase, D-erythro-sphingosine kinase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases which may be ameliorated by a modulating kinase activity. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, acute lymphoblastic leukemia associated with spleen tyrosine kinase deficiency [Goodman P. A., et al, (2001) Oncogene, 20(30):3969-78], ataxia telangiectasia associated with ATM kinase deficiency [Boultwood J., (2001) J. Clin. Pathol., 54(7):512-6], congenital haemolytic anaemia associated with erythrocyte pyruvate kinase deficiency [Zanella A., et al., (2001) Br. J. Haematol., 113(l):43-8], mevalonic aciduria caused by mevalonate kinase deficiency [Houten S. M., et al., (2001) Eur. J. Hum. Genet., 9(4):253-9], and acute myelogenous leukemia associated with over-expressed death-associated protein kinase [Guzman M. L., et al., (2001) Blood, 97(7):2177-9]. Nucleotide binding: The term "nucleotide binding" refers to ligand binding or canier proteins, involved in physical interaction with a nucleotide, preferably, any compound consisting of a nucleoside that is esterified with [ortho]phosphate or an oligophosphate at any hydroxyl group on the glycose moiety, such as purine nucleotide binding proteins. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases that are associated with abnormal nucleotide binding. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to Gout (a syndrome characterized by high urate level in the blood). Since urate is a breakdown metabolite of purines, reducing purines serum levels could have a therapeutic effect in Gout disease. Tubulin binding: The term "tubulin binding" refers to binding proteins that bind tubulin such as microtubule binding proteins. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases which are associated with abnormal tubulin activity or stracture. Binding the products of the genes of this family, or antibodies reactive therewith, can modulate a plurality of tubulin activities as well as change microtubulin stracture. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, Alzheimer's disease associated with t-complex polypeptide 1 deficiency [Schuller E., et al., (2001) Life Sci., 69(3):263-70], neurodegeneration associated with apoE deficiency [Masliah E., et al., (1995) Exp. Neurol., 136(2): 107-22], progressive axonopathy associated with disfuctional neurofilaments [Griffiths I. R., et al., (1989) Neuropathol. Appl. Neurobiol., 15(l):63-74], familial frontotemporal dementia associated with tau deficiency [astor P., et al., (2001) Ann. Neurol., 49(2):263-7], and colon cancer suppressed by APC [White R. L., (1997) Pathol. Biol. (Paris), 45(3):240-4]. En example for a drag whose target is tubulin is the anticancer drug - Taxol. Drags having similar mechanism of action (interfering with tubulin polymerization) may be developed based on tubulin binding proteins.
Receptor signaling proteins: The phrase "receptor signaling proteins" refers to receptor proteins involved in signal transduction such as receptor signaling protein serine/threonine kinase, receptor signaling protein tyrosine kinase, receptor signaling protein tyrosine phosphatase, aryl hydrocarbon receptor nuclear translocator, hematopoeitin/interferon-class (D200-domain) cytokine receptor signal fransducer, fransmembrane receptor protein tyrosine kinase signaling protein, fransmembrane receptor protein serine/threonine kinase signaling protein, receptor signaling protein serine/threonine kinase signaling protein, receptor signaling protein serine/threonine phosphatase signaling protein, small GTPase regulatory/interacting protein, receptor signaling protein tyrosine kinase signaling protein, and receptor signaling protein serine/threonine phosphatase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the signal-transduction is abnormal, either as a cause, or as a result of the disease. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, complete hypogonadofropic hypogonadism associated with GnRH receptor deficiency [Kottler M. L., et a., (2000) J. Clin. Endocrinol. Metab., 85(9):3002-8], severe combined immunodeficiency disease associated with IL-7 receptor deficiency [Puel A. and Leonard W. J., (2000) Cun. Opin. Immunol., 12(4):468- 7], schizophrenia associated N-methyl-D-aspartate receptor deficiency [Mohn A.R., et al., (1999) Cell, 98(4):427-36], Yesinia-associated arthritis associated with tumor necrosis factor receptor p55 deficiency [Zhao Y. X., et al., (1999) Arthritis Rheum., 42(8): 1662-72], and Dwarfism of Sindh caused by growth hormone-releasing hormone receptor deficiency [aheshwari H. G, et al, (1998) J. Clin. Endocrinol. Metab., 83(ll):4065-74]. Molecular function unknown: The phrase "molecular function unknown" refers to various proteins with unknown molecular function, such as cell surface antigens. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which regulation of the recognition, or participation or bind of cell surface antigens to other moieties may have therapeutic effect. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, autoimmune diseases, various infectious diseases, cancer diseases which involve non cell surface antigens recognition and activity. Enzyme activators: The term "enzyme activators" refers to enzyme regulators such as activators of: kinases, phosphatases, sphingolipids, chaperones, guanylate cyclase, tryptophan hydroxylase, proteases, phospholipases, caspases, proprotein convertase 2 activator, cyclin- dependent protein kinase 5 activator, superoxide-generating NADPH oxidase activator, sphingomyelin phosphodiesterase activator, monophenol monooxygenase activator, proteasome activator, and GTPase activator. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which beneficial effect may be achieved by modulating the activity of activators of proteins and enzymes. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to all complement related diseases, as most complement proteins activate by cleavage other complement proteins. Transferases, transferring one-carbon groups: The phrase "transferases, fransfening one-carbon groups" refers enzymes that catalyze the transfer of a one-carbon chemical group from one molecule to another such as methyltransferase, amidinotransferase, hydroxymethyl-, formyl- and related fransferase, carboxyl- and carbamoylfransferase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which the transfer of a one-carbon chemical group from one molecule to another is abnormal so that a beneficial effect may be achieved by modulation of such reaction. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Transferases: The term "transferases" refers to enzymes that catalyze the transfer of a chemical group, preferably, a phosphate or amine from one molecule to another. It includes enzymes such as transferases, transfening one-carbon groups, aldehyde or ketonic groups, acyl groups, glycosyl groups, alkyl or aryl (other than methyl) groups, nitrogenous, phosphorus- containing groups, sulfur-containing groups, lipoyltransferase, deoxycytidyl transferases. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the transfer of a chemical group from one molecule to another is abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to cancerous diseases such as prostate cancer [Urology. 2003, 62(5 Suppl l):55-62] or lung cancer [Invest. New Drags. 2003, 21(4):435-43; JAMA. 2003, 22;290(16):2149-58], psychiatric disorders [Am. J. Med. Genet. 2003, 15;123B(l):64-9], colorectal disease such as Crohn's disease [Dis. Colon Rectum. 2003, 46(11): 1498-507] or celiac diseases [N Engl. J. Med. 2003, 349(17): 1673-4; author reply 1673-4], neurological diseases such as Prkinson's disease [J. Chem Neuroanat. 2003, 26(2): 143-51], Alzheimer disease [Hum. Mol. Genet. 2003 21] or Charcot-Marie- Tooth Disease [Mol. Biol. Evol. 2003 31]. Chaperones: The term "chaperones" refers to functional classes of unrelated families of proteins that assist the conect non-covalent assembly of other polypeptide-containing structures in vivo, but are not components of these assembled structures when they a performing their normal biological function. The group of chaperones include proteins such as ribosomal chaperone, peptidylprolyl isomerase, lectin-binding chaperone, nucleosome assembly chaperone, chaperonin ATPase, cochaperone, heat shock protein, HSP70/HSP90 organizing protein, fimbrial chaperone, metallochaperone, tubulin folding, and HSC70-interacting protein. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases which are associated with abnormal protein activity, stracture, degradation or accumulation of proteins. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to neurological syndromes [J. Neuropathol. Exp. Neurol. 2003, 62(7):751-64; Antioxid Redox Signal. 2003, 5(3):337-48; J. Neurochem. 2003, 86(2) :394-404], neurological diseases such as Parkinson's disease [Hum. Genet. 2003, 6; Neurol Sci. 2003, 24(3): 159-60; J. Neurol. 2003, 250 Suppl. 3:11125-11129] ataxia [J. Hum. Genet. 2003 ;48(8):415-9] or Alzheimer diseases [J. Mol. Neurosci. 2003, 20(3):283-6; J. Alzheimers Dis. 2003, 5(3): 171-7], cancerous diseases [Semin. Oncol. 2003, 30(5):709-16], prostate cancer [Semin. Oncol. 2003, 30(5):709-16] metabolic diseases [J Neurochem. 2003, 87(l):248-56], infectious diseases, such as prion infection [EMBO J. 2003, 22(20) :5435-5445]. Chaperones may be also used for manipulating therapeutic proteins binding to their receptors therefore, improving their therapeutic effect. Cell adhesion molecule: The phrase "cell adhesion molecule" refers to proteins that serve as adhesion molecules between adjoining cells such as membrane-associated protein with guanylate kinase activity, cell adhesion receptor, neuroligin, calcium-dependent cell adhesion molecule, selectin, calcium-independent cell adhesion molecule, and extracellular matrix protein. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which adhesion between adjoining cells is involved, typically conditions in which the adhesion is abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to cancer in which abnormal adhesion may cause and enhance the process of metastasis and abnormal growth and development of various tissues in which modulation adhesion among adjoining cells can improve the condition. Leucocyte-endothlial interactions characterized by adhesion molecules involved in interactions between cells lead to a tissue injury and ischemia reperfusion disorders in which activated signals generated during ischemia may trigger an exuberant inflammatory response during reperfusion, provoking greater tissue damage than initial ischemic insult [Crit. Care Med. 2002, 30(5 Suppl) :S214-9]. The blockade of leucocyte-endothelial adhesive interactions has the potential to reduce vascular and tissue injury. This blockade may be achieved using a soluble variant of the adhesion molecule. States of septic shock and ARDS involve large recruitment of neutrophil cells to the damaged tissues. Neutrophil cells bind to the endothelial cells in the target tissues through adhesion molecules. Neufrophils possess multiple effector mechanisms that can produce endothelial and lung tissue injury, and interfere with pulmonary gas transfer by dismption of surfactant activity [Eur. J. Surg. 2002, 168(4):204-14]. In such cases, the use of soluble variant of the adhesion molecule may decrease the adhesion of neufrophils to the damaged tissues. Examples of such diseases include, but are not limited to, Wiskott-Aldrich syndrome associated with WAS deficiency [Westerberg L., et al, (2001) Blood, 98(4): 1086-94], asthma associated with intercellular adhesion molecule- 1 deficiency [Tang M. L. and Fiscus L. C, (2001) Pulm. Pharmacol. Ther., 14(3):203-10], infra-afrial thrombogenesis associated with increased von Willebrand factor activity [Fukuchi M., et al., (2001) J. Am. Coll. Cardiol.,
37(5): 1436-42], junctional epidermolysis bullosa associated with laminin 5-β-3 deficiency
[Robbins P. B., et al., (2001) Proc. Natl. Acad. Sci., 98(9):5193-8], and hydrocephalus caused by neural adhesion molecule LI deficiency [Rolf B., et al., (2001) Brain Res., 891(l-2):247- 52]. Motor proteins: The term "motor proteins" refers to proteins that generate force or energy by the hydrolysis of ATP and that function in the production of intracellular movement or transportation. Examples of such proteins include microfilament motor, axonemal motor, microtubule motor, and kinetochore motor (dynein, kinesin, or myosin). Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which force or energy generation is impaired. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, malignant diseases where microtubules are drag targets for a family of anticancer drags such as myodystrophies and myopathies [Trends Cell Biol. 2002, 12(12):585-91], neurological disorders [Neuron. 2003,
25;40(l):25-40; Trends Biochem. Sci. 2003, 28(10):558-65; Med. Genet. 2003, 40(9):671-
5], and hearing impairment [Trends Biochem. Sci. 2003, 28(10):558-65]. Defense/immunity proteins: The term "defense/immunity proteins" refers to proteins that are involved in the immune and complement systems such as acute-phase response proteins, antimicrobial peptides, antiviral response proteins, blood coagulation factors, complement components, immunoglobulins, major histocompatibility complex antigens and opsonins. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases involving the immunological system including inflammation, autoimmune diseases, infectious diseases, as well as cancerous processes or diseases which are manifested by abnormal coagulation processes, which may include abnormal bleeding or excessive coagulation. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, late (C5-9) complement component deficiency associated with opsonin receptor allotypes [Fijen C. A., et al., (2000) Clin. Exp. Immunol., 120(2):338-45], combined immunodeficiency associated with defective expression of MHC class II genes [Griscelli C, et al., (1989) Immunodefic. Rev. l(2):135-53], loss of antiviral activity of CD4 T cells caused by neufralization of endogenous TNFα [Pavic I., et al., (1993) J. Gen. Virol., 74 (Pt 10):2215-23], autoimmune diseases associated with natural resistance-associated macrophage protein deficiency [Evans C. A., et al., (2001) Neurogenetics, 3(2):69-78], Epstein-Ban vims-associated lymphoproliferative disease inhibited by combined GM-CSF and IL-2 therapy [Baiocchi R. A., et al., (2001) J. Clin. Invest., 108(6):887-94], multiple sclerosis in which recombinant proteins from the interferons family are the treatment of choice and sepsis in which activated protein C is a therapeutic protein itself. Intracellular transporters: The term "intracellular transporters" refers to proteins that mediate the transport of molecules and macromolecules inside the cell, such as intracellular nucleoside transporter, vacuolar assembly proteins, vesicle transporters, vesicle fusion proteins, type II protein secretors. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the transport of molecules and macromolecules is abnormal leading to various pathologies. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Transporters: The term "transporters" refers to proteins that mediate the transport of molecules and macromolecules, such as channels, exchangers, and pumps. Transporters include proteins such as: amine/polyamine transporter, lipid transporter, neurotransmitter transporter, organic acid transporter, oxygen transporter, water transporter, earners, intracellular transports, protein transporters, ion transporters, carbohydrate transporter, polyol transporter, amino acid transporters, vitamin/cofactor transporters, siderophore fransporter, drag transporter, channel/pore class transporter, group translocator, auxiliary transport proteins, permeases, murein transporter, organic alcohol fransporter, nucleobase, nucleoside, and nucleotide and nucleic acid transporters. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the transport of molecules and macromolecules such as neurotransmitters, hormones, sugar etc. is impaired leading to various pathologies. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, glycogen storage disease caused by glucose-6-phosphate fransporter deficiency [Hiraiwa H., and Chou J. Y. (2001) DNA Cell Biol., 20(8):447-53], tangier disease associated with ATP-binding cassette fransporter-1 deficiency [McNeish J., et al., (2000) Proc. Natl. Acad. Sci., 97(8):4245-50], systemic primary carnitine deficiency associated with organic cation fransporter deficiency [Tang N. L., et al., (1999) Hum. Mol. Genet., 8(4):655-60], Wilson disease associated with copper-transporting ATPases deficiency [Payne A. S., et al., (1998) Proc. Natl. Acad. Sci. 95(18): 10854-9], and atelosteogenesis associated with diastrophic dysplasia sulphate transporter deficiency [Newbury-Ecob R., (1998) J. Med. Genet., 35(l):49-53], Central Nervous system diseases freated by inhibiting neurotransmitter fransporter (e.g. Depression, freated with serotonin transporters inhibitors - Prozac), and Cystic fibrosis mediated by the chloride channel CFTR. Other transporter related diseases are cancer [Oncogene. 2003, 22(38):6005-12] and especially cancer resistant to treatment [Oncologist. 2003, 8(5):411-24; J. Med. Invest. 2003, 50(3-4):126- 35], infectious diseases, especially fungal infections [Annu. Rev. Phytopathol. 2003, 41:641- 67], neurological diseases, such as Parkinson [FASEB J. 2003, Sep 4 [Epub ahead of print]], diabetes where ATP-sensitive potassium channel in beta cells is the target for insulin secretagogues, hypertension where calcium channels are the target for calcium blockers, and cardiovascular diseases, including hypercholesterolemia [Am. J. Cardiol. 2003, 92(4B):10K- 16K]. There are about 30 membrane fransporter genes linked to a known genetic clinical syndrome. Secreted versions of splice variants of fransporters may be therapeutic as the case with soluble receptors. These transporters may have the capability to bind the compound in the serum they would normally bind on the membrane. For example, a secreted form ATP7B, a transporter involved in Wilson's disease, is expected to bind plasma Copper, therefore have a desired therapeutic effect in Wilson's disease.
Lyases: The term "lyases" refers to enzymes that catalyze the formation of double bonds by removing chemical groups from a substrate without hydrolysis or catalyze the addition of chemical groups to double bonds. It includes enzymes such as carbon-carbon lyase, carbon- oxygen lyase, carbon-nitrogen lyase, carbon-sulfur lyase, carbon-halide lyase, and phosphorus- oxygen lyase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which the double bonds formation catalyzed by these enzymes is impaired. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, autoimmune diseases [JAMA. 2003, 290(13): 1721-8; JAMA.2003, 290(13): 1713-20], diabetes [Diabetes. 2003, 52(9):2274- 8], neurological disorders such as epilepsy [J. Neurosci. 2003, 23(24):8471-9], Parkinson [J. Neurosci.2003, 23(23):8302-9; Lancet. 2003, 362(9385):712] or Creutzfeldt-Jakob disease [Clin. Neurophysiol. 2003, 114(9): 1724-8], and cancerous diseases [J. Pathol. 2003, 201(1):37- 45; J. Pathol.2003, 201(l):37-45; Cancer Res. 2003, 63(16):4952-9; Eur. J. Cancer. 2003, 39(13): 1899-903]. Actin binding proteins: The phrase "actin binding proteins" refers to proteins binding actin as actin cross- linking, actin bundling, F-actin capping, actin monomer binding, actin lateral binding, actin depolymerizing, actin monomer sequestering, actin filament severing, actin modulating, membrane associated actin binding, actin thin filament length regulation, and actin polymerizing proteins. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which actin binding is impaired. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, neuromuscular diseases such as muscular dystrophy [Neurology. 2003, 61(3):404-6], Cancerous diseases [Urology. 2003, 61(4):845-50; J. Cutan. Pathol. 2002, 29(7):430; Cancer. 2002, 94(6): 1777-86; Clin. Cancer Res. 2001, 7(8):2415-24; Breast Cancer Res. Treat.2001, 65(1):11-21], renal diseases such as glomerulonephritis [J. Am. Soc. Nephrol. 2002, 13(2):322-31; Eur. J. Immunol.2001, 31(4).T221-7], and gastrointestinal diseases such as Crohn's disease [J. Cell Physiol.2000, 182(2):303-9]. Protein binding proteins: The phrase "protein binding proteins" refers to proteins involved in diverse biological functions through binding other proteins. Examples of such biological function include intermediate filament binding, LIM-domain binding, LLR-domain binding, clathrin binding, ARF binding, vinculin binding, KU70 binding, froponin C binding PDZ-domain binding, SH3 -domain binding, fibroblast growth factor binding, membrane-associated protein with guanylate kinase activity interacting, Wnt-protein binding , DEAD/H-box RNA helicase binding, β-amyloid binding, myosin binding, TATA-binding protein binding DNA topoisomerase I binding, polypeptide hormone binding, RHO binding, FHl-domain binding, syntaxin-1 binding, HSC70-interacting, transcription factor binding, metarhodopsin binding, tubulin binding, JUN kinase binding, RAN protein binding, protein signal sequence binding, importin α export receptor, poly-glutamine tract binding, protein carrier, β-catenin binding, protein C-terminus binding, lipoprotein binding, cytoskeletal protein binding protein, nuclear localization sequence binding, protein phosphatase 1 binding, adenylate cyclase binding, eukaryotic initiation factor 4E binding, calmodulin binding, collagen binding, insulin-like growth factor binding, lamin binding, profilin binding, tropomyosin binding, actin binding, peroxisome targeting sequence binding, SNARE binding, and cyclin binding. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases which are associated with impaired protein bindmg. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, neurological and psychiatric diseases [J. Neurosci. 2003, 23(25):8788-99; Neurobiol. Dis. 2003, 14(1): 146-56; J. Neurosci. 2003, 23(17):6956-64; Am. J. Pathol. 2003, 163(2):609-19], and cancerous diseases [Cancer Res. 2003, 63(15):4299-304; Semin. Thromb. Hemost. 2003, 29(3):247-58; Proc. Natl. Acad. Sci. U S 2003, 100(16):9506-11]. Ligand binding or carrier proteins: The phrase "ligand binding or canier proteins" refers to proteins involved in diverse biological functions such as: pyridoxal phosphate binding, carbohydrate binding, magnesium binding, amino acid binding, cyclosporin A binding, nickel binding, chlorophyll binding, biotin binding, penicillin binding, selenium binding, tocopherol binding, lipid binding, drag binding, oxygen transporter, electron fransporter, steroid binding, juvenile hormone binding, retinoid binding, heavy metal binding, calcium binding, protein binding, glycosaminoglycan binding, folate binding, odorartt binding, lipopolysaccharide binding and nucleotide binding. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies dfrected against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases which are associated with impaired function of these proteins. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, neurological disorders [J. Med. Genet. 2003, 40(10):733-40; J. Neuropathol. Exp. Neurol. 2003, 62(9):968-75; J. Neurochem. 2003, 87(2):427-36], autoimmune diseases (N. Engl. J. Med. 2003, 349(16):1526-33; JAMA. 2003, 290(13):1721-8]; gastroesophageal reflux disease [Dig. Dis. Sci. 2003, 48(9): 1832-8], cardiovascular diseases [J. Vase. Surg. 2003, 38(4):827-32], cancerous diseases [Oncogene. 2003, 22(43):6699-703; Br. J. Haematol. 2003, 123(2):288- 96], respiratory diseases [Circulation. 2003, 108(15):1839-44], and ophtalmic diseases [Ophthalmology. 2003, 110(10):2040-4; Am. J. Ophthalmol. 2003, 136(4):729-32]. ATPases: The term "ATPases" refers to enzymes that catalyze the hydrolysis of ATP to ADP, releasing energy that is used in the cell. This group include enzymes such as plasma membrane cation-transporting ATPase, ATP-binding cassette (ABC) fransporter, magnesium-ATPase, hydrogen-/sodium-translocating ATPase or ATPase translocating any other elements, arsenite-transporting ATPase, protein-transporting ATPase, DNA translocase, P-type ATPase, and hydrolase, acting on acid anhydrides involved in cellular and subcellular movement. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases which are associated with impaired conversion of the hydrolysis of ATP to ADP or resulting energy use. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, infectious diseases such as helicobacter pylori ulcers [BMC Gasfroenterology 2003, 3:31 (published 6 November 2003)], Neurological, muscular and psychiatric diseases [Int. J. Neurosci. 2003, Ϊ3(12):1705-1717; Int. J. Neurosci. 2003, 113(11):1579-1591; Ann. Neurol. 2003, 54(4):494-500], Amyotrophic Lateral Sclerosis [Other Motor Neuron Disord. 2003 4(2):96- 9], cardiovascular diseases [J. Nippon. Med. Sch. 2003, 70(5):384-92; Endocrinology. 2003, 144(10):4478-83], metabolic diseases [Mol. Pathol. 2003, 56(5):302-4; Neurosci. Lett. 2003, 350(2): 105-8], and peptic ulcer disease treated with inhibitors of the gastric H+-K+ ATPase (e.g. Omeprazole) responsible for acid secretion in the gastric mucosa. Carboxylic ester hydrolases: The phrase carboxylic ester hydrolases" refers to hydrolytic enzymes acting on carboxylic ester bonds such as N-acetylglucosaminylphosphatidylinositol deacetylase, 2- acetyl-1-alkylglycerophosphocholine esterase, aminoacyl-tRNA hydrolase, arylesterase, carboxylesterase, cholinesterase, gluconolactonase, sterol esterase, acetylesterase, carboxymethylenebutenolidase, protein-glutamate methylesterase, lipase, and 6- phosphogluconolactonase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which the hydrolytic cleavage of a covalent bond with accompanying addition of water (-H being added to one product of the cleavage and -OH to the other) is abnormal so that a beneficial effect may be achieved by modulation of such reaction. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, autoimmune neuromuscular disease Myasthenia Gravis, freated with cholinesterase inhibitors. Hydrolase, acting on ester bonds: The phrase "hydrolase, acting on ester bonds" refers to hydrolytic enzymes acting on ester bonds such as nucleases, sulfuric ester hydrolase, carboxylic ester hydrolases, thiolester hydrolase, phosphoric monoester hydrolase, phosphoric diester hydrolase, triphosphoric monoester hydrolase, diphosphoric monoester hydrolase, and phosphoric triester hydrolase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which the hydrolytic cleavage of a covalent bond with accompanying addition of water (-H being added to one product of the cleavage and -OH to the other), is abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Hydrolases: The term "hydrolases" refers to hydrolytic enzymes such as GPI-anchor fransamidase, peptidases, hydrolases, acting on ester bonds, glycosyl bonds, ether bonds, carbon-nitrogen (but not peptide) bonds, acid anhydrides, acid carbon-carbon bonds, acid halide bonds, acid phosphoras-nitrogen bonds, acid sulfur-nitrogen bonds, acid carbon- phosphorus bonds, acid sulfur-sulfur bonds. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases in which the hydrolytic cleavage of a covalent bond with accompanying addition of water (-H being added to one product of the cleavage and -OH to the other) is abnormal. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, cancerous diseases [Cancer.
2003, 98(9): 1842-8; Cancer. 2003, 98(9): 1822-9], neurological diseases such as Parkinson diseases [J. Neurol. 2003, 250 Suppl 3:11115-11124; J. Neurol. 2003, 250 Suppl 3:1112-11110], endocrinological diseases such as pancreatitis [Pancreas. 2003, 27(4):291-6] or childhood genetic diseases [Eur. J. Pediatr. 1997, 156(12):935-8], coagulation diseases [BMJ. 2003, 327(7421):974-7], cardiovascular diseases [Ann. Intern. Med. 2003, Oct 139(8):670-82], autoimmunity diseases [J. Med. Genet. 2003, 40(10):761-6], and metabolic diseases [Am. J. Hum. Genet. 2001, 69(5): 1002-12]. Enzymes: The term "enzymes' refers to naturally occuning or synthetic macromolecular substance composed mostly of protein, that catalyzes, to various degree of specificity, at least one (bio)chemical reactions at relatively low temperatures. The action of RNA that has catalytic activity (ribozyme) is often also regarded as enzymatic. Nevertheless, enzymes are mainly proteinaceous and are often easily inactivated by heating or by protein-denaturing agents. The substances upon which they act are known as substrates, for which the enzyme possesses a specific binding or active site. The group of enzymes include various proteins possessing enzymatic activities such as mannosylphosphate fransferase, para-hydroxybenzoate:polyprenyltransferase, rieske iron- sulfur protein, imidazoleglycerol-phosphate synthase, sphingosine hydroxylase, tRNA 2'- phosphofransferase, sterol C-24(28) reductase, C-8 sterol isomerase, C-22 sterol desaturase, C-14 sterol reductase, C-3 sterol dehydrogenase (C-4 sterol decarboxylase), 3-keto sterol reductase, C-4 methyl sterol oxidase, dihydronicotinamide riboside quinone reductase, glutamate phosphate reductase, DNA repair enzyme, telomerase, α-ketoacid dehydrogenase, β-alanyl-dopamine synthase, RNA editase, aldo-keto reductase, alkylbase DNA glycosidase, glycogen debranching enzyme, dihydropterin deaminase, dihydropterin oxidase, dimethylnitrosamine demethylase, ecdysteroid UDP-glucosyl/UDP glucuronosyl fransferase, glycine cleavage system, helicase, histone deacetylase, mevaldate reductase, monooxygenase, poly(ADP-ribose) glycohydrolase, pymvate dehydrogenase, serine esterase, sterol canier protein X-related thiolase, fransposase, tyramine-β hydroxylase, para-aminobenzoic acid (PABA) synthase, glu-tRNA(gln) amidotransferase, molybdopterin cofactor sulfurase, lanosterol 14-α-demethylase, aromatase, 4-hydroxybenzoate octaprenyltransferase, 7,8-dihydro-8-oxoguanine-triphosphatase, CDP-alcohol phosphotransferase, 2,5-diamino-6-(ribosylamino)-4(3H)-pyrimidonone 5'-phosphate deaminase, diphosphoinositol polyphosphate phosphohydrolase, γ-glutamyl carboxylase, small protein conjugating enzyme, small protein activating enzyme, l-deoxyxylulose-5- phosphate synthase, 2'-phosphotransferase, 2-octoprenyl-3-methyl-6-methoxy-l,4- benzoquinone hydroxylase, 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase, 3,4 dihydroxy-2-butanone-4-phosphate synthase, 4-amino-4-deoxychorismate lyase, 4- diphosphocytidyl-2C-methyl-D-erythritol synthase, ADP-L-glycero-D-manno-heptose synthase, D-erythro-7,8-dihydroneopterin friphosphate 2'-epimerase, N-ethylmaleimide reductase, O-antigen ligase, O-antigen polymerase, UDP-2,3-diacylglucosamine hydrolase, arsenate reductase, carnitine racemase, cobalamin [5'-phosphate] synthase, cobinamide phosphate guanylylfransferase, enterobactin synthetase, enterochelin esterase, enterochelin synthetase, glycolate oxidase, integrase, lauroyl fransferase, peptidoglycan synthetase, phosphopantetheinyltransferase, phosphoglucosamine mutase, phosphoheptose isomerase, quinolinate synthase, siroheme synthase, N-acylmannosamine-6-phosphate 2-epimerase, N- acetyl-anhydromuramoyl-L-alanine amidase, carbon-phosphorous lyase, heme-copper terminal oxidase, disulfide oxidoreductase, phthalate dioxygenase reductase, sphingosine-1- phosphate lyase, molybdopterin oxidoreductase, dehydrogenase, NADPH oxidase, naringenin-chalcone synthase, N-ethylammeline chlorohydrolase, polyketide synthase, aldolase, kinase, phosphatase, CoA-ligase, oxidoreductase, fransferase, hydrolase, lyase, isomerase, ligase, ATPase, sulfhydryl oxidase, lipoate-protein ligase, δ-l-pyrroline-5- carboxyate synthetase, lipoic acid synthase, and tRNA dihydrouridine synthase. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases which can be ameliorated by modulating the activity of various enzymes which are involved both in enzymatic processes inside cells as well as in cell signaling. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to diabetes where alpha- glucosidase is the target for drags which delay glucose absorption, Osteoporosis where farnsesyl diphosphate synthase is the target for bisphosphonates, thyroid autoimmune disease associated with thyroid peroxidase, ?MUCOPOLYSACCHARIDOSES associated with defects in lysosomal enzymes, Tay-Sachs Disease associated with defects in b-hexosaminidase and hypertension where Angiotensin Converting Enzyme is the target for the common hypertension drags - ACE inhibitors. Cytoskeletal proteins: The term "cytoskeletal proteins" refers to proteins involved in the structure formation of the cytoskeleton. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases which are caused or due to abnormalities in cytoskeleton, including cancerous cells, and diseased cells such as cells that do not propagate, grow or function normally. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, liver diseases such as cholestatic diseases [Lancet. 2003, 362(9390).T 112-9], vascular diseases [J. Cell Biol. 2003, 162(6):l l ll-22], endocrinological diseases [Cancer Res. 2003, 63(16):4836-41], neuromuscular disorders such as muscular dystrophy [Neuromuscul. Disord. 2003, 13(7- 8):579-88], or myopathy [Neuromuscul. Disord. 2003, 13(6):456-67] neurological disorders such as Alzheimer's disease [J. Alzheimers Dis. 2003, 5(3):209-28], cardiac disorders [J. Am. Coll. Cardiol. 2003, 42(2):319-27], skin disorders [J. Am. Coll.. Cardiol. 2003, 42(2):319-27], and cancer [Proteomics.2003, 3(6):979-90]. Structural proteins: The term "structural proteins" refers to proteins involved in the stmcture formation of the cell, such as structural proteins of ribosome, cell wall structural proteins, structural proteins of cytoskeleton, extracellular matrix structural proteins, extracellular matrix glycoproteins, amyloid proteins, plasma proteins, structural proteins of eye lens, structural protein of chorion (sensu Insecta), structural protein of cuticle (sensu Insecta), puparial glue protein (sensu Diptera), structural proteins of bone, yolk proteins, structural proteins of muscle, structural protein of vitelline membrane (sensu Insecta), structural proteins of perifrophic membrane (sensu Insecta), and structural proteins of nuclear pores. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases which are caused by abnormalities in cytoskeleton, including cancerous cells, and diseased cells such as cells that do not propagate, grow or function normally. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, blood vessels diseases such as aneurysms [Cardiovasc. Res. 2003, 60(1):205-13], joint diseases [Rheum. Dis. Clin. North Am. 2003, 29(3):631-45], muscular diseases such as muscular dystrophies [Cun. Opin. Clin. Nutr. Metab. Care. 2003, 6(4):435-9], neuronal diseases such as encephalitis [Neurovfrol. 2003, 9(2):274-83], retinitis pigmentosa [Dev. Ophthalmol. 2003, 37:109-25], and infectious diseases [J. Virol. Methods. 2003, 109(l):75-83; FEMS Immunol. Med. Microbiol. 2003, 35(2):125-30; J. Exp. Med. 2003, 197(5):633-42]. Ligands: The term "ligands" refers to proteins that bind to another chemical entity to form a larger complex, involved in various biological processes, such as signal transduction, metabolism, growth and differentiation, etc. This group of proteins includes opioid peptides, baboon receptor ligand, branchless receptor ligand, breathless receptor ligand, ephrin, frizzled receptor ligand, frizzled-2 receptor ligand, heartless receptor ligand, Notch receptor ligand, patched receptor ligand, punt receptor ligand, Ror receptor ligand, saxophone receptor ligand, SE20 receptor ligand, sevenless receptor ligand, smooth receptor ligand, thickveins receptor ligand, Toll receptor ligand, Torso receptor ligand, death receptor ligand, scavenger receptor ligand, neuroligin, integrin ligand, hormones, pheromones, growth factors, and sulfonylurea receptor ligand. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases involved in impaired hormone function or diseases which involve abnormal secretion of proteins which may be due to abnormal presence, absence or impaired normal response to normal levels of secreted proteins. Those secreted proteins include hormones, neurotransmitters, and various other proteins secreted by cells to the extracellular environment. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, analgesia inhibited by orphanin FQ/nociceptin [Shane R., et al., (2001) Brain Res., 907(1-2):109-16], stroke protected by estrogen [Alkayed N. J., et al., (2001) J. Neurosci., 21(19):7543-50], atherosclerosis associated with growth hormone deficiency [Elhadd T A., et al., (2001) J. Clin. Endocrinol. Metab., 86(9):4223-32], diabetes inhibited by α-galactosylceramide [Hong S., et al., (2001) Nat. Med., 7(9):1052-6], and Huntington's disease [Rao D. S., et al., (2001) Mol. Cell Biol., 21(22):7796-806]. Signal transducer: The term "signal transducers" refers to proteins such as activin inhibitors, receptor- associated proteins, α-2 macroglobulin receptors, morphogens, quorum sensing signal generators, quorum sensing response regulators, receptor signaling proteins, ligands, receptors, two-component sensor molecules, and two-component response regulators. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases in which the signal-fransduction is impaired, either as a cause, or as a result of the disease. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, altered sexual dimorphism associated with signal fransducer and activator of franscription 5b [Udy G. B., et al., (1997) Proc. Natl. Acad. Sci. U S A, 94(14):7239-44], multiple sclerosis associated witii sgpl30 deficiency [Padberg F., et al., (1999) J. NeuroimmunoL, 99(2):218-23], intestinal inflammation associated with elevated signal transducer and activator of transcription 3 activity [Suzuki A., et al., (2001) J Exp Med, 193(4):471-81], carcinoid tumor inhibited by increased signal transducer and activators of transcription 1 and 2 [Zhou Y., et al., (2001) Oncology, 60(4):330-8], and esophageal cancer associated with loss of EGF-STAT1 pathway [Watanabe G., et al., (2001) Cancer J., 7(2): 132-9]. RNA polymerase II transcription factors: The phrase "RNA polymerase II transcription factors" refers to proteins such as specific and non-specific RNA polymerase II transcription factors, enhancer binding, ligand-regulated transcription factor, and general RNA polymerase II transcription factors. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases involving impaired function of RNA polymerase II franscription factors. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, cardiac diseases [Cell Cycle.
2003, 2(2):99-104], xeroderma pigmentosum [Bioessays. 2001, 23(8):671-3; Biochim. Biophys. Acta. 1997, 1354(3):241-51], muscular atrophy [J. Cell Biol. 2001, 152(l):75-85], neurological diseases such as Alzheimer's disease [Front Biosci. 2000, 5:D244-57], cancerous diseases such as breast cancer [Biol. Chem. 1999, 380(2): 117-28], and autoimmune disorders [Clin. Exp. Immunol. 1997, 109(3):488-94]. RNA binding proteins: The phrase "RNA binding proteins" refers to RNA binding proteins involved in splicing and translation regulation such as tRNA binding proteins, RNA helicases, double- sfranded RNA and single-stranded RNA binding proteins, mRNA binding proteins, snRNA cap binding proteins, 5S RNA and 7S RNA binding proteins, poly-pyrimidine tract binding proteins, snRNA binding proteins, and AU-specific RNA binding proteins. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases involving franscription and translation factors such as helicases, isomerases, histones and nucleases, diseases where there is impaired transcription, splicing, post-transcriptional processing, translation or stability of the RNA. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, cancerous diseases such as lymphomas [Tumori. 2003, 89(3):278-84], prostate cancer [Prostate. 2003, 57(l):80-92] or lung cancer [J. Pathol. 2003, 200(5):640-6], blood diseases, such as fanconi anemia [Cun. Hematol.
Rep. 2003, 2(4):335-40], cardiovascular diseases such as atherosclerosis [J. Thromb. Haemost
2003, l(7):1381-90] muscle diseases [Trends Cardiovasc. Med. 2003, 13(5):188-95] and brain and neuronal diseases [Trends Cardiovasc. Med. 2003, 13(5): 188-95; Neurosci. Lett. 2003, 342(l-2):41-4]. Nucleic acid binding proteins: The phrase "nucleic acid binding proteins" refers to proteins involved in RNA and DNA synthesis and expression regulation such as transcription factors, RNA and DNA binding proteins, zinc fingers, helicase, isomerase, histones, nucleases, ribonucleoproteins, and transcription and translation factors. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat diseases involving DNA or RNA binding proteins such as: helicases, isomerases, histones and nucleases, for example diseases where there is abnormal replication or transcription of DNA and RNA respectively. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such diseases include, but are not limited to, neurological diseases such as renitis pigmentoas [Am. J. Ophthal ol. 2003, 13 (4):678-87] parkinsonism [Proc. Natl. Acad.
Sci. U S A. 2003, 100(18):10347-52], Alzheimer [J. Neurosci. 2003, 23(17):6914-27] and canavan diseases [Brain Res Bull. 2003, 61(4):427-35], cancerous diseases such as leukemia
[Anticancer Res.2003, 23(4):3419-26] or lung cancer [J. Pathol.2003, 200(5):640-6], miopathy [Neuromuscul Disord. 2003, 13(7-8):559-67] and liver diseases [J. Pathol. 2003, 200(5):553-
60]. Proteins involved in Metabolism: The phrase "proteins involved in metabolism" refers to proteins involved in the totality of the chemical reactions and physical changes that occur in living organisms, comprising anabolism and catabolism; may be qualified to mean the chemical reactions and physical processes undergone by a particular substance, or class of substances, in a living organism. This group includes proteins involved in the reactions of cell growth and maintenance such as: metabolism resulting in cell growth, carbohydrate metabolism, energy pathways, electron transport, nucleobase, nucleoside, nucleotide and nucleic acid metabolism, protein metabolism and modification, amino acid and derivative metabolism, protein targeting, lipid metabolism, aromatic compound metabolism, one-carbon compound metabolism, coenzymes and prosthetic group metabolism, sulfur metabolism, phosphorus metabolism, phosphate metabolism, oxygen and radical metabolism, xenobiotic metabolism, nitrogen metabolism, fat body metabolism (sensu Insecta), protein localization, catabolism, biosynthesis, toxin metabolism , methylglyoxal metabolism, cyanate metabolism, glycolate metabolism, carbon utilization and antibiotic metabolism. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to freat diseases involving cell metabolism. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases. Examples of such metabolism-related diseases include, but are not limited to, multisystem mitochondrial disorder caused by mitochondrial DNA cytochrome C oxidase II deficiency [Campos Y., et al., (2001) Ann. Neurol. 50(3):409-13], conduction defects and ventricular dysfunction in the heart associated with heterogeneous connexin43 expression [Gutstein D. E., et al, (2001) Circulation, 104(10): 1194-9], atherosclerosis associated with growth suppressor p27 deficiency [Diez-Juan A., and Andres V. (2001) FASEB J., 15(11): 1989-95], colitis associated with glutathione peroxidase deficiency [Esworthy R. S., et al, (2001) Am. J. Physiol. Gastrointest. Liver Physiol., 281(3):G848-55], systemic lupus erythematosus associated with deoxyribonuclease I deficiency [Yasutomo K., et al., (2001) Nat. Genet., 28(4):313-4], alcoholic pancreatitis [Pancreas. 2003, 27(4):281-5], amyloidosis and diseases that are related to amyloid metabolism, such as F?MF, atherosclerosis, diabetes, and especially diabetes long term consequences, neurological diseases such as Creutzfeldt- Jakob disease, and Parkinson or Rasmussen's encephalitis. Cell growth and/or maintenance proteins: The phrase "Cell growth and/or maintenance proteins" refers to proteins involved in any biological process required for cell survival, growth and maintenance, including proteins involved in biological processes such as cell organization and biogenesis, cell growth, cell proliferation, metabolism, cell cycle, budding, cell shape and cell size control, sporulation (sensu Saccharomyces), transport, ion homeostasis, autophagy, cell motility, chemi- mechanical coupling, membrane fusion, cell-cell fusion, and stress response. Pharmaceutical compositions including such proteins or protein encoding sequences, antibodies directed against such proteins or polynucleotides capable of altering expression of such proteins, may be used to treat or prevent diseases such as cancer, degenerative diseases, for example neurodegenerative diseases or conditions associated with aging, or alternatively, diseases wherein apoptosis which should have taken place, does not take place. Antibodies and polynucleotides such as PCR primers and molecular probes designed to identify such proteins or protein encoding sequences may be used for diagnosis of such diseases, detection of predisposition to a disease, and determination of the stage of a disease. Examples of such diseases include, but are not limited to, ataxia-telangiectasia associated with ataxia-telangiectasia mutated deficiency [Hande et al., (2001) Hum. Mol. Genet., 10(5):519-28], osteoporosis associated with osteonectin deficiency [Delany et al., (2000) J. Clin. Invest., 105(7):915-23], arthritis caused by membrane-bound matrix metalloproteinase deficiency [Holmbeck et al., (1999) Cell, 99(1): 81-92], defective stratum corneum and early neonatal death associated with transglutaminase 1 deficiency [Matsuki et al., (1998) Proc. Natl. Acad. Sci. U S A, 95(3): 1044-9], and Alzheimer's disease associated with estrogen [Simpkins et al, (1997) Am. J. Med., 103(3A):19S-25S]. Variants of proteins which accumulate an element/compound Variant proteins which their wild type version naturally binds a certain compound or element inside the cell, such as for storage, may have therapeutic effect as secreted variants. For example, Fenitin, accumulates iron inside the cells. A secreted variant of this protein is expected to bind plasma iron, reduce its levels to thereby have therapeutic effects in hemodisorders which are characterized by high levels of free-iron in the blood. Autoantigens Autoantigens refer to "self proteins which evoke autoimmune response. Examples of autoantigens are listed in Table 8, below. Secreted splice variants of such autoantigens can be used to treat such autoimmune disorders. Since autoimmune disorders are occasionally accompanied by different autoimmune manifestations (including but not limited to multiple endocrine syndromes, i.e., syndrome), the secreted variants of the present invention may treat these multiple symptoms. Therapeutic mechanisms of such variants may include: (i) sequestration of auto-antibodies to thereby reduce their circulating levels; (ii) antigen specific immunotherapy — based on the observation that prior systemic administration of a protein antigen could inhibit the subsequent generation of the immune response to the same antigen (has been proved in mice models for Myasthenia Gravis and type I Diabetes). In addition, any novel variant of autoantigens (not necessarily secreted) may be used for "specific immunoadsorption" - leading to a specific immunodepletion of an antibody when used in immunoadsorption columns. Variants of autoantigens are also of a diagnostic value. The diagnosis of many autoimmune disorders is based on looking for specific autoantibodies to autoantigens known to be associated with an autoimmune condition. Most of the diagnostic techniques are based on having a recombinant form of the autoantigen and using it to screen for serum autoantibodies. However these antibodies may bind the variants of the present invention with a similar or augmented affinity. For example, TPO is a known autoantigen in thyroid autoimmunity. It has been shown that its variant TPOzanelli also take part in the autoimmune process and can bind the same antibodies as TPO [Biochemistry. 2001 Feb 27; 40(8):2572-9.]. The nucleic acid sequences of the present invention, the proteins encoded thereby and the cells and antibodies described hereinabove can be used in screening assays, therapeutic or prophylactic methods of treatment, or predictive medicine (e.g., diagnostic and prognostic assays, including those used to monitor clinical trials, and pharmacogenetics). More specifically, the nucleic acids of the present invention can be used to: (i) express a protein of the invention in a host cell in culture or in an intact multicellular organism following, e.g., gene therapy; (ii) detect an mRNA; or (iii) detect an alteration in a gene to which a nucleic acid of the invention specifically binds; or to modulate such a gene's activity. The nucleic acids and proteins of the present invention can also be used to treat disorders characterized by either insufficient or excessive production of those nucleic acids or proteins, a failure in a biochemical pathway in which they normally participate in a cell, or other abenant or unwanted activity relative to the wild type protein (e.g., inappropriate enzymatic activity or unproductive protein folding). The proteins of the invention are useful in screening for naturally occuning protein substrates or other compounds (e.g., drags) that modulate protein activity. The antibodies of the invention can also be used to detect and isolate the proteins of the invention, to regulate their bioavailability, or otherwise modulate their activity. Examplary uses, and the methods by which they can be achieved, are described in detail below. Possible utilities for variants of drug targets Finding a variant of a known drug target can be advantageous in cases where the known drag has a major side effect, the therapeutic efficacy of the known drug is medium, a known drag has failed clinical trials due to one of the above. A drug which is specific to a new protein variant of the target or to the target only (without affecting the novel variant) is likely to have lower side effects as compared to the original drag, higher therapeutic efficacy, and broader or different range of activities. For example, COX3, which is a variant of COX1, is known to bind COX inhibitors in different affinity than COX1. This molecule is also associated with different physiological processes than COX1. Therefore, a compound specific to COX1 or compounds specific to
COX3 would have lower side effects (by not affecting the other variants), and higher therapeutic efficacy to larger populations. (ii)Diseases that may be treated/diagnosed using the teaching of the present invention Inflammatory diseases Examples of inflammatory diseases include, but are not limited to, chrome inflammatory diseases and acute inflammatory diseases. Inflammatory diseases associated with hypersensitivity Examples of hypersensitivity include, but are not limited to, Types I-IN hypersensitivity, immediate hypersensitivity, antibody mediated hypersensitivity, immune complex mediated hypersensitivity, T lymphocyte mediated hypersensitivity and DTH. An example of type I or immediate hypersensitivity is asthma. Examples of type II hypersensitivity include, but are not limited to, rheumatoid diseases, rheumatoid autoimmune diseases, rheumatoid arthritis [Krenn V. et al, Histol Histopathol 2000 Jul;15
(3):791], spondylitis, ankylosing spondylitis [Jan Voswinkel et al, Arthritis Res 2001; 3 (3):
189], systemic diseases, systemic autoimmune diseases, systemic lupus erythematosus [Erikson J. et al, Immunol Res 1998;17 (l-2):49], sclerosis, systemic sclerosis [Renaudineau
Y. et al, Clin Diagn Lab Immunol. 1999 Mar;6 (2): 156; Chan OT. et al, Immunol Rev 1999
Jun;l 69: 107], glandular diseases, glandular autoimmune diseases, pancreatic autoimmune diseases, diabetes, Type I diabetes [Zimmet P. Diabetes Res Clin Pract 1996 Oct;34
Suppl:S125], thyroid diseases, autoimmune thyroid diseases, Graves' disease [Orgiazzi J. Endocrinol Metab Clin North Am 2000 Jun;29 (2):339], thyroiditis, spontaneous autoimmune thyroiditis [Braley-Mullen H. and Yu S, J Immunol 2000 Dec 15; 165
(12):7262], Hashimoto's thyroiditis [Toyoda N. et al, Nippon Rinsho 1999 Aug;57
(8):1810], myxedema, idiopathic myxedema [Mitsuma T. Nippon Rinsho. 1999 Aug;57
(8): 1759], autoimmune reproductive diseases, ovarian diseases, ovarian autoimmunity [Garza KM. et al, 1 Reprod Immunol 1998 Feb;37 (2):87], autoimmune anti-sperm infertility [Diekman AB. et al, Am J Reprod Immunol. 2000 Mar;43 (3):134], repeated fetal loss [Tincani A. et al, Lupus 1998;7 Suppl 2:S 107-9], neurodegenerative diseases, neurological diseases, neurological autoimmune diseases, multiple sclerosis [Cross AH. et al, J Neuroimmunol 2001 Jan 1;112 (1-2): 1], Alzheimer's disease [Oron L. et al, 1 Neural Transm Suppl. 1997;49:77], myasthenia gravis [Infante AJ. And Kraig E, Int Rev Immunol 1999;18 (l-2):83], motor neuropathies [Komberg AJ. J Clin Neurosci. 2000 May;7 (3):191], Guillain-Bane syndrome, neuropathies and autoimmune neuropathies [Kusunoki S. Am J Med Sci. 2000 Apr;319 (4):234], myasthenic diseases, Lambert-Eaton myasthenic syndrome [Takamori M. Am J Med Sci. 2000 Apr;319 (4):204], paraneoplastic neurological diseases, cerebellar atrophy, paraneoplastic cerebellar atrophy, non-paraneoplastic stiff man syndrome, cerebellar atrophies, progressive cerebellar atrophies, encephalitis, Rasmussen's encephalitis,. amyofrophic lateral sclerosis, Sydeham chorea, Gilles de la Tourette syndrome, polyendocrinopathies, autoimmune polyendocrinopathies [Antoine JC. and Honnorat J. Rev Neurol (Paris) 2000 Jan;156 (1):23], neuropathies, dysimmune neuropathies [Nobile-Orazio E. et al, Elecfroencephalogr Clin Neurophysiol Suppl 1999;50:419], neuromyotonia, acquired neuromyotonia, arthrogryposis multiplex congenita [Vincent A. et al, Ann N Y Acad Sci. 1998 May 13;841:482], cardiovascular diseases, cardiovascular autoimmune diseases, atherosclerosis [Matsuura E. et al, Lupus. 1998;7 Suppl 2:S135], myocardial infarction [Vaarala O. Lupus. 1998;7 Suppl 2:S132], thrombosis [Tincani A. et al, Lupus 1998;7 Suppl 2:S 107-9], granulomatosis, Wegener's granulomatosis, arteritis, Takayasu's arteritis and Kawasaki syndrome [Praprotnik S. et al, Wien Klin Wochenschr 2000 Aug 25;112 (15-16):660], anti-factor VIII autoimmune disease [Lacroix-Desmazes S. et al, Semin Thromb Hemost.2000;26 (2): 157], vasculitises, necrotizing small vessel vasculitises, microscopic polyangiitis, Churg and Strauss syndrome, glomerulonephritis, pauci-immune focal necrotizing glomemlonephritis, crescentic glomerulonephritis [Noel LH. Ann Med Interne (Paris). 2000 May; 151 (3): 178], antiphospholipid syndrome [Flamholz R. et al, J Clin Apheresis 1999;14 (4):171], heart failure, agonist-like β-adrenoceptor antibodies in heart failure [Wallukat G. et al, Am J Cardiol. 1999 Jun 17;83 (12A):75H], thrombocytopenic purpura [Moccia F. Ann Ital Med Int. 1999 Apr-Jun;14 (2): 114], hemolytic anemia, autoimmune hemolytic anemia [Efremov DG. et al, Leuk Lymphoma 1998 Jan;28 (3-4):285], gastrointestinal diseases, autoimmune diseases of the gastrointestinal tract, intestinal diseases, chronic inflammatory intestinal disease [Garcia Herola A. et al, Gasfroenterol Hepatol. 2000 Jan;23 (1):16], celiac disease [Landau YE. and Shoenfeld Y. Harefuah 2000 Jan 16;138 (2): 122], autoimmune diseases of the musculature, myositis, autoimmune myositis, Sjogren's syndrome [Feist E. et al, Int Arch Allergy Immunol 2000 Sep;123 (1):92], smooth muscle autoimmune disease [Zauli D. et al, Biomed Pharmacother 1999 Jun;53 (5-6):234], hepatic diseases, hepatic autoimmune diseases, autoimmune hepatitis [Manns MP. J Hepatol 2000 Aug;33 (2):326] and primary biliary cinhosis [Sfrassburg CP. et al, Eur J Gastroenterol Hepatol. 1999 Jun;l 1 (6):595]. Examples of type IV or T cell mediated hypersensitivity, include, but are not limited to, rheumatoid diseases, rheumatoid arthritis [Tisch R, McDevitt HO. Proc Natl Acad Sci U S A 1994 Jan 18;91 (2):437], systemic diseases, systemic autoimmune diseases, systemic lupus erythematosus [Datta SK., Lupus 1998;7 (9):591], glandular diseases, glandular autoimmune diseases, pancreatic diseases, pancreatic autoimmune diseases, Type 1 diabetes [Castano L. and Eisenbarth GS. Ann. Rev. Immunol. 8:647], thyroid diseases, autoimmune thyroid diseases, Graves' disease [Sakata S. et al, Mol Cell Endocrinol 1993 Mar;92 (1):77], ovarian diseases [Garza KM. et al, J Reprod Immunol 1998 Feb;37 (2):87], prostatitis, autoimmune prostatitis [Alexander RB. et al, Urology 1997 Dec;50 (6):893], polyglandular syndrome, autoimmune polyglandular syndrome, Type I autoimmune polyglandular syndrome [Hara T. et al, Blood. 1991 Mar 1;77 (5):1127], neurological diseases, autoimmune neurological diseases, multiple sclerosis, neuritis, optic neuritis [Soderstrom M. et al, 1 Neurol Neurosurg Psychiatry 1994 May;57 (5):544], myasthenia gravis [Oshima M. et al, Eur J Immunol 1990 Dec;20 (12):2563], stiff-man syndrome [Hiemsfra HS. et al, Proc Natl Acad Sci U S A 2001 Mar 27;98 (7):3988], cardiovascular diseases, cardiac autoimmunity in Chagas' disease [Cunha-Neto E. et al, 1 Clin Invest 1996 Oct 15;98 (8): 1709], autoimmune thrombocytopenic purpura [Semple JW. et al, Blood 1996 May 15;87 (10):4245], anti-helper T lymphocyte autoimmunity [Caporossi AP. et al, Viral Immunol 1998;11 (1):9], hemolytic anemia [Sallah S. et al, Ann Hematol 1997 Mar;74 (3): 139], hepatic diseases, hepatic autoimmune diseases, hepatitis, chronic active hepatitis [Franco A. et al, Clin Immunol Immunopathol 1990 Mar;54 (3):382], biliary cinhosis, primary biliary cinhosis [Jones DE. Clin Sci (Colch) 1996 Nov;91 (5):551], nephric diseases, nephric autoimmune diseases, nephritis, interstitial nephritis [Kelly CJ. J Am Soc Nephrol 1990 Aug;l (2): 140], connective tissue diseases, ear diseases, autoimmune connective tissue diseases, autoimmune ear disease [Yoo TJ. et al, Cell Immunol 1994 Aug;157 (1):249], disease of the inner ear [Gloddek B. et al, Ann N Y Acad Sci 1997 Dec 29;830:266], skin diseases, cutaneous diseases, dermal diseases, bullous skin diseases, pemphigus vulgaris, bullous pemphigoid and pemphigus foliaceus. Examples of delayed type hypersensitivity include, but are not limited to, contact dermatitis and drag eruption. Autoimmune diseases Examples of autoimmune diseases include, but are not limited to, cardiovascular diseases, rheumatoid diseases, glandular diseases, gastrointestinal diseases, cutaneous diseases, hepatic diseases, neurological diseases, muscular diseases, nephric diseases, diseases related to reproduction, connective tissue diseases and systemic diseases. Examples of autoimmune cardiovascular and blood diseases include, but are not limited to atherosclerosis [Matsuura E. et al, Lupus. 1998;7 Suppl 2:S135], myocardial infarction [Naarala O. Lupus. 1998;7 Suppl 2:S132], thrombosis [Tincani A. et al, Lupus 1998;7 Suppl 2:S107-9], Wegener's granulomatosis, Takayasu's arteritis, Kawasaki syndrome [Praprotnik S. et al, Wien Klin Wochenschr 2000 Aug 25;112 (15-16):660], anti- factor NIII autoimmune disease [Lacroix-Desmazes S. et al, Semin Thromb Hemost.2000;26 (2): 157], necrotizing small vessel vasculitis, microscopic polyangiitis, Churg and Strauss syndrome, pauci-immune focal necrotizing and crescentic glomerulonephritis [Noel LH. Ann Med Interne (Paris). 2000 May; 151 (3): 178], antiphospholipid syndrome [Flamholz R. et al, 1 Clin Apheresis 1999;14 (4): 171], antibody- induced heart failure [Wallukat G. et al, Am J Cardiol. 1999 Jun 17;83 (12A):75H], thrombocytopemc purpura [Moccia F. Ann Ital Med Int. 1999 Apr-Jun;14 (2):114; Semple JW. et al, Blood 1996 May 15;87 (10):4245], autoimmune hemolytic anemia [Efremov DG. et al, Leuk Lymphoma 1998 Jan;28 (3-4):285; Sallah S. et al, Ann Hematol 1997 Mar;74 (3): 139], cardiac autoimmunity in Chagas' disease [Cunha-Neto E. et al, 1 Clin Invest 1996 Oct 15;98 (8): 1709) and anti-helper T lymphocyte autoimmunity [Caporossi AP. et al, Viral Immunol 1998;11 (1):9]. Examples of autoimmune rheumatoid diseases include, but are not limited to rheumatoid arthritis [Krenn V. et al, Histol Histopathol 2000 Jul;15 (3):791; Tisch R, McDevitt HO. Proc Natl Acad Sci units S A 1994 Jan 18;91 (2):437) and ankylosing spondylitis [Jan Voswinkel etal, Arthritis Res 2001; 3 (3): 189]. Examples of autoimmune glandular diseases include, but are not limited to, autoimmune diseases of the pancreas, Type 1 diabetes [Castano L. and Eisenbarth GS. Ann. Rev. Immunol. 8:647; Zimmet P. Diabetes Res Clin Pract 1996 Oct;34 Suppl:S125], autoimmune thyroid diseases, Graves' disease [Orgiazzi J. Endocrinol Metab Clin North Am 2000 Jun;29 (2):339; Sakata S. et al, Mol Cell Endocrinol 1993 Mar;92 (1):77], spontaneous autoimmune thyroiditis [Braley-Mullen H. and Yu S, J Immunol 2000 Dec 15; 165 (12):7262], Hashimoto's thyroiditis [Toyoda N. et al, Nippon Rinsho 1999 Aug;57 (8):1810], idiopathic myxedema [Mitsuma T. Nippon Rinsho. 1999 Aug;57 (8):1759], ovarian autoimmunity [Garza KM. et al, 1 Reprod Immunol 1998 Feb;37 (2):87], autoimmune anti-sperm infertility, autoimmune prostatitis and Type I autoimmune polyglandular syndrome. Examples of autoimmune gastrointestinal diseases include, but are not limited to, chronic inflammatory intestinal diseases [Garcia Herola A. et al, Gastroenterol Hepatol. 2000 Jan;23 (1):16], celiac disease [Landau YE. and Shoenfeld Y. Harefuah 2000 Jan 16;138 (2): 122], colitis, ileitis and Crohn's disease and ulcerative colitis. Examples of autoimmune cutaneous diseases include, but are not limited to, autoimmune bullous skin diseases, such as, but are not limited to, pemphigus vulgaris, bullous pemphigoid and pemphigus foliaceus. Examples of autoimmune hepatic diseases include, but are not limited to, hepatitis, autoimmune chronic active hepatitis [Franco A. et al, Clin Immunol Immunopathol 1990 Mar;54 (3):382], primary biliary cinhosis [Jones DE. Clin Sci (Colch) 1996 Nov;91 (5):551; Sfrassburg CP. et al, Eur J Gastroenterol Hepatol. 1999 Jun;l l (6):595) and autoimmune hepatitis [Manns MP. J Hepatol 2000 Aug;33 (2):326]. Examples of autoimmune neurological diseases include, but are not limited to, multiple sclerosis [Cross AH. et al, 1 Neuroimmunol 2001 Jan 1;112 (1-2):1], Alzheimer's disease [Oron L. et al, J Neural Transm Suppl. 1997;49:77], myasthenia gravis [Infante AJ. And Kraig E, Int Rev Immunol 1999;18 (l-2):83; Oshima M. et al, Eur J Immunol 1990 Dec;20 (12):2563], neuropathies, motor neuropathies [Komberg AJ. J Clin Neurosci. 2000 May;7 (3):191], Guillain-Bane syndrome and autoimmune neuropathies [Rusunoki S. Am J Med Sci. 2000 Apr;319 (4):234], myasthenia, Lambert-Eaton myasthenic syndrome [Takamori M. Am J Med Sci. 2000 Apr;319 (4):204], paraneoplastic neurological diseases, cerebellar atrophy, paraneoplastic cerebellar atrophy and stiff-man syndrome [Hiemstra HS. et al, Proc Natl Acad Sci units S A 2001 Mar 27;98 (7):3988], non-paraneoplastic stiff man syndrome, progressive cerebellar atrophies, encephalitis, Rasmussen's encephalitis, amyofrophic lateral sclerosis, Sydeham chorea, Gilles de la Tourette syndrome and autoimmune polyendocrinopathies [Antoine JC. and Honnorat J. Rev Neurol (Paris) 2000 Jan;156 (1):23], dysimmune neuropathies [Nobile-Orazio E. et al, Elecfroencephalogr Clin Neurophysiol Suppl 1999;50:419], acquired neuromyotonia, arthrogryposis multiplex congenita [Vincent A. et al, Ann N Y Acad Sci. 1998 May 13;841:482], neuritis, optic neuritis [Soderstrom M. et al, 1 Neurol Neurosurg Psychiatry 1994 May;57 (5):544) multiple sclerosis and neurodegenerative diseases. Examples of autoimmune muscular diseases include, but are not limited to, myositis, autoimmune myositis and primary Sjogren's syndrome [Feist E. et al, Int Arch Allergy Immunol 2000 Sep;123 (1):92) and smooth muscle autoimmune disease [Zauli D. et al, Biomed Pharmacother 1999 Jun;53 (5-6):234]. Examples of autoimmune nephric diseases include, but are not limited to, nephritis and autoimmune interstitial nephritis [Kelly CJ. J Am Soc Nephrol 1990 Aug;l (2).T40], glommeralar nephritis. Examples of autoimmune diseases related to reproduction include, but are not limited to, repeated fetal loss [Tincani A. etal, Lupus 1998;7 Suppl 2:S107-9]. Examples of autoimmune connective tissue diseases include, but are not limited to, ear diseases, autoimmune ear diseases [Yoo TJ. et al, Cell Immunol 1994 Aug;157 (1):249) and autoimmune diseases of the inner ear [Gloddek B. et al, Ann N Y Acad Sci 1997 Dec 29;830:266]. Examples of autoimmune systemic diseases include, but are not limited to, systemic lupus erythematosus [Erikson J. et al, Immunol Res 1998; 17 (l-2):49) and systemic sclerosis [Renaudineau Y. et al, Clin Diagn Lab Immunol. 1999 Mar;6 (2):156; Chan OT. et al, Immunol Rev 1999 Jun; 169: 107]. Infectious diseases Examples of infectious diseases include, but are not limited to, chronic infectious diseases, subacute infectious diseases, acute infectious diseases, viral diseases, bacterial diseases, protozoan diseases, parasitic diseases, fungal diseases, mycoplasma diseases, and prion diseases. Graft rejection diseases Examples of diseases associated with transplantation of a graft include, but are not limited to, graft rejection, chronic graft rejection, subacute graft rejection, hyperacute graft rejection, acute graft rejection, and graft versus host disease. Allergic diseases Examples of allergic diseases include, but are not limited to, asthma, hives, urticaria, pollen allergy, dust mite allergy, venom allergy, cosmetics allergy, latex allergy, chemical allergy, drag allergy, insect bite allergy, animal dander allergy, stinging plant allergy, poison ivy allergy and food allergy. Cancerous diseases Examples of cancer include but are not limited to carcinoma, lymphoma, blastoma, sarcoma, and leukemia. Particular examples of cancerous diseases but are not limited to: Myeloid leukemia such as Chronic myelogenous leukemia. Acute myelogenous leukemia with maturation. Acute promyelocytic leukemia, Acute nonlymphocytic leukemia with increased basophils, Acute monocytic leukemia. Acute myelomonocytic leukemia with eosinophilia; malignant lymphoma, such as Birkitt's Non-Hodgkin's; Lymphoctyic leukemia, such as acute lumphoblastic leukemia. Chronic lymphocytic leukemia; Myeloproliferative diseases, such as Solid tumors Benign Meningioma, Mixed tumors of salivary gland, Colonic adenomas; Adenocarcinomas, such as Small cell lung cancer, Kidney, Uteras, Prostate, Bladder, Ovary, Colon, Sarcomas, Liposarcoma, myxoid, Synovial sarcoma, Rhabdomyosarcoma (alveolar), Extraskeletel myxoid chonodrosarcoma, Ewing's tumor; other include Testicular and ovarian dysgerminoma, Retinoblastoma, Wilms' tumor, Neuroblastoma, Malignant melanoma, Mesothelioma, breast, skin, prostate, and ovarian.
Thus, the nucleic acid sequences of the present invention, having RNA editing sites, and the proteins encoded thereby and the cells and antibodies described hereinabove can be used in, for example, screening assays, therapeutic or prophylactic methods of treatment, or predictive medicine (e.g., diagnostic and prognostic assays, including those used to monitor clinical trials, and pharmacogenetics). More specifically, the nucleic acids of the invention can be used to: (i) express a protein of the invention in a host cell (in culture or in an intact multicellular organism following, e.g., gene therapy, given, of course, that the transcript in question contains more than untranslated sequence); (ii) detect an mRNA; or (iii) detect an alteration in a gene to which a nucleic acid of the invention specifically binds; or to modulate such a gene's activity. The nucleic acids and proteins of the invention can also be used to treat disorders characterized by either insufficient or excessive production of those nucleic acids or proteins, a failure in a biochemical pathway in which they normally participate in a cell, or other abenant or unwanted activity relative to the wild type protein (e.g., inappropriate enzymatic activity or unproductive protein folding). The proteins of the invention are especially useful in screening for naturally occurring protein substrates or other compounds (e.g., drags) that modulate protein activity. The antibodies of the invention can also be used to detect and isolate the proteins of the invention, to regulate their bioavailability, or otherwise modulate their activity.
EXAMPLE 8 Examples of annotation This section presents examples of annotations, assigned to transcripts having RNA editing, as described in Example 1 above. The arbitrary name of each fragment means as follows: Compugen contig name (see Table l)_segment numbe r_editing site location within the segment. AA554866_1_1403 #SEQLIST AK024183
AA554866_1_1404 #SEQLIST AK024183
AA554866_1_722 #SEQLIST AK024183
AA554866_1_723 #SEQLIST AK024183
AA554866_1_736 #SEQLIST AK024183 AA554866_1_819 #SEQLIST AK024183
AA633281_5_1043 #GENE_SYMBOL LOC285773 #SEQUST AK096219 AA633281_5_1108 #GENE_SYMBOL LOC285773 #SEQLIST AK096219 AA633281_5_1109 #GENE_SYMBOL LOC285773 #SEQLIST AK096219 AA633281_5_1127 #GENE_SYMBOL LOC285773 #SEQLIST AK096219 AA633281_5 178 #GENE_SYMBOL LOC285773 #SEQLIST AK096219
AI138826_1_253 #SEQLIST BG952531 AI537687 AI138826_1_274 #SEQLIST AL702589 AI537687 AH 38826 L279 #SEQLIST BG952531 AI537687 AI138826_1_281 #SEQLIST AI537687
AK098080_1_1101 #SEQLIST AK098080 AK098080_1_1241 #SEQLIST AK098080
AK098080_1_1264 #SEQLIST AK098080 AK098080_1_ 293 #SEQLIST AK098080
AK098080_1_1360 #SEQLIST AK098080 AK098080_1_1366 #SEQLIST AK098080
AK098080_1_1420 #SEQLIST AK098080 "
AK098080_1_1477 #SEQLIST AK098080
AK098080_3_112 #SEQLIST AK098080
AK098080_3_137 #SEQLIST AK098080 AK098080_3_155 #SEQLIST AK098080
AK098080_3_223 #SEQLIST AK098080
AK098080_3_238 #SEQLIST AK098080
AK098080_3_4 #SEQLIST AK098080
AK098080_3_40 #SEQLIST AK098080 AK098080_3_5 #SEQLIST AK098080
D31352_35_416 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST AA835888 D31352_35_512 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST AA835888 D31352_35_526 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST AA835888 D31352_40_152 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST AK097188
D31352_40_21 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST AK097188
D31352_40_34 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST AK097188 D31352_40_65 #GENE_SYMBO DM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST BQ228724 AK097188 D31352_51_1175 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST CD239400 AA961420 AA363417 AW959392 AW393514
D31352_51_120 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST BF530400 BM464523 BM914390 D31352_51_139 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST BF804670 BF530400 BF678862 BM914390 D31352_51_179 #GENE_SYMBOL MDM4 ; DMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST BF804670 BF530400 BM914390 D31352_51_31 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST BF804670 H15088 H15405 BF678862 BX437328 BM914390 D31352_51_40 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST BF530400 H15088 H15405 BF678862 BE247751 D31352_51_41 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST BF530400 H15088 H15405 BE247751 BF678862 BM914390
D31352_51_59 #GENE_SYMBOL MDM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST H15088 H15405 B 464523 BF678862 BM914390
D31352_51_60 #GENE_SYMBOL DM4 ;MDMX #GO_F #GO_Acc 16874 #GO_Desc ligase activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 8285 #GO_Desc negative regulation of cell proliferation #TAAT EPI #SEQLIST H15088 H15405 BM464523 BF678862 BX437328 BM914390
F02268_33_1162 #GENE_SYMBOL PIGO #GO_F #GO_Acc 16787 #GO_Desc hydrolase activity #GO_P #GO_Acc 6506 #GO_Desc GPI anchor biosynthesis #GO_P #GO_Acc 9117 #GO_Desc nucleotide metabolism #SEQLIST CA771947 BQ631298 AW003696 AI991777 F02268_33_1163 #GE E_SYMBO PIGO #GO_F #GO_Acc 16787 #GO_Desc hydrolase activity
#GO_P #GO_Acc 6506 #GO_Desc GPI anchor biosynthesis #GO_P #GO_Acc 9117 #GO_Desc nucleotide metabolism #SEQLIST AW003696 AI991777
F02268_33_1185 #GENE_SYMBOL PIGO #GO_F #GO_Acc 16787 #GO_Desc hydrolase activity #GO_P #GO_Acc 6506 #GO_Desc GPI anchor biosynthesis #GO_P #GO_Acc 9117 #GO_Desc nucleotide metabolism #SEQLIST AW003696 AI991777
F02268_33_1264 #GENE_SYMBOL PIGO #GO_F #GO_Acc 16787 #GO_Desc hydrolase activity
#GO_P #GO_Acc 6506 #GO_Desc GPI anchor biosynthesis #GO_P #GO_Acc 9117 #GO_Desc nucleotide metabolism #SEQLIST CA771947 BQ631298 CA949577
F07156_22_101 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST
AK092641
F07156_22_112 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST
AK092641
F07156_22_115 #GENE_SYMBOL F J38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST
AA493559 F07156_22_219 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST
AA493559 AK092641
F07156_22_563 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST AK092641 CD102556 F07156_22_571 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST
AK092641 CD102556
F07156_22_572 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST
AK092641
F07156_22_59 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction
#GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST AK092641
F07156_22_74 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST AK092641
F07156_22_98 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction
#GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST AW820808
AK092641
F07156_23_11 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST BQ878947
AK092641
F07156_23_120 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST
BQ878947 AK092641 CD102556 F07156_23_145 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST
AK092641
F07156_23_147 #GENE_SYMBOL FLJ38991 #GO_ P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST BQ878947 AK092641 CD102556 F07156_23_3 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST AK092641 F07156_23_65 #GENE_SY BOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST NM173827 BQ878947 AK096310
F07156_23_68 #GENE_SYMBOL FLJ38991 #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 8535 #GO_Desc cytochrome c oxidase biogenesis #SEQLIST BQ878947 AK092641 CD102556
F09475_167_764 #G EN E_SYMBOL PAF400 ;TRRAP #GO_F #GO_Acc 4428 #GO_Desc inositol/phosphatidylinositol kinase activity #GO_F #GO_Acc 4672 #GO_Desc protein kinase activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5507 #GO_Desc copper ion binding #GO_P #GO_Acc 30330 #GO_Desc DNA damage response, signal transduction by p53 class mediator #GO_P #GO_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7165 #GO_Desc signal transduction #TAA blood-tumor #SEQLIST AU151903 AK001533
F09475_167_811 #GENE_SYMBOL PAF400 ;TRRAP #GO_F #GO_Acc 4428 #GO_Desc inositol/phosphatidylinositol kinase activity #GO_F #GO_Acc 4672 #GO_Desc protein kinase activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5507 #GO_Desc copper ion binding #GO_P #GO_Acc 30330 #GO_Desc DNA damage response, signal transduction by p53 class mediator #GO_P #GO_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7165 #GO_Desc signal transduction #TAA blood-tumor #SEQLIST AU151903 AK001533 F09475_167_838 #GENE_SYMBOL PAF400 ;TRRAP #GO_F #GO_Acc 4428 #GO_Desc inositol/phosphatidylinositol kinase activity #GO_F #GO_Acc 4672 #GO_Desc protein kinase activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5507 #GO_Desc copper ion binding #GO_P #GO_Acc 30330 #GO_Desc DNA damage response, signal transduction by p53 class mediator #GO_P #G0_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7165 #GO_Desc signal transduction #TAA blood-tumor #SEQLIST AU 151903 AK001533 F09475_167_904 #GENE_SYMBOL PAF400 ;TRRAP #GO_F #GO_Acc 4428 #GO_Desc inositol/phosphatidylinositol kinase activity #GO_F #GO_Acc 4672 #GO_Desc protein kinase activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5507 #GO_Desc copper ion binding #GO_P #GO_Acc 30330 #GO_Desc DNA damage response, signal transduction by p53 class mediator #GO_P #GO_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7165 #GO_Desc signal transduction #TAA blood-tumor #SEQLIST AU 151903 AK001533
H20403_31_127 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST AK057769 H20403_31_13 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST AK057769 H20403_31_155 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST AK057769 H20403_31_159 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST BM352024 AK092149 H20403_31_197 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST BC035077 AK057769 BM352024 CA778069
H20403_31_237 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST AK057769 BM352024 H20403_31_382 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST AK057769 H20403_31_412 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST AK057769 H20403_31_449 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST HSM802030 AK057769 AK092149 BM314127
H20403_31_502 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST HSM802030
BM314127
H20403_31_58 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST BC035077 AK095369 BM352024
H20403_31_66 #GENE_SY BOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST AK057769 H20403_31_82 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST AV683429 BM352024 AV698569 AV684422
H20403_31_92 #GENE_SY BOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST BM352024 CA778069 H20403_31_97 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST AK057769 BM352024 CA778069
H20403_32_18 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST HSM802030 H20403_32_209 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST BQ052326
BQ054291
H20403_32_264 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST BQ052326
BQ054291
H20403_32_272 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST BQ052326
BQ054291 H20403_32_346 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST BQ643535
BQ646533 BQ653258 B 005955
H20403_32_356 #GENE_SYMBOL KIAA1936 #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_P #GO_Acc 6118 #GO_Desc electron transport #SEQLIST BQ643535 BQ646533 BQ653258 B 005955
HSIFNABR_30_353 #GENE_SYMBOL IFNABR ;IFNAR2 ;IFNARB #GO_F #GO_Acc 3800 #GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO„Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses #INDICATION Anti-inflammatory ;Antiallergic, non-asthma ;Antiarthritic, immunological ;Antiasthma ;Anticancer, immunological ;Anticancer, interferon ;Anticancer, other ;Antidiabetic ;Antifungal ;Antiviral, anti-HIV ;Antiviral, interferon ;Antiviral, other ;Arthritis, rheumatoid ;Asthma ;Behcet's disease ;Biotechnology, other ;Cancer, bone ;Cancer, brain ;Cancer, cervical ;Cancer, general ;Cancer, head and neck ;Cancer, leukaemia, chronic myelogenous ;Cancer, leukaemia, hairy cell ;Cancer, lymphoma, non-Hodgkin's ;Cancer, melanoma ;Cancer, myeloma ;Cancer, renal ;Cancer, sarcoma, Kaposi's ;Cancer, skin, general ;Chronic fatigue syndrome ;Cirrhosis, hepatic ;Cytokine ;Diabetes, Type II ;Fibromyalgia ;Fibrosis, pulmonary ;Gene therapy ;Hepatoprotective ;lmmunoconjugate, other jlmmunodeficiency, general ;lmmunoglobulin, non- Ab ;lmmunological ;lmmunomodulator, anti-infective ;lmmunostimulant, anti-AIDS ;lmmunostimulant, other ;lmmunosuppressant ;lnfection, HIV/AIDS ;lnfection, coronavirus ;lnfection, coronavirus, prophylaxis ;lnfection, general ;lnfection, hepatitis virus, general ;lnfection, hepatitis-B virus ;lnfection, hepatitis-C virus -.Infection, herpes simplex virus ;lnfection, herpes virus, general ;lnfection, human papilloma virus ;lnfection, otological ;lnfection, staphylococcal prophylaxis ;lnfection, streptococcal prophylaxis ;lnfection, varicella zoster virus . nflammation, brain ;Keratoconjunctivitis ;Macular degeneration -.Monoclonal antibody, other ;Multiple sclerosis treatment ;Multiple sclerosis, general ."Musculoskeletal ;Neurological ;Non-antisense oligonucleotides ;Ophthalmological ;Pemphigus ;Prophylactic vaccine jRecombinant interferon ;Recombinants, other ;Respiratory ;Rhinitis, allergic, general ;Sepsis ;Septic shock treatment ;Sjogren's syndrome ;Stomatological ;Transplant rejection, general ;Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST HUMIFNAL BQ710666 NM000874 HSU29584 BU784848 BQ709397 HUMIFNAJ HUMIFNAK HSIFNABR_30_360 #GENE_SYMBOL IFNABR ;IFNAR2 ;IFNARB #GO_F #GO_Acc 3800
#GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory ;Antiallergic, non-asthma ;Antiarthritic, immunological ;Antiasthma ;Anticancer, immunological Anticancer, interferon ;Anticancer, other ;Antidiabetic ;Antifungal .-Antiviral, anti-HIV ;Antiviral, interferon ;Antiviral, other ;Arthritis, rheumatoid ;Asthma ;Behcet's disease ;Biotechnology, other ;Cancer, bone ;Cancer, brain ;Cancer, cervical ;Cancer, general ;Cancer, head and neck ;Cancer, leukaemia, chronic myelogenous ;Cancer, leukaemia, hairy cell ;Cancer, lymphoma, non-Hodgkin's ;Cancer, melanoma ;Cancer, myeloma ;Cancer, renal ;Cancer, sarcoma, Kaposi's ;Cancer, skin, general ;Chronic fatigue syndrome ;Cirrhosis, hepatic ;Cytokine ;Diabetes, Type II ;Fibromyalgia ;Fibrosis, pulmonary ;Gene therapy ;Hepatoprotective ;lmmunoconjugate, other jlmmunodeficiency, general ;lmmunoglobulin, non-MAb ;lmmunological ;lmmunomodulator, anti-infective ;lmmunostimulant, anti-AIDS ;lmmunostimulant, other ;lmmunosuppressant ;lnfection, HIV/AIDS ;lnfection, coronavirus ;lnfection, coronavirus, prophylaxis ;lnfection, general jlnfection, hepatitis virus, general ;lnfection, hepatitis-B virus ;lnfection, hepatitis-C virus ;lnfection, herpes simplex virus ;lnfection, herpes virus, general .-Infection, human papilloma virus ;lnfection, otological ;lnfection, staphylococcal prophylaxis ;lnfection, streptococcal prophylaxis ;infection, varicella zoster virus inflammation, brain ;Keratoconjunctivitis ;Macular degeneration ;Monoclonal antibody, other ;Multiple sclerosis treatment ;Multiple sclerosis, general .-Musculoskeletal ;Neurological ;Non-antisense oligonucleotides ;Ophthalmological ;Pemphigus ; Prophylactic vaccine ;Recombinant interferon ;Recombinants, other .-Respiratory ;Rhinitis, allergic, general ;Sepsis ;Septic shock treatment ;Sjogren's syndrome ;Stomatological ;Transplant rejection, general ;Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST HUMIFNAL NM000874 HSU29584 BU784848 HUMIFNAJ HUMIFNAK
HSIFNABR_30_459 #GENE_SYMBOL IFNABR ;IFNAR2 ;IFNARB #GO_F #GO_Acc 3800 #GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory ;Antiallergic, non-asthma ;Antiarthritic, immunological ;Antiasthma ;Anticancer, immunological ;Anticancer, interferon ;Anticancer, other ;Antidiabetic {Antifungal {Antiviral, anti-HIV {Antiviral, interferon ;Antiviral, other Arthritis, rheumatoid ;Asthma ;Behcet's disease {Biotechnology, other ;Cancer, bone ;Cancer, brain ;Cancer, cervical {Cancer, general ;Cancer, head and neck {Cancer, leukaemia, chronic myelogenous ;Cancer, leukaemia, hairy cell ;Cancer, lymphoma, non-Hodgkin's ;Cancer, melanoma ;Cancer, myeloma {Cancer, renal {Cancer, sarcoma, Kaposi's ;Cancer, skin, general ;Chronic fatigue syndrome
{Cirrhosis, hepatic ;Cytokine ;Diabetes, Type II ;Fibromyalgia ;Fibrosis, pulmonary ;Gene therapy ;Hepatoprotective ;lmmunoconjugate, other .-Immunodeficiency, general ;lmmunoglobulin, non-MAb ;lmmunological ;Immunomodulator, anti-infective {Immunostimulant, anti-AIDS ;lmmunostimulant, other ;lmmunosuppressant ;lnfection, HIV/AIDS ;lnfection, coronavirus ;lnfection, coronavirus, prophylaxis {Infection, general ;lnfection, hepatitis virus, general ;lnfection, hepatitis-B virus
;lnfection, hepatitis-C virus ;lnfection, herpes simplex virus ;lnfection, herpes virus, general ;lnfection, human papilloma virus ;lnfection, otological ;lnfection, staphylococcal prophylaxis ;lnfection, streptococcal prophylaxis ;lnfection, varicella zoster virus ;lnflammation, brain ;Keratoconjunctivitis ;Macular degeneration ;Monoclonal antibody, other {Multiple sclerosis treatment {Multiple sclerosis, general {Musculoskeletal {Neurological ;Non-antisense oligonucleotides {Ophthalmological {Pemphigus {Prophylactic vaccine {Recombinant interferon {Recombinants, other {Respiratory {Rhinitis, allergic, general {Sepsis {Septic shock treatment {Sjogren's syndrome {Stomatological {Transplant rejection, general {Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST HUMIFNAL NM000874 HUMIFNAJ HUMIFNAK HSIFNABR_30_505 #GENE_SYMBOL IFNABR {IFNAR2 {IFNARB #GO_F #GO_Acc 3800
#GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory {Antiallergic, non-asthma {Antiarthritic, immunological {Antiasthma {Anticancer, immunological {Anticancer, interferon {Anticancer, other {Antidiabetic {Antifungal {Antiviral, anti-HIV {Antiviral, interferon {Antiviral, other {Arthritis, rheumatoid {Asthma {Behcet's disease {Biotechnology, other {Cancer, bone {Cancer, brain {Cancer, cervical {Cancer, general {Cancer, head and neck {Cancer, leukaemia, chronic myelogenous {Cancer, leukaemia, hairy cell {Cancer, lymphoma, non-Hodgkin's {Cancer, melanoma {Cancer, myeloma {Cancer, renal {Cancer, sarcoma, Kaposi's {Cancer, skin, general {Chronic fatigue syndrome {Cirrhosis, hepatic {Cytokine {Diabetes, Type II {Fibromyalgia {Fibrosis, pulmonary ;Gene therapy {Hepatoprotective {Immunoconjugate, other {Immunodeficiency, general {Immunoglobulin, non-MAb {Immunological {Immunomodulatbr, anti-infective {Immunostimulant, anti-AIDS {Immunostimulant, other {Immunosuppressant {Infection, HIV/AIDS {Infection, coronavirus {Infection, coronavirus, prophylaxis {Infection, general {Infection, hepatitis virus, general {Infection, hepatitis-B virus {Infection, hepatitis-C virus {Infection, herpes simplex virus {Infection, herpes virus, general {Infection, human papilloma virus {Infection, otological {Infection, staphylococcal prophylaxis {Infection, streptococcal prophylaxis {Infection, varicella zoster virus {Inflammation, brain {Keratoconjunctivitis {Macular degeneration {Monoclonal antibody, other {Multiple sclerosis treatment {Multiple sclerosis, general {Musculoskeletal {Neurological ;Non-antisense oligonucleotides {Ophthalmological {Pemphigus {Prophylactic vaccine {Recombinant interferon {Recombinants, other {Respiratory {Rhinitis, allergic, general {Sepsis {Septic shock treatment {Sjogren's syndrome {Stomatological {Transplant rejection, general {Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST HUMIFNAL NM000874 HSU29584 BM990677 AI874312 BM314290 AI827888 BU784848 AA973866 A 157678 HUMIFNAJ HUMIFNAK HSIFNABR_33_125 #GENE_SYMBOL IFNABR {IFNAR2 {IFNARB #GO_F #GO_Acc 3800 #GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory {Antiallergic, non-asthma {Antiarthritic, immunological {Antiasthma {Anticancer, immunological {Anticancer, interferon {Anticancer, other {Antidiabetic {Antifungal {Antiviral, anti-HIV {Antiviral, interferon {Antiviral, other {Arthritis, rheumatoid {Asthma {Behcet's disease {Biotechnology, other {Cancer, bone {Cancer, brain {Cancer, cervical {Cancer, general {Cancer, head and neck {Cancer, leukaemia, chronic myelogenous {Cancer, leukaemia, hairy cell {Cancer, lymphoma, non-Hodgkin's {Cancer, melanoma {Cancer, myeloma {Cancer, renal {Cancer, sarcoma, Kaposi's {Cancer, skin, general {Chronic fatigue syndrome {Cirrhosis, hepatic {Cytokine {Diabetes, Type II {Fibromyalgia {Fibrosis, pulmonary ;Gene therapy {Hepatoprotective {Immunoconjugate, other {Immunodeficiency, general {Immunoglobulin, non-MAb {Immunological {Immunomodulator, anti-infective {Immunostimulant, anti-AIDS {Immunostimulant, other {Immunosuppressant {Infection, HIV/AIDS {Infection, coronavirus {Infection, coronavirus, prophylaxis {Infection, general {Infection, hepatitis virus, general {Infection, hepatitis-B virus {Infection, hepatitis-C virus {Infection, herpes simplex virus {Infection, herpes virus, general {Infection, human papilloma virus {Infection, otological {Infection, staphylococcal prophylaxis {Infection, streptococcal prophylaxis {Infection, varicella zoster virus {Inflammation, brain {Keratoconjunctivitis {Macular degeneration {Monoclonal antibody, other {Multiple sclerosis treatment {Multiple sclerosis, general {Musculoskeletal {Neurological ;Non-antisense oligonucleotides {Ophthalmological {Pemphigus {Prophylactic vaccine {Recombinant interferon {Recombinants, other {Respiratory {Rhinitis, allergic, general {Sepsis {Septic shock treatment {Sjogren's syndrome {Stomatological {Transplant rejection, general {Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST BU844823 BU856569 BF985432 HSIFNABR_33_134 #GENE_SYMBOL IFNABR {IFNAR2 {IFNARB #GO_F #GO_Acc 3800
#GO_Desc antiviral response protein activity #GOJF #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory {Antiallergic, non-asthma {Antiarthritic, immunological {Antiasthma {Anticancer, immunological {Anticancer, interferon {Anticancer, other {Antidiabetic {Antifungal {Antiviral, anti-HIV {Antiviral, interferon {Antiviral, other {Arthritis, rheumatoid {Asthma {Behcet's disease {Biotechnology, other {Cancer, bone {Cancer, brain {Cancer, cervical {Cancer, general {Cancer, head and neck {Cancer, leukaemia, chronic myelogenous {Cancer, leukaemia, hairy cell {Cancer, lymphoma, non-Hodgkin's {Cancer, melanoma {Cancer, myeloma {Cancer, renal {Cancer, sarcoma, Kaposi's {Cancer, skin, general {Chronic fatigue syndrome {Cirrhosis, hepatic {Cytokine {Diabetes, Type II {Fibromyalgia {Fibrosis, pulmonary ;Gene therapy {Hepatoprotective {Immunoconjugate, other {Immunodeficiency, general {Immunoglobulin, non-MAb {Immunological {Immunomodulator, anti-infective {Immunostimulant, anti-AIDS {Immunostimulant, other {Immunosuppressant {Infection, HIV/AIDS {Infection, coronavirus {Infection, coronavirus, prophylaxis {Infection, general {Infection, hepatitis virus, general {Infection, hepatitis-B virus {Infection, hepatitis-C virus {Infection, herpes simplex virus {Infection, herpes virus, general {Infection, human papilloma virus {Infection, otological {Infection, staphylococcal prophylaxis {Infection, streptococcal prophylaxis {Infection, varicella zoster virus {Inflammation, brain {Keratoconjunctivitis {Macular degeneration {Monoclonal antibody, other {Multiple sclerosis treatment {Multiple sclerosis, general {Musculoskeletal {Neurological ;Non-antisense oligonucleotides {Ophthalmological {Pemphigus {Prophylactic vaccine {Recombinant interferon {Recombinants, other {Respiratory {Rhinitis, allergic, general {Sepsis {Septic shock treatment {Sjogren's syndrome {Stomatological {Transplant rejection, general {Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST BX337778 BF985432
HSIFNABR_33_140 #GENE_SYMBOL IFNABR {IFNAR2 {IFNARB #GO_F #GO_Acc 3800 #GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory {Antiallergic, non-asthma {Antiarthritic, immunological {Antiasthma {Anticancer, immunological {Anticancer, interferon {Anticancer, other
{Antidiabetic {Antifungal {Antiviral, anti-HIV {Antiviral, interferon {Antiviral, other {Arthritis, rheumatoid {Asthma {Behcet's disease {Biotechnology, other {Cancer, bone {Cancer, brain {Cancer, cervical {Cancer, general {Cancer, head and neck {Cancer, leukaemia, chronic myelogenous {Cancer, leukaemia, hairy cell {Cancer, lymphoma, non-Hodgkin's {Cancer, melanoma {Cancer, myeloma {Cancer, renal {Cancer, sarcoma, Kaposi's {Cancer, skin, general {Chronic fatigue syndrome
{Cirrhosis, hepatic {Cytokine {Diabetes, Type II {Fibromyalgia {Fibrosis, pulmonary ;'Gene therapy {Hepatoprotective {Immunoconjugate, other {Immunodeficiency, general {Immunoglobulin, non-MAb {Immunological {Immunomodulator, anti-infective {Immunostimulant, anti-AIDS {Immunostimulant, other {Immunosuppressant {Infection, HIV/AIDS {Infection, coronavirus {Infection, coronavirus, prophylaxis {Infection, general {Infection, hepatitis virus, general {Infection, hepatitis-B virus
{Infection, hepatitis-C virus {Infection, herpes simplex virus {Infection, herpes virus, general {Infection, human papilloma virus {Infection, otological {Infection, staphylococcal prophylaxis {Infection, streptococcal prophylaxis {Infection, varicella zoster virus {Inflammation, brain {Keratoconjunctivitis {Macular degeneration {Monoclonal antibody, other {Multiple sclerosis treatment {Multiple sclerosis, general {Musculoskeletal {Neurological ;Non-antisense oligonucleotides {Ophthalmological {Pemphigus {Prophylactic vaccine {Recombinant interferon {Recombinants, other {Respiratory {Rhinitis, allergic, general {Sepsis {Septic shock treatment {Sjogren's syndrome {Stomatological {Transplant rejection, general {Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST AA234804 BU844823 BM467267 AL047793 BU856569 BU171428 BE295623 N48443 BU173182 BE019398 BF985432 BG752933 BX337778 BE895279 HSIFNABR_33_145 #GENE_SYMBOL IFNABR {IFNAR2 {IFNARB #GO_F #GO_Acc 3800
#GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory {Antiallergic, non-asthma {Antiarthritic, immunological {Antiasthma {Anticancer, immunological {Anticancer, interferon {Anticancer, other {Antidiabetic {Antifungal {Antiviral, anti-HIV {Antiviral, interferon {Antiviral, other {Arthritis, rheumatoid {Asthma {Behcet's disease {Biotechnology, other {Cancer, bone {Cancer, brain {Cancer, cervical {Cancer, general {Cancer, head and neck {Cancer, leukaemia, chronic myelogenous {Cancer, leukaemia, hairy cell {Cancer, lymphoma, non-Hodgkin's {Cancer, melanoma {Cancer, myeloma {Cancer, renal {Cancer, sarcoma, Kaposi's {Cancer, skin, general {Chronic fatigue syndrome {Cirrhosis, hepatic {Cytokine {Diabetes, Type II {Fibromyalgia {Fibrosis, pulmonary {Gene therapy {Hepatoprotective ;lmmunoconjugate, other {Immunodeficiency, general {Immunoglobulin, non-MAb {Immunological {Immunomodulator, anti-infective {Immunostimulant, anti-AIDS {Immunostimulant, other {Immunosuppressant {Infection, HIV/AIDS {Infection, coronavirus {Infection, coronavirus, prophylaxis {Infection, general {Infection, hepatitis virus, general {Infection, hepatitis-B virus {Infection, hepatitis-C virus {Infection, herpes simplex virus {Infection, herpes virus, general {Infection, human papilloma virus {Infection, otological {Infection, staphylococcal prophylaxis {Infection, streptococcal prophylaxis {Infection, varicella zoster virus {Inflammation, brain {Keratoconjunctivitis {Macular degeneration {Monoclonal antibody, other {Multiple sclerosis treatment {Multiple sclerosis, general {Musculoskeletal {Neurological ;Non-antisense oligonucleotides {Ophthalmological {Pemphigus {Prophylactic vaccine {Recombinant interferon {Recombinants, other {Respiratory {Rhinitis, allergic, general {Sepsis {Septic shock treatment {Sjogren's syndrome {Stomatological {Transplant rejection, general {Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST BU173182 AA234804 BU844823 BM467267 AL047793 BX337778 BU856569 BU171428 BG752933 BF985432 BE895279 BE295623 HSIFNABR_33_158 #GENE_SYMBOL IFNABR {IFNAR2 {IFNARB #GO_F #GO_Acc 3800 #GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory {Antiallergic, non-asthma {Antiarthritic, immunological {Antiasthma {Anticancer, immunological {Anticancer, interferon {Anticancer, other {Antidiabetic {Antifungal {Antiviral, anti-HIV {Antiviral, interferon {Antiviral, other {Arthritis, rheumatoid {Asthma {Behcet's disease {Biotechnology, other {Cancer, bone {Cancer, brain {Cancer, cervical {Cancer, general {Cancer, head and neck {Cancer, leukaemia, chronic myelogenous {Cancer, leukaemia, hairy cell {Cancer, lymphoma, non-Hodgkin's {Cancer, melanoma {Cancer, myeloma {Cancer, renal {Cancer, sarcoma, Kaposi's {Cancer, skin, general {Chronic fatigue syndrome {Cirrhosis, hepatic {Cytokine {Diabetes, Type II {Fibromyalgia {Fibrosis, pulmonary ;Gene therapy {Hepatoprotective {Immunoconjugate, other {Immunodeficiency, general {Immunoglobulin, non-MAb {Immunological {Immunomodulator, anti-infective {Immunostimulant, anti-AIDS {Immunostimulant, other {Immunosuppressant {Infection, HIV/AIDS {Infection, coronavirus {Infection, coronavirus, prophylaxis {Infection, general {Infection, hepatitis virus, general {Infection, hepatitis-B virus
{Infection, hepatitis-C virus {Infection, herpes simplex virus {Infection, herpes virus, general {Infection, human papilloma virus {Infection, otological {Infection, staphylococcal prophylaxis {Infection, streptococcal prophylaxis {Infection, varicella zoster virus {Inflammation, brain {Keratoconjunctivitis {Macular degeneration {Monoclonal antibody, other {Multiple sclerosis treatment {Multiple sclerosis, general {Musculoskeletal {Neurological ;Non-antisense oligonucleotides {Ophthalmological
{Pemphigus {Prophylactic vaccine {Recombinant interferon {Recombinants, other {Respiratory {Rhinitis, allergic, general {Sepsis {Septic shock treatment {Sjogren's syndrome {Stomatological {Transplant rejection, general {Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST BM467267 HSIFNABR_33_48 #GENE_SYMBOL IFNABR {IFNAR2 {IFNARB #GO_F #GO_Acc 3800
#GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory {Antiallergic, non-asthma {Antiarthritic, immunological {Antiasthma {Anticancer, immunological {Anticancer, interferon {Anticancer, other {Antidiabetic {Antifungal {Antiviral, anti-HIV {Antiviral, interferon {Antiviral, other {Arthritis, rheumatoid {Asthma {Behcet's disease {Biotechnology, other {Cancer, bone {Cancer, brain {Cancer, cervical {Cancer, general {Cancer, head and neck {Cancer, leukaemia, chronic myelogenous {Cancer, leukaemia, hairy cell {Cancer, lymphoma, non-Hodgkin's {Cancer, melanoma {Cancer, myeloma {Cancer, renal {Cancer, sarcoma, Kaposi's {Cancer, skin, general {Chronic fatigue syndrome {Cirrhosis, hepatic {Cytokine {Diabetes, Type II {Fibromyalgia {Fibrosis, pulmonary ;Gene therapy {Hepatoprotective {Immunoconjugate, other {Immunodeficiency, general {Immunoglobulin, non-MAb {Immunological {Immunomodulator, anti-infective {Immunostimulant, anti-AIDS {Immunostimulant, other {Immunosuppressant {Infection, HIV/AIDS {Infection, coronavirus {Infection, coronavirus, prophylaxis {Infection, general {Infection, hepatitis virus, general {Infection, hepatitis-B virus {Infection, hepatitis-C virus {Infection, herpes simplex virus {Infection, herpes virus, general {Infection, human papilloma virus {Infection, otological {Infection, staphylococcal prophylaxis {Infection, streptococcal prophylaxis {Infection, varicella zoster virus {Inflammation, brain {Keratoconjunctivitis {Macular degeneration {Monoclonal antibody, other {Multiple sclerosis treatment {Multiple sclerosis, general {Musculoskeletal {Neurological ;Non-antisense oligonucleotides {Ophthalmological {Pemphigus {Prophylactic vaccine {Recombinant interferon {Recombinants, other {Respiratory {Rhinitis, allergic, general {Sepsis {Septic shock treatment {Sjogren's syndrome {Stomatological {Transplant rejection, general {Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST BU844823 BU856569
HSIFNABR_33_58 #GENE_SYMBOL IFNABR {IFNAR2 {IFNARB #GO_F #GO_Acc 3800 #GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory {Antiallergic, non-asthma {Antiarthritic, immunological {Antiasthma {Anticancer, immunological {Anticancer, interferon {Anticancer, other
{Antidiabetic {Antifungal {Antiviral, anti-HIV {Antiviral, interferon {Antiviral, other {Arthritis, rheumatoid {Asthma {Behcet's disease {Biotechnology, other {Cancer, bone {Cancer, brain {Cancer, cervical {Cancer, general {Cancer, head and neck {Cancer, leukaemia, chronic myelogenous {Cancer, leukaemia, hairy cell {Cancer, lymphoma, non-Hodgkin's {Cancer, melanoma {Cancer, myeloma {Cancer, renal {Cancer, sarcoma, Kaposi's {Cancer, skin, general {Chronic fatigue syndrome {Cirrhosis, hepatic {Cytokine {Diabetes, Type II {Fibromyalgia {Fibrosis, pulmonary ;Gene therapy {Hepatoprotective {Immunoconjugate, other {Immunodeficiency, general {Immunoglobulin, non-MAb {Immunological {Immunomodulator, anti-infective {Immunostimulant, anti-AIDS {Immunostimulant, other {Immunosuppressant {Infection, HIV/AIDS {Infection, coronavirus {Infection, coronavirus, prophylaxis {Infection, general {Infection, hepatitis virus, general {Infection, hepatitis-B virus {Infection, hepatitis-C virus {Infection, herpes simplex virus {Infection, herpes virus, general {Infection, human papilloma virus {Infection, otological {Infection, staphylococcal prophylaxis {Infection, streptococcal prophylaxis {Infection, varicella zoster virus {Inflammation, brain {Keratoconjunctivitis {Macular degeneration {Monoclonal antibody, other {Multiple sclerosis treatment {Multiple sclerosis, general {Musculoskeletal {Neurological ;Non-antisense oligonucleotides {Ophthalmological {Pemphigus {Prophylactic vaccine {Recombinant interferon {Recombinants, other {Respiratory {Rhinitis, allergic, general {Sepsis {Septic shock treatment {Sjogren's syndrome {Stomatological {Transplant rejection, general {Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST BG827697 BM467267 BX337778 HSIFNABR_33_70 #GENE_SYMBOL IFNABR {IFNAR2 {IFNARB #GO_F #GO_Acc 3800 #GO_Desc antiviral response protein activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 4896 #GO_Desc hematopoietin/interferon-class (D200-domain) cytokine receptor activity #GO_F #GO_Acc 4905 #GO_Desc interferon-alpha/beta receptor activity #GO_P #GO_Acc 6357 #GO_Desc regulation of transcription from Pol II promoter #GO_P #GO_Acc 7166 #GO_Desc cell surface receptor linked signal transduction #GO_P #GO_Acc 7259 #GO_Desc JAK-STAT cascade #GO_P #GO_Acc 8283 #GO_Desc cell proliferation #GO_P #GO_Acc 9615 #GO_Desc response to viruses INDICATION Anti-inflammatory {Antiallergic, non-asthma {Antiarthritic, immunological {Antiasthma {Anticancer, immunological {Anticancer, interferon {Anticancer, other {Antidiabetic {Antifungal {Antiviral, anti-HIV {Antiviral, interferon {Antiviral, other {Arthritis, rheumatoid {Asthma {Behcet's disease {Biotechnology, other {Cancer, bone {Cancer, brain {Cancer, cervical {Cancer, general {Cancer, head and neck {Cancer, leukaemia, chronic myelogenous {Cancer, leukaemia, hairy cell {Cancer, lymphoma, non-Hodgkin's {Cancer, melanoma {Cancer, myeloma {Cancer, renal {Cancer, sarcoma, Kaposi's {Cancer, skin, general {Chronic fatigue syndrome {Cirrhosis, hepatic {Cytokine {Diabetes, Type II {Fibromyalgia {Fibrosis, pulmonary ;Gene therapy {Hepatoprotective {Immunoconjugate, other {Immunodeficiency, general {Immunoglobulin, non-MAb {Immunological {Immunomodulator, anti-infective {Immunostimulant, anti-AIDS {Immunostimulant, other {Immunosuppressant {Infection, HIV/AIDS {Infection, coronavirus {Infection, coronavirus, prophylaxis {Infection, general {Infection, hepatitis virus, general {Infection, hepatitis-B virus {Infection, hepatitis-C virus {Infection, herpes simplex virus {Infection, herpes virus, general {Infection, human papilloma virus {Infection, otological {Infection, staphylococcal prophylaxis {Infection, streptococcal prophylaxis {Infection, varicella zoster virus {Inflammation, brain {Keratoconjunctivitis {Macular degeneration {Monoclonal antibody, other {Multiple sclerosis treatment {Multiple sclerosis, general {Musculoskeletal {Neurological ;Non-antisense oligonucleotides {Ophthalmological {Pemphigus {Prophylactic vaccine {Recombinant interferon {Recombinants, other {Respiratory {Rhinitis, allergic, general {Sepsis {Septic shock treatment {Sjogren's syndrome {Stomatological {Transplant rejection, general {Ulcer, aphthous #PHARM Interferon (type I) receptor antagonist #TAA lymph nodes-tumor #SEQLIST BI769986 BU173182 BU844823 BM467267 AL047793 BU856569 BF985432
HSNICRB_27_756 #GENE_SYMBOL CHRNB4 #GO_F #GO_Acc 15464 #GO_Desc acetylcholine receptor activity #GO_F #GO_Acc 30594 #GO_Desc neurotransmitter receptor activity #GO_F #GO_Acc 4889 #GO_Desc nicotinic acetylcholine-activated cation-selective channel activity #GO_P #GO_Acc 6811 #GO_Desc ion transport #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7271 #GO_Desc synaptic transmission, cholinergic #TS brain #SEQLIST NM000750 HSU48861
HSNICRB_27_789 #GENE_SYMBOL CHRNB4 #GO_F #GO_Acc 15464 #GO_Desc acetylcholine receptor activity #GO_F #GO_Acc 30594 #GO_Desc neurotransmitter receptor activity #GOJF #GO_Acc 4889 #GO_Desc nicotinic acetylcholine-activated cation-selective channel activity #GO_P #GO_Acc 6811 #GO_Desc ion transport #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7271 #GO_Desc synaptic transmission, cholinergic #TS brain #SEQLIST NM000750 HSU48861 BQ718501 HSNICRB_27_795 #GENE_SYMBOL CHRNB4 #GO_F #GO_Acc 15464 #GO_Desc acetylcholine receptor activity #GO_F #GO_Acc 30594 #GO_Desc neurotransmitter receptor activity #GO_F
#GO_Acc 4889 #GO_Desc nicotinic acetylcholine-activated cation-selective channel activity #GO_P #GO_Acc 6811 #GO_Desc ion transport #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7271 #GO_Desc synaptic transmission, cholinergic #TS brain #SEQLIST BQ718501 HSNICRB_27_798 #GENE_SYMBOL CHRNB4 #GO_F #GO_Acc 15464 #GO_Desc acetylcholine receptor activity #GO_F #GO_Acc 30594 #GO_Desc neurotransmitter receptor activity #GO_F #GO_Acc 4889 #GO_Desc nicotinic acetylcholine-activated cation-selective channel activity #GO_P #GO_Acc 681 #GO_Desc ion transport #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7271 #GO_Desc synaptic transmission, cholinergic #TS brain #SEQLIST NM000750 HSU48861
HUMFOLMES 8_200 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P #GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic ;Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal
{Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological
{Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BM475776 BC000192 BU529014
BQ939840 BM710202
HUMFOLMES_18_209 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P #GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal {Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BG612364 BC000192 BU529014 BQ939840
HUMFOLMES_18_298 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P #GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal
{Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BM710202 HUMFOLMES_18_382 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P
#GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal {Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BC000192 AA059177 AA001097 AW301288 BE540884 HUMFOLMES_19_384 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P #GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic ;Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal {Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BQ676457 BU168890 AA135018 HUMFOLMES_19_386 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P #GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal
{Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BG492408 BQ676457 BF570033 BM479366 BU618464 BG494342 BE566659 W03820 BQ425504 BU168890 BG033264 BG497976 CB162469 BC000192 AW301288 AA135018 HUMFOLMES_ 9_393 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P #GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal {Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST CB162469 BQ671948 BG497976 AA135018
HUMFOLMES_20_18 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P #GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal
{Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BG497976 AA135018 HUMFOLMES_20_19 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P
#GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other ^Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal {Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BQ676457 BU168890 HUMFOLMES_20_44 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P #GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal {Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BG492408 BQ676457 BF570033 BU 168890 AA135018
HUMFOLMES_20_55 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P #GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal
{Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST CB162469 HUMFOLMES_20_63 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P
#GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal {Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BQ676457 BU618464 BU168890 AA135018 HUMFOLMES_20_64 #GENE_SYMBOL DHFR #GO_F #GO_Acc 4146 #GO_Desc dihydrofolate reductase activity #GO_F #GO_Acc 4386 #GO_Desc helicase activity #GO_F #GO_Acc 8233 #GO_Desc peptidase activity #GO_P #GO_Acc 6545 #GO_Desc glycine biosynthesis #GO_P #GO_Acc 6730 #GO_Desc one-carbon compound metabolism #GO_P #GO_Acc 9165 #GO_Desc nucleotide biosynthesis #INDICATION Acne {Anaesthesia {Anaesthetic, local {Anthelmintic {Anti- inflammatory {Anti-inflammatory, topical {Antiacne {Antiarthritic, immunological {Antiarthritic, other {Antibacterial, other {Anticancer, antimetabolite {Anticancer, immunological {Anticancer, other {Antifungal {Antihypertensive, diuretic {Antimalarial {Antimycobacterial {Antipruritic/inflamm, non- allergic {Antipsoriasis {Arthritis, rheumatoid {Cancer, breast {Cancer, colorectal {Cancer, endometrial {Cancer, general {Cancer, head and neck {Cancer, leukaemia, acute lymphocytic {Cancer, lung, non- small cell {Cancer, lymphoma, T-cell {Cancer, mesothelioma {Cancer, pancreatic {Cancer, prostate {Cancer, renal {Cancer, sarcoma, general {Cardiostimulant {Eczema, atopic {Immunoconjugate, other {Immunosuppressant {Immunotoxin {Infection, Candida, general {Infection, Pneumocystis jiroveci {Infection, general {Infection, malaria {Infection, malaria prophylaxis {Infection, pneumococcal prophylaxis {Infection, respiratory tract, general {Infection, urinary tract {Musculoskeletal
{Ophthalmological {Protozoacide {Pruritus {Psoriasis {Quinolone antibacterial {Stomatological {Trimethoprim and analogues {Vulnerary #TAA GEN #SEQLIST BG492408 CB162469 BQ676457 BU618464 BU168890
R12689_20_635 #GENE_SYMBOL FLJ21106 #SEQLIST AW192675 R12689_20_640 #GENE_SYMBOL FLJ21106 #SEQLIST AW192675 R12689_20_654 #GENE_SYMBOL FLJ21106 #SEQL!ST BM876242 BM804070 R12689_20_662 #GENE_SYMBOL FLJ21106 #SEQLIST BM876242 BF984644 BM804070 AW192675 R12689_20_734 #GENE_SYMBOL FLJ21106 #SEQLIST BM804070 R12689_20_735 #GENE_SYMBOL FLJ21106 #SEQLIST AW192675 R12689_20_765 #GENE_SYMBOL FLJ21106 #SEQLIST BF984644 AW192675 R12689_50_767 #GENE_SYMBOL FLJ21106 #SEQLIST BE788256 R12689_50_844 #GENE_SYMBOL FLJ21106 #SEQLIST BI086997 BI087009 R12689_50_863 #GENE_SYMBOL FLJ21106 #SEQLIST BI086997 BE788256 BI087009 R12689_50_889 #GENE_SYMBOL FLJ21106 #SEQLIST BI086997 BI087009 R12689_50_899 #GENE_SYMBOL FLJ21106 #SEQLIST BI086997 BI087009 R12689_50_903 #GENE_SYMBOL FLJ21106 #SEQLIST BE788256
T39606_44_100 #GENE_SYMBOL F11R ;JAM1 ;JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST AI093487 BQ305460 CD369735 AF191495 BF871813 AI263270
R01692 AA101562 BM727609 AW339937 AI333843 BQ305305 AW936086 AF207907 AI193265 BQ305324 AI955238 T86963 BC001533 BG992611 AW630757 AF172398 BF873312 AW190875 NM144504 BQ305722 BQ306615 BM681047 BF771639 AI632044 AI002582 BG428150 AI241578 BG994189 T84017 AK026665 AF111713 BQ305313 AW338261 AA149993 BE350662 BG997475 BQ307221 AI469580 AW999404 BF770534
T39606_44_120 #GENE_SYMBOL F11R ;JAM1 ;JCAM #GOJ= #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P
#GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST AI093487 BQ305460 BM727609 CD369735 BQ305722 BQ306615 BM681047 BQ305313 BQ305305 AW936086 AA149993 AL550948 BQ305324 AI002582 BG997475 BQ307221 BG994189 BG992611
T39606_44_121 #GENE_SYMBOL F11R ;JAM1 {JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST AI093487 BQ326107 BQ305460 AF172398 CD369735 BQ305722
BQ306615 BM681047 BG428150 AI002582 BG994189 AI263270 AA101562 BM727609 AW339937 AF111713 BQ305313 BQ305305 AF207907 AW936086 AA149993 AW338261 BQ305324 BG997475 BQ307221 AW999404 BC001533 BF770534 BG992611 T39606_44_20 #GENE_SYMBOL F11R ;JAM1 ;JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST BF839692 AF172398 AW190875 CD369735 BM681047 AA101520 BG428150 BG994189 AA101562 BM727609 AI333843 AA702259 AF111713 AK026665 AI193265 AA149993 BE350662 BG997475 AW999404 BF770534
T39606_44_29 #GENE_SYMBOL F11R ;JAM1 ;JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST AI093487 BF839692 BM681047 BF771639 BG428150 BG994189 AI263270 AA101562 BM727609 AK026665 AF111713 BE350662 AW338261 AA149993 BG997475 BF770534 BC001533 AW999404
T39606_44_30 #GENE_SYMBOL F11R ;JAM1 ;JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P
#GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQ IST AI093487 BQ305460 AF172398 AW190875 BQ305722 BM681047 BG428150 AI002582 BG994189 R01692 AI608881 AA101562 BM727609 AI333843 AF111713 AK026665 BQ305313 BQ305305 AF207907 AI193265 AA149993 AW338261 BQ305324 BG997475 BQ307221 AI469580 AW999404 BF770534
T39606_44_68 #GENE_SYMBOL F11 R ;JAM1 ;JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P
#GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST AI093487 BQ305460 CD369735 AA244018 BF871813 AI263270 R01692 AI608881 AA101562 BM727609 AW339937 AI333843 BQ305305 AW936086 AF207907 AI193265 BQ305324 T86963 BC001533 AW630757 AF172398 BF873312 AW190875 BQ305722 BQ306615 BM681047 BG999100 BF771639 AI632044 AI002582 BG428150 AI241578 BG994 89 BF871808 T84017 BF813471 AK026665 AF111713 BQ305313 AW338261 AA149993 BE350662 BF871973 BG997475 BQ307221 AI469580 AW999404 BF770534
T39606_66_167 #GENE_SYMBOL F11 R ;JAM1 ;JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQ IST BG289042 AI333187 T39606_66_193 #GENE_SYMBOL F11 R ;JAM1 ;JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST BG289042
T39606_66_242 #GENE_SYMBOL F11R ;JAM1 ;JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST BG289042 T39606_66_277 #GENE_SYMBOL F11R ;JAM1 ;JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F
#GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST BG289042 AI333187 T39606_66_287 #GENE_SYMBOL F11 R ;JAM1 {JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST AI333187
T39606_66_314 #GENE_SYMBOL F11 R ;JAM1 {JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST BG289042 AI333187 T39606_66_320 #GENE_SYMBOL F11R ;JAM1 ;JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST AI333187 T39606_66_329 #GENE_SYMBOL F11R ;JAM1 {JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P
#GO_Acc 7155 #GO„Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQ IST AI333187
T39606_66_340 #GENE_SYMBOL F11R ;JAM1 {JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST BG289042 AI333187 T39606_66_387 #GENE_SYMBOL F11R ;JAM1 {JCAM #GO_F #GO_Acc 3754 #GO_Desc chaperone activity #GO_F #GO_Acc 4792 #GO_Desc thiosulfate sulfurtransferase activity #GO_F #GO_Acc 4872 #GO_Desc receptor activity #GO_F #GO_Acc 5194 #GO_Desc cell adhesion molecule activity #GO_F #GO_Acc 5515 #GO_Desc protein binding #GO_P #GO_Acc 6928 #GO_Desc cell motility #GO_P #GO_Acc 6954 #GO_Desc inflammatory response #GO_P #GO_Acc 7155 #GO_Desc cell adhesion #GO_P #GO_Acc 8272 #GO_Desc sulfate transport #TAA mammary-tumor #SEQLIST BG289042 AI333187
T89426_18_10 #GENE_SYMBOL DFF1 ;DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7 65 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BE302337 AI620675 BX117311 AA487452 AI620663 BG392751 AI049623 CA437921 BU556879
T89426_18_105 #GENE_SYMBOL DFF1 {DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU161766 BE302337 BC007721 AA487452 BQ064502 AU121243 CA437921 T89426„18_126 #GENE_SYMBOL DFF1 ;DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BQ064502 T89426_18_136 #GENE_SYMBOL DFF1 {DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST W56054 BQ064502 T89426_18_139 #GENE_SYMBOL DFF1 {DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU161766 AA487452 AU121243 T89426_18_144 #GENE_SYMBOL DFF1 {DFF45 ;DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity. #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU161766 BC007721 AA837509 AU145314
T89426_18.167 #GENE_SYMBOL DFF1 ;DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU161766 AA487452 T89426_18_168 #GENE_SYMBOL DFF1 ;DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST AA487452 BQ064502 BU556879 T89426_18_178 #GENE_SYMBOL DFF1 ;DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU161766 AA487452 BQ064502 T89426_18_239 #GENE_SYMBOL DFF1 {DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU161766 T89426_18_28 #GENE_SYMBOL DFF1 {DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU161766 AI620675 BX117311 AA487452 AI620663 AU121243 BI115576
T89426_18_29 #GENE_SYMBOL DFF1 ;DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU161766 AI620675 BX117311 AI620663 T89426_18_3 #GENE_SYMBOL DFF1 ;DFF45 ;DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BE302337 AA487452 BQ064502 AU121243 BQ214663 BM006015
T89426_18_67 #GENE_SYMBOL DFF1 {DFF45 ;DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU 161766 BE302337 AA837509 AA487452 W56054 AI620663 AI049623 AI620675 BX117311 BE383228 CA437921 BU556879 T89426_18_71 #GENE_SYMBOL DFF1 ;DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BE315367 T89426 8_95 #GENE_SYMBOL DFF1 ;DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242 #GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU 161766 AA487452 BQ064502 BM006015
T89426_18_97 #GENE_SYMBOL DFF1 {DFF45 {DFFA #GO_F #GO_Acc 16329 #GO_Desc apoptosis regulator activity #GO_F #GO_Acc 4537 #GO_Desc caspase-activated deoxyribonuclease activity #GO_P #GO_Acc 6309 #GO_Desc DNA fragmentation #GO_P #GO_Acc 6915 #GO_Desc apoptosis #GO_P #GO_Acc 7165 #GO_Desc signal transduction #GO_P #GO_Acc 7242
#GO_Desc intracellular signaling cascade #TAA EPI #SEQLIST BU161766 BE302337 AA837509 BE383228 AA487452 BQ064502 AU121243 BU556879 BM006015
Z39650_36_1450 #GENE_SYMBOL REQ ;UBID4 #GO_F #GO_Acc 3677 #GO_Desc DNA binding #GO_P #GO_Acc 6355 #GO_Desc regulation of transcription, DNA-dependent #GO_P #GO_Acc 8624 #GO_Desc induction of apoptosis by extracellular signals #TAA pancreas-tumor #SEQLIST AK056223
Z39650_36_1513 #GENE_SYMBOL REQ ;UBID4 #GO_F #GO_Acc 3677 #GO_Desc DNA binding #GO_P #GO_Acc 6355 #GO_Desc regulation of transcription, DNA-dependent #GO_P #GO_Acc 8624 #GO_Desc induction of apoptosis by extracellular signals #TAA pancreas-tumor #SEQLIST AK056223
Z39650_36_1538 #GENE_SYMBOL REQ ;UBID4 #GO_F #GO_Acc 3677 #GO_Desc DNA binding #GO_P #GO_Acc 6355 #GO_Desc regulation of transcription, DNA-dependent #GO_P #GO_Acc 8624 #GO_Desc induction of apoptosis by extracellular signals #TAA pancreas-tumor #SEQLIST AK056223 Z39650_36_1593 #GENE_SYMBOL REQ ;UBID4 #GO_F #GO_Acc 3677 #GO_Desc DNA binding #GO_P #GO_Acc 6355 #GO_Desc regulation of transcription, DNA-dependent #GO_P #GO_Acc 8624 #GO_Desc induction of apoptosis by extracellular signals #TAA pancreas-tumor #SEQLIST AK056223
Z40049_54_274 #GENE_SYMBOL DKFZP434E1120 {DKFZP727M111 {KIAA1403 #GO_F #GO_Acc 4009 #GO_Desc ATP-binding cassette (ABC) transporter activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 6810 #GO_Desc transport #TAA brain-tumor #SEQLIST AB037824 Z40049_54_287 #GENE_SYMBOL DKFZP434E1120 {DKFZP727M111 {KIAA1403 #GO_F
#GO_Acc 4009 #GO_Desc ATP-binding cassette (ABC) transporter activity #GO_F #GO_Acc 5524
#GO_Desc ATP binding #GO_P #GO_Acc 6810 #GO_Desc transport #TAA brain-tumor #SEQLIST
AB037824
Z40049_54_328 #GENE_SYMBOL DKFZP434E1120 {DKFZP727M111 {KIAA1403 #GO_F #GO_Acc 4009 #GO_Desc ATP-binding cassette (ABC) transporter activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 6810 #GO_Desc transport #TAA brain-tumor #SEQLIST AB037824
Z40049_54_351 #GENE_SYMBOL DKFZP434E1120 {DKFZP727M111 {KIAA1403 #GO_F #GO_Acc 4009 #GO_Desc ATP-binding cassette (ABC) transporter activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 6810 #GO_Desc transport #TAA brain-tumor #SEQLIST AB037824
Z40049_54_360 #GENE_SYMBOL DKFZP434E1120 {DKFZP727M111 {KIAA1403 #GO_F #GO_Acc 4009 #GO_Desc ATP-binding cassette (ABC) transporter activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 6810 #GO_Desc transport #TAA brain-tumor #SEQLIST AB037824 Z40049_54_378 #GENE_SYMBOL DKFZP434E1120 {DKFZP727M111 {KIAA1403 #GO_F #GO_Acc 4009 #GO_Desc ATP-binding cassette (ABC) transporter activity #GO_F #GO_Acc 5524 #GO_Desc ATP binding #GO_P #GO_Acc 6810 #GO_Desc transport #TAA brain-tumor #SEQUST AB037824
Z44562_19_627 #GENE_SYMBOL KIAA0665 ;Rab11-FIP3 #GO_F #GO_Acc 17137 #GO_Desc Rab interactor activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5509 #GO_Desc calcium ion binding #GO_F #GO_Acc 8017 #GO_Desc microtubule binding #GO_P #GO_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7017 #GO_Desc microtubule-based process #GO_P #GO_Acc 915 #GO_Desc cytokinesis, actomyosin ring formation #SEQLIST AK094817
Z44562_19_727 #GENE_SYMBOL KIAA0665 ;Rab11-FIP3 #GO_F #GO_Acc 17137 #GO_Desc Rab interactor activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5509 #GO_Desc calcium ion binding #GO_F #GO_Acc 8017 #GO_Desc microtubule binding #GO_P #GO_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7017 #GO_Desc microtubule-based process #GO_P #GO_Acc 915 #GO_Desc cytokinesis, actomyosin ring formation #SEQLIST AK094817
Z44562_19_737 #GENE_SYMBOL KIAA0665 ;Rab11-FIP3 #GO_F #GO_Acc 17137 #GO_Desc Rab interactor activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5509 #GO_Desc calcium ion binding #GO_F #GO_Acc 8017 #GO_Desc microtubule binding #GO_P #GO_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7017 #GO_Desc microtubule-based process #GO_P #GO_Acc 915 #GO_Desc cytokinesis, actomyosin ring formation #SEQLIST AK094817 Z44562_19_743 #GENE_SYMBOL KIAA0665 ;Rab11-FIP3 #GO_F #GO_Acc 17137 #GO_Desc Rab interactor activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5509 #GO_Desc calcium ion binding #GO_F #GO_Acc 8017 #GO_Desc microtubule binding #GO_P #GO_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7017 #GO_Desc microtubule-based process #GO_P #GO_Acc 915 #GO_Desc cytokinesis, actomyosin ring formation #SEQLIST AK094817
Z44562_19_868 #GENE_SYMBOL KIAA0665 ;Rab11-FIP3 #GO_F #GO_Acc 17137 #GO_Desc Rab interactor activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5509 #GO_Desc calcium ion binding #GO_F #GO_Acc 8017 #GO_Desc microtubule binding #GO_P #GO_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7017 #GO_Desc microtubule-based process #GO_P #GO_Acc 915 #GO_Desc cytokinesis, actomyosin ring formation #SEQLIST AK094817 Z44562_19_877 #GENE_SYMBOL KIAA0665 ;Rab11-FIP3 #GO_F #GO_Acc 17137 #GO_Desc Rab interactor activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5509 #GO_Desc calcium ion binding #GO_F #GO_Acc 8017 #GO_Desc microtubule binding #GO_P #GO_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7017 #GO_Desc microtubule-based process #GO_P #GO_Acc 915 #GO_Desc cytokinesis, actomyosin ring formation #SEQLIST AK094817
Z44562_19_887 #GENE_SYMBOL KIAA0665 ;Rab11-FIP3 #GO_F #GO_Acc 17137 #GO_Desc Rab interactor activity #GO_F #GO_Acc 5489 #GO_Desc electron transporter activity #GO_F #GO_Acc 5509 #GO_Desc calcium ion binding #GO_F #GO_Acc 8017 #GO_Desc microtubule binding #GO_P #GO_Acc 6118 #GO_Desc electron transport #GO_P #GO_Acc 7017 #GO_Desc microtubule-based process #GO_P #GO_Acc 915 #GO_Desc cytokinesis, actomyosin ring formation #SEQLIST AK094817
Z46114_21_670 #GO_F #GO_Acc 3700 #GO_Desc transcription factor activity #GO_P #GO_Acc 6355 #GO_Desc regulation of transcription, DNA-dependent #TAA GEN #SEQLIST AF007129 AB011129 BC014054 Z46114_21_672 #GO_F #GO_Acc 3700 #GO_Desc transcription factor activity #GO_P #GO_Acc 6355 #GO_Desc regulation of transcription, DNA-dependent #TAA GEN #SEQLIST BC033151 AF007129 AB011129 BC014054
Z46114_21_737 #GO_F #GO_Acc 3700 #GO_Desc transcription factor activity #GO_P #GO_Acc 6355 #GO_Desc regulation of transcription, DNA-dependent #TAA GEN #SEQLIST AB011129
EXAMPLE 9 Description of the data on CD-ROM1 and CD-ROM1 content
The attached CD-ROM1 contains 4 files as follows:
1. "Ann_for_all" containing annotations for all putative editing sites listed in "flan_for_all" file. 2. "flan_for_all" containing the actual sequences of all putative editing sites. Total number of editing sites in this file is 31,888. Prediction reliability is 82%. 3. "flan_clean" containing sequences of more reliable editing sites. Total number of editing sites in this file is 12,200. Prediction reliability is 97%.
4. "Ann_clean" containing annotation for more reliable editing sites, listed in . "flan dean" .
Examples of the data in the files:
"Ann_clean" and "Ann_for_all" :
AA058640_3_2272 #GENE_SYMBOL KCNE4 #GO_F #GO_Acc 5249 #GO_Desc voltage-gated potassium channel activity #GO_P #GO_Acc 6813 #GO_Desc potassium ion transport #TS lung
"flan_clean" and "flan_for_all"
>AA001587_2_1320 AH 27847 CA GTTCATATGAAATCAAAAAAG TCAG TAGCCAAAGCMCCCTGAGCAAAAAAGAAAAA GCTGG GCATCACACTACTTAACTTCAAAACATTaCAGGGCTATGGTAACCCAAAGAGCATAGT ATTGATATAAACAGACACACAGACC TGGMCAGMTAGAAAACCCAGAAATAAATACATATAT TTACCA Example 10 Finding RNA Editing in Coding Regions This Example relates to locating RNA editing sites which affect proteins, and hence which are located in the coding region. To locate such editing sites, this Example describes the use of conservation data between human and mouse. The previously described method to identify editing sites, while possible to use, was found to mainly locate editing sites in non coding regions and non conserved regions (mainly ALU type repeats) . The cunent method uses LEADS (the previously described sequence discovery engine and database) to find all potential mismatches between RNA and DNA, and maps them to the human genome and also separately to the mouse genome (see the results in Appendices 2 and 3). Flanking regions around each mismatch of 200 bp (100 bp on each side of the mismatch) were then obtained, and were aligned between human and mouse sequences. The method looked for aligned sequences in which the same type of mismatch occurs in conserved regions at the same location for both the human and mouse sequences. Such A->G mismatches and C->T mismatches are suspected to be RNA editing sites. In addition, the method then preferably includes locating all potential loops that are conserved between human and mouse sequences, and search for editing sites in this region with the EST data. A list of A->G putative editing sites detected according to this method appears in Appendix 2, while a conesponding list for putative C-T sites appears in Appendix 3. One example of a validated RNA editing site that was predicted according to the present invention is as follows, for the blcap gene: in the DNA sequence, there is only "A" but in the ?RNA one can see "A" and "G" which is the hallmark of editing. This case is the first non-ion channel protein that undergoes editing in its coding sequence. In this case, at the protein level there is a transformation from Y->C. The sequence change is shown in the illustration in Figure 12.
The method of the present invention was optionally and preferably performed as follows: 1. Marking bad sequences. 2. Marking regions with higher sequencing enor probability
3. Marking clones/library related sequences
4. Finding locations which might be RNA editing sites
5. Finding a good probability model for these sites The method of the present invention preferably has the following detailed stages. 1. Marking "bad" sequences involves removing sequences which are defective and/or otherwise could be problematic or create noise in the method of the present invention. For example, sequences with an excessively high enor rate in a node. These sequences might be simply "bad", or wrongly clustered, and are preferably discarded from the rest of the analysis. This stage also discards the RNAS from which refseqs are derived (this is done to remove duplicates: a refseq is an RNA that appears as an RNA and as a refseq (reference sequence, derived from a project by NCBI - see www.ncbi.nlm.nih.gov/RefSeq/ for an example)). The relevant parameter is: seq_bnd = the maximal allowed ratio of mistakes or mismatches (see above) for a sequence. (0.1) small_node_seq_bnd = seq_bnd for small nodes small_node_size = maximal size for small nodes. 2. Marking "polluted" regions - regions where the sequencing enor probability seems higher for some reason. There are three possible reasons used for this optional embodiment of the present invention: A. "Dirty windows" - regions on the multiple alignment with too many columns (sequence positions) with disagreements. These regions might indicate assembly or clustering problems. The parameters are : win_size = setting the size of the windows to check . winjump = setting the jump between windows. win_bnd = the maximal allowed ratio of "bad" columns within a window.
B. Consecutive enors - a region in a sequence with several consecutive disagreements with the consensus sequence. The parameter is: polluted_seq = setting the number of consecutive such letters. (3)
C. Repetitions - the letter after a sufficiently long one-letter repetition. The parameter is: polluted_rep = setting the size of the repetition.
3. Marking "grouping" of sequences within columns - for each column with a disagreement, checking sequences of the same clone or library and marking them as a clone (or library, with the optional distinction between pool and one-person libraries) with agreement or disagreement (depending on wheter the sequences have the same nucleotide at the specific place). This stage depends on checking different libraries.
4. Finding tentative RNA editing sites - calculating the probabilities of columns with disagreements, given a model of no-editing site, and extracting the ones for which the probability is below a given bound. This method involves the use of the null hypothesis, with a threshold for determining acceptance of an RNA editing site. The parameters optionally include: An anay of probabilities for sequencing enors = the entries are the type of sequence (dna/refseq/ma/est), the quality of the specific place (clean dirty given by the cleaning stage or polluted according to stage 2) and mismatch/indel . (clean mismatch probs : 2E-7, refseq 2E-6, ma 5e-4, est 3E-4. polluted probs : 8E-5,8E-4,5E-3,5E-2) Multiplicative probability factors for the groupings = setting a new sequencing-enor probability for the different conditions. For example, for clone-disagreement, this factor should be much larger than 1, indicating that this is probably a sequencing enor. (clone_agreement_factor = 500) col_en_bnd/(l+depth_factor * the column size) = the probability bound for a column. (0.004) 5. Finding a good alternative model for the column - looking for a model with an editing site which gives a sufficiently high (higher than col_en_bnd) probability for the tentative editing site. If no such model is found, the editing site is discarded. The parameters are : snpjprob = the a-priori probability of a snp.(lE-4) allelejprob = the probability for an additional allele (above 2) Example 11
Experimental Verification of ADAR1 targets Using comparative genomics and expressed sequences analysis, four additional human substrates were identified and experimentally verified: FLNA, BLCAP, CYPIF2 and IGFBP7 — more than the sum total previously reported. Editing of three of these substrates was also verified in mouse, and two subsfrates were validated in chicken as well (see
Appendix 5; relevant nucleotides are marked in bold (G) or underlined (C)). None of these subsfrates is an ion-channel, but two of them have functions in the nervous system. The editing pattern observed suggests that some of the novel proteins identified here may be involved in different physiological processes and could solve the mystery of the missing ADAR1 targets.
The method of the present invention is designed to find genomic sites at which the expressed nucleotide diverges from the genomic one. Such occunences could be interpreted as either SNPs or editing, and it is therefore not surprising to find that all of the editing sites reported here are enoneously recorded as SNPs in dbSNP (dbSNP id's: BLCAP - rsll557677; FLNA - rs3179473; CYFIP2 - rs3207362; IGFBP7 - rsl 133243 and rsl 1555284). All of these presumed SNPs have no evidence for genomic polymorphism, and were included in dbSNP based on expressed data alone. It follows that the possibility of editing rather than genomic polymorphism should be taken into account for all dbSNP records based solely on expressed data. To experimentally validate the predicted editing sites, matching DNA and RNA samples retrieved from the same specimen were sequenced, for up to six tissues of human and mouse. Additionally, brain and liver cDNA and genomic DNA were sequenced for the chicken FLNA and BLCAP genes. Editing events in all predicted substrates were verified in human and mouse, except for the mouse IGFBP7 gene, which was not amplified successfully. In addition, editing of BLCAP and FLNA in chicken tissues was verified. PCR products were either cloned followed by sequencing of individual clones (mouse and chicken), or sequenced as a population without cloning (human). When the PCR products were cloned, the editing occunence was detected by comparing the sequences of several clones with the genomic sequence. When PCR products were directly sequenced, the occunence of editing was determined by the presence of an unambiguous trace of guanosine in positions for which the genomic DNA clearly indicated the presence of an adenosine. The full sequencing data are given in Figure 7 below; additional data is also provided. The full-length BLCAP (bladder cancer associated protein) cDNA contains a complete open reading frame (ORF) encoding a protein composed of 87 amino acids. Comparison of mouse and human BLCAP genomic loci revealed an infronless organization of the coding region in both species as well as a highly conserved stracture having 91% and 100% identity at the DNA (coding region) and protein levels. The function of this differentially expressed protein is not yet known but it is expressed mainly in brain tissues and B cells(/2) and appears to be down-regulated during bladder cancer progression( 3). An editing site within the BLCAP coding sequence, located at chr20:36,833,001 (here and in what follows the UCSC coordinates on the July 2003 build of the human genome are used) was identified, inducing a Y->C substitution at the 2nd amino-acid of the final protein. There is a highly conserved region within the infron, about 500bp upstream of the editing site, which potentially pairs with the editing region to form an almost perfect, 48bp long, dsRNA hairpin stracture (Figure 8). Notably, the present experimental results show evidence for an additional U->C editing site at chr20:36,832,971, resulting in an L->P substitution (data not shown). The FLNA (filamin A alpha) protein is a 280-kD (2647 a.a.) protein that crosslinks actin filaments into orthogonal networks in the cortical cytoplasm(i^) and participates in the anchoring of membrane proteins with the actin cytoskeleton(75). The resulting remodelling of the cytoskeleton is central to the modulation of cell shape and cell migration. One editing site within the FLNA transcript (chrX: 152,047,854) was identified, resulting in Q->R substitution at amino-acid 2341 in the human and mouse proteins and 2283 in the chicken homologue. The human editing region is predicted to form a 32bp long dsRNA stracture with a conserved region within the infron ~200bp downstream to the editing site. The edited amino acid lies within the 22nd rod-like region in the protein, which has been shown to be important for interaction with integrin beta(i<5). The same region binds to Racl(77), which is also known to interact with CYFIP2(i<°). The CYFIP2 (cytoplasmic FMR1 interacting protein 2) transcript encodes a protein of 1253 amino-acids. C YFIP2 is a member of a highly conserved protein family found in both invertebrates and vertebrates. Human CYFIP2 shares approximately 99% sequence identity with its mouse orthologs(7 ). It is expressed mainly in brain tissues, immune-system cells and kidney(72). One editing site within the CYFIP2 transcript (chr5: 156,717,703) was identified, resulting in a K->E substitution at amino-acid 320 in both the human and mouse proteins. Editing was also observed at the conesponding predicted position in the chicken cDNA. Although a strong editing signal was observed for human cerebellum cDNA, only a residual signal was observed in human lung, prostate, and utems tissues. This pattern is in agreement with the results in mouse an chicken: all eight mouse brain clones and four out of nine chicken brain clones were edited, while none of the eight chicken liver clones were edited. CYFIP2 is a p53 inducible protein(20), thus possibly a pro-apoptotic gene.
Interestingly, ADAR1 knock out mice show elevated apoptosis in most tissues thus possibly providing a link between the phenotype of these mice and a potential pro-apoptotic editing target (10). No obvious dsRNA stmcture in the CYFIP2 pre-mRNA including the editing region could be identified, except for a weak, local pairing. The IGFBP7 (insulin-like growth factor binding protein 7) transcript encodes a protein 282 amino-acids length, and is expressed in a wide range of tissues (12). IGFBP7 is a member of a family of soluble proteins that bind insulin-like growth factors (IGFs) with high affinity. Their principal functions are to regulate IGF availability in body fluids and tissues and to modulate IGF binding to its receptors(27). Two editing sites within the IGFBP7 transcript (chr4:57,891,828 and chr4:57,891,776) were identified, resulting in R->G and K- >R substitutions at amino-acids 78 and 95, respectively (although the genomic region of IGFBP7 was not amplified, the editing signal can be seen in the RNA sequencesas described in greater detail below). The editing region seemingly pairs with a region within the coding sequence, 200bp upstream, to form a 140bp long dsRNA stmcture. In addition, the editing site overlaps with an infron of an antisense transcript BC039519, pairing with which could also trigger editing by ADARs(22). Notably, the editing site in the FLNA transcript is located two nucleotides upstream to a splicing site, resembling the R/G editing site of glutamate receptor. In addition, seven of the eight nucleotides around the editing site are identical in the two substrates. This might suggest that FLNA, like glutamate receptor, can be edited by ADAR2. The proximity of the editing site in the glutamate receptor to the splicing site has led to speculations on a possible link between editing and splicing. Indeed, GluR-B mRNA molecules in ADAR2 null mice exhibit almost no editing at the Q/R site accompanied with inefficient removal of the adjacent infron 11 (8). Interestingly, analysis of the available EST data suggests a positive conelation between editing of the last codon in the exon of FLNA and abenant retention of the following infron, again, suggesting a link between editing and splicing. Editing typically happens in only a fraction of the sequences. Since the coverage of expressed sequences is scarce for many genes, editing sites might be missed by the method of the present invention. For example, the method did not detect editing of the serotonin receptor, which is supported by only one sequence, or editing of KCNA1, which is not supported by any sequence. In addition, the search parameters used here were rather strict, resulting in a small but accurate set of candidates. Therefore, more editing sites may be located in the future, as a result of improvements of the algorithm, more liberal parameters, and the continuous growth of the public EST databases. The three human proteins affected by ADAR editing found so far are all ion- channels. In addition, ADAR2 knockout mice, as well as adr knockout flies show behavioural phenotypes(25). Therefore it was hypothesized that A-to-I RNA editing has a pivotal role in nervous system functions(23). Notably, while all four novel subsfrates presented here do not encode ion-channels, at least two of them have functions in the CNS. CYFIP2 interacts with the Fragile-X mental retardation protein(7P), as well as with the FMRP-related proteins F?XR1P and F?XR2P, and is present in synaptosomal exfracts(iP). The Drosophila homologue has also been shown to be required for normal axonal growth and synapsis formation (18, 24). In addition, our experimental results suggest that the editing of CYFIP2 is brain specific. Most notably, FLNA binds a plethora of fransmembrane receptors and ion channels(75). Mutations in FLNA are associated with peri ventricular nodular heterotopia, a disorder of neuronal migration characterized by nodules of heterotopic neurons abutting the lateral cerebral ventricles(25). The neurological features of this condition range from asymptomatic state to severe drug-resistant epilepsy, which is reminiscent of the phenotype of ADAR2 knockouts(S). However, while CYFIP2 seems to be edited mainly in the brain, editing of FLNA, as well as BLCAP and IGFBP7 is observed in a broad range of tissues, in accordance with the expression spectrum of ADAR1. Thus, while this work provides additional support for the importance of RNA editing for CNS functions, some of the novel targets identified here may be involved in different physiological processes and could solve the mystery of the missing ADAR1 targets(2d).
Methods Multiple Alignment Human ESTs and cDNAs were obtained from NCBI GenBank version 139; www.ncbi.nlm.nih.gov/dbEST). The genomic sequences were taken from the human genome build 34 (www.ncbi.nlm.nih.gov/genome/guide/human). Details of our MA model can be found in Sorek et al(7). More details and parameters of the algorithm In order to identify conserved editing sites, we first produced exhaustive lists of potential editing sites for mouse and human. Such potential sites are found by aligning expressed sequences (ESTs and RNAs) against their perspective genomes, and finding positions where the expressed nucleotide differs from the genomic one. This process has to be done with some care as most of these mismatches are due to sequencing enors or problems in the alignment, inclusion of which would result in an enormous list of useless candidates. We used the following algorithm to identify "tme" events of disagreement between the genome and the expressed sequences, which could be either single-nucleotide- polymorphisms (SNPs) or editing sites. The algorithm is based on a probabilistic model for the various sources of mismatches. It first aligns all available expressed sequences to the genome and clusters them into genes. For each gene, it looks for columns in the multiple alignment (MA) matrix that include, and estimates the probability of the observed nucleotide distribution being caused by either sequencing and alignment enors, or SNPs and RNA editing. If the probability for the nucleotide distribution being a result of sequencing or alignment enors does not exceed the cut-off but the probability of an SNP/editing does, the genomic position is marked as a tme event (i.e., it is assumed being an SNP or editing site). We mask sequences of low alignment quality (>10% mismatches), genomic regions where the ?MA is of low quality (mismatches in >20% of columns), and all single-letter repetitions and consecutive mismatches of length 3 or more. The probability of a sequencing or alignment enor at a certain position is estimated based on the type of the sequence (RefSeq, RNA or EST) and the quality of the MA at the genomic region (enor probabilities: clean regions - RefSeq: 2E- 6; RNA: 5e-4; EST: 3E-4. polluted regions - RefSeq: 8E-4; RNA.5E-3; EST: 5E-2). The probability cut-off against which the different model probabilities are compared is 10"6 divided by the number of supporting sequences. The prior probability of an SNP is 10"4. Applying this algorithm to the human and mouse transcriptomes resulted in two lists of putative SNPs/editing events. Subsequently, the sites found in the human genome were aligned against those found in the mouse genome, retaining only alignments longer than 50nt with identity levels higher than 85%, and nucleotide mismatches occurring at identical positions within the two sequences. Genomic sites that are duplicated in either genome, were also eliminated and retained only non-synonymous events in the coding sequence. Experimental protocols For human sequences, total RNA and genomic DNA (gDNA) isolated simultaneously from the same tissue sample were purchased from Biochain Institute (Hayward, CA). In this work we used samples of liver, prostate, utems, kidney, lung normal and tumor, brain tumor (glioma), cerebellum and frontal lobe. The total RNA underwent oligo-dT primed reverse transcription using Superscript II
(Invifrogen, Carlsbad, CA) according to manufacturer instructions. The cDNA and gDNA (at 0.1 μg/μl) were used as templates for PCR reactions. High sequencing quality was desired and thus rather short genomic sequences (roughly 200nt) were amplified. The amplified regions chosen for validation were selected only if the fragment to be amplified maps to the genome at a single site. PCR reactions were done using Abgene ReddyMix™ kit (Takara Bio, Shiga, Japan) using the primers and annealing conditions as detailed in the following. The PCR products were run on 2% agarose gels and only if a single clear band of the conect approximate size was obtained, it was excised and sent to Hy-labs laboratories (Rehovot, Israel) for purification and direct sequencing without cloning. For mouse and chicken sequences poly-A RNA was isolated from brain and liver samples using Trifast (PeqLab, Germany) and poly-A selected using magnetic oligo dT beads (Dynal, Germany), lμg of poly A RNA was reverse transcribed using random hexamers as primers and RNAseH deficient M-MLV reverse transcriptase (Promega, Madison, WI). Genomic DNA from the same samples was isolated according to Ausubel et al(2). First strand cDNAs or conesponding genomic regions were amplified with suitable primers using Ptu polymerase, to minimize mutation rates during amplification. Amplified fragments were A-tailed using Taq polymerase, gel purified and cloned into pGem-T easy (Promega, Madison, WI). After transformation in E. coli individual plasmids were sequenced and aligned using ClustalW. Sequencher 4.2 Suite (Gene Codes Corporation) was usesd for multiple-alignment of the elecfropherograms. Typically, the extent of A-I editing is variable, e.g. the levels of the guanosine trace sometimes is only a fraction of the adenine trace, while in some occasions the conversion from A to I is almost complete. For each gene tested, we sequenced the three tissues in which the expression was the highest. The RT-PCR and gDNA-PCR of one of these tissues were sequenced from both ends to ensure the consistency of the resulting elecfropherograms.
Primers
The following primer pairs were used for amplification of cDNAs and genomic DNA fragments: Human primers
BLCAP ( RNA &DNA)
F : 7AATTGTGC7AAGGCTTCCGTT
R: TCCCATTAGGTCGGTTCCTG
CIYFP2 - DNA
F: TCAGCATCTCACGAGCTGTGT
R: GAGACATTACGGCAGGCACTC
CIYFP2 - RNA
F : TCTACCTAATGGATGGAAATGTCAGTAAC R: ATCCCGGATCTGAACCATCTG
FLNA - DNA F: GACCTGAGACACGAGAAAAACTCC R: CGGTCTTACACTCTTTCCCTGC
FLNA - RNA
F: GATCTCTTTTGAGGACCGCAAGG R: TGGTCAATTTCTGTGACATAGCACTCC
IGFBP7 - RNA F: GAGGGCGAGCCGTGC
R: TATTCTCCAGCATCTTCCTTACTTAGAG
Mouse primers
BLCAP - DNA
F : CTGTTTGTTGTGTTGACTTTTC
R: GAGTGGCTGAACCACAGAGCG
BLCAP - RNA
F: GGCGTCGCCCGCCTGGGC R: GAGTGGCTGAACCACAGAGCG
CYFIP2 - DNA F: GCGAAGGCAGCCACCCCAAC
R: GACTTGTTCTCTTCATAGTGAGC
CYFIP2 - RNA
F: CCAAGAAGAGAATCAACCTTAGC R: GACTTGTTCTCTTCATAGTGAGC
FLNA - DNA
F : GGTGACGCCCGCCGCCTTAC
R: GCCCAGGGCCAAGACCTG
FLNA - RNA
F: GGTGACGCCCGCCGCCTTAC
R: AAGATGCTGGCTGGTTGACC
Chicken primers
CYFIP2 - ( DNA & RNA ) F: TCGGCGATATGCAGATAGAAC R: GGGACACACACAGAAGCCAAG
FLNA - DNA
F: TCTGATGATGCTCGCAGGC
R: GGTCTCAGAGAACAAGGACG FLNA - RNA
F: GCCCTTTGCCCCGTTCAG R : GGTCTCAGAGAACAAGGACG Example 12 Mis-identified SNPs Consistent variation between a number of sequences and the genome was interpreted as a sign for SNP. It has been pointed out recently that RNA editing, the site-selective modification of specific nucleotides within the pre-mRNA, is a widespread phenomenon in the mammalian franscriptome. Accordingly, many of the above-described disagreements between expressed and genomic sequences could be attributed to editing events rather than genomic polymorphisms. The distribution of expressed SNPs in dbS?NP was analyzed and was found to support this hypothesis, suggesting that hundreds of dbSNP records are actually editing sites. The recently published list of adenosine to inosine (A-to-I) editing sites was then searched; it was found to contain 102 sites enoneously annotated as SNPs. Five of these were experimentally validated, by sequencing matching DNA and RNA samples. Editing sites show up when a sequence is aligned with the genome: while the DNA reads A, sequencing identifies the inosine in the edited site as guanosine (G). All five cases have turned out to be editing sites. The biggest depository of SNP is dbSNP in which virtually all known SNP are deposited. Most of the SNPs in dbSlNTP were found in the course of the sequencing the human, by algorithmic search for single nucleotide differences between aligned sequence reads of genomic sequence. This approach has been successful in identifying common SNPs, namely those with a frequency of greater than 1% in a diverse panel of individuals representative of different populations. Moreover, this approach has concentrated on developing a dense map, with uniform coverage across the existing draft of the human genome1. However, many other SNPs come from other origins, and are of varying accuracy. Sources for enoneous SNP identifications include sequencing enors, mutations and duplications. A recent confirmation study have reported that a large fraction (> 40%) of SNPs in these databases were not seen, meaning that they are either of very low frequency, mis- mapped, or not polymorphic at all4. In addition, S?NPs were identified using expressed data: aligning millions of available expressed sequence tags (ESTs), one can search clusters of ESTs for possible SNPs5"7. Although, this methods have yielded only tens of thousands of SNPs, not a significant number compared to the millions of SNPs in dbSNP, its importance is due to the fact that the resulting SNPs have an increased likelihood of residing in a coding region or untranslated region of a gene. SNPs in these regions, or generally in regulatory (rSNP) and expressed regions (cSNP), are considered much more important than those in non-functional regions (i.e., most of the SNPs) which are considered of low probability to contribute to phenotype.
Large scale EST searches for SNPs were utilized in other organisms as well, such as rat , and Arabidopsis thaliana9. This method is the most efficient method for identification of SNPs in organisms which do not have a sequenced genome 10, and were employed to many organisms, e.g. the Bombyx mori silkworm11. The abundance of RNA editing sites, and the fact that the EST signature of an SNP is virtually the same as the EST signature of an editing site, raise the question whether some of the SNPs predicted by EST data are actually RNA editing sites. In the following we describe our search for editing sites that were enoneously deposited in dbSNP as SNPs. We find over a hundred such sites, and claim that the actual number is much higher.
Description of Experiments dbSNp consists of atotal of 6,134,414 non-redundent documented RefSNP clusters. Most of these were validated by comparing DNA of different individuals, but for 30,879 clusters the only evidence of polymoφhism are mismatches between DNA and expressed data (expressed SNPs). 5,672,327 of the SNPs (92.5%) are a simple single nucleotide substitution, including virtually all expressed SNPs (30,774; 99.7%). The mismatch between DNA and RNA observed for the expressed SNPs can potentially be not a result of an SNP but rather a signature of RNA editing. In particular, sequences undergoing A-to-I RNA editing will read G instead of the genomic A, and this could be enoneously inteφreted as an A/G SNP. While the expressed S?NPs are only a small fraction (0.5%) of the total number of SNPs, they are a significant fraction (15%) of SNPs in coding sequences, including 17% of the non-synonym SlSIPs. Thus, curation of this subset of SNPs is of great importance. In order to test the possibility of editing sites mistakenly reported as S?MPs, over-representation of A/G expressed SNPs was therefore checked within Alu repetitive elements, in which A- to-I RNA editing is enhanced. Figure 9 shows the distribution of the different types of simple substitution SlSfPs. A/G SNPs account for 33% of all single substitution S?NPs, and for 35% of single substitution S?NPs within Alu repeats. In contrast, A/G expressed SlSfPs are highly over-represented in Alu repeats: whereas only 27% of all expressed single- substitution SNPs are of type A/G, 70% of these which reside within an Alu repeat are A/G SNPs. Since the annotation of the SNPs does not distinguish between strands, it might be necessary to look at the statistics of A/G and C/T SNPs combined. These types of SNPs account for 66% of all single substitution SNPs, and for 69% of single substitution SlSfPs within Alu repeats. In contrast, A/G and C/T expressed SNPs are highly over-represented in Alu repeats: whereas only 59% of all expressed single-substitution SNPs are of type A/G or C/T, 86% of these which reside within an Alu repeat are SNPs of these types. The above results strongly suggest that many of the expressed S?NPs are actually not SNPs at all but rather attest for RNA editing. It is possible to distinguish between an editing site and an SNP according to one or more factors: (a) A-to-I editing occurs in dsRNA regions (b) A-to-I editing occurs mainly within Alu repeats (c) editing sites tend to cluster, and to show a combinatorial nature: different sequences will be edited in different subsets of the cluster. Such a combinatorial behavior is not expected for SNPs, since the short distance between the sites does not allow for many recombinations. The above characteristics were used in a recently published algorithm to search for RNA editing(Levanon et al. 2004). The set of putative editing sites (predicted accuracy > 95%, experimental validation of a random subset shows accuracy -90%) was used for aligning each predicted editing site against the database of expressed SNPs using the BLAST algorithm, retaining only alignments longer than 32nt with identity levels higher than 95%. 562 expressed SNPs that were mapped on predicted A-to-I editing sites were found, a list of which is given below. However, since most of these SNPs are located within Alu elements, only 102 of these SNPs have an unambiguous mapping onto the genome in dbSNP. The list of these 102 SNPs is given in Table 10. For each dbSNP record the RefSeq sequence onto which the SNP is mapped (if any), and the location within the RefSeq sequence are given. In addition, it is indicated whether the SNP resides within an Alu repeat. 56 out of 102 SNPs are mapped onto a RefSeq sequence, 37 of which (66%) are mapped to the UTR of the RefSeq, and the remaining 19 (34%) are located within infrons of the RefSeq sequence. None of the 102 SNPs are mapped onto RefSeq coding sequences. 96 out of the 102 SNPs in the table (94%) are located within Alu repeats. In order to validate these results, four transcripts that contain SNPs were selected from the list of 102 candidates and are relatively easy to sequence, having a long, unique, flanking region out of the Alu in the same exon. PCR products of matching DNA and RNA samples were sequenced in a number of tissues as described in the methods section below. The occunence of editing was determined by the presence of an unambiguous frace of guanosine in positions for which the genomic DNA clearly indicated the presence of an adenosine (figure 10 and figure 11).
Methods Bbuild 119 (January 2004) of dbSNP (http://www.ncbi.nlm.nih.gov/SNP/) was used for sequence information. Experimental protocol: Total RNA and genomic DNA (gDNA) were isolated simultaneously from the same tissue sample using TriZol reagent (Invifrogen, Carlsbad, CA). We used samples of liver, prostate, uteras, kidney, lung normal and tumor, brain tumor (glioma), cerebellum and frontal lobe. The total RNA underwent oligo-dT primed reverse franscription using M-MLV Reverse Transcriptase (Invifrogen, Carlsbad, CA) according to manufacturer instructions and also as described above. The cDNA and gDNA (at 20ng) were used as templates for PCR reactions. We aimed at high sequencing quality and thus amplified rather short genomic sequences (roughly 200nt). The amplified regions chosen for validation were selected only if the fragment to be amplified maps to the genome at a single site. PCR reactions were done using Abgene ReddyMix™ kit (Takara Bio, Shiga, Japan) using the primers and annealing conditions as detailed in the following. PCR fragment were purified from agarose gel using QIAquick Gel Extraction Kit (QIAGEN) followed by sequencing using ABI Prism 3100 Genetic Analyzer (Applied Biosystems). Results All sites tested have been shown to be truly editing sites and not SNPs. One of the amplified franscripts included more than one SNP in our list, and thus 7 out of the predicted 102 were validated (dbSNP id numbers: rsl 136573, rs3170195, rs3180172, rs3207022, rs3180175, rs3192564, rsl057026). In addition, these experiments have yielded one more false SNP, not present in the cunent list: rs3207020. The results for two of these franscripts are presented in figures 3. Table 10 List of562 dbSNP records which coincide with putative editing sites: rsl043567, rsl043568, rsl043571, rsl046652, rsl046940, rsl047405, rsl048112, rsl048113, rsl048114, rsl054846, rsl055462, rsl056257, rsl056622, rsl057026, rsl057448, rsl058916, rsl062354, rsl065104, rsl065105, rsl071787, rsl071788, rsl071789, rsl071790, rsl071797, rsl071813, rsl071822, rsl071825, rsl071830, rsl071834, rsl 127505, rsl 127860, rsl 127866, rsl 1l27867, rsl 127877, rsl 127980, rsl 128029, rsl 128032, rsl 128033, rsl 128034, rsl1l28240, rsl 128241, rsl 128242, rsl 128489, rsl129048, rsl 129243, rsl 129501, rsl1l29502, rsl129702, rsl 129792, rsl 129805, rsl129806, rsl 129821, rsl 130931, rsl 1l31019, rsll31133, rsll31673, rsll31733, rsll31734, rrssll1 13311990077,, rrssll 1 13322006633,, rrssll 1l32067, rsl132381, rsl132382, rsl 132405, rsl 132416, rsl 132418, rsl 132419, rsl 1l32420, rsl132421, rsl132422, rsl 132424, rsl 132641, rsl 133075, rsl 133168, rsl1l33212, rsl 133213, rsl133265, rsl133266, rsl133516, rsl 133519, rsl 133521, rsl 1l33543, rsl133757, rsl133857, rsl 133872, rsl 133912, rsl 133913, rsl 133994, rsl1l34203, rsl 134254, rsl134255, rsl 134257, rsl134290, rrssll1 13344332255,, rrssll 1 13344332266,, rrssll1l34327, rsl134410, rsl134412, rsl 134415, rsl134416, rsl 134473, rsl 134475, rsl 1l34661, rsl134853, rsl134854, rsl134855, rsl135039, rsl 135049, rsl 135131, rsl 1l35138, rsl 135180, rsl 135182, rsl 135183, rsl135184, rsl 135185, rsl 135215, rsl 1l35260, rsl 135261, rsl135262, rsl 135271, rsl 135272, rsl 135307, rsl 135308, rsl 1l35398, rsl 135524, rsl135525, rsl 135917, rsl 136245, rrssll1 13366228888,, rrssll1 13366228899,, rrssll1l36307, rsl136368, rsl136389, rsl 136572, rsl 136573, rsl 136574, rsl 136781, rsl 1l36852, rsl136981, rsl136988, rsl 136989, rsl 136991, rsl 137089, rsl 137121, rsl 1l37122, rsl137123, rsl137148, rsl 137153, rsl 137217, rsl" 137219, rsl 137272, rsl1l37289, rsl137294, rsl137341, rsl137342, rsl137343, rsl? 137364, rsl 137366, rsl 1l37367, rsl 137369, rsl 137374, rsl137376, rsl137388, rrssll1? 13377444444,, rrssll 1 13377445522,, rrssll 1l37501, rsl 137505, rsl137506, rsl137507, rsl137508, rsl" 137509, rsl 137540, rsl 1l37541, rsl 137542, rsl 137603, rsl137681, rsl137742, rsl 137743, rsl 137744, rsl1l37752, rsl137896, rsl137922, rsl137923, rsl137924, rsl ] 137925, rsl 137963, rsl1l37964, rsl138052, rsl138063, rsl138070, rsl138077, rsl ] 138089, rsl 1 138092, rsl 1l38098, rsll38107, rsll38153, rsll38179, rsll38180, rrssll1] 13388220066,, rrssll1" 13388220099,, rrssll1138213, rsl138217, rsl 138224, rsl138232, rsl138235, rsl] 138236, rsl " 138237, rsl 1l38247, rsl 138248, rsl 138251, rsl138301, rsl 138352, rsl] 138353, rsl" 138362, rsl 1l38363, rsl138398, rsl138400, rsl138552, rsl138558, rsl] 138564, rsl " 138591, rsl1l38696, rsl138782, rsl138884, rsl138980, rsl138982, rsl 139015, rsl 139234, rsl 139342, rsl 139343, rsl 139360, rsl 139366, rsl 139888, rsl 139889, rsl 140177, rsl 140524, rsl 140576, rsl 140579, rsl 140580, rsl 140967, rsl 141133, rsl 141172, rsl 141174, rsl 141183, rsl 141188, rsl 141189, rsl 141337, rsl 141483, rsl 141495, rsl 141496, rsl 141569, rsl 141633, rsl 141637, rsl 141660, rsl 141851, rsl 141953, rsl 141955, rsl 141956, rsl 141959, rsl 142000, rsl 142016, rsl 142043, rsl 142084, rsl 142190, rsl 142237, rsl 142243, rsl 142286, rsl 142369, rsl 142371, rsl 142374, rsll42375, rsl 142436, rsl 142454, rsl 142481, rsl 142688, rsl 142689, rsl 142690, rsl 142786, rsl 142961, rsl 142963, rsl 142964, rsl 142965, rsll42972, rsll42981, rsll43039, rsll43108, rsll43113, rsll43134, rsll43144, rsl 143166, rsl 143352, rsl 143419, rsl 143429, rsl 143431, rsl 143502, rsl 143514, rsl 143516, rsl2766, rsl5663, rs3168729, rs3170195, rs3177099, rs3177217, rs3177271, rs3177272, rs3177496, rs3177527, rs3177576, rs3177577, rs3177640, rs3178026, rs3178027, rs3178028, rs3178324, rs3178325, rs3178432, rs3178641, rs3178642, rs3178643, rs3178765, rs3178766, rs3178767, rs3178829, rs3178830, rs3178831, rs3178832, rs3178842, rs3178860, ιs3178861, rs3179093, rs3179094, rs3179111, rs3179252, rs3179256, rs3179304, rs3179319, rs3179320, rs3179321, rs3179331, rs3179332, rs3179392, rs3179444, rs3179463, rs3179464, rs3179490, rs3179491, rs3179492, rs3179500, rs3179511, rs3179513, rs3179532, rs3179540, rs3179546, rs3179600, rs3179601, rs3179669, rs3179670, rs3179746, rs3179853, rs3179929, rs3179986, rs3179997, rs3180007, rs3180008, rs3180009, rs3180105, rs3180172, rs3180173, rs3180174, rs3180175, rs3180210, rs3180319, rs3180343, rs3180344, rs3180785, rs3180917, rs3180918, rs3180919, rs3180920, rs3180929, rs3180985, rs3181003, rs3183391, rs3184536, rs3184960, rs3184962, rs3184963, rs3185509, rs3187117, rs3187438, rs3187439, rs3187924, rs3188088, rs3188089, rs3190706, rs3190707, rs3190709, rs3190712, rs3191288, rs3191290, rs3191291, rs3191790, rs3191888, rs3191889, rs3192195, ιs3192551, rs3192555, rs3192558, rs3192563, rs3192564, rs3192568, rs3193134, rs3193157, rs3193158, rs3193161, rs3193574, rs3194513, rs3194748, rs3194749, rs3194901, rs3194902, rs3195344, rs3195365, rs3195390, rs3195738, rs3195946, rs3195947, rs3195948, rs3195978, rs3196171, rs3196473, rs3196474, rs3196476, rs3196528, rs3196529, rs3196530, rs3196602, rs3196603, rs3196605, rs3196606, rs3197213, rs3197215, rs3197216, rs3197505, rs3197641, rs3197705, rs3197766, rs3197854, rs3197855, rs3199569, rs3199570, K3199591, rs3199660, rs3199929, rs3199930, rs3200347, rs3200563, rs3200564, rs3200627, rs3200630, rs3200638, rs3200641, rs3200935, rs3200962, rs3201007, rs3201050, rs3201053, rs3201095, rs3201096, rs3201097, rs3201140, rs3201149, rs3201168, rs3201230, rs3201364, rs3201486, rs3201546, rs3201595, rs3201596, rs3201666, rs3201696, rs3201697, rs3201738, rs3201830, rs3201843, rs3201853, rs3201855, rs3201947, rs3201973, rs3201976, rs3201977, rs3201985, rs3202152, rs3202209, rs3202212, rs3202622, rs3203047, rs3203048, rs3203065, rs3203956, rs3204240, rs3204265, rs3204546, rs3204580, rs3204583, rs3204584, rs3205133, rs3205134, rs3205152, rs3205153, rs3205267, rs3205462, rs3205463, rs3205600, rs3205700, rs3205823, rs3205839, rs3205958, rs3206016, rs3206017, rs3206054, rs3206176, rs3206225, rs3206226, rs3206227, rs3206228, rs3206229, rs3206288, rs3206290, rs3206764, rs3206774, rs3206775, rs3207022, rs3207024, rs3207025, rs3207094, rs3207098, rs3207101, rs3207102, rs3207263, rs3207264, rs3207265, rs3207266, rs3207293, rs3207377, rs3207686, rs3207949, rs3208030, rs3208031, rs3208113, rs3208934, rs3208935, rs3209041, rs3209042, rs3209049, rs3209089, rs3209090, rs3209123, rs3209361, rs3209362, rs3210738, rs3211289
Table 10 location on RefSeq dbSNP ace. number Refseq ace. number sequence within Alu? rsl 046652 Yes rsl 046940 NM_014797 mrna-utr Yes rsl 047405 No rs1048112 Yes rs1048113 Yes rs1048114 Yes rsl 054846 Yes rsl 055462 Yes rsl 056257 Yes rsl 057026 NM_004401 mrna-utr Yes rsl 057448 NM_002156 intron Yes rs1058916 Yes rsl 062354 Yes rsl 127505 Yes rsl 127866 NM 016946 mrna-utr Yes rs1128032 NM_032285 mrna-utr Yes rsl 128033 NM_032285 mrna-utr Yes rsl 129048 Yes rsl 130931 NM_003298 mrna-utr Yes rsl 132416 Yes rs1132418 Yes rsl 132419 Yes rsl 132420 Yes rsl 133168 NM_012434 mrna-utr Yes rsl 133521 No rsl 133912 NM_006749 mrna-utr Yes rsl 133913 NM_006749 mrna-utr Yes rsl 134410 Yes rsl 134412 Yes rsl 134473 NM_014887 intron Yes rsl 134475 NM_014887 intron Yes rsl 135260 NM_005897 mrna-utr Yes rsl 135261 NM_005897 mrna-utr Yes rsl 135262 NM_005897 mrna-utr Yes rsl 135307 NM_016040 intron Yes rsl 135308 NM_016040 intron Yes rsl 135398 NM_002156 intron Yes rs1135917 Yes rsl 136245 NM_000551 mrna-utr Yes rsl 136307 NM_003661 mrna-utr Yes rsl 136573 NM_013234 intron Yes rsl 137089 Yes rsl 137121 NM_017664 intron Yes rs1137123 NM_017664 intron Yes rs1137153 Yes rsl 137217 NM_016628 intron Yes rsl 137294 NM_015092 intron Yes rsl 137366 NM_019593 intron Yes rsl 137374 No rsl 137501 NM_014396 mrna-utr Yes rsl 137540 NM_080670 mrna-utr Yes rsl 137541 NM_080670 mrna-utr Yes rsl 137542 NM_080670 mrna-utr Yes rsl 137603 NM_032188 intron Yes rsl 137752 NM_006085 mrna-utr Yes rsl 137922 Yes rsl 137923 Yes rs1138107 Yes rsl 138206 NM_032285 mrna-utr Yes rsl 138209 NM_001230 mrna-utr Yes rsl 138213 NM_001230 mrna-utr Yes rsl 138247 NM_001230 mrna-utr Yes rsl 138352 NM_014678 mrna-utr Yes rsl 138353 NM_014678 mrna-utr Yes rsl 138362 NM_007375 intron Yes rsl 138363 NM_007375 intron Yes rsl 138782 NM_033661 mrna-utr Yes rsl 138884 Yes rsl 139343 No rsl 141637 Yes rsl 141660 Yes rsl 142000 NM_032188 intron Yes rsl 142190 NM_001230 mrna-utr Yes rsl 142369 NM_000367 mrna-utr Yes rsl 142981 Yes rs12766 No rs3168729 NM_014887 intron Yes rs3170195 Yes rs3178829 NM_016237 intron Yes rs3178830 NM_016237 intron Yes rs3179532 NM_032285 mrna-utr Yes rs3180172 Yes rs3180173 Yes rs3180175 Yes rs3180343 NM_014033 mrna-utr Yes rs3180985 NM_014033 mrna-utr Yes rs3181003 NM_024715 mrna-utr Yes rs3190706 NM_018294 mrna-utr Yes rs3190707 NM 018294 mrna-utr Yes rs3192564 Yes rs3195738 NM 006749 mrna-utr Yes rs3195946 Yes rs3195948 Yes rs3201546 NM 006085 mrna-utr Yes rs3206225 Yes rs3207022 Yes rs3207266 Yes rs3207377 Yes rs3207949 NM 005759 mrna-utr Yes rs3208934 Yes rs3209123 No rs3210738 Yes
Discussion The published dataset of RNA editing sites consists of 12,723 sites22. This is a conservative estimation, using a strict set of parameters. There is a number of indications that there are actually many more sites as previously described. Accordingly, the number of enoneously assigned EST-based SNPs is probably much higher than the 121 examples that are described herein. In addition, there are other types of RNA editing in the human franscriptome such as the C-to-U editing of apoB transcripts by APOBEC-1 (apolipoprotein B mRNA editing catalytic polypeptide 1). The total number of the substrate for this enzyme family is probably much larger than the one known target, since other members of the family, such as AID (Activation-induced Cytidine Deaminase) has yet unknown targets24,25. The possibility of editing events of these types being recorded as EST-based SlSfPs as well should be taken into account. This possibility can be used to exploit dbSNP as a starting point for searching for new editing targets. On the other hand, for careful genotyping analyses, one might want to ignore all the express origin SNP, to be on the safe side (or at least remove all A/G and C/T S?NPs). A less drastic solution would be to use the known properties of editing sites (e.g., they tend to cluster, to appear in dsRNAs and in Alu repeats), and remove only these SNPs based on ESTs that satisfy these properties. Reference list
Poison, A. G., Crain, P. F., Pomerantz, S. C, McCloskey, J. A. & Bass, B. L. The mechanism of adenosine to inosine conversion by the double-stranded RNA unwinding/modifying activity: a high-performance liquid chromatography-mass specfrometry analysis. Biochemistry 30, 11507-14 (1991). Tonkin, L. A. et ai. ? NA editing by ADARs is important for normal behavior in Caenorhabditis elegans. Embo J21, 6025-35 (2002). Palladino, M. J., Keegan, L. P., O'Connell, M. A. & Reenan, R. A. A-to-I pre-mRNA editing in Drosophila is primarily involved in adult nervous system function and integrity. Cell 102, 437-49 (2000). Wang, Q., Khillan, J., Gadue, P. & Nishikura, K. Requirement of the RNA editing deaminase ADAR1 gene for embryonic erythropoiesis. Science 290, 1765-8 (2000). Higuchi, M. et al. Point mutation in an AMPA receptor gene rescues lethality in mice deficient in the RNA-editing enzyme ADAR2. Nature 406, 78-81 (2000). Morse, D. P., Arascavage, P. J. & Bass, B. L. RNA ha pins in noncoding regions of human brain and Caenorhabditis elegans mRNA are edited by adenosine deaminases that act on RNA. Proc Natl Acad Sci USA 99, 7906-11 (2002). Paul, M. S. & Bass, B. L. Inosine exists in mRNA at tissue-specific levels and is most abundant in brain mRNA. Embo J 17, 1120-7 (1998). Kim, U., Wang, Y., Sanford, T., Zeng, Y. & Nishikura, K. Molecular cloning of cDNA for double-stranded RNA adenosine deaminase, a candidate enzyme for nuclear RNA editing. Proc Natl Acad Sci USA 91, 11457-61 (1994). Brusa, R. et al. Early-onset epilepsy and postnatal lethality associated with an editing-deficient GluR-B allele in mice. Science 270, 1677-80 (1995). Gurevich, I. et al. Altered editing of serotonin 2C receptor pre-mRNA in the prefrontal cortex of depressed suicide victims. Neuron 34, 349-56 (2002). Maas, S., Patt, S., Schrey, M. & Rich, A. Underediting of glutamate receptor GluR-B mRNA in malignant gliomas. Proc Natl Acad Sci USA 98, 14687-92 (2001). Bass, B. L. RNA editing by adenosine deaminases that act on RNA. Annu Rev Biochem 71, 817-46 (2002). Morse, D. P. & Bass, B. L. Long RNA haiφins that contain inosine are present in Caenorhabditis elegans poly(A)+ RNA Proc Natl Acad Sci USA 96, 6048-53 (1999). Hoopengardner, B., Bhalla, T., Staber, C. & Reenan, R. Nervous system targets of RNA editing identified by comparative genomics. Science 301, 832-6 (2003). Seeburg, P. H. A-to-I editing: new and old sites, functions and speculations. Neuron 35, 17-20 (2002). Boguski, M. S., Lowe, T. M. & Tolstoshev, C. M. dbEST-database for "expressed sequence tags". Nat Genet 4, 332-3 (1993). Hillier, L. D. et al. Generation and analysis of 280,000 human expressed sequence tags. Genome Res 6, 807-28 (1996). Shoshan, A. et al. in Proceedings of SPIE: Microarrays: Optical technologies and informatics (eds. Bittner, M. L., Chen, Y., Dorsel, A. N. & R., D. E.) 86-95 (SPIE, 2001). Maas, S. et al. Structural requirements for RNA editing in glutamate receptor pre- mRNAs by recombinant double-sfranded RNA adenosine deaminase. JBiol Chem 271, 12221-6 (1996). Poison, A. G. & Bass, B. L. Preferential selection of adenosines for modification by double-sfranded RNA adenosine deaminase. Embo J 13, 5701-11 (1994). Lehmann, K. A. & Bass, B. L. Double-sfranded RNA adenosine deaminases ADAR1 and ADAR2 have overlapping specificities. Biochemistry 39, 12875-84 (2000). Higuchi, M. et al. ?RNA editing of AMP A receptor subunit GluR-B: a base-paired intron-exon stracture determines position and efficiency. Cell 75, 1361-70 (1993). Rueter, S. M., Dawson, T. R. & Εmeson, R. B. Regulation of alternative splicing by
RNA editing. Nature 399, 75-80 (1999). Wong, S. K., Sato, S. & Lazinski, D. W. Substrate recognition by ADAR1 and ADAR2. Rna 7, 846-58 (2001). Lei, M., Liu, Y. & Samuel, C. Ε. Adeno virus VAI RNA antagonizes the RNA- editing activity of the ADAR adenosine deaminase. Virology 245, 188-96 (1998). Tonkin, L. A. & Bass, B. L. Mutations in RNAi rescue abenant chemotaxis of ADAR mutants. Science 302, 1725 (2003). Jiang, R. et al. Genome- wide evaluation of the public SNP databases. Pharmacogenomics 4, 779-89 (2003). Antonarakis, S. Ε., Krawczak, M. & Cooper, D. C. in The Genetic Basis of Human
Cancer (eds. Vogelstein, B. & Kinzler, K. W.) 7-41 (McGraw-Hill, New- York, 2002). A. G. Poison, P. F. Grain, S. C. Pomerantz, J. A. McCloskey, B. L. Bass, Biochemistry 30, 11507 (1991). D. D. ?Kim et al, Genome Res 14, 1719 (2004). E. Y. Levanon et al, Nat Biotechnol 22, 1001 (2004). D. P. Morse, P. J. Arascavage, B. L. Bass, Proc Natl Acad Sci USA 99, 7906 (2002). M. Higuchi et al, Cell 75, 1361 (1993). C. M. Bums et al, Nature 387, 303 (1997). B. Hoopengardner, T. Bhalla, C. Staber, R. Reenan, Science 301, 832 (2003). M. Higuchi et al, Nature 406, 78 (2000). J. C. Hartner et al, JBiol Chem (2003). Q. Wang et al. , JBiol Chem 279, 4952 (2004). M. S. Boguski, T. M. Lowe, C. M. Tolstoshev, Nat Genet 4, 332 (1993). A I. Su et al, Proc Natl Acad Sci USA 101, 6062 (2004). I. Gromova, P. Gromov, J. E. Celis, Int J Cancer 98, 539 (2002). J. H. Hartwig, T. P. Stossel, JBiol Chem 250, 5696 (1975). A. van der Flier, A. Sonnenberg, Biochim Biophys Acta 1538, 99 (2001). M. A Travis et al, FEBS Lett 569, 185 (2004). Y. Ohta, N. Suzuki, S. Nakamura, J. H. Hartwig, T. P. Stossel, PNAS96, 2122 (1999). A Schenck etal, Neuron 38, 887 (2003). A. Schenck, B. Bardoni, A. Moro, C. Bagni, J. L. Mandel, Proc Natl Acad Sci USA
98, 8844 (2001). E. Sailer et al, Embo J IS, 4424 (1999). Y. Oh et al, JBiol Chem 271, 30322 (1996). N. T. Peters, J. A. Rohrbach, B. A. Zalewski, C. M. Byrkett, J. C. Vaughn, Rna 9, 698 (2003). M. J. Palladino, L. P. Keegan, M. A. O'Connell, R. A. Reenan, Cell 102, 437 (2000). S. Bogdan, O. Grewe, M. Sfrunk, A. Mertens, C. Klambt, Development 131, 3981 (2004). J. W. Fox et al, Neuron 21, 1315 (1998). J. C. Hartner et al. , J. Biol Chem. 279, 4894 (2004). M. Zuker, Nucleic Acids Res 31, 3406 (2003). Taylor, J. G., Choi, E. H., Foster, C. B. & Chanock, S. J. Using genetic variation to study human disease. Trends Mol Med 7, 507-12 (2001). Saunders, A. M. et al. Association of apolipoprotein E allele epsilon 4 with late-onset familial and sporadic Alzheimer's disease. Neurology 43, 1467-72 (1993). Altshuler, D. et al. The common PPARgamma Pro 12 Ala polymoφhism is associated with decreased risk of type 2 diabetes. Nat Genet 26, 76-80 (2000). Jiang, R. et al. Genome- wide evaluation of the public SNP databases. Pharmacogenomics 4, 779-89 (2003). Buetow, K. H., Edmonson, M. N. & Cassidy, A. B. Reliable identification of large numbers of candidate SNPs from public EST data. Nat Genet 21, 323-5 (1999). Picoult-Newberg, L. et al. Mining SNPs from EST databases. Genome Res 9, 167-74 (1999). Irizany, K. et al. Genome-wide analysis of single-nucleotide polymoφhisms in human expressed sequences. Nat Genet 26, 233-6 (2000). Guryev, V., Berezikov, E., Malik, R., Plasterk, R. H. & Cuppen, E. Single nucleotide polymoφhisms associated with rat expressed sequences. Genome Res 14, 1438-43 (2004). Schmid, K. J. et al. Large-scale identification and analysis of genome- wide single- nucleotide polymoφhisms for mapping in Arabidopsis thaliana. Genome Res 13, 1250-7 (2003). Chevreux, B. et al. Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Res 14, 1147-59 (2004). Cheng, T. C. et al. Mining single nucleotide polymoφhisms from EST data of silkworm, Bombyx mori, inbred sfrain Dazao. Insect Biochem Mol Biol 34, 523-30 (2004). Poison, A. G., Crain, P. F., Pomerantz, S. C, McCloskey, J. A. & Bass, B. L. The mechanism of adenosine to inosine conversion by the double-sfranded RNA unwinding/modifying activity: a high-performance liquid chromatography-mass spectromefry analysis. Biochemistry 30, 11507-14 (1991). Palladino, M. J., Keegan, L. P., O'Connell, M. A. & Reenan, R. A. A-to-I pre-mRNA editing in Drosophila is primarily involved in adult nervous system function and integrity. Cell 102, 437-49 (2000). Wang, Q., Khillan, J., Gadue, P. & Nishikura, K. Requirement of the RNA editing deaminase ADAR1 gene for embryonic erythropoiesis. Science 290, 1765-8 (2000). Higuchi, M. et al. Point mutation in an AMPA receptor gene rescues lethality in mice deficient in the RNA-editing enzyme ADAR2. Nature 406, 78-81 (2000). Patterson, J. B. & Samuel, C. E. Expression and regulation by interferon of a double- stranded-RΝA-specific adenosine deaminase from human cells: evidence for two forms of the deaminase. Mol Cell Biol 15, 5376-88 (1995). Bmsa, R. et al. Early-onset epilepsy and postnatal lethality associated with an editing-deficient GluR-B allele in mice. Science 270, 1677-80 (1995). Gurevich, I. et al. Altered editing of serotonin 2C receptor pre-mRΝA in the prefrontal cortex of depressed suicide victims. Neuron 34, 349-56 (2002). Maas, S., Patt, S., Schrey, M. & Rich, A. Underediting of glutamate receptor GluR-B mRΝA in malignant gliomas. Proc Natl Acad Sci USA 98, 14687-92 (2001). Morse, D. P., Aruscavage, P. J. & Bass, B. L. RΝA hafrpins in noncoding regions of human brain and Caenorhabditis elegans mRΝA are edited by adenosine deaminases that act on RΝA. Proc Natl Acad Sci USA 99, 7906-11 (2002). Bass, B. L. RΝA editing by adenosine deaminases that act on RΝA. Annu Rev
Biochem 71, 817-46 (2002). Levanon, E. Y. et al. Systematic identification of abundant A-to-I editing sites in the human franscriptome. Nat Biotechnol 22, 1001-5 (2004). Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860-921 (2001). Muramatsu, M. et al. Specific expression of activation-induced cytidine deaminase (AID), a novel member of the RΝA-editing deaminase family in germinal center B cells. J Biol Chem 274, 18470-6 (1999). Begum, Ν. A. et al. Uracil DΝA glycosylase activity is dispensable for immunoglobulin class switch. Science 305, 1160-3 (2004). R. Sorek, G. Ast, D. Graur, Genome Res 12, 1060 (Jul, 2002). F. M. Ausubel et al, Current protocols in molecular biology (John Wiley & Sons, Inc, New York, 1987), pp. M. Zuker, Nucleic Acids Res 31, 3406 (Jul 1, 2003). CD-ROM content The following lists the file content of CD-ROMl and CD-ROM2, which is enclosed herewith and filed with the application. These files are incoφorated herein by reference and thus form a part of the filed application. File information is provided as: File name/bite size/date of creation/machine format/ operating system.
Files on CD-ROMl: 1. "Ann_for_all"/8192884 bytes/Mar 8 2004/Internet Explorer/PC 2. "flan_for_all"/ 7414476 bytes/ Mar 8 2004/ Internet Explorer/PC 3. "flan_clean"/ 2866695 bytes/ Mar 8 2004/ Internet Explorer/PC 4. "Ann_clean"/ 7367337 bytes/ Mar 8 2004/ Internet Explorer/PC
Files on CD-ROM2: 1. "Appendixl'712288 bytes/March 13 2005/Internet Explorer/PC 2. "Appendix2"/ 36864 bytes/ Mar 13 2005/ Internet Explorer/PC
APPENDDC 1 - parameters for use with the method of Example 10
win_size=50 ; snp_win_size=l ; seq_ bnd=0.1 ; small_node_size=15 ; small_node_seq_bnd=0.25; win_bnd=0.2 ; win_max=3 ; col_err_bnd=le-6; dept actor=0.004; snp_prob=0.0001 ; allele_prob-0.005 ; polluted_seq=3 ; polluted Xep=5 ; winjump=25 ; est_cln_err_prob= 0.003 ; est_err_prob= 0.008 ; est_pol_err_prob= 0.05; est_clnJndel_prob = 0.008; est_indel_prob = 0.02 ; est_pol_indeljprob = 0.07; rna_cln_err jrob = 0.00008 ; rna_err_prob = 0.0005 ; rna_pol_err_prob= 0.005; rna_cln_indel_prob= 0.0001 ; rna_indel_prob= 0.001 ; rna_pol_indel_prob= 0.01 ; ref_cln_err_prob = 0.000002 ; ref_err_prob= 0.000005 ; ref_ ol_err_prob= 0.0008 ; ref_clnjndel_prob= 0.000004 ; ref_indel_ρrob= 0.000008 ; ref_pol_indel_prob = 0.002; dna_err_prob= 0.0000002 ; dnajpol errjprob = 0.00008; dna_indel_prob= 0.0000004 ; dna_pol_indel_prob= 0.0002; clon _agreement_f actor = 1; clone_disagreement_factor= 500; lib_agreement_factor= 1; Hb_disagreementJactor= 5; pooljigreement a' ctor 1; pool_disagreement actor=5;
Appendix 2 list of potential A->G see file labeled "Appendix 2.txt" in CDROM2.
Appendix 3 list of potential C->T see file labeled "Appendix 3.txt" in CDROM2.
Appendix 4 - list of protein names and conesponding contig names from Appendices 2 and 3
Bold face type indicates the two validated examples.
AA132529 FLJ23841 hypothetical protein FLJ23841
AA306074 CCDC9 coiled-coil domain containing 9
AA327208 ELMO3 engulfinent and cell motility 3 (ced-12 homolog, C. elegans)
AW352296 RPS2 ribosomal protein S2
D11535 RNU46 RNA, U46 small nuclear
DI 1535 RPS8 ribosomal protein S8
DI 1580 EIF2S1 eukaryotic translation initiation factor 2, subunit 1 alpha, 35kDa
DI 1781 RPS20 ribosomal protein S20
DI 1894 RPLP2 ribosomal protein, large P2
DI 1923 NONO non-POU domain containing, octamer-binding
D 12080 MAPRE1 microtubule-associated protein, RP/EB family, member 1
D 12228 DBH dopamine beta-hydroxylase (dopamine beta-monooxygenase)
F09155 ZSIG11 putative secreted protein ZSIG11
F 10461 FBXL6 F-box and leucine-rich repeat protein 6
F 10798 EFE?MP2 EGF-containing fibulin-like extracellular matrix protein 2
H24858 SLC25A10 solute canier family 25 (mitochondrial canier; dicarboxylate fransporter), member 10
HSACTA ACTA2 actin, alpha 2, smooth muscle, aorta
HSALDAR ALDOA aldolase A, fructose-bisphosphate
HSGLURC GRIA3 glutamate receptor, ionofrophic, AMPA 3
HSGRSFLIP GRIA2 glutamate receptor, ionofropic, AMPA 2
HSHNRNPL HNRPL heterogeneous nuclear ribonucleoprotein L HSU 10323 ILF2 interleukin enhancer binding factor 2, 45kDa
HU LA2MGRAP LRPAPl low density lipoprotein receptor-related protein associated protein 1
HUMCYTOK KRT8 keratin 8
HUMGOGG GCNT1 glucosaminyl (N-acetyl) fransferase 1, core 2 (beta-l,6-N- acetylglucosaminyltransferase) '
HUMGRK5 A GRK5 G protein-coupled receptor kinase 5
HUMHISH3B H3F3A H3 histone, family 3A
HUMMAC25X IGFBP7 insulin-like growth factor binding protein 7
HUMP13KIN PIK3R1 phosphoinositide-3 -kinase, regulatory subunit, polypeptide 1
(p85 alpha)
HUMPEDF SE?RPINF1 serine (or cysteine) proteinase inhibitor, clade F (alpha-2 antiplasmin, pigment epithelium derived factor), member 1
HUMPPARPO RPLPO ribosomal protein, large, P0
HUMTEN TNC tenascin C (hexabrachion)
M62008 CYFIP2 cytoplasmic FMR1 interacting protein 2
M62188 GDI2 GDP dissociation inhibitor 2
M62202 GNAS GNAS complex locus
M62222 ATP1 Al ATPase, Na+/K+ transporting, alpha 1 polypeptide
M77973 RPS27A ribosomal protein S27a
M78082 UBC ubiquitin C
M78215 ATF4 activating transcription factor 4 (tax-responsive enhancer element B67)
M78245 PSME2 proteasome (prosome, macropain) activator subunit 2 (PA28 beta)
M78458 SCFD2 seel family domain containing 2
M78802 FY Duffy blood group
M78828 SLC20A2 solute canier family 20 (phosphate transporter), member 2
M78847 CD2BP2 CD2 antigen (cytoplasmic tail) binding protein 2
M79195 BLCAP bladder cancer associated protein
M79263 RPS29 ribosomal protein S29
M85323 EIF4G2 eukaryotic translation initiation factor 4 gamma, 2
M85323 LOC144017 hypothetical protein LOC144017
M85347 IMPDH2 IMP (inosine monophosphate) dehydrogenase 2 N40611 BCL7A B-cell CLL/lymphoma 7A
R21435 PDLIM4 PDZ and LIM domain 4
R35837 PRKCL1 protein kinase C-like 1
R47298 ODF2 outer dense fiber of sperm tails 2
R49784 MAPBPIP mitogen-activated protein-binding protein-interacting protein
R67998 DHX15 DEAH (Asp-Glu-Ala-His) box polypeptide 15
S51033 MPGN-methylpurine-DNA glycosylase
T05030 CTNND1 catenin (cadherin-associated protein), delta 1
T05030 TMX2 thioredoxin-related fransmembrane protein 2
T05149 H2AFY H2A histone family, member Y
T05196 FBLNl fibulin l
T05358 EIF4B eukaryotic translation initiation factor 4B
T05832 HNRPU heterogeneous nuclear ribonucleoprotein U (scaffold attachment factor
A)
T08066 PFN2 profilin 2
T08109 OTUB1 OTU domain, ubiquitin aldehyde binding 1
T08798 PSMB4 proteasome (prosome, macropain) subunit, beta type, 4
T09075 SDBCAG84 serologically defined breast cancer antigen 84
T09311 C20orfl 16 chromosome 20 open reading frame 116
T09419 PRDX5 peroxiredoxin 5
T10707 NPAS2 neuronal PAS domain protein 2
T10707 RPL31 ribosomal protein L31
T10707 TBC1D8 TBCl domain family, member 8 (with GRAM domain)
TI 1079 CPA2 carboxypeptidase A2 (pancreatic)
T12009 ACTB actin, beta
T12126 FLNA filamin A, alpha (actin binding protein 280)
T20007 CARHSP1 calcium regulated heat stable protein 1, 24kDa
T23728 CDC2L1 cell division cycle 2-like 1 (PITSLRE proteins)
T23728 CDC2L2 cell division cycle 2-like 2
T39409 RPL9 ribosomal protein L9
T40220 RNU69 RNA, U69 small nucleolar
T40220 RPL39 ribosomal protein L39 T41202 RNU65 RNA, U65 small nucleolar
T41202 RPL12 ribosomal protein L12
T47019 S100A6 S100 calcium binding protein A6 (calcyclin)
T47040 UBA52 ubiquitin A-52 residue ribosomal protein fusion product 1
T47670 LOC56965 hypothetical protein from EUROIMAGE 1977056
T50110 eIF3k eukaryotic translation initiation factor 3 subunit k
T50839 BAF53A BAF53
T54527 GLCCI1 glucocorticoid induced transcript 1
T55401 HSPC155 hypothetical protein HSPC155
T64817 U5-116KD U5 snRNP-specific protein, 116 kD
T71977 CIAO1 WD40 protein Ciaol
T72286 LOCI 34997 peptidylprolyl isomerase A processed pseudogene
T75212 DDIT3 DNA-damage-inducible transcript 3
T79353 FLJ33008 hypothetical protein FLJ33008
T96621 HLA-DQB2 major histocompatibility complex, class II, DQ beta 2
Z17354 TLN1 talin 1
Z17844 MVP major vault protein
Z19187 GPAA1 GPAA1P anchor attachment protein 1 homolog (yeast)
Z19371 ARHGDIA Rho GDP dissociation inhibitor (GDI) alpha
Z19398 XPO7 exportin 7
Z24803 TFIP11 tuftelin interacting protein 11
Z25267 EIF3S6 eukaryotic translation initiation factor 3, subunit 648kDa
Z25337 DHPS deoxyhypusine synthase
Z38708 DKFZP566K1924 DKFZP566K1924 protein
Z39558 ABCF2 ATP-binding cassette, sub-family F (GCN20), member 2
Z39644 PAFAH1B3 platelet-activating factor acetylhydrolase, isoform lb, gamma subunit 29kDa
Z40194 HPS4 Hermansky-Pudlak syndrome 4
Z40243 MGC3234 hypothetical protein MGC3234
Z41538 THAP7 THAP domain containing 7
Z43318 MARK3 MAP/microtubule affinity-regulating kinase 3
Z44003 PPP4C protein phosphatase 4 (formerly X), catalytic subunit Z44701 SMYD5 SMYD family member 5 Appendix 5 - Mouse and Chicken data
See Figure 17

Claims

WHAT IS CLAIMED IS:
1. A method of identifying an RNA editing substrate, the method comprising: identifying nucleic acid sequence exhibiting a base pair mismatch in a stem region thereof, said nucleic acid sequence being the RNA editing substrate.
2. The method of claim 1 , wherein said stem region is identified by: detecting an exon capable of forming a double stranded region in said nucleic acid sequence, wherein said exon features an adenosine.
3. The method of claim 2, further comprising filtering said nucleic acid sequence to remove a section of repeated nucleotides before said identifying said nucleic acid sequence.
4. The method of claim 3, wherein said section comprises at least four repeated nucleotides.
5. The method of claim 2, further comprising filtering said nucleic acid sequence wherein at least a portion of said nucleic acid sequence is discarded if said portion features more than a threshold number of mismatches before said identifying said nucleic acid sequence.
6. The method of claim 5, wherein said portion comprises at least about 20 nucleotides and said threshold number comprises at least about three mismatches.
7. The method of claim 6, wherein if said portion features at least about two identical sequential mismatches, said portion is not discarded.
8. The method of claim 5, wherein said portion comprises at least about 50 nucleotides and said threshold number comprises at least about six mismatches.
9. The method of claim 8, wherein if said portion features at least about four identical sequential mismatches, said portion is not discarded.
10. The method of any of claims 1-9, wherein the RNA editing substrate is detected in a tissue comprising at least one of liver, lung, kidney, prostate, or uterine tissue.
11. The method of claim 10, further comprising: diagnosing a disease or pathological condition in a subject by detecting RNA editing in at least one of said tissues.
12. The method of claim 11, wherein said diagnosing is performed by determining whether RNA editing in a nucleotide sequence of said subject differs from a normal nucleotide sequence.
13. A kit for diagnosing a subject, comprising at least one component for detecting RNA editing according to claim 11.
14. The kit of claim 13, wherein said at least one component comprises an oligonucleotide.
15. The kit of claim 14, wherein said oligonucleotide hybridizes to said nucleotide sequence for detecting RNA editing.
16. The method of claim 14, wherein said oligonucleotide comprises a pair of oligonucleotides for amplifying at least a portion of said nucleotide sequence for detecting RNA editing.
17. Use of a polynucleotide having a nucleic acid sequence set forth in the files "flan_for_all" and "flan_clean" of the enclosed CD-ROM and fragments and homologs thereof for the diagnosis and/or treatment of the diseases listed herein and in the file "Ann for all" and "Ann clean" of the enclosed CD-ROM.
18. A computer readable storage medium comprising data stored in a retrievable manner, said data including sequence information of RNA editing subsfrates as set forth in files "flanJfor_all" and "flan_clean" of enclosed CD-ROM, and conesponding sequence annotations as set forth in the file "Ann_for_all" and "Ann_clean" of enclosed CD-ROM
19. Use of any identified RNA editing site, as described herein or as derivable from the methods described herein, optionally according to any one of claims 1-10, for diagnostic assays, drug targets, expressed sequences suitable for therapeutic proteins, and gene therapy to fix abenant and/or pathological RNA editing.
20. A diagnostic assay, comprising an assay for determining an RNA editing pattern in a sample taken from an individual, optionally according to any one of claims 1-10.
21. The assay of claim 20, performed on a multi-probe chip, said chip comprising a plurality of probes for detecting a presence or an absence of at least one RNA editing site in said sample, optionally according to any one of claims 1-10.
22. A diagnostic method for determining an RNA editing pattern in a sample taken from an individual, comprising: determining an RNA editing pattern in the sample to form a test pattern; and comparing said test pattern to a standard pattern, optionally according to any one of claims 1-10.
23. The method of claim 22, wherein said standard pattern is optionally related to disease or pathology, and/or to normalcy or "health".
24. The method of claim 22, further comprising: at least partially diagnosing the individual according to said comparison.
25. The method of claim 24, wherein said disease comprises cancer.
26. A method for detecting cancer in a subject or a disposition or tendency or susceptibility thereto, comprising analyzing RNA editing in the subject, optionally according to any one of claims 1-10.
PCT/IL2005/000286 2004-03-12 2005-03-13 Systematic mapping of adenosine to inosine editing sites in the human transcriptome WO2005087949A1 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US55231104P 2004-03-12 2004-03-12
US60/552,311 2004-03-12
US58359104P 2004-06-30 2004-06-30
US60/583,591 2004-06-30
US63145804P 2004-11-30 2004-11-30
US60/631,458 2004-11-30

Publications (1)

Publication Number Publication Date
WO2005087949A1 true WO2005087949A1 (en) 2005-09-22

Family

ID=34963740

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2005/000286 WO2005087949A1 (en) 2004-03-12 2005-03-13 Systematic mapping of adenosine to inosine editing sites in the human transcriptome

Country Status (1)

Country Link
WO (1) WO2005087949A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008152146A1 (en) * 2007-06-13 2008-12-18 Biocortech Peripherical tissue sample containing cells expressing the 5htr2c and/or adars as markers of the alteration of the mechanism of the 5htr2c mrna editing and its applications
WO2021231679A1 (en) * 2020-05-15 2021-11-18 Korro Bio, Inc. Methods and compositions for the adar-mediated editing of gap junction protein beta 2 (gjb2)
US11649454B2 (en) 2016-06-22 2023-05-16 Proqr Therapeutics Ii B.V. Single-stranded RNA-editing oligonucleotides
US11781134B2 (en) 2014-12-17 2023-10-10 Proqr Therapeutics Ii B.V. Targeted RNA editing
US11851656B2 (en) 2016-09-01 2023-12-26 Proqr Therapeutics Ii B.V. Chemically modified single-stranded RNA-editing oligonucleotides

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002084496A1 (en) * 2001-04-16 2002-10-24 Sunncomm, Inc. Apparatus and method for authentication of computer-readable medium
WO2004011594A2 (en) * 2002-07-26 2004-02-05 Biocortech Novel method for analyzing nucleic acid and use thereof for evaluating the degree of mrna editing of the serotonin 5-ht2c receptor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002084496A1 (en) * 2001-04-16 2002-10-24 Sunncomm, Inc. Apparatus and method for authentication of computer-readable medium
WO2004011594A2 (en) * 2002-07-26 2004-02-05 Biocortech Novel method for analyzing nucleic acid and use thereof for evaluating the degree of mrna editing of the serotonin 5-ht2c receptor

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BASS BRENDA L: "RNA editing by adenosine deaminases that act on RNA.", ANNUAL REVIEW OF BIOCHEMISTRY. 2002, vol. 71, 2002, pages 817 - 846, XP002336892, ISSN: 0066-4154 *
HOOPENGARDNER BARRY ET AL: "Nervous system targets of RNA editing identified by comparative genomics.", SCIENCE. 8 AUG 2003, vol. 301, no. 5634, 8 August 2003 (2003-08-08), pages 832 - 836, XP002336890, ISSN: 1095-9203 *
LEVANON EREZ Y ET AL: "Systematic identification of abundant A-to-I editing sites in the human transcriptome.", NATURE BIOTECHNOLOGY. AUG 2004, vol. 22, no. 8, August 2004 (2004-08-01), pages 1001 - 1005, XP002336893, ISSN: 1087-0156 *
MORSE D P ET AL: "Long RNA hairpins that contain inosine are present in Caenorhabditis elegans poly(A)+ RNA.", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA. 25 MAY 1999, vol. 96, no. 11, 25 May 1999 (1999-05-25), pages 6048 - 6053, XP002336889, ISSN: 0027-8424 *
SEEBURG PETER H: "A-to-I editing: new and old sites, functions and speculations.", NEURON. 3 JUL 2002, vol. 35, no. 1, 3 July 2002 (2002-07-03), pages 17 - 20, XP002336888, ISSN: 0896-6273 *
SOREK ROTEM ET AL: "Alu-containing exons are alternatively spliced.", GENOME RESEARCH. JUL 2002, vol. 12, no. 7, July 2002 (2002-07-01), pages 1060 - 1067, XP002336891, ISSN: 1088-9051 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008152146A1 (en) * 2007-06-13 2008-12-18 Biocortech Peripherical tissue sample containing cells expressing the 5htr2c and/or adars as markers of the alteration of the mechanism of the 5htr2c mrna editing and its applications
EP3272881A1 (en) * 2007-06-13 2018-01-24 Alcediag Peripherical tissue sample containing cells expressing the 5htr2c and/or adars as markers of the alteration of the mechanism of the 5htr2c mrna editing and its applications
US11781134B2 (en) 2014-12-17 2023-10-10 Proqr Therapeutics Ii B.V. Targeted RNA editing
US11649454B2 (en) 2016-06-22 2023-05-16 Proqr Therapeutics Ii B.V. Single-stranded RNA-editing oligonucleotides
US11851656B2 (en) 2016-09-01 2023-12-26 Proqr Therapeutics Ii B.V. Chemically modified single-stranded RNA-editing oligonucleotides
WO2021231679A1 (en) * 2020-05-15 2021-11-18 Korro Bio, Inc. Methods and compositions for the adar-mediated editing of gap junction protein beta 2 (gjb2)

Similar Documents

Publication Publication Date Title
US7745391B2 (en) Human thrombospondin polypeptide
Steward et al. Genome annotation for clinical genomic diagnostics: strengths and weaknesses
Jex et al. Ascaris suum draft genome
Dolled-Filhart et al. Computational and bioinformatics frameworks for next-generation whole exome and genome sequencing
EP1716227A2 (en) Methods of identifying putative gene products by interspecies sequence comparison and biomolecular sequences uncovered thereby
Mudge et al. Functional transcriptomics in the post-ENCODE era
WO2004104161A2 (en) Methods and systems for identifying naturally occurring antisense transcripts and methods, kits and arrays utilizing same
EP1713900A2 (en) Methods and systems for annotating biomolecular sequences
WO2004096979A2 (en) Methods and systems for annotating biomolecular sequences
Singh miRNAs target databases: developmental methods and target identification techniques with functional annotations
Uyar et al. RNA-seq analysis of the C. briggsae transcriptome
Senatore et al. Deep mRNA sequencing of the Tritonia diomedea brain transcriptome provides access to gene homologues for neuronal excitability, synaptic transmission and peptidergic signalling
Jadhav et al. RNA-Seq in 296 phased trios provides a high-resolution map of genomic imprinting
Hung et al. An evolutionary landscape of A-to-I RNA editome across metazoan species
Zhang et al. On the origin and evolution of RNA editing in metazoans
WO2005087949A1 (en) Systematic mapping of adenosine to inosine editing sites in the human transcriptome
Gong et al. Design, validation and annotation of transcriptome-wide oligonucleotide probes for the oligochaete annelid Eisenia fetida
Sinha et al. Genome-wide analysis of trans-splicing in the nematode Pristionchus pacificus unravels conserved gene functions for germline and dauer development in divergent operons
Molineris et al. A new approach for the identification of processed pseudogenes
Wang et al. Prioritisation of associations between protein domains and complex diseases using domain–domain interaction networks
Shimada et al. A comprehensive survey of human polymorphisms at conserved splice dinucleotides and its evolutionary relationship with alternative splicing
EP1451355A1 (en) Methods and systems for identifying naturally occurring antisense transcripts and methods, kits and arrays utilizing same
Bhuiyan Prioritizing genes with functionally distinct splice isoforms
Barros et al. Check for updates
Tan Identification of Bona Fide RNA Editing Sites: History, Challenges, and Opportunities

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

122 Ep: pct application non-entry in european phase